Re: SRCU question

From: "Paul E. McKenney" <paulmck@kernel.org>
To: Kent Overstreet <kent.overstreet@gmail.com>
Cc: rcu@vger.kernel.org
Subject: Re: SRCU question
Date: Sun, 15 Nov 2020 12:11:24 -0800	[thread overview]
Message-ID: <20201115201124.GL3249@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20201112201547.GF3365678@moria.home.lan>

On Thu, Nov 12, 2020 at 03:15:47PM -0500, Kent Overstreet wrote:
> Hi to Paul & the rcu mailing list,
> 
> I've got a problem I'm trying to figure out if I can adapt SRCU for.
> 
> Within bcachefs, currently struct btree obects, and now also btree_cached_key,
> aren't freed until the filesystem is torn down, because btree iterators contain
> pointers to them and will drop and retake locks (iff a sequence number hasn't
> changed) without holding a ref. With the btree key cache code, this is now
> something I need to fix.
> 
> What I plan on doing is having struct btree_trans (container for btree
> iterators) hold an srcu read lock while it's alive. But, I don't want to just
> use call_srcu to free the btree/btree_cached_key objects, because btree trans
> objects can at times be fairly long lived and the existing code can reuse these
> objects for other btree nodes/btree cached keys immediately. Freeing them with
> call_srcu() would break that; I could have my callback function check if the
> object has been reused before freeing, but I'd still have a problem when the
> object gets freed a second time before the first call_scru() has finished.
> 
> What I'm wondering is if the SRCU code has a well defined notion of a clock that
> I could make use of. What I would like to do is, instead of doing call_srcu() to
> free the object - just mark that object with the current SRCU context time, and
> then when my shrinkers run they can free objects that haven't been reused and
> are old enough according the the current SRCU time.
> 
> Thoughts?

An early prototype is available on -rcu [1].  The Tree SRCU version
seems to be reasonably reliable, but I do not yet trust the Tiny SRCU
implementation.  So please avoid !SMP&&!PREEMPT if you would like to
give it a try.  Unless you -really- like helping me find bugs, in which
case full speed ahead!!!

Here is the API:

unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp)

	Returns a "cookie" that can be thought of as a snapshot of your
	"clock" above.	(SRCU calls it a "grace-period sequence number".)
	Also ensures that enough future grace periods happen to eventually
	make the grace-period sequence number reach the cookie.

bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie)

	Given a cookie from start_poll_synchronize_srcu(), returns true if
	at least one full SRCU grace period has elapsed in the meantime.
	Given finite SRCU readers in a well-behaved kernel, the following
	code will complete in finite time:

		cookie = start_poll_synchronize_srcu(&my_srcu);
		while (!poll_state_synchronize_srcu(&my_srcu, cookie))
			schedule_timeout_uninterruptible(1);

unsigned long get_state_synchronize_srcu(struct srcu_struct *ssp)

	Like start_poll_synchronize_srcu(), except that it does not start
	any grace periods.  This means that the following code is -not-
	guaranteed to complete:

		cookie = get_state_synchronize_srcu(&my_srcu);
		while (!poll_state_synchronize_srcu(&my_srcu, cookie))
			schedule_timeout_uninterruptible(1);

	Use this if you know that something else will be starting the
	needed SRCU grace periods.  This might also be useful if you
	had items that were likely to be reused before the SRCU grace
	period elapsed, so that you avoid burning CPU on SRCU grace
	periods that prove to be unnecessary.  Or if you don't want
	to have more than (say) 100 SRCU grace periods per seconds,
	in which case you might use a timer to start the grace periods.
	Or maybe you don't bother starting the SRCU grace period until
	some sort of emergency situation has arisen.  Or...

	OK, maybe you don't need it, but I do need it for rcutorture,
	so here it is anyway.

All of these can be invoked anywhere that call_srcu() can be invoked.

Does this look like it will work for you?

							Thanx, Paul

[1] git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git