All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>, rcu <rcu@vger.kernel.org>,
	Uladzislau Rezki <urezki@gmail.com>,
	Neeraj Upadhyay <quic_neeraju@quicinc.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	Joel Fernandes <joel@joelfernandes.org>
Subject: Re: [PATCH 1/4] rcu/nocb: Protect lazy shrinker against concurrent (de-)offloading
Date: Fri, 24 Mar 2023 15:51:54 -0700	[thread overview]
Message-ID: <ae1cb391-aeed-4587-8d9d-50909c918fb1@paulmck-laptop> (raw)
In-Reply-To: <ZB4fhA1BafN7h2N3@localhost.localdomain>

On Fri, Mar 24, 2023 at 11:09:08PM +0100, Frederic Weisbecker wrote:
> Le Wed, Mar 22, 2023 at 04:18:24PM -0700, Paul E. McKenney a écrit :
> > > @@ -1336,13 +1336,25 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> > >  	unsigned long flags;
> > >  	unsigned long count = 0;
> > >  
> > > +	/*
> > > +	 * Protect against concurrent (de-)offloading. Otherwise nocb locking
> > > +	 * may be ignored or imbalanced.
> > > +	 */
> > > +	mutex_lock(&rcu_state.barrier_mutex);
> > 
> > I was worried about this possibly leading to out-of-memory deadlock,
> > but if I recall correctly, the (de-)offloading process never allocates
> > memory, so this should be OK?
> 
> Good point. It _should_ be fine but like you, Joel and Hillf pointed out
> it's asking for trouble.
> 
> We could try Joel's idea to use mutex_trylock() as a best effort, which
> should be fine as it's mostly uncontended.
> 
> The alternative is to force nocb locking and check the offloading state
> right after. So instead of:
> 
> 	rcu_nocb_lock_irqsave(rdp, flags);
> 	//flush stuff
> 	rcu_nocb_unlock_irqrestore(rdp, flags);
> 
> Have:
> 
> 	raw_spin_lock_irqsave(rdp->nocb_lock, flags);
> 	if (!rcu_rdp_is_offloaded(rdp))
> 		raw_spin_unlock_irqrestore(rdp->nocb_lock, flags);
> 		continue;
> 	}
> 	//flush stuff
> 	rcu_nocb_unlock_irqrestore(rdp, flags);
> 
> But it's not pretty and also disqualifies the last two patches as
> rcu_nocb_mask can't be iterated safely anymore.
> 
> What do you think?

The mutex_trylock() approach does have the advantage of simplicity,
and as you say should do well given low contention.

Which reminds me, what sort of test strategy did you have in mind?
Memory exhaustion can have surprising effects.

> > >  	/* Snapshot count of all CPUs */
> > >  	for_each_possible_cpu(cpu) {
> > >  		struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
> > > -		int _count = READ_ONCE(rdp->lazy_len);
> > > +		int _count;
> > > +
> > > +		if (!rcu_rdp_is_offloaded(rdp))
> > > +			continue;
> > 
> > If the CPU is offloaded, isn't ->lazy_len guaranteed to be zero?
> > 
> > Or can it contain garbage after a de-offloading operation?
> 
> If it's deoffloaded, ->lazy_len is indeed (supposed to be) guaranteed to be zero.
> Bypass is flushed and disabled atomically early on de-offloading and the
> flush resets ->lazy_len.

Whew!  At the moment, I don't feel strongly about whether or not
the following code should (1) read the value, (2) warn on non-zero,
(3) assume zero without reading, or (4) some other option that is not
occurring to me.  Your choice!

							Thanx, Paul

  reply	other threads:[~2023-03-24 22:52 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-22 19:44 [PATCH 0/4] rcu/nocb: Shrinker related boring fixes Frederic Weisbecker
2023-03-22 19:44 ` [PATCH 1/4] rcu/nocb: Protect lazy shrinker against concurrent (de-)offloading Frederic Weisbecker
2023-03-22 23:18   ` Paul E. McKenney
2023-03-24  0:55     ` Joel Fernandes
2023-03-24  1:06       ` Paul E. McKenney
2023-03-24 22:09     ` Frederic Weisbecker
2023-03-24 22:51       ` Paul E. McKenney [this message]
2023-03-26 20:01         ` Frederic Weisbecker
2023-03-26 21:45           ` Paul E. McKenney
2023-03-29 16:07             ` Frederic Weisbecker
2023-03-29 20:45               ` Paul E. McKenney
2023-03-22 19:44 ` [PATCH 2/4] rcu/nocb: Fix shrinker race against callback enqueuer Frederic Weisbecker
2023-03-22 23:19   ` Paul E. McKenney
2023-03-22 19:44 ` [PATCH 3/4] rcu/nocb: Recheck lazy callbacks under the ->nocb_lock from shrinker Frederic Weisbecker
2023-03-22 23:21   ` Paul E. McKenney
2023-03-22 19:44 ` [PATCH 4/4] rcu/nocb: Make shrinker to iterate only NOCB CPUs Frederic Weisbecker
2023-03-24  0:41   ` Joel Fernandes
2023-03-29 16:01 [PATCH 0/4 v2] rcu/nocb: Shrinker related boring fixes Frederic Weisbecker
2023-03-29 16:02 ` [PATCH 1/4] rcu/nocb: Protect lazy shrinker against concurrent (de-)offloading Frederic Weisbecker
2023-03-29 20:44   ` Paul E. McKenney
2023-03-29 21:18     ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ae1cb391-aeed-4587-8d9d-50909c918fb1@paulmck-laptop \
    --to=paulmck@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=quic_neeraju@quicinc.com \
    --cc=rcu@vger.kernel.org \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.