All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mukesh Ojha <quic_mojha@quicinc.com>
To: <paulmck@kernel.org>, Tejun Heo <tj@kernel.org>
Cc: lkml <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>, <jiangshanlai@gmail.com>
Subject: Re: synchronize_rcu_expedited gets stuck in hotplug path
Date: Mon, 24 Jan 2022 19:32:01 +0530	[thread overview]
Message-ID: <4f2ada96-234f-31d8-664a-c84f5b461385@quicinc.com> (raw)
In-Reply-To: <20220118214155.GK947480@paulmck-ThinkPad-P17-Gen-1>


On 1/19/2022 3:11 AM, Paul E. McKenney wrote:
> On Tue, Jan 18, 2022 at 10:11:34AM -1000, Tejun Heo wrote:
>> Hello,
>>
>> On Tue, Jan 18, 2022 at 12:06:46PM -0800, Paul E. McKenney wrote:
>>> Interesting.  Adding Tejun and Lai on CC for their perspective.
>>>
>>> As you say, the incoming CPU invoked synchronize_rcu_expedited() which
>>> in turn invoked queue_work().  By default, workqueues will of course
>>> queue that work on the current CPU.  But in this case, the CPU's bit
>>> is not yet set in the cpu_active_mask.  Thus, a workqueue scheduled on
>>> the incoming CPU won't be invoked until CPUHP_AP_ACTIVE, which won't
>>> be reached until after the grace period ends, which cannot happen until
>>> the workqueue handler is invoked.
>>>
>>> I could imagine doing something as shown in the (untested) patch below,
>>> but first does this help?
>>>
>>> If it does help, would this sort of check be appropriate here or
>>> should it instead go into workqueues?
>> Maybe it can be solved by rearranging the hotplug sequence but it's fragile
>> to schedule per-cpu work items from hotplug paths. Maybe the whole issue can
>> be side-stepped by making synchronize_rcu_expedited() use unbound workqueue
>> instead? Does it require to be per-cpu?
> Good point!
>
> And now that you mention it, RCU expedited grace periods already avoid
> using workqueues during early boot.  The (again untested) patch below
> extends that approach to incoming CPUs.
>
> Thoughts?

Hi Paul,

We are not seeing the issue after this patch.
Can we merge this patch ?

-Mukesh

>
> 							Thanx, Paul
>
> ------------------------------------------------------------------------
>
> diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
> index 60197ea24ceb9..1a45667402260 100644
> --- a/kernel/rcu/tree_exp.h
> +++ b/kernel/rcu/tree_exp.h
> @@ -816,7 +816,7 @@ static int rcu_print_task_exp_stall(struct rcu_node *rnp)
>    */
>   void synchronize_rcu_expedited(void)
>   {
> -	bool boottime = (rcu_scheduler_active == RCU_SCHEDULER_INIT);
> +	bool no_wq;
>   	struct rcu_exp_work rew;
>   	struct rcu_node *rnp;
>   	unsigned long s;
> @@ -841,9 +841,15 @@ void synchronize_rcu_expedited(void)
>   	if (exp_funnel_lock(s))
>   		return;  /* Someone else did our work for us. */
>   
> +	/* Don't use workqueue during boot or from an incoming CPU. */
> +	preempt_disable();
> +	no_wq = rcu_scheduler_active == RCU_SCHEDULER_INIT ||
> +		!cpumask_test_cpu(smp_processor_id(), cpu_active_mask);
> +	preempt_enable();
> +
>   	/* Ensure that load happens before action based on it. */
> -	if (unlikely(boottime)) {
> -		/* Direct call during scheduler init and early_initcalls(). */
> +	if (unlikely(no_wq)) {
> +		/* Direct call for scheduler init, early_initcall()s, and incoming CPUs. */
>   		rcu_exp_sel_wait_wake(s);
>   	} else {
>   		/* Marshall arguments & schedule the expedited grace period. */
> @@ -861,7 +867,7 @@ void synchronize_rcu_expedited(void)
>   	/* Let the next expedited grace period start. */
>   	mutex_unlock(&rcu_state.exp_mutex);
>   
> -	if (likely(!boottime))
> +	if (likely(!no_wq))
>   		destroy_work_on_stack(&rew.rew_work);
>   }
>   EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);

  reply	other threads:[~2022-01-24 14:02 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-18 11:46 synchronize_rcu_expedited gets stuck in hotplug path Mukesh Ojha
2022-01-18 20:06 ` Paul E. McKenney
2022-01-18 20:11   ` Tejun Heo
2022-01-18 21:41     ` Paul E. McKenney
2022-01-24 14:02       ` Mukesh Ojha [this message]
2022-01-24 16:44         ` Paul E. McKenney
2022-01-24 16:58           ` Mukesh Ojha
2022-01-25 20:21             ` Paul E. McKenney
2022-01-26  7:32               ` Mukesh Ojha
2022-01-26  7:33               ` Mukesh Ojha

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4f2ada96-234f-31d8-664a-c84f5b461385@quicinc.com \
    --to=quic_mojha@quicinc.com \
    --cc=jiangshanlai@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.