linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Seth Forshee <seth@forshee.me>
Cc: "David S. Miller" <davem@davemloft.net>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Jiri Pirko <jiri@resnulli.us>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] net: sch: eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
Date: Mon, 25 Oct 2021 12:48:28 -0700	[thread overview]
Message-ID: <20211025124828.1e4900e3@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> (raw)
In-Reply-To: <20211022161747.81609-1-seth@forshee.me>

On Fri, 22 Oct 2021 11:17:46 -0500 Seth Forshee wrote:
> From: Seth Forshee <sforshee@digitalocean.com>
> 
> Currently rcu_barrier() is used to ensure that no readers of the
> inactive mini_Qdisc buffer remain before it is reused. This waits for
> any pending RCU callbacks to complete, when all that is actually
> required is to wait for one RCU grace period to elapse after the buffer
> was made inactive. This means that using rcu_barrier() may result in
> unnecessary waits.
> 
> To improve this, store the current RCU state when a buffer is made
> inactive and use poll_state_synchronize_rcu() to check whether a full
> grace period has elapsed before reusing it. If a full grace period has
> not elapsed, wait for a grace period to elapse, and in the non-RT case
> use synchronize_rcu_expedited() to hasten it.
> 
> Since this approach eliminates the RCU callback it is no longer
> necessary to synchronize_rcu() in the tp_head==NULL case. However, the
> RCU state should still be saved for the previously active buffer.
> 
> Before this change I would typically see mini_qdisc_pair_swap() take
> tens of milliseconds to complete. After this change it typcially
> finishes in less than 1 ms, and often it takes just a few microseconds.
> 
> Thanks to Paul for walking me through the options for improving this.
> 
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Signed-off-by: Seth Forshee <sforshee@digitalocean.com>

LGTM, but please rebase and retest on top of latest net-next.

>  void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp,
>  			  struct tcf_proto *tp_head)
>  {
> @@ -1423,28 +1419,30 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp,
>  
>  	if (!tp_head) {
>  		RCU_INIT_POINTER(*miniqp->p_miniq, NULL);
> -		/* Wait for flying RCU callback before it is freed. */
> -		rcu_barrier();
> -		return;
> -	}
> +	} else {
> +		miniq = !miniq_old || miniq_old == &miniqp->miniq2 ?
> +			&miniqp->miniq1 : &miniqp->miniq2;
>  
> -	miniq = !miniq_old || miniq_old == &miniqp->miniq2 ?
> -		&miniqp->miniq1 : &miniqp->miniq2;

nit: any reason this doesn't read:

	miniq = miniq_old != &miniqp->miniq1 ? 
		&miniqp->miniq1 : &miniqp->miniq2;

Surely it's not equal to miniq1 or miniq2 if it's NULL.

> +		/* We need to make sure that readers won't see the miniq
> +		 * we are about to modify. So ensure that at least one RCU
> +		 * grace period has elapsed since the miniq was made
> +		 * inactive.
> +		 */
> +		if (IS_ENABLED(CONFIG_PREEMPT_RT))
> +			cond_synchronize_rcu(miniq->rcu_state);
> +		else if (!poll_state_synchronize_rcu(miniq->rcu_state))
> +			synchronize_rcu_expedited();
>  
> -	/* We need to make sure that readers won't see the miniq
> -	 * we are about to modify. So wait until previous call_rcu callback
> -	 * is done.
> -	 */
> -	rcu_barrier();
> -	miniq->filter_list = tp_head;
> -	rcu_assign_pointer(*miniqp->p_miniq, miniq);
> +		miniq->filter_list = tp_head;
> +		rcu_assign_pointer(*miniqp->p_miniq, miniq);
> +	}
>  
>  	if (miniq_old)
> -		/* This is counterpart of the rcu barriers above. We need to
> +		/* This is counterpart of the rcu sync above. We need to
>  		 * block potential new user of miniq_old until all readers
>  		 * are not seeing it.
>  		 */
> -		call_rcu(&miniq_old->rcu, mini_qdisc_rcu_func);
> +		miniq_old->rcu_state = start_poll_synchronize_rcu();
>  }
>  EXPORT_SYMBOL(mini_qdisc_pair_swap);
>  
> @@ -1463,6 +1461,8 @@ void mini_qdisc_pair_init(struct mini_Qdisc_pair *miniqp, struct Qdisc *qdisc,
>  	miniqp->miniq1.cpu_qstats = qdisc->cpu_qstats;
>  	miniqp->miniq2.cpu_bstats = qdisc->cpu_bstats;
>  	miniqp->miniq2.cpu_qstats = qdisc->cpu_qstats;
> +	miniqp->miniq1.rcu_state = get_state_synchronize_rcu();
> +	miniqp->miniq2.rcu_state = miniqp->miniq1.rcu_state;
>  	miniqp->p_miniq = p_miniq;
>  }
>  EXPORT_SYMBOL(mini_qdisc_pair_init);


  parent reply	other threads:[~2021-10-25 19:59 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-22 16:17 [PATCH] net: sch: eliminate unnecessary RCU waits in mini_qdisc_pair_swap() Seth Forshee
2021-10-22 20:36 ` Paul E. McKenney
2021-10-25 19:48 ` Jakub Kicinski [this message]
2021-10-26 12:27   ` Seth Forshee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211025124828.1e4900e3@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com \
    --to=kuba@kernel.org \
    --cc=davem@davemloft.net \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=seth@forshee.me \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).