All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yunsheng Lin <linyunsheng@huawei.com>
To: <davem@davemloft.net>, <kuba@kernel.org>
Cc: <ast@kernel.org>, <daniel@iogearbox.net>, <andriin@fb.com>,
	<edumazet@google.com>, <weiwan@google.com>,
	<cong.wang@bytedance.com>, <ap420073@gmail.com>,
	<netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linuxarm@openeuler.org>
Subject: [PATCH RFC] net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc
Date: Sat, 13 Mar 2021 10:47:47 +0800	[thread overview]
Message-ID: <1615603667-22568-1-git-send-email-linyunsheng@huawei.com> (raw)

Currently pfifo_fast has both TCQ_F_CAN_BYPASS and TCQ_F_NOLOCK
flag set, but queue discipline by-pass does not work for lockless
qdisc because skb is always enqueued to qdisc even when the qdisc
is empty, see __dev_xmit_skb().

This patch calles sch_direct_xmit() to transmit the skb directly
to the driver for empty lockless qdisc too, which aviod enqueuing
and dequeuing operation. qdisc->empty is set to false whenever a
skb is enqueued, and is set to true when skb dequeuing return NULL,
see pfifo_fast_dequeue().

Also, qdisc is scheduled at the end of qdisc_run_end() when q->empty
is false to avoid packet stuck problem.

The performance for ip_forward test increases about 10% with this
patch.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
---
 include/net/sch_generic.h |  7 +++++--
 net/core/dev.c            | 11 +++++++++++
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 2d6eb60..6591356 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -161,7 +161,6 @@ static inline bool qdisc_run_begin(struct Qdisc *qdisc)
 	if (qdisc->flags & TCQ_F_NOLOCK) {
 		if (!spin_trylock(&qdisc->seqlock))
 			return false;
-		WRITE_ONCE(qdisc->empty, false);
 	} else if (qdisc_is_running(qdisc)) {
 		return false;
 	}
@@ -176,8 +175,12 @@ static inline bool qdisc_run_begin(struct Qdisc *qdisc)
 static inline void qdisc_run_end(struct Qdisc *qdisc)
 {
 	write_seqcount_end(&qdisc->running);
-	if (qdisc->flags & TCQ_F_NOLOCK)
+	if (qdisc->flags & TCQ_F_NOLOCK) {
 		spin_unlock(&qdisc->seqlock);
+
+		if (unlikely(!READ_ONCE(qdisc->empty)))
+			__netif_schedule(qdisc);
+	}
 }
 
 static inline bool qdisc_may_bulk(const struct Qdisc *qdisc)
diff --git a/net/core/dev.c b/net/core/dev.c
index 2bfdd52..fa8504d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3791,7 +3791,18 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
 	qdisc_calculate_pkt_len(skb, q);
 
 	if (q->flags & TCQ_F_NOLOCK) {
+		if (q->flags & TCQ_F_CAN_BYPASS && READ_ONCE(q->empty) && qdisc_run_begin(q)) {
+			qdisc_bstats_cpu_update(q, skb);
+
+			if (sch_direct_xmit(skb, q, dev, txq, NULL, true) && !READ_ONCE(q->empty))
+				__qdisc_run(q);
+
+			qdisc_run_end(q);
+			return NET_XMIT_SUCCESS;
+		}
+
 		rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
+		WRITE_ONCE(q->empty, false);
 		qdisc_run(q);
 
 		if (unlikely(to_free))
-- 
2.7.4


             reply	other threads:[~2021-03-13  2:47 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-13  2:47 Yunsheng Lin [this message]
2021-03-14  0:03 ` [PATCH RFC] net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc Vladimir Oltean
2021-03-14 10:15   ` Marc Kleine-Budde
2021-03-15  0:50     ` Yunsheng Lin
2021-03-15  3:10 ` [RFC v2] " Yunsheng Lin
2021-03-15 12:29   ` Vladimir Oltean
2021-03-15 13:09   ` Marc Kleine-Budde
2021-03-15 18:53   ` Jakub Kicinski
2021-03-16  0:35     ` Yunsheng Lin
2021-03-16  3:47       ` [Linuxarm] " Yunsheng Lin
2021-03-16  8:15       ` Eric Dumazet
2021-03-16 12:36         ` Yunsheng Lin
2021-03-16 22:48     ` Cong Wang
2021-03-17  1:14       ` Yunsheng Lin
2021-03-17 13:35       ` Toke Høiland-Jørgensen
2021-03-17 13:45         ` Jason A. Donenfeld
2021-03-18  7:33           ` [Linuxarm] " Yunsheng Lin
2021-03-19 18:15             ` Cong Wang
2021-03-22  0:55               ` Yunsheng Lin
2021-03-24  1:49                 ` Cong Wang
2021-03-24  2:36                   ` Yunsheng Lin
2021-03-19 19:03             ` Jason A. Donenfeld
2021-03-22  1:05               ` Yunsheng Lin
2021-03-18  7:10   ` Ahmad Fatoum
2021-03-18  7:46     ` Yunsheng Lin
2021-03-18  9:09       ` Ahmad Fatoum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1615603667-22568-1-git-send-email-linyunsheng@huawei.com \
    --to=linyunsheng@huawei.com \
    --cc=andriin@fb.com \
    --cc=ap420073@gmail.com \
    --cc=ast@kernel.org \
    --cc=cong.wang@bytedance.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@openeuler.org \
    --cc=netdev@vger.kernel.org \
    --cc=weiwan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.