* [PATCH net-next v3 0/2] net: dev: BYPASS for lockless qdisc
@ 2019-03-22 15:01 Paolo Abeni
2019-03-22 15:01 ` [PATCH net-next v3 1/2] net: sched: add empty status flag for NOLOCK qdisc Paolo Abeni
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Paolo Abeni @ 2019-03-22 15:01 UTC (permalink / raw)
To: netdev; +Cc: David S. Miller, Eric Dumazet, John Fastabend, Ivan Vecera
This patch series is aimed at improving xmit performances of lockless qdisc
in the uncontended scenario.
After the lockless refactor pfifo_fast can't leverage the BYPASS optimization.
Due to retpolines the overhead for the avoidables enqueue and dequeue operations
has increased and we see measurable regressions.
The first patch introduces the BYPASS code path for lockless qdisc, and the
second one optimizes such path further. Overall this avoids up to 3 indirect
calls per xmit packet. Detailed performance figures are reported in the 2nd
patch.
v2 -> v3:
- qdisc_is_empty() has a const argument (Eric)
v1 -> v2:
- use really an 'empty' flag instead of 'not_empty', as
suggested by Eric
Paolo Abeni (2):
net: sched: add empty status flag for NOLOCK qdisc
net: dev: introduce support for sch BYPASS for lockless qdisc
include/net/sch_generic.h | 11 +++++++++++
net/core/dev.c | 9 +++++++++
net/sched/sch_generic.c | 3 +++
3 files changed, 23 insertions(+)
--
2.20.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net-next v3 1/2] net: sched: add empty status flag for NOLOCK qdisc
2019-03-22 15:01 [PATCH net-next v3 0/2] net: dev: BYPASS for lockless qdisc Paolo Abeni
@ 2019-03-22 15:01 ` Paolo Abeni
2019-03-22 15:18 ` Eric Dumazet
2019-03-22 15:24 ` Ivan Vecera
2019-03-22 15:01 ` [PATCH net-next v3 2/2] net: dev: introduce support for sch BYPASS for lockless qdisc Paolo Abeni
2019-03-24 1:54 ` [PATCH net-next v3 0/2] net: dev: " David Miller
2 siblings, 2 replies; 6+ messages in thread
From: Paolo Abeni @ 2019-03-22 15:01 UTC (permalink / raw)
To: netdev; +Cc: David S. Miller, Eric Dumazet, John Fastabend, Ivan Vecera
The queue is marked not empty after acquiring the seqlock,
and it's up to the NOLOCK qdisc clearing such flag on dequeue.
Since the empty status lays on the same cache-line of the
seqlock, it's always hot on cache during the updates.
This makes the empty flag update a little bit loosy. Given
the lack of synchronization between enqueue and dequeue, this
is unavoidable.
v2 -> v3:
- qdisc_is_empty() has a const argument (Eric)
v1 -> v2:
- use really an 'empty' flag instead of 'not_empty', as
suggested by Eric
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
include/net/sch_generic.h | 11 +++++++++++
net/sched/sch_generic.c | 3 +++
2 files changed, 14 insertions(+)
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 31284c078d06..e227475e78ca 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -113,6 +113,9 @@ struct Qdisc {
spinlock_t busylock ____cacheline_aligned_in_smp;
spinlock_t seqlock;
+
+ /* for NOLOCK qdisc, true if there are no enqueued skbs */
+ bool empty;
struct rcu_head rcu;
};
@@ -143,11 +146,19 @@ static inline bool qdisc_is_running(struct Qdisc *qdisc)
return (raw_read_seqcount(&qdisc->running) & 1) ? true : false;
}
+static inline bool qdisc_is_empty(const struct Qdisc *qdisc)
+{
+ if (qdisc->flags & TCQ_F_NOLOCK)
+ return qdisc->empty;
+ return !qdisc->q.qlen;
+}
+
static inline bool qdisc_run_begin(struct Qdisc *qdisc)
{
if (qdisc->flags & TCQ_F_NOLOCK) {
if (!spin_trylock(&qdisc->seqlock))
return false;
+ qdisc->empty = false;
} else if (qdisc_is_running(qdisc)) {
return false;
}
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index a117d9260558..81356ef38d1d 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -671,6 +671,8 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc)
qdisc_qstats_cpu_backlog_dec(qdisc, skb);
qdisc_bstats_cpu_update(qdisc, skb);
qdisc_qstats_atomic_qlen_dec(qdisc);
+ } else {
+ qdisc->empty = true;
}
return skb;
@@ -880,6 +882,7 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
sch->enqueue = ops->enqueue;
sch->dequeue = ops->dequeue;
sch->dev_queue = dev_queue;
+ sch->empty = true;
dev_hold(dev);
refcount_set(&sch->refcnt, 1);
--
2.20.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net-next v3 2/2] net: dev: introduce support for sch BYPASS for lockless qdisc
2019-03-22 15:01 [PATCH net-next v3 0/2] net: dev: BYPASS for lockless qdisc Paolo Abeni
2019-03-22 15:01 ` [PATCH net-next v3 1/2] net: sched: add empty status flag for NOLOCK qdisc Paolo Abeni
@ 2019-03-22 15:01 ` Paolo Abeni
2019-03-24 1:54 ` [PATCH net-next v3 0/2] net: dev: " David Miller
2 siblings, 0 replies; 6+ messages in thread
From: Paolo Abeni @ 2019-03-22 15:01 UTC (permalink / raw)
To: netdev; +Cc: David S. Miller, Eric Dumazet, John Fastabend, Ivan Vecera
With commit c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array")
pfifo_fast no longer benefit from the TCQ_F_CAN_BYPASS optimization.
Due to retpolines the cost of the enqueue()/dequeue() pair has become
relevant and we observe measurable regression for the uncontended
scenario when the packet-rate is below line rate.
After commit 46b1c18f9deb ("net: sched: put back q.qlen into a
single location") we can check for empty qdisc with a reasonably
fast operation even for nolock qdiscs.
This change extends TCQ_F_CAN_BYPASS support to nolock qdisc.
The new chunk of code mirrors closely the existing one for traditional
qdisc, leveraging a newly introduced helper to read atomically the
qdisc length.
Tested with pktgen in queue xmit mode, with pfifo_fast, a MQ
device, and MQ root qdisc:
threads vanilla patched
kpps kpps
1 2465 2889
2 4304 5188
4 7898 9589
Same as above, but with a single queue device:
threads vanilla patched
kpps kpps
1 2556 2827
2 2900 2900
4 5000 5000
8 4700 4700
No mesaurable changes in the contended scenarios, and more 10%
improvement in the uncontended ones.
v1 -> v2:
- rebased after flag name change
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
---
net/core/dev.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/net/core/dev.c b/net/core/dev.c
index 2b67f2aa59dd..e6cf39a9d3c5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3468,6 +3468,15 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) {
__qdisc_drop(skb, &to_free);
rc = NET_XMIT_DROP;
+ } else if ((q->flags & TCQ_F_CAN_BYPASS) && q->empty &&
+ qdisc_run_begin(q)) {
+ qdisc_bstats_cpu_update(q, skb);
+
+ if (sch_direct_xmit(skb, q, dev, txq, NULL, true))
+ __qdisc_run(q);
+
+ qdisc_run_end(q);
+ rc = NET_XMIT_SUCCESS;
} else {
rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
qdisc_run(q);
--
2.20.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net-next v3 1/2] net: sched: add empty status flag for NOLOCK qdisc
2019-03-22 15:01 ` [PATCH net-next v3 1/2] net: sched: add empty status flag for NOLOCK qdisc Paolo Abeni
@ 2019-03-22 15:18 ` Eric Dumazet
2019-03-22 15:24 ` Ivan Vecera
1 sibling, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2019-03-22 15:18 UTC (permalink / raw)
To: Paolo Abeni, netdev
Cc: David S. Miller, Eric Dumazet, John Fastabend, Ivan Vecera
On 03/22/2019 08:01 AM, Paolo Abeni wrote:
> The queue is marked not empty after acquiring the seqlock,
> and it's up to the NOLOCK qdisc clearing such flag on dequeue.
> Since the empty status lays on the same cache-line of the
> seqlock, it's always hot on cache during the updates.
>
> This makes the empty flag update a little bit loosy. Given
> the lack of synchronization between enqueue and dequeue, this
> is unavoidable.
>
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-next v3 1/2] net: sched: add empty status flag for NOLOCK qdisc
2019-03-22 15:01 ` [PATCH net-next v3 1/2] net: sched: add empty status flag for NOLOCK qdisc Paolo Abeni
2019-03-22 15:18 ` Eric Dumazet
@ 2019-03-22 15:24 ` Ivan Vecera
1 sibling, 0 replies; 6+ messages in thread
From: Ivan Vecera @ 2019-03-22 15:24 UTC (permalink / raw)
To: Paolo Abeni, netdev; +Cc: David S. Miller, Eric Dumazet, John Fastabend
On 22. 03. 19 16:01, Paolo Abeni wrote:
> The queue is marked not empty after acquiring the seqlock,
> and it's up to the NOLOCK qdisc clearing such flag on dequeue.
> Since the empty status lays on the same cache-line of the
> seqlock, it's always hot on cache during the updates.
>
> This makes the empty flag update a little bit loosy. Given
> the lack of synchronization between enqueue and dequeue, this
> is unavoidable.
>
> v2 -> v3:
> - qdisc_is_empty() has a const argument (Eric)
>
> v1 -> v2:
> - use really an 'empty' flag instead of 'not_empty', as
> suggested by Eric
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> include/net/sch_generic.h | 11 +++++++++++
> net/sched/sch_generic.c | 3 +++
> 2 files changed, 14 insertions(+)
>
> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> index 31284c078d06..e227475e78ca 100644
> --- a/include/net/sch_generic.h
> +++ b/include/net/sch_generic.h
> @@ -113,6 +113,9 @@ struct Qdisc {
>
> spinlock_t busylock ____cacheline_aligned_in_smp;
> spinlock_t seqlock;
> +
> + /* for NOLOCK qdisc, true if there are no enqueued skbs */
> + bool empty;
> struct rcu_head rcu;
> };
>
> @@ -143,11 +146,19 @@ static inline bool qdisc_is_running(struct Qdisc *qdisc)
> return (raw_read_seqcount(&qdisc->running) & 1) ? true : false;
> }
>
> +static inline bool qdisc_is_empty(const struct Qdisc *qdisc)
> +{
> + if (qdisc->flags & TCQ_F_NOLOCK)
> + return qdisc->empty;
> + return !qdisc->q.qlen;
> +}
> +
> static inline bool qdisc_run_begin(struct Qdisc *qdisc)
> {
> if (qdisc->flags & TCQ_F_NOLOCK) {
> if (!spin_trylock(&qdisc->seqlock))
> return false;
> + qdisc->empty = false;
> } else if (qdisc_is_running(qdisc)) {
> return false;
> }
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index a117d9260558..81356ef38d1d 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -671,6 +671,8 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc)
> qdisc_qstats_cpu_backlog_dec(qdisc, skb);
> qdisc_bstats_cpu_update(qdisc, skb);
> qdisc_qstats_atomic_qlen_dec(qdisc);
> + } else {
> + qdisc->empty = true;
> }
>
> return skb;
> @@ -880,6 +882,7 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
> sch->enqueue = ops->enqueue;
> sch->dequeue = ops->dequeue;
> sch->dev_queue = dev_queue;
> + sch->empty = true;
> dev_hold(dev);
> refcount_set(&sch->refcnt, 1);
>
>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-next v3 0/2] net: dev: BYPASS for lockless qdisc
2019-03-22 15:01 [PATCH net-next v3 0/2] net: dev: BYPASS for lockless qdisc Paolo Abeni
2019-03-22 15:01 ` [PATCH net-next v3 1/2] net: sched: add empty status flag for NOLOCK qdisc Paolo Abeni
2019-03-22 15:01 ` [PATCH net-next v3 2/2] net: dev: introduce support for sch BYPASS for lockless qdisc Paolo Abeni
@ 2019-03-24 1:54 ` David Miller
2 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2019-03-24 1:54 UTC (permalink / raw)
To: pabeni; +Cc: netdev, edumazet, john.fastabend, ivecera
From: Paolo Abeni <pabeni@redhat.com>
Date: Fri, 22 Mar 2019 16:01:54 +0100
> This patch series is aimed at improving xmit performances of lockless qdisc
> in the uncontended scenario.
>
> After the lockless refactor pfifo_fast can't leverage the BYPASS optimization.
> Due to retpolines the overhead for the avoidables enqueue and dequeue operations
> has increased and we see measurable regressions.
>
> The first patch introduces the BYPASS code path for lockless qdisc, and the
> second one optimizes such path further. Overall this avoids up to 3 indirect
> calls per xmit packet. Detailed performance figures are reported in the 2nd
> patch.
>
> v2 -> v3:
> - qdisc_is_empty() has a const argument (Eric)
>
> v1 -> v2:
> - use really an 'empty' flag instead of 'not_empty', as
> suggested by Eric
Looks great, series applied, thanks Paolo.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-03-24 1:54 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-22 15:01 [PATCH net-next v3 0/2] net: dev: BYPASS for lockless qdisc Paolo Abeni
2019-03-22 15:01 ` [PATCH net-next v3 1/2] net: sched: add empty status flag for NOLOCK qdisc Paolo Abeni
2019-03-22 15:18 ` Eric Dumazet
2019-03-22 15:24 ` Ivan Vecera
2019-03-22 15:01 ` [PATCH net-next v3 2/2] net: dev: introduce support for sch BYPASS for lockless qdisc Paolo Abeni
2019-03-24 1:54 ` [PATCH net-next v3 0/2] net: dev: " David Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.