netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] net: dev: BYPASS for lockless qdisc
@ 2019-03-21 10:14 Paolo Abeni
  2019-03-21 10:14 ` [PATCH net-next 1/2] net: sched: add empty status flag for NOLOCK qdisc Paolo Abeni
  2019-03-21 10:14 ` [PATCH net-next 2/2] net: dev: introduce support for sch BYPASS for lockless qdisc Paolo Abeni
  0 siblings, 2 replies; 5+ messages in thread
From: Paolo Abeni @ 2019-03-21 10:14 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Eric Dumazet, John Fastabend, Ivan Vecera

This patch series is aimed at improving xmit performances of lockless qdisc
in the uncontended scenario.

After the lockless refactor pfifo_fast can't leverage the BYPASS optimization.
Due to retpolines the overhead for the avoidables enqueue and dequeue operations
has increased and we see measurable regressions.

The first patch introduces and maintains an additional flag inside the qdisc
struct, tracking the queue status (empty/not empty) and the second patch uses
such info to implement the BYPASS code path for lockless qdisc.
Overall this avoids up to 2 indirect calls per xmit packet when the packet
rate is below line rate.
Detailed performance figures are reported in the 2nd patch.

Paolo Abeni (2):
  net: sched: add empty status flag for NOLOCK qdisc
  net: dev: introduce support for sch BYPASS for lockless qdisc

 include/net/sch_generic.h | 17 +++++++++++++++++
 net/core/dev.c            |  9 +++++++++
 net/sched/sch_generic.c   |  2 ++
 3 files changed, 28 insertions(+)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH net-next 1/2] net: sched: add empty status flag for NOLOCK qdisc
  2019-03-21 10:14 [PATCH net-next 0/2] net: dev: BYPASS for lockless qdisc Paolo Abeni
@ 2019-03-21 10:14 ` Paolo Abeni
  2019-03-21 14:42   ` Eric Dumazet
  2019-03-21 10:14 ` [PATCH net-next 2/2] net: dev: introduce support for sch BYPASS for lockless qdisc Paolo Abeni
  1 sibling, 1 reply; 5+ messages in thread
From: Paolo Abeni @ 2019-03-21 10:14 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Eric Dumazet, John Fastabend, Ivan Vecera

The queue is marked not empty after acquiring the seqlock,
and it's up to the NOLOCK qdisc clearing such flag on dequeue.
Since the empty status lays on the same cache-line of the
seqlock, it's always hot on cache during the updates.

This makes the empty flag update a little bit loosy. Given
the lack of synchronization between enqueue and dequeue, this
is unavoidable.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/net/sch_generic.h | 17 +++++++++++++++++
 net/sched/sch_generic.c   |  2 ++
 2 files changed, 19 insertions(+)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 31284c078d06..1dc0aeec5d39 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -113,6 +113,9 @@ struct Qdisc {
 
 	spinlock_t		busylock ____cacheline_aligned_in_smp;
 	spinlock_t		seqlock;
+
+	/* for NOLOCK qdisc, true if there are enqueued skbs */
+	bool			not_empty;
 	struct rcu_head		rcu;
 };
 
@@ -143,11 +146,25 @@ static inline bool qdisc_is_running(struct Qdisc *qdisc)
 	return (raw_read_seqcount(&qdisc->running) & 1) ? true : false;
 }
 
+static inline bool qdisc_is_empty(struct Qdisc *qdisc)
+{
+	if (qdisc->flags & TCQ_F_NOLOCK)
+		return !qdisc->not_empty;
+	return !qdisc->q.qlen;
+}
+
+/* for NOLOCK qdisc, caller must held seqlock */
+static inline void qdisc_set_empty(struct Qdisc *qdisc)
+{
+	qdisc->not_empty = false;
+}
+
 static inline bool qdisc_run_begin(struct Qdisc *qdisc)
 {
 	if (qdisc->flags & TCQ_F_NOLOCK) {
 		if (!spin_trylock(&qdisc->seqlock))
 			return false;
+		qdisc->not_empty = true;
 	} else if (qdisc_is_running(qdisc)) {
 		return false;
 	}
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index a117d9260558..3b03c6b9be1e 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -671,6 +671,8 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc)
 		qdisc_qstats_cpu_backlog_dec(qdisc, skb);
 		qdisc_bstats_cpu_update(qdisc, skb);
 		qdisc_qstats_atomic_qlen_dec(qdisc);
+	} else {
+		qdisc_set_empty(qdisc);
 	}
 
 	return skb;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net-next 2/2] net: dev: introduce support for sch BYPASS for lockless qdisc
  2019-03-21 10:14 [PATCH net-next 0/2] net: dev: BYPASS for lockless qdisc Paolo Abeni
  2019-03-21 10:14 ` [PATCH net-next 1/2] net: sched: add empty status flag for NOLOCK qdisc Paolo Abeni
@ 2019-03-21 10:14 ` Paolo Abeni
  1 sibling, 0 replies; 5+ messages in thread
From: Paolo Abeni @ 2019-03-21 10:14 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Eric Dumazet, John Fastabend, Ivan Vecera

With commit c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array")
pfifo_fast no longer benefit from the TCQ_F_CAN_BYPASS optimization.
Due to retpolines the cost of the enqueue()/dequeue() pair has become
relevant and we observe measurable regression for the uncontended
scenario when the packet-rate is below line rate.

After commit 46b1c18f9deb ("net: sched: put back q.qlen into a
single location") we can check for empty qdisc with a reasonably
fast operation even for nolock qdiscs.

This change extends TCQ_F_CAN_BYPASS support to nolock qdisc.
The new chunk of code mirrors closely the existing one for traditional
qdisc, leveraging a newly introduced helper to read atomically the
qdisc length.

Tested with pktgen in queue xmit mode, with pfifo_fast, a MQ
device, and MQ root qdisc:

threads         vanilla         patched
                kpps            kpps
1               2465            2889
2               4304            5188
4               7898            9589

Same as above, but with a single queue device:

threads         vanilla         patched
                kpps            kpps
1               2556            2827
2               2900            2900
4               5000            5000
8               4700            4700

No mesaurable changes in the contended scenarios, and more 10%
improvement in the uncontended ones.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Ivan Vecera <ivecera@redhat.com>
---
 net/core/dev.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index 2b67f2aa59dd..9a6d3617f03a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3468,6 +3468,15 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
 		if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) {
 			__qdisc_drop(skb, &to_free);
 			rc = NET_XMIT_DROP;
+		} else if ((q->flags & TCQ_F_CAN_BYPASS) && !q->not_empty &&
+			   qdisc_run_begin(q)) {
+			qdisc_bstats_cpu_update(q, skb);
+
+			if (sch_direct_xmit(skb, q, dev, txq, NULL, true))
+				__qdisc_run(q);
+
+			qdisc_run_end(q);
+			rc = NET_XMIT_SUCCESS;
 		} else {
 			rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
 			qdisc_run(q);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next 1/2] net: sched: add empty status flag for NOLOCK qdisc
  2019-03-21 10:14 ` [PATCH net-next 1/2] net: sched: add empty status flag for NOLOCK qdisc Paolo Abeni
@ 2019-03-21 14:42   ` Eric Dumazet
  2019-03-21 15:10     ` Paolo Abeni
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2019-03-21 14:42 UTC (permalink / raw)
  To: Paolo Abeni, netdev
  Cc: David S. Miller, Eric Dumazet, John Fastabend, Ivan Vecera



On 03/21/2019 03:14 AM, Paolo Abeni wrote:
> The queue is marked not empty after acquiring the seqlock,
> and it's up to the NOLOCK qdisc clearing such flag on dequeue.
> Since the empty status lays on the same cache-line of the
> seqlock, it's always hot on cache during the updates.
> 
> This makes the empty flag update a little bit loosy. Given
> the lack of synchronization between enqueue and dequeue, this
> is unavoidable.
> 

Seems nice !

> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  include/net/sch_generic.h | 17 +++++++++++++++++
>  net/sched/sch_generic.c   |  2 ++
>  2 files changed, 19 insertions(+)
> 
> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> index 31284c078d06..1dc0aeec5d39 100644
> --- a/include/net/sch_generic.h
> +++ b/include/net/sch_generic.h
> @@ -113,6 +113,9 @@ struct Qdisc {
>  
>  	spinlock_t		busylock ____cacheline_aligned_in_smp;
>  	spinlock_t		seqlock;
> +
> +	/* for NOLOCK qdisc, true if there are enqueued skbs */
> +	bool			not_empty;
>  	struct rcu_head		rcu;
>  };
>  
> @@ -143,11 +146,25 @@ static inline bool qdisc_is_running(struct Qdisc *qdisc)
>  	return (raw_read_seqcount(&qdisc->running) & 1) ? true : false;
>  }
>  
> +static inline bool qdisc_is_empty(struct Qdisc *qdisc)
> +{
> +	if (qdisc->flags & TCQ_F_NOLOCK)
> +		return !qdisc->not_empty;

double negations are not very readable IMO ...

Why not using qdisc->empty instead ?


> +	return !qdisc->q.qlen;
> +}
> +
> +/* for NOLOCK qdisc, caller must held seqlock */
> +static inline void qdisc_set_empty(struct Qdisc *qdisc)
> +{
> +	qdisc->not_empty = false;
> +}
> +
I guess the helper is not really needed ?


>  static inline bool qdisc_run_begin(struct Qdisc *qdisc)
>  {
>  	if (qdisc->flags & TCQ_F_NOLOCK) {
>  		if (!spin_trylock(&qdisc->seqlock))
>  			return false;
> +		qdisc->not_empty = true;

Not clear why you have a qdisc_set_empty() helper only for one case.



>  	} else if (qdisc_is_running(qdisc)) {
>  		return false;
>  	}
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index a117d9260558..3b03c6b9be1e 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -671,6 +671,8 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc)
>  		qdisc_qstats_cpu_backlog_dec(qdisc, skb);
>  		qdisc_bstats_cpu_update(qdisc, skb);
>  		qdisc_qstats_atomic_qlen_dec(qdisc);
> +	} else {
> +		qdisc_set_empty(qdisc);
>  	}
>  
>  	return skb;
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next 1/2] net: sched: add empty status flag for NOLOCK qdisc
  2019-03-21 14:42   ` Eric Dumazet
@ 2019-03-21 15:10     ` Paolo Abeni
  0 siblings, 0 replies; 5+ messages in thread
From: Paolo Abeni @ 2019-03-21 15:10 UTC (permalink / raw)
  To: Eric Dumazet, netdev
  Cc: David S. Miller, Eric Dumazet, John Fastabend, Ivan Vecera

Hi,

Thank you for the review.

On Thu, 2019-03-21 at 07:42 -0700, Eric Dumazet wrote:
> 
> On 03/21/2019 03:14 AM, Paolo Abeni wrote:
> > The queue is marked not empty after acquiring the seqlock,
> > and it's up to the NOLOCK qdisc clearing such flag on dequeue.
> > Since the empty status lays on the same cache-line of the
> > seqlock, it's always hot on cache during the updates.
> > 
> > This makes the empty flag update a little bit loosy. Given
> > the lack of synchronization between enqueue and dequeue, this
> > is unavoidable.
> > 
> 
> Seems nice !
> 
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> >  include/net/sch_generic.h | 17 +++++++++++++++++
> >  net/sched/sch_generic.c   |  2 ++
> >  2 files changed, 19 insertions(+)
> > 
> > diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> > index 31284c078d06..1dc0aeec5d39 100644
> > --- a/include/net/sch_generic.h
> > +++ b/include/net/sch_generic.h
> > @@ -113,6 +113,9 @@ struct Qdisc {
> >  
> >  	spinlock_t		busylock ____cacheline_aligned_in_smp;
> >  	spinlock_t		seqlock;
> > +
> > +	/* for NOLOCK qdisc, true if there are enqueued skbs */
> > +	bool			not_empty;
> >  	struct rcu_head		rcu;
> >  };
> >  
> > @@ -143,11 +146,25 @@ static inline bool qdisc_is_running(struct Qdisc *qdisc)
> >  	return (raw_read_seqcount(&qdisc->running) & 1) ? true : false;
> >  }
> >  
> > +static inline bool qdisc_is_empty(struct Qdisc *qdisc)
> > +{
> > +	if (qdisc->flags & TCQ_F_NOLOCK)
> > +		return !qdisc->not_empty;
> 
> double negations are not very readable IMO ...
> 
> Why not using qdisc->empty instead ?

So that the initial kzalloc() initialize this field corrently and no
explicit initialization is needed. Not a winning argument, I see. Will
use 'empty' in next version.

> > +	return !qdisc->q.qlen;
> > +}
> > +
> > +/* for NOLOCK qdisc, caller must held seqlock */
> > +static inline void qdisc_set_empty(struct Qdisc *qdisc)
> > +{
> > +	qdisc->not_empty = false;
> > +}
> > +
> I guess the helper is not really needed ?

I added this as an helper, to prepare the ground for other NOLOCK
qdiscs. Since none of them is on the horizon, I'll drop this in the
next version.

I'd like to leverage this patch in a later series to avoid the
qdisc_qstats_atomic_qlen_inc()/qdisc_qstats_atomic_qlen_dec() for
NOLOCK qdisc, as you suggested in
46b1c18f9deb326a7e18348e668e4c7ab7c7458b

In the control path we could resort to accumulate percpu values, and in
the data path we always check for empty/not empty status, AFAICS with
only one exception, in net/sched/sch_cbq.c:

static inline void
cbq_update_toplevel(struct cbq_sched_data *q, struct cbq_class *cl,
                    struct cbq_class *borrowed)
{
        if (cl && q->toplevel >= borrowed->level) {
                 if (cl->q->q.qlen > 1) {


This looks both problematic and strange to me. This test predates git
history, and the related comment is not very helpful to me. Could it be
a typo slipped in from the beginning ?!? (instead of 'cl->q->q.qlen >
0')

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-03-21 15:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-21 10:14 [PATCH net-next 0/2] net: dev: BYPASS for lockless qdisc Paolo Abeni
2019-03-21 10:14 ` [PATCH net-next 1/2] net: sched: add empty status flag for NOLOCK qdisc Paolo Abeni
2019-03-21 14:42   ` Eric Dumazet
2019-03-21 15:10     ` Paolo Abeni
2019-03-21 10:14 ` [PATCH net-next 2/2] net: dev: introduce support for sch BYPASS for lockless qdisc Paolo Abeni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).