All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] netem performance improvements
@ 2007-03-21 17:42 Stephen Hemminger
  2007-03-21 17:42 ` [PATCH 1/5] netem: report reorder percent correctly Stephen Hemminger
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Stephen Hemminger @ 2007-03-21 17:42 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

The following patches for the 2.6.22 net tree, increase the
performance of netem by about 2x.  With 2.6.20 getting about
100K (out of possible 300K) packets per second, after these
patches now at over 200K pps.

-- 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/5] netem: report reorder percent correctly.
  2007-03-21 17:42 [PATCH 0/5] netem performance improvements Stephen Hemminger
@ 2007-03-21 17:42 ` Stephen Hemminger
  2007-03-21 17:42 ` [PATCH 2/5] netem: use better types for time values Stephen Hemminger
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Stephen Hemminger @ 2007-03-21 17:42 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

[-- Attachment #1: netem-report-reorder --]
[-- Type: text/plain, Size: 792 bytes --]

If you setup netem to just delay packets; "tc qdisc ls" will report
the reordering as 100%. Well it's a lie, reorder isn't used unless
gap is set, so just set value to 0 so the output of utility
is correct.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>


---
 net/sched/sch_netem.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- net-2.6.22.orig/net/sched/sch_netem.c
+++ net-2.6.22/net/sched/sch_netem.c
@@ -428,7 +428,8 @@ static int netem_change(struct Qdisc *sc
 	/* for compatiablity with earlier versions.
 	 * if gap is set, need to assume 100% probablity
 	 */
-	q->reorder = ~0;
+	if (q->gap)
+		q->reorder = ~0;
 
 	/* Handle nested options after initial queue options.
 	 * Should have put all options in nested format but too late now.

-- 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/5] netem: use better types for time values
  2007-03-21 17:42 [PATCH 0/5] netem performance improvements Stephen Hemminger
  2007-03-21 17:42 ` [PATCH 1/5] netem: report reorder percent correctly Stephen Hemminger
@ 2007-03-21 17:42 ` Stephen Hemminger
  2007-03-21 17:42 ` [PATCH 3/5] netem: optimize tfifo Stephen Hemminger
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Stephen Hemminger @ 2007-03-21 17:42 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

[-- Attachment #1: netem-typefix.patch --]
[-- Type: text/plain, Size: 1708 bytes --]

The random number generator always generates 32 bit values.
The time values are limited by psched_tdiff_t

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>

---
 net/sched/sch_netem.c |   23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

--- net-2.6.22.orig/net/sched/sch_netem.c
+++ net-2.6.22/net/sched/sch_netem.c
@@ -56,19 +56,20 @@ struct netem_sched_data {
 	struct Qdisc	*qdisc;
 	struct qdisc_watchdog watchdog;
 
-	u32 latency;
+	psched_tdiff_t latency;
+	psched_tdiff_t jitter;
+
 	u32 loss;
 	u32 limit;
 	u32 counter;
 	u32 gap;
-	u32 jitter;
 	u32 duplicate;
 	u32 reorder;
 	u32 corrupt;
 
 	struct crndstate {
-		unsigned long last;
-		unsigned long rho;
+		u32 last;
+		u32 rho;
 	} delay_cor, loss_cor, dup_cor, reorder_cor, corrupt_cor;
 
 	struct disttable {
@@ -95,7 +96,7 @@ static void init_crandom(struct crndstat
  * Next number depends on last value.
  * rho is scaled to avoid floating point.
  */
-static unsigned long get_crandom(struct crndstate *state)
+static u32 get_crandom(struct crndstate *state)
 {
 	u64 value, rho;
 	unsigned long answer;
@@ -114,11 +115,13 @@ static unsigned long get_crandom(struct 
  * std deviation sigma.  Uses table lookup to approximate the desired
  * distribution, and a uniformly-distributed pseudo-random source.
  */
-static long tabledist(unsigned long mu, long sigma,
-		      struct crndstate *state, const struct disttable *dist)
-{
-	long t, x;
-	unsigned long rnd;
+static psched_tdiff_t tabledist(psched_tdiff_t mu, psched_tdiff_t sigma,
+				struct crndstate *state,
+				const struct disttable *dist)
+{
+	psched_tdiff_t x;
+	long t;
+	u32 rnd;
 
 	if (sigma == 0)
 		return mu;

-- 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/5] netem: optimize tfifo
  2007-03-21 17:42 [PATCH 0/5] netem performance improvements Stephen Hemminger
  2007-03-21 17:42 ` [PATCH 1/5] netem: report reorder percent correctly Stephen Hemminger
  2007-03-21 17:42 ` [PATCH 2/5] netem: use better types for time values Stephen Hemminger
@ 2007-03-21 17:42 ` Stephen Hemminger
  2007-03-21 17:42 ` [PATCH 4/5] netem: avoid excessive requeues Stephen Hemminger
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Stephen Hemminger @ 2007-03-21 17:42 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

[-- Attachment #1: netem-tfifo-opt.patch --]
[-- Type: text/plain, Size: 1699 bytes --]

In most cases, the next packet will be sent after the
last one. So optimize that case.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>

---
 net/sched/sch_netem.c |   15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

--- net-2.6.22.orig/net/sched/sch_netem.c
+++ net-2.6.22/net/sched/sch_netem.c
@@ -478,22 +478,28 @@ static int netem_change(struct Qdisc *sc
  */
 struct fifo_sched_data {
 	u32 limit;
+	psched_time_t oldest;
 };
 
 static int tfifo_enqueue(struct sk_buff *nskb, struct Qdisc *sch)
 {
 	struct fifo_sched_data *q = qdisc_priv(sch);
 	struct sk_buff_head *list = &sch->q;
-	const struct netem_skb_cb *ncb
-		= (const struct netem_skb_cb *)nskb->cb;
+	psched_time_t tnext = ((struct netem_skb_cb *)nskb->cb)->time_to_send;
 	struct sk_buff *skb;
 
 	if (likely(skb_queue_len(list) < q->limit)) {
+		/* Optimize for add at tail */
+		if (likely(skb_queue_empty(list) || !PSCHED_TLESS(tnext, q->oldest))) {
+			q->oldest = tnext;
+			return qdisc_enqueue_tail(nskb, sch);
+		}
+
 		skb_queue_reverse_walk(list, skb) {
 			const struct netem_skb_cb *cb
 				= (const struct netem_skb_cb *)skb->cb;
 
-			if (!PSCHED_TLESS(ncb->time_to_send, cb->time_to_send))
+			if (!PSCHED_TLESS(tnext, cb->time_to_send))
 				break;
 		}
 
@@ -506,7 +512,7 @@ static int tfifo_enqueue(struct sk_buff 
 		return NET_XMIT_SUCCESS;
 	}
 
-	return qdisc_drop(nskb, sch);
+	return qdisc_reshape_fail(nskb, sch);
 }
 
 static int tfifo_init(struct Qdisc *sch, struct rtattr *opt)
@@ -522,6 +528,7 @@ static int tfifo_init(struct Qdisc *sch,
 	} else
 		q->limit = max_t(u32, sch->dev->tx_queue_len, 1);
 
+	PSCHED_SET_PASTPERFECT(q->oldest);
 	return 0;
 }
 

-- 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 4/5] netem: avoid excessive requeues
  2007-03-21 17:42 [PATCH 0/5] netem performance improvements Stephen Hemminger
                   ` (2 preceding siblings ...)
  2007-03-21 17:42 ` [PATCH 3/5] netem: optimize tfifo Stephen Hemminger
@ 2007-03-21 17:42 ` Stephen Hemminger
  2007-03-22 20:40   ` Patrick McHardy
  2007-03-21 17:42 ` [PATCH 5/5] qdisc: avoid transmit softirq on watchdog wakeup Stephen Hemminger
  2007-03-22 19:19 ` [PATCH 0/5] netem performance improvements David Miller
  5 siblings, 1 reply; 11+ messages in thread
From: Stephen Hemminger @ 2007-03-21 17:42 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

[-- Attachment #1: netem-throttle-opt.patch --]
[-- Type: text/plain, Size: 2339 bytes --]

The netem code would call getnstimeofday() and dequeue/requeue after
every packet, even if it was waiting. Avoid this overhead by using
the throttled flag.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>

---
 net/sched/sch_api.c   |    3 +++
 net/sched/sch_netem.c |   21 ++++++++++++---------
 2 files changed, 15 insertions(+), 9 deletions(-)

--- net-2.6.22.orig/net/sched/sch_api.c
+++ net-2.6.22/net/sched/sch_api.c
@@ -298,6 +298,7 @@ static enum hrtimer_restart qdisc_watchd
 						 timer);
 
 	wd->qdisc->flags &= ~TCQ_F_THROTTLED;
+	smp_wmb();
 	netif_schedule(wd->qdisc->dev);
 	return HRTIMER_NORESTART;
 }
@@ -315,6 +316,7 @@ void qdisc_watchdog_schedule(struct qdis
 	ktime_t time;
 
 	wd->qdisc->flags |= TCQ_F_THROTTLED;
+	smp_wmb();
 	time = ktime_set(0, 0);
 	time = ktime_add_ns(time, PSCHED_US2NS(expires));
 	hrtimer_start(&wd->timer, time, HRTIMER_MODE_ABS);
@@ -325,6 +327,7 @@ void qdisc_watchdog_cancel(struct qdisc_
 {
 	hrtimer_cancel(&wd->timer);
 	wd->qdisc->flags &= ~TCQ_F_THROTTLED;
+	smp_wmb();
 }
 EXPORT_SYMBOL(qdisc_watchdog_cancel);
 
--- net-2.6.22.orig/net/sched/sch_netem.c
+++ net-2.6.22/net/sched/sch_netem.c
@@ -272,6 +272,10 @@ static struct sk_buff *netem_dequeue(str
 	struct netem_sched_data *q = qdisc_priv(sch);
 	struct sk_buff *skb;
 
+	smp_mb();
+	if (sch->flags & TCQ_F_THROTTLED)
+		return NULL;
+
 	skb = q->qdisc->dequeue(q->qdisc);
 	if (skb) {
 		const struct netem_skb_cb *cb
@@ -284,18 +288,17 @@ static struct sk_buff *netem_dequeue(str
 		if (PSCHED_TLESS(cb->time_to_send, now)) {
 			pr_debug("netem_dequeue: return skb=%p\n", skb);
 			sch->q.qlen--;
-			sch->flags &= ~TCQ_F_THROTTLED;
 			return skb;
-		} else {
-			qdisc_watchdog_schedule(&q->watchdog, cb->time_to_send);
+		}
 
-			if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) {
-				qdisc_tree_decrease_qlen(q->qdisc, 1);
-				sch->qstats.drops++;
-				printk(KERN_ERR "netem: queue discpline %s could not requeue\n",
-				       q->qdisc->ops->id);
-			}
+		if (unlikely(q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS)) {
+			qdisc_tree_decrease_qlen(q->qdisc, 1);
+			sch->qstats.drops++;
+			printk(KERN_ERR "netem: %s could not requeue\n",
+			       q->qdisc->ops->id);
 		}
+
+		qdisc_watchdog_schedule(&q->watchdog, cb->time_to_send);
 	}
 
 	return NULL;

-- 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 5/5] qdisc: avoid transmit softirq on watchdog wakeup
  2007-03-21 17:42 [PATCH 0/5] netem performance improvements Stephen Hemminger
                   ` (3 preceding siblings ...)
  2007-03-21 17:42 ` [PATCH 4/5] netem: avoid excessive requeues Stephen Hemminger
@ 2007-03-21 17:42 ` Stephen Hemminger
  2007-03-22 19:19 ` [PATCH 0/5] netem performance improvements David Miller
  5 siblings, 0 replies; 11+ messages in thread
From: Stephen Hemminger @ 2007-03-21 17:42 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

[-- Attachment #1: qdisc-avoid-softirq.patch --]
[-- Type: text/plain, Size: 886 bytes --]

If possible, avoid having to do a transmit softirq when a qdisc
watchdog decides to re-enable.  The watchdog routine runs off
a timer, so it is already in the same effective context as
the softirq.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>

---
 net/sched/sch_api.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- net-2.6.22.orig/net/sched/sch_api.c
+++ net-2.6.22/net/sched/sch_api.c
@@ -296,10 +296,16 @@ static enum hrtimer_restart qdisc_watchd
 {
 	struct qdisc_watchdog *wd = container_of(timer, struct qdisc_watchdog,
 						 timer);
+	struct net_device *dev = wd->qdisc->dev;
 
 	wd->qdisc->flags &= ~TCQ_F_THROTTLED;
 	smp_wmb();
-	netif_schedule(wd->qdisc->dev);
+	if (spin_trylock(&dev->queue_lock)) {
+		qdisc_run(dev);
+		spin_unlock(&dev->queue_lock);
+	} else
+		netif_schedule(dev);
+
 	return HRTIMER_NORESTART;
 }
 

-- 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/5] netem performance improvements
  2007-03-21 17:42 [PATCH 0/5] netem performance improvements Stephen Hemminger
                   ` (4 preceding siblings ...)
  2007-03-21 17:42 ` [PATCH 5/5] qdisc: avoid transmit softirq on watchdog wakeup Stephen Hemminger
@ 2007-03-22 19:19 ` David Miller
  5 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2007-03-22 19:19 UTC (permalink / raw)
  To: shemminger; +Cc: netdev

From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Wed, 21 Mar 2007 10:42:31 -0700

> The following patches for the 2.6.22 net tree, increase the
> performance of netem by about 2x.  With 2.6.20 getting about
> 100K (out of possible 300K) packets per second, after these
> patches now at over 200K pps.

All patches applied to net-2.6.22, thanks Stephen.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/5] netem: avoid excessive requeues
  2007-03-21 17:42 ` [PATCH 4/5] netem: avoid excessive requeues Stephen Hemminger
@ 2007-03-22 20:40   ` Patrick McHardy
  2007-03-22 21:08     ` David Miller
  0 siblings, 1 reply; 11+ messages in thread
From: Patrick McHardy @ 2007-03-22 20:40 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, netdev

Stephen Hemminger wrote:
> @@ -315,6 +316,7 @@ void qdisc_watchdog_schedule(struct qdis
>  	ktime_t time;
>  
>  	wd->qdisc->flags |= TCQ_F_THROTTLED;
> +	smp_wmb();
>  	time = ktime_set(0, 0);
>  	time = ktime_add_ns(time, PSCHED_US2NS(expires));
>  	hrtimer_start(&wd->timer, time, HRTIMER_MODE_ABS);
> @@ -325,6 +327,7 @@ void qdisc_watchdog_cancel(struct qdisc_
>  {
>  	hrtimer_cancel(&wd->timer);
>  	wd->qdisc->flags &= ~TCQ_F_THROTTLED;
> +	smp_wmb();
>  }
>  EXPORT_SYMBOL(qdisc_watchdog_cancel);


These two look unnecessary, we're holding the queue lock.

> --- net-2.6.22.orig/net/sched/sch_netem.c
> +++ net-2.6.22/net/sched/sch_netem.c
> @@ -272,6 +272,10 @@ static struct sk_buff *netem_dequeue(str
>  	struct netem_sched_data *q = qdisc_priv(sch);
>  	struct sk_buff *skb;
>  
> +	smp_mb();
> +	if (sch->flags & TCQ_F_THROTTLED)
> +		return NULL;
> +


Perhaps we should put this in qdisc_restart, other qdiscs have the
same problem.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/5] netem: avoid excessive requeues
  2007-03-22 20:40   ` Patrick McHardy
@ 2007-03-22 21:08     ` David Miller
  2007-03-23 11:06       ` Patrick McHardy
  0 siblings, 1 reply; 11+ messages in thread
From: David Miller @ 2007-03-22 21:08 UTC (permalink / raw)
  To: kaber; +Cc: shemminger, netdev

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 22 Mar 2007 21:40:43 +0100

> > --- net-2.6.22.orig/net/sched/sch_netem.c
> > +++ net-2.6.22/net/sched/sch_netem.c
> > @@ -272,6 +272,10 @@ static struct sk_buff *netem_dequeue(str
> >  	struct netem_sched_data *q = qdisc_priv(sch);
> >  	struct sk_buff *skb;
> >  
> > +	smp_mb();
> > +	if (sch->flags & TCQ_F_THROTTLED)
> > +		return NULL;
> > +
> 
> 
> Perhaps we should put this in qdisc_restart, other qdiscs have the
> same problem.

Agreed, patches welcome :)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/5] netem: avoid excessive requeues
  2007-03-22 21:08     ` David Miller
@ 2007-03-23 11:06       ` Patrick McHardy
  2007-03-23 13:26         ` Patrick McHardy
  0 siblings, 1 reply; 11+ messages in thread
From: Patrick McHardy @ 2007-03-23 11:06 UTC (permalink / raw)
  To: David Miller; +Cc: shemminger, netdev

David Miller wrote:
> From: Patrick McHardy <kaber@trash.net>
> Date: Thu, 22 Mar 2007 21:40:43 +0100
> 
>>Perhaps we should put this in qdisc_restart, other qdiscs have the
>>same problem.
> 
> 
> Agreed, patches welcome :)


I've tried this, but for some reason it makes TBF stay about
5% under the configured rate. Probably because of late timers,
the strange thing is that the 5% happen constantly even with
very low rates.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/5] netem: avoid excessive requeues
  2007-03-23 11:06       ` Patrick McHardy
@ 2007-03-23 13:26         ` Patrick McHardy
  0 siblings, 0 replies; 11+ messages in thread
From: Patrick McHardy @ 2007-03-23 13:26 UTC (permalink / raw)
  To: David Miller; +Cc: shemminger, netdev

Patrick McHardy wrote:
> David Miller wrote:
> 
>>>Perhaps we should put this in qdisc_restart, other qdiscs have the
>>>same problem.
>>
>>
>>Agreed, patches welcome :)
> 
> 
> I've tried this, but for some reason it makes TBF stay about
> 5% under the configured rate. Probably because of late timers,
> the strange thing is that the 5% happen constantly even with
> very low rates.


Turns out it was a mistake during testing, I measured on the
ethernet device that is also used by my PPPoE connection, which
had some traffic too. iptraf only measures IP packets, so it
didn't show the bandwidth used by PPPoE.

Patch coming up :)

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2007-03-23 13:26 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-21 17:42 [PATCH 0/5] netem performance improvements Stephen Hemminger
2007-03-21 17:42 ` [PATCH 1/5] netem: report reorder percent correctly Stephen Hemminger
2007-03-21 17:42 ` [PATCH 2/5] netem: use better types for time values Stephen Hemminger
2007-03-21 17:42 ` [PATCH 3/5] netem: optimize tfifo Stephen Hemminger
2007-03-21 17:42 ` [PATCH 4/5] netem: avoid excessive requeues Stephen Hemminger
2007-03-22 20:40   ` Patrick McHardy
2007-03-22 21:08     ` David Miller
2007-03-23 11:06       ` Patrick McHardy
2007-03-23 13:26         ` Patrick McHardy
2007-03-21 17:42 ` [PATCH 5/5] qdisc: avoid transmit softirq on watchdog wakeup Stephen Hemminger
2007-03-22 19:19 ` [PATCH 0/5] netem performance improvements David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.