netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 net-next 1/2] netem: rate extension
@ 2011-11-30 22:20 Hagen Paul Pfeifer
  2011-11-30 22:20 ` [PATCH v3 net-next 2/2] netem: add cell concept to simulate special MAC behavior Hagen Paul Pfeifer
  2011-12-01  4:19 ` [PATCH v3 net-next 1/2] netem: rate extension David Miller
  0 siblings, 2 replies; 16+ messages in thread
From: Hagen Paul Pfeifer @ 2011-11-30 22:20 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, Stephen Hemminger, Hagen Paul Pfeifer

Currently netem is not in the ability to emulate channel bandwidth. Only static
delay (and optional random jitter) can be configured.

To emulate the channel rate the token bucket filter (sch_tbf) can be used.  But
TBF has some major emulation flaws. The buffer (token bucket depth/rate) cannot
be 0. Also the idea behind TBF is that the credit (token in buckets) fills if
no packet is transmitted. So that there is always a "positive" credit for new
packets. In real life this behavior contradicts the law of nature where
nothing can travel faster as speed of light. E.g.: on an emulated 1000 byte/s
link a small IPv4/TCP SYN packet with ~50 byte require ~0.05 seconds - not 0
seconds.

Netem is an excellent place to implement a rate limiting feature: static
delay is already implemented, tfifo already has time information and the
user can skip TBF configuration completely.

This patch implement rate feature which can be configured via tc. e.g:

	tc qdisc add dev eth0 root netem rate 10kbit

To emulate a link of 5000byte/s and add an additional static delay of 10ms:

	tc qdisc add dev eth0 root netem delay 10ms rate 5KBps

Note: similar to TBF the rate extension is bounded to the kernel timing
system. Depending on the architecture timer granularity, higher rates (e.g.
10mbit/s and higher) tend to transmission bursts. Also note: further queues
living in network adaptors; see ethtool(8).

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/linux/pkt_sched.h |    5 +++++
 net/sched/sch_netem.c     |   40 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+), 0 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index c533670..26c37ca 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -465,6 +465,7 @@ enum {
 	TCA_NETEM_REORDER,
 	TCA_NETEM_CORRUPT,
 	TCA_NETEM_LOSS,
+	TCA_NETEM_RATE,
 	__TCA_NETEM_MAX,
 };
 
@@ -495,6 +496,10 @@ struct tc_netem_corrupt {
 	__u32	correlation;
 };
 
+struct tc_netem_rate {
+	__u32	rate;	/* byte/s */
+};
+
 enum {
 	NETEM_LOSS_UNSPEC,
 	NETEM_LOSS_GI,		/* General Intuitive - 4 state model */
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index eb3b9a8..9b7af9f 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -79,6 +79,7 @@ struct netem_sched_data {
 	u32 duplicate;
 	u32 reorder;
 	u32 corrupt;
+	u32 rate;
 
 	struct crndstate {
 		u32 last;
@@ -298,6 +299,11 @@ static psched_tdiff_t tabledist(psched_tdiff_t mu, psched_tdiff_t sigma,
 	return  x / NETEM_DIST_SCALE + (sigma / NETEM_DIST_SCALE) * t + mu;
 }
 
+static psched_time_t packet_len_2_sched_time(unsigned int len, u32 rate)
+{
+	return PSCHED_NS2TICKS((u64)len * NSEC_PER_SEC / rate);
+}
+
 /*
  * Insert one skb into qdisc.
  * Note: parent depends on return value to account for queue length.
@@ -371,6 +377,24 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 				  &q->delay_cor, q->delay_dist);
 
 		now = psched_get_time();
+
+		if (q->rate) {
+			struct sk_buff_head *list = &q->qdisc->q;
+
+			delay += packet_len_2_sched_time(skb->len, q->rate);
+
+			if (!skb_queue_empty(list)) {
+				/*
+				 * Last packet in queue is reference point (now).
+				 * First packet in queue is already in flight,
+				 * calculate this time bonus and substract
+				 * from delay.
+				 */
+				delay -= now - netem_skb_cb(skb_peek(list))->time_to_send;
+				now = netem_skb_cb(skb_peek_tail(list))->time_to_send;
+			}
+		}
+
 		cb->time_to_send = now + delay;
 		++q->counter;
 		ret = qdisc_enqueue(skb, q->qdisc);
@@ -535,6 +559,14 @@ static void get_corrupt(struct Qdisc *sch, const struct nlattr *attr)
 	init_crandom(&q->corrupt_cor, r->correlation);
 }
 
+static void get_rate(struct Qdisc *sch, const struct nlattr *attr)
+{
+	struct netem_sched_data *q = qdisc_priv(sch);
+	const struct tc_netem_rate *r = nla_data(attr);
+
+	q->rate = r->rate;
+}
+
 static int get_loss_clg(struct Qdisc *sch, const struct nlattr *attr)
 {
 	struct netem_sched_data *q = qdisc_priv(sch);
@@ -594,6 +626,7 @@ static const struct nla_policy netem_policy[TCA_NETEM_MAX + 1] = {
 	[TCA_NETEM_CORR]	= { .len = sizeof(struct tc_netem_corr) },
 	[TCA_NETEM_REORDER]	= { .len = sizeof(struct tc_netem_reorder) },
 	[TCA_NETEM_CORRUPT]	= { .len = sizeof(struct tc_netem_corrupt) },
+	[TCA_NETEM_RATE]	= { .len = sizeof(struct tc_netem_rate) },
 	[TCA_NETEM_LOSS]	= { .type = NLA_NESTED },
 };
 
@@ -666,6 +699,9 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt)
 	if (tb[TCA_NETEM_CORRUPT])
 		get_corrupt(sch, tb[TCA_NETEM_CORRUPT]);
 
+	if (tb[TCA_NETEM_RATE])
+		get_rate(sch, tb[TCA_NETEM_RATE]);
+
 	q->loss_model = CLG_RANDOM;
 	if (tb[TCA_NETEM_LOSS])
 		ret = get_loss_clg(sch, tb[TCA_NETEM_LOSS]);
@@ -846,6 +882,7 @@ static int netem_dump(struct Qdisc *sch, struct sk_buff *skb)
 	struct tc_netem_corr cor;
 	struct tc_netem_reorder reorder;
 	struct tc_netem_corrupt corrupt;
+	struct tc_netem_rate rate;
 
 	qopt.latency = q->latency;
 	qopt.jitter = q->jitter;
@@ -868,6 +905,9 @@ static int netem_dump(struct Qdisc *sch, struct sk_buff *skb)
 	corrupt.correlation = q->corrupt_cor.rho;
 	NLA_PUT(skb, TCA_NETEM_CORRUPT, sizeof(corrupt), &corrupt);
 
+	rate.rate = q->rate;
+	NLA_PUT(skb, TCA_NETEM_RATE, sizeof(rate), &rate);
+
 	if (dump_loss_model(q, skb) != 0)
 		goto nla_put_failure;
 
-- 
1.7.7

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 net-next 2/2] netem: add cell concept to simulate special MAC behavior
  2011-11-30 22:20 [PATCH v3 net-next 1/2] netem: rate extension Hagen Paul Pfeifer
@ 2011-11-30 22:20 ` Hagen Paul Pfeifer
  2011-12-01  3:30   ` Eric Dumazet
  2011-12-01  4:19 ` [PATCH v3 net-next 1/2] netem: rate extension David Miller
  1 sibling, 1 reply; 16+ messages in thread
From: Hagen Paul Pfeifer @ 2011-11-30 22:20 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, Stephen Hemminger, Hagen Paul Pfeifer

This extension can be used to simulate special link layer
characteristics. Simulate because packet data is not modified, only the
calculation base is changed to delay a packet based on the original
packet size and artificial cell information.

packet_overhead can be used to simulate a link layer header compression
scheme (e.g. set packet_overhead to -20) or with a positive
packet_overhead value an additional MAC header can be simulated. It is
also possible to "replace" the 14 byte Ethernet header with something
else.

cell_size and cell_overhead can be used to simulate link layer schemes,
based on cells, like some TDMA schemes. Another application area are MAC
schemes using a link layer fragmentation with a (small) header each.
Cell size is the maximum amount of data bytes within one cell. Cell
overhead is an additional variable to change the per-cell-overhead (e.g.
5 byte header per fragment).

Example (5 kbit/s, 20 byte per packet overhead, cell-size 100 byte, per
cell overhead 5 byte):

	tc qdisc add dev eth0 root netem rate 5kbit 20 100 5

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
---

The actual version of packet_len_2_sched_time() address Eric's div/mod
instruction concerns. I benchmarked the version in the patch with the
following version:


	if (q->cell_size) {
		u32 mod_carry = len % q->cell_size;
		u32 cells     = len / q->cell_size;
		if (mod_carry)
			mod_carry = (len > q->cell_size || !cells) ?
				q->cell_size - mod_carry : len - mod_carry;

		if (q->cell_overhead) {
			if (mod_carry)
				++cells;
			len += cells * q->cell_overhead;
		}
		len += mod_carry;
	}
	return len;


The patch version is a little bit faster for "all" packet sizes. For common
cases (e.g. max. 1000 byte packets, cellsize 100 byte, the patch version
exhibit significant improvements). IMHO the actual version is also more
understandable. Replace div and mod by do_div() was not that successful.


 include/linux/pkt_sched.h |    3 +++
 net/sched/sch_netem.c     |   32 +++++++++++++++++++++++++++++---
 2 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 26c37ca..63845cf 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -498,6 +498,9 @@ struct tc_netem_corrupt {
 
 struct tc_netem_rate {
 	__u32	rate;	/* byte/s */
+	__s32   packet_overhead;
+	__u32   cell_size;
+	__s32   cell_overhead;
 };
 
 enum {
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 9b7af9f..bcd2b3f 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -80,6 +80,9 @@ struct netem_sched_data {
 	u32 reorder;
 	u32 corrupt;
 	u32 rate;
+	s32 packet_overhead;
+	u32 cell_size;
+	s32 cell_overhead;
 
 	struct crndstate {
 		u32 last;
@@ -299,9 +302,26 @@ static psched_tdiff_t tabledist(psched_tdiff_t mu, psched_tdiff_t sigma,
 	return  x / NETEM_DIST_SCALE + (sigma / NETEM_DIST_SCALE) * t + mu;
 }
 
-static psched_time_t packet_len_2_sched_time(unsigned int len, u32 rate)
+static psched_time_t packet_len_2_sched_time(unsigned int len,
+					     struct netem_sched_data *q)
 {
-	return PSCHED_NS2TICKS((u64)len * NSEC_PER_SEC / rate);
+	u32 cells = 0;
+	u32 datalen;
+
+	len += q->packet_overhead;
+
+	if (q->cell_size) {
+		for (datalen = len; datalen >  q->cell_size; datalen -= q->cell_size)
+			cells++;
+
+		if (q->cell_overhead)
+			len += cells * q->cell_overhead;
+
+		if (datalen)
+			len += (q->cell_size - datalen);
+	}
+
+	return PSCHED_NS2TICKS((u64)len * NSEC_PER_SEC / q->rate);
 }
 
 /*
@@ -381,7 +401,7 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 		if (q->rate) {
 			struct sk_buff_head *list = &q->qdisc->q;
 
-			delay += packet_len_2_sched_time(skb->len, q->rate);
+			delay += packet_len_2_sched_time(skb->len, q);
 
 			if (!skb_queue_empty(list)) {
 				/*
@@ -565,6 +585,9 @@ static void get_rate(struct Qdisc *sch, const struct nlattr *attr)
 	const struct tc_netem_rate *r = nla_data(attr);
 
 	q->rate = r->rate;
+	q->packet_overhead = r->packet_overhead;
+	q->cell_size       = r->cell_size;
+	q->cell_overhead   = r->cell_overhead;
 }
 
 static int get_loss_clg(struct Qdisc *sch, const struct nlattr *attr)
@@ -906,6 +929,9 @@ static int netem_dump(struct Qdisc *sch, struct sk_buff *skb)
 	NLA_PUT(skb, TCA_NETEM_CORRUPT, sizeof(corrupt), &corrupt);
 
 	rate.rate = q->rate;
+	rate.packet_overhead = q->packet_overhead;
+	rate.cell_size       = q->cell_size;
+	rate.cell_overhead   = q->cell_overhead;
 	NLA_PUT(skb, TCA_NETEM_RATE, sizeof(rate), &rate);
 
 	if (dump_loss_model(q, skb) != 0)
-- 
1.7.7

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 net-next 2/2] netem: add cell concept to simulate special MAC behavior
  2011-11-30 22:20 ` [PATCH v3 net-next 2/2] netem: add cell concept to simulate special MAC behavior Hagen Paul Pfeifer
@ 2011-12-01  3:30   ` Eric Dumazet
  2011-12-01  8:25     ` Hagen Paul Pfeifer
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2011-12-01  3:30 UTC (permalink / raw)
  To: Hagen Paul Pfeifer; +Cc: netdev, Stephen Hemminger

Le mercredi 30 novembre 2011 à 23:20 +0100, Hagen Paul Pfeifer a écrit :
> This extension can be used to simulate special link layer
> characteristics. Simulate because packet data is not modified, only the
> calculation base is changed to delay a packet based on the original
> packet size and artificial cell information.
> 
> packet_overhead can be used to simulate a link layer header compression
> scheme (e.g. set packet_overhead to -20) or with a positive
> packet_overhead value an additional MAC header can be simulated. It is
> also possible to "replace" the 14 byte Ethernet header with something
> else.
> 
> cell_size and cell_overhead can be used to simulate link layer schemes,
> based on cells, like some TDMA schemes. Another application area are MAC
> schemes using a link layer fragmentation with a (small) header each.
> Cell size is the maximum amount of data bytes within one cell. Cell
> overhead is an additional variable to change the per-cell-overhead (e.g.
> 5 byte header per fragment).
> 
> Example (5 kbit/s, 20 byte per packet overhead, cell-size 100 byte, per
> cell overhead 5 byte):
> 
> 	tc qdisc add dev eth0 root netem rate 5kbit 20 100 5
> 
> Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
> ---
> 
> The actual version of packet_len_2_sched_time() address Eric's div/mod
> instruction concerns. I benchmarked the version in the patch with the
> following version:
> 
> 
> 	if (q->cell_size) {
> 		u32 mod_carry = len % q->cell_size;
> 		u32 cells     = len / q->cell_size;
> 		if (mod_carry)
> 			mod_carry = (len > q->cell_size || !cells) ?
> 				q->cell_size - mod_carry : len - mod_carry;
> 
> 		if (q->cell_overhead) {
> 			if (mod_carry)
> 				++cells;
> 			len += cells * q->cell_overhead;
> 		}
> 		len += mod_carry;
> 	}
> 	return len;
> 
> 
> The patch version is a little bit faster for "all" packet sizes. For common
> cases (e.g. max. 1000 byte packets, cellsize 100 byte, the patch version
> exhibit significant improvements). IMHO the actual version is also more
> understandable. Replace div and mod by do_div() was not that successful.
> 
> 
>  include/linux/pkt_sched.h |    3 +++
>  net/sched/sch_netem.c     |   32 +++++++++++++++++++++++++++++---
>  2 files changed, 32 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
> index 26c37ca..63845cf 100644
> --- a/include/linux/pkt_sched.h
> +++ b/include/linux/pkt_sched.h
> @@ -498,6 +498,9 @@ struct tc_netem_corrupt {
>  
>  struct tc_netem_rate {
>  	__u32	rate;	/* byte/s */
> +	__s32   packet_overhead;
> +	__u32   cell_size;
> +	__s32   cell_overhead;
>  };
>  
>  enum {
> diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
> index 9b7af9f..bcd2b3f 100644
> --- a/net/sched/sch_netem.c
> +++ b/net/sched/sch_netem.c
> @@ -80,6 +80,9 @@ struct netem_sched_data {
>  	u32 reorder;
>  	u32 corrupt;
>  	u32 rate;
> +	s32 packet_overhead;
> +	u32 cell_size;
> +	s32 cell_overhead;
>  
>  	struct crndstate {
>  		u32 last;
> @@ -299,9 +302,26 @@ static psched_tdiff_t tabledist(psched_tdiff_t mu, psched_tdiff_t sigma,
>  	return  x / NETEM_DIST_SCALE + (sigma / NETEM_DIST_SCALE) * t + mu;
>  }
>  
> -static psched_time_t packet_len_2_sched_time(unsigned int len, u32 rate)
> +static psched_time_t packet_len_2_sched_time(unsigned int len,
> +					     struct netem_sched_data *q)
>  {
> -	return PSCHED_NS2TICKS((u64)len * NSEC_PER_SEC / rate);
> +	u32 cells = 0;
> +	u32 datalen;
> +
> +	len += q->packet_overhead;
> +
> +	if (q->cell_size) {
> +		for (datalen = len; datalen >  q->cell_size; datalen -= q->cell_size)
> +			cells++;

Oh well.. you can exit this loop with data len = q->cell_size


Hmm, take a look at reciprocal divide ...

(include/linux/reciprocal_div.h)


Instead of :

u32 cells     = len / q->cell_size;

You set once q->cell_size_reciprocal = reciprocal_value(q->cell_size);
(in Qdisc init)

Then you do :

cells = reciprocal_divide(len, q->cell_size_reciprocal);

Thats a multiply instead of a divide. On many cpus thats a lot faster.

Think about a super packet (TSO) of 65000 bytes and cell_size=64

> +
> +		if (q->cell_overhead)
> +			len += cells * q->cell_overhead;
> +
> +		if (datalen)
> +			len += (q->cell_size - datalen);
> +	}
> +
> +	return PSCHED_NS2TICKS((u64)len * NSEC_PER_SEC / q->rate);
>  }
>  
>  /*
> @@ -381,7 +401,7 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch)
>  		if (q->rate) {
>  			struct sk_buff_head *list = &q->qdisc->q;
>  
> -			delay += packet_len_2_sched_time(skb->len, q->rate);
> +			delay += packet_len_2_sched_time(skb->len, q);
>  
>  			if (!skb_queue_empty(list)) {
>  				/*
> @@ -565,6 +585,9 @@ static void get_rate(struct Qdisc *sch, const struct nlattr *attr)
>  	const struct tc_netem_rate *r = nla_data(attr);
>  
>  	q->rate = r->rate;
> +	q->packet_overhead = r->packet_overhead;
> +	q->cell_size       = r->cell_size;
> +	q->cell_overhead   = r->cell_overhead;
>  }
>  
>  static int get_loss_clg(struct Qdisc *sch, const struct nlattr *attr)
> @@ -906,6 +929,9 @@ static int netem_dump(struct Qdisc *sch, struct sk_buff *skb)
>  	NLA_PUT(skb, TCA_NETEM_CORRUPT, sizeof(corrupt), &corrupt);
>  
>  	rate.rate = q->rate;
> +	rate.packet_overhead = q->packet_overhead;
> +	rate.cell_size       = q->cell_size;
> +	rate.cell_overhead   = q->cell_overhead;
>  	NLA_PUT(skb, TCA_NETEM_RATE, sizeof(rate), &rate);
>  
>  	if (dump_loss_model(q, skb) != 0)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 net-next 1/2] netem: rate extension
  2011-11-30 22:20 [PATCH v3 net-next 1/2] netem: rate extension Hagen Paul Pfeifer
  2011-11-30 22:20 ` [PATCH v3 net-next 2/2] netem: add cell concept to simulate special MAC behavior Hagen Paul Pfeifer
@ 2011-12-01  4:19 ` David Miller
  1 sibling, 0 replies; 16+ messages in thread
From: David Miller @ 2011-12-01  4:19 UTC (permalink / raw)
  To: hagen; +Cc: netdev, eric.dumazet, shemminger

From: Hagen Paul Pfeifer <hagen@jauu.net>
Date: Wed, 30 Nov 2011 23:20:26 +0100

> Currently netem is not in the ability to emulate channel bandwidth. Only static
> delay (and optional random jitter) can be configured.
> 
> To emulate the channel rate the token bucket filter (sch_tbf) can be used.  But
> TBF has some major emulation flaws. The buffer (token bucket depth/rate) cannot
> be 0. Also the idea behind TBF is that the credit (token in buckets) fills if
> no packet is transmitted. So that there is always a "positive" credit for new
> packets. In real life this behavior contradicts the law of nature where
> nothing can travel faster as speed of light. E.g.: on an emulated 1000 byte/s
> link a small IPv4/TCP SYN packet with ~50 byte require ~0.05 seconds - not 0
> seconds.
> 
> Netem is an excellent place to implement a rate limiting feature: static
> delay is already implemented, tfifo already has time information and the
> user can skip TBF configuration completely.
> 
> This patch implement rate feature which can be configured via tc. e.g:
> 
> 	tc qdisc add dev eth0 root netem rate 10kbit
> 
> To emulate a link of 5000byte/s and add an additional static delay of 10ms:
> 
> 	tc qdisc add dev eth0 root netem delay 10ms rate 5KBps
> 
> Note: similar to TBF the rate extension is bounded to the kernel timing
> system. Depending on the architecture timer granularity, higher rates (e.g.
> 10mbit/s and higher) tend to transmission bursts. Also note: further queues
> living in network adaptors; see ethtool(8).
> 
> Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 net-next 2/2] netem: add cell concept to simulate  special MAC behavior
  2011-12-01  3:30   ` Eric Dumazet
@ 2011-12-01  8:25     ` Hagen Paul Pfeifer
  2011-12-01  9:01       ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Hagen Paul Pfeifer @ 2011-12-01  8:25 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Stephen Hemminger


On Thu, 01 Dec 2011 04:30:25 +0100, Eric Dumazet wrote:



> Thats a multiply instead of a divide. On many cpus thats a lot faster.

>

> Think about a super packet (TSO) of 65000 bytes and cell_size=64



I've never imagined that I am going to say the following: you are wrong,

Eric! (ok, maybe you are right ;-)



TSO and Netem is a no-go. With netem you are strongly advised to disable

offloading. I mean TSO will result in _one_ delay of several minutes,

followed by a burst of packets. Instead of packets spaced by several

seconds (with the rate of 1000byte/s) - which is what you wan't.



To sum up: skb->len is _never_ larger as the MTU for (normal, correct)

network emulation setups with netem. This was the assumption why I

preferred the iterative solution over the div/mod solution.



Did I miss something?



Hagen

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 net-next 2/2] netem: add cell concept to simulate special MAC behavior
  2011-12-01  8:25     ` Hagen Paul Pfeifer
@ 2011-12-01  9:01       ` Eric Dumazet
  2011-12-01  9:32         ` [PATCH net-next] netem: fix build error on 32bit arches Eric Dumazet
  2011-12-01  9:36         ` [PATCH v3 net-next 2/2] netem: add cell concept to simulate special MAC behavior Hagen Paul Pfeifer
  0 siblings, 2 replies; 16+ messages in thread
From: Eric Dumazet @ 2011-12-01  9:01 UTC (permalink / raw)
  To: Hagen Paul Pfeifer; +Cc: netdev, Stephen Hemminger

Le jeudi 01 décembre 2011 à 09:25 +0100, Hagen Paul Pfeifer a écrit :
> On Thu, 01 Dec 2011 04:30:25 +0100, Eric Dumazet wrote:
> 
> > Thats a multiply instead of a divide. On many cpus thats a lot faster.
> >
> > Think about a super packet (TSO) of 65000 bytes and cell_size=64
> 
> I've never imagined that I am going to say the following: you are wrong,
> Eric! (ok, maybe you are right ;-)
> 
> TSO and Netem is a no-go. With netem you are strongly advised to disable
> offloading. I mean TSO will result in _one_ delay of several minutes,
> followed by a burst of packets. Instead of packets spaced by several
> seconds (with the rate of 1000byte/s) - which is what you wan't.
> 
> To sum up: skb->len is _never_ larger as the MTU for (normal, correct)
> network emulation setups with netem. This was the assumption why I
> preferred the iterative solution over the div/mod solution.
> 
> Did I miss something?
> 

Yes :)

I want to be able to use netem on a 10Gigabit link, and simulate a 5ms
delay. I already will hit the shared qdisc bottleneck, dont force me to
use small packets !

We did cleanups in net/sched to properly handle large packets as well.
(SFQ for example is OK)

Really, reciprocal divide is the way to go, its faster anyway on modern
cpus than your loop.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next] netem: fix build error on 32bit arches
  2011-12-01  9:01       ` Eric Dumazet
@ 2011-12-01  9:32         ` Eric Dumazet
  2011-12-01  9:46           ` Hagen Paul Pfeifer
                             ` (2 more replies)
  2011-12-01  9:36         ` [PATCH v3 net-next 2/2] netem: add cell concept to simulate special MAC behavior Hagen Paul Pfeifer
  1 sibling, 3 replies; 16+ messages in thread
From: Eric Dumazet @ 2011-12-01  9:32 UTC (permalink / raw)
  To: Hagen Paul Pfeifer, David Miller; +Cc: netdev

ERROR: "__udivdi3" [net/sched/sch_netem.ko] undefined!

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/sched/sch_netem.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 9b7af9f..3bfd733 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -301,7 +301,10 @@ static psched_tdiff_t tabledist(psched_tdiff_t mu, psched_tdiff_t sigma,
 
 static psched_time_t packet_len_2_sched_time(unsigned int len, u32 rate)
 {
-	return PSCHED_NS2TICKS((u64)len * NSEC_PER_SEC / rate);
+	u64 ticks = (u64)len * NSEC_PER_SEC;
+
+	do_div(ticks, rate);
+	return PSCHED_NS2TICKS(ticks);
 }
 
 /*

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 net-next 2/2] netem: add cell concept to simulate  special MAC behavior
  2011-12-01  9:01       ` Eric Dumazet
  2011-12-01  9:32         ` [PATCH net-next] netem: fix build error on 32bit arches Eric Dumazet
@ 2011-12-01  9:36         ` Hagen Paul Pfeifer
  2011-12-01 16:24           ` Stephen Hemminger
  1 sibling, 1 reply; 16+ messages in thread
From: Hagen Paul Pfeifer @ 2011-12-01  9:36 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Stephen Hemminger


On Thu, 01 Dec 2011 10:01:48 +0100, Eric Dumazet wrote:



> Yes :)



damn!



> I want to be able to use netem on a 10Gigabit link, and simulate a 5ms

> delay. I already will hit the shared qdisc bottleneck, dont force me to

> use small packets !



No I don't want that. But with 10Gb/s links you will have packet

scheduling problems anyway - if you focus on an _accurate_ delay. A static

delay differs from rate shaping in use case. In the later we (and probably

you) want a exact/realistic spacing between packets.



Due to timer and scheduling granularity somewhere in between 1bit/s and

10Gb/s netem rate (and tbf) will not scale anymore. You will see burst and

inaccurate spacings, far away from what you want to emulate. For us we want

a realistic and clean behavior, if the result of the emulation is not

identical to the emulated link/device we cannot use it (some background

information). 



Anyway: I was not sure what solution you prefer - for us both are

identical. That's why I presented two solutions, so you can pick up the

favorite one. I will re-code the calculation using a reciprocal divide.

Thanks Eric!



Hagen

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] netem: fix build error on 32bit arches
  2011-12-01  9:32         ` [PATCH net-next] netem: fix build error on 32bit arches Eric Dumazet
@ 2011-12-01  9:46           ` Hagen Paul Pfeifer
  2011-12-01 11:04           ` David Laight
  2011-12-01 17:46           ` David Miller
  2 siblings, 0 replies; 16+ messages in thread
From: Hagen Paul Pfeifer @ 2011-12-01  9:46 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev


On Thu, 01 Dec 2011 10:32:14 +0100, Eric Dumazet wrote:



> ERROR: "__udivdi3" [net/sched/sch_netem.ko] undefined!

> 

> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>



Acked-by: Hagen Paul Pfeifer <hagen@jauu.net>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH net-next] netem: fix build error on 32bit arches
  2011-12-01  9:32         ` [PATCH net-next] netem: fix build error on 32bit arches Eric Dumazet
  2011-12-01  9:46           ` Hagen Paul Pfeifer
@ 2011-12-01 11:04           ` David Laight
  2011-12-01 11:19             ` Eric Dumazet
  2011-12-01 17:46           ` David Miller
  2 siblings, 1 reply; 16+ messages in thread
From: David Laight @ 2011-12-01 11:04 UTC (permalink / raw)
  To: Eric Dumazet, Hagen Paul Pfeifer, David Miller; +Cc: netdev

 
> +	u64 ticks = (u64)len * NSEC_PER_SEC;
Writing:
	u64 ticks = len * (u64)NSEC_PER_SEC;
probably generates better code since the compiler
is much more likely to spot that a single 32x32 -> 64
multiply is adequate.

	David

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH net-next] netem: fix build error on 32bit arches
  2011-12-01 11:04           ` David Laight
@ 2011-12-01 11:19             ` Eric Dumazet
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2011-12-01 11:19 UTC (permalink / raw)
  To: David Laight; +Cc: Hagen Paul Pfeifer, David Miller, netdev

Le jeudi 01 décembre 2011 à 11:04 +0000, David Laight a écrit :
> > +	u64 ticks = (u64)len * NSEC_PER_SEC;
> Writing:
> 	u64 ticks = len * (u64)NSEC_PER_SEC;
> probably generates better code since the compiler
> is much more likely to spot that a single 32x32 -> 64
> multiply is adequate.
> 

Your copy if gcc is very dumb then :)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 net-next 2/2] netem: add cell concept to simulate special MAC behavior
  2011-12-01  9:36         ` [PATCH v3 net-next 2/2] netem: add cell concept to simulate special MAC behavior Hagen Paul Pfeifer
@ 2011-12-01 16:24           ` Stephen Hemminger
  2011-12-01 16:38             ` David Laight
  0 siblings, 1 reply; 16+ messages in thread
From: Stephen Hemminger @ 2011-12-01 16:24 UTC (permalink / raw)
  To: Hagen Paul Pfeifer; +Cc: Eric Dumazet, netdev

One idea to do small delays at higher speed is to insert dummy pad frames
into the device. It would mean generating garbage, but would allow for
highly accurate fine grain delay.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH v3 net-next 2/2] netem: add cell concept to simulate  special MAC behavior
  2011-12-01 16:24           ` Stephen Hemminger
@ 2011-12-01 16:38             ` David Laight
  2011-12-01 16:57               ` Stephen Hemminger
  0 siblings, 1 reply; 16+ messages in thread
From: David Laight @ 2011-12-01 16:38 UTC (permalink / raw)
  To: Stephen Hemminger, Hagen Paul Pfeifer; +Cc: Eric Dumazet, netdev

 
> One idea to do small delays at higher speed is to insert 
> dummy pad frames into the device.
> It would mean generating garbage, but would allow for
> highly accurate fine grain delay.

Not a good idea.
They would have to be sent to a known MAC address
otherwise all the ethernet switches would forward them
on all output ports.

	David

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 net-next 2/2] netem: add cell concept to simulate special MAC behavior
  2011-12-01 16:38             ` David Laight
@ 2011-12-01 16:57               ` Stephen Hemminger
  2011-12-01 18:25                 ` Rick Jones
  0 siblings, 1 reply; 16+ messages in thread
From: Stephen Hemminger @ 2011-12-01 16:57 UTC (permalink / raw)
  To: David Laight; +Cc: Hagen Paul Pfeifer, Eric Dumazet, netdev

On Thu, 1 Dec 2011 16:38:51 -0000
"David Laight" <David.Laight@ACULAB.COM> wrote:

>  
> > One idea to do small delays at higher speed is to insert 
> > dummy pad frames into the device.
> > It would mean generating garbage, but would allow for
> > highly accurate fine grain delay.
> 
> Not a good idea.
> They would have to be sent to a known MAC address
> otherwise all the ethernet switches would forward them
> on all output ports.
> 
> 	David
> 
> 

Yes it would have to be a constant destination, not sure if there
is a discard value in Ethernet protocol spec.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] netem: fix build error on 32bit arches
  2011-12-01  9:32         ` [PATCH net-next] netem: fix build error on 32bit arches Eric Dumazet
  2011-12-01  9:46           ` Hagen Paul Pfeifer
  2011-12-01 11:04           ` David Laight
@ 2011-12-01 17:46           ` David Miller
  2 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2011-12-01 17:46 UTC (permalink / raw)
  To: eric.dumazet; +Cc: hagen, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 01 Dec 2011 10:32:14 +0100

> ERROR: "__udivdi3" [net/sched/sch_netem.ko] undefined!
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 net-next 2/2] netem: add cell concept to simulate special MAC behavior
  2011-12-01 16:57               ` Stephen Hemminger
@ 2011-12-01 18:25                 ` Rick Jones
  0 siblings, 0 replies; 16+ messages in thread
From: Rick Jones @ 2011-12-01 18:25 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Laight, Hagen Paul Pfeifer, Eric Dumazet, netdev

On 12/01/2011 08:57 AM, Stephen Hemminger wrote:
> On Thu, 1 Dec 2011 16:38:51 -0000
> "David Laight"<David.Laight@ACULAB.COM>  wrote:
>
>>
>>> One idea to do small delays at higher speed is to insert
>>> dummy pad frames into the device.
>>> It would mean generating garbage, but would allow for
>>> highly accurate fine grain delay.
>>
>> Not a good idea.
>> They would have to be sent to a known MAC address
>> otherwise all the ethernet switches would forward them
>> on all output ports.
>>
>> 	David
>>
>>
>
> Yes it would have to be a constant destination, not sure if there
> is a discard value in Ethernet protocol spec.

Aren't there special addresses that aren't supposed to be forwarded by 
(intelligent) switches?  IIRC LLDP uses such things.  Though the IEEE 
may take a dim view of using it for such a purpose, and knuth only knows 
what switch bugs would be uncovered that way...

http://standards.ieee.org/develop/regauth/grpmac/public.html
http://en.wikipedia.org/wiki/Link_Layer_Discovery_Protocol

rick jones

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2011-12-01 18:25 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-30 22:20 [PATCH v3 net-next 1/2] netem: rate extension Hagen Paul Pfeifer
2011-11-30 22:20 ` [PATCH v3 net-next 2/2] netem: add cell concept to simulate special MAC behavior Hagen Paul Pfeifer
2011-12-01  3:30   ` Eric Dumazet
2011-12-01  8:25     ` Hagen Paul Pfeifer
2011-12-01  9:01       ` Eric Dumazet
2011-12-01  9:32         ` [PATCH net-next] netem: fix build error on 32bit arches Eric Dumazet
2011-12-01  9:46           ` Hagen Paul Pfeifer
2011-12-01 11:04           ` David Laight
2011-12-01 11:19             ` Eric Dumazet
2011-12-01 17:46           ` David Miller
2011-12-01  9:36         ` [PATCH v3 net-next 2/2] netem: add cell concept to simulate special MAC behavior Hagen Paul Pfeifer
2011-12-01 16:24           ` Stephen Hemminger
2011-12-01 16:38             ` David Laight
2011-12-01 16:57               ` Stephen Hemminger
2011-12-01 18:25                 ` Rick Jones
2011-12-01  4:19 ` [PATCH v3 net-next 1/2] netem: rate extension David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).