* [patch net-next v2 00/11] couple of net/sched fixes+improvements
@ 2013-02-08 18:59 Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 01/11] htb: use PSCHED_TICKS2NS() Jiri Pirko
` (10 more replies)
0 siblings, 11 replies; 15+ messages in thread
From: Jiri Pirko @ 2013-02-08 18:59 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, jhs, kuznet, j.vimal
v1->v2:
- made struct psched_ratecfg const in params of couple of inline functions
(patch "sch: make htb_rate_cfg and functions around that generic")
- fixes misspelled "peak"
(patch "tbf: improved accuracy at high rates")
- added last 4 patches to this set
Jiri Pirko (11):
htb: use PSCHED_TICKS2NS()
htb: fix values in opt dump
htb: remove pointless first initialization of buffer and cbuffer
htb: initialize cl->tokens and cl->ctokens correctly
sch: make htb_rate_cfg and functions around that generic
tbf: improved accuracy at high rates
tbf: ignore max_size check for gso skbs
tbf: fix value set for q->ptokens
act_police: move struct tcf_police to act_police.c
act_police: improved accuracy at high rates
act_police: remove <=mtu check for gso skbs
include/net/act_api.h | 15 ------
include/net/sch_generic.h | 19 +++++++
net/sched/act_police.c | 124 +++++++++++++++++++++++++---------------------
net/sched/sch_generic.c | 37 ++++++++++++++
net/sched/sch_htb.c | 80 ++++++------------------------
net/sched/sch_tbf.c | 71 +++++++++++++-------------
6 files changed, 173 insertions(+), 173 deletions(-)
--
1.8.1.2
^ permalink raw reply [flat|nested] 15+ messages in thread
* [patch net-next v2 01/11] htb: use PSCHED_TICKS2NS()
2013-02-08 18:59 [patch net-next v2 00/11] couple of net/sched fixes+improvements Jiri Pirko
@ 2013-02-08 18:59 ` Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 02/11] htb: fix values in opt dump Jiri Pirko
` (9 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Jiri Pirko @ 2013-02-08 18:59 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, jhs, kuznet, j.vimal
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
---
net/sched/sch_htb.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 51561ea..476992c 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -1512,8 +1512,8 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
htb_precompute_ratedata(&cl->rate);
htb_precompute_ratedata(&cl->ceil);
- cl->buffer = hopt->buffer << PSCHED_SHIFT;
- cl->cbuffer = hopt->buffer << PSCHED_SHIFT;
+ cl->buffer = PSCHED_TICKS2NS(hopt->buffer);
+ cl->cbuffer = PSCHED_TICKS2NS(hopt->buffer);
sch_tree_unlock(sch);
--
1.8.1.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [patch net-next v2 02/11] htb: fix values in opt dump
2013-02-08 18:59 [patch net-next v2 00/11] couple of net/sched fixes+improvements Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 01/11] htb: use PSCHED_TICKS2NS() Jiri Pirko
@ 2013-02-08 18:59 ` Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 03/11] htb: remove pointless first initialization of buffer and cbuffer Jiri Pirko
` (8 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Jiri Pirko @ 2013-02-08 18:59 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, jhs, kuznet, j.vimal
in htb_change_class() cl->buffer and cl->buffer are stored in ns.
So in dump, convert them back to psched ticks.
Note this was introduced by:
commit 56b765b79e9a78dc7d3f8850ba5e5567205a3ecd
htb: improved accuracy at high rates
Please consider this for -net/-stable.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
---
net/sched/sch_htb.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 476992c..14a83dc 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -1135,9 +1135,9 @@ static int htb_dump_class(struct Qdisc *sch, unsigned long arg,
memset(&opt, 0, sizeof(opt));
opt.rate.rate = cl->rate.rate_bps >> 3;
- opt.buffer = cl->buffer;
+ opt.buffer = PSCHED_NS2TICKS(cl->buffer);
opt.ceil.rate = cl->ceil.rate_bps >> 3;
- opt.cbuffer = cl->cbuffer;
+ opt.cbuffer = PSCHED_NS2TICKS(cl->cbuffer);
opt.quantum = cl->quantum;
opt.prio = cl->prio;
opt.level = cl->level;
--
1.8.1.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [patch net-next v2 03/11] htb: remove pointless first initialization of buffer and cbuffer
2013-02-08 18:59 [patch net-next v2 00/11] couple of net/sched fixes+improvements Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 01/11] htb: use PSCHED_TICKS2NS() Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 02/11] htb: fix values in opt dump Jiri Pirko
@ 2013-02-08 18:59 ` Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 04/11] htb: initialize cl->tokens and cl->ctokens correctly Jiri Pirko
` (7 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Jiri Pirko @ 2013-02-08 18:59 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, jhs, kuznet, j.vimal
these are initialized correctly couple of lines later in the function.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
---
net/sched/sch_htb.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 14a83dc..547912e9 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -1503,9 +1503,6 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
cl->prio = TC_HTB_NUMPRIO - 1;
}
- cl->buffer = hopt->buffer;
- cl->cbuffer = hopt->cbuffer;
-
cl->rate.rate_bps = (u64)hopt->rate.rate << 3;
cl->ceil.rate_bps = (u64)hopt->ceil.rate << 3;
--
1.8.1.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [patch net-next v2 04/11] htb: initialize cl->tokens and cl->ctokens correctly
2013-02-08 18:59 [patch net-next v2 00/11] couple of net/sched fixes+improvements Jiri Pirko
` (2 preceding siblings ...)
2013-02-08 18:59 ` [patch net-next v2 03/11] htb: remove pointless first initialization of buffer and cbuffer Jiri Pirko
@ 2013-02-08 18:59 ` Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 05/11] sch: make htb_rate_cfg and functions around that generic Jiri Pirko
` (6 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Jiri Pirko @ 2013-02-08 18:59 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, jhs, kuznet, j.vimal
These are in ns so convert from ticks to ns.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
---
net/sched/sch_htb.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 547912e9..2b22544 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -1459,8 +1459,8 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
cl->parent = parent;
/* set class to be in HTB_CAN_SEND state */
- cl->tokens = hopt->buffer;
- cl->ctokens = hopt->cbuffer;
+ cl->tokens = PSCHED_TICKS2NS(hopt->buffer);
+ cl->ctokens = PSCHED_TICKS2NS(hopt->cbuffer);
cl->mbuffer = 60 * PSCHED_TICKS_PER_SEC; /* 1min */
cl->t_c = psched_get_time();
cl->cmode = HTB_CAN_SEND;
--
1.8.1.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [patch net-next v2 05/11] sch: make htb_rate_cfg and functions around that generic
2013-02-08 18:59 [patch net-next v2 00/11] couple of net/sched fixes+improvements Jiri Pirko
` (3 preceding siblings ...)
2013-02-08 18:59 ` [patch net-next v2 04/11] htb: initialize cl->tokens and cl->ctokens correctly Jiri Pirko
@ 2013-02-08 18:59 ` Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 06/11] tbf: improved accuracy at high rates Jiri Pirko
` (5 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Jiri Pirko @ 2013-02-08 18:59 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, jhs, kuznet, j.vimal
As it is going to be used in tbf as well, push these to generic code.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
---
include/net/sch_generic.h | 19 ++++++++++++++
net/sched/sch_generic.c | 37 +++++++++++++++++++++++++++
net/sched/sch_htb.c | 65 +++++++----------------------------------------
3 files changed, 65 insertions(+), 56 deletions(-)
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 2d06c2a..2761c90 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -679,4 +679,23 @@ static inline struct sk_buff *skb_act_clone(struct sk_buff *skb, gfp_t gfp_mask,
}
#endif
+struct psched_ratecfg {
+ u64 rate_bps;
+ u32 mult;
+ u32 shift;
+};
+
+static inline u64 psched_l2t_ns(const struct psched_ratecfg *r,
+ unsigned int len)
+{
+ return ((u64)len * r->mult) >> r->shift;
+}
+
+extern void psched_ratecfg_precompute(struct psched_ratecfg *r, u32 rate);
+
+static inline u32 psched_ratecfg_getrate(const struct psched_ratecfg *r)
+{
+ return r->rate_bps >> 3;
+}
+
#endif
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 5d81a44..ffad481 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -25,6 +25,7 @@
#include <linux/rcupdate.h>
#include <linux/list.h>
#include <linux/slab.h>
+#include <net/sch_generic.h>
#include <net/pkt_sched.h>
#include <net/dst.h>
@@ -896,3 +897,39 @@ void dev_shutdown(struct net_device *dev)
WARN_ON(timer_pending(&dev->watchdog_timer));
}
+
+void psched_ratecfg_precompute(struct psched_ratecfg *r, u32 rate)
+{
+ u64 factor;
+ u64 mult;
+ int shift;
+
+ r->rate_bps = rate << 3;
+ r->shift = 0;
+ r->mult = 1;
+ /*
+ * Calibrate mult, shift so that token counting is accurate
+ * for smallest packet size (64 bytes). Token (time in ns) is
+ * computed as (bytes * 8) * NSEC_PER_SEC / rate_bps. It will
+ * work as long as the smallest packet transfer time can be
+ * accurately represented in nanosec.
+ */
+ if (r->rate_bps > 0) {
+ /*
+ * Higher shift gives better accuracy. Find the largest
+ * shift such that mult fits in 32 bits.
+ */
+ for (shift = 0; shift < 16; shift++) {
+ r->shift = shift;
+ factor = 8LLU * NSEC_PER_SEC * (1 << r->shift);
+ mult = div64_u64(factor, r->rate_bps);
+ if (mult > UINT_MAX)
+ break;
+ }
+
+ r->shift = shift - 1;
+ factor = 8LLU * NSEC_PER_SEC * (1 << r->shift);
+ r->mult = div64_u64(factor, r->rate_bps);
+ }
+}
+EXPORT_SYMBOL(psched_ratecfg_precompute);
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 2b22544..03c2692 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -38,6 +38,7 @@
#include <linux/workqueue.h>
#include <linux/slab.h>
#include <net/netlink.h>
+#include <net/sch_generic.h>
#include <net/pkt_sched.h>
/* HTB algorithm.
@@ -71,12 +72,6 @@ enum htb_cmode {
HTB_CAN_SEND /* class can send */
};
-struct htb_rate_cfg {
- u64 rate_bps;
- u32 mult;
- u32 shift;
-};
-
/* interior & leaf nodes; props specific to leaves are marked L: */
struct htb_class {
struct Qdisc_class_common common;
@@ -124,8 +119,8 @@ struct htb_class {
int filter_cnt;
/* token bucket parameters */
- struct htb_rate_cfg rate;
- struct htb_rate_cfg ceil;
+ struct psched_ratecfg rate;
+ struct psched_ratecfg ceil;
s64 buffer, cbuffer; /* token bucket depth/rate */
psched_tdiff_t mbuffer; /* max wait time */
s64 tokens, ctokens; /* current number of tokens */
@@ -168,45 +163,6 @@ struct htb_sched {
struct work_struct work;
};
-static u64 l2t_ns(struct htb_rate_cfg *r, unsigned int len)
-{
- return ((u64)len * r->mult) >> r->shift;
-}
-
-static void htb_precompute_ratedata(struct htb_rate_cfg *r)
-{
- u64 factor;
- u64 mult;
- int shift;
-
- r->shift = 0;
- r->mult = 1;
- /*
- * Calibrate mult, shift so that token counting is accurate
- * for smallest packet size (64 bytes). Token (time in ns) is
- * computed as (bytes * 8) * NSEC_PER_SEC / rate_bps. It will
- * work as long as the smallest packet transfer time can be
- * accurately represented in nanosec.
- */
- if (r->rate_bps > 0) {
- /*
- * Higher shift gives better accuracy. Find the largest
- * shift such that mult fits in 32 bits.
- */
- for (shift = 0; shift < 16; shift++) {
- r->shift = shift;
- factor = 8LLU * NSEC_PER_SEC * (1 << r->shift);
- mult = div64_u64(factor, r->rate_bps);
- if (mult > UINT_MAX)
- break;
- }
-
- r->shift = shift - 1;
- factor = 8LLU * NSEC_PER_SEC * (1 << r->shift);
- r->mult = div64_u64(factor, r->rate_bps);
- }
-}
-
/* find class in global hash table using given handle */
static inline struct htb_class *htb_find(u32 handle, struct Qdisc *sch)
{
@@ -632,7 +588,7 @@ static inline void htb_accnt_tokens(struct htb_class *cl, int bytes, s64 diff)
if (toks > cl->buffer)
toks = cl->buffer;
- toks -= (s64) l2t_ns(&cl->rate, bytes);
+ toks -= (s64) psched_l2t_ns(&cl->rate, bytes);
if (toks <= -cl->mbuffer)
toks = 1 - cl->mbuffer;
@@ -645,7 +601,7 @@ static inline void htb_accnt_ctokens(struct htb_class *cl, int bytes, s64 diff)
if (toks > cl->cbuffer)
toks = cl->cbuffer;
- toks -= (s64) l2t_ns(&cl->ceil, bytes);
+ toks -= (s64) psched_l2t_ns(&cl->ceil, bytes);
if (toks <= -cl->mbuffer)
toks = 1 - cl->mbuffer;
@@ -1134,9 +1090,9 @@ static int htb_dump_class(struct Qdisc *sch, unsigned long arg,
memset(&opt, 0, sizeof(opt));
- opt.rate.rate = cl->rate.rate_bps >> 3;
+ opt.rate.rate = psched_ratecfg_getrate(&cl->rate);
opt.buffer = PSCHED_NS2TICKS(cl->buffer);
- opt.ceil.rate = cl->ceil.rate_bps >> 3;
+ opt.ceil.rate = psched_ratecfg_getrate(&cl->ceil);
opt.cbuffer = PSCHED_NS2TICKS(cl->cbuffer);
opt.quantum = cl->quantum;
opt.prio = cl->prio;
@@ -1503,11 +1459,8 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
cl->prio = TC_HTB_NUMPRIO - 1;
}
- cl->rate.rate_bps = (u64)hopt->rate.rate << 3;
- cl->ceil.rate_bps = (u64)hopt->ceil.rate << 3;
-
- htb_precompute_ratedata(&cl->rate);
- htb_precompute_ratedata(&cl->ceil);
+ psched_ratecfg_precompute(&cl->rate, hopt->rate.rate);
+ psched_ratecfg_precompute(&cl->ceil, hopt->ceil.rate);
cl->buffer = PSCHED_TICKS2NS(hopt->buffer);
cl->cbuffer = PSCHED_TICKS2NS(hopt->buffer);
--
1.8.1.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [patch net-next v2 06/11] tbf: improved accuracy at high rates
2013-02-08 18:59 [patch net-next v2 00/11] couple of net/sched fixes+improvements Jiri Pirko
` (4 preceding siblings ...)
2013-02-08 18:59 ` [patch net-next v2 05/11] sch: make htb_rate_cfg and functions around that generic Jiri Pirko
@ 2013-02-08 18:59 ` Jiri Pirko
2013-02-08 19:09 ` Eric Dumazet
2013-02-08 18:59 ` [patch net-next v2 07/11] tbf: ignore max_size check for gso skbs Jiri Pirko
` (4 subsequent siblings)
10 siblings, 1 reply; 15+ messages in thread
From: Jiri Pirko @ 2013-02-08 18:59 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, jhs, kuznet, j.vimal
Current TBF uses rate table computed by the "tc" userspace program,
which has the following issue:
The rate table has 256 entries to map packet lengths to
token (time units). With TSO sized packets, the 256 entry granularity
leads to loss/gain of rate, making the token bucket inaccurate.
Thus, instead of relying on rate table, this patch explicitly computes
the time and accounts for packet transmission times with nanosecond
granularity.
This is a followup to 56b765b79e9a78dc7d3f8850ba5e5567205a3ecd
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
net/sched/sch_tbf.c | 60 ++++++++++++++++++++++++++---------------------------
1 file changed, 29 insertions(+), 31 deletions(-)
diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
index 4b056c15..e05710a 100644
--- a/net/sched/sch_tbf.c
+++ b/net/sched/sch_tbf.c
@@ -19,6 +19,7 @@
#include <linux/errno.h>
#include <linux/skbuff.h>
#include <net/netlink.h>
+#include <net/sch_generic.h>
#include <net/pkt_sched.h>
@@ -100,23 +101,21 @@
struct tbf_sched_data {
/* Parameters */
u32 limit; /* Maximal length of backlog: bytes */
- u32 buffer; /* Token bucket depth/rate: MUST BE >= MTU/B */
+ s64 buffer; /* Token bucket depth/rate: MUST BE >= MTU/B */
u32 mtu;
u32 max_size;
- struct qdisc_rate_table *R_tab;
- struct qdisc_rate_table *P_tab;
+ struct psched_ratecfg rate;
+ struct psched_ratecfg peak;
+ bool peak_present;
/* Variables */
- long tokens; /* Current number of B tokens */
- long ptokens; /* Current number of P tokens */
+ s64 tokens; /* Current number of B tokens */
+ s64 ptokens; /* Current number of P tokens */
psched_time_t t_c; /* Time check-point */
struct Qdisc *qdisc; /* Inner qdisc, default - bfifo queue */
struct qdisc_watchdog watchdog; /* Watchdog timer */
};
-#define L2T(q, L) qdisc_l2t((q)->R_tab, L)
-#define L2T_P(q, L) qdisc_l2t((q)->P_tab, L)
-
static int tbf_enqueue(struct sk_buff *skb, struct Qdisc *sch)
{
struct tbf_sched_data *q = qdisc_priv(sch);
@@ -157,23 +156,23 @@ static struct sk_buff *tbf_dequeue(struct Qdisc *sch)
if (skb) {
psched_time_t now;
- long toks;
- long ptoks = 0;
+ s64 toks;
+ s64 ptoks = 0;
unsigned int len = qdisc_pkt_len(skb);
- now = psched_get_time();
- toks = psched_tdiff_bounded(now, q->t_c, q->buffer);
+ now = ktime_to_ns(ktime_get());
+ toks = min_t(s64, now - q->t_c, q->buffer);
- if (q->P_tab) {
+ if (q->peak_present) {
ptoks = toks + q->ptokens;
if (ptoks > (long)q->mtu)
ptoks = q->mtu;
- ptoks -= L2T_P(q, len);
+ ptoks -= (s64) psched_l2t_ns(&q->peak, len);
}
toks += q->tokens;
- if (toks > (long)q->buffer)
+ if (toks > q->buffer)
toks = q->buffer;
- toks -= L2T(q, len);
+ toks -= (s64) psched_l2t_ns(&q->rate, len);
if ((toks|ptoks) >= 0) {
skb = qdisc_dequeue_peeked(q->qdisc);
@@ -214,7 +213,7 @@ static void tbf_reset(struct Qdisc *sch)
qdisc_reset(q->qdisc);
sch->q.qlen = 0;
- q->t_c = psched_get_time();
+ q->t_c = ktime_to_ns(ktime_get());
q->tokens = q->buffer;
q->ptokens = q->mtu;
qdisc_watchdog_cancel(&q->watchdog);
@@ -295,12 +294,17 @@ static int tbf_change(struct Qdisc *sch, struct nlattr *opt)
q->limit = qopt->limit;
q->mtu = qopt->mtu;
q->max_size = max_size;
- q->buffer = qopt->buffer;
+ q->buffer = PSCHED_TICKS2NS(qopt->buffer);
q->tokens = q->buffer;
q->ptokens = q->mtu;
- swap(q->R_tab, rtab);
- swap(q->P_tab, ptab);
+ psched_ratecfg_precompute(&q->rate, rtab->rate.rate);
+ if (ptab) {
+ psched_ratecfg_precompute(&q->peak, ptab->rate.rate);
+ q->peak_present = true;
+ } else {
+ q->peak_present = false;
+ }
sch_tree_unlock(sch);
err = 0;
@@ -319,7 +323,7 @@ static int tbf_init(struct Qdisc *sch, struct nlattr *opt)
if (opt == NULL)
return -EINVAL;
- q->t_c = psched_get_time();
+ q->t_c = ktime_to_ns(ktime_get());
qdisc_watchdog_init(&q->watchdog, sch);
q->qdisc = &noop_qdisc;
@@ -331,12 +335,6 @@ static void tbf_destroy(struct Qdisc *sch)
struct tbf_sched_data *q = qdisc_priv(sch);
qdisc_watchdog_cancel(&q->watchdog);
-
- if (q->P_tab)
- qdisc_put_rtab(q->P_tab);
- if (q->R_tab)
- qdisc_put_rtab(q->R_tab);
-
qdisc_destroy(q->qdisc);
}
@@ -352,13 +350,13 @@ static int tbf_dump(struct Qdisc *sch, struct sk_buff *skb)
goto nla_put_failure;
opt.limit = q->limit;
- opt.rate = q->R_tab->rate;
- if (q->P_tab)
- opt.peakrate = q->P_tab->rate;
+ opt.rate.rate = psched_ratecfg_getrate(&q->rate);
+ if (q->peak_present)
+ opt.peakrate.rate = psched_ratecfg_getrate(&q->peak);
else
memset(&opt.peakrate, 0, sizeof(opt.peakrate));
opt.mtu = q->mtu;
- opt.buffer = q->buffer;
+ opt.buffer = PSCHED_NS2TICKS(q->buffer);
if (nla_put(skb, TCA_TBF_PARMS, sizeof(opt), &opt))
goto nla_put_failure;
--
1.8.1.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [patch net-next v2 07/11] tbf: ignore max_size check for gso skbs
2013-02-08 18:59 [patch net-next v2 00/11] couple of net/sched fixes+improvements Jiri Pirko
` (5 preceding siblings ...)
2013-02-08 18:59 ` [patch net-next v2 06/11] tbf: improved accuracy at high rates Jiri Pirko
@ 2013-02-08 18:59 ` Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 08/11] tbf: fix value set for q->ptokens Jiri Pirko
` (3 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Jiri Pirko @ 2013-02-08 18:59 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, jhs, kuznet, j.vimal
This check made bigger packets incorrectly dropped. Remove this
limitation for gso skbs.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
---
net/sched/sch_tbf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
index e05710a..dc562a8 100644
--- a/net/sched/sch_tbf.c
+++ b/net/sched/sch_tbf.c
@@ -121,7 +121,7 @@ static int tbf_enqueue(struct sk_buff *skb, struct Qdisc *sch)
struct tbf_sched_data *q = qdisc_priv(sch);
int ret;
- if (qdisc_pkt_len(skb) > q->max_size)
+ if (qdisc_pkt_len(skb) > q->max_size && !skb_is_gso(skb))
return qdisc_reshape_fail(skb, sch);
ret = qdisc_enqueue(skb, q->qdisc);
--
1.8.1.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [patch net-next v2 08/11] tbf: fix value set for q->ptokens
2013-02-08 18:59 [patch net-next v2 00/11] couple of net/sched fixes+improvements Jiri Pirko
` (6 preceding siblings ...)
2013-02-08 18:59 ` [patch net-next v2 07/11] tbf: ignore max_size check for gso skbs Jiri Pirko
@ 2013-02-08 18:59 ` Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 09/11] act_police: move struct tcf_police to act_police.c Jiri Pirko
` (2 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Jiri Pirko @ 2013-02-08 18:59 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, jhs, kuznet, j.vimal
q->ptokens is in ns and we are assigning q->mtu directly to it. That is
wrong. psched_l2t_ns() should be used to compute correct value.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
---
net/sched/sch_tbf.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
index dc562a8..6e8b670 100644
--- a/net/sched/sch_tbf.c
+++ b/net/sched/sch_tbf.c
@@ -165,8 +165,8 @@ static struct sk_buff *tbf_dequeue(struct Qdisc *sch)
if (q->peak_present) {
ptoks = toks + q->ptokens;
- if (ptoks > (long)q->mtu)
- ptoks = q->mtu;
+ if (ptoks > (s64) psched_l2t_ns(&q->peak, q->mtu))
+ ptoks = (s64) psched_l2t_ns(&q->peak, q->mtu);
ptoks -= (s64) psched_l2t_ns(&q->peak, len);
}
toks += q->tokens;
@@ -215,7 +215,8 @@ static void tbf_reset(struct Qdisc *sch)
sch->q.qlen = 0;
q->t_c = ktime_to_ns(ktime_get());
q->tokens = q->buffer;
- q->ptokens = q->mtu;
+ if (q->peak_present)
+ q->ptokens = (s64) psched_l2t_ns(&q->peak, q->mtu);
qdisc_watchdog_cancel(&q->watchdog);
}
@@ -296,11 +297,11 @@ static int tbf_change(struct Qdisc *sch, struct nlattr *opt)
q->max_size = max_size;
q->buffer = PSCHED_TICKS2NS(qopt->buffer);
q->tokens = q->buffer;
- q->ptokens = q->mtu;
psched_ratecfg_precompute(&q->rate, rtab->rate.rate);
if (ptab) {
psched_ratecfg_precompute(&q->peak, ptab->rate.rate);
+ q->ptokens = (s64) psched_l2t_ns(&q->peak, q->mtu);
q->peak_present = true;
} else {
q->peak_present = false;
--
1.8.1.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [patch net-next v2 09/11] act_police: move struct tcf_police to act_police.c
2013-02-08 18:59 [patch net-next v2 00/11] couple of net/sched fixes+improvements Jiri Pirko
` (7 preceding siblings ...)
2013-02-08 18:59 ` [patch net-next v2 08/11] tbf: fix value set for q->ptokens Jiri Pirko
@ 2013-02-08 18:59 ` Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 10/11] act_police: improved accuracy at high rates Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 11/11] act_police: remove <=mtu check for gso skbs Jiri Pirko
10 siblings, 0 replies; 15+ messages in thread
From: Jiri Pirko @ 2013-02-08 18:59 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, jhs, kuznet, j.vimal
It's not used anywhere else, so move it.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
include/net/act_api.h | 15 ---------------
net/sched/act_police.c | 15 +++++++++++++++
2 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/include/net/act_api.h b/include/net/act_api.h
index 112c25c..06ef7e9 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -35,21 +35,6 @@ struct tcf_common {
#define tcf_lock common.tcfc_lock
#define tcf_rcu common.tcfc_rcu
-struct tcf_police {
- struct tcf_common common;
- int tcfp_result;
- u32 tcfp_ewma_rate;
- u32 tcfp_burst;
- u32 tcfp_mtu;
- u32 tcfp_toks;
- u32 tcfp_ptoks;
- psched_time_t tcfp_t_c;
- struct qdisc_rate_table *tcfp_R_tab;
- struct qdisc_rate_table *tcfp_P_tab;
-};
-#define to_police(pc) \
- container_of(pc, struct tcf_police, common)
-
struct tcf_hashinfo {
struct tcf_common **htab;
unsigned int hmask;
diff --git a/net/sched/act_police.c b/net/sched/act_police.c
index 8dbd695..378a649 100644
--- a/net/sched/act_police.c
+++ b/net/sched/act_police.c
@@ -22,6 +22,21 @@
#include <net/act_api.h>
#include <net/netlink.h>
+struct tcf_police {
+ struct tcf_common common;
+ int tcfp_result;
+ u32 tcfp_ewma_rate;
+ u32 tcfp_burst;
+ u32 tcfp_mtu;
+ u32 tcfp_toks;
+ u32 tcfp_ptoks;
+ psched_time_t tcfp_t_c;
+ struct qdisc_rate_table *tcfp_R_tab;
+ struct qdisc_rate_table *tcfp_P_tab;
+};
+#define to_police(pc) \
+ container_of(pc, struct tcf_police, common)
+
#define L2T(p, L) qdisc_l2t((p)->tcfp_R_tab, L)
#define L2T_P(p, L) qdisc_l2t((p)->tcfp_P_tab, L)
--
1.8.1.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [patch net-next v2 10/11] act_police: improved accuracy at high rates
2013-02-08 18:59 [patch net-next v2 00/11] couple of net/sched fixes+improvements Jiri Pirko
` (8 preceding siblings ...)
2013-02-08 18:59 ` [patch net-next v2 09/11] act_police: move struct tcf_police to act_police.c Jiri Pirko
@ 2013-02-08 18:59 ` Jiri Pirko
2013-02-08 19:12 ` Eric Dumazet
2013-02-08 18:59 ` [patch net-next v2 11/11] act_police: remove <=mtu check for gso skbs Jiri Pirko
10 siblings, 1 reply; 15+ messages in thread
From: Jiri Pirko @ 2013-02-08 18:59 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, jhs, kuznet, j.vimal
Current act_police uses rate table computed by the "tc" userspace program,
which has the following issue:
The rate table has 256 entries to map packet lengths to
token (time units). With TSO sized packets, the 256 entry granularity
leads to loss/gain of rate, making the token bucket inaccurate.
Thus, instead of relying on rate table, this patch explicitly computes
the time and accounts for packet transmission times with nanosecond
granularity.
This is a followup to 56b765b79e9a78dc7d3f8850ba5e5567205a3ecd
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
net/sched/act_police.c | 119 +++++++++++++++++++++++--------------------------
1 file changed, 57 insertions(+), 62 deletions(-)
diff --git a/net/sched/act_police.c b/net/sched/act_police.c
index 378a649..8723183 100644
--- a/net/sched/act_police.c
+++ b/net/sched/act_police.c
@@ -26,20 +26,19 @@ struct tcf_police {
struct tcf_common common;
int tcfp_result;
u32 tcfp_ewma_rate;
- u32 tcfp_burst;
+ s64 tcfp_burst;
u32 tcfp_mtu;
- u32 tcfp_toks;
- u32 tcfp_ptoks;
+ s64 tcfp_toks;
+ s64 tcfp_ptoks;
psched_time_t tcfp_t_c;
- struct qdisc_rate_table *tcfp_R_tab;
- struct qdisc_rate_table *tcfp_P_tab;
+ struct psched_ratecfg rate;
+ bool rate_present;
+ struct psched_ratecfg peak;
+ bool peak_present;
};
#define to_police(pc) \
container_of(pc, struct tcf_police, common)
-#define L2T(p, L) qdisc_l2t((p)->tcfp_R_tab, L)
-#define L2T_P(p, L) qdisc_l2t((p)->tcfp_P_tab, L)
-
#define POL_TAB_MASK 15
static struct tcf_common *tcf_police_ht[POL_TAB_MASK + 1];
static u32 police_idx_gen;
@@ -123,10 +122,6 @@ static void tcf_police_destroy(struct tcf_police *p)
write_unlock_bh(&police_lock);
gen_kill_estimator(&p->tcf_bstats,
&p->tcf_rate_est);
- if (p->tcfp_R_tab)
- qdisc_put_rtab(p->tcfp_R_tab);
- if (p->tcfp_P_tab)
- qdisc_put_rtab(p->tcfp_P_tab);
/*
* gen_estimator est_timer() might access p->tcf_lock
* or bstats, wait a RCU grace period before freeing p
@@ -154,7 +149,6 @@ static int tcf_act_police_locate(struct net *net, struct nlattr *nla,
struct nlattr *tb[TCA_POLICE_MAX + 1];
struct tc_police *parm;
struct tcf_police *police;
- struct qdisc_rate_table *R_tab = NULL, *P_tab = NULL;
int size;
if (nla == NULL)
@@ -197,21 +191,37 @@ static int tcf_act_police_locate(struct net *net, struct nlattr *nla,
if (bind)
police->tcf_bindcnt = 1;
override:
+ spin_lock_bh(&police->tcf_lock);
+ police->tcfp_mtu = parm->mtu;
+ police->rate_present = false;
+ police->peak_present = false;
if (parm->rate.rate) {
+ struct qdisc_rate_table *tab;
+
err = -ENOMEM;
- R_tab = qdisc_get_rtab(&parm->rate, tb[TCA_POLICE_RATE]);
- if (R_tab == NULL)
- goto failure;
+ tab = qdisc_get_rtab(&parm->rate, tb[TCA_POLICE_RATE]);
+ if (!tab)
+ goto failure_unlock;
+ police->rate_present = true;
+ psched_ratecfg_precompute(&police->rate, tab->rate.rate);
+ if (!police->tcfp_mtu)
+ police->tcfp_mtu = 255 << tab->rate.cell_log;
+ qdisc_put_rtab(tab);
if (parm->peakrate.rate) {
- P_tab = qdisc_get_rtab(&parm->peakrate,
- tb[TCA_POLICE_PEAKRATE]);
- if (P_tab == NULL)
- goto failure;
+ tab = qdisc_get_rtab(&parm->peakrate,
+ tb[TCA_POLICE_PEAKRATE]);
+ if (!tab)
+ goto failure_unlock;
+ police->peak_present = true;
+ psched_ratecfg_precompute(&police->peak,
+ tab->rate.rate);
+ qdisc_put_rtab(tab);
}
}
+ if (!police->tcfp_mtu)
+ police->tcfp_mtu = ~0;
- spin_lock_bh(&police->tcf_lock);
if (est) {
err = gen_replace_estimator(&police->tcf_bstats,
&police->tcf_rate_est,
@@ -227,26 +237,13 @@ override:
}
/* No failure allowed after this point */
- if (R_tab != NULL) {
- qdisc_put_rtab(police->tcfp_R_tab);
- police->tcfp_R_tab = R_tab;
- }
- if (P_tab != NULL) {
- qdisc_put_rtab(police->tcfp_P_tab);
- police->tcfp_P_tab = P_tab;
- }
-
if (tb[TCA_POLICE_RESULT])
police->tcfp_result = nla_get_u32(tb[TCA_POLICE_RESULT]);
- police->tcfp_toks = police->tcfp_burst = parm->burst;
- police->tcfp_mtu = parm->mtu;
- if (police->tcfp_mtu == 0) {
- police->tcfp_mtu = ~0;
- if (police->tcfp_R_tab)
- police->tcfp_mtu = 255<<police->tcfp_R_tab->rate.cell_log;
- }
- if (police->tcfp_P_tab)
- police->tcfp_ptoks = L2T_P(police, police->tcfp_mtu);
+ police->tcfp_burst = PSCHED_TICKS2NS(parm->burst);
+ police->tcfp_toks = police->tcfp_burst;
+ if (police->peak_present)
+ police->tcfp_ptoks = (s64) psched_l2t_ns(&police->peak,
+ police->tcfp_mtu);
police->tcf_action = parm->action;
if (tb[TCA_POLICE_AVRATE])
@@ -256,7 +253,7 @@ override:
if (ret != ACT_P_CREATED)
return ret;
- police->tcfp_t_c = psched_get_time();
+ police->tcfp_t_c = ktime_to_ns(ktime_get());
police->tcf_index = parm->index ? parm->index :
tcf_hash_new_index(&police_idx_gen, &police_hash_info);
h = tcf_hash(police->tcf_index, POL_TAB_MASK);
@@ -270,11 +267,6 @@ override:
failure_unlock:
spin_unlock_bh(&police->tcf_lock);
-failure:
- if (P_tab)
- qdisc_put_rtab(P_tab);
- if (R_tab)
- qdisc_put_rtab(R_tab);
if (ret == ACT_P_CREATED)
kfree(police);
return err;
@@ -303,8 +295,8 @@ static int tcf_act_police(struct sk_buff *skb, const struct tc_action *a,
{
struct tcf_police *police = a->priv;
psched_time_t now;
- long toks;
- long ptoks = 0;
+ s64 toks;
+ s64 ptoks = 0;
spin_lock(&police->tcf_lock);
@@ -320,24 +312,27 @@ static int tcf_act_police(struct sk_buff *skb, const struct tc_action *a,
}
if (qdisc_pkt_len(skb) <= police->tcfp_mtu) {
- if (police->tcfp_R_tab == NULL) {
+ if (!police->rate_present) {
spin_unlock(&police->tcf_lock);
return police->tcfp_result;
}
- now = psched_get_time();
- toks = psched_tdiff_bounded(now, police->tcfp_t_c,
- police->tcfp_burst);
- if (police->tcfp_P_tab) {
+ now = ktime_to_ns(ktime_get());
+ toks = min_t(s64, now - police->tcfp_t_c,
+ police->tcfp_burst);
+ if (police->peak_present) {
ptoks = toks + police->tcfp_ptoks;
- if (ptoks > (long)L2T_P(police, police->tcfp_mtu))
- ptoks = (long)L2T_P(police, police->tcfp_mtu);
- ptoks -= L2T_P(police, qdisc_pkt_len(skb));
+ if (ptoks > (s64) psched_l2t_ns(&police->peak,
+ police->tcfp_mtu))
+ ptoks = (s64) psched_l2t_ns(&police->peak,
+ police->tcfp_mtu);
+ ptoks -= (s64) psched_l2t_ns(&police->peak,
+ qdisc_pkt_len(skb));
}
toks += police->tcfp_toks;
- if (toks > (long)police->tcfp_burst)
+ if (toks > police->tcfp_burst)
toks = police->tcfp_burst;
- toks -= L2T(police, qdisc_pkt_len(skb));
+ toks -= (s64) psched_l2t_ns(&police->rate, qdisc_pkt_len(skb));
if ((toks|ptoks) >= 0) {
police->tcfp_t_c = now;
police->tcfp_toks = toks;
@@ -363,15 +358,15 @@ tcf_act_police_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref)
.index = police->tcf_index,
.action = police->tcf_action,
.mtu = police->tcfp_mtu,
- .burst = police->tcfp_burst,
+ .burst = PSCHED_NS2TICKS(police->tcfp_burst),
.refcnt = police->tcf_refcnt - ref,
.bindcnt = police->tcf_bindcnt - bind,
};
- if (police->tcfp_R_tab)
- opt.rate = police->tcfp_R_tab->rate;
- if (police->tcfp_P_tab)
- opt.peakrate = police->tcfp_P_tab->rate;
+ if (police->rate_present)
+ opt.rate.rate = psched_ratecfg_getrate(&police->rate);
+ if (police->peak_present)
+ opt.peakrate.rate = psched_ratecfg_getrate(&police->peak);
if (nla_put(skb, TCA_POLICE_TBF, sizeof(opt), &opt))
goto nla_put_failure;
if (police->tcfp_result &&
--
1.8.1.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [patch net-next v2 11/11] act_police: remove <=mtu check for gso skbs
2013-02-08 18:59 [patch net-next v2 00/11] couple of net/sched fixes+improvements Jiri Pirko
` (9 preceding siblings ...)
2013-02-08 18:59 ` [patch net-next v2 10/11] act_police: improved accuracy at high rates Jiri Pirko
@ 2013-02-08 18:59 ` Jiri Pirko
10 siblings, 0 replies; 15+ messages in thread
From: Jiri Pirko @ 2013-02-08 18:59 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, jhs, kuznet, j.vimal
This check made bigger packets incorrectly dropped. Remove this
limitation for gso skbs.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
net/sched/act_police.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/act_police.c b/net/sched/act_police.c
index 8723183..2a55cea 100644
--- a/net/sched/act_police.c
+++ b/net/sched/act_police.c
@@ -311,7 +311,7 @@ static int tcf_act_police(struct sk_buff *skb, const struct tc_action *a,
return police->tcf_action;
}
- if (qdisc_pkt_len(skb) <= police->tcfp_mtu) {
+ if (qdisc_pkt_len(skb) <= police->tcfp_mtu || skb_is_gso(skb)) {
if (!police->rate_present) {
spin_unlock(&police->tcf_lock);
return police->tcfp_result;
--
1.8.1.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [patch net-next v2 06/11] tbf: improved accuracy at high rates
2013-02-08 18:59 ` [patch net-next v2 06/11] tbf: improved accuracy at high rates Jiri Pirko
@ 2013-02-08 19:09 ` Eric Dumazet
0 siblings, 0 replies; 15+ messages in thread
From: Eric Dumazet @ 2013-02-08 19:09 UTC (permalink / raw)
To: Jiri Pirko; +Cc: netdev, davem, edumazet, jhs, kuznet, j.vimal
On Fri, 2013-02-08 at 19:59 +0100, Jiri Pirko wrote:
> Current TBF uses rate table computed by the "tc" userspace program,
> which has the following issue:
>
> The rate table has 256 entries to map packet lengths to
> token (time units). With TSO sized packets, the 256 entry granularity
> leads to loss/gain of rate, making the token bucket inaccurate.
>
> Thus, instead of relying on rate table, this patch explicitly computes
> the time and accounts for packet transmission times with nanosecond
> granularity.
>
> This is a followup to 56b765b79e9a78dc7d3f8850ba5e5567205a3ecd
>
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
> ---
Acked-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [patch net-next v2 10/11] act_police: improved accuracy at high rates
2013-02-08 18:59 ` [patch net-next v2 10/11] act_police: improved accuracy at high rates Jiri Pirko
@ 2013-02-08 19:12 ` Eric Dumazet
2013-02-08 22:16 ` Jiri Pirko
0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2013-02-08 19:12 UTC (permalink / raw)
To: Jiri Pirko; +Cc: netdev, davem, edumazet, jhs, kuznet, j.vimal
On Fri, 2013-02-08 at 19:59 +0100, Jiri Pirko wrote:
> Current act_police uses rate table computed by the "tc" userspace program,
> which has the following issue:
>
> The rate table has 256 entries to map packet lengths to
> token (time units). With TSO sized packets, the 256 entry granularity
> leads to loss/gain of rate, making the token bucket inaccurate.
>
> Thus, instead of relying on rate table, this patch explicitly computes
> the time and accounts for packet transmission times with nanosecond
> granularity.
>
> This is a followup to 56b765b79e9a78dc7d3f8850ba5e5567205a3ecd
>
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
> ---
> net/sched/act_police.c | 119 +++++++++++++++++++++++--------------------------
> 1 file changed, 57 insertions(+), 62 deletions(-)
>
> diff --git a/net/sched/act_police.c b/net/sched/act_police.c
> index 378a649..8723183 100644
> --- a/net/sched/act_police.c
> +++ b/net/sched/act_police.c
> @@ -26,20 +26,19 @@ struct tcf_police {
> struct tcf_common common;
> int tcfp_result;
> u32 tcfp_ewma_rate;
> - u32 tcfp_burst;
> + s64 tcfp_burst;
> u32 tcfp_mtu;
> - u32 tcfp_toks;
> - u32 tcfp_ptoks;
> + s64 tcfp_toks;
> + s64 tcfp_ptoks;
> psched_time_t tcfp_t_c;
> - struct qdisc_rate_table *tcfp_R_tab;
> - struct qdisc_rate_table *tcfp_P_tab;
> + struct psched_ratecfg rate;
> + bool rate_present;
> + struct psched_ratecfg peak;
> + bool peak_present;
> };
> #define to_police(pc) \
> container_of(pc, struct tcf_police, common)
>
> -#define L2T(p, L) qdisc_l2t((p)->tcfp_R_tab, L)
> -#define L2T_P(p, L) qdisc_l2t((p)->tcfp_P_tab, L)
> -
> #define POL_TAB_MASK 15
> static struct tcf_common *tcf_police_ht[POL_TAB_MASK + 1];
> static u32 police_idx_gen;
> @@ -123,10 +122,6 @@ static void tcf_police_destroy(struct tcf_police *p)
> write_unlock_bh(&police_lock);
> gen_kill_estimator(&p->tcf_bstats,
> &p->tcf_rate_est);
> - if (p->tcfp_R_tab)
> - qdisc_put_rtab(p->tcfp_R_tab);
> - if (p->tcfp_P_tab)
> - qdisc_put_rtab(p->tcfp_P_tab);
> /*
> * gen_estimator est_timer() might access p->tcf_lock
> * or bstats, wait a RCU grace period before freeing p
> @@ -154,7 +149,6 @@ static int tcf_act_police_locate(struct net *net, struct nlattr *nla,
> struct nlattr *tb[TCA_POLICE_MAX + 1];
> struct tc_police *parm;
> struct tcf_police *police;
> - struct qdisc_rate_table *R_tab = NULL, *P_tab = NULL;
> int size;
>
> if (nla == NULL)
> @@ -197,21 +191,37 @@ static int tcf_act_police_locate(struct net *net, struct nlattr *nla,
> if (bind)
> police->tcf_bindcnt = 1;
> override:
> + spin_lock_bh(&police->tcf_lock);
> + police->tcfp_mtu = parm->mtu;
> + police->rate_present = false;
> + police->peak_present = false;
> if (parm->rate.rate) {
> + struct qdisc_rate_table *tab;
> +
> err = -ENOMEM;
> - R_tab = qdisc_get_rtab(&parm->rate, tb[TCA_POLICE_RATE]);
> - if (R_tab == NULL)
> - goto failure;
> + tab = qdisc_get_rtab(&parm->rate, tb[TCA_POLICE_RATE]);
This patch was not tested, it cannot possibly work
spin_lock_bh();
rtab = kmalloc(sizeof(*rtab), GFP_KERNEL);
should crash or complain loudly.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [patch net-next v2 10/11] act_police: improved accuracy at high rates
2013-02-08 19:12 ` Eric Dumazet
@ 2013-02-08 22:16 ` Jiri Pirko
0 siblings, 0 replies; 15+ messages in thread
From: Jiri Pirko @ 2013-02-08 22:16 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, davem, edumazet, jhs, kuznet, j.vimal
Fri, Feb 08, 2013 at 08:12:35PM CET, eric.dumazet@gmail.com wrote:
>On Fri, 2013-02-08 at 19:59 +0100, Jiri Pirko wrote:
>> Current act_police uses rate table computed by the "tc" userspace program,
>> which has the following issue:
>>
>> The rate table has 256 entries to map packet lengths to
>> token (time units). With TSO sized packets, the 256 entry granularity
>> leads to loss/gain of rate, making the token bucket inaccurate.
>>
>> Thus, instead of relying on rate table, this patch explicitly computes
>> the time and accounts for packet transmission times with nanosecond
>> granularity.
>>
>> This is a followup to 56b765b79e9a78dc7d3f8850ba5e5567205a3ecd
>>
>> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>> ---
>> net/sched/act_police.c | 119 +++++++++++++++++++++++--------------------------
>> 1 file changed, 57 insertions(+), 62 deletions(-)
>>
>> diff --git a/net/sched/act_police.c b/net/sched/act_police.c
>> index 378a649..8723183 100644
>> --- a/net/sched/act_police.c
>> +++ b/net/sched/act_police.c
>> @@ -26,20 +26,19 @@ struct tcf_police {
>> struct tcf_common common;
>> int tcfp_result;
>> u32 tcfp_ewma_rate;
>> - u32 tcfp_burst;
>> + s64 tcfp_burst;
>> u32 tcfp_mtu;
>> - u32 tcfp_toks;
>> - u32 tcfp_ptoks;
>> + s64 tcfp_toks;
>> + s64 tcfp_ptoks;
>> psched_time_t tcfp_t_c;
>> - struct qdisc_rate_table *tcfp_R_tab;
>> - struct qdisc_rate_table *tcfp_P_tab;
>> + struct psched_ratecfg rate;
>> + bool rate_present;
>> + struct psched_ratecfg peak;
>> + bool peak_present;
>> };
>> #define to_police(pc) \
>> container_of(pc, struct tcf_police, common)
>>
>> -#define L2T(p, L) qdisc_l2t((p)->tcfp_R_tab, L)
>> -#define L2T_P(p, L) qdisc_l2t((p)->tcfp_P_tab, L)
>> -
>> #define POL_TAB_MASK 15
>> static struct tcf_common *tcf_police_ht[POL_TAB_MASK + 1];
>> static u32 police_idx_gen;
>> @@ -123,10 +122,6 @@ static void tcf_police_destroy(struct tcf_police *p)
>> write_unlock_bh(&police_lock);
>> gen_kill_estimator(&p->tcf_bstats,
>> &p->tcf_rate_est);
>> - if (p->tcfp_R_tab)
>> - qdisc_put_rtab(p->tcfp_R_tab);
>> - if (p->tcfp_P_tab)
>> - qdisc_put_rtab(p->tcfp_P_tab);
>> /*
>> * gen_estimator est_timer() might access p->tcf_lock
>> * or bstats, wait a RCU grace period before freeing p
>> @@ -154,7 +149,6 @@ static int tcf_act_police_locate(struct net *net, struct nlattr *nla,
>> struct nlattr *tb[TCA_POLICE_MAX + 1];
>> struct tc_police *parm;
>> struct tcf_police *police;
>> - struct qdisc_rate_table *R_tab = NULL, *P_tab = NULL;
>> int size;
>>
>> if (nla == NULL)
>> @@ -197,21 +191,37 @@ static int tcf_act_police_locate(struct net *net, struct nlattr *nla,
>> if (bind)
>> police->tcf_bindcnt = 1;
>> override:
>> + spin_lock_bh(&police->tcf_lock);
>> + police->tcfp_mtu = parm->mtu;
>> + police->rate_present = false;
>> + police->peak_present = false;
>> if (parm->rate.rate) {
>> + struct qdisc_rate_table *tab;
>> +
>> err = -ENOMEM;
>> - R_tab = qdisc_get_rtab(&parm->rate, tb[TCA_POLICE_RATE]);
>> - if (R_tab == NULL)
>> - goto failure;
>> + tab = qdisc_get_rtab(&parm->rate, tb[TCA_POLICE_RATE]);
>
>This patch was not tested, it cannot possibly work
>
>spin_lock_bh();
>rtab = kmalloc(sizeof(*rtab), GFP_KERNEL);
>
>should crash or complain loudly.
Thanks, you are right, I had this debug option disabled. Will repost.
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2013-02-08 22:16 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-08 18:59 [patch net-next v2 00/11] couple of net/sched fixes+improvements Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 01/11] htb: use PSCHED_TICKS2NS() Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 02/11] htb: fix values in opt dump Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 03/11] htb: remove pointless first initialization of buffer and cbuffer Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 04/11] htb: initialize cl->tokens and cl->ctokens correctly Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 05/11] sch: make htb_rate_cfg and functions around that generic Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 06/11] tbf: improved accuracy at high rates Jiri Pirko
2013-02-08 19:09 ` Eric Dumazet
2013-02-08 18:59 ` [patch net-next v2 07/11] tbf: ignore max_size check for gso skbs Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 08/11] tbf: fix value set for q->ptokens Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 09/11] act_police: move struct tcf_police to act_police.c Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 10/11] act_police: improved accuracy at high rates Jiri Pirko
2013-02-08 19:12 ` Eric Dumazet
2013-02-08 22:16 ` Jiri Pirko
2013-02-08 18:59 ` [patch net-next v2 11/11] act_police: remove <=mtu check for gso skbs Jiri Pirko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).