netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH/RFC 00/11] HFSC patches
@ 2011-11-05  2:32 Michal Soltys
  2011-11-05  2:32 ` [PATCH 01/11] sch_hfsc.c: update_d() fixup Michal Soltys
                   ` (11 more replies)
  0 siblings, 12 replies; 15+ messages in thread
From: Michal Soltys @ 2011-11-05  2:32 UTC (permalink / raw)
  To: kaber; +Cc: davem, netdev

Those are most of the patches I've been sitting on for a while. For the most
part they are either small corrections or simplifications (marked with s) - but
there are also some new things (marked with !), the most interesting being #8
probably.

All changes are richly commented in respective commit messages. I've been using
them for a while - so unless I missed some subtlety, all should be fine.

Apart from these, there's still one subtle thing to do w.r.t. cl_cvtmin (during
init_vf(), as this value is lagged relatively to the situation at the time of
enqueue).

On a side note, I was thinking about something like hfsc-strict or so - where
[uplink] interface could be upperlimited on hfsc qdisc level, but all the class
upperlimit would be otherwise gone. Not sure if anyone would be even interested
in something like that at all.


Michal Soltys (11):
  sch_hfsc.c: update_d() fixup
  sch_hfsc.c: go passive after cl_vt update in update_vf()
  sch_hfsc.c: rtsc_min() convex/convex case fixup
s sch_hfsc.c: remove initial check in vttree_get_minvt() (optional)
s sch_hfsc.c: don't subtract from cl_vtoff and cl_cvtoff
s sch_hfsc.c: seg_y2x(), rtsc_y2x(), hfsc_change_class() minor changes
s sch_hfsc.c: make cl_vt keep all periods in itself
! sch_hfsc.c: update speed table, add explanations, increase accuracy
s sch_hfsc.c: split update_vf()
! sch_hfsc.c: make sure classes are able to use 1st segment on fresh backlog period
s sch_hfsc.c: remove cl_vtadj, shift virtual curve instead

 net/sched/sch_hfsc.c |  373 +++++++++++++++++++++++++++-----------------------
 1 files changed, 203 insertions(+), 170 deletions(-)

-- 
1.7.7.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 01/11] sch_hfsc.c: update_d() fixup
  2011-11-05  2:32 [PATCH/RFC 00/11] HFSC patches Michal Soltys
@ 2011-11-05  2:32 ` Michal Soltys
  2011-11-05  2:32 ` [PATCH 02/11] sch_hfsc.c: go passive after cl_vt update in update_vf() Michal Soltys
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Michal Soltys @ 2011-11-05  2:32 UTC (permalink / raw)
  To: kaber; +Cc: davem, netdev

The deadline time is generated from total rt work + size of the next packet.

Now, when a packet gets dequeued by linkshare criterion, we have to
update the deadline time to match the new situation (that's the job of
update_d()) - but to do that properly, we have to subtract the size of
the packet just dequeued, when calling rtsc_min().

This is actually stated in hfsc paper very clearly, but got (probably)
missed during altq days (or due to some patch at some point).

Signed-off-by: Michal Soltys <soltys@ziu.info>
---
 net/sched/sch_hfsc.c |   14 ++++++++------
 1 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 6488e64..261accc 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -650,9 +650,9 @@ update_ed(struct hfsc_class *cl, unsigned int next_len)
 }
 
 static inline void
-update_d(struct hfsc_class *cl, unsigned int next_len)
+update_d(struct hfsc_class *cl, unsigned int curr_len, unsigned int next_len)
 {
-	cl->cl_d = rtsc_y2x(&cl->cl_deadline, cl->cl_cumul + next_len);
+	cl->cl_d = rtsc_y2x(&cl->cl_deadline, cl->cl_cumul - curr_len + next_len);
 }
 
 static inline void
@@ -1610,7 +1610,7 @@ hfsc_dequeue(struct Qdisc *sch)
 	struct hfsc_class *cl;
 	struct sk_buff *skb;
 	u64 cur_time;
-	unsigned int next_len;
+	unsigned int curr_len, next_len;
 	int realtime = 0;
 
 	if (sch->q.qlen == 0)
@@ -1640,14 +1640,16 @@ hfsc_dequeue(struct Qdisc *sch)
 	}
 
 	skb = qdisc_dequeue_peeked(cl->qdisc);
+	curr_len = qdisc_pkt_len(skb);
+
 	if (skb == NULL) {
 		qdisc_warn_nonwc("HFSC", cl->qdisc);
 		return NULL;
 	}
 
-	update_vf(cl, qdisc_pkt_len(skb), cur_time);
+	update_vf(cl, curr_len, cur_time);
 	if (realtime)
-		cl->cl_cumul += qdisc_pkt_len(skb);
+		cl->cl_cumul += curr_len;
 
 	if (cl->qdisc->q.qlen != 0) {
 		if (cl->cl_flags & HFSC_RSC) {
@@ -1656,7 +1658,7 @@ hfsc_dequeue(struct Qdisc *sch)
 			if (realtime)
 				update_ed(cl, next_len);
 			else
-				update_d(cl, next_len);
+				update_d(cl, curr_len, next_len);
 		}
 	} else {
 		/* the class becomes passive */
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 02/11] sch_hfsc.c: go passive after cl_vt update in update_vf()
  2011-11-05  2:32 [PATCH/RFC 00/11] HFSC patches Michal Soltys
  2011-11-05  2:32 ` [PATCH 01/11] sch_hfsc.c: update_d() fixup Michal Soltys
@ 2011-11-05  2:32 ` Michal Soltys
  2011-11-05  2:32 ` [PATCH 03/11] sch_hfsc.c: rtsc_min() convex/convex case fixup Michal Soltys
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Michal Soltys @ 2011-11-05  2:32 UTC (permalink / raw)
  To: kaber; +Cc: davem, netdev

In contrast to RT and UL curves - which are based on actual time and
can always be properly initialized, LS is based on virtual time, which
we initialize manually, in three possible ways:

1. (max + min) / 2 of active siblings (min must not be upperlimited)
2. do not adjust existing vt, if parent has the same backlog period and
   newly calculated vt is smaller
3. initialize to cumulative cvtmax of all periods

In the hfsc paper, update_vf() was not responsible for setting class
passive - it only updated linksharing related stuff. So update
chores first, passivity second (if applicable).

In one of the patches to altq, setting class passive was merged into
update_vf(), but in doing so, go_passive condition was checked /before/
vt update.

This conflicts with point (2.), as vt tested at that point doesn't
reflect the situation after the last dequeue - and we update virtual
time curve later, after updating vt itself.

It also kind of conflicts with (3.) for single-packet backlog periods,
as vt will never move to the right, but total work will keep increasing
(this is handled by rtsc_min() conditions though, so the results are
still generally valid).

Signed-off-by: Michal Soltys <soltys@ziu.info>
---
 net/sched/sch_hfsc.c |   33 +++++++++++++++++----------------
 1 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 261accc..6b73725 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -776,6 +776,22 @@ update_vf(struct hfsc_class *cl, unsigned int len, u64 cur_time)
 		if (!(cl->cl_flags & HFSC_FSC) || cl->cl_nactive == 0)
 			continue;
 
+		/* update vt */
+		cl->cl_vt = rtsc_y2x(&cl->cl_virtual, cl->cl_total)
+			    - cl->cl_vtoff + cl->cl_vtadj;
+
+		/*
+		 * fixup vt due to upperlimit (if applicable)
+		 *
+		 * if vt of the class is smaller than cvtmin,
+		 * the class was skipped in the past due to non-fit.
+		 * if so, we need to adjust vtadj.
+		 */
+		if (cl->cl_vt < cl->cl_parent->cl_cvtmin) {
+			cl->cl_vtadj += cl->cl_parent->cl_cvtmin - cl->cl_vt;
+			cl->cl_vt = cl->cl_parent->cl_cvtmin;
+		}
+
 		if (go_passive && --cl->cl_nactive == 0)
 			go_passive = 1;
 		else
@@ -797,25 +813,10 @@ update_vf(struct hfsc_class *cl, unsigned int len, u64 cur_time)
 			continue;
 		}
 
-		/*
-		 * update vt and f
-		 */
-		cl->cl_vt = rtsc_y2x(&cl->cl_virtual, cl->cl_total)
-			    - cl->cl_vtoff + cl->cl_vtadj;
-
-		/*
-		 * if vt of the class is smaller than cvtmin,
-		 * the class was skipped in the past due to non-fit.
-		 * if so, we need to adjust vtadj.
-		 */
-		if (cl->cl_vt < cl->cl_parent->cl_cvtmin) {
-			cl->cl_vtadj += cl->cl_parent->cl_cvtmin - cl->cl_vt;
-			cl->cl_vt = cl->cl_parent->cl_cvtmin;
-		}
-
 		/* update the vt tree */
 		vttree_update(cl);
 
+		/* update ft */
 		if (cl->cl_flags & HFSC_USC) {
 			cl->cl_myf = cl->cl_myfadj + rtsc_y2x(&cl->cl_ulimit,
 							      cl->cl_total);
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 03/11] sch_hfsc.c: rtsc_min() convex/convex case fixup
  2011-11-05  2:32 [PATCH/RFC 00/11] HFSC patches Michal Soltys
  2011-11-05  2:32 ` [PATCH 01/11] sch_hfsc.c: update_d() fixup Michal Soltys
  2011-11-05  2:32 ` [PATCH 02/11] sch_hfsc.c: go passive after cl_vt update in update_vf() Michal Soltys
@ 2011-11-05  2:32 ` Michal Soltys
  2011-11-05  2:32 ` [PATCH 04/11] sch_hfsc.c: remove initial check in vttree_get_minvt() (optional) Michal Soltys
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Michal Soltys @ 2011-11-05  2:32 UTC (permalink / raw)
  To: kaber; +Cc: davem, netdev

The condition, which chooses smaller curve in convex/convex case is a bit too
simple. If we actually have intersecting curves, and base the test on initial
(x,y), instead of (x+dx,y+dy), the we can choose wrong curve:

                     +
                *   +
               *++++
       x,y  ++*+
    new ++++ *
            *
        ****
old ****

>From the above art, (x,y) will win the comparison and old curve will be chosen,
but it's new that should be selected (as its infinite 2nd segment is below the
old one).

Another possibility is to flatten the 1st segment (see hfsc paper), which has
an effect for *y2x() functions to return max x such that f(x) = y. In this way
the old curve would take precedence on the initial part. Worth considering for
another patch.

Signed-off-by: Michal Soltys <soltys@ziu.info>
---
 net/sched/sch_hfsc.c |   22 +++++++++++++++-------
 1 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 6b73725..710cbda 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -559,13 +559,21 @@ rtsc_min(struct runtime_sc *rtsc, struct internal_sc *isc, u64 x, u64 y)
 	u32 dsm;
 
 	if (isc->sm1 <= isc->sm2) {
-		/* service curve is convex */
-		y1 = rtsc_x2y(rtsc, x);
-		if (y1 < y)
-			/* the current rtsc is smaller */
-			return;
-		rtsc->x = x;
-		rtsc->y = y;
+		/* service curve is convex or linear */
+		y1 = rtsc_x2y(rtsc, x + isc->dx);
+		if (y1 > y + isc->dy) {
+			/*
+			 * sm1/sm2 junction point is below the old curve, so
+			 * update the curve with ours; note that for strictly
+			 * convex curves a brief part of the first segment
+			 * might be above, but it's important that 2nd infinite
+			 * segment wins (which is below the old curve in this
+			 * case)
+			 */
+			rtsc->x = x;
+			rtsc->y = y;
+		}
+		/* else: the current rtsc is smaller */
 		return;
 	}
 
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 04/11] sch_hfsc.c: remove initial check in vttree_get_minvt() (optional)
  2011-11-05  2:32 [PATCH/RFC 00/11] HFSC patches Michal Soltys
                   ` (2 preceding siblings ...)
  2011-11-05  2:32 ` [PATCH 03/11] sch_hfsc.c: rtsc_min() convex/convex case fixup Michal Soltys
@ 2011-11-05  2:32 ` Michal Soltys
  2011-11-05  2:32 ` [PATCH 05/11] sch_hfsc.c: don't subtract from cl_vtoff and cl_cvtoff Michal Soltys
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Michal Soltys @ 2011-11-05  2:32 UTC (permalink / raw)
  To: kaber; +Cc: davem, netdev

Remove initial check in vttree_get_minvt(), as it's implicitly
handled by vttree_firstfit().

Also see:
http://marc.info/?l=linux-netdev&m=128457329614914&w=2

Signed-off-by: Michal Soltys <soltys@ziu.info>
---
 net/sched/sch_hfsc.c |    4 ----
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 710cbda..97d720c 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -307,10 +307,6 @@ vttree_firstfit(struct hfsc_class *cl, u64 cur_time)
 static struct hfsc_class *
 vttree_get_minvt(struct hfsc_class *cl, u64 cur_time)
 {
-	/* if root-class's cfmin is bigger than cur_time nothing to do */
-	if (cl->cl_cfmin > cur_time)
-		return NULL;
-
 	while (cl->level > 0) {
 		cl = vttree_firstfit(cl, cur_time);
 		if (cl == NULL)
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 05/11] sch_hfsc.c: don't subtract from cl_vtoff and cl_cvtoff
  2011-11-05  2:32 [PATCH/RFC 00/11] HFSC patches Michal Soltys
                   ` (3 preceding siblings ...)
  2011-11-05  2:32 ` [PATCH 04/11] sch_hfsc.c: remove initial check in vttree_get_minvt() (optional) Michal Soltys
@ 2011-11-05  2:32 ` Michal Soltys
  2011-11-05  2:32 ` [PATCH 06/11] sch_hfsc.c: seg_y2x(), rtsc_y2x(), hfsc_change_class() minor changes Michal Soltys
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Michal Soltys @ 2011-11-05  2:32 UTC (permalink / raw)
  To: kaber; +Cc: davem, netdev

Using pcvtoff to keep vtoff as close to 0 as possible is not necessary. All
curves (based on real or virtual time) use the same mechanics, and real time
based ones are obviously not even allowed anything like that.

This might be some leftover from earlier hfsc versions (or perhaps versions
[that are/were ?] limited to 32 bit values only, where such thing would help a
bit against hitting overflow ... though what about RSC/USC ?).

Signed-off-by: Michal Soltys <soltys@ziu.info>
---
 net/sched/sch_hfsc.c |   11 +----------
 1 files changed, 1 insertions(+), 10 deletions(-)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 97d720c..e0cf11b 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -155,8 +155,6 @@ struct hfsc_class {
 	u64	cl_vtoff;		/* inter-period cumulative vt offset */
 	u64	cl_cvtmax;		/* max child's vt in the last period */
 	u64	cl_cvtoff;		/* cumulative cvtmax of all periods */
-	u64	cl_pcvtoff;		/* parent's cvtoff at initialization
-					   time */
 
 	struct internal_sc cl_rsc;	/* internal real-time service curve */
 	struct internal_sc cl_fsc;	/* internal fair service curve */
@@ -719,17 +717,12 @@ init_vf(struct hfsc_class *cl, unsigned int len)
 				cl->cl_vt = 0;
 			}
 
-			cl->cl_vtoff = cl->cl_parent->cl_cvtoff -
-							cl->cl_pcvtoff;
+			cl->cl_vtoff = cl->cl_parent->cl_cvtoff;
 
 			/* update the virtual curve */
 			vt = cl->cl_vt + cl->cl_vtoff;
 			rtsc_min(&cl->cl_virtual, &cl->cl_fsc, vt,
 						      cl->cl_total);
-			if (cl->cl_virtual.x == vt) {
-				cl->cl_virtual.x -= cl->cl_vtoff;
-				cl->cl_vtoff = 0;
-			}
 			cl->cl_vtadj = 0;
 
 			cl->cl_vtperiod++;  /* increment vt period */
@@ -1102,7 +1095,6 @@ hfsc_change_class(struct Qdisc *sch, u32 classid, u32 parentid,
 	if (parent->level == 0)
 		hfsc_purge_queue(sch, parent);
 	hfsc_adjust_levels(parent);
-	cl->cl_pcvtoff = parent->cl_cvtoff;
 	sch_tree_unlock(sch);
 
 	qdisc_class_hash_grow(sch, &q->clhash);
@@ -1500,7 +1492,6 @@ hfsc_reset_class(struct hfsc_class *cl)
 	cl->cl_cvtmin       = 0;
 	cl->cl_cvtmax       = 0;
 	cl->cl_cvtoff       = 0;
-	cl->cl_pcvtoff      = 0;
 	cl->cl_vtperiod     = 0;
 	cl->cl_parentperiod = 0;
 	cl->cl_f            = 0;
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 06/11] sch_hfsc.c: seg_y2x(), rtsc_y2x(), hfsc_change_class() minor changes
  2011-11-05  2:32 [PATCH/RFC 00/11] HFSC patches Michal Soltys
                   ` (4 preceding siblings ...)
  2011-11-05  2:32 ` [PATCH 05/11] sch_hfsc.c: don't subtract from cl_vtoff and cl_cvtoff Michal Soltys
@ 2011-11-05  2:32 ` Michal Soltys
  2011-11-05  2:32 ` [PATCH 07/11] sch_hfsc.c: make cl_vt keep all periods in itself Michal Soltys
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Michal Soltys @ 2011-11-05  2:32 UTC (permalink / raw)
  To: kaber; +Cc: davem, netdev

hfsc_change_class():

Make sure that invalid input (flat 2nd segment, USC without FSC) is not
allowed. Even though userspace (tc) does the respective checks as well - we
want to verify if everything is in order before the following seg_y2x() change
(and some other userspace tools not doing such checks might exist):

seg_y2x():

All callers (actually there's only one - rtsc_y2x()) guarantee that the
slope argument actually makese sense (otherwise the results would be
unusable) - so we don't heave to check for +inf case.

rtsc_y2x():

In case of flat 1st segment of a service curve, the function always
returns maximum a, such that f(a) = y (for our curves then, a = x + dx).
To keep things consistent (and continuous), we also return x+dx for
values < y.
---
 net/sched/sch_hfsc.c |   39 +++++++++++++++++++++++++++++----------
 1 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index e0cf11b..8da2d88 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -400,6 +400,8 @@ seg_y2x(u64 y, u64 ism)
 {
 	u64 x;
 
+	/* callers guarantee that ism is not infinity */
+#if 0
 	if (y == 0)
 		x = 0;
 	else if (ism == HT_INFINITY)
@@ -408,6 +410,8 @@ seg_y2x(u64 y, u64 ism)
 		x = (y >> ISM_SHIFT) * ism
 		    + (((y & ISM_MASK) * ism) >> ISM_SHIFT);
 	}
+#endif
+	x = (y >> ISM_SHIFT) * ism + (((y & ISM_MASK) * ism) >> ISM_SHIFT);
 	return x;
 }
 
@@ -509,7 +513,7 @@ rtsc_y2x(struct runtime_sc *rtsc, u64 y)
 {
 	u64 x;
 
-	if (y < rtsc->y)
+	if (y < rtsc->y && rtsc->dy != 0)
 		x = rtsc->x;
 	else if (y <= rtsc->y + rtsc->dy) {
 		/* x belongs to the 1st segment */
@@ -985,22 +989,40 @@ hfsc_change_class(struct Qdisc *sch, u32 classid, u32 parentid,
 
 	if (tb[TCA_HFSC_RSC]) {
 		rsc = nla_data(tb[TCA_HFSC_RSC]);
-		if (rsc->m1 == 0 && rsc->m2 == 0)
-			rsc = NULL;
+		if (rsc->m2 == 0) {
+			if (rsc->m1 != 0)
+				return -EINVAL;
+			else
+				rsc = NULL;
+		}
 	}
 
 	if (tb[TCA_HFSC_FSC]) {
 		fsc = nla_data(tb[TCA_HFSC_FSC]);
-		if (fsc->m1 == 0 && fsc->m2 == 0)
-			fsc = NULL;
+		if (fsc->m2 == 0) {
+			if (fsc->m1 != 0)
+				return -EINVAL;
+			else
+				fsc = NULL;
+		}
 	}
 
 	if (tb[TCA_HFSC_USC]) {
 		usc = nla_data(tb[TCA_HFSC_USC]);
-		if (usc->m1 == 0 && usc->m2 == 0)
-			usc = NULL;
+		if (usc->m2 == 0) {
+			if (usc->m1 != 0)
+				return -EINVAL;
+			else
+				usc = NULL;
+		}
 	}
 
+	if (rsc == NULL && fsc == NULL)
+		return -EINVAL;
+
+	if (fsc == NULL && usc != NULL)
+		return -EINVAL;
+
 	if (cl != NULL) {
 		if (parentid) {
 			if (cl->cl_parent &&
@@ -1053,9 +1075,6 @@ hfsc_change_class(struct Qdisc *sch, u32 classid, u32 parentid,
 	if (hfsc_find_class(classid, sch))
 		return -EEXIST;
 
-	if (rsc == NULL && fsc == NULL)
-		return -EINVAL;
-
 	cl = kzalloc(sizeof(struct hfsc_class), GFP_KERNEL);
 	if (cl == NULL)
 		return -ENOBUFS;
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 07/11] sch_hfsc.c: make cl_vt keep all periods in itself
  2011-11-05  2:32 [PATCH/RFC 00/11] HFSC patches Michal Soltys
                   ` (5 preceding siblings ...)
  2011-11-05  2:32 ` [PATCH 06/11] sch_hfsc.c: seg_y2x(), rtsc_y2x(), hfsc_change_class() minor changes Michal Soltys
@ 2011-11-05  2:32 ` Michal Soltys
  2011-11-05  2:32 ` [PATCH 08/11] sch_hfsc.c: update speed table, add explanations, increase accuracy Michal Soltys
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Michal Soltys @ 2011-11-05  2:32 UTC (permalink / raw)
  To: kaber; +Cc: davem, netdev

Currently, cl_vt is kept at all cost to be relative to actual backlog
period - thus we have cl_vtoff that is carefully updated to track
current "start" of vt; cl_vtmax, that keeps highest vt seen among
siblings in last backlog period; cl_cvtoff that keeps sum of all
previous cl_vtmax.

Despite that, the actual service curve is always updated from "full" vt,
and subsequent computations have vtoff part subtracted.

This is unnecessary, as vt can simply keep all vtoff values in itself -
while keeping the algorithm behaviour exactly the same.

This change allows us to remove cl_vtoff (no longer needed - part of
cl_vt now) and cl_cvtmax (no longer needed, cl_cvtoff is updated
directly from cl_vt).

Signed-off-by: Michal Soltys <soltys@ziu.info>
---
 net/sched/sch_hfsc.c |   37 ++++++++++++-------------------------
 1 files changed, 12 insertions(+), 25 deletions(-)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 8da2d88..ceff8a6 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -152,9 +152,8 @@ struct hfsc_class {
 					   (monotonic within a period) */
 	u64	cl_vtadj;		/* intra-period cumulative vt
 					   adjustment */
-	u64	cl_vtoff;		/* inter-period cumulative vt offset */
-	u64	cl_cvtmax;		/* max child's vt in the last period */
-	u64	cl_cvtoff;		/* cumulative cvtmax of all periods */
+	u64	cl_cvtoff;		/* last (biggest) vt seen between
+					   siblings */
 
 	struct internal_sc cl_rsc;	/* internal real-time service curve */
 	struct internal_sc cl_fsc;	/* internal fair service curve */
@@ -710,26 +709,17 @@ init_vf(struct hfsc_class *cl, unsigned int len)
 			} else {
 				/*
 				 * first child for a new parent backlog period.
-				 * add parent's cvtmax to cvtoff to make a new
-				 * vt (vtoff + vt) larger than the vt in the
-				 * last period for all children.
+				 * use the biggest vt seen between siblings so far
 				 */
-				vt = cl->cl_parent->cl_cvtmax;
-				cl->cl_parent->cl_cvtoff += vt;
-				cl->cl_parent->cl_cvtmax = 0;
-				cl->cl_parent->cl_cvtmin = 0;
-				cl->cl_vt = 0;
+				cl->cl_vt = cl->cl_parent->cl_cvtoff;
+				cl->cl_parent->cl_cvtmin = cl->cl_vt;
 			}
 
-			cl->cl_vtoff = cl->cl_parent->cl_cvtoff;
-
 			/* update the virtual curve */
-			vt = cl->cl_vt + cl->cl_vtoff;
-			rtsc_min(&cl->cl_virtual, &cl->cl_fsc, vt,
-						      cl->cl_total);
-			cl->cl_vtadj = 0;
+			rtsc_min(&cl->cl_virtual, &cl->cl_fsc, cl->cl_vt, cl->cl_total);
 
-			cl->cl_vtperiod++;  /* increment vt period */
+			cl->cl_vtadj = 0;
+			cl->cl_vtperiod++;
 			cl->cl_parentperiod = cl->cl_parent->cl_vtperiod;
 			if (cl->cl_parent->cl_nactive == 0)
 				cl->cl_parentperiod++;
@@ -778,8 +768,7 @@ update_vf(struct hfsc_class *cl, unsigned int len, u64 cur_time)
 			continue;
 
 		/* update vt */
-		cl->cl_vt = rtsc_y2x(&cl->cl_virtual, cl->cl_total)
-			    - cl->cl_vtoff + cl->cl_vtadj;
+		cl->cl_vt = rtsc_y2x(&cl->cl_virtual, cl->cl_total) + cl->cl_vtadj;
 
 		/*
 		 * fixup vt due to upperlimit (if applicable)
@@ -801,9 +790,9 @@ update_vf(struct hfsc_class *cl, unsigned int len, u64 cur_time)
 		if (go_passive) {
 			/* no more active child, going passive */
 
-			/* update cvtmax of the parent class */
-			if (cl->cl_vt > cl->cl_parent->cl_cvtmax)
-				cl->cl_parent->cl_cvtmax = cl->cl_vt;
+			/* update cl_cvtoff of the parent class */
+			if (cl->cl_vt > cl->cl_parent->cl_cvtoff)
+				cl->cl_parent->cl_cvtoff = cl->cl_vt;
 
 			/* remove this class from the vt tree */
 			vttree_remove(cl);
@@ -1507,9 +1496,7 @@ hfsc_reset_class(struct hfsc_class *cl)
 	cl->cl_e            = 0;
 	cl->cl_vt           = 0;
 	cl->cl_vtadj        = 0;
-	cl->cl_vtoff        = 0;
 	cl->cl_cvtmin       = 0;
-	cl->cl_cvtmax       = 0;
 	cl->cl_cvtoff       = 0;
 	cl->cl_vtperiod     = 0;
 	cl->cl_parentperiod = 0;
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 08/11] sch_hfsc.c: update speed table, add explanations, increase accuracy
  2011-11-05  2:32 [PATCH/RFC 00/11] HFSC patches Michal Soltys
                   ` (6 preceding siblings ...)
  2011-11-05  2:32 ` [PATCH 07/11] sch_hfsc.c: make cl_vt keep all periods in itself Michal Soltys
@ 2011-11-05  2:32 ` Michal Soltys
  2011-11-06  1:09   ` [PATCH 12/11] sch_hfsc.c: converter fixups Michal Soltys
  2011-11-05  2:32 ` [PATCH 09/11] sch_hfsc.c: split update_vf() Michal Soltys
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 15+ messages in thread
From: Michal Soltys @ 2011-11-05  2:32 UTC (permalink / raw)
  To: kaber; +Cc: davem, netdev

This patch:

- updates the "speed table" in commented section before service curve
  helpers, to reflect current PSCHED_SHIFT value, and how things look in
  current kernels

- adds detailed explanation about possible shifts

- updates seg_x2y() and seg_y2x() functions to retain full accuracy
  regardless of the chosen shifts

- with help of seg_*() changes, and one div64_u64(), allows slopes to be
  shifted by 32

- swaps do_div()s to div_u64()s and <linux/math64.h> header

Another thing, that got fixed as a side-effect, is that earlier way of
deriving [I]SM_SHIFT was just an estimate and didn't yield proper values
after changing PSCHED_SHIFT. For example, if PSCHED_SHIFT was
changed from 6 to 8:

at 100 kbit:  0.0032 bytes / tick
at 1 gbit:        32 bytes / tick -> 0.03125 ticks / byte

Under assumption of keeping at least 4 full decimal digits:

0.0032  requires shift = 22 (log2(10000/.0032))
0.03125 requires shift = 19

But with old definitions:

SM_SHIFT = (30 - PSCHED_SHIFT)
ISM_SHIFT = (8 + PSCHED_SHIFT)

we would get 22 for SM_SHIFT (still fine, but for PSCHED_SHIFT == 10, we
would be 1 too little) and 16 for ISM_SHIFT (too small).

Not an issue anymore, as shifts are now set simply to 32.

Signed-off-by: Michal Soltys <soltys@ziu.info>
---
 net/sched/sch_hfsc.c |  190 +++++++++++++++++++++++++++++++-------------------
 1 files changed, 117 insertions(+), 73 deletions(-)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index ceff8a6..e73d2e0 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -63,10 +63,10 @@
 #include <linux/init.h>
 #include <linux/rtnetlink.h>
 #include <linux/pkt_sched.h>
+#include <linux/math64.h>
 #include <net/netlink.h>
 #include <net/pkt_sched.h>
 #include <net/pkt_cls.h>
-#include <asm/div64.h>
 
 /*
  * kernel internal service curve representation:
@@ -350,80 +350,147 @@ cftree_update(struct hfsc_class *cl)
 }
 
 /*
- * service curve support functions
+ * -- service curve support functions --
  *
  *  external service curve parameters
- *	m: bps
- *	d: us
+ *	m: bps (bytes per second)
+ *	d: us (microseconds)
  *  internal service curve parameters
- *	sm: (bytes/psched_us) << SM_SHIFT
- *	ism: (psched_us/byte) << ISM_SHIFT
- *	dx: psched_us
+ *	sm:  (bytes/pt) << SM_SHIFT
+ *	ism: (pts/byte) << ISM_SHIFT
+ *	dx: pts (psched ticks)
  *
- * The clock source resolution with ktime and PSCHED_SHIFT 10 is 1.024us.
+ * pt stands for psched tick. With PSCHED_SHIFT = 6, we have
+ * PSCHED_TICKS_PER_SEC = 1e9 >> 6 - so 1 psched tick is 64ns.
  *
  * sm and ism are scaled in order to keep effective digits.
  * SM_SHIFT and ISM_SHIFT are selected to keep at least 4 effective
- * digits in decimal using the following table.
+ * full digits in decimal.
  *
- *  bits/sec      100Kbps     1Mbps     10Mbps     100Mbps    1Gbps
- *  ------------+-------------------------------------------------------
- *  bytes/1.024us 12.8e-3    128e-3     1280e-3    12800e-3   128000e-3
+ * -- calculation info, assuming 64ns psched tick --
  *
- *  1.024us/byte  78.125     7.8125     0.78125    0.078125   0.0078125
+ * 1gbit = 125,000,000 bytes / 1 s = 8 bytes / 1 pt
+ * inverse of that is 0.125 pts / 1 byte
+ * to keep our assumed 4 effective digits, we need to mul that by 8e4
+ * log2(8e4) ~= 16.3 - thus ISM_SHIFT = 17
  *
- * So, for PSCHED_SHIFT 10 we need: SM_SHIFT 20, ISM_SHIFT 18.
+ * similary:
+ * 10kbit = 1,250 bytes / 1 s = 0.00008 bytes / 1 pt
+ * we want at least full 4 digits, so we need to mul by 125e6
+ * log2(125e6) ~= 26.9 - thus SM_SHIFT = 27
+ *
+ *  bits/sec         10kbit   100kbit   1mbit    10mbit    100mbit    1gbit
+ *  ---------------+---------------------------------------------------------
+ *  bytes/pt (sm)     8e-5     8e-4     8e-3      8e-2      8e-1        8
+ *  pts/byte (ism)    12500    1250      125     125e-1     125e-2    125e-3
+ *
+ * as you can see:
+ *
+ * SM_SHIFT comes from the left-top
+ * ISM_SHIFT comes from the right-bottom
+ *
+ * In the code below we actually use flat 32bit shifts, giving us decent
+ * headroom on both ends of the above table.
+ *
+ * -- about allowed values (under assumption of 64ns psched tick) --
+ *
+ * We have to make sure, we have no overflows anywhere in the code. It's not a
+ * problem to just scale sm and ism, but then during computations we would
+ * easily end with >64bit values (remember our time resolution is 64ns).
+ *
+ * 1) seg_x2y and seg_y2x
+ *
+ * These use "school" math to derive final values, making sure all possible
+ * accuracy is preserved, regardless of the shifts. Note, that with this
+ * method we could use arbitrarily scaled sm/ism values, but we would have
+ * problems with divisions in the other places (and more complex code).
+ *
+ * 2) initialization functions: m2sm, m2ism, d2dx, dx2d
+ *
+ * These are used only during [re]setup/reports. Nothing particulary special
+ * about those, as user input is limited to 32bit. One remark though:
+ *
+ * Theoretically, due to rounding up in d2dx() and m2sm(), it's possible that
+ * in dx2d() and sm2m() we could overflow as well, but that would require input
+ * values near 2^32-1 provided during d2dx() and m2sm(). Just in case, we catch
+ * that possibility with WARN_ON().
+ *
+ * 3) rtsc_min
+ *
+ * The only other place, where a shift is used directly due to division. We are
+ * forced to use 64/64 division here: the (y1 - y) difference is guaranteed to
+ * take more than 32 bits due to the shift. Similary, the dsm difference is
+ * scaled, and with scalers == 32 will require more than 32 bits.
+ *
+ * Note - since v2.6.37-rc1~105^2~18, full 64/64 bit division produces fully
+ * accurate results.
+ *
+ */
+
+/*
+ * Within 10kbit .. 1gbit range, 32bit shifts still manage to keep 4 decimal
+ * digits even with PSCHED_SHIFT == 1. For reasonable == 6 we have currently
+ * there's lots of headroom for even smaller or faster speeds.
  */
-#define	SM_SHIFT	(30 - PSCHED_SHIFT)
-#define	ISM_SHIFT	(8 + PSCHED_SHIFT)
 
-#define	SM_MASK		((1ULL << SM_SHIFT) - 1)
-#define	ISM_MASK	((1ULL << ISM_SHIFT) - 1)
+#define SM_SHIFT	32
+#define ISM_SHIFT	32
+
+#define SM_MASK		((1ULL << SM_SHIFT) - 1)
+#define ISM_MASK	((1ULL << ISM_SHIFT) - 1)
+#define B32_MASK	((1ULL << 32) - 1)
+
+/* for readability */
+#define PTPS		PSCHED_TICKS_PER_SEC
+#define USPS		USEC_PER_SEC
 
 static inline u64
 seg_x2y(u64 x, u64 sm)
 {
-	u64 y;
-
 	/*
-	 * compute
-	 *	y = x * sm >> SM_SHIFT
-	 * but divide it for the upper and lower bits to avoid overflow
+	 * perform 3-stage computation, to allow shifts as high
+	 * as 32 while retaining full accuracy
 	 */
-	y = (x >> SM_SHIFT) * sm + (((x & SM_MASK) * sm) >> SM_SHIFT);
-	return y;
+	return
+	(( (x & B32_MASK) * (sm & B32_MASK) ) >> SM_SHIFT) +
+	((
+		( (x >> 32) * sm ) +
+		( (x & B32_MASK) * (sm >> 32) )
+	) << (32 - SM_SHIFT));
 }
 
 static inline u64
 seg_y2x(u64 y, u64 ism)
 {
-	u64 x;
-
-	/* callers guarantee that ism is not infinity */
-#if 0
-	if (y == 0)
-		x = 0;
-	else if (ism == HT_INFINITY)
-		x = HT_INFINITY;
-	else {
-		x = (y >> ISM_SHIFT) * ism
-		    + (((y & ISM_MASK) * ism) >> ISM_SHIFT);
-	}
-#endif
-	x = (y >> ISM_SHIFT) * ism + (((y & ISM_MASK) * ism) >> ISM_SHIFT);
-	return x;
+	/*
+	 * callers guarantee that ism is sane, otherwise we would have to check
+	 * for HT_INFINITY and return it in case of match
+	 */
+	/*
+	 * perform 3-stage computation, to allow shifts as high
+	 * as 32 while retaining full accuracy
+	 */
+	return
+	(( (y & B32_MASK) * (ism & B32_MASK) ) >> ISM_SHIFT) +
+	((
+		( (y >> 32) * ism ) +
+		( (y & B32_MASK) * (ism >> 32) )
+	) << (32 - ISM_SHIFT));
 }
 
 /* Convert m (bps) into sm (bytes/psched us) */
 static u64
 m2sm(u32 m)
 {
-	u64 sm;
+	WARN_ON(m & (1<<31));
+	return div_u64(((u64)m << SM_SHIFT) + PTPS - 1, PTPS);
+}
 
-	sm = ((u64)m << SM_SHIFT);
-	sm += PSCHED_TICKS_PER_SEC - 1;
-	do_div(sm, PSCHED_TICKS_PER_SEC);
-	return sm;
+/* convert sm (bytes/psched us) into m (bps) */
+static u32
+sm2m(u64 sm)
+{
+	return (u32)((sm * PTPS) >> SM_SHIFT);
 }
 
 /* convert m (bps) into ism (psched us/byte) */
@@ -435,9 +502,7 @@ m2ism(u32 m)
 	if (m == 0)
 		ism = HT_INFINITY;
 	else {
-		ism = ((u64)PSCHED_TICKS_PER_SEC << ISM_SHIFT);
-		ism += m - 1;
-		do_div(ism, m);
+		ism = div_u64(((u64)PTPS << ISM_SHIFT) + m - 1, m);
 	}
 	return ism;
 }
@@ -446,33 +511,15 @@ m2ism(u32 m)
 static u64
 d2dx(u32 d)
 {
-	u64 dx;
-
-	dx = ((u64)d * PSCHED_TICKS_PER_SEC);
-	dx += USEC_PER_SEC - 1;
-	do_div(dx, USEC_PER_SEC);
-	return dx;
-}
-
-/* convert sm (bytes/psched us) into m (bps) */
-static u32
-sm2m(u64 sm)
-{
-	u64 m;
-
-	m = (sm * PSCHED_TICKS_PER_SEC) >> SM_SHIFT;
-	return (u32)m;
+	WARN_ON(d & (1<<31));
+	return div_u64(((u64)d * PTPS) + USPS - 1, USPS);
 }
 
 /* convert dx (psched us) into d (us) */
 static u32
 dx2d(u64 dx)
 {
-	u64 d;
-
-	d = dx * USEC_PER_SEC;
-	do_div(d, PSCHED_TICKS_PER_SEC);
-	return (u32)d;
+	return (u32)div_u64(dx * USPS, PTPS);
 }
 
 static void
@@ -553,7 +600,6 @@ static void
 rtsc_min(struct runtime_sc *rtsc, struct internal_sc *isc, u64 x, u64 y)
 {
 	u64 y1, y2, dx, dy;
-	u32 dsm;
 
 	if (isc->sm1 <= isc->sm2) {
 		/* service curve is convex or linear */
@@ -602,9 +648,7 @@ rtsc_min(struct runtime_sc *rtsc, struct internal_sc *isc, u64 x, u64 y)
 	 * function of seg_x2y()
 	 *	seg_x2y(dx, sm1) == seg_x2y(dx, sm2) + (y1 - y)
 	 */
-	dx = (y1 - y) << SM_SHIFT;
-	dsm = isc->sm1 - isc->sm2;
-	do_div(dx, dsm);
+	dx = div64_u64((y1 - y) << SM_SHIFT, isc->sm1 - isc->sm2);
 	/*
 	 * check if (x, y1) belongs to the 1st segment of rtsc.
 	 * if so, add the offset.
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 09/11] sch_hfsc.c: split update_vf()
  2011-11-05  2:32 [PATCH/RFC 00/11] HFSC patches Michal Soltys
                   ` (7 preceding siblings ...)
  2011-11-05  2:32 ` [PATCH 08/11] sch_hfsc.c: update speed table, add explanations, increase accuracy Michal Soltys
@ 2011-11-05  2:32 ` Michal Soltys
  2011-11-05  2:32 ` [PATCH 10/11] sch_hfsc.c: make sure classes are able to use 1st segment on fresh backlog period Michal Soltys
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Michal Soltys @ 2011-11-05  2:32 UTC (permalink / raw)
  To: kaber; +Cc: davem, netdev

Split update_vf() into 2 spearate functions, one responsible for just
updating the vt, and the other responsible for setting the class [and
possibly parent nodes] passive.

This is actually how it was once done in the past, but at some point it
was merged into one larger function.

Two functions are shorter and cleaner, and during normal update_vf() it
has a bit less work to do.
---
 net/sched/sch_hfsc.c |   51 +++++++++++++++++--------------------------------
 1 files changed, 18 insertions(+), 33 deletions(-)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index e73d2e0..26cdfaa 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -800,15 +800,11 @@ static void
 update_vf(struct hfsc_class *cl, unsigned int len, u64 cur_time)
 {
 	u64 f; /* , myf_bound, delta; */
-	int go_passive = 0;
-
-	if (cl->qdisc->q.qlen == 0 && cl->cl_flags & HFSC_FSC)
-		go_passive = 1;
 
 	for (; cl->cl_parent != NULL; cl = cl->cl_parent) {
 		cl->cl_total += len;
 
-		if (!(cl->cl_flags & HFSC_FSC) || cl->cl_nactive == 0)
+		if (!(cl->cl_flags & HFSC_FSC))
 			continue;
 
 		/* update vt */
@@ -826,27 +822,6 @@ update_vf(struct hfsc_class *cl, unsigned int len, u64 cur_time)
 			cl->cl_vt = cl->cl_parent->cl_cvtmin;
 		}
 
-		if (go_passive && --cl->cl_nactive == 0)
-			go_passive = 1;
-		else
-			go_passive = 0;
-
-		if (go_passive) {
-			/* no more active child, going passive */
-
-			/* update cl_cvtoff of the parent class */
-			if (cl->cl_vt > cl->cl_parent->cl_cvtoff)
-				cl->cl_parent->cl_cvtoff = cl->cl_vt;
-
-			/* remove this class from the vt tree */
-			vttree_remove(cl);
-
-			cftree_remove(cl);
-			update_cfmin(cl->cl_parent);
-
-			continue;
-		}
-
 		/* update the vt tree */
 		vttree_update(cl);
 
@@ -899,15 +874,27 @@ set_active(struct hfsc_class *cl, unsigned int len)
 static void
 set_passive(struct hfsc_class *cl)
 {
+	list_del(&cl->dlist);
+
 	if (cl->cl_flags & HFSC_RSC)
 		eltree_remove(cl);
 
-	list_del(&cl->dlist);
+	if (cl->cl_flags & HFSC_FSC)
+	for (; cl->cl_parent != NULL; cl = cl->cl_parent) {
+		if (--cl->cl_nactive > 0)
+			break;
 
-	/*
-	 * vttree is now handled in update_vf() so that update_vf(cl, 0, 0)
-	 * needs to be called explicitly to remove a class from vttree.
-	 */
+		/* update cl_cvtoff of the parent class */
+		if (cl->cl_vt > cl->cl_parent->cl_cvtoff)
+			cl->cl_parent->cl_cvtoff = cl->cl_vt;
+
+		/* remove this class from the parent's vt & cf trees */
+		vttree_remove(cl);
+		cftree_remove(cl);
+
+		/* update cfmin of the parent, after removal */
+		update_cfmin(cl->cl_parent);
+	}
 }
 
 static unsigned int
@@ -1286,7 +1273,6 @@ hfsc_qlen_notify(struct Qdisc *sch, unsigned long arg)
 	struct hfsc_class *cl = (struct hfsc_class *)arg;
 
 	if (cl->qdisc->q.qlen == 0) {
-		update_vf(cl, 0, 0);
 		set_passive(cl);
 	}
 }
@@ -1729,7 +1715,6 @@ hfsc_drop(struct Qdisc *sch)
 		if (cl->qdisc->ops->drop != NULL &&
 		    (len = cl->qdisc->ops->drop(cl->qdisc)) > 0) {
 			if (cl->qdisc->q.qlen == 0) {
-				update_vf(cl, 0, 0);
 				set_passive(cl);
 			} else {
 				list_move_tail(&cl->dlist, &q->droplist);
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 10/11] sch_hfsc.c: make sure classes are able to use 1st segment on fresh backlog period
  2011-11-05  2:32 [PATCH/RFC 00/11] HFSC patches Michal Soltys
                   ` (8 preceding siblings ...)
  2011-11-05  2:32 ` [PATCH 09/11] sch_hfsc.c: split update_vf() Michal Soltys
@ 2011-11-05  2:32 ` Michal Soltys
  2011-11-05  2:32 ` [PATCH 11/11] sch_hfsc.c: remove cl_vtadj, shift virtual curve instead Michal Soltys
  2011-11-05  3:48 ` [PATCH/RFC 00/11] HFSC patches Patrick McHardy
  11 siblings, 0 replies; 15+ messages in thread
From: Michal Soltys @ 2011-11-05  2:32 UTC (permalink / raw)
  To: kaber; +Cc: davem, netdev

This change guarantees, that when a class starts fresh backlog period,
it will be able to use its 1st segment for subsequent vt update.
---
 net/sched/sch_hfsc.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 26cdfaa..2109878 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -88,6 +88,7 @@ struct internal_sc {
 	u64	dy;	/* the y-projection of the 1st segment */
 	u64	sm2;	/* scaled slope of the 2nd segment */
 	u64	ism2;	/* scaled inverse-slope of the 2nd segment */
+	u64	i2dy;	/* x-projection of dy, using 2nd slope */
 };
 
 /* runtime service curve */
@@ -531,6 +532,7 @@ sc2isc(struct tc_service_curve *sc, struct internal_sc *isc)
 	isc->dy   = seg_x2y(isc->dx, isc->sm1);
 	isc->sm2  = m2sm(sc->m2);
 	isc->ism2 = m2ism(sc->m2);
+	isc->i2dy = seg_y2x(isc->dy, isc->ism2);
 }
 
 /*
@@ -885,8 +887,8 @@ set_passive(struct hfsc_class *cl)
 			break;
 
 		/* update cl_cvtoff of the parent class */
-		if (cl->cl_vt > cl->cl_parent->cl_cvtoff)
-			cl->cl_parent->cl_cvtoff = cl->cl_vt;
+		if (cl->cl_vt + cl->cl_fsc.i2dy > cl->cl_parent->cl_cvtoff)
+			cl->cl_parent->cl_cvtoff = cl->cl_vt + cl->cl_fsc.i2dy;
 
 		/* remove this class from the parent's vt & cf trees */
 		vttree_remove(cl);
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 11/11] sch_hfsc.c: remove cl_vtadj, shift virtual curve instead
  2011-11-05  2:32 [PATCH/RFC 00/11] HFSC patches Michal Soltys
                   ` (9 preceding siblings ...)
  2011-11-05  2:32 ` [PATCH 10/11] sch_hfsc.c: make sure classes are able to use 1st segment on fresh backlog period Michal Soltys
@ 2011-11-05  2:32 ` Michal Soltys
  2011-11-05  3:48 ` [PATCH/RFC 00/11] HFSC patches Patrick McHardy
  11 siblings, 0 replies; 15+ messages in thread
From: Michal Soltys @ 2011-11-05  2:32 UTC (permalink / raw)
  To: kaber; +Cc: davem, netdev

Instead of keeping intra-period difference, we can simply shift
the virtual curve to the right (as the adjustment is permanent).
---
 net/sched/sch_hfsc.c |   16 +++++++---------
 1 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 2109878..bf38551 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -151,8 +151,6 @@ struct hfsc_class {
 	u64	cl_cvtmin;		/* minimal virtual time among the
 					   children fit for link-sharing
 					   (monotonic within a period) */
-	u64	cl_vtadj;		/* intra-period cumulative vt
-					   adjustment */
 	u64	cl_cvtoff;		/* last (biggest) vt seen between
 					   siblings */
 
@@ -764,7 +762,6 @@ init_vf(struct hfsc_class *cl, unsigned int len)
 			/* update the virtual curve */
 			rtsc_min(&cl->cl_virtual, &cl->cl_fsc, cl->cl_vt, cl->cl_total);
 
-			cl->cl_vtadj = 0;
 			cl->cl_vtperiod++;
 			cl->cl_parentperiod = cl->cl_parent->cl_vtperiod;
 			if (cl->cl_parent->cl_nactive == 0)
@@ -810,17 +807,19 @@ update_vf(struct hfsc_class *cl, unsigned int len, u64 cur_time)
 			continue;
 
 		/* update vt */
-		cl->cl_vt = rtsc_y2x(&cl->cl_virtual, cl->cl_total) + cl->cl_vtadj;
+		cl->cl_vt = rtsc_y2x(&cl->cl_virtual, cl->cl_total);
 
 		/*
 		 * fixup vt due to upperlimit (if applicable)
 		 *
-		 * if vt of the class is smaller than cvtmin,
-		 * the class was skipped in the past due to non-fit.
-		 * if so, we need to adjust vtadj.
+		 * if we detect we're lagging past virtual times of other siblings,
+		 * that means we were skipped in the past due to non-fit.
+		 *
+		 * If so we adjust vt and shift virtual curve to match current
+		 * situation.
 		 */
 		if (cl->cl_vt < cl->cl_parent->cl_cvtmin) {
-			cl->cl_vtadj += cl->cl_parent->cl_cvtmin - cl->cl_vt;
+			cl->cl_virtual.x += cl->cl_parent->cl_cvtmin - cl->cl_vt;
 			cl->cl_vt = cl->cl_parent->cl_cvtmin;
 		}
 
@@ -1527,7 +1526,6 @@ hfsc_reset_class(struct hfsc_class *cl)
 	cl->cl_d            = 0;
 	cl->cl_e            = 0;
 	cl->cl_vt           = 0;
-	cl->cl_vtadj        = 0;
 	cl->cl_cvtmin       = 0;
 	cl->cl_cvtoff       = 0;
 	cl->cl_vtperiod     = 0;
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH/RFC 00/11] HFSC patches
  2011-11-05  2:32 [PATCH/RFC 00/11] HFSC patches Michal Soltys
                   ` (10 preceding siblings ...)
  2011-11-05  2:32 ` [PATCH 11/11] sch_hfsc.c: remove cl_vtadj, shift virtual curve instead Michal Soltys
@ 2011-11-05  3:48 ` Patrick McHardy
  2011-11-05 17:43   ` Michal Soltys
  11 siblings, 1 reply; 15+ messages in thread
From: Patrick McHardy @ 2011-11-05  3:48 UTC (permalink / raw)
  To: Michal Soltys; +Cc: davem, netdev

On 05.11.2011 03:32, Michal Soltys wrote:
> Those are most of the patches I've been sitting on for a while. For the most
> part they are either small corrections or simplifications (marked with s) - but
> there are also some new things (marked with !), the most interesting being #8
> probably.
> 
> All changes are richly commented in respective commit messages. I've been using
> them for a while - so unless I missed some subtlety, all should be fine.

Thanks Michal. It has been quite a while since I've last looked
at this and this is complicated stuff, please give me a few days
to review your patches.

> Apart from these, there's still one subtle thing to do w.r.t. cl_cvtmin (during
> init_vf(), as this value is lagged relatively to the situation at the time of
> enqueue).
> 
> On a side note, I was thinking about something like hfsc-strict or so - where
> [uplink] interface could be upperlimited on hfsc qdisc level, but all the class
> upperlimit would be otherwise gone. Not sure if anyone would be even interested
> in something like that at all.

So classes would just use link-sharing curves? That's
already possible, so I probably don't get your idea.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH/RFC 00/11] HFSC patches
  2011-11-05  3:48 ` [PATCH/RFC 00/11] HFSC patches Patrick McHardy
@ 2011-11-05 17:43   ` Michal Soltys
  0 siblings, 0 replies; 15+ messages in thread
From: Michal Soltys @ 2011-11-05 17:43 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: davem, netdev

On 11-11-05 04:48, Patrick McHardy wrote:
>
> Thanks Michal. It has been quite a while since I've last looked
> at this and this is complicated stuff, please give me a few days
> to review your patches.

Of course, and thanks for doing it. If I have any corrections, I'll add 
them as v2 versions in this thread (under respective patches).

>
>>  Apart from these, there's still one subtle thing to do w.r.t. cl_cvtmin (during
>>  init_vf(), as this value is lagged relatively to the situation at the time of
>>  enqueue).
>>
>>  On a side note, I was thinking about something like hfsc-strict or so - where
>>  [uplink] interface could be upperlimited on hfsc qdisc level, but all the class
>>  upperlimit would be otherwise gone. Not sure if anyone would be even interested
>>  in something like that at all.
>
> So classes would just use link-sharing curves? That's
> already possible, so I probably don't get your idea.
>

I mean, that upperlimit's main use is for matching [upstream] router's 
capability (as far as I've always seen this). Other scenarios where 
upperlimit is used somewhere lower, can be transformed to just proper 
linksharing ratios and realtime leaves w/o linksharking part (if 
applicable) - so thus the idea of no upperlimit at class level at all 
(and related code), but ability to define one at qdisc level (added 
during tc qdisc add hfsc ...) and executed during hfsc_dequeue().

Note - this is just a purist's idea, and I realize unacceptable in 
context of existing hfsc scheduler for many reasons (compatibility with 
exisiting configurations for once). But the idea about 
hfsc-{light,strict,pure,etc.} has been crawling in my head for a while.


Apart from that - in the sch_hfsc.c code there're few things you once 
commented out - related to myf adjustments that "overshoot" and made 
classes stay way too much under their respective linksharing curves. Do 
you have the configuration examples saved somewhere, under which it 
happened ?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 12/11] sch_hfsc.c: converter fixups
  2011-11-05  2:32 ` [PATCH 08/11] sch_hfsc.c: update speed table, add explanations, increase accuracy Michal Soltys
@ 2011-11-06  1:09   ` Michal Soltys
  0 siblings, 0 replies; 15+ messages in thread
From: Michal Soltys @ 2011-11-06  1:09 UTC (permalink / raw)
  To: kaber; +Cc: davem, netdev

Remove WARN_ON()s added in earlier commit, as the cases they are
supposed to warn about are purely theoretical (would require - aside
from extreme input values - much smaller SM_SHIFT (sm<->m) and larger
PSCHED_SHIFT (dx<->d)).

Signed-off-by: Michal Soltys <soltys@ziu.info>
---
 net/sched/sch_hfsc.c |    9 +--------
 1 files changed, 1 insertions(+), 8 deletions(-)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 05cecc0..d0cef0f 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -407,12 +407,7 @@ cftree_update(struct hfsc_class *cl)
  * 2) initialization functions: m2sm, m2ism, d2dx, dx2d
  *
  * These are used only during [re]setup/reports. Nothing particulary special
- * about those, as user input is limited to 32bit. One remark though:
- *
- * Theoretically, due to rounding up in d2dx() and m2sm(), it's possible that
- * in dx2d() and sm2m() we could overflow as well, but that would require input
- * values near 2^32-1 provided during d2dx() and m2sm(). Just in case, we catch
- * that possibility with WARN_ON().
+ * about those, as user input is limited to 32bit.
  *
  * 3) rtsc_min
  *
@@ -481,7 +476,6 @@ seg_y2x(u64 y, u64 ism)
 static u64
 m2sm(u32 m)
 {
-	WARN_ON(m & (1<<31));
 	return div_u64(((u64)m << SM_SHIFT) + PTPS - 1, PTPS);
 }
 
@@ -510,7 +504,6 @@ m2ism(u32 m)
 static u64
 d2dx(u32 d)
 {
-	WARN_ON(d & (1<<31));
 	return div_u64(((u64)d * PTPS) + USPS - 1, USPS);
 }
 
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-11-06  1:09 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-05  2:32 [PATCH/RFC 00/11] HFSC patches Michal Soltys
2011-11-05  2:32 ` [PATCH 01/11] sch_hfsc.c: update_d() fixup Michal Soltys
2011-11-05  2:32 ` [PATCH 02/11] sch_hfsc.c: go passive after cl_vt update in update_vf() Michal Soltys
2011-11-05  2:32 ` [PATCH 03/11] sch_hfsc.c: rtsc_min() convex/convex case fixup Michal Soltys
2011-11-05  2:32 ` [PATCH 04/11] sch_hfsc.c: remove initial check in vttree_get_minvt() (optional) Michal Soltys
2011-11-05  2:32 ` [PATCH 05/11] sch_hfsc.c: don't subtract from cl_vtoff and cl_cvtoff Michal Soltys
2011-11-05  2:32 ` [PATCH 06/11] sch_hfsc.c: seg_y2x(), rtsc_y2x(), hfsc_change_class() minor changes Michal Soltys
2011-11-05  2:32 ` [PATCH 07/11] sch_hfsc.c: make cl_vt keep all periods in itself Michal Soltys
2011-11-05  2:32 ` [PATCH 08/11] sch_hfsc.c: update speed table, add explanations, increase accuracy Michal Soltys
2011-11-06  1:09   ` [PATCH 12/11] sch_hfsc.c: converter fixups Michal Soltys
2011-11-05  2:32 ` [PATCH 09/11] sch_hfsc.c: split update_vf() Michal Soltys
2011-11-05  2:32 ` [PATCH 10/11] sch_hfsc.c: make sure classes are able to use 1st segment on fresh backlog period Michal Soltys
2011-11-05  2:32 ` [PATCH 11/11] sch_hfsc.c: remove cl_vtadj, shift virtual curve instead Michal Soltys
2011-11-05  3:48 ` [PATCH/RFC 00/11] HFSC patches Patrick McHardy
2011-11-05 17:43   ` Michal Soltys

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).