All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH -next] hfsc: reduce hfsc_sched to 14 cachelines
@ 2016-07-04 14:22 Florian Westphal
  2016-07-04 17:57 ` Michal Soltys
  2016-07-09  3:09 ` David Miller
  0 siblings, 2 replies; 3+ messages in thread
From: Florian Westphal @ 2016-07-04 14:22 UTC (permalink / raw)
  To: netdev; +Cc: Florian Westphal, Michal Soltys

hfsc_sched is huge (size: 920, cachelines: 15), but we can get it to 14
cachelines by placing level after filter_cnt (covering 4 byte hole) and
reducing period/nactive/flags to u32 (period is just a counter,
incremented when class becomes active -- 2**32 is plenty for this
purpose, also, long is only 32bit wide on 32bit platforms anyway).

cl_vtperiod is exported to userspace via tc_hfsc_stats, but its period
member is already u32, so no precision is lost there either.

Cc: Michal Soltys <soltys@ziu.info>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/sched/sch_hfsc.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 0fd8da7..6a6fb30 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -115,9 +115,9 @@ struct hfsc_class {
 	struct gnet_stats_basic_packed bstats;
 	struct gnet_stats_queue qstats;
 	struct gnet_stats_rate_est64 rate_est;
-	unsigned int	level;		/* class level in hierarchy */
 	struct tcf_proto __rcu *filter_list; /* filter list */
 	unsigned int	filter_cnt;	/* filter count */
+	unsigned int	level;		/* class level in hierarchy */
 
 	struct hfsc_sched *sched;	/* scheduler data */
 	struct hfsc_class *cl_parent;	/* parent class */
@@ -165,10 +165,10 @@ struct hfsc_class {
 	struct runtime_sc cl_virtual;	/* virtual curve */
 	struct runtime_sc cl_ulimit;	/* upperlimit curve */
 
-	unsigned long	cl_flags;	/* which curves are valid */
-	unsigned long	cl_vtperiod;	/* vt period sequence number */
-	unsigned long	cl_parentperiod;/* parent's vt period sequence number*/
-	unsigned long	cl_nactive;	/* number of active children */
+	u8		cl_flags;	/* which curves are valid */
+	u32		cl_vtperiod;	/* vt period sequence number */
+	u32		cl_parentperiod;/* parent's vt period sequence number*/
+	u32		cl_nactive;	/* number of active children */
 };
 
 struct hfsc_sched {
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH -next] hfsc: reduce hfsc_sched to 14 cachelines
  2016-07-04 14:22 [PATCH -next] hfsc: reduce hfsc_sched to 14 cachelines Florian Westphal
@ 2016-07-04 17:57 ` Michal Soltys
  2016-07-09  3:09 ` David Miller
  1 sibling, 0 replies; 3+ messages in thread
From: Michal Soltys @ 2016-07-04 17:57 UTC (permalink / raw)
  To: Florian Westphal, netdev

On 2016-07-04 16:22, Florian Westphal wrote:
> hfsc_sched is huge (size: 920, cachelines: 15), but we can get it to 14
> cachelines by placing level after filter_cnt (covering 4 byte hole) and
> reducing period/nactive/flags to u32 (period is just a counter,
> incremented when class becomes active -- 2**32 is plenty for this
> purpose, also, long is only 32bit wide on 32bit platforms anyway).
> 
> cl_vtperiod is exported to userspace via tc_hfsc_stats, but its period
> member is already u32, so no precision is lost there either.
> 

It should be fine, even if it overflowed (which theoretically isn't that
hard: 1500 mtu, 1gbit interface, 900mbit transfer (meaning some process
throttling itself to 900mbit, not hfsc upperlimiting it) => ~16 hours to
overflow or ITOW 75000 period changes/s) - what really matters (in
init_vf()) is if the period is different.

For the record, I have 2 patches that will trim some stuff further.
Unfortunately I have another 2 that will near surely put it back at
[hopefully only] 16 (if they get accepted that is).

But there're some other candidates that might help (some not that tiny
functions defined as inline that are called in more than 1 place). E.g.
update_cfmin() is called from 3 places.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH -next] hfsc: reduce hfsc_sched to 14 cachelines
  2016-07-04 14:22 [PATCH -next] hfsc: reduce hfsc_sched to 14 cachelines Florian Westphal
  2016-07-04 17:57 ` Michal Soltys
@ 2016-07-09  3:09 ` David Miller
  1 sibling, 0 replies; 3+ messages in thread
From: David Miller @ 2016-07-09  3:09 UTC (permalink / raw)
  To: fw; +Cc: netdev, soltys

From: Florian Westphal <fw@strlen.de>
Date: Mon,  4 Jul 2016 16:22:20 +0200

> hfsc_sched is huge (size: 920, cachelines: 15), but we can get it to 14
> cachelines by placing level after filter_cnt (covering 4 byte hole) and
> reducing period/nactive/flags to u32 (period is just a counter,
> incremented when class becomes active -- 2**32 is plenty for this
> purpose, also, long is only 32bit wide on 32bit platforms anyway).
> 
> cl_vtperiod is exported to userspace via tc_hfsc_stats, but its period
> member is already u32, so no precision is lost there either.
> 
> Cc: Michal Soltys <soltys@ziu.info>
> Signed-off-by: Florian Westphal <fw@strlen.de>

Applied, thanks.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-07-09  3:09 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-04 14:22 [PATCH -next] hfsc: reduce hfsc_sched to 14 cachelines Florian Westphal
2016-07-04 17:57 ` Michal Soltys
2016-07-09  3:09 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.