All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v8 0/3] net sched actions: improve dump performance
@ 2017-04-25 11:54 Jamal Hadi Salim
  2017-04-25 11:54 ` [PATCH net-next v8 1/3] net sched actions: Use proper root attribute table for actions Jamal Hadi Salim
                   ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-25 11:54 UTC (permalink / raw)
  To: davem; +Cc: jiri, xiyou.wangcong, eric.dumazet, netdev, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>

Changes since v7:
-----------------

Jamal:
Patch 1 went out twice. Resend without two copies of patch 1

changes since v6:
-----------------

1) DaveM:
New rules for netlink messages. From now on we are going to start
checking for bits that are not used and rejecting anything we dont
understand. In the future this is going to require major changes
to user space code (tc etc). This is just a start.

To quote, David:
"
 Again, bits you aren't using now, make sure userspace doesn't
   set them.  And if it does, reject.
"

2) Jiri:
a)Fix the commit message to properly use "Fixes" description
b)Align assignments for nla_policy

Changes since v5:
----------------

0)
Remove use of BIT() because it is kernel specific. Requires a separate
patch (Jiri can submit that in his cleanups)

1)To paraphrase Eric D.

"memcpy(nla_data(count_attr), &cb->args[1], sizeof(u32));
wont work on 64bit BE machines because cb->args[1]
(which is 64 bit is larger in size than sizeof(u32))"

Fixed

2) Jiri Pirko

i) Spotted a bug fix mixed in the patch for wrong TLV
fix. Add patch 1/3 to address this. Make part of this
series because of dependencies.

ii) Rename ACT_LARGE_DUMP_ON -> TCA_FLAG_LARGE_DUMP_ON

iii) Satisfy Jiri's obsession against the noun "tcaa"
a)Rename struct nlattr *tcaa --> struct nlattr *tb
b)Rename TCAA_ACT_XXX -> TCA_ROOT_XXX

Changes since v4:
-----------------

1) Eric D.

pointed out that when all skb space is used up by the dump
there will be no space to insert the TCAA_ACT_COUNT attribute.

2) Jiri:

i) Change:

enum {
        TCAA_UNSPEC,
        TCAA_ACT_TAB,
        TCAA_ACT_FLAGS,
        TCAA_ACT_COUNT,
        TCAA_ACT_TIME_FILTER,
        __TCAA_MAX
};

#define TCAA_MAX (__TCAA_MAX - 1)
#define ACT_LARGE_DUMP_ON               (1 << 0)

to:
enum {
       TCAA_UNSPEC,
       TCAA_ACT_TAB,
#define TCA_ACT_TAB TCAA_ACT_TAB
       TCAA_ACT_FLAGS,
       TCAA_ACT_COUNT,
       __TCAA_MAX,
#define        TCAA_MAX (__TCAA_MAX - 1)
};

#define ACT_LARGE_DUMP_ON              BIT(0)

Jiri plans to followup with the rest of the code to make the
style consistent.

ii) Rename attribute TCAA_ACT_TIME_FILTER --> TCAA_ACT_TIME_DELTA

iii) Rename variable jiffy_filter --> jiffy_since
iv) Rename msecs_filter --> msecs_since
v) get rid of unused cb->args[0] and rename cb->args[4] to cb->args[0]

Jamal Hadi Salim (3):
  net sched actions: User proper root attribute table for actions
  net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  net sched actions: add time filter for action dumping

 include/uapi/linux/rtnetlink.h | 22 ++++++++++++--
 net/sched/act_api.c            | 66 +++++++++++++++++++++++++++++++++++-------
 2 files changed, 75 insertions(+), 13 deletions(-)

-- 
1.9.1


Jamal Hadi Salim (3):
  net sched actions: Use proper root attribute table for actions
  net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  net sched actions: add time filter for action dumping

 include/uapi/linux/rtnetlink.h | 23 ++++++++++--
 net/sched/act_api.c            | 79 ++++++++++++++++++++++++++++++++++++------
 2 files changed, 89 insertions(+), 13 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH net-next v8 1/3] net sched actions: Use proper root attribute table for actions
  2017-04-25 11:54 [PATCH net-next v8 0/3] net sched actions: improve dump performance Jamal Hadi Salim
@ 2017-04-25 11:54 ` Jamal Hadi Salim
  2017-04-25 18:42   ` Simon Horman
  2017-04-25 11:54 ` [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch Jamal Hadi Salim
  2017-04-25 11:54 ` [PATCH net-next v8 3/3] net sched actions: add time filter for action dumping Jamal Hadi Salim
  2 siblings, 1 reply; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-25 11:54 UTC (permalink / raw)
  To: davem; +Cc: jiri, xiyou.wangcong, eric.dumazet, netdev, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>

Bug fix for an issue which has been around for about a decade.
We got away with it because the enumeration was larger than needed.

Fixes: 7ba699c604ab ("[NET_SCHED]: Convert actions from rtnetlink to new netlink API")
Suggested-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 net/sched/act_api.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 82b1d48..9ce22b7 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -997,7 +997,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
 			 struct netlink_ext_ack *extack)
 {
 	struct net *net = sock_net(skb->sk);
-	struct nlattr *tca[TCA_ACT_MAX + 1];
+	struct nlattr *tca[TCAA_MAX + 1];
 	u32 portid = skb ? NETLINK_CB(skb).portid : 0;
 	int ret = 0, ovr = 0;
 
@@ -1005,7 +1005,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
 	    !netlink_capable(skb, CAP_NET_ADMIN))
 		return -EPERM;
 
-	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCA_ACT_MAX, NULL,
+	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCAA_MAX, NULL,
 			  extack);
 	if (ret < 0)
 		return ret;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-25 11:54 [PATCH net-next v8 0/3] net sched actions: improve dump performance Jamal Hadi Salim
  2017-04-25 11:54 ` [PATCH net-next v8 1/3] net sched actions: Use proper root attribute table for actions Jamal Hadi Salim
@ 2017-04-25 11:54 ` Jamal Hadi Salim
  2017-04-25 12:13   ` Jiri Pirko
  2017-04-25 11:54 ` [PATCH net-next v8 3/3] net sched actions: add time filter for action dumping Jamal Hadi Salim
  2 siblings, 1 reply; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-25 11:54 UTC (permalink / raw)
  To: davem; +Cc: jiri, xiyou.wangcong, eric.dumazet, netdev, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>

When you dump hundreds of thousands of actions, getting only 32 per
dump batch even when the socket buffer and memory allocations allow
is inefficient.

With this change, the user will get as many as possibly fitting
within the given constraints available to the kernel.

The top level action TLV space is extended. An attribute
TCA_ROOT_FLAGS is used to carry flags; flag TCA_FLAG_LARGE_DUMP_ON
is set by the user indicating the user is capable of processing
these large dumps. Older user space which doesnt set this flag
doesnt get the large (than 32) batches.
The kernel uses the TCA_ROOT_COUNT attribute to tell the user how many
actions are put in a single batch. As such user space app knows how long
to iterate (independent of the type of action being dumped)
instead of hardcoded maximum of 32.

Some results dumping 1.5M actions, first unpatched tc which the
kernel doesnt help:

prompt$ time -p tc actions ls action gact | grep index | wc -l
1500000
real 1388.43
user 2.07
sys 1386.79

Now lets see a patched tc which sets the correct flags when requesting
a dump:

prompt$ time -p updatedtc actions ls action gact | grep index | wc -l
1500000
real 178.13
user 2.02
sys 176.96

That is about 8x performance improvement for tc which sets its
receive buffer to about 32K.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 include/uapi/linux/rtnetlink.h | 22 ++++++++++++++--
 net/sched/act_api.c            | 59 +++++++++++++++++++++++++++++++++++-------
 2 files changed, 69 insertions(+), 12 deletions(-)

diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index cce0613..5875467 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -674,10 +674,28 @@ struct tcamsg {
 	unsigned char	tca__pad1;
 	unsigned short	tca__pad2;
 };
+
+enum {
+	TCA_ROOT_UNSPEC,
+	TCA_ROOT_TAB,
+#define TCA_ACT_TAB TCA_ROOT_TAB
+	TCA_ROOT_FLAGS,
+	TCA_ROOT_COUNT,
+	__TCA_ROOT_MAX,
+#define	TCA_ROOT_MAX (__TCA_ROOT_MAX - 1)
+};
+
 #define TA_RTA(r)  ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct tcamsg))))
 #define TA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct tcamsg))
-#define TCA_ACT_TAB 1 /* attr type must be >=1 */	
-#define TCAA_MAX 1
+/* tcamsg flags stored in attribute TCA_ROOT_FLAGS
+ *
+ * TCA_FLAG_LARGE_DUMP_ON user->kernel to request for larger than TCA_ACT_MAX_PRIO
+ * actions in a dump. All dump responses will contain the number of actions
+ * being dumped stored in for user app's consumption in TCA_ROOT_COUNT
+ *
+ */
+#define TCA_FLAG_LARGE_DUMP_ON		(1 << 0)
+#define VALID_TCA_FLAGS TCA_FLAG_LARGE_DUMP_ON
 
 /* New extended info filters for IFLA_EXT_MASK */
 #define RTEXT_FILTER_VF		(1 << 0)
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 9ce22b7..2756213 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -83,6 +83,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
 			   struct netlink_callback *cb)
 {
 	int err = 0, index = -1, i = 0, s_i = 0, n_i = 0;
+	u32 act_flags = cb->args[2];
 	struct nlattr *nest;
 
 	spin_lock_bh(&hinfo->lock);
@@ -111,14 +112,18 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
 			}
 			nla_nest_end(skb, nest);
 			n_i++;
-			if (n_i >= TCA_ACT_MAX_PRIO)
+			if (!(act_flags & TCA_FLAG_LARGE_DUMP_ON) &&
+			    n_i >= TCA_ACT_MAX_PRIO)
 				goto done;
 		}
 	}
 done:
 	spin_unlock_bh(&hinfo->lock);
-	if (n_i)
+	if (n_i) {
 		cb->args[0] += n_i;
+		if (act_flags & TCA_FLAG_LARGE_DUMP_ON)
+			cb->args[1] = n_i;
+	}
 	return n_i;
 
 nla_put_failure:
@@ -993,11 +998,15 @@ static int tcf_action_add(struct net *net, struct nlattr *nla,
 	return tcf_add_notify(net, n, &actions, portid);
 }
 
+static const struct nla_policy tcaa_policy[TCA_ROOT_MAX + 1] = {
+	[TCA_ROOT_FLAGS]      = { .type = NLA_U32 },
+};
+
 static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
 			 struct netlink_ext_ack *extack)
 {
 	struct net *net = sock_net(skb->sk);
-	struct nlattr *tca[TCAA_MAX + 1];
+	struct nlattr *tca[TCA_ROOT_MAX + 1];
 	u32 portid = skb ? NETLINK_CB(skb).portid : 0;
 	int ret = 0, ovr = 0;
 
@@ -1005,7 +1014,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
 	    !netlink_capable(skb, CAP_NET_ADMIN))
 		return -EPERM;
 
-	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCAA_MAX, NULL,
+	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCA_ROOT_MAX, NULL,
 			  extack);
 	if (ret < 0)
 		return ret;
@@ -1046,16 +1055,12 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
 	return ret;
 }
 
-static struct nlattr *find_dump_kind(const struct nlmsghdr *n)
+static struct nlattr *find_dump_kind(struct nlattr **nla)
 {
 	struct nlattr *tb1, *tb2[TCA_ACT_MAX + 1];
 	struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
-	struct nlattr *nla[TCAA_MAX + 1];
 	struct nlattr *kind;
 
-	if (nlmsg_parse(n, sizeof(struct tcamsg), nla, TCAA_MAX,
-			NULL, NULL) < 0)
-		return NULL;
 	tb1 = nla[TCA_ACT_TAB];
 	if (tb1 == NULL)
 		return NULL;
@@ -1073,6 +1078,16 @@ static struct nlattr *find_dump_kind(const struct nlmsghdr *n)
 	return kind;
 }
 
+static inline bool tca_flags_valid(u32 act_flags)
+{
+	u32 invalid_flags_mask  = ~VALID_TCA_FLAGS;
+
+	if (act_flags & invalid_flags_mask)
+		return false;
+
+	return true;
+}
+
 static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
 {
 	struct net *net = sock_net(skb->sk);
@@ -1082,8 +1097,18 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
 	struct tc_action_ops *a_o;
 	int ret = 0;
 	struct tcamsg *t = (struct tcamsg *) nlmsg_data(cb->nlh);
-	struct nlattr *kind = find_dump_kind(cb->nlh);
+	struct nlattr *count_attr = NULL;
+	struct nlattr *tb[TCA_ROOT_MAX + 1];
+	struct nlattr *kind = NULL;
+	u32 act_flags = 0;
+	u32 act_count = 0;
+
+	ret = nlmsg_parse(cb->nlh, sizeof(struct tcamsg), tb, TCA_ROOT_MAX,
+			  tcaa_policy, NULL);
+	if (ret < 0)
+		return ret;
 
+	kind = find_dump_kind(tb);
 	if (kind == NULL) {
 		pr_info("tc_dump_action: action bad kind\n");
 		return 0;
@@ -1093,14 +1118,25 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
 	if (a_o == NULL)
 		return 0;
 
+	if (tb[TCA_ROOT_FLAGS])
+		act_flags = nla_get_u32(tb[TCA_ROOT_FLAGS]);
+
+	if (act_flags && !tca_flags_valid(act_flags))
+		return -EINVAL;
+
 	nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
 			cb->nlh->nlmsg_type, sizeof(*t), 0);
 	if (!nlh)
 		goto out_module_put;
+
+	cb->args[2] = act_flags;
 	t = nlmsg_data(nlh);
 	t->tca_family = AF_UNSPEC;
 	t->tca__pad1 = 0;
 	t->tca__pad2 = 0;
+	count_attr = nla_reserve(skb, TCA_ROOT_COUNT, sizeof(u32));
+	if (!count_attr)
+		goto out_module_put;
 
 	nest = nla_nest_start(skb, TCA_ACT_TAB);
 	if (nest == NULL)
@@ -1113,6 +1149,9 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
 	if (ret > 0) {
 		nla_nest_end(skb, nest);
 		ret = skb->len;
+		act_count = cb->args[1];
+		memcpy(nla_data(count_attr), &act_count, sizeof(u32));
+		cb->args[1] = 0;
 	} else
 		nlmsg_trim(skb, b);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH net-next v8 3/3] net sched actions: add time filter for action dumping
  2017-04-25 11:54 [PATCH net-next v8 0/3] net sched actions: improve dump performance Jamal Hadi Salim
  2017-04-25 11:54 ` [PATCH net-next v8 1/3] net sched actions: Use proper root attribute table for actions Jamal Hadi Salim
  2017-04-25 11:54 ` [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch Jamal Hadi Salim
@ 2017-04-25 11:54 ` Jamal Hadi Salim
  2 siblings, 0 replies; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-25 11:54 UTC (permalink / raw)
  To: davem; +Cc: jiri, xiyou.wangcong, eric.dumazet, netdev, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>

This adds support for filtering based on time since last used.
When we are dumping a large number of actions it is useful to
have the option of filtering based on when the action was last
used to reduce the amount of data crossing to user space.

With this patch the user space app sets the TCA_ROOT_TIME_DELTA
attribute with the value in milliseconds with "time of interest
since now".  The kernel converts this to jiffies and does the
filtering comparison matching entries that have seen activity
since then and returns them to user space.
Old kernels and old tc continue to work in legacy mode since
they dont specify this attribute.

Some example (we have 400 actions bound to 400 filters); at
installation time using  hacked tc which sets the time of
interest to 120 seconds:
prompt$ hackedtc actions ls action gact | grep index | wc -l
400

go get some coffee and  wait for > 120 seconds and try again:

prompt$ hackedtc actions ls action gact | grep index | wc -l
0

Lets see a filter bound to one of these actions:
..
filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10  (rule hit 2 success 1)
  match 7f000002/ffffffff at 12 (success 1 )
    action order 1: gact action pass
     random type none pass val 0
     index 23 ref 2 bind 1 installed 1145 sec used 802 sec
    Action statistics:
    Sent 84 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
    backlog 0b 0p requeues 0
....

that coffee took long, no? It was good.

Now lets ping -c 1 127.0.0.2, then run the actions again:

prompt$ hackedtc actions ls action gact | grep index | wc -l
1

More details please:

prompt$ hackedtc -s actions ls action gact

    action order 0: gact action pass
     random type none pass val 0
     index 23 ref 2 bind 1 installed 1270 sec used 30 sec
    Action statistics:
    Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
    backlog 0b 0p requeues 0

And the filter?

filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10  (rule hit 4 success 2)
  match 7f000002/ffffffff at 12 (success 2 )
    action order 1: gact action pass
     random type none pass val 0
     index 23 ref 2 bind 1 installed 1324 sec used 84 sec
    Action statistics:
    Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
    backlog 0b 0p requeues 0

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 include/uapi/linux/rtnetlink.h |  1 +
 net/sched/act_api.c            | 22 ++++++++++++++++++++--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index 5875467..39c7499 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -681,6 +681,7 @@ enum {
 #define TCA_ACT_TAB TCA_ROOT_TAB
 	TCA_ROOT_FLAGS,
 	TCA_ROOT_COUNT,
+	TCA_ROOT_TIME_DELTA, /* in msecs */
 	__TCA_ROOT_MAX,
 #define	TCA_ROOT_MAX (__TCA_ROOT_MAX - 1)
 };
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 2756213..c0aee2c 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -84,6 +84,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
 {
 	int err = 0, index = -1, i = 0, s_i = 0, n_i = 0;
 	u32 act_flags = cb->args[2];
+	unsigned long jiffy_since = cb->args[3];
 	struct nlattr *nest;
 
 	spin_lock_bh(&hinfo->lock);
@@ -101,6 +102,11 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
 			if (index < s_i)
 				continue;
 
+			if (jiffy_since &&
+			    time_after(jiffy_since,
+				       (unsigned long)p->tcfa_tm.lastuse))
+				continue;
+
 			nest = nla_nest_start(skb, n_i);
 			if (nest == NULL)
 				goto nla_put_failure;
@@ -118,9 +124,11 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
 		}
 	}
 done:
+	if (index > 0)
+		cb->args[0] = index + 1;
+
 	spin_unlock_bh(&hinfo->lock);
 	if (n_i) {
-		cb->args[0] += n_i;
 		if (act_flags & TCA_FLAG_LARGE_DUMP_ON)
 			cb->args[1] = n_i;
 	}
@@ -999,7 +1007,8 @@ static int tcf_action_add(struct net *net, struct nlattr *nla,
 }
 
 static const struct nla_policy tcaa_policy[TCA_ROOT_MAX + 1] = {
-	[TCA_ROOT_FLAGS]      = { .type = NLA_U32 },
+	[TCA_ROOT_FLAGS]           = { .type = NLA_U32 },
+	[TCA_ROOT_TIME_DELTA]      = { .type = NLA_U32 },
 };
 
 static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
@@ -1099,7 +1108,9 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
 	struct tcamsg *t = (struct tcamsg *) nlmsg_data(cb->nlh);
 	struct nlattr *count_attr = NULL;
 	struct nlattr *tb[TCA_ROOT_MAX + 1];
+	unsigned long jiffy_since = 0;
 	struct nlattr *kind = NULL;
+	u32 msecs_since = 0;
 	u32 act_flags = 0;
 	u32 act_count = 0;
 
@@ -1124,12 +1135,19 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
 	if (act_flags && !tca_flags_valid(act_flags))
 		return -EINVAL;
 
+	if (tb[TCA_ROOT_TIME_DELTA])
+		msecs_since = nla_get_u32(tb[TCA_ROOT_TIME_DELTA]);
+
 	nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
 			cb->nlh->nlmsg_type, sizeof(*t), 0);
 	if (!nlh)
 		goto out_module_put;
 
+	if (msecs_since)
+		jiffy_since = jiffies - msecs_to_jiffies(msecs_since);
+
 	cb->args[2] = act_flags;
+	cb->args[3] = jiffy_since;
 	t = nlmsg_data(nlh);
 	t->tca_family = AF_UNSPEC;
 	t->tca__pad1 = 0;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-25 11:54 ` [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch Jamal Hadi Salim
@ 2017-04-25 12:13   ` Jiri Pirko
  2017-04-25 13:01     ` Jamal Hadi Salim
  0 siblings, 1 reply; 33+ messages in thread
From: Jiri Pirko @ 2017-04-25 12:13 UTC (permalink / raw)
  To: Jamal Hadi Salim; +Cc: davem, xiyou.wangcong, eric.dumazet, netdev

Tue, Apr 25, 2017 at 01:54:06PM CEST, jhs@mojatatu.com wrote:
>From: Jamal Hadi Salim <jhs@mojatatu.com>
>
>When you dump hundreds of thousands of actions, getting only 32 per
>dump batch even when the socket buffer and memory allocations allow
>is inefficient.
>
>With this change, the user will get as many as possibly fitting
>within the given constraints available to the kernel.
>
>The top level action TLV space is extended. An attribute
>TCA_ROOT_FLAGS is used to carry flags; flag TCA_FLAG_LARGE_DUMP_ON
>is set by the user indicating the user is capable of processing
>these large dumps. Older user space which doesnt set this flag
>doesnt get the large (than 32) batches.
>The kernel uses the TCA_ROOT_COUNT attribute to tell the user how many
>actions are put in a single batch. As such user space app knows how long
>to iterate (independent of the type of action being dumped)
>instead of hardcoded maximum of 32.
>
>Some results dumping 1.5M actions, first unpatched tc which the
>kernel doesnt help:
>
>prompt$ time -p tc actions ls action gact | grep index | wc -l
>1500000
>real 1388.43
>user 2.07
>sys 1386.79
>
>Now lets see a patched tc which sets the correct flags when requesting
>a dump:
>
>prompt$ time -p updatedtc actions ls action gact | grep index | wc -l
>1500000
>real 178.13
>user 2.02
>sys 176.96
>
>That is about 8x performance improvement for tc which sets its
>receive buffer to about 32K.
>
>Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
>---
> include/uapi/linux/rtnetlink.h | 22 ++++++++++++++--
> net/sched/act_api.c            | 59 +++++++++++++++++++++++++++++++++++-------
> 2 files changed, 69 insertions(+), 12 deletions(-)
>
>diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
>index cce0613..5875467 100644
>--- a/include/uapi/linux/rtnetlink.h
>+++ b/include/uapi/linux/rtnetlink.h
>@@ -674,10 +674,28 @@ struct tcamsg {
> 	unsigned char	tca__pad1;
> 	unsigned short	tca__pad2;
> };
>+
>+enum {
>+	TCA_ROOT_UNSPEC,
>+	TCA_ROOT_TAB,
>+#define TCA_ACT_TAB TCA_ROOT_TAB
>+	TCA_ROOT_FLAGS,
>+	TCA_ROOT_COUNT,
>+	__TCA_ROOT_MAX,
>+#define	TCA_ROOT_MAX (__TCA_ROOT_MAX - 1)
>+};
>+
> #define TA_RTA(r)  ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct tcamsg))))
> #define TA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct tcamsg))
>-#define TCA_ACT_TAB 1 /* attr type must be >=1 */	
>-#define TCAA_MAX 1
>+/* tcamsg flags stored in attribute TCA_ROOT_FLAGS
>+ *
>+ * TCA_FLAG_LARGE_DUMP_ON user->kernel to request for larger than TCA_ACT_MAX_PRIO
>+ * actions in a dump. All dump responses will contain the number of actions
>+ * being dumped stored in for user app's consumption in TCA_ROOT_COUNT
>+ *
>+ */
>+#define TCA_FLAG_LARGE_DUMP_ON		(1 << 0)

BIT (I think I mentioned this before)

>+#define VALID_TCA_FLAGS TCA_FLAG_LARGE_DUMP_ON

Again, namespace please. "TCA_ROOT_FLAGS_VALID"
Why is this UAPI?


> 
> /* New extended info filters for IFLA_EXT_MASK */
> #define RTEXT_FILTER_VF		(1 << 0)
>diff --git a/net/sched/act_api.c b/net/sched/act_api.c
>index 9ce22b7..2756213 100644
>--- a/net/sched/act_api.c
>+++ b/net/sched/act_api.c
>@@ -83,6 +83,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
> 			   struct netlink_callback *cb)
> {
> 	int err = 0, index = -1, i = 0, s_i = 0, n_i = 0;
>+	u32 act_flags = cb->args[2];
> 	struct nlattr *nest;
> 
> 	spin_lock_bh(&hinfo->lock);
>@@ -111,14 +112,18 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
> 			}
> 			nla_nest_end(skb, nest);
> 			n_i++;
>-			if (n_i >= TCA_ACT_MAX_PRIO)
>+			if (!(act_flags & TCA_FLAG_LARGE_DUMP_ON) &&
>+			    n_i >= TCA_ACT_MAX_PRIO)
> 				goto done;
> 		}
> 	}
> done:
> 	spin_unlock_bh(&hinfo->lock);
>-	if (n_i)
>+	if (n_i) {
> 		cb->args[0] += n_i;
>+		if (act_flags & TCA_FLAG_LARGE_DUMP_ON)
>+			cb->args[1] = n_i;
>+	}
> 	return n_i;
> 
> nla_put_failure:
>@@ -993,11 +998,15 @@ static int tcf_action_add(struct net *net, struct nlattr *nla,
> 	return tcf_add_notify(net, n, &actions, portid);
> }
> 
>+static const struct nla_policy tcaa_policy[TCA_ROOT_MAX + 1] = {
>+	[TCA_ROOT_FLAGS]      = { .type = NLA_U32 },
>+};
>+
> static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
> 			 struct netlink_ext_ack *extack)
> {
> 	struct net *net = sock_net(skb->sk);
>-	struct nlattr *tca[TCAA_MAX + 1];
>+	struct nlattr *tca[TCA_ROOT_MAX + 1];
> 	u32 portid = skb ? NETLINK_CB(skb).portid : 0;
> 	int ret = 0, ovr = 0;
> 
>@@ -1005,7 +1014,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
> 	    !netlink_capable(skb, CAP_NET_ADMIN))
> 		return -EPERM;
> 
>-	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCAA_MAX, NULL,
>+	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCA_ROOT_MAX, NULL,
> 			  extack);
> 	if (ret < 0)
> 		return ret;
>@@ -1046,16 +1055,12 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
> 	return ret;
> }
> 
>-static struct nlattr *find_dump_kind(const struct nlmsghdr *n)
>+static struct nlattr *find_dump_kind(struct nlattr **nla)
> {
> 	struct nlattr *tb1, *tb2[TCA_ACT_MAX + 1];
> 	struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
>-	struct nlattr *nla[TCAA_MAX + 1];
> 	struct nlattr *kind;
> 
>-	if (nlmsg_parse(n, sizeof(struct tcamsg), nla, TCAA_MAX,
>-			NULL, NULL) < 0)
>-		return NULL;
> 	tb1 = nla[TCA_ACT_TAB];
> 	if (tb1 == NULL)
> 		return NULL;
>@@ -1073,6 +1078,16 @@ static struct nlattr *find_dump_kind(const struct nlmsghdr *n)
> 	return kind;
> }
> 
>+static inline bool tca_flags_valid(u32 act_flags)
>+{
>+	u32 invalid_flags_mask  = ~VALID_TCA_FLAGS;
>+
>+	if (act_flags & invalid_flags_mask)
>+		return false;

I don't see how this resolves anything. VALID_TCA_FLAGS is set in stone
not going to change anytime in future, right? Then the TCA_ROOT_FLAGS
attr will always contain only one flag, right?
Then, I don't see why do we need this dance... u8 flag attr resolves
this in elegant way. I know I sound like a broken record. This is the
last time I'm commenting on this. I give up.


>+
>+	return true;
>+}
>+
> static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
> {
> 	struct net *net = sock_net(skb->sk);
>@@ -1082,8 +1097,18 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
> 	struct tc_action_ops *a_o;
> 	int ret = 0;
> 	struct tcamsg *t = (struct tcamsg *) nlmsg_data(cb->nlh);
>-	struct nlattr *kind = find_dump_kind(cb->nlh);
>+	struct nlattr *count_attr = NULL;
>+	struct nlattr *tb[TCA_ROOT_MAX + 1];
>+	struct nlattr *kind = NULL;
>+	u32 act_flags = 0;
>+	u32 act_count = 0;
>+
>+	ret = nlmsg_parse(cb->nlh, sizeof(struct tcamsg), tb, TCA_ROOT_MAX,
>+			  tcaa_policy, NULL);
>+	if (ret < 0)
>+		return ret;
> 
>+	kind = find_dump_kind(tb);
> 	if (kind == NULL) {
> 		pr_info("tc_dump_action: action bad kind\n");
> 		return 0;
>@@ -1093,14 +1118,25 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
> 	if (a_o == NULL)
> 		return 0;
> 
>+	if (tb[TCA_ROOT_FLAGS])
>+		act_flags = nla_get_u32(tb[TCA_ROOT_FLAGS]);
>+
>+	if (act_flags && !tca_flags_valid(act_flags))
>+		return -EINVAL;
>+
> 	nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
> 			cb->nlh->nlmsg_type, sizeof(*t), 0);
> 	if (!nlh)
> 		goto out_module_put;
>+
>+	cb->args[2] = act_flags;
> 	t = nlmsg_data(nlh);
> 	t->tca_family = AF_UNSPEC;
> 	t->tca__pad1 = 0;
> 	t->tca__pad2 = 0;
>+	count_attr = nla_reserve(skb, TCA_ROOT_COUNT, sizeof(u32));
>+	if (!count_attr)
>+		goto out_module_put;
> 
> 	nest = nla_nest_start(skb, TCA_ACT_TAB);
> 	if (nest == NULL)
>@@ -1113,6 +1149,9 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
> 	if (ret > 0) {
> 		nla_nest_end(skb, nest);
> 		ret = skb->len;
>+		act_count = cb->args[1];
>+		memcpy(nla_data(count_attr), &act_count, sizeof(u32));
>+		cb->args[1] = 0;
> 	} else
> 		nlmsg_trim(skb, b);
> 
>-- 
>1.9.1
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-25 12:13   ` Jiri Pirko
@ 2017-04-25 13:01     ` Jamal Hadi Salim
  2017-04-25 16:04       ` Jiri Pirko
  0 siblings, 1 reply; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-25 13:01 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: davem, xiyou.wangcong, eric.dumazet, netdev

On 17-04-25 08:13 AM, Jiri Pirko wrote:
> Tue, Apr 25, 2017 at 01:54:06PM CEST, jhs@mojatatu.com wrote:


[..]

>> -#define TCAA_MAX 1
>> +/* tcamsg flags stored in attribute TCA_ROOT_FLAGS
>> + *
>> + * TCA_FLAG_LARGE_DUMP_ON user->kernel to request for larger than TCA_ACT_MAX_PRIO
>> + * actions in a dump. All dump responses will contain the number of actions
>> + * being dumped stored in for user app's consumption in TCA_ROOT_COUNT
>> + *
>> + */
>> +#define TCA_FLAG_LARGE_DUMP_ON		(1 << 0)
>
> BIT (I think I mentioned this before)
>

You did - but i took it out about two submissions back (per cover
letter) because it is no part of UAPI today. I noticed devlink was
using it but they defined  their own variant.
So if i added this, iproute2 doesnt compile. I could fix iproute2
to move it somewhere to a common header then restore this.

>> +#define VALID_TCA_FLAGS TCA_FLAG_LARGE_DUMP_ON
>
> Again, namespace please. "TCA_ROOT_FLAGS_VALID"

Good point.

> Why is this UAPI?
>

Should not be. I will fix in next update.


>
>>
>> /* New extended info filters for IFLA_EXT_MASK */
>> #define RTEXT_FILTER_VF		(1 << 0)
>> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
>> index 9ce22b7..2756213 100644
>> --- a/net/sched/act_api.c
>> +++ b/net/sched/act_api.c
>> @@ -83,6 +83,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
>> 			   struct netlink_callback *cb)
>> {
>> 	int err = 0, index = -1, i = 0, s_i = 0, n_i = 0;
>> +	u32 act_flags = cb->args[2];
>> 	struct nlattr *nest;
>>
>> 	spin_lock_bh(&hinfo->lock);
>> @@ -111,14 +112,18 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
>> 			}
>> 			nla_nest_end(skb, nest);
>> 			n_i++;
>> -			if (n_i >= TCA_ACT_MAX_PRIO)
>> +			if (!(act_flags & TCA_FLAG_LARGE_DUMP_ON) &&
>> +			    n_i >= TCA_ACT_MAX_PRIO)
>> 				goto done;
>> 		}
>> 	}
>> done:
>> 	spin_unlock_bh(&hinfo->lock);
>> -	if (n_i)
>> +	if (n_i) {
>> 		cb->args[0] += n_i;
>> +		if (act_flags & TCA_FLAG_LARGE_DUMP_ON)
>> +			cb->args[1] = n_i;
>> +	}
>> 	return n_i;
>>
>> nla_put_failure:
>> @@ -993,11 +998,15 @@ static int tcf_action_add(struct net *net, struct nlattr *nla,
>> 	return tcf_add_notify(net, n, &actions, portid);
>> }
>>
>> +static const struct nla_policy tcaa_policy[TCA_ROOT_MAX + 1] = {
>> +	[TCA_ROOT_FLAGS]      = { .type = NLA_U32 },
>> +};
>> +
>> static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
>> 			 struct netlink_ext_ack *extack)
>> {
>> 	struct net *net = sock_net(skb->sk);
>> -	struct nlattr *tca[TCAA_MAX + 1];
>> +	struct nlattr *tca[TCA_ROOT_MAX + 1];
>> 	u32 portid = skb ? NETLINK_CB(skb).portid : 0;
>> 	int ret = 0, ovr = 0;
>>
>> @@ -1005,7 +1014,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
>> 	    !netlink_capable(skb, CAP_NET_ADMIN))
>> 		return -EPERM;
>>
>> -	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCAA_MAX, NULL,
>> +	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCA_ROOT_MAX, NULL,
>> 			  extack);
>> 	if (ret < 0)
>> 		return ret;
>> @@ -1046,16 +1055,12 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
>> 	return ret;
>> }
>>
>> -static struct nlattr *find_dump_kind(const struct nlmsghdr *n)
>> +static struct nlattr *find_dump_kind(struct nlattr **nla)
>> {
>> 	struct nlattr *tb1, *tb2[TCA_ACT_MAX + 1];
>> 	struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
>> -	struct nlattr *nla[TCAA_MAX + 1];
>> 	struct nlattr *kind;
>>
>> -	if (nlmsg_parse(n, sizeof(struct tcamsg), nla, TCAA_MAX,
>> -			NULL, NULL) < 0)
>> -		return NULL;
>> 	tb1 = nla[TCA_ACT_TAB];
>> 	if (tb1 == NULL)
>> 		return NULL;
>> @@ -1073,6 +1078,16 @@ static struct nlattr *find_dump_kind(const struct nlmsghdr *n)
>> 	return kind;
>> }
>>
>> +static inline bool tca_flags_valid(u32 act_flags)
>> +{
>> +	u32 invalid_flags_mask  = ~VALID_TCA_FLAGS;
>> +
>> +	if (act_flags & invalid_flags_mask)
>> +		return false;
>
> I don't see how this resolves anything. VALID_TCA_FLAGS is set in stone
> not going to change anytime in future, right?

Every time a new bit gets added VALID_TCA_FLAGS changes.

>Then the TCA_ROOT_FLAGS
> attr will always contain only one flag, right?

Not true - forever. Just in this patch discussion:
I have added 2 other flags since removed. So I cant predict the
future. You keep coming back to this assumption there will always
ever be this one flag. I am not following how you reach this
conclusion.

> Then, I don't see why do we need this dance... u8 flag attr resolves
> this in elegant way. I know I sound like a broken record. This is the
> last time I'm commenting on this. I give up.
>

Why is a u8 better than u32 Jiri?
The TLV space consumed is still 64 bits in both cases. If I define u8,
u16, u32 - it is still 64 bits being alloced. If i use a u8 then i have
24 bits which are pads - given current discussions I can never ever use
again!

If i keep 32 bits I am free to use those without ever changing the
data structure (except for the restrictions to now make sure nothing
gets set).

cheers,
jamal

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-25 13:01     ` Jamal Hadi Salim
@ 2017-04-25 16:04       ` Jiri Pirko
  2017-04-25 20:29         ` Jamal Hadi Salim
  2017-04-26 11:02         ` Simon Horman
  0 siblings, 2 replies; 33+ messages in thread
From: Jiri Pirko @ 2017-04-25 16:04 UTC (permalink / raw)
  To: Jamal Hadi Salim; +Cc: davem, xiyou.wangcong, eric.dumazet, netdev

Tue, Apr 25, 2017 at 03:01:22PM CEST, jhs@mojatatu.com wrote:
>On 17-04-25 08:13 AM, Jiri Pirko wrote:
>> Tue, Apr 25, 2017 at 01:54:06PM CEST, jhs@mojatatu.com wrote:
>
>
>[..]
>
>> > -#define TCAA_MAX 1
>> > +/* tcamsg flags stored in attribute TCA_ROOT_FLAGS
>> > + *
>> > + * TCA_FLAG_LARGE_DUMP_ON user->kernel to request for larger than TCA_ACT_MAX_PRIO
>> > + * actions in a dump. All dump responses will contain the number of actions
>> > + * being dumped stored in for user app's consumption in TCA_ROOT_COUNT
>> > + *
>> > + */
>> > +#define TCA_FLAG_LARGE_DUMP_ON		(1 << 0)
>> 
>> BIT (I think I mentioned this before)
>> 
>
>You did - but i took it out about two submissions back (per cover
>letter) because it is no part of UAPI today. I noticed devlink was
>using it but they defined  their own variant.
>So if i added this, iproute2 doesnt compile. I could fix iproute2
>to move it somewhere to a common header then restore this.

So fix iproute2. It is always first kernel, then iproute2.


>
>> > +#define VALID_TCA_FLAGS TCA_FLAG_LARGE_DUMP_ON
>> 
>> Again, namespace please. "TCA_ROOT_FLAGS_VALID"
>
>Good point.
>
>> Why is this UAPI?
>> 
>
>Should not be. I will fix in next update.
>
>
>> 
>> > 
>> > /* New extended info filters for IFLA_EXT_MASK */
>> > #define RTEXT_FILTER_VF		(1 << 0)
>> > diff --git a/net/sched/act_api.c b/net/sched/act_api.c
>> > index 9ce22b7..2756213 100644
>> > --- a/net/sched/act_api.c
>> > +++ b/net/sched/act_api.c
>> > @@ -83,6 +83,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
>> > 			   struct netlink_callback *cb)
>> > {
>> > 	int err = 0, index = -1, i = 0, s_i = 0, n_i = 0;
>> > +	u32 act_flags = cb->args[2];
>> > 	struct nlattr *nest;
>> > 
>> > 	spin_lock_bh(&hinfo->lock);
>> > @@ -111,14 +112,18 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
>> > 			}
>> > 			nla_nest_end(skb, nest);
>> > 			n_i++;
>> > -			if (n_i >= TCA_ACT_MAX_PRIO)
>> > +			if (!(act_flags & TCA_FLAG_LARGE_DUMP_ON) &&
>> > +			    n_i >= TCA_ACT_MAX_PRIO)
>> > 				goto done;
>> > 		}
>> > 	}
>> > done:
>> > 	spin_unlock_bh(&hinfo->lock);
>> > -	if (n_i)
>> > +	if (n_i) {
>> > 		cb->args[0] += n_i;
>> > +		if (act_flags & TCA_FLAG_LARGE_DUMP_ON)
>> > +			cb->args[1] = n_i;
>> > +	}
>> > 	return n_i;
>> > 
>> > nla_put_failure:
>> > @@ -993,11 +998,15 @@ static int tcf_action_add(struct net *net, struct nlattr *nla,
>> > 	return tcf_add_notify(net, n, &actions, portid);
>> > }
>> > 
>> > +static const struct nla_policy tcaa_policy[TCA_ROOT_MAX + 1] = {
>> > +	[TCA_ROOT_FLAGS]      = { .type = NLA_U32 },
>> > +};
>> > +
>> > static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
>> > 			 struct netlink_ext_ack *extack)
>> > {
>> > 	struct net *net = sock_net(skb->sk);
>> > -	struct nlattr *tca[TCAA_MAX + 1];
>> > +	struct nlattr *tca[TCA_ROOT_MAX + 1];
>> > 	u32 portid = skb ? NETLINK_CB(skb).portid : 0;
>> > 	int ret = 0, ovr = 0;
>> > 
>> > @@ -1005,7 +1014,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
>> > 	    !netlink_capable(skb, CAP_NET_ADMIN))
>> > 		return -EPERM;
>> > 
>> > -	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCAA_MAX, NULL,
>> > +	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCA_ROOT_MAX, NULL,
>> > 			  extack);
>> > 	if (ret < 0)
>> > 		return ret;
>> > @@ -1046,16 +1055,12 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
>> > 	return ret;
>> > }
>> > 
>> > -static struct nlattr *find_dump_kind(const struct nlmsghdr *n)
>> > +static struct nlattr *find_dump_kind(struct nlattr **nla)
>> > {
>> > 	struct nlattr *tb1, *tb2[TCA_ACT_MAX + 1];
>> > 	struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
>> > -	struct nlattr *nla[TCAA_MAX + 1];
>> > 	struct nlattr *kind;
>> > 
>> > -	if (nlmsg_parse(n, sizeof(struct tcamsg), nla, TCAA_MAX,
>> > -			NULL, NULL) < 0)
>> > -		return NULL;
>> > 	tb1 = nla[TCA_ACT_TAB];
>> > 	if (tb1 == NULL)
>> > 		return NULL;
>> > @@ -1073,6 +1078,16 @@ static struct nlattr *find_dump_kind(const struct nlmsghdr *n)
>> > 	return kind;
>> > }
>> > 
>> > +static inline bool tca_flags_valid(u32 act_flags)
>> > +{
>> > +	u32 invalid_flags_mask  = ~VALID_TCA_FLAGS;
>> > +
>> > +	if (act_flags & invalid_flags_mask)
>> > +		return false;
>> 
>> I don't see how this resolves anything. VALID_TCA_FLAGS is set in stone
>> not going to change anytime in future, right?
>
>Every time a new bit gets added VALID_TCA_FLAGS changes.

You mean flag that user can set? If that is the case, you are breaking
UAPI for newer app running on older kernel.


>
>> Then the TCA_ROOT_FLAGS
>> attr will always contain only one flag, right?
>
>Not true - forever. Just in this patch discussion:
>I have added 2 other flags since removed. So I cant predict the
>future. You keep coming back to this assumption there will always
>ever be this one flag. I am not following how you reach this
>conclusion.

I read this paragraph 5 times, still I don't get what you say :/


>
>> Then, I don't see why do we need this dance... u8 flag attr resolves
>> this in elegant way. I know I sound like a broken record. This is the
>> last time I'm commenting on this. I give up.
>> 
>
>Why is a u8 better than u32 Jiri?
>The TLV space consumed is still 64 bits in both cases. If I define u8,
>u16, u32 - it is still 64 bits being alloced. If i use a u8 then i have
>24 bits which are pads - given current discussions I can never ever use
>again!

I don't care, use u8 or u32. Just 1 attr - 1 flag.


>
>If i keep 32 bits I am free to use those without ever changing the
>data structure (except for the restrictions to now make sure nothing
>gets set).

what datastructure? I'm confused.

>
>cheers,
>jamal

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 1/3] net sched actions: Use proper root attribute table for actions
  2017-04-25 11:54 ` [PATCH net-next v8 1/3] net sched actions: Use proper root attribute table for actions Jamal Hadi Salim
@ 2017-04-25 18:42   ` Simon Horman
  0 siblings, 0 replies; 33+ messages in thread
From: Simon Horman @ 2017-04-25 18:42 UTC (permalink / raw)
  To: Jamal Hadi Salim; +Cc: davem, jiri, xiyou.wangcong, eric.dumazet, netdev

On Tue, Apr 25, 2017 at 07:54:05AM -0400, Jamal Hadi Salim wrote:
> From: Jamal Hadi Salim <jhs@mojatatu.com>
> 
> Bug fix for an issue which has been around for about a decade.
> We got away with it because the enumeration was larger than needed.
> 
> Fixes: 7ba699c604ab ("[NET_SCHED]: Convert actions from rtnetlink to new netlink API")
> Suggested-by: Jiri Pirko <jiri@mellanox.com>
> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>

Reviewed-by: Simon Horman <simon.horman@netronome.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-25 16:04       ` Jiri Pirko
@ 2017-04-25 20:29         ` Jamal Hadi Salim
  2017-04-26  6:19           ` Jiri Pirko
  2017-04-26 11:02         ` Simon Horman
  1 sibling, 1 reply; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-25 20:29 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: davem, xiyou.wangcong, eric.dumazet, netdev

On 17-04-25 12:04 PM, Jiri Pirko wrote:
> Tue, Apr 25, 2017 at 03:01:22PM CEST, jhs@mojatatu.com wrote:
>> On 17-04-25 08:13 AM, Jiri Pirko wrote:
>>> Tue, Apr 25, 2017 at 01:54:06PM CEST, jhs@mojatatu.com wrote:

>>>>
>>>> +static inline bool tca_flags_valid(u32 act_flags)
>>>> +{
>>>> +	u32 invalid_flags_mask  = ~VALID_TCA_FLAGS;
>>>> +
>>>> +	if (act_flags & invalid_flags_mask)
>>>> +		return false;
>>>
>>> I don't see how this resolves anything. VALID_TCA_FLAGS is set in stone
>>> not going to change anytime in future, right?
>>
>> Every time a new bit gets added VALID_TCA_FLAGS changes.
>
> You mean flag that user can set? If that is the case, you are breaking
> UAPI for newer app running on older kernel.
>

Ok, let me try to explain with more clarity. The rules Iam
trying to follow are:
if i see any bit set that i dont understand I will reject.

So lets in first kernel I have support for bit 0.
My validation check is to make sure only bit 0 is set.
The valid_flags currently then only constitutes bit 0.
i.e
If you set bit 2 or 3, the function above will reject and i return
the error to the user.

That is expected behavior correct?

3 months down the road:
I add two flags - bit 1 and 2.
So now my valid_flags changes to bits 1, 2 and 0.

The function above will now return true for bits 0-2 but
will reject if you set bit 3.

That is expected behavior, correct?

On u32/16/8:
I am choosing u32 so it allows me to add more upto 32 bit flags.
Not all 32 are needed today but it is better insurance.
If I used u8 then the 24 of those 32 bits i dont use will be used
as pads in the TLV. So it doesnt make sense for me to use a u8/16.

Does that make more sense?

cheers,
jamal

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-25 20:29         ` Jamal Hadi Salim
@ 2017-04-26  6:19           ` Jiri Pirko
  2017-04-26 11:07             ` Simon Horman
  2017-04-26 11:48             ` Jamal Hadi Salim
  0 siblings, 2 replies; 33+ messages in thread
From: Jiri Pirko @ 2017-04-26  6:19 UTC (permalink / raw)
  To: Jamal Hadi Salim; +Cc: davem, xiyou.wangcong, eric.dumazet, netdev

Tue, Apr 25, 2017 at 10:29:40PM CEST, jhs@mojatatu.com wrote:
>On 17-04-25 12:04 PM, Jiri Pirko wrote:
>> Tue, Apr 25, 2017 at 03:01:22PM CEST, jhs@mojatatu.com wrote:
>> > On 17-04-25 08:13 AM, Jiri Pirko wrote:
>> > > Tue, Apr 25, 2017 at 01:54:06PM CEST, jhs@mojatatu.com wrote:
>
>> > > > 
>> > > > +static inline bool tca_flags_valid(u32 act_flags)
>> > > > +{
>> > > > +	u32 invalid_flags_mask  = ~VALID_TCA_FLAGS;
>> > > > +
>> > > > +	if (act_flags & invalid_flags_mask)
>> > > > +		return false;
>> > > 
>> > > I don't see how this resolves anything. VALID_TCA_FLAGS is set in stone
>> > > not going to change anytime in future, right?
>> > 
>> > Every time a new bit gets added VALID_TCA_FLAGS changes.
>> 
>> You mean flag that user can set? If that is the case, you are breaking
>> UAPI for newer app running on older kernel.
>> 
>
>Ok, let me try to explain with more clarity. The rules Iam
>trying to follow are:
>if i see any bit set that i dont understand I will reject.
>
>So lets in first kernel I have support for bit 0.
>My validation check is to make sure only bit 0 is set.
>The valid_flags currently then only constitutes bit 0.
>i.e
>If you set bit 2 or 3, the function above will reject and i return
>the error to the user.
>
>That is expected behavior correct?
>
>3 months down the road:
>I add two flags - bit 1 and 2.
>So now my valid_flags changes to bits 1, 2 and 0.
>
>The function above will now return true for bits 0-2 but
>will reject if you set bit 3.
>
>That is expected behavior, correct?

The same app compiled against new kernel with bits (0, 1, 2) will run with
this kernel good. But if you run it with older kernel, the kernel (0)
would refuse. Is that ok?



>
>On u32/16/8:
>I am choosing u32 so it allows me to add more upto 32 bit flags.
>Not all 32 are needed today but it is better insurance.
>If I used u8 then the 24 of those 32 bits i dont use will be used
>as pads in the TLV. So it doesnt make sense for me to use a u8/16.

Jamal, note that I never suggested having more flags in a single attr.
Therefore I suggested u8 to carry a single flag.

You say that it has performance impact having 3 flag attrs in compare to
one bit flag attr. Could you please provide some numbers?

I expect that you will not be able to show the difference. And if there
is no difference in performance, your main argument goes away. And we
can do this in a nice, clear, TLV fashion.


>
>Does that make more sense?
>
>cheers,
>jamal

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-25 16:04       ` Jiri Pirko
  2017-04-25 20:29         ` Jamal Hadi Salim
@ 2017-04-26 11:02         ` Simon Horman
  2017-04-26 12:46           ` Jamal Hadi Salim
  1 sibling, 1 reply; 33+ messages in thread
From: Simon Horman @ 2017-04-26 11:02 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Jamal Hadi Salim, davem, xiyou.wangcong, eric.dumazet, netdev

On Tue, Apr 25, 2017 at 06:04:45PM +0200, Jiri Pirko wrote:
> Tue, Apr 25, 2017 at 03:01:22PM CEST, jhs@mojatatu.com wrote:
> >On 17-04-25 08:13 AM, Jiri Pirko wrote:
> >> Tue, Apr 25, 2017 at 01:54:06PM CEST, jhs@mojatatu.com wrote:
> >
> >
> >[..]
> >
> >> > -#define TCAA_MAX 1
> >> > +/* tcamsg flags stored in attribute TCA_ROOT_FLAGS
> >> > + *
> >> > + * TCA_FLAG_LARGE_DUMP_ON user->kernel to request for larger than TCA_ACT_MAX_PRIO
> >> > + * actions in a dump. All dump responses will contain the number of actions
> >> > + * being dumped stored in for user app's consumption in TCA_ROOT_COUNT
> >> > + *
> >> > + */
> >> > +#define TCA_FLAG_LARGE_DUMP_ON		(1 << 0)
> >> 
> >> BIT (I think I mentioned this before)
> >> 
> >
> >You did - but i took it out about two submissions back (per cover
> >letter) because it is no part of UAPI today. I noticed devlink was
> >using it but they defined  their own variant.
> >So if i added this, iproute2 doesnt compile. I could fix iproute2
> >to move it somewhere to a common header then restore this.
> 
> So fix iproute2. It is always first kernel, then iproute2.

Perhaps I am missing the point or somehow misguided but I would expect that
if the UAPI uses BIT() it also provides BIT().

...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-26  6:19           ` Jiri Pirko
@ 2017-04-26 11:07             ` Simon Horman
  2017-04-26 13:00               ` Jamal Hadi Salim
  2017-04-26 11:48             ` Jamal Hadi Salim
  1 sibling, 1 reply; 33+ messages in thread
From: Simon Horman @ 2017-04-26 11:07 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Jamal Hadi Salim, davem, xiyou.wangcong, eric.dumazet, netdev

On Wed, Apr 26, 2017 at 08:19:04AM +0200, Jiri Pirko wrote:
> Tue, Apr 25, 2017 at 10:29:40PM CEST, jhs@mojatatu.com wrote:

...

> >So lets in first kernel I have support for bit 0.
> >My validation check is to make sure only bit 0 is set.
> >The valid_flags currently then only constitutes bit 0.
> >i.e
> >If you set bit 2 or 3, the function above will reject and i return
> >the error to the user.
> >
> >That is expected behavior correct?
> >
> >3 months down the road:
> >I add two flags - bit 1 and 2.
> >So now my valid_flags changes to bits 1, 2 and 0.
> >
> >The function above will now return true for bits 0-2 but
> >will reject if you set bit 3.
> >
> >That is expected behavior, correct?
> 
> The same app compiled against new kernel with bits (0, 1, 2) will run with
> this kernel good. But if you run it with older kernel, the kernel (0)
> would refuse. Is that ok?

Conversely, if an app is compiled against a new kernel and uses ATTR0, ATTR1
and ATTR2 all will be well. But if you run it against the older kernel
ATTR1 and ATTR2 will be silently ignored. I believe that is how its always
been but is that ok?

> >On u32/16/8:
> >I am choosing u32 so it allows me to add more upto 32 bit flags.
> >Not all 32 are needed today but it is better insurance.
> >If I used u8 then the 24 of those 32 bits i dont use will be used
> >as pads in the TLV. So it doesnt make sense for me to use a u8/16.
> 
> Jamal, note that I never suggested having more flags in a single attr.
> Therefore I suggested u8 to carry a single flag.
> 
> You say that it has performance impact having 3 flag attrs in compare to
> one bit flag attr. Could you please provide some numbers?
> 
> I expect that you will not be able to show the difference. And if there
> is no difference in performance, your main argument goes away. And we
> can do this in a nice, clear, TLV fashion.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-26  6:19           ` Jiri Pirko
  2017-04-26 11:07             ` Simon Horman
@ 2017-04-26 11:48             ` Jamal Hadi Salim
  2017-04-26 12:08               ` Jiri Pirko
  1 sibling, 1 reply; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-26 11:48 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, xiyou.wangcong, eric.dumazet, netdev, Simon Horman,
	Benjamin LaHaise

On 17-04-26 02:19 AM, Jiri Pirko wrote:
> Tue, Apr 25, 2017 at 10:29:40PM CEST, jhs@mojatatu.com wrote:
>> On 17-04-25 12:04 PM, Jiri Pirko wrote:
[..]
>> That is expected behavior correct?
>>
>> 3 months down the road:
>> I add two flags - bit 1 and 2.
>> So now my valid_flags changes to bits 1, 2 and 0.
>>
>> The function above will now return true for bits 0-2 but
>> will reject if you set bit 3.
>>
>> That is expected behavior, correct?
>
> The same app compiled against new kernel with bits (0, 1, 2) will run with
> this kernel good. But if you run it with older kernel, the kernel (0)
> would refuse. Is that ok?
>


Dave said that is what has to be done.
To quote from the cover letter:

--------- START QUOTE -------------
changes since v6:
-----------------

1) DaveM:
New rules for netlink messages. From now on we are going to start
checking for bits that are not used and rejecting anything we dont
understand. In the future this is going to require major changes
to user space code (tc etc). This is just a start.

To quote, David:
"
  Again, bits you aren't using now, make sure userspace doesn't
    set them.  And if it does, reject.
"

---------- END QUOTE -----------

I am going to send the patches - if you dont like this then speak up and
David needs to be convinced.  This is UAPI - once patches are in it is
cast in stone and I dont mind a discussion to make sure we get it right.

> Jamal, note that I never suggested having more flags in a single attr.
> Therefore I suggested u8 to carry a single flag.
>

Jiri, thats our main difference unless I am misunderstanding you.

I believe you should squeeze as many as you can in one single attribute.
You believe you should have only one flag per attribute.

Aesthetically a u8 looks good. Performance wise it is bad when you
have many entries to deal with.


> You say that it has performance impact having 3 flag attrs in compare to
> one bit flag attr. Could you please provide some numbers?
>

I have experience with dealing with a massive amount of various dumps
and (batch) sets and it always boils down to one thing:
_how much data is exchanged between user and kernel_
3 flags encoded as bits in a u32 attribute cost 64 bits.
Encoded separately cost 3x that.

Believe me, it _does make a difference_ in performance.

My least favorite subsystem is bridge. The bridge code has
tons of flags in those entries that are sent to/from kernel as u8
attributes. It is painful.

For something more recent, lets look at this commit from Ben on Flower:
+       TCA_FLOWER_KEY_MPLS_TTL,        /* u8 - 8 bits */
+       TCA_FLOWER_KEY_MPLS_BOS,        /* u8 - 1 bit */
+       TCA_FLOWER_KEY_MPLS_TC,         /* u8 - 3 bits */
+       TCA_FLOWER_KEY_MPLS_LABEL,      /* be32 - 20 bits */

Yes, that looks pretty, but:
That would have fit in one attribute with a u32. Mask attributes would
be eliminated with a second 32 bit - all in the same singular
attribute.

Imagine if i have 1M flower entries. If you add up the mask the cost
of these things is about 3*2*64 bits more per entry compared to putting
the mpls info/mask in one attribute.
At 1M entries that is a few MBs of data being exchanged.

> I expect that you will not be able to show the difference. And if there
> is no difference in performance, your main argument goes away. And we
> can do this in a nice, clear, TLV fashion.
>

I love TLVs Jiri. But there is a difference between management and
control. The former cares more about humans the later needs to get shit
done faster. The extreme version of the later is using json. But you
try to get the json guy to do setting or dumping 1M entries and you
can take a long distance trip and come back and they are not done.

I want to use TLVs but plan for optimization/performance as well.

cheers,
jamal

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-26 11:48             ` Jamal Hadi Salim
@ 2017-04-26 12:08               ` Jiri Pirko
  2017-04-26 13:14                 ` Jamal Hadi Salim
  0 siblings, 1 reply; 33+ messages in thread
From: Jiri Pirko @ 2017-04-26 12:08 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, xiyou.wangcong, eric.dumazet, netdev, Simon Horman,
	Benjamin LaHaise

Wed, Apr 26, 2017 at 01:48:29PM CEST, jhs@mojatatu.com wrote:
>On 17-04-26 02:19 AM, Jiri Pirko wrote:
>> Tue, Apr 25, 2017 at 10:29:40PM CEST, jhs@mojatatu.com wrote:
>> > On 17-04-25 12:04 PM, Jiri Pirko wrote:
>[..]
>> > That is expected behavior correct?
>> > 
>> > 3 months down the road:
>> > I add two flags - bit 1 and 2.
>> > So now my valid_flags changes to bits 1, 2 and 0.
>> > 
>> > The function above will now return true for bits 0-2 but
>> > will reject if you set bit 3.
>> > 
>> > That is expected behavior, correct?
>> 
>> The same app compiled against new kernel with bits (0, 1, 2) will run with
>> this kernel good. But if you run it with older kernel, the kernel (0)
>> would refuse. Is that ok?
>> 
>
>
>Dave said that is what has to be done.
>To quote from the cover letter:
>
>--------- START QUOTE -------------
>changes since v6:
>-----------------
>
>1) DaveM:
>New rules for netlink messages. From now on we are going to start
>checking for bits that are not used and rejecting anything we dont
>understand. In the future this is going to require major changes
>to user space code (tc etc). This is just a start.
>
>To quote, David:
>"
> Again, bits you aren't using now, make sure userspace doesn't
>   set them.  And if it does, reject.
>"
>
>---------- END QUOTE -----------
>
>I am going to send the patches - if you dont like this then speak up and
>David needs to be convinced.  This is UAPI - once patches are in it is
>cast in stone and I dont mind a discussion to make sure we get it right.
>
>> Jamal, note that I never suggested having more flags in a single attr.
>> Therefore I suggested u8 to carry a single flag.
>> 
>
>Jiri, thats our main difference unless I am misunderstanding you.
>
>I believe you should squeeze as many as you can in one single attribute.
>You believe you should have only one flag per attribute.
>
>Aesthetically a u8 looks good. Performance wise it is bad when you
>have many entries to deal with.
>
>
>> You say that it has performance impact having 3 flag attrs in compare to
>> one bit flag attr. Could you please provide some numbers?
>> 
>
>I have experience with dealing with a massive amount of various dumps
>and (batch) sets and it always boils down to one thing:
>_how much data is exchanged between user and kernel_
>3 flags encoded as bits in a u32 attribute cost 64 bits.
>Encoded separately cost 3x that.
>
>Believe me, it _does make a difference_ in performance.
>
>My least favorite subsystem is bridge. The bridge code has
>tons of flags in those entries that are sent to/from kernel as u8
>attributes. It is painful.
>
>For something more recent, lets look at this commit from Ben on Flower:
>+       TCA_FLOWER_KEY_MPLS_TTL,        /* u8 - 8 bits */
>+       TCA_FLOWER_KEY_MPLS_BOS,        /* u8 - 1 bit */
>+       TCA_FLOWER_KEY_MPLS_TC,         /* u8 - 3 bits */
>+       TCA_FLOWER_KEY_MPLS_LABEL,      /* be32 - 20 bits */
>
>Yes, that looks pretty, but:
>That would have fit in one attribute with a u32. Mask attributes would
>be eliminated with a second 32 bit - all in the same singular
>attribute.
>
>Imagine if i have 1M flower entries. If you add up the mask the cost
>of these things is about 3*2*64 bits more per entry compared to putting
>the mpls info/mask in one attribute.
>At 1M entries that is a few MBs of data being exchanged.

I can do the math :) Yet still, I would like to see the numbers :)
Because I believe that is the only way to end this lenghty converstation
once and forever...


>
>> I expect that you will not be able to show the difference. And if there
>> is no difference in performance, your main argument goes away. And we
>> can do this in a nice, clear, TLV fashion.
>> 
>
>I love TLVs Jiri. But there is a difference between management and
>control. The former cares more about humans the later needs to get shit
>done faster. The extreme version of the later is using json. But you
>try to get the json guy to do setting or dumping 1M entries and you
>can take a long distance trip and come back and they are not done.
>
>I want to use TLVs but plan for optimization/performance as well.
>
>cheers,
>jamal
>
>
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-26 11:02         ` Simon Horman
@ 2017-04-26 12:46           ` Jamal Hadi Salim
  2017-04-26 13:05             ` Jiri Pirko
  0 siblings, 1 reply; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-26 12:46 UTC (permalink / raw)
  To: Simon Horman, Jiri Pirko; +Cc: davem, xiyou.wangcong, eric.dumazet, netdev

On 17-04-26 07:02 AM, Simon Horman wrote:
> On Tue, Apr 25, 2017 at 06:04:45PM +0200, Jiri Pirko wrote:
[..]

>> So fix iproute2. It is always first kernel, then iproute2.
>
> Perhaps I am missing the point or somehow misguided but I would expect that
> if the UAPI uses BIT() it also provides BIT().

There is a user of BIT() already in iproute2 (devlink). We can move
the code to be more generally available for other iproute2 users.
Then this UAPI change makes use of it.

cheers,
jamal

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-26 11:07             ` Simon Horman
@ 2017-04-26 13:00               ` Jamal Hadi Salim
  0 siblings, 0 replies; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-26 13:00 UTC (permalink / raw)
  To: Simon Horman, Jiri Pirko; +Cc: davem, xiyou.wangcong, eric.dumazet, netdev

On 17-04-26 07:07 AM, Simon Horman wrote:
> On Wed, Apr 26, 2017 at 08:19:04AM +0200, Jiri Pirko wrote:
>> Tue, Apr 25, 2017 at 10:29:40PM CEST, jhs@mojatatu.com wrote:
>
> ...
>
>>> So lets in first kernel I have support for bit 0.
>>> My validation check is to make sure only bit 0 is set.
>>> The valid_flags currently then only constitutes bit 0.
>>> i.e
>>> If you set bit 2 or 3, the function above will reject and i return
>>> the error to the user.
>>>
>>> That is expected behavior correct?
>>>
>>> 3 months down the road:
>>> I add two flags - bit 1 and 2.
>>> So now my valid_flags changes to bits 1, 2 and 0.
>>>
>>> The function above will now return true for bits 0-2 but
>>> will reject if you set bit 3.
>>>
>>> That is expected behavior, correct?
>>
>> The same app compiled against new kernel with bits (0, 1, 2) will run with
>> this kernel good. But if you run it with older kernel, the kernel (0)
>> would refuse. Is that ok?
>
> Conversely, if an app is compiled against a new kernel and uses ATTR0, ATTR1
> and ATTR2 all will be well. But if you run it against the older kernel
> ATTR1 and ATTR2 will be silently ignored. I believe that is how its always
> been but is that ok?
>

I think the answer is much complex than ok/notok.

If i have bits that when not supported by the kernel would result in
bad operations then of course kernel ignoring their presence is a very
bad thing (Dave's example is "I want you to encrypt" which an old
kernel wont understand).
There's also a security concern where flags are being set and are being
randomly accepted by old kernels resulting in root access etc.

The other side of the coin, if I really dont care if you dont understand
the extra bits I have set i would like you to do what you can.
Thats status quo upto now.

So my overall view:
Ignoring how it used to work is a revolution not an evolution.
There will be a blood bath with existing apps before the new norm comes
into proper effect.

cheers,
jamal

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-26 12:46           ` Jamal Hadi Salim
@ 2017-04-26 13:05             ` Jiri Pirko
  2017-04-26 14:46               ` David Miller
  0 siblings, 1 reply; 33+ messages in thread
From: Jiri Pirko @ 2017-04-26 13:05 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Simon Horman, davem, xiyou.wangcong, eric.dumazet, netdev

Wed, Apr 26, 2017 at 02:46:22PM CEST, jhs@mojatatu.com wrote:
>On 17-04-26 07:02 AM, Simon Horman wrote:
>> On Tue, Apr 25, 2017 at 06:04:45PM +0200, Jiri Pirko wrote:
>[..]
>
>> > So fix iproute2. It is always first kernel, then iproute2.
>> 
>> Perhaps I am missing the point or somehow misguided but I would expect that
>> if the UAPI uses BIT() it also provides BIT().
>
>There is a user of BIT() already in iproute2 (devlink). We can move
>the code to be more generally available for other iproute2 users.
>Then this UAPI change makes use of it.

Should be part of UAPI as well
I see that include/uapi/rdma/vmw_pvrdma-abi.h is using BIT macro.
I don't see BIT macro defined in UAPI (I thought it is). So either
define it there (not sure where) or just use "<<"

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-26 12:08               ` Jiri Pirko
@ 2017-04-26 13:14                 ` Jamal Hadi Salim
  2017-04-26 13:56                   ` Jiri Pirko
  0 siblings, 1 reply; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-26 13:14 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, xiyou.wangcong, eric.dumazet, netdev, Simon Horman,
	Benjamin LaHaise

On 17-04-26 08:08 AM, Jiri Pirko wrote:
> Wed, Apr 26, 2017 at 01:48:29PM CEST, jhs@mojatatu.com wrote:
>> On 17-04-26 02:19 AM, Jiri Pirko wrote:
>>> Tue, Apr 25, 2017 at 10:29:40PM CEST, jhs@mojatatu.com wrote:
>>>> On 17-04-25 12:04 PM, Jiri Pirko wrote:

>> I have experience with dealing with a massive amount of various dumps
>> and (batch) sets and it always boils down to one thing:
>> _how much data is exchanged between user and kernel_
>> 3 flags encoded as bits in a u32 attribute cost 64 bits.
>> Encoded separately cost 3x that.
>>
>> Believe me, it _does make a difference_ in performance.
>>
>> My least favorite subsystem is bridge. The bridge code has
>> tons of flags in those entries that are sent to/from kernel as u8
>> attributes. It is painful.
>>
>> For something more recent, lets look at this commit from Ben on Flower:
>> +       TCA_FLOWER_KEY_MPLS_TTL,        /* u8 - 8 bits */
>> +       TCA_FLOWER_KEY_MPLS_BOS,        /* u8 - 1 bit */
>> +       TCA_FLOWER_KEY_MPLS_TC,         /* u8 - 3 bits */
>> +       TCA_FLOWER_KEY_MPLS_LABEL,      /* be32 - 20 bits */
>>
>> Yes, that looks pretty, but:
>> That would have fit in one attribute with a u32. Mask attributes would
>> be eliminated with a second 32 bit - all in the same singular
>> attribute.
>>
>> Imagine if i have 1M flower entries. If you add up the mask the cost
>> of these things is about 3*2*64 bits more per entry compared to putting
>> the mpls info/mask in one attribute.
>> At 1M entries that is a few MBs of data being exchanged.
>
> I can do the math :) Yet still, I would like to see the numbers :)
> Because I believe that is the only way to end this lenghty converstation
> once and forever...
>

Jiri, what are you arguing about if you have done the math? ;->
You want me to show you that getting or setting less data is good for
performance?
Look at the third patch: Why do i think it is necessary to send only
actions that have changed? Precisely to reduce the amount of data
being transported. The second patch - to reduce the amount of crossing
user space to kernel space (which is going to happen more with increased
data I have to transport between the user and the kernel).

Again: You are looking at this from a manageability point of view which
is useful but not the only input into a design. If i can squeeze more
data without killing usability - I am all for it. It just doesnt
compute that it is ok to use a flag per attribute because it looks
beautiful.

cheers,
jamal

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-26 13:14                 ` Jamal Hadi Salim
@ 2017-04-26 13:56                   ` Jiri Pirko
  2017-04-26 20:07                     ` Jamal Hadi Salim
  0 siblings, 1 reply; 33+ messages in thread
From: Jiri Pirko @ 2017-04-26 13:56 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, xiyou.wangcong, eric.dumazet, netdev, Simon Horman,
	Benjamin LaHaise

Wed, Apr 26, 2017 at 03:14:38PM CEST, jhs@mojatatu.com wrote:
>On 17-04-26 08:08 AM, Jiri Pirko wrote:
>> Wed, Apr 26, 2017 at 01:48:29PM CEST, jhs@mojatatu.com wrote:
>> > On 17-04-26 02:19 AM, Jiri Pirko wrote:
>> > > Tue, Apr 25, 2017 at 10:29:40PM CEST, jhs@mojatatu.com wrote:
>> > > > On 17-04-25 12:04 PM, Jiri Pirko wrote:
>
>> > I have experience with dealing with a massive amount of various dumps
>> > and (batch) sets and it always boils down to one thing:
>> > _how much data is exchanged between user and kernel_
>> > 3 flags encoded as bits in a u32 attribute cost 64 bits.
>> > Encoded separately cost 3x that.
>> > 
>> > Believe me, it _does make a difference_ in performance.
>> > 
>> > My least favorite subsystem is bridge. The bridge code has
>> > tons of flags in those entries that are sent to/from kernel as u8
>> > attributes. It is painful.
>> > 
>> > For something more recent, lets look at this commit from Ben on Flower:
>> > +       TCA_FLOWER_KEY_MPLS_TTL,        /* u8 - 8 bits */
>> > +       TCA_FLOWER_KEY_MPLS_BOS,        /* u8 - 1 bit */
>> > +       TCA_FLOWER_KEY_MPLS_TC,         /* u8 - 3 bits */
>> > +       TCA_FLOWER_KEY_MPLS_LABEL,      /* be32 - 20 bits */
>> > 
>> > Yes, that looks pretty, but:
>> > That would have fit in one attribute with a u32. Mask attributes would
>> > be eliminated with a second 32 bit - all in the same singular
>> > attribute.
>> > 
>> > Imagine if i have 1M flower entries. If you add up the mask the cost
>> > of these things is about 3*2*64 bits more per entry compared to putting
>> > the mpls info/mask in one attribute.
>> > At 1M entries that is a few MBs of data being exchanged.
>> 
>> I can do the math :) Yet still, I would like to see the numbers :)
>> Because I believe that is the only way to end this lenghty converstation
>> once and forever...
>> 
>
>Jiri, what are you arguing about if you have done the math? ;->

I can do 3*2*64. What I cannot do is to figure out the real performance
impact.


>You want me to show you that getting or setting less data is good for
>performance?
>Look at the third patch: Why do i think it is necessary to send only
>actions that have changed? Precisely to reduce the amount of data
>being transported. The second patch - to reduce the amount of crossing
>user space to kernel space (which is going to happen more with increased
>data I have to transport between the user and the kernel).
>
>Again: You are looking at this from a manageability point of view which
>is useful but not the only input into a design. If i can squeeze more
>data without killing usability - I am all for it. It just doesnt
>compute that it is ok to use a flag per attribute because it looks
>beautiful.

Hmm. Now that I'm thinking about it, why don't we have NLA_FLAGS with
couple of helpers around it? It will be obvious what the attr is, all
kernel code would use the same helpers. Would be nice.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-26 13:05             ` Jiri Pirko
@ 2017-04-26 14:46               ` David Miller
  2017-04-26 14:58                 ` Jiri Pirko
  2017-04-28 12:21                 ` Simon Horman
  0 siblings, 2 replies; 33+ messages in thread
From: David Miller @ 2017-04-26 14:46 UTC (permalink / raw)
  To: jiri; +Cc: jhs, simon.horman, xiyou.wangcong, eric.dumazet, netdev

From: Jiri Pirko <jiri@resnulli.us>
Date: Wed, 26 Apr 2017 15:05:06 +0200

> Wed, Apr 26, 2017 at 02:46:22PM CEST, jhs@mojatatu.com wrote:
>>On 17-04-26 07:02 AM, Simon Horman wrote:
>>> On Tue, Apr 25, 2017 at 06:04:45PM +0200, Jiri Pirko wrote:
>>[..]
>>
>>> > So fix iproute2. It is always first kernel, then iproute2.
>>> 
>>> Perhaps I am missing the point or somehow misguided but I would expect that
>>> if the UAPI uses BIT() it also provides BIT().
>>
>>There is a user of BIT() already in iproute2 (devlink). We can move
>>the code to be more generally available for other iproute2 users.
>>Then this UAPI change makes use of it.
> 
> Should be part of UAPI as well
> I see that include/uapi/rdma/vmw_pvrdma-abi.h is using BIT macro.
> I don't see BIT macro defined in UAPI (I thought it is). So either
> define it there (not sure where) or just use "<<"

"BIT" is a pretty crazy small simple name to pollute into the global
namespace, IMHO.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-26 14:46               ` David Miller
@ 2017-04-26 14:58                 ` Jiri Pirko
  2017-04-28 12:21                 ` Simon Horman
  1 sibling, 0 replies; 33+ messages in thread
From: Jiri Pirko @ 2017-04-26 14:58 UTC (permalink / raw)
  To: David Miller; +Cc: jhs, simon.horman, xiyou.wangcong, eric.dumazet, netdev

Wed, Apr 26, 2017 at 04:46:58PM CEST, davem@davemloft.net wrote:
>From: Jiri Pirko <jiri@resnulli.us>
>Date: Wed, 26 Apr 2017 15:05:06 +0200
>
>> Wed, Apr 26, 2017 at 02:46:22PM CEST, jhs@mojatatu.com wrote:
>>>On 17-04-26 07:02 AM, Simon Horman wrote:
>>>> On Tue, Apr 25, 2017 at 06:04:45PM +0200, Jiri Pirko wrote:
>>>[..]
>>>
>>>> > So fix iproute2. It is always first kernel, then iproute2.
>>>> 
>>>> Perhaps I am missing the point or somehow misguided but I would expect that
>>>> if the UAPI uses BIT() it also provides BIT().
>>>
>>>There is a user of BIT() already in iproute2 (devlink). We can move
>>>the code to be more generally available for other iproute2 users.
>>>Then this UAPI change makes use of it.
>> 
>> Should be part of UAPI as well
>> I see that include/uapi/rdma/vmw_pvrdma-abi.h is using BIT macro.
>> I don't see BIT macro defined in UAPI (I thought it is). So either
>> define it there (not sure where) or just use "<<"
>
>"BIT" is a pretty crazy small simple name to pollute into the global
>namespace, IMHO.

Btw, this is also something resolvable nicely if we have NLA_FLAGS
netlink attribute type. We can have some helper in UAPI like:

#define TCA_FLAG_LARGE_DUMP_ON	NLA_FLAGS_F(0)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-26 13:56                   ` Jiri Pirko
@ 2017-04-26 20:07                     ` Jamal Hadi Salim
  2017-04-27  6:30                       ` Jiri Pirko
  0 siblings, 1 reply; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-26 20:07 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, xiyou.wangcong, eric.dumazet, netdev, Simon Horman,
	Benjamin LaHaise

On 17-04-26 09:56 AM, Jiri Pirko wrote:
> Wed, Apr 26, 2017 at 03:14:38PM CEST, jhs@mojatatu.com wrote:
>> On 17-04-26 08:08 AM, Jiri Pirko wrote:

[..]

>> Jiri, what are you arguing about if you have done the math? ;->
>
> I can do 3*2*64. What I cannot do is to figure out the real performance
> impact.
>

Jiri, I do a lot of very large data dumping and setting towards the
kernel. You know that. It is why I even have these patches to begin
with.

The math should be convincing enough.
48B per rule extra for just MPLS in a filter rule. I havent started
testing the overhead of flower but i do plan to use it - with about a
million rules for offloading. I will give you the numbers then.

I think we are at a stalemate.
You are not going to convince me to use an attribute with a
u8 for a bit flag when I can fit 32 of them in one attribute (with
the same cost). And I am not able to convince you that you are
wrong to put beauty first.

>> Again: You are looking at this from a manageability point of view which
>> is useful but not the only input into a design. If i can squeeze more
>> data without killing usability - I am all for it. It just doesnt
>> compute that it is ok to use a flag per attribute because it looks
>> beautiful.
>
> Hmm. Now that I'm thinking about it, why don't we have NLA_FLAGS with
> couple of helpers around it? It will be obvious what the attr is, all
> kernel code would use the same helpers. Would be nice.
>

I think to have flags at that level is useful but it
is a different hierarchy level. I am not sure the
"actions dump large messages" is a fit for that level.

cheers,
jamal

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-26 20:07                     ` Jamal Hadi Salim
@ 2017-04-27  6:30                       ` Jiri Pirko
  2017-04-28  1:22                         ` Jamal Hadi Salim
  0 siblings, 1 reply; 33+ messages in thread
From: Jiri Pirko @ 2017-04-27  6:30 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, xiyou.wangcong, eric.dumazet, netdev, Simon Horman,
	Benjamin LaHaise

Wed, Apr 26, 2017 at 10:07:08PM CEST, jhs@mojatatu.com wrote:
>On 17-04-26 09:56 AM, Jiri Pirko wrote:
>> Wed, Apr 26, 2017 at 03:14:38PM CEST, jhs@mojatatu.com wrote:
>> > On 17-04-26 08:08 AM, Jiri Pirko wrote:
>
>[..]
>

[...]

>
>> > Again: You are looking at this from a manageability point of view which
>> > is useful but not the only input into a design. If i can squeeze more
>> > data without killing usability - I am all for it. It just doesnt
>> > compute that it is ok to use a flag per attribute because it looks
>> > beautiful.
>> 
>> Hmm. Now that I'm thinking about it, why don't we have NLA_FLAGS with
>> couple of helpers around it? It will be obvious what the attr is, all
>> kernel code would use the same helpers. Would be nice.
>> 
>
>I think to have flags at that level is useful but it
>is a different hierarchy level. I am not sure the
>"actions dump large messages" is a fit for that level.

Jamal, the idea is to have exactly what you want to have. Only does not
use NLA_U32 attr for that but a special attr NLA_FLAGS which would have
well defined semantics and set of helpers to work with and enforce it.

Then, this could be easily reused in other subsystem that uses netlink

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-27  6:30                       ` Jiri Pirko
@ 2017-04-28  1:22                         ` Jamal Hadi Salim
  2017-04-28  7:02                           ` Jiri Pirko
  0 siblings, 1 reply; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-28  1:22 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, xiyou.wangcong, eric.dumazet, netdev, Simon Horman,
	Benjamin LaHaise

On 17-04-27 02:30 AM, Jiri Pirko wrote:
> Wed, Apr 26, 2017 at 10:07:08PM CEST, jhs@mojatatu.com wrote:
>> On 17-04-26 09:56 AM, Jiri Pirko wrote:
>>> Wed, Apr 26, 2017 at 03:14:38PM CEST, jhs@mojatatu.com wrote:

>> I think to have flags at that level is useful but it
>> is a different hierarchy level. I am not sure the
>> "actions dump large messages" is a fit for that level.
>
> Jamal, the idea is to have exactly what you want to have. Only does not
> use NLA_U32 attr for that but a special attr NLA_FLAGS which would have
> well defined semantics and set of helpers to work with and enforce it.
>
> Then, this could be easily reused in other subsystem that uses netlink
>

Maybe I am misunderstanding:
Recall, this is what it looks like with this patchset:
<nlh><subsytem-header>[TCA_ROOT_XXXX]

TCA_ROOT_XXX is very subsystem specific. classifiers, qdiscs and many
subsystems defined their own semantics for that TLV level. This specific
"dump max" is very very specific to actions. They were crippled by the
fact you could only send 32 at a time - this allows more to be sent.

I thought initially you meant:
<nlh>[NLA_XXX]<subsytem-header>[TCA_ROOT_XXXX]

I think at the NLA_XXX you could fit netlink wide TLVs - but if i said
"do a large dump" it is of no use to any other subsystem.

cheers,
jamal

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-28  1:22                         ` Jamal Hadi Salim
@ 2017-04-28  7:02                           ` Jiri Pirko
  2017-04-28 12:30                             ` Jamal Hadi Salim
  0 siblings, 1 reply; 33+ messages in thread
From: Jiri Pirko @ 2017-04-28  7:02 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, xiyou.wangcong, eric.dumazet, netdev, Simon Horman,
	Benjamin LaHaise

Fri, Apr 28, 2017 at 03:22:53AM CEST, jhs@mojatatu.com wrote:
>On 17-04-27 02:30 AM, Jiri Pirko wrote:
>> Wed, Apr 26, 2017 at 10:07:08PM CEST, jhs@mojatatu.com wrote:
>> > On 17-04-26 09:56 AM, Jiri Pirko wrote:
>> > > Wed, Apr 26, 2017 at 03:14:38PM CEST, jhs@mojatatu.com wrote:
>
>> > I think to have flags at that level is useful but it
>> > is a different hierarchy level. I am not sure the
>> > "actions dump large messages" is a fit for that level.
>> 
>> Jamal, the idea is to have exactly what you want to have. Only does not
>> use NLA_U32 attr for that but a special attr NLA_FLAGS which would have
>> well defined semantics and set of helpers to work with and enforce it.
>> 
>> Then, this could be easily reused in other subsystem that uses netlink
>> 
>
>Maybe I am misunderstanding:
>Recall, this is what it looks like with this patchset:
><nlh><subsytem-header>[TCA_ROOT_XXXX]
>
>TCA_ROOT_XXX is very subsystem specific. classifiers, qdiscs and many
>subsystems defined their own semantics for that TLV level. This specific
>"dump max" is very very specific to actions. They were crippled by the
>fact you could only send 32 at a time - this allows more to be sent.
>
>I thought initially you meant:
><nlh>[NLA_XXX]<subsytem-header>[TCA_ROOT_XXXX]
>
>I think at the NLA_XXX you could fit netlink wide TLVs - but if i said
>"do a large dump" it is of no use to any other subsystem.

Okay, I'm sorry, I had couple of beers yesterday so that might be
the cause why your msg makes me totally confused :O

All I suggest is to replace NLA_U32 flags you want that does not
have any semantics with NLA_FLAGS flags, which eventually will carry
the exact same u32, but with predefined semantics, helpers, everything.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-26 14:46               ` David Miller
  2017-04-26 14:58                 ` Jiri Pirko
@ 2017-04-28 12:21                 ` Simon Horman
  2017-04-28 12:55                   ` Jamal Hadi Salim
  1 sibling, 1 reply; 33+ messages in thread
From: Simon Horman @ 2017-04-28 12:21 UTC (permalink / raw)
  To: David Miller; +Cc: jiri, jhs, xiyou.wangcong, eric.dumazet, netdev

On Wed, Apr 26, 2017 at 10:46:58AM -0400, David Miller wrote:
> From: Jiri Pirko <jiri@resnulli.us>
> Date: Wed, 26 Apr 2017 15:05:06 +0200
> 
> > Wed, Apr 26, 2017 at 02:46:22PM CEST, jhs@mojatatu.com wrote:
> >>On 17-04-26 07:02 AM, Simon Horman wrote:
> >>> On Tue, Apr 25, 2017 at 06:04:45PM +0200, Jiri Pirko wrote:
> >>[..]
> >>
> >>> > So fix iproute2. It is always first kernel, then iproute2.
> >>> 
> >>> Perhaps I am missing the point or somehow misguided but I would expect that
> >>> if the UAPI uses BIT() it also provides BIT().
> >>
> >>There is a user of BIT() already in iproute2 (devlink). We can move
> >>the code to be more generally available for other iproute2 users.
> >>Then this UAPI change makes use of it.
> > 
> > Should be part of UAPI as well
> > I see that include/uapi/rdma/vmw_pvrdma-abi.h is using BIT macro.
> > I don't see BIT macro defined in UAPI (I thought it is). So either
> > define it there (not sure where) or just use "<<"
> 
> "BIT" is a pretty crazy small simple name to pollute into the global
> namespace, IMHO.

It sounds to me that it would be best to just use "<<" rather than
spending cycles posturing on how to add it to the UAPI. Existing users
of BIT in the UAPI could also be updated to use "<<" to avoid having
a misleading precedence in-tree.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-28  7:02                           ` Jiri Pirko
@ 2017-04-28 12:30                             ` Jamal Hadi Salim
  2017-04-28 13:21                               ` Jiri Pirko
  0 siblings, 1 reply; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-28 12:30 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, xiyou.wangcong, eric.dumazet, netdev, Simon Horman,
	Benjamin LaHaise

On 17-04-28 03:02 AM, Jiri Pirko wrote:
> Fri, Apr 28, 2017 at 03:22:53AM CEST, jhs@mojatatu.com wrote:

[..]
>> Maybe I am misunderstanding:
>> Recall, this is what it looks like with this patchset:
>> <nlh><subsytem-header>[TCA_ROOT_XXXX]
>>
>> TCA_ROOT_XXX is very subsystem specific. classifiers, qdiscs and many
>> subsystems defined their own semantics for that TLV level. This specific
>> "dump max" is very very specific to actions. They were crippled by the
>> fact you could only send 32 at a time - this allows more to be sent.
>>
>> I thought initially you meant:
>> <nlh>[NLA_XXX]<subsytem-header>[TCA_ROOT_XXXX]
>>
>> I think at the NLA_XXX you could fit netlink wide TLVs - but if i said
>> "do a large dump" it is of no use to any other subsystem.
>
> Okay, I'm sorry, I had couple of beers yesterday so that might be
> the cause why your msg makes me totally confused :O
>
> All I suggest is to replace NLA_U32 flags you want that does not
> have any semantics with NLA_FLAGS flags, which eventually will carry
> the exact same u32, but with predefined semantics, helpers, everything.
>

I didnt understand fully Jiri. Are you suggesting a new type called
NLA_FLAGS which is re-usable elsewhere?

cheers,
jamal

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-28 12:21                 ` Simon Horman
@ 2017-04-28 12:55                   ` Jamal Hadi Salim
  2017-04-28 13:21                     ` Jiri Pirko
  0 siblings, 1 reply; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-28 12:55 UTC (permalink / raw)
  To: Simon Horman, David Miller; +Cc: jiri, xiyou.wangcong, eric.dumazet, netdev

On 17-04-28 08:21 AM, Simon Horman wrote:
> On Wed, Apr 26, 2017 at 10:46:58AM -0400, David Miller wrote:
>> From: Jiri Pirko <jiri@resnulli.us>
>> Date: Wed, 26 Apr 2017 15:05:06 +0200
>>

[..]
>>> Should be part of UAPI as well
>>> I see that include/uapi/rdma/vmw_pvrdma-abi.h is using BIT macro.
>>> I don't see BIT macro defined in UAPI (I thought it is). So either
>>> define it there (not sure where) or just use "<<"
>>
>> "BIT" is a pretty crazy small simple name to pollute into the global
>> namespace, IMHO.
>
> It sounds to me that it would be best to just use "<<" rather than
> spending cycles posturing on how to add it to the UAPI. Existing users
> of BIT in the UAPI could also be updated to use "<<" to avoid having
> a misleading precedence in-tree.
>

We need to convince Jiri ;-> He has plans to do a lot of cleanups and
I feel like I am pioneering (A lot of new things on my patchset).
Jiri, could this come in your cleanups later?

cheers,
jamal

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-28 12:30                             ` Jamal Hadi Salim
@ 2017-04-28 13:21                               ` Jiri Pirko
  2017-04-28 13:42                                 ` Jamal Hadi Salim
  2017-04-30 10:34                                 ` Jamal Hadi Salim
  0 siblings, 2 replies; 33+ messages in thread
From: Jiri Pirko @ 2017-04-28 13:21 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, xiyou.wangcong, eric.dumazet, netdev, Simon Horman,
	Benjamin LaHaise

Fri, Apr 28, 2017 at 02:30:17PM CEST, jhs@mojatatu.com wrote:
>On 17-04-28 03:02 AM, Jiri Pirko wrote:
>> Fri, Apr 28, 2017 at 03:22:53AM CEST, jhs@mojatatu.com wrote:
>
>[..]
>> > Maybe I am misunderstanding:
>> > Recall, this is what it looks like with this patchset:
>> > <nlh><subsytem-header>[TCA_ROOT_XXXX]
>> > 
>> > TCA_ROOT_XXX is very subsystem specific. classifiers, qdiscs and many
>> > subsystems defined their own semantics for that TLV level. This specific
>> > "dump max" is very very specific to actions. They were crippled by the
>> > fact you could only send 32 at a time - this allows more to be sent.
>> > 
>> > I thought initially you meant:
>> > <nlh>[NLA_XXX]<subsytem-header>[TCA_ROOT_XXXX]
>> > 
>> > I think at the NLA_XXX you could fit netlink wide TLVs - but if i said
>> > "do a large dump" it is of no use to any other subsystem.
>> 
>> Okay, I'm sorry, I had couple of beers yesterday so that might be
>> the cause why your msg makes me totally confused :O
>> 
>> All I suggest is to replace NLA_U32 flags you want that does not
>> have any semantics with NLA_FLAGS flags, which eventually will carry
>> the exact same u32, but with predefined semantics, helpers, everything.
>> 
>
>I didnt understand fully Jiri. Are you suggesting a new type called
>NLA_FLAGS which is re-usable elsewhere?

Exactly. That's what I'm saying.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-28 12:55                   ` Jamal Hadi Salim
@ 2017-04-28 13:21                     ` Jiri Pirko
  0 siblings, 0 replies; 33+ messages in thread
From: Jiri Pirko @ 2017-04-28 13:21 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Simon Horman, David Miller, xiyou.wangcong, eric.dumazet, netdev

Fri, Apr 28, 2017 at 02:55:40PM CEST, jhs@mojatatu.com wrote:
>On 17-04-28 08:21 AM, Simon Horman wrote:
>> On Wed, Apr 26, 2017 at 10:46:58AM -0400, David Miller wrote:
>> > From: Jiri Pirko <jiri@resnulli.us>
>> > Date: Wed, 26 Apr 2017 15:05:06 +0200
>> > 
>
>[..]
>> > > Should be part of UAPI as well
>> > > I see that include/uapi/rdma/vmw_pvrdma-abi.h is using BIT macro.
>> > > I don't see BIT macro defined in UAPI (I thought it is). So either
>> > > define it there (not sure where) or just use "<<"
>> > 
>> > "BIT" is a pretty crazy small simple name to pollute into the global
>> > namespace, IMHO.
>> 
>> It sounds to me that it would be best to just use "<<" rather than
>> spending cycles posturing on how to add it to the UAPI. Existing users
>> of BIT in the UAPI could also be updated to use "<<" to avoid having
>> a misleading precedence in-tree.
>> 
>
>We need to convince Jiri ;-> He has plans to do a lot of cleanups and
>I feel like I am pioneering (A lot of new things on my patchset).
>Jiri, could this come in your cleanups later?
 
Sure.

>
>cheers,
>jamal

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-28 13:21                               ` Jiri Pirko
@ 2017-04-28 13:42                                 ` Jamal Hadi Salim
  2017-04-28 14:02                                   ` Jiri Pirko
  2017-04-30 10:34                                 ` Jamal Hadi Salim
  1 sibling, 1 reply; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-28 13:42 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, xiyou.wangcong, eric.dumazet, netdev, Simon Horman,
	Benjamin LaHaise

On 17-04-28 09:21 AM, Jiri Pirko wrote:
> Fri, Apr 28, 2017 at 02:30:17PM CEST, jhs@mojatatu.com wrote:
>> On 17-04-28 03:02 AM, Jiri Pirko wrote:
>>> Fri, Apr 28, 2017 at 03:22:53AM CEST, jhs@mojatatu.com wrote:
>>
>> [..]
>>>> Maybe I am misunderstanding:
>>>> Recall, this is what it looks like with this patchset:
>>>> <nlh><subsytem-header>[TCA_ROOT_XXXX]
>>>>
>>>> TCA_ROOT_XXX is very subsystem specific. classifiers, qdiscs and many
>>>> subsystems defined their own semantics for that TLV level. This specific
>>>> "dump max" is very very specific to actions. They were crippled by the
>>>> fact you could only send 32 at a time - this allows more to be sent.

>>>
>>> All I suggest is to replace NLA_U32 flags you want that does not
>>> have any semantics with NLA_FLAGS flags, which eventually will carry
>>> the exact same u32, but with predefined semantics, helpers, everything.
>>>
>>
>> I didnt understand fully Jiri. Are you suggesting a new type called
>> NLA_FLAGS which is re-usable elsewhere?
>
> Exactly. That's what I'm saying.
>

If you want to make it general:
I see the semantics of this thing as more detailed than what I had.
It would have a u32 bitmap + u32 bitmask.

cheers,
jamal

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-28 13:42                                 ` Jamal Hadi Salim
@ 2017-04-28 14:02                                   ` Jiri Pirko
  0 siblings, 0 replies; 33+ messages in thread
From: Jiri Pirko @ 2017-04-28 14:02 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, xiyou.wangcong, eric.dumazet, netdev, Simon Horman,
	Benjamin LaHaise

Fri, Apr 28, 2017 at 03:42:31PM CEST, jhs@mojatatu.com wrote:
>On 17-04-28 09:21 AM, Jiri Pirko wrote:
>> Fri, Apr 28, 2017 at 02:30:17PM CEST, jhs@mojatatu.com wrote:
>> > On 17-04-28 03:02 AM, Jiri Pirko wrote:
>> > > Fri, Apr 28, 2017 at 03:22:53AM CEST, jhs@mojatatu.com wrote:
>> > 
>> > [..]
>> > > > Maybe I am misunderstanding:
>> > > > Recall, this is what it looks like with this patchset:
>> > > > <nlh><subsytem-header>[TCA_ROOT_XXXX]
>> > > > 
>> > > > TCA_ROOT_XXX is very subsystem specific. classifiers, qdiscs and many
>> > > > subsystems defined their own semantics for that TLV level. This specific
>> > > > "dump max" is very very specific to actions. They were crippled by the
>> > > > fact you could only send 32 at a time - this allows more to be sent.
>
>> > > 
>> > > All I suggest is to replace NLA_U32 flags you want that does not
>> > > have any semantics with NLA_FLAGS flags, which eventually will carry
>> > > the exact same u32, but with predefined semantics, helpers, everything.
>> > > 
>> > 
>> > I didnt understand fully Jiri. Are you suggesting a new type called
>> > NLA_FLAGS which is re-usable elsewhere?
>> 
>> Exactly. That's what I'm saying.
>> 
>
>If you want to make it general:
>I see the semantics of this thing as more detailed than what I had.
>It would have a u32 bitmap + u32 bitmask.

Sure, lets make this nice.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-04-28 13:21                               ` Jiri Pirko
  2017-04-28 13:42                                 ` Jamal Hadi Salim
@ 2017-04-30 10:34                                 ` Jamal Hadi Salim
  1 sibling, 0 replies; 33+ messages in thread
From: Jamal Hadi Salim @ 2017-04-30 10:34 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, xiyou.wangcong, eric.dumazet, netdev, Simon Horman,
	Benjamin LaHaise

On 17-04-28 09:21 AM, Jiri Pirko wrote:
> Fri, Apr 28, 2017 at 02:30:17PM CEST, jhs@mojatatu.com wrote:
[..]
>> I didnt understand fully Jiri. Are you suggesting a new type called
>> NLA_FLAGS which is re-usable elsewhere?
>
> Exactly. That's what I'm saying.
>

Okay, I will post something.

cheers,
jamal

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2017-04-30 10:34 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-25 11:54 [PATCH net-next v8 0/3] net sched actions: improve dump performance Jamal Hadi Salim
2017-04-25 11:54 ` [PATCH net-next v8 1/3] net sched actions: Use proper root attribute table for actions Jamal Hadi Salim
2017-04-25 18:42   ` Simon Horman
2017-04-25 11:54 ` [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch Jamal Hadi Salim
2017-04-25 12:13   ` Jiri Pirko
2017-04-25 13:01     ` Jamal Hadi Salim
2017-04-25 16:04       ` Jiri Pirko
2017-04-25 20:29         ` Jamal Hadi Salim
2017-04-26  6:19           ` Jiri Pirko
2017-04-26 11:07             ` Simon Horman
2017-04-26 13:00               ` Jamal Hadi Salim
2017-04-26 11:48             ` Jamal Hadi Salim
2017-04-26 12:08               ` Jiri Pirko
2017-04-26 13:14                 ` Jamal Hadi Salim
2017-04-26 13:56                   ` Jiri Pirko
2017-04-26 20:07                     ` Jamal Hadi Salim
2017-04-27  6:30                       ` Jiri Pirko
2017-04-28  1:22                         ` Jamal Hadi Salim
2017-04-28  7:02                           ` Jiri Pirko
2017-04-28 12:30                             ` Jamal Hadi Salim
2017-04-28 13:21                               ` Jiri Pirko
2017-04-28 13:42                                 ` Jamal Hadi Salim
2017-04-28 14:02                                   ` Jiri Pirko
2017-04-30 10:34                                 ` Jamal Hadi Salim
2017-04-26 11:02         ` Simon Horman
2017-04-26 12:46           ` Jamal Hadi Salim
2017-04-26 13:05             ` Jiri Pirko
2017-04-26 14:46               ` David Miller
2017-04-26 14:58                 ` Jiri Pirko
2017-04-28 12:21                 ` Simon Horman
2017-04-28 12:55                   ` Jamal Hadi Salim
2017-04-28 13:21                     ` Jiri Pirko
2017-04-25 11:54 ` [PATCH net-next v8 3/3] net sched actions: add time filter for action dumping Jamal Hadi Salim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.