All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v11 0/4]  net sched actions: improve dump performance
@ 2017-07-24  1:35 Jamal Hadi Salim
  2017-07-24  1:35 ` [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32 Jamal Hadi Salim
                   ` (3 more replies)
  0 siblings, 4 replies; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-24  1:35 UTC (permalink / raw)
  To: davem
  Cc: netdev, jiri, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>


Changes since v10:
-----------------
1) Jiri: move type->validate_content() to its own patch
Jamal: decided to remove it altogether so we can get this patch set in.

2) Change name of NLA_FLAG_BITS to NLA_BITFIELD_32 based on discussions
with D. Ahern and Jiri. D. Ahern suggests to make this a variable bitmap size.
My analysis at this point is it too complex and i only need a few bit
flags. If we run out of bits someone else can create a new NLA_BITFIELD_XXX
and start using that. So please let this go.

3) Jamal - Add Suggested-by: Jiri for type NLA_BITFIELD_32

4) Jiri: Change name allowed_flags to tcaa_root_flags_allowed

5) Jiri: Introduce nla_get_flag_bits_values() helper instead of using
memcpy for retrieving nla_bitfield_32 fields.

Changes since v9:
-----------------

1) General consensus:
- remove again the use of BIT() to maintain uapi consistency ;->

1) Jiri:
- Add a new netlink type NLA_FLAG_BITS to check for valid bits 
  and use it instead of inline vetting (patch 4/4 now)
  
Changes since v8:
-----------------

1) Jiri:
- Add back the use of BIT(). Eventually fix iproute2 instead
- Rename VALID_TCA_FLAGS to VALID_TCA_ROOT_FLAGS

Changes since v7:
-----------------

Jamal:
No changes.
Patch 1 went out twice. Resend without two copies of patch 1

changes since v6:
-----------------

1) DaveM:
New rules for netlink messages. From now on we are going to start
checking for bits that are not used and rejecting anything we dont
understand. In the future this is going to require major changes
to user space code (tc etc). This is just a start.

To quote, David:
"
 Again, bits you aren't using now, make sure userspace doesn't
   set them.  And if it does, reject.
"
Added checks for ensuring things work as above.

2) Jiri:
a)Fix the commit message to properly use "Fixes" description
b)Align assignments for nla_policy

Changes since v5:
----------------

0)
Remove use of BIT() because it is kernel specific. Requires a separate
patch (Jiri can submit that in his cleanups)

1)To paraphrase Eric D.

"memcpy(nla_data(count_attr), &cb->args[1], sizeof(u32));
wont work on 64bit BE machines because cb->args[1]
(which is 64 bit is larger in size than sizeof(u32))"

Fixed

2) Jiri Pirko

i) Spotted a bug fix mixed in the patch for wrong TLV
fix. Add patch 1/3 to address this. Make part of this
series because of dependencies.

ii) Rename ACT_LARGE_DUMP_ON -> TCA_FLAG_LARGE_DUMP_ON

iii) Satisfy Jiri's obsession against the noun "tcaa"
a)Rename struct nlattr *tcaa --> struct nlattr *tb
b)Rename TCAA_ACT_XXX -> TCA_ROOT_XXX

Changes since v4:
-----------------

1) Eric D.

pointed out that when all skb space is used up by the dump
there will be no space to insert the TCAA_ACT_COUNT attribute.

2) Jiri:

i) Change:

enum {
        TCAA_UNSPEC,
        TCAA_ACT_TAB,
        TCAA_ACT_FLAGS,
        TCAA_ACT_COUNT,
        TCAA_ACT_TIME_FILTER,
        __TCAA_MAX
};

#define TCAA_MAX (__TCAA_MAX - 1)
#define ACT_LARGE_DUMP_ON               (1 << 0)

to:
enum {
       TCAA_UNSPEC,
       TCAA_ACT_TAB,
#define TCA_ACT_TAB TCAA_ACT_TAB
       TCAA_ACT_FLAGS,
       TCAA_ACT_COUNT,
       __TCAA_MAX,
#define        TCAA_MAX (__TCAA_MAX - 1)
};

#define ACT_LARGE_DUMP_ON              BIT(0)

Jiri plans to followup with the rest of the code to make the
style consistent.

ii) Rename attribute TCAA_ACT_TIME_FILTER --> TCAA_ACT_TIME_DELTA

iii) Rename variable jiffy_filter --> jiffy_since
iv) Rename msecs_filter --> msecs_since
v) get rid of unused cb->args[0] and rename cb->args[4] to cb->args[0]

Earlier Changes
----------------
- Jiri mostly on names of things.

Jamal Hadi Salim (4):
  net netlink: Add new type NLA_BITFIELD_32
  net sched actions: Use proper root attribute table for actions
  net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  net sched actions: add time filter for action dumping

 include/net/netlink.h          | 16 ++++++++++
 include/uapi/linux/netlink.h   | 17 +++++++++++
 include/uapi/linux/rtnetlink.h | 23 ++++++++++++--
 lib/nlattr.c                   | 21 +++++++++++++
 net/sched/act_api.c            | 68 +++++++++++++++++++++++++++++++++++-------
 5 files changed, 132 insertions(+), 13 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32
  2017-07-24  1:35 [PATCH net-next v11 0/4] net sched actions: improve dump performance Jamal Hadi Salim
@ 2017-07-24  1:35 ` Jamal Hadi Salim
  2017-07-24 11:14   ` Jiri Pirko
                     ` (2 more replies)
  2017-07-24  1:35 ` [PATCH net-next v11 2/4] net sched actions: Use proper root attribute table for actions Jamal Hadi Salim
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-24  1:35 UTC (permalink / raw)
  To: davem
  Cc: netdev, jiri, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>

Generic bitflags attribute content sent to the kernel by user.
With this type the user can either set or unset a flag in the
kernel.

The nla_value is a bitmap that defines the values being set
The nla_selector is a bitmask that defines which value is legit.

A check is made to ensure the rules that a kernel subsystem always
conforms to bitflags the kernel already knows about. i.e
if the user tries to set a bit flag that is not understood then
the _it will be rejected_.

In the most basic form, the user specifies the attribute policy as:
[ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data = &myvalidflags },

where myvalidflags is the bit mask of the flags the kernel understands.

If the user _does not_ provide myvalidflags then the attribute will
also be rejected.

Examples:
nla_value = 0x0, and nla_selector = 0x1
implies we are selecting bit 1 and we want to set its value to 0.

nla_value = 0x2, and nla_selector = 0x2
implies we are selecting bit 2 and we want to set its value to 1.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 include/net/netlink.h        |  4 ++++
 include/uapi/linux/netlink.h | 17 +++++++++++++++++
 lib/nlattr.c                 | 21 +++++++++++++++++++++
 3 files changed, 42 insertions(+)

diff --git a/include/net/netlink.h b/include/net/netlink.h
index ef8e6c3..e33d1fb 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -178,6 +178,7 @@ enum {
 	NLA_S16,
 	NLA_S32,
 	NLA_S64,
+	NLA_BITFIELD_32,
 	__NLA_TYPE_MAX,
 };
 
@@ -206,6 +207,7 @@ enum {
  *    NLA_MSECS            Leaving the length field zero will verify the
  *                         given type fits, using it verifies minimum length
  *                         just like "All other"
+ *    NLA_BITFIELD_32      A 32-bit bitmap/bitselector attribute
  *    All other            Minimum length of attribute payload
  *
  * Example:
@@ -213,11 +215,13 @@ enum {
  * 	[ATTR_FOO] = { .type = NLA_U16 },
  *	[ATTR_BAR] = { .type = NLA_STRING, .len = BARSIZ },
  *	[ATTR_BAZ] = { .len = sizeof(struct mystruct) },
+ *	[ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data = &myvalidflags },
  * };
  */
 struct nla_policy {
 	u16		type;
 	u16		len;
+	void            *validation_data;
 };
 
 /**
diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h
index f86127a..0ac05a6 100644
--- a/include/uapi/linux/netlink.h
+++ b/include/uapi/linux/netlink.h
@@ -226,5 +226,22 @@ struct nlattr {
 #define NLA_ALIGN(len)		(((len) + NLA_ALIGNTO - 1) & ~(NLA_ALIGNTO - 1))
 #define NLA_HDRLEN		((int) NLA_ALIGN(sizeof(struct nlattr)))
 
+/* Generic 32 bitflags attribute content sent to the kernel.
+ *
+ * The nla_value is a bitmap that defines the values being set
+ * The nla_selector is a bitmask that defines which value is legit
+ *
+ * Examples:
+ *  nla_value = 0x0, and nla_selector = 0x1
+ *  implies we are selecting bit 1 and we want to set its value to 0.
+ *
+ *  nla_value = 0x2, and nla_selector = 0x2
+ *  implies we are selecting bit 2 and we want to set its value to 1.
+ *
+ */
+struct nla_bitfield_32 {
+	__u32 nla_value;
+	__u32 nla_selector;
+};
 
 #endif /* _UAPI__LINUX_NETLINK_H */
diff --git a/lib/nlattr.c b/lib/nlattr.c
index fb52435..c8aad7e 100644
--- a/lib/nlattr.c
+++ b/lib/nlattr.c
@@ -27,6 +27,20 @@
 	[NLA_S64]	= sizeof(s64),
 };
 
+static int validate_nla_bitfield_32(const struct nlattr *nla, void *valid_data)
+{
+	const struct nla_bitfield_32 *nbf = nla_data(nla);
+	u32 *valid_flags_mask = valid_data;
+
+	if (!valid_data)
+		return -EINVAL;
+
+	if (nbf->nla_value & ~*valid_flags_mask)
+		return -EINVAL;
+
+	return 0;
+}
+
 static int validate_nla(const struct nlattr *nla, int maxtype,
 			const struct nla_policy *policy)
 {
@@ -46,6 +60,13 @@ static int validate_nla(const struct nlattr *nla, int maxtype,
 			return -ERANGE;
 		break;
 
+	case NLA_BITFIELD_32:
+		if (attrlen != sizeof(struct nla_bitfield_32))
+			return -ERANGE;
+
+		return validate_nla_bitfield_32(nla, pt->validation_data);
+		break;
+
 	case NLA_NUL_STRING:
 		if (pt->len)
 			minlen = min_t(int, attrlen, pt->len + 1);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next v11 2/4] net sched actions: Use proper root attribute table for actions
  2017-07-24  1:35 [PATCH net-next v11 0/4] net sched actions: improve dump performance Jamal Hadi Salim
  2017-07-24  1:35 ` [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32 Jamal Hadi Salim
@ 2017-07-24  1:35 ` Jamal Hadi Salim
  2017-07-24  1:35 ` [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch Jamal Hadi Salim
  2017-07-24  1:35 ` [PATCH net-next v11 4/4] net sched actions: add time filter for action dumping Jamal Hadi Salim
  3 siblings, 0 replies; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-24  1:35 UTC (permalink / raw)
  To: davem
  Cc: netdev, jiri, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>

Bug fix for an issue which has been around for about a decade.
We got away with it because the enumeration was larger than needed.

Fixes: 7ba699c604ab ("[NET_SCHED]: Convert actions from rtnetlink to new netlink API")
Suggested-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 net/sched/act_api.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index f2e9ed3..848370e 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -1072,7 +1072,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
 			 struct netlink_ext_ack *extack)
 {
 	struct net *net = sock_net(skb->sk);
-	struct nlattr *tca[TCA_ACT_MAX + 1];
+	struct nlattr *tca[TCAA_MAX + 1];
 	u32 portid = skb ? NETLINK_CB(skb).portid : 0;
 	int ret = 0, ovr = 0;
 
@@ -1080,7 +1080,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
 	    !netlink_capable(skb, CAP_NET_ADMIN))
 		return -EPERM;
 
-	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCA_ACT_MAX, NULL,
+	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCAA_MAX, NULL,
 			  extack);
 	if (ret < 0)
 		return ret;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-07-24  1:35 [PATCH net-next v11 0/4] net sched actions: improve dump performance Jamal Hadi Salim
  2017-07-24  1:35 ` [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32 Jamal Hadi Salim
  2017-07-24  1:35 ` [PATCH net-next v11 2/4] net sched actions: Use proper root attribute table for actions Jamal Hadi Salim
@ 2017-07-24  1:35 ` Jamal Hadi Salim
  2017-07-24 11:27   ` Jiri Pirko
  2017-07-24  1:35 ` [PATCH net-next v11 4/4] net sched actions: add time filter for action dumping Jamal Hadi Salim
  3 siblings, 1 reply; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-24  1:35 UTC (permalink / raw)
  To: davem
  Cc: netdev, jiri, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>

When you dump hundreds of thousands of actions, getting only 32 per
dump batch even when the socket buffer and memory allocations allow
is inefficient.

With this change, the user will get as many as possibly fitting
within the given constraints available to the kernel.

The top level action TLV space is extended. An attribute
TCA_ROOT_FLAGS is used to carry flags; flag TCA_FLAG_LARGE_DUMP_ON
is set by the user indicating the user is capable of processing
these large dumps. Older user space which doesnt set this flag
doesnt get the large (than 32) batches.
The kernel uses the TCA_ROOT_COUNT attribute to tell the user how many
actions are put in a single batch. As such user space app knows how long
to iterate (independent of the type of action being dumped)
instead of hardcoded maximum of 32 thus maintaining backward compat.

Some results dumping 1.5M actions below:
first an unpatched tc which doesnt understand these features...

prompt$ time -p tc actions ls action gact | grep index | wc -l
1500000
real 1388.43
user 2.07
sys 1386.79

Now lets see a patched tc which sets the correct flags when requesting
a dump:

prompt$ time -p updatedtc actions ls action gact | grep index | wc -l
1500000
real 178.13
user 2.02
sys 176.96

That is about 8x performance improvement for tc app which sets its
receive buffer to about 32K.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 include/net/netlink.h          | 12 +++++++++++
 include/uapi/linux/rtnetlink.h | 22 +++++++++++++++++--
 net/sched/act_api.c            | 48 +++++++++++++++++++++++++++++++++---------
 3 files changed, 70 insertions(+), 12 deletions(-)

diff --git a/include/net/netlink.h b/include/net/netlink.h
index e33d1fb..87c0b15 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -1207,6 +1207,18 @@ static inline struct in6_addr nla_get_in6_addr(const struct nlattr *nla)
 }
 
 /**
+ * nla_get_bitfield_32 - return payload of 32 bitfield attribute
+ * @nla: nla_bitfield_32 attribute
+ */
+static inline struct nla_bitfield_32 nla_get_bitfield_32(const struct nlattr *nla)
+{
+	struct nla_bitfield_32 tmp;
+
+	nla_memcpy(&tmp, nla, sizeof(tmp));
+	return tmp;
+}
+
+/**
  * nla_memdup - duplicate attribute memory (kmemdup)
  * @src: netlink attribute to duplicate from
  * @gfp: GFP mask
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index d148505..bfa80a6 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -683,10 +683,28 @@ struct tcamsg {
 	unsigned char	tca__pad1;
 	unsigned short	tca__pad2;
 };
+
+enum {
+	TCA_ROOT_UNSPEC,
+	TCA_ROOT_TAB,
+#define TCA_ACT_TAB TCA_ROOT_TAB
+#define TCAA_MAX TCA_ROOT_TAB
+	TCA_ROOT_FLAGS,
+	TCA_ROOT_COUNT,
+	__TCA_ROOT_MAX,
+#define	TCA_ROOT_MAX (__TCA_ROOT_MAX - 1)
+};
+
 #define TA_RTA(r)  ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct tcamsg))))
 #define TA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct tcamsg))
-#define TCA_ACT_TAB 1 /* attr type must be >=1 */	
-#define TCAA_MAX 1
+/* tcamsg flags stored in attribute TCA_ROOT_FLAGS
+ *
+ * TCA_FLAG_LARGE_DUMP_ON user->kernel to request for larger than TCA_ACT_MAX_PRIO
+ * actions in a dump. All dump responses will contain the number of actions
+ * being dumped stored in for user app's consumption in TCA_ROOT_COUNT
+ *
+ */
+#define TCA_FLAG_LARGE_DUMP_ON		(1 << 0)
 
 /* New extended info filters for IFLA_EXT_MASK */
 #define RTEXT_FILTER_VF		(1 << 0)
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 848370e..15d6c46 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -110,6 +110,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
 			   struct netlink_callback *cb)
 {
 	int err = 0, index = -1, i = 0, s_i = 0, n_i = 0;
+	u32 act_flags = cb->args[2];
 	struct nlattr *nest;
 
 	spin_lock_bh(&hinfo->lock);
@@ -138,14 +139,18 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
 			}
 			nla_nest_end(skb, nest);
 			n_i++;
-			if (n_i >= TCA_ACT_MAX_PRIO)
+			if (!(act_flags & TCA_FLAG_LARGE_DUMP_ON) &&
+			    n_i >= TCA_ACT_MAX_PRIO)
 				goto done;
 		}
 	}
 done:
 	spin_unlock_bh(&hinfo->lock);
-	if (n_i)
+	if (n_i) {
 		cb->args[0] += n_i;
+		if (act_flags & TCA_FLAG_LARGE_DUMP_ON)
+			cb->args[1] = n_i;
+	}
 	return n_i;
 
 nla_put_failure:
@@ -1068,11 +1073,17 @@ static int tcf_action_add(struct net *net, struct nlattr *nla,
 	return tcf_add_notify(net, n, &actions, portid);
 }
 
+static u32 tcaa_root_flags_allowed = TCA_FLAG_LARGE_DUMP_ON;
+static const struct nla_policy tcaa_policy[TCA_ROOT_MAX + 1] = {
+	[TCA_ROOT_FLAGS] = { .type = NLA_BITFIELD_32,
+			     .validation_data = &tcaa_root_flags_allowed },
+};
+
 static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
 			 struct netlink_ext_ack *extack)
 {
 	struct net *net = sock_net(skb->sk);
-	struct nlattr *tca[TCAA_MAX + 1];
+	struct nlattr *tca[TCA_ROOT_MAX + 1];
 	u32 portid = skb ? NETLINK_CB(skb).portid : 0;
 	int ret = 0, ovr = 0;
 
@@ -1080,7 +1091,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
 	    !netlink_capable(skb, CAP_NET_ADMIN))
 		return -EPERM;
 
-	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCAA_MAX, NULL,
+	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCA_ROOT_MAX, NULL,
 			  extack);
 	if (ret < 0)
 		return ret;
@@ -1121,16 +1132,12 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
 	return ret;
 }
 
-static struct nlattr *find_dump_kind(const struct nlmsghdr *n)
+static struct nlattr *find_dump_kind(struct nlattr **nla)
 {
 	struct nlattr *tb1, *tb2[TCA_ACT_MAX + 1];
 	struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
-	struct nlattr *nla[TCAA_MAX + 1];
 	struct nlattr *kind;
 
-	if (nlmsg_parse(n, sizeof(struct tcamsg), nla, TCAA_MAX,
-			NULL, NULL) < 0)
-		return NULL;
 	tb1 = nla[TCA_ACT_TAB];
 	if (tb1 == NULL)
 		return NULL;
@@ -1157,8 +1164,18 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
 	struct tc_action_ops *a_o;
 	int ret = 0;
 	struct tcamsg *t = (struct tcamsg *) nlmsg_data(cb->nlh);
-	struct nlattr *kind = find_dump_kind(cb->nlh);
+	struct nla_bitfield_32 fb;
+	struct nlattr *count_attr = NULL;
+	struct nlattr *tb[TCA_ROOT_MAX + 1];
+	struct nlattr *kind = NULL;
+	u32 act_count = 0;
+
+	ret = nlmsg_parse(cb->nlh, sizeof(struct tcamsg), tb, TCA_ROOT_MAX,
+			  tcaa_policy, NULL);
+	if (ret < 0)
+		return ret;
 
+	kind = find_dump_kind(tb);
 	if (kind == NULL) {
 		pr_info("tc_dump_action: action bad kind\n");
 		return 0;
@@ -1168,14 +1185,22 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
 	if (a_o == NULL)
 		return 0;
 
+	if (tb[TCA_ROOT_FLAGS])
+		fb = nla_get_bitfield_32(tb[TCA_ROOT_FLAGS]);
+
 	nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
 			cb->nlh->nlmsg_type, sizeof(*t), 0);
 	if (!nlh)
 		goto out_module_put;
+
+	cb->args[2] = fb.nla_value;
 	t = nlmsg_data(nlh);
 	t->tca_family = AF_UNSPEC;
 	t->tca__pad1 = 0;
 	t->tca__pad2 = 0;
+	count_attr = nla_reserve(skb, TCA_ROOT_COUNT, sizeof(u32));
+	if (!count_attr)
+		goto out_module_put;
 
 	nest = nla_nest_start(skb, TCA_ACT_TAB);
 	if (nest == NULL)
@@ -1188,6 +1213,9 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
 	if (ret > 0) {
 		nla_nest_end(skb, nest);
 		ret = skb->len;
+		act_count = cb->args[1];
+		memcpy(nla_data(count_attr), &act_count, sizeof(u32));
+		cb->args[1] = 0;
 	} else
 		nlmsg_trim(skb, b);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next v11 4/4] net sched actions: add time filter for action dumping
  2017-07-24  1:35 [PATCH net-next v11 0/4] net sched actions: improve dump performance Jamal Hadi Salim
                   ` (2 preceding siblings ...)
  2017-07-24  1:35 ` [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch Jamal Hadi Salim
@ 2017-07-24  1:35 ` Jamal Hadi Salim
  2017-07-24 11:34   ` Jiri Pirko
  3 siblings, 1 reply; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-24  1:35 UTC (permalink / raw)
  To: davem
  Cc: netdev, jiri, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>

This patch adds support for filtering based on time since last used.
When we are dumping a large number of actions it is useful to
have the option of filtering based on when the action was last
used to reduce the amount of data crossing to user space.

With this patch the user space app sets the TCA_ROOT_TIME_DELTA
attribute with the value in milliseconds with "time of interest
since now".  The kernel converts this to jiffies and does the
filtering comparison matching entries that have seen activity
since then and returns them to user space.
Old kernels and old tc continue to work in legacy mode since
they dont specify this attribute.

Some example (we have 400 actions bound to 400 filters); at
installation time using hacked tc setting the time of
interest to 120 seconds earlier:
prompt$ hackedtc actions ls action gact since 120000| grep index | wc -l
400

go get some coffee and wait for > 120 seconds and try again:

prompt$ hackedtc actions ls action gact since 120000 | grep index | wc -l
0

Lets see a filter bound to one of these actions:
..
filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10  (rule hit 2 success 1)
  match 7f000002/ffffffff at 12 (success 1 )
    action order 1: gact action pass
     random type none pass val 0
     index 23 ref 2 bind 1 installed 1145 sec used 802 sec
    Action statistics:
    Sent 84 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
    backlog 0b 0p requeues 0
....

that coffee took long, no? It was good.

Now lets ping -c 1 127.0.0.2, then run the actions again:
prompt$ hackedtc actions ls action gact since 120 | grep index | wc -l
1

More details please:

prompt$ hackedtc -s actions ls action gact since 120000

    action order 0: gact action pass
     random type none pass val 0
     index 23 ref 2 bind 1 installed 1270 sec used 30 sec
    Action statistics:
    Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
    backlog 0b 0p requeues 0

And the filter?

filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10  (rule hit 4 success 2)
  match 7f000002/ffffffff at 12 (success 2 )
    action order 1: gact action pass
     random type none pass val 0
     index 23 ref 2 bind 1 installed 1324 sec used 84 sec
    Action statistics:
    Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
    backlog 0b 0p requeues 0

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 include/uapi/linux/rtnetlink.h |  1 +
 net/sched/act_api.c            | 20 +++++++++++++++++++-
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index bfa80a6..dab7dad 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -691,6 +691,7 @@ enum {
 #define TCAA_MAX TCA_ROOT_TAB
 	TCA_ROOT_FLAGS,
 	TCA_ROOT_COUNT,
+	TCA_ROOT_TIME_DELTA, /* in msecs */
 	__TCA_ROOT_MAX,
 #define	TCA_ROOT_MAX (__TCA_ROOT_MAX - 1)
 };
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 15d6c46..9ffad9c 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -111,6 +111,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
 {
 	int err = 0, index = -1, i = 0, s_i = 0, n_i = 0;
 	u32 act_flags = cb->args[2];
+	unsigned long jiffy_since = cb->args[3];
 	struct nlattr *nest;
 
 	spin_lock_bh(&hinfo->lock);
@@ -128,6 +129,11 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
 			if (index < s_i)
 				continue;
 
+			if (jiffy_since &&
+			    time_after(jiffy_since,
+				       (unsigned long)p->tcfa_tm.lastuse))
+				continue;
+
 			nest = nla_nest_start(skb, n_i);
 			if (nest == NULL)
 				goto nla_put_failure;
@@ -145,9 +151,11 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
 		}
 	}
 done:
+	if (index >= 0)
+		cb->args[0] = index + 1;
+
 	spin_unlock_bh(&hinfo->lock);
 	if (n_i) {
-		cb->args[0] += n_i;
 		if (act_flags & TCA_FLAG_LARGE_DUMP_ON)
 			cb->args[1] = n_i;
 	}
@@ -1077,6 +1085,7 @@ static int tcf_action_add(struct net *net, struct nlattr *nla,
 static const struct nla_policy tcaa_policy[TCA_ROOT_MAX + 1] = {
 	[TCA_ROOT_FLAGS] = { .type = NLA_BITFIELD_32,
 			     .validation_data = &tcaa_root_flags_allowed },
+	[TCA_ROOT_TIME_DELTA]      = { .type = NLA_U32 },
 };
 
 static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
@@ -1167,7 +1176,9 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
 	struct nla_bitfield_32 fb;
 	struct nlattr *count_attr = NULL;
 	struct nlattr *tb[TCA_ROOT_MAX + 1];
+	unsigned long jiffy_since = 0;
 	struct nlattr *kind = NULL;
+	u32 msecs_since = 0;
 	u32 act_count = 0;
 
 	ret = nlmsg_parse(cb->nlh, sizeof(struct tcamsg), tb, TCA_ROOT_MAX,
@@ -1188,12 +1199,19 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
 	if (tb[TCA_ROOT_FLAGS])
 		fb = nla_get_bitfield_32(tb[TCA_ROOT_FLAGS]);
 
+	if (tb[TCA_ROOT_TIME_DELTA])
+		msecs_since = nla_get_u32(tb[TCA_ROOT_TIME_DELTA]);
+
 	nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
 			cb->nlh->nlmsg_type, sizeof(*t), 0);
 	if (!nlh)
 		goto out_module_put;
 
+	if (msecs_since)
+		jiffy_since = jiffies - msecs_to_jiffies(msecs_since);
+
 	cb->args[2] = fb.nla_value;
+	cb->args[3] = jiffy_since;
 	t = nlmsg_data(nlh);
 	t->tca_family = AF_UNSPEC;
 	t->tca__pad1 = 0;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32
  2017-07-24  1:35 ` [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32 Jamal Hadi Salim
@ 2017-07-24 11:14   ` Jiri Pirko
  2017-07-25 11:14     ` Jamal Hadi Salim
  2017-07-24 11:18   ` Jiri Pirko
  2017-07-25 14:41   ` David Ahern
  2 siblings, 1 reply; 34+ messages in thread
From: Jiri Pirko @ 2017-07-24 11:14 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

Mon, Jul 24, 2017 at 03:35:43AM CEST, jhs@mojatatu.com wrote:
>From: Jamal Hadi Salim <jhs@mojatatu.com>
>
>Generic bitflags attribute content sent to the kernel by user.
>With this type the user can either set or unset a flag in the
>kernel.
>
>The nla_value is a bitmap that defines the values being set
>The nla_selector is a bitmask that defines which value is legit.
>
>A check is made to ensure the rules that a kernel subsystem always
>conforms to bitflags the kernel already knows about. i.e
>if the user tries to set a bit flag that is not understood then
>the _it will be rejected_.
>
>In the most basic form, the user specifies the attribute policy as:
>[ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data = &myvalidflags },
>
>where myvalidflags is the bit mask of the flags the kernel understands.
>
>If the user _does not_ provide myvalidflags then the attribute will
>also be rejected.
>
>Examples:
>nla_value = 0x0, and nla_selector = 0x1
>implies we are selecting bit 1 and we want to set its value to 0.
>
>nla_value = 0x2, and nla_selector = 0x2
>implies we are selecting bit 2 and we want to set its value to 1.
>
>Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
>---
> include/net/netlink.h        |  4 ++++
> include/uapi/linux/netlink.h | 17 +++++++++++++++++
> lib/nlattr.c                 | 21 +++++++++++++++++++++
> 3 files changed, 42 insertions(+)
>
>diff --git a/include/net/netlink.h b/include/net/netlink.h
>index ef8e6c3..e33d1fb 100644
>--- a/include/net/netlink.h
>+++ b/include/net/netlink.h
>@@ -178,6 +178,7 @@ enum {
> 	NLA_S16,
> 	NLA_S32,
> 	NLA_S64,
>+	NLA_BITFIELD_32,
> 	__NLA_TYPE_MAX,
> };
> 
>@@ -206,6 +207,7 @@ enum {
>  *    NLA_MSECS            Leaving the length field zero will verify the
>  *                         given type fits, using it verifies minimum length
>  *                         just like "All other"
>+ *    NLA_BITFIELD_32      A 32-bit bitmap/bitselector attribute
>  *    All other            Minimum length of attribute payload
>  *
>  * Example:
>@@ -213,11 +215,13 @@ enum {
>  * 	[ATTR_FOO] = { .type = NLA_U16 },
>  *	[ATTR_BAR] = { .type = NLA_STRING, .len = BARSIZ },
>  *	[ATTR_BAZ] = { .len = sizeof(struct mystruct) },
>+ *	[ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data = &myvalidflags },
>  * };
>  */
> struct nla_policy {
> 	u16		type;
> 	u16		len;
>+	void            *validation_data;
> };
> 
> /**
>diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h
>index f86127a..0ac05a6 100644
>--- a/include/uapi/linux/netlink.h
>+++ b/include/uapi/linux/netlink.h
>@@ -226,5 +226,22 @@ struct nlattr {
> #define NLA_ALIGN(len)		(((len) + NLA_ALIGNTO - 1) & ~(NLA_ALIGNTO - 1))
> #define NLA_HDRLEN		((int) NLA_ALIGN(sizeof(struct nlattr)))
> 
>+/* Generic 32 bitflags attribute content sent to the kernel.
>+ *
>+ * The nla_value is a bitmap that defines the values being set
>+ * The nla_selector is a bitmask that defines which value is legit
>+ *
>+ * Examples:
>+ *  nla_value = 0x0, and nla_selector = 0x1
>+ *  implies we are selecting bit 1 and we want to set its value to 0.
>+ *
>+ *  nla_value = 0x2, and nla_selector = 0x2
>+ *  implies we are selecting bit 2 and we want to set its value to 1.
>+ *
>+ */
>+struct nla_bitfield_32 {
>+	__u32 nla_value;
>+	__u32 nla_selector;

I would just have "value" and "selector" here. "nla_" prefix indicates
netlink attrubute, like in struct nlattr - nla_type, nla_len,
and that is wrong indication. 



>+};
> 
> #endif /* _UAPI__LINUX_NETLINK_H */
>diff --git a/lib/nlattr.c b/lib/nlattr.c
>index fb52435..c8aad7e 100644
>--- a/lib/nlattr.c
>+++ b/lib/nlattr.c
>@@ -27,6 +27,20 @@
> 	[NLA_S64]	= sizeof(s64),
> };
> 
>+static int validate_nla_bitfield_32(const struct nlattr *nla, void *valid_data)

This should be:
static int validate_nla_bitfield_32(const struct nlattr *nla, u32 *valid_flags_allowed)


Other than this 2 nits, this looks good.



>+{
>+	const struct nla_bitfield_32 *nbf = nla_data(nla);
>+	u32 *valid_flags_mask = valid_data;
>+
>+	if (!valid_data)
>+		return -EINVAL;
>+
>+	if (nbf->nla_value & ~*valid_flags_mask)
>+		return -EINVAL;
>+
>+	return 0;
>+}
>+
> static int validate_nla(const struct nlattr *nla, int maxtype,
> 			const struct nla_policy *policy)
> {
>@@ -46,6 +60,13 @@ static int validate_nla(const struct nlattr *nla, int maxtype,
> 			return -ERANGE;
> 		break;
> 
>+	case NLA_BITFIELD_32:
>+		if (attrlen != sizeof(struct nla_bitfield_32))
>+			return -ERANGE;
>+
>+		return validate_nla_bitfield_32(nla, pt->validation_data);
>+		break;
>+
> 	case NLA_NUL_STRING:
> 		if (pt->len)
> 			minlen = min_t(int, attrlen, pt->len + 1);
>-- 
>1.9.1
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32
  2017-07-24  1:35 ` [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32 Jamal Hadi Salim
  2017-07-24 11:14   ` Jiri Pirko
@ 2017-07-24 11:18   ` Jiri Pirko
  2017-07-25 11:15     ` Jamal Hadi Salim
  2017-07-25 14:41   ` David Ahern
  2 siblings, 1 reply; 34+ messages in thread
From: Jiri Pirko @ 2017-07-24 11:18 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

Mon, Jul 24, 2017 at 03:35:43AM CEST, jhs@mojatatu.com wrote:
>From: Jamal Hadi Salim <jhs@mojatatu.com>
>
>Generic bitflags attribute content sent to the kernel by user.
>With this type the user can either set or unset a flag in the
>kernel.
>
>The nla_value is a bitmap that defines the values being set
>The nla_selector is a bitmask that defines which value is legit.
>
>A check is made to ensure the rules that a kernel subsystem always
>conforms to bitflags the kernel already knows about. i.e
>if the user tries to set a bit flag that is not understood then
>the _it will be rejected_.
>
>In the most basic form, the user specifies the attribute policy as:
>[ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data = &myvalidflags },
>
>where myvalidflags is the bit mask of the flags the kernel understands.
>
>If the user _does not_ provide myvalidflags then the attribute will
>also be rejected.
>
>Examples:
>nla_value = 0x0, and nla_selector = 0x1
>implies we are selecting bit 1 and we want to set its value to 0.
>
>nla_value = 0x2, and nla_selector = 0x2
>implies we are selecting bit 2 and we want to set its value to 1.

Oh, 2 more things:

[...]


>@@ -46,6 +60,13 @@ static int validate_nla(const struct nlattr *nla, int maxtype,
> 			return -ERANGE;
> 		break;
> 
>+	case NLA_BITFIELD_32:

Now that I'm looking at it, perhaps just "NLA_BITFIELD32" looks nicer
and aligns with "NLA_U32" and others.



>+		if (attrlen != sizeof(struct nla_bitfield_32))
>+			return -ERANGE;
>+
>+		return validate_nla_bitfield_32(nla, pt->validation_data);
>+		break;

Remove the pointless "break" from here.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-07-24  1:35 ` [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch Jamal Hadi Salim
@ 2017-07-24 11:27   ` Jiri Pirko
  2017-07-25 11:22     ` Jamal Hadi Salim
  0 siblings, 1 reply; 34+ messages in thread
From: Jiri Pirko @ 2017-07-24 11:27 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

Mon, Jul 24, 2017 at 03:35:45AM CEST, jhs@mojatatu.com wrote:
>From: Jamal Hadi Salim <jhs@mojatatu.com>
>
>When you dump hundreds of thousands of actions, getting only 32 per
>dump batch even when the socket buffer and memory allocations allow
>is inefficient.
>
>With this change, the user will get as many as possibly fitting
>within the given constraints available to the kernel.
>
>The top level action TLV space is extended. An attribute
>TCA_ROOT_FLAGS is used to carry flags; flag TCA_FLAG_LARGE_DUMP_ON
>is set by the user indicating the user is capable of processing
>these large dumps. Older user space which doesnt set this flag
>doesnt get the large (than 32) batches.
>The kernel uses the TCA_ROOT_COUNT attribute to tell the user how many
>actions are put in a single batch. As such user space app knows how long
>to iterate (independent of the type of action being dumped)
>instead of hardcoded maximum of 32 thus maintaining backward compat.
>
>Some results dumping 1.5M actions below:
>first an unpatched tc which doesnt understand these features...
>
>prompt$ time -p tc actions ls action gact | grep index | wc -l
>1500000
>real 1388.43
>user 2.07
>sys 1386.79
>
>Now lets see a patched tc which sets the correct flags when requesting
>a dump:
>
>prompt$ time -p updatedtc actions ls action gact | grep index | wc -l
>1500000
>real 178.13
>user 2.02
>sys 176.96
>
>That is about 8x performance improvement for tc app which sets its
>receive buffer to about 32K.
>
>Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
>---
> include/net/netlink.h          | 12 +++++++++++
> include/uapi/linux/rtnetlink.h | 22 +++++++++++++++++--
> net/sched/act_api.c            | 48 +++++++++++++++++++++++++++++++++---------
> 3 files changed, 70 insertions(+), 12 deletions(-)
>
>diff --git a/include/net/netlink.h b/include/net/netlink.h
>index e33d1fb..87c0b15 100644
>--- a/include/net/netlink.h
>+++ b/include/net/netlink.h
>@@ -1207,6 +1207,18 @@ static inline struct in6_addr nla_get_in6_addr(const struct nlattr *nla)
> }
> 
> /**
>+ * nla_get_bitfield_32 - return payload of 32 bitfield attribute
>+ * @nla: nla_bitfield_32 attribute
>+ */
>+static inline struct nla_bitfield_32 nla_get_bitfield_32(const struct nlattr *nla)
>+{
>+	struct nla_bitfield_32 tmp;
>+
>+	nla_memcpy(&tmp, nla, sizeof(tmp));
>+	return tmp;
>+}

This helper should be part of the previous patch.


>+
>+/**
>  * nla_memdup - duplicate attribute memory (kmemdup)
>  * @src: netlink attribute to duplicate from
>  * @gfp: GFP mask
>diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
>index d148505..bfa80a6 100644
>--- a/include/uapi/linux/rtnetlink.h
>+++ b/include/uapi/linux/rtnetlink.h
>@@ -683,10 +683,28 @@ struct tcamsg {
> 	unsigned char	tca__pad1;
> 	unsigned short	tca__pad2;
> };
>+
>+enum {
>+	TCA_ROOT_UNSPEC,
>+	TCA_ROOT_TAB,
>+#define TCA_ACT_TAB TCA_ROOT_TAB
>+#define TCAA_MAX TCA_ROOT_TAB
>+	TCA_ROOT_FLAGS,
>+	TCA_ROOT_COUNT,
>+	__TCA_ROOT_MAX,
>+#define	TCA_ROOT_MAX (__TCA_ROOT_MAX - 1)
>+};
>+
> #define TA_RTA(r)  ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct tcamsg))))
> #define TA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct tcamsg))
>-#define TCA_ACT_TAB 1 /* attr type must be >=1 */	
>-#define TCAA_MAX 1
>+/* tcamsg flags stored in attribute TCA_ROOT_FLAGS
>+ *
>+ * TCA_FLAG_LARGE_DUMP_ON user->kernel to request for larger than TCA_ACT_MAX_PRIO
>+ * actions in a dump. All dump responses will contain the number of actions
>+ * being dumped stored in for user app's consumption in TCA_ROOT_COUNT
>+ *
>+ */
>+#define TCA_FLAG_LARGE_DUMP_ON		(1 << 0)
> 
> /* New extended info filters for IFLA_EXT_MASK */
> #define RTEXT_FILTER_VF		(1 << 0)
>diff --git a/net/sched/act_api.c b/net/sched/act_api.c
>index 848370e..15d6c46 100644
>--- a/net/sched/act_api.c
>+++ b/net/sched/act_api.c
>@@ -110,6 +110,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
> 			   struct netlink_callback *cb)
> {
> 	int err = 0, index = -1, i = 0, s_i = 0, n_i = 0;
>+	u32 act_flags = cb->args[2];
> 	struct nlattr *nest;
> 
> 	spin_lock_bh(&hinfo->lock);
>@@ -138,14 +139,18 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
> 			}
> 			nla_nest_end(skb, nest);
> 			n_i++;
>-			if (n_i >= TCA_ACT_MAX_PRIO)
>+			if (!(act_flags & TCA_FLAG_LARGE_DUMP_ON) &&
>+			    n_i >= TCA_ACT_MAX_PRIO)
> 				goto done;
> 		}
> 	}
> done:
> 	spin_unlock_bh(&hinfo->lock);
>-	if (n_i)
>+	if (n_i) {
> 		cb->args[0] += n_i;
>+		if (act_flags & TCA_FLAG_LARGE_DUMP_ON)
>+			cb->args[1] = n_i;
>+	}
> 	return n_i;
> 
> nla_put_failure:
>@@ -1068,11 +1073,17 @@ static int tcf_action_add(struct net *net, struct nlattr *nla,
> 	return tcf_add_notify(net, n, &actions, portid);
> }
> 
>+static u32 tcaa_root_flags_allowed = TCA_FLAG_LARGE_DUMP_ON;
>+static const struct nla_policy tcaa_policy[TCA_ROOT_MAX + 1] = {
>+	[TCA_ROOT_FLAGS] = { .type = NLA_BITFIELD_32,
>+			     .validation_data = &tcaa_root_flags_allowed },
>+};
>+
> static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
> 			 struct netlink_ext_ack *extack)
> {
> 	struct net *net = sock_net(skb->sk);
>-	struct nlattr *tca[TCAA_MAX + 1];
>+	struct nlattr *tca[TCA_ROOT_MAX + 1];
> 	u32 portid = skb ? NETLINK_CB(skb).portid : 0;
> 	int ret = 0, ovr = 0;
> 
>@@ -1080,7 +1091,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
> 	    !netlink_capable(skb, CAP_NET_ADMIN))
> 		return -EPERM;
> 
>-	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCAA_MAX, NULL,
>+	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCA_ROOT_MAX, NULL,
> 			  extack);
> 	if (ret < 0)
> 		return ret;
>@@ -1121,16 +1132,12 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
> 	return ret;
> }
> 
>-static struct nlattr *find_dump_kind(const struct nlmsghdr *n)
>+static struct nlattr *find_dump_kind(struct nlattr **nla)
> {
> 	struct nlattr *tb1, *tb2[TCA_ACT_MAX + 1];
> 	struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
>-	struct nlattr *nla[TCAA_MAX + 1];
> 	struct nlattr *kind;
> 
>-	if (nlmsg_parse(n, sizeof(struct tcamsg), nla, TCAA_MAX,
>-			NULL, NULL) < 0)
>-		return NULL;
> 	tb1 = nla[TCA_ACT_TAB];
> 	if (tb1 == NULL)
> 		return NULL;
>@@ -1157,8 +1164,18 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
> 	struct tc_action_ops *a_o;
> 	int ret = 0;
> 	struct tcamsg *t = (struct tcamsg *) nlmsg_data(cb->nlh);
>-	struct nlattr *kind = find_dump_kind(cb->nlh);
>+	struct nla_bitfield_32 fb;
>+	struct nlattr *count_attr = NULL;
>+	struct nlattr *tb[TCA_ROOT_MAX + 1];
>+	struct nlattr *kind = NULL;

Reverse christmas tree :D


>+	u32 act_count = 0;
>+
>+	ret = nlmsg_parse(cb->nlh, sizeof(struct tcamsg), tb, TCA_ROOT_MAX,
>+			  tcaa_policy, NULL);
>+	if (ret < 0)
>+		return ret;
> 
>+	kind = find_dump_kind(tb);
> 	if (kind == NULL) {
> 		pr_info("tc_dump_action: action bad kind\n");
> 		return 0;
>@@ -1168,14 +1185,22 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
> 	if (a_o == NULL)
> 		return 0;
> 
>+	if (tb[TCA_ROOT_FLAGS])
>+		fb = nla_get_bitfield_32(tb[TCA_ROOT_FLAGS]);

fb? bf? nbf? Please make this synced within the patchset.


Don't you need to mask value with selector? In fact, I think that
nla_get_bitfield_32 could just return u32 which would be (value&selector).
The validation takes care of unsupported bits.


>+
> 	nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
> 			cb->nlh->nlmsg_type, sizeof(*t), 0);
> 	if (!nlh)
> 		goto out_module_put;
>+
>+	cb->args[2] = fb.nla_value;
> 	t = nlmsg_data(nlh);
> 	t->tca_family = AF_UNSPEC;
> 	t->tca__pad1 = 0;
> 	t->tca__pad2 = 0;
>+	count_attr = nla_reserve(skb, TCA_ROOT_COUNT, sizeof(u32));
>+	if (!count_attr)
>+		goto out_module_put;
> 
> 	nest = nla_nest_start(skb, TCA_ACT_TAB);
> 	if (nest == NULL)
>@@ -1188,6 +1213,9 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
> 	if (ret > 0) {
> 		nla_nest_end(skb, nest);
> 		ret = skb->len;
>+		act_count = cb->args[1];
>+		memcpy(nla_data(count_attr), &act_count, sizeof(u32));
>+		cb->args[1] = 0;
> 	} else
> 		nlmsg_trim(skb, b);
> 
>-- 
>1.9.1
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 4/4] net sched actions: add time filter for action dumping
  2017-07-24  1:35 ` [PATCH net-next v11 4/4] net sched actions: add time filter for action dumping Jamal Hadi Salim
@ 2017-07-24 11:34   ` Jiri Pirko
  2017-07-25 11:27     ` Jamal Hadi Salim
  0 siblings, 1 reply; 34+ messages in thread
From: Jiri Pirko @ 2017-07-24 11:34 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

Mon, Jul 24, 2017 at 03:35:46AM CEST, jhs@mojatatu.com wrote:
>From: Jamal Hadi Salim <jhs@mojatatu.com>
>
>This patch adds support for filtering based on time since last used.
>When we are dumping a large number of actions it is useful to
>have the option of filtering based on when the action was last
>used to reduce the amount of data crossing to user space.
>
>With this patch the user space app sets the TCA_ROOT_TIME_DELTA
>attribute with the value in milliseconds with "time of interest
>since now".  The kernel converts this to jiffies and does the
>filtering comparison matching entries that have seen activity
>since then and returns them to user space.
>Old kernels and old tc continue to work in legacy mode since
>they dont specify this attribute.

[...]


>@@ -128,6 +129,11 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
> 			if (index < s_i)
> 				continue;
> 
>+			if (jiffy_since &&
>+			    time_after(jiffy_since,
>+				       (unsigned long)p->tcfa_tm.lastuse))

You don't need to check jiffy_since==0. Also, nicer  ^^ this with a space :)

Other than this, looks fine. Thanks.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32
  2017-07-24 11:14   ` Jiri Pirko
@ 2017-07-25 11:14     ` Jamal Hadi Salim
  0 siblings, 0 replies; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-25 11:14 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

On 17-07-24 07:14 AM, Jiri Pirko wrote:
> Mon, Jul 24, 2017 at 03:35:43AM CEST, jhs@mojatatu.com wrote:
>> From: Jamal Hadi Salim <jhs@mojatatu.com>
>>
>> Generic bitflags attribute content sent to the kernel by user.
>> With this type the user can either set or unset a flag in the
>> kernel.
>>
>> The nla_value is a bitmap that defines the values being set
>> The nla_selector is a bitmask that defines which value is legit.
>>
>> A check is made to ensure the rules that a kernel subsystem always
>> conforms to bitflags the kernel already knows about. i.e
>> if the user tries to set a bit flag that is not understood then
>> the _it will be rejected_.
>>
>> In the most basic form, the user specifies the attribute policy as:
>> [ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data = &myvalidflags },
>>
>> where myvalidflags is the bit mask of the flags the kernel understands.
>>
>> If the user _does not_ provide myvalidflags then the attribute will
>> also be rejected.
>>
>> Examples:
>> nla_value = 0x0, and nla_selector = 0x1
>> implies we are selecting bit 1 and we want to set its value to 0.
>>
>> nla_value = 0x2, and nla_selector = 0x2
>> implies we are selecting bit 2 and we want to set its value to 1.
>>
>> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
>> ---
>> include/net/netlink.h        |  4 ++++
>> include/uapi/linux/netlink.h | 17 +++++++++++++++++
>> lib/nlattr.c                 | 21 +++++++++++++++++++++
>> 3 files changed, 42 insertions(+)
>>
>> diff --git a/include/net/netlink.h b/include/net/netlink.h
>> index ef8e6c3..e33d1fb 100644
>> --- a/include/net/netlink.h
>> +++ b/include/net/netlink.h
>> @@ -178,6 +178,7 @@ enum {
>> 	NLA_S16,
>> 	NLA_S32,
>> 	NLA_S64,
>> +	NLA_BITFIELD_32,
>> 	__NLA_TYPE_MAX,
>> };
>>
>> @@ -206,6 +207,7 @@ enum {
>>   *    NLA_MSECS            Leaving the length field zero will verify the
>>   *                         given type fits, using it verifies minimum length
>>   *                         just like "All other"
>> + *    NLA_BITFIELD_32      A 32-bit bitmap/bitselector attribute
>>   *    All other            Minimum length of attribute payload
>>   *
>>   * Example:
>> @@ -213,11 +215,13 @@ enum {
>>   * 	[ATTR_FOO] = { .type = NLA_U16 },
>>   *	[ATTR_BAR] = { .type = NLA_STRING, .len = BARSIZ },
>>   *	[ATTR_BAZ] = { .len = sizeof(struct mystruct) },
>> + *	[ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data = &myvalidflags },
>>   * };
>>   */
>> struct nla_policy {
>> 	u16		type;
>> 	u16		len;
>> +	void            *validation_data;
>> };
>>
>> /**
>> diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h
>> index f86127a..0ac05a6 100644
>> --- a/include/uapi/linux/netlink.h
>> +++ b/include/uapi/linux/netlink.h
>> @@ -226,5 +226,22 @@ struct nlattr {
>> #define NLA_ALIGN(len)		(((len) + NLA_ALIGNTO - 1) & ~(NLA_ALIGNTO - 1))
>> #define NLA_HDRLEN		((int) NLA_ALIGN(sizeof(struct nlattr)))
>>
>> +/* Generic 32 bitflags attribute content sent to the kernel.
>> + *
>> + * The nla_value is a bitmap that defines the values being set
>> + * The nla_selector is a bitmask that defines which value is legit
>> + *
>> + * Examples:
>> + *  nla_value = 0x0, and nla_selector = 0x1
>> + *  implies we are selecting bit 1 and we want to set its value to 0.
>> + *
>> + *  nla_value = 0x2, and nla_selector = 0x2
>> + *  implies we are selecting bit 2 and we want to set its value to 1.
>> + *
>> + */
>> +struct nla_bitfield_32 {
>> +	__u32 nla_value;
>> +	__u32 nla_selector;
> 
> I would just have "value" and "selector" here. "nla_" prefix indicates
> netlink attrubute, like in struct nlattr - nla_type, nla_len,
> and that is wrong indication.
> 

Sure ;->

> 
> 
>> +};
>>
>> #endif /* _UAPI__LINUX_NETLINK_H */
>> diff --git a/lib/nlattr.c b/lib/nlattr.c
>> index fb52435..c8aad7e 100644
>> --- a/lib/nlattr.c
>> +++ b/lib/nlattr.c
>> @@ -27,6 +27,20 @@
>> 	[NLA_S64]	= sizeof(s64),
>> };
>>
>> +static int validate_nla_bitfield_32(const struct nlattr *nla, void *valid_data)
> 
> This should be:
> static int validate_nla_bitfield_32(const struct nlattr *nla, u32 *valid_flags_allowed)
> 
> 
> Other than this 2 nits, this looks good.
> 
> 

Are you sure Jiri? ;->

cheers,
jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32
  2017-07-24 11:18   ` Jiri Pirko
@ 2017-07-25 11:15     ` Jamal Hadi Salim
  0 siblings, 0 replies; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-25 11:15 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

On 17-07-24 07:18 AM, Jiri Pirko wrote:
> Mon, Jul 24, 2017 at 03:35:43AM CEST, jhs@mojatatu.com wrote:
>> From: Jamal Hadi Salim <jhs@mojatatu.com>
>>
>> Generic bitflags attribute content sent to the kernel by user.
>> With this type the user can either set or unset a flag in the
>> kernel.
>>
>> The nla_value is a bitmap that defines the values being set
>> The nla_selector is a bitmask that defines which value is legit.
>>
>> A check is made to ensure the rules that a kernel subsystem always
>> conforms to bitflags the kernel already knows about. i.e
>> if the user tries to set a bit flag that is not understood then
>> the _it will be rejected_.
>>
>> In the most basic form, the user specifies the attribute policy as:
>> [ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data = &myvalidflags },
>>
>> where myvalidflags is the bit mask of the flags the kernel understands.
>>
>> If the user _does not_ provide myvalidflags then the attribute will
>> also be rejected.
>>
>> Examples:
>> nla_value = 0x0, and nla_selector = 0x1
>> implies we are selecting bit 1 and we want to set its value to 0.
>>
>> nla_value = 0x2, and nla_selector = 0x2
>> implies we are selecting bit 2 and we want to set its value to 1.
> 
> Oh, 2 more things:
> 
> [...]
> 

;-> ok will do that next opportunity (probably at 30K feet).

cheers,
jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-07-24 11:27   ` Jiri Pirko
@ 2017-07-25 11:22     ` Jamal Hadi Salim
  2017-07-25 11:33       ` Jiri Pirko
  0 siblings, 1 reply; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-25 11:22 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

On 17-07-24 07:27 AM, Jiri Pirko wrote:
> Mon, Jul 24, 2017 at 03:35:45AM CEST, jhs@mojatatu.com wrote:
>> From: Jamal Hadi Salim <jhs@mojatatu.com>
[..]
> 
> This helper should be part of the previous patch.
> 

Will do next update.
> 

 >> @@ -1157,8 +1164,18 @@ static int tc_dump_action(struct sk_buff 
*skb, struct netlink_callback *cb)
 >> 	struct tc_action_ops *a_o;
 >> 	int ret = 0;
 >> 	struct tcamsg *t = (struct tcamsg *) nlmsg_data(cb->nlh);
 >> -	struct nlattr *kind = find_dump_kind(cb->nlh);
 >> +	struct nla_bitfield_32 fb;
 >> +	struct nlattr *count_attr = NULL;
 >> +	struct nlattr *tb[TCA_ROOT_MAX + 1];
 >> +	struct nlattr *kind = NULL;
 >
 > Reverse christmas tree :D
 >

There were already 2 christmas trees in place;->
I will re-arrange this.

>>
>> +	if (tb[TCA_ROOT_FLAGS])
>> +		fb = nla_get_bitfield_32(tb[TCA_ROOT_FLAGS]);
> 
> fb? bf? nbf? Please make this synced within the patchset.
> 
>

Ok, what do you like best? ;->

> Don't you need to mask value with selector? In fact, I think that
> nla_get_bitfield_32 could just return u32 which would be (value&selector).
> The validation takes care of unsupported bits.

For my use case I dont need any of the above since I dont need to
unset things. In other use cases you will need both selector and
value in case someone wants a bit to be set to 0.
Infact I think i will rename that helper to "nla_get_bitvalue_32"
to be more precise.

cheers,
jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 4/4] net sched actions: add time filter for action dumping
  2017-07-24 11:34   ` Jiri Pirko
@ 2017-07-25 11:27     ` Jamal Hadi Salim
  2017-07-25 11:34       ` Jiri Pirko
  0 siblings, 1 reply; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-25 11:27 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

On 17-07-24 07:34 AM, Jiri Pirko wrote:
> Mon, Jul 24, 2017 at 03:35:46AM CEST, jhs@mojatatu.com wrote:
>> From: Jamal Hadi Salim <jhs@mojatatu.com>

> 
>> @@ -128,6 +129,11 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
>> 			if (index < s_i)
>> 				continue;
>>
>> +			if (jiffy_since &&
>> +			    time_after(jiffy_since,
>> +				       (unsigned long)p->tcfa_tm.lastuse))
> 
> You don't need to check jiffy_since==0. Also, nicer  ^^ this with a space :)
> 

Assuming that time_after() would work fine for jiffy_since being zero,
but:
wouldnt it be more efficient to just not call time_after() altogether?

> Other than this, looks fine.

Ok, please no more changes - I am exhausted ;-> So speak for this
update or send patches afterwards if you dont like something.

cheers,
jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-07-25 11:22     ` Jamal Hadi Salim
@ 2017-07-25 11:33       ` Jiri Pirko
  2017-07-25 12:34         ` Jamal Hadi Salim
  0 siblings, 1 reply; 34+ messages in thread
From: Jiri Pirko @ 2017-07-25 11:33 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

Tue, Jul 25, 2017 at 01:22:44PM CEST, jhs@mojatatu.com wrote:
>On 17-07-24 07:27 AM, Jiri Pirko wrote:
>> Mon, Jul 24, 2017 at 03:35:45AM CEST, jhs@mojatatu.com wrote:
>> > From: Jamal Hadi Salim <jhs@mojatatu.com>
>[..]
>> 
>> This helper should be part of the previous patch.
>> 
>
>Will do next update.
>> 
>
>>> @@ -1157,8 +1164,18 @@ static int tc_dump_action(struct sk_buff *skb,
>struct netlink_callback *cb)
>>> 	struct tc_action_ops *a_o;
>>> 	int ret = 0;
>>> 	struct tcamsg *t = (struct tcamsg *) nlmsg_data(cb->nlh);
>>> -	struct nlattr *kind = find_dump_kind(cb->nlh);
>>> +	struct nla_bitfield_32 fb;
>>> +	struct nlattr *count_attr = NULL;
>>> +	struct nlattr *tb[TCA_ROOT_MAX + 1];
>>> +	struct nlattr *kind = NULL;
>>
>> Reverse christmas tree :D
>>
>
>There were already 2 christmas trees in place;->
>I will re-arrange this.
>
>> > 
>> > +	if (tb[TCA_ROOT_FLAGS])
>> > +		fb = nla_get_bitfield_32(tb[TCA_ROOT_FLAGS]);
>> 
>> fb? bf? nbf? Please make this synced within the patchset.
>> 
>> 
>
>Ok, what do you like best? ;->

"bf"


>
>> Don't you need to mask value with selector? In fact, I think that
>> nla_get_bitfield_32 could just return u32 which would be (value&selector).
>> The validation takes care of unsupported bits.
>
>For my use case I dont need any of the above since I dont need to
>unset things. In other use cases you will need both selector and
>value in case someone wants a bit to be set to 0.
>Infact I think i will rename that helper to "nla_get_bitvalue_32"
>to be more precise.

The getter should contain name of the type, so "nla_get_bitfield32_val"
is much better.

What if I pass val 0x1 and selector 0x0 from userspace. I don't have the
bit selected, so you should not process it in kernel, no?


>
>cheers,
>jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 4/4] net sched actions: add time filter for action dumping
  2017-07-25 11:27     ` Jamal Hadi Salim
@ 2017-07-25 11:34       ` Jiri Pirko
  0 siblings, 0 replies; 34+ messages in thread
From: Jiri Pirko @ 2017-07-25 11:34 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

Tue, Jul 25, 2017 at 01:27:03PM CEST, jhs@mojatatu.com wrote:
>On 17-07-24 07:34 AM, Jiri Pirko wrote:
>> Mon, Jul 24, 2017 at 03:35:46AM CEST, jhs@mojatatu.com wrote:
>> > From: Jamal Hadi Salim <jhs@mojatatu.com>
>
>> 
>> > @@ -128,6 +129,11 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
>> > 			if (index < s_i)
>> > 				continue;
>> > 
>> > +			if (jiffy_since &&
>> > +			    time_after(jiffy_since,
>> > +				       (unsigned long)p->tcfa_tm.lastuse))
>> 
>> You don't need to check jiffy_since==0. Also, nicer  ^^ this with a space :)
>> 
>
>Assuming that time_after() would work fine for jiffy_since being zero,
>but:
>wouldnt it be more efficient to just not call time_after() altogether?

time_after is pretty trivial. But your call.


>
>> Other than this, looks fine.
>
>Ok, please no more changes - I am exhausted ;-> So speak for this
>update or send patches afterwards if you dont like something.
>
>cheers,
>jamal
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-07-25 11:33       ` Jiri Pirko
@ 2017-07-25 12:34         ` Jamal Hadi Salim
  2017-07-25 12:37           ` Jiri Pirko
  0 siblings, 1 reply; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-25 12:34 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

On 17-07-25 07:33 AM, Jiri Pirko wrote:
> Tue, Jul 25, 2017 at 01:22:44PM CEST, jhs@mojatatu.com wrote:

>>> fb? bf? nbf? Please make this synced within the patchset.
>>>
>>>
>>
>> Ok, what do you like best? ;->
> 
> "bf"
> 

Ok.

> 
>>
>>> Don't you need to mask value with selector? In fact, I think that
>>> nla_get_bitfield_32 could just return u32 which would be (value&selector).
>>> The validation takes care of unsupported bits.
>>
>> For my use case I dont need any of the above since I dont need to
>> unset things. In other use cases you will need both selector and
>> value in case someone wants a bit to be set to 0.
>> Infact I think i will rename that helper to "nla_get_bitvalue_32"
>> to be more precise.
> 
> The getter should contain name of the type, so "nla_get_bitfield32_val"
> is much better.
> 

Actually I mispoke. I was returning the struct not the value. So
nla_get_bitfield32() is a better name.

> What if I pass val 0x1 and selector 0x0 from userspace. I don't have the
> bit selected, so you should not process it in kernel, no?
>

Yes, valid point. I am not sure - should we reject?

cheers,
jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-07-25 12:34         ` Jamal Hadi Salim
@ 2017-07-25 12:37           ` Jiri Pirko
  2017-07-28 13:41             ` Jamal Hadi Salim
  0 siblings, 1 reply; 34+ messages in thread
From: Jiri Pirko @ 2017-07-25 12:37 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

Tue, Jul 25, 2017 at 02:34:58PM CEST, jhs@mojatatu.com wrote:
>On 17-07-25 07:33 AM, Jiri Pirko wrote:
>> Tue, Jul 25, 2017 at 01:22:44PM CEST, jhs@mojatatu.com wrote:
>
>> > > fb? bf? nbf? Please make this synced within the patchset.
>> > > 
>> > > 
>> > 
>> > Ok, what do you like best? ;->
>> 
>> "bf"
>> 
>
>Ok.
>
>> 
>> > 
>> > > Don't you need to mask value with selector? In fact, I think that
>> > > nla_get_bitfield_32 could just return u32 which would be (value&selector).
>> > > The validation takes care of unsupported bits.
>> > 
>> > For my use case I dont need any of the above since I dont need to
>> > unset things. In other use cases you will need both selector and
>> > value in case someone wants a bit to be set to 0.
>> > Infact I think i will rename that helper to "nla_get_bitvalue_32"
>> > to be more precise.
>> 
>> The getter should contain name of the type, so "nla_get_bitfield32_val"
>> is much better.
>> 
>
>Actually I mispoke. I was returning the struct not the value. So
>nla_get_bitfield32() is a better name.

ack


>
>> What if I pass val 0x1 and selector 0x0 from userspace. I don't have the
>> bit selected, so you should not process it in kernel, no?
>> 
>
>Yes, valid point. I am not sure - should we reject?

I think that the validation might check this and reject. Makes sense to
me.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32
  2017-07-24  1:35 ` [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32 Jamal Hadi Salim
  2017-07-24 11:14   ` Jiri Pirko
  2017-07-24 11:18   ` Jiri Pirko
@ 2017-07-25 14:41   ` David Ahern
  2017-07-28 13:51     ` Jamal Hadi Salim
  2 siblings, 1 reply; 34+ messages in thread
From: David Ahern @ 2017-07-25 14:41 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem
  Cc: netdev, jiri, xiyou.wangcong, eric.dumazet, mrv, simon.horman,
	alex.aring

On 7/23/17 7:35 PM, Jamal Hadi Salim wrote:
> In the most basic form, the user specifies the attribute policy as:
> [ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data = &myvalidflags },
> 
> where myvalidflags is the bit mask of the flags the kernel understands.
> 
> If the user _does not_ provide myvalidflags then the attribute will
> also be rejected.

No other netlink attribute has this requirement. Users of the attributes
are the only ones that know if a value is valid or not (e.g, attribute
passing a device index) and those are always checked in line.
Furthermore, you are locking this attribute into a static meaning of
what is a valid value when flags can be valid or invalid based on other
attributes passed in the request.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-07-25 12:37           ` Jiri Pirko
@ 2017-07-28 13:41             ` Jamal Hadi Salim
  2017-07-28 14:12               ` Jiri Pirko
  0 siblings, 1 reply; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-28 13:41 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

On 17-07-25 08:37 AM, Jiri Pirko wrote:
> Tue, Jul 25, 2017 at 02:34:58PM CEST, jhs@mojatatu.com wrote:
>> On 17-07-25 07:33 AM, Jiri Pirko wrote:
>>> Tue, Jul 25, 2017 at 01:22:44PM CEST, jhs@mojatatu.com wrote:

[..]
>>> What if I pass val 0x1 and selector 0x0 from userspace. I don't have the
>>> bit selected, so you should not process it in kernel, no?
>>>
>>
>> Yes, valid point. I am not sure - should we reject?
> 
> I think that the validation might check this and reject. Makes sense to
> me.
> 

How does this look? I havent tested it but covers all angles
I can think of.

static int validate_nla_bitfield32(const struct nlattr *nla,
                                     void *valid_flags_allowed)
{
         const struct nla_bitfield32 *bf = nla_data(nla);
         u32 *valid_flags_mask = valid_flags_allowed;

         if (!valid_flags_allowed)
                 return -EINVAL;
         /*disallow invalid selector */
         if ((bf->selector & valid_flags_allowed) >*valid_flags_allowed)
                 return -EINVAL;
         /*disallow invalid bit values */
         if (bf->value & ~*valid_flags_mask)
                 return -EINVAL;
         /*disallow valid bit values that are not selected*/
         if (bf->value & ~nbf->selector)
                 return -EINVAL;

         return 0;
}

cheers,
jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32
  2017-07-25 14:41   ` David Ahern
@ 2017-07-28 13:51     ` Jamal Hadi Salim
  2017-07-28 14:08       ` Jiri Pirko
  2017-07-28 14:19       ` David Ahern
  0 siblings, 2 replies; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-28 13:51 UTC (permalink / raw)
  To: David Ahern, davem
  Cc: netdev, jiri, xiyou.wangcong, eric.dumazet, mrv, simon.horman,
	alex.aring

On 17-07-25 10:41 AM, David Ahern wrote:
> On 7/23/17 7:35 PM, Jamal Hadi Salim wrote:
>> In the most basic form, the user specifies the attribute policy as:
>> [ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data = &myvalidflags },
>>
>> where myvalidflags is the bit mask of the flags the kernel understands.
>>
>> If the user _does not_ provide myvalidflags then the attribute will
>> also be rejected.
> 
> No other netlink attribute has this requirement. 

This is the first one where we have to inspect content. We add things
when we need them - as in this case.

> Users of the attributes
> are the only ones that know if a value is valid or not (e.g, attribute
> passing a device index) and those are always checked in line.

It doesnt make sense that every user of the API has to repeat that
validation code. Same principle as someone specifying that a type is
u32 and have the nla validation check it. At some point we never had
the u32 validation code. Then it was factored out because everyone
repeats the same boilerplate code.
I see this in the same spirit.

> Furthermore, you are locking this attribute into a static meaning of
> what is a valid value when flags can be valid or invalid based on other
> attributes passed in the request.
> 

That doesnt disqualify that i factored out common code that all users
of this nltype are going to cutnpaste.

On the dependency on bit presence topic: I had added an "extra
validation" ops - but it was distracting enough that i removed that
patch altogether.

cheers,
jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32
  2017-07-28 13:51     ` Jamal Hadi Salim
@ 2017-07-28 14:08       ` Jiri Pirko
  2017-07-28 14:19       ` David Ahern
  1 sibling, 0 replies; 34+ messages in thread
From: Jiri Pirko @ 2017-07-28 14:08 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: David Ahern, davem, netdev, xiyou.wangcong, eric.dumazet, mrv,
	simon.horman, alex.aring

Fri, Jul 28, 2017 at 03:51:41PM CEST, jhs@mojatatu.com wrote:
>On 17-07-25 10:41 AM, David Ahern wrote:
>> On 7/23/17 7:35 PM, Jamal Hadi Salim wrote:
>> > In the most basic form, the user specifies the attribute policy as:
>> > [ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data = &myvalidflags },
>> > 
>> > where myvalidflags is the bit mask of the flags the kernel understands.
>> > 
>> > If the user _does not_ provide myvalidflags then the attribute will
>> > also be rejected.
>> 
>> No other netlink attribute has this requirement.
>
>This is the first one where we have to inspect content. We add things
>when we need them - as in this case.
>
>> Users of the attributes
>> are the only ones that know if a value is valid or not (e.g, attribute
>> passing a device index) and those are always checked in line.
>
>It doesnt make sense that every user of the API has to repeat that
>validation code. Same principle as someone specifying that a type is
>u32 and have the nla validation check it. At some point we never had
>the u32 validation code. Then it was factored out because everyone
>repeats the same boilerplate code.
>I see this in the same spirit.

Agreed.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-07-28 13:41             ` Jamal Hadi Salim
@ 2017-07-28 14:12               ` Jiri Pirko
  2017-07-28 14:52                 ` Jamal Hadi Salim
  0 siblings, 1 reply; 34+ messages in thread
From: Jiri Pirko @ 2017-07-28 14:12 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

Fri, Jul 28, 2017 at 03:41:44PM CEST, jhs@mojatatu.com wrote:
>On 17-07-25 08:37 AM, Jiri Pirko wrote:
>> Tue, Jul 25, 2017 at 02:34:58PM CEST, jhs@mojatatu.com wrote:
>> > On 17-07-25 07:33 AM, Jiri Pirko wrote:
>> > > Tue, Jul 25, 2017 at 01:22:44PM CEST, jhs@mojatatu.com wrote:
>
>[..]
>> > > What if I pass val 0x1 and selector 0x0 from userspace. I don't have the
>> > > bit selected, so you should not process it in kernel, no?
>> > > 
>> > 
>> > Yes, valid point. I am not sure - should we reject?
>> 
>> I think that the validation might check this and reject. Makes sense to
>> me.
>> 
>
>How does this look? I havent tested it but covers all angles

Looks like a big mess to be honest. Mixing up u32* u32 void*. I don't
understand ****. Would be probably good to first apply my review comment
on the function itselt, then to add the checks :)


>I can think of.
>
>static int validate_nla_bitfield32(const struct nlattr *nla,
>                                    void *valid_flags_allowed)
>{
>        const struct nla_bitfield32 *bf = nla_data(nla);
>        u32 *valid_flags_mask = valid_flags_allowed;
>
>        if (!valid_flags_allowed)
>                return -EINVAL;
>        /*disallow invalid selector */
>        if ((bf->selector & valid_flags_allowed) >*valid_flags_allowed)
>                return -EINVAL;
>        /*disallow invalid bit values */
>        if (bf->value & ~*valid_flags_mask)
>                return -EINVAL;
>        /*disallow valid bit values that are not selected*/
>        if (bf->value & ~nbf->selector)
>                return -EINVAL;
>
>        return 0;
>}
>
>cheers,
>jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32
  2017-07-28 13:51     ` Jamal Hadi Salim
  2017-07-28 14:08       ` Jiri Pirko
@ 2017-07-28 14:19       ` David Ahern
  2017-07-28 14:55         ` Jiri Pirko
  2017-07-28 15:04         ` Jamal Hadi Salim
  1 sibling, 2 replies; 34+ messages in thread
From: David Ahern @ 2017-07-28 14:19 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem
  Cc: netdev, jiri, xiyou.wangcong, eric.dumazet, mrv, simon.horman,
	alex.aring

On 7/28/17 7:51 AM, Jamal Hadi Salim wrote:
> On 17-07-25 10:41 AM, David Ahern wrote:
>> On 7/23/17 7:35 PM, Jamal Hadi Salim wrote:
>>> In the most basic form, the user specifies the attribute policy as:
>>> [ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data =
>>> &myvalidflags },
>>>
>>> where myvalidflags is the bit mask of the flags the kernel understands.
>>>
>>> If the user _does not_ provide myvalidflags then the attribute will
>>> also be rejected.
>>
>> No other netlink attribute has this requirement. 
> 
> This is the first one where we have to inspect content. We add things
> when we need them - as in this case.

Sure, the validation is required. My argument is that the validation
should be done where other attributes are validated -- inline with its
use. Nothing about this new bitfield says it must have a generic
validation code.

> 
>> Users of the attributes
>> are the only ones that know if a value is valid or not (e.g, attribute
>> passing a device index) and those are always checked in line.
> 
> It doesnt make sense that every user of the API has to repeat that
> validation code. Same principle as someone specifying that a type is
> u32 and have the nla validation check it. At some point we never had
> the u32 validation code. Then it was factored out because everyone
> repeats the same boilerplate code.

Every user of an attribute that uses a device index must verify the
device index is valid. The same code is repeated over and over.

Now you are suggesting to have 1 attribute whose content is validated by
generic infra and the rest are validated inline by the code using it. I
believe it is wrong and going to lead to problems.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-07-28 14:12               ` Jiri Pirko
@ 2017-07-28 14:52                 ` Jamal Hadi Salim
  2017-07-28 14:57                   ` Jiri Pirko
  2017-07-28 15:08                   ` Jamal Hadi Salim
  0 siblings, 2 replies; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-28 14:52 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

On 17-07-28 10:12 AM, Jiri Pirko wrote:
> Fri, Jul 28, 2017 at 03:41:44PM CEST, jhs@mojatatu.com wrote:
[..]
> 
> Looks like a big mess to be honest. Mixing up u32* u32 void*. I don't
> understand ****. Would be probably good to first apply my review comment
> on the function itselt, then to add the checks :)
> 

I havent even compiled/test that Jiri.
Just ignore the void * and assume it is a u32 *.

I am trying to avoid doing unlucky number 13 patch.
So feedback on this is good. Just look at what it is disallowing
first.

back later.

cheers,
jamal

> 
>> I can think of.
>>
>> static int validate_nla_bitfield32(const struct nlattr *nla,
>>                                     void *valid_flags_allowed)
>> {
>>         const struct nla_bitfield32 *bf = nla_data(nla);
>>         u32 *valid_flags_mask = valid_flags_allowed;
>>
>>         if (!valid_flags_allowed)
>>                 return -EINVAL;
>>         /*disallow invalid selector */
>>         if ((bf->selector & valid_flags_allowed) >*valid_flags_allowed)
>>                 return -EINVAL;
>>         /*disallow invalid bit values */
>>         if (bf->value & ~*valid_flags_mask)
>>                 return -EINVAL;
>>         /*disallow valid bit values that are not selected*/
>>         if (bf->value & ~nbf->selector)
>>                 return -EINVAL;
>>
>>         return 0;
>> }
>>
>> cheers,
>> jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32
  2017-07-28 14:19       ` David Ahern
@ 2017-07-28 14:55         ` Jiri Pirko
  2017-07-28 15:04         ` Jamal Hadi Salim
  1 sibling, 0 replies; 34+ messages in thread
From: Jiri Pirko @ 2017-07-28 14:55 UTC (permalink / raw)
  To: David Ahern
  Cc: Jamal Hadi Salim, davem, netdev, xiyou.wangcong, eric.dumazet,
	mrv, simon.horman, alex.aring

Fri, Jul 28, 2017 at 04:19:06PM CEST, dsahern@gmail.com wrote:
>On 7/28/17 7:51 AM, Jamal Hadi Salim wrote:
>> On 17-07-25 10:41 AM, David Ahern wrote:
>>> On 7/23/17 7:35 PM, Jamal Hadi Salim wrote:
>>>> In the most basic form, the user specifies the attribute policy as:
>>>> [ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data =
>>>> &myvalidflags },
>>>>
>>>> where myvalidflags is the bit mask of the flags the kernel understands.
>>>>
>>>> If the user _does not_ provide myvalidflags then the attribute will
>>>> also be rejected.
>>>
>>> No other netlink attribute has this requirement. 
>> 
>> This is the first one where we have to inspect content. We add things
>> when we need them - as in this case.
>
>Sure, the validation is required. My argument is that the validation
>should be done where other attributes are validated -- inline with its
>use. Nothing about this new bitfield says it must have a generic
>validation code.
>
>> 
>>> Users of the attributes
>>> are the only ones that know if a value is valid or not (e.g, attribute
>>> passing a device index) and those are always checked in line.
>> 
>> It doesnt make sense that every user of the API has to repeat that
>> validation code. Same principle as someone specifying that a type is
>> u32 and have the nla validation check it. At some point we never had
>> the u32 validation code. Then it was factored out because everyone
>> repeats the same boilerplate code.
>
>Every user of an attribute that uses a device index must verify the
>device index is valid. The same code is repeated over and over.

This is something different. You don't have NLA_IFINDEX. If you'd have it,
might make sense to do validation on Netlink level. Ofc this is highly
hypothetical. But in Jamal's case, there is indeed NLA_BITFIELD32 and
this attribute type itself assumes some format. Therefore the validation
on Netlink level makes sense here. At least that is how I feel it.


>
>Now you are suggesting to have 1 attribute whose content is validated by
>generic infra and the rest are validated inline by the code using it. I
>believe it is wrong and going to lead to problems.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-07-28 14:52                 ` Jamal Hadi Salim
@ 2017-07-28 14:57                   ` Jiri Pirko
  2017-07-28 15:08                   ` Jamal Hadi Salim
  1 sibling, 0 replies; 34+ messages in thread
From: Jiri Pirko @ 2017-07-28 14:57 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

Fri, Jul 28, 2017 at 04:52:02PM CEST, jhs@mojatatu.com wrote:
>On 17-07-28 10:12 AM, Jiri Pirko wrote:
>> Fri, Jul 28, 2017 at 03:41:44PM CEST, jhs@mojatatu.com wrote:
>[..]
>> 
>> Looks like a big mess to be honest. Mixing up u32* u32 void*. I don't
>> understand ****. Would be probably good to first apply my review comment
>> on the function itselt, then to add the checks :)
>> 
>
>I havent even compiled/test that Jiri.
>Just ignore the void * and assume it is a u32 *.
>
>I am trying to avoid doing unlucky number 13 patch.

I'll rather wait for you to send the code in proper shape so I can
decypher easily and don't have to guess.


>So feedback on this is good. Just look at what it is disallowing
>first.
>
>back later.
>
>cheers,
>jamal
>
>> 
>> > I can think of.
>> > 
>> > static int validate_nla_bitfield32(const struct nlattr *nla,
>> >                                     void *valid_flags_allowed)
>> > {
>> >         const struct nla_bitfield32 *bf = nla_data(nla);
>> >         u32 *valid_flags_mask = valid_flags_allowed;
>> > 
>> >         if (!valid_flags_allowed)
>> >                 return -EINVAL;
>> >         /*disallow invalid selector */
>> >         if ((bf->selector & valid_flags_allowed) >*valid_flags_allowed)
>> >                 return -EINVAL;
>> >         /*disallow invalid bit values */
>> >         if (bf->value & ~*valid_flags_mask)
>> >                 return -EINVAL;
>> >         /*disallow valid bit values that are not selected*/
>> >         if (bf->value & ~nbf->selector)
>> >                 return -EINVAL;
>> > 
>> >         return 0;
>> > }
>> > 
>> > cheers,
>> > jamal
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32
  2017-07-28 14:19       ` David Ahern
  2017-07-28 14:55         ` Jiri Pirko
@ 2017-07-28 15:04         ` Jamal Hadi Salim
  2017-07-28 15:13           ` David Ahern
  1 sibling, 1 reply; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-28 15:04 UTC (permalink / raw)
  To: David Ahern, davem
  Cc: netdev, jiri, xiyou.wangcong, eric.dumazet, mrv, simon.horman,
	alex.aring

On 17-07-28 10:19 AM, David Ahern wrote:
> On 7/28/17 7:51 AM, Jamal Hadi Salim wrote:
>> On 17-07-25 10:41 AM, David Ahern wrote:
>>> On 7/23/17 7:35 PM, Jamal Hadi Salim wrote:
>>>> In the most basic form, the user specifies the attribute policy as:
>>>> [ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data =
>>>> &myvalidflags },
>>>>
>>>> where myvalidflags is the bit mask of the flags the kernel understands.
>>>>
>>>> If the user _does not_ provide myvalidflags then the attribute will
>>>> also be rejected.
>>>
>>> No other netlink attribute has this requirement.
>>
>> This is the first one where we have to inspect content. We add things
>> when we need them - as in this case.
> 
> Sure, the validation is required. My argument is that the validation
> should be done where other attributes are validated -- inline with its
> use. Nothing about this new bitfield says it must have a generic
> validation code.
>
Response below.

>>
>>> Users of the attributes
>>> are the only ones that know if a value is valid or not (e.g, attribute
>>> passing a device index) and those are always checked in line.
>>
>> It doesnt make sense that every user of the API has to repeat that
>> validation code. Same principle as someone specifying that a type is
>> u32 and have the nla validation check it. At some point we never had
>> the u32 validation code. Then it was factored out because everyone
>> repeats the same boilerplate code.
> 
> Every user of an attribute that uses a device index must verify the
> device index is valid. The same code is repeated over and over.
> 
> Now you are suggesting to have 1 attribute whose content is validated by
> generic infra and the rest are validated inline by the code using it. I
> believe it is wrong and going to lead to problems.
> 

Kernel side checking for device ifindex must know what a device
ifindex means.
That doesnt disqualify that the generic code checks that it
is of the same size as a signed 32b, etc. That is generic
stuff that can be factored out.

In this case:
Checking for whether bits selected are in the allowed range
that the kernel understands, that the bit value are set in
the right bit position, that the bits set in the correct bit
value position are also selected in the transaction.
That is generic code (which the content validation does).

Checking what bit 5 means - that is specific per kernel user
code. Checking that when bit 5 is selected it is only useful
if bit 3 is unselected - that is specific to the kernel code
consuming those bits.

cheers,
jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-07-28 14:52                 ` Jamal Hadi Salim
  2017-07-28 14:57                   ` Jiri Pirko
@ 2017-07-28 15:08                   ` Jamal Hadi Salim
  2017-07-28 15:45                     ` Jiri Pirko
  1 sibling, 1 reply; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-28 15:08 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

On 17-07-28 10:52 AM, Jamal Hadi Salim wrote:
> On 17-07-28 10:12 AM, Jiri Pirko wrote:
>> Fri, Jul 28, 2017 at 03:41:44PM CEST, jhs@mojatatu.com wrote:
> [..]
>>
>> Looks like a big mess to be honest. Mixing up u32* u32 void*. I don't
>> understand ****. Would be probably good to first apply my review comment
>> on the function itselt, then to add the checks :)
>>
> 
> I havent even compiled/test that Jiri.
> Just ignore the void * and assume it is a u32 *.
> 

This compiled - but dont have much time right now to test.
===
static int validate_nla_bitfield32(const struct nlattr *nla,
                                    u32 *valid_flags_allowed)
{
         const struct nla_bitfield32 *bf = nla_data(nla);
         u32 *valid_flags_mask = valid_flags_allowed;

         if (!valid_flags_allowed)
                 return -EINVAL;

         /*disallow invalid selector */
         if ((bf->selector & *valid_flags_allowed) > *valid_flags_allowed)
                 return -EINVAL;

         /*disallow invalid bit values */
         if (bf->value & ~*valid_flags_mask)
                 return -EINVAL;

         /*disallow valid bit values that are not selected*/
         if (bf->value & ~bf->selector)
                 return -EINVAL;

         return 0;
}
========

cheers,
jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32
  2017-07-28 15:04         ` Jamal Hadi Salim
@ 2017-07-28 15:13           ` David Ahern
  2017-07-28 21:55             ` Jamal Hadi Salim
  0 siblings, 1 reply; 34+ messages in thread
From: David Ahern @ 2017-07-28 15:13 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem
  Cc: netdev, jiri, xiyou.wangcong, eric.dumazet, mrv, simon.horman,
	alex.aring

On 7/28/17 9:04 AM, Jamal Hadi Salim wrote:
> 
> Kernel side checking for device ifindex must know what a device
> ifindex means.
> That doesnt disqualify that the generic code checks that it
> is of the same size as a signed 32b, etc. That is generic
> stuff that can be factored out.
> 
> In this case:
> Checking for whether bits selected are in the allowed range
> that the kernel understands, that the bit value are set in
> the right bit position, that the bits set in the correct bit
> value position are also selected in the transaction.
> That is generic code (which the content validation does).

Create a helper function then. It's the validation of attribute content
in 2 places that I object to. 1 attribute is validated in 1 place
(generic infra for this bitfield attribute), the others are validated in
line. Asymmetric validations is not a good design.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-07-28 15:08                   ` Jamal Hadi Salim
@ 2017-07-28 15:45                     ` Jiri Pirko
  2017-07-28 22:10                       ` Jamal Hadi Salim
  0 siblings, 1 reply; 34+ messages in thread
From: Jiri Pirko @ 2017-07-28 15:45 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

Fri, Jul 28, 2017 at 05:08:10PM CEST, jhs@mojatatu.com wrote:
>On 17-07-28 10:52 AM, Jamal Hadi Salim wrote:
>> On 17-07-28 10:12 AM, Jiri Pirko wrote:
>> > Fri, Jul 28, 2017 at 03:41:44PM CEST, jhs@mojatatu.com wrote:
>> [..]
>> > 
>> > Looks like a big mess to be honest. Mixing up u32* u32 void*. I don't
>> > understand ****. Would be probably good to first apply my review comment
>> > on the function itselt, then to add the checks :)
>> > 
>> 
>> I havent even compiled/test that Jiri.
>> Just ignore the void * and assume it is a u32 *.
>> 
>
>This compiled - but dont have much time right now to test.
>===
>static int validate_nla_bitfield32(const struct nlattr *nla,
>                                   u32 *valid_flags_allowed)
>{
>        const struct nla_bitfield32 *bf = nla_data(nla);
>        u32 *valid_flags_mask = valid_flags_allowed;

good one :D


>
>        if (!valid_flags_allowed)
>                return -EINVAL;
>
>        /*disallow invalid selector */
>        if ((bf->selector & *valid_flags_allowed) > *valid_flags_allowed)

I don't get the ">"....
Just (bf->selector & ~*valid_flags_allowed) should be enought, no?


>                return -EINVAL;
>
>        /*disallow invalid bit values */
>        if (bf->value & ~*valid_flags_mask)
>                return -EINVAL;
>
>        /*disallow valid bit values that are not selected*/
>        if (bf->value & ~bf->selector)
>                return -EINVAL;
>
>        return 0;
>}
>========
>
>cheers,
>jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32
  2017-07-28 15:13           ` David Ahern
@ 2017-07-28 21:55             ` Jamal Hadi Salim
  0 siblings, 0 replies; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-28 21:55 UTC (permalink / raw)
  To: David Ahern, davem
  Cc: netdev, jiri, xiyou.wangcong, eric.dumazet, mrv, simon.horman,
	alex.aring

On 17-07-28 11:13 AM, David Ahern wrote:
> On 7/28/17 9:04 AM, Jamal Hadi Salim wrote:
>>
>> Kernel side checking for device ifindex must know what a device
>> ifindex means.
>> That doesnt disqualify that the generic code checks that it
>> is of the same size as a signed 32b, etc. That is generic
>> stuff that can be factored out.
>>
>> In this case:
>> Checking for whether bits selected are in the allowed range
>> that the kernel understands, that the bit value are set in
>> the right bit position, that the bits set in the correct bit
>> value position are also selected in the transaction.
>> That is generic code (which the content validation does).
> 
> Create a helper function then. It's the validation of attribute content
> in 2 places that I object to. 1 attribute is validated in 1 place
> (generic infra for this bitfield attribute), the others are validated in
> line. Asymmetric validations is not a good design.
> 

What is not generic in what is being validated though?
Content validation is when you say "This bit means the sky is blue".
You have to know the meaning of the bit.
The generic validation is opaque; "this bit is not supposed to be here".
I would argue that "this is bit is not supposed to be here unless
this other bit is present" is had to generalize but would be infra
as well (why i provided the ops for it, but removed it so we could
move on).

cheers,
jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-07-28 15:45                     ` Jiri Pirko
@ 2017-07-28 22:10                       ` Jamal Hadi Salim
  2017-07-29  7:19                         ` Jiri Pirko
  0 siblings, 1 reply; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-28 22:10 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

On 17-07-28 11:45 AM, Jiri Pirko wrote:
> Fri, Jul 28, 2017 at 05:08:10PM CEST, jhs@mojatatu.com wrote:
>> On 17-07-28 10:52 AM, Jamal Hadi Salim wrote:
>>> On 17-07-28 10:12 AM, Jiri Pirko wrote:


>>         /*disallow invalid selector */
>>         if ((bf->selector & *valid_flags_allowed) > *valid_flags_allowed)
> 
> I don't get the ">"....
> Just (bf->selector & ~*valid_flags_allowed) should be enought, no?
> 

It may be enough - I will try it out tommorow.
I was worried about the selector having more bits than the allowed flags.

cheers,
jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-07-28 22:10                       ` Jamal Hadi Salim
@ 2017-07-29  7:19                         ` Jiri Pirko
  2017-07-29 11:21                           ` Jamal Hadi Salim
  0 siblings, 1 reply; 34+ messages in thread
From: Jiri Pirko @ 2017-07-29  7:19 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

Sat, Jul 29, 2017 at 12:10:20AM CEST, jhs@mojatatu.com wrote:
>On 17-07-28 11:45 AM, Jiri Pirko wrote:
>> Fri, Jul 28, 2017 at 05:08:10PM CEST, jhs@mojatatu.com wrote:
>> > On 17-07-28 10:52 AM, Jamal Hadi Salim wrote:
>> > > On 17-07-28 10:12 AM, Jiri Pirko wrote:
>
>
>> >         /*disallow invalid selector */
>> >         if ((bf->selector & *valid_flags_allowed) > *valid_flags_allowed)
>> 
>> I don't get the ">"....
>> Just (bf->selector & ~*valid_flags_allowed) should be enought, no?
>> 
>
>It may be enough - I will try it out tommorow.
>I was worried about the selector having more bits than the allowed flags.

That is what the code I wrote above do. And I think that your code does
not (if you don't assume that allowed flags are always starting from the
least significant bit)

>
>cheers,
>jamal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  2017-07-29  7:19                         ` Jiri Pirko
@ 2017-07-29 11:21                           ` Jamal Hadi Salim
  0 siblings, 0 replies; 34+ messages in thread
From: Jamal Hadi Salim @ 2017-07-29 11:21 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: davem, netdev, xiyou.wangcong, dsahern, eric.dumazet, mrv,
	simon.horman, alex.aring

[-- Attachment #1: Type: text/plain, Size: 362 bytes --]

On 17-07-29 03:19 AM, Jiri Pirko wrote:

> That is what the code I wrote above do. And I think that your code does
> not (if you don't assume that allowed flags are always starting from the
> least significant bit)

Ok, here's how it looks now..
I still have to move things around in the other patches and do testing
but i think this looks good.

cheers,
jamal


[-- Attachment #2: patchlet --]
[-- Type: text/plain, Size: 4976 bytes --]

commit 79e3e2a9d292179513df1591fb50b5bbe3bf8cc3
Author: Jamal Hadi Salim <jhs@mojatatu.com>
Date:   Sun Jul 23 08:58:04 2017 -0400

    net netlink: Add new type NLA_BITFIELD32
    
    Generic bitflags attribute content sent to the kernel by user.
    With this netlink attr type the user can either set or unset a
    flag in the kernel.
    
    The value is a bitmap that defines the bit values being set
    The selector is a bitmask that defines which value bit is to be
    considered.
    
    A check is made to ensure the rules that a kernel subsystem always
    conforms to bitflags the kernel already knows about. i.e
    if the user tries to set a bit flag that is not understood then
    the _it will be rejected_.
    
    In the most basic form, the user specifies the attribute policy as:
    [ATTR_GOO] = { .type = NLA_BITFIELD32, .validation_data = &myvalidflags },
    
    where myvalidflags is the bit mask of the flags the kernel understands.
    
    If the user _does not_ provide myvalidflags then the attribute will
    also be rejected.
    
    Examples:
    value = 0x0, and selector = 0x1
    implies we are selecting bit 1 and we want to set its value to 0.
    
    value = 0x2, and selector = 0x2
    implies we are selecting bit 2 and we want to set its value to 1.
    
    Suggested-by: Jiri Pirko <jiri@mellanox.com>
    Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>

diff --git a/include/net/netlink.h b/include/net/netlink.h
index ef8e6c3..82dd298 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -178,6 +178,7 @@ enum {
 	NLA_S16,
 	NLA_S32,
 	NLA_S64,
+	NLA_BITFIELD32,
 	__NLA_TYPE_MAX,
 };
 
@@ -206,6 +207,7 @@ enum {
  *    NLA_MSECS            Leaving the length field zero will verify the
  *                         given type fits, using it verifies minimum length
  *                         just like "All other"
+ *    NLA_BITFIELD32      A 32-bit bitmap/bitselector attribute
  *    All other            Minimum length of attribute payload
  *
  * Example:
@@ -213,11 +215,13 @@ enum {
  * 	[ATTR_FOO] = { .type = NLA_U16 },
  *	[ATTR_BAR] = { .type = NLA_STRING, .len = BARSIZ },
  *	[ATTR_BAZ] = { .len = sizeof(struct mystruct) },
+ *	[ATTR_GOO] = { .type = NLA_BITFIELD32, .validation_data = &myvalidflags },
  * };
  */
 struct nla_policy {
 	u16		type;
 	u16		len;
+	void            *validation_data;
 };
 
 /**
@@ -1203,6 +1207,18 @@ static inline struct in6_addr nla_get_in6_addr(const struct nlattr *nla)
 }
 
 /**
+ * nla_get_bitfield32 - return payload of 32 bitfield attribute
+ * @nla: nla_bitfield32 attribute
+ */
+static inline struct nla_bitfield32 nla_get_bitfield32(const struct nlattr *nla)
+{
+	struct nla_bitfield32 tmp;
+
+	nla_memcpy(&tmp, nla, sizeof(tmp));
+	return tmp;
+}
+
+/**
  * nla_memdup - duplicate attribute memory (kmemdup)
  * @src: netlink attribute to duplicate from
  * @gfp: GFP mask
diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h
index f86127a..f4fc9c9 100644
--- a/include/uapi/linux/netlink.h
+++ b/include/uapi/linux/netlink.h
@@ -226,5 +226,22 @@ struct nlattr {
 #define NLA_ALIGN(len)		(((len) + NLA_ALIGNTO - 1) & ~(NLA_ALIGNTO - 1))
 #define NLA_HDRLEN		((int) NLA_ALIGN(sizeof(struct nlattr)))
 
+/* Generic 32 bitflags attribute content sent to the kernel.
+ *
+ * The value is a bitmap that defines the values being set
+ * The selector is a bitmask that defines which value is legit
+ *
+ * Examples:
+ *  value = 0x0, and selector = 0x1
+ *  implies we are selecting bit 1 and we want to set its value to 0.
+ *
+ *  value = 0x2, and selector = 0x2
+ *  implies we are selecting bit 2 and we want to set its value to 1.
+ *
+ */
+struct nla_bitfield32 {
+	__u32 value;
+	__u32 selector;
+};
 
 #endif /* _UAPI__LINUX_NETLINK_H */
diff --git a/lib/nlattr.c b/lib/nlattr.c
index fb52435..ee79b7a 100644
--- a/lib/nlattr.c
+++ b/lib/nlattr.c
@@ -27,6 +27,30 @@
 	[NLA_S64]	= sizeof(s64),
 };
 
+static int validate_nla_bitfield32(const struct nlattr *nla,
+				   u32 *valid_flags_allowed)
+{
+	const struct nla_bitfield32 *bf = nla_data(nla);
+	u32 *valid_flags_mask = valid_flags_allowed;
+
+	if (!valid_flags_allowed)
+		return -EINVAL;
+
+	/*disallow invalid bit selector */
+	if (bf->selector & ~*valid_flags_mask)
+		return -EINVAL;
+
+	/*disallow invalid bit values */
+	if (bf->value & ~*valid_flags_mask)
+		return -EINVAL;
+
+	/*disallow valid bit values that are not selected*/
+	if (bf->value & ~bf->selector)
+		return -EINVAL;
+
+	return 0;
+}
+
 static int validate_nla(const struct nlattr *nla, int maxtype,
 			const struct nla_policy *policy)
 {
@@ -46,6 +70,12 @@ static int validate_nla(const struct nlattr *nla, int maxtype,
 			return -ERANGE;
 		break;
 
+	case NLA_BITFIELD32:
+		if (attrlen != sizeof(struct nla_bitfield32))
+			return -ERANGE;
+
+		return validate_nla_bitfield32(nla, pt->validation_data);
+
 	case NLA_NUL_STRING:
 		if (pt->len)
 			minlen = min_t(int, attrlen, pt->len + 1);

^ permalink raw reply related	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2017-07-29 11:21 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-24  1:35 [PATCH net-next v11 0/4] net sched actions: improve dump performance Jamal Hadi Salim
2017-07-24  1:35 ` [PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32 Jamal Hadi Salim
2017-07-24 11:14   ` Jiri Pirko
2017-07-25 11:14     ` Jamal Hadi Salim
2017-07-24 11:18   ` Jiri Pirko
2017-07-25 11:15     ` Jamal Hadi Salim
2017-07-25 14:41   ` David Ahern
2017-07-28 13:51     ` Jamal Hadi Salim
2017-07-28 14:08       ` Jiri Pirko
2017-07-28 14:19       ` David Ahern
2017-07-28 14:55         ` Jiri Pirko
2017-07-28 15:04         ` Jamal Hadi Salim
2017-07-28 15:13           ` David Ahern
2017-07-28 21:55             ` Jamal Hadi Salim
2017-07-24  1:35 ` [PATCH net-next v11 2/4] net sched actions: Use proper root attribute table for actions Jamal Hadi Salim
2017-07-24  1:35 ` [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch Jamal Hadi Salim
2017-07-24 11:27   ` Jiri Pirko
2017-07-25 11:22     ` Jamal Hadi Salim
2017-07-25 11:33       ` Jiri Pirko
2017-07-25 12:34         ` Jamal Hadi Salim
2017-07-25 12:37           ` Jiri Pirko
2017-07-28 13:41             ` Jamal Hadi Salim
2017-07-28 14:12               ` Jiri Pirko
2017-07-28 14:52                 ` Jamal Hadi Salim
2017-07-28 14:57                   ` Jiri Pirko
2017-07-28 15:08                   ` Jamal Hadi Salim
2017-07-28 15:45                     ` Jiri Pirko
2017-07-28 22:10                       ` Jamal Hadi Salim
2017-07-29  7:19                         ` Jiri Pirko
2017-07-29 11:21                           ` Jamal Hadi Salim
2017-07-24  1:35 ` [PATCH net-next v11 4/4] net sched actions: add time filter for action dumping Jamal Hadi Salim
2017-07-24 11:34   ` Jiri Pirko
2017-07-25 11:27     ` Jamal Hadi Salim
2017-07-25 11:34       ` Jiri Pirko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.