* [PATCH,nf-next RFC 0/2] add NFTA_SET_ELEM_KEY_END @ 2019-12-02 13:14 Pablo Neira Ayuso 2019-12-02 13:14 ` [PATCH,nf-next RFC 1/2] netfilter: nf_tables: add nft_setelem_parse_key() Pablo Neira Ayuso ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Pablo Neira Ayuso @ 2019-12-02 13:14 UTC (permalink / raw) To: sbrivio; +Cc: netfilter-devel Hi Stefano, This patchset extends the netlink API to allow to express an interval with one single element. This simplifies this interface since userspace does not need to send two independent elements anymore, one of the including the NFT_SET_ELEM_INTERVAL_END flag. The idea is to use the _DESC to specify that userspace speaks the kernel that new API representation. In your case, the new description attribute that tells that this set contains interval + concatenation implicitly tells the kernel that userspace supports for this new API. If you're fine with this, I can scratch a bit of time to finish the libnftnl part. The nft code will need a small update too. You will not need to use the nft_set_pipapo object as scratchpad area anymore. Compile-tested only. Let me know, thanks. Pablo Neira Ayuso (2): netfilter: nf_tables: add nft_setelem_parse_key() netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attribute include/net/netfilter/nf_tables.h | 14 +++- include/uapi/linux/netfilter/nf_tables.h | 2 + net/netfilter/nf_tables_api.c | 134 +++++++++++++++++++++---------- net/netfilter/nft_dynset.c | 2 +- 4 files changed, 106 insertions(+), 46 deletions(-) -- 2.11.0 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH,nf-next RFC 1/2] netfilter: nf_tables: add nft_setelem_parse_key() 2019-12-02 13:14 [PATCH,nf-next RFC 0/2] add NFTA_SET_ELEM_KEY_END Pablo Neira Ayuso @ 2019-12-02 13:14 ` Pablo Neira Ayuso 2019-12-05 22:43 ` Stefano Brivio 2019-12-02 13:14 ` [PATCH,nf-next RFC 2/2] netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attribute Pablo Neira Ayuso 2019-12-02 16:19 ` [PATCH,nf-next RFC 0/2] add NFTA_SET_ELEM_KEY_END Stefano Brivio 2 siblings, 1 reply; 13+ messages in thread From: Pablo Neira Ayuso @ 2019-12-02 13:14 UTC (permalink / raw) To: sbrivio; +Cc: netfilter-devel Add helper function to parse the set element key netlink attribute. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> --- net/netfilter/nf_tables_api.c | 56 ++++++++++++++++++++++++------------------- 1 file changed, 32 insertions(+), 24 deletions(-) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 0db2784fee9a..13e291fac26f 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -4490,11 +4490,31 @@ static int nft_setelem_parse_flags(const struct nft_set *set, return 0; } +static int nft_setelem_parse_key(struct nft_ctx *ctx, struct nft_set *set, + struct nft_data *key, struct nlattr *attr) +{ + struct nft_data_desc desc; + int err; + + err = nft_data_init(ctx, key, sizeof(*key), &desc, attr); + if (err < 0) + goto err1; + + err = -EINVAL; + if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) + goto err2; + + return 0; +err2: + nft_data_release(key, desc.type); +err1: + return err; +} + static int nft_get_set_elem(struct nft_ctx *ctx, struct nft_set *set, const struct nlattr *attr) { struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; - struct nft_data_desc desc; struct nft_set_elem elem; struct sk_buff *skb; uint32_t flags = 0; @@ -4513,15 +4533,11 @@ static int nft_get_set_elem(struct nft_ctx *ctx, struct nft_set *set, if (err < 0) return err; - err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &desc, - nla[NFTA_SET_ELEM_KEY]); + err = nft_setelem_parse_key(ctx, set, &elem.key.val, + nla[NFTA_SET_ELEM_KEY]); if (err < 0) return err; - err = -EINVAL; - if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) - return err; - priv = set->ops->get(ctx->net, set, &elem, flags); if (IS_ERR(priv)) return PTR_ERR(priv); @@ -4720,13 +4736,13 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, { struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; u8 genmask = nft_genmask_next(ctx->net); - struct nft_data_desc d1, d2; struct nft_set_ext_tmpl tmpl; struct nft_set_ext *ext, *ext2; struct nft_set_elem elem; struct nft_set_binding *binding; struct nft_object *obj = NULL; struct nft_userdata *udata; + struct nft_data_desc d2; struct nft_data data; enum nft_registers dreg; struct nft_trans *trans; @@ -4792,15 +4808,12 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, return err; } - err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &d1, - nla[NFTA_SET_ELEM_KEY]); + err = nft_setelem_parse_key(ctx, set, &elem.key.val, + nla[NFTA_SET_ELEM_KEY]); if (err < 0) goto err1; - err = -EINVAL; - if (d1.type != NFT_DATA_VALUE || d1.len != set->klen) - goto err2; - nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, d1.len); + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); if (timeout > 0) { nft_set_ext_add(&tmpl, NFT_SET_EXT_EXPIRATION); if (timeout != set->timeout) @@ -4942,7 +4955,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, if (nla[NFTA_SET_ELEM_DATA] != NULL) nft_data_release(&data, d2.type); err2: - nft_data_release(&elem.key.val, d1.type); + nft_data_release(&elem.key.val, NFT_DATA_VALUE); err1: return err; } @@ -5038,7 +5051,6 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, { struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; struct nft_set_ext_tmpl tmpl; - struct nft_data_desc desc; struct nft_set_elem elem; struct nft_set_ext *ext; struct nft_trans *trans; @@ -5063,16 +5075,12 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, if (flags != 0) nft_set_ext_add(&tmpl, NFT_SET_EXT_FLAGS); - err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &desc, - nla[NFTA_SET_ELEM_KEY]); + err = nft_setelem_parse_key(ctx, set, &elem.key.val, + nla[NFTA_SET_ELEM_KEY]); if (err < 0) goto err1; - err = -EINVAL; - if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) - goto err2; - - nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, desc.len); + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); err = -ENOMEM; elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, NULL, 0, @@ -5109,7 +5117,7 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, err3: kfree(elem.priv); err2: - nft_data_release(&elem.key.val, desc.type); + nft_data_release(&elem.key.val, NFT_DATA_VALUE); err1: return err; } -- 2.11.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH,nf-next RFC 1/2] netfilter: nf_tables: add nft_setelem_parse_key() 2019-12-02 13:14 ` [PATCH,nf-next RFC 1/2] netfilter: nf_tables: add nft_setelem_parse_key() Pablo Neira Ayuso @ 2019-12-05 22:43 ` Stefano Brivio 2019-12-06 19:45 ` Pablo Neira Ayuso 0 siblings, 1 reply; 13+ messages in thread From: Stefano Brivio @ 2019-12-05 22:43 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter-devel Hi Pablo, Just two nits: On Mon, 2 Dec 2019 14:14:06 +0100 Pablo Neira Ayuso <pablo@netfilter.org> wrote: > Add helper function to parse the set element key netlink attribute. > > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> > --- > net/netfilter/nf_tables_api.c | 56 ++++++++++++++++++++++++------------------- > 1 file changed, 32 insertions(+), 24 deletions(-) > > diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c > index 0db2784fee9a..13e291fac26f 100644 > --- a/net/netfilter/nf_tables_api.c > +++ b/net/netfilter/nf_tables_api.c > @@ -4490,11 +4490,31 @@ static int nft_setelem_parse_flags(const struct nft_set *set, > return 0; > } > > +static int nft_setelem_parse_key(struct nft_ctx *ctx, struct nft_set *set, > + struct nft_data *key, struct nlattr *attr) > +{ > + struct nft_data_desc desc; > + int err; > + > + err = nft_data_init(ctx, key, sizeof(*key), &desc, attr); > + if (err < 0) > + goto err1; > + > + err = -EINVAL; > + if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) > + goto err2; > + > + return 0; > +err2: > + nft_data_release(key, desc.type); > +err1: > + return err; > +} > + > static int nft_get_set_elem(struct nft_ctx *ctx, struct nft_set *set, > const struct nlattr *attr) > { > struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; > - struct nft_data_desc desc; > struct nft_set_elem elem; > struct sk_buff *skb; > uint32_t flags = 0; > @@ -4513,15 +4533,11 @@ static int nft_get_set_elem(struct nft_ctx *ctx, struct nft_set *set, > if (err < 0) > return err; > > - err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &desc, > - nla[NFTA_SET_ELEM_KEY]); > + err = nft_setelem_parse_key(ctx, set, &elem.key.val, > + nla[NFTA_SET_ELEM_KEY]); > if (err < 0) > return err; > > - err = -EINVAL; > - if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) > - return err; > - > priv = set->ops->get(ctx->net, set, &elem, flags); > if (IS_ERR(priv)) > return PTR_ERR(priv); > @@ -4720,13 +4736,13 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > { > struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; > u8 genmask = nft_genmask_next(ctx->net); > - struct nft_data_desc d1, d2; > struct nft_set_ext_tmpl tmpl; > struct nft_set_ext *ext, *ext2; > struct nft_set_elem elem; > struct nft_set_binding *binding; > struct nft_object *obj = NULL; > struct nft_userdata *udata; > + struct nft_data_desc d2; At this point, this could simply be desc, or data_desc. > struct nft_data data; > enum nft_registers dreg; > struct nft_trans *trans; > @@ -4792,15 +4808,12 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > return err; > } > > - err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &d1, > - nla[NFTA_SET_ELEM_KEY]); > + err = nft_setelem_parse_key(ctx, set, &elem.key.val, > + nla[NFTA_SET_ELEM_KEY]); > if (err < 0) > goto err1; > - err = -EINVAL; > - if (d1.type != NFT_DATA_VALUE || d1.len != set->klen) > - goto err2; > > - nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, d1.len); > + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); > if (timeout > 0) { > nft_set_ext_add(&tmpl, NFT_SET_EXT_EXPIRATION); > if (timeout != set->timeout) > @@ -4942,7 +4955,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > if (nla[NFTA_SET_ELEM_DATA] != NULL) > nft_data_release(&data, d2.type); > err2: > - nft_data_release(&elem.key.val, d1.type); > + nft_data_release(&elem.key.val, NFT_DATA_VALUE); > err1: > return err; > } > @@ -5038,7 +5051,6 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, > { > struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; > struct nft_set_ext_tmpl tmpl; > - struct nft_data_desc desc; > struct nft_set_elem elem; > struct nft_set_ext *ext; > struct nft_trans *trans; > @@ -5063,16 +5075,12 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, > if (flags != 0) > nft_set_ext_add(&tmpl, NFT_SET_EXT_FLAGS); > > - err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &desc, > - nla[NFTA_SET_ELEM_KEY]); > + err = nft_setelem_parse_key(ctx, set, &elem.key.val, > + nla[NFTA_SET_ELEM_KEY]); > if (err < 0) > goto err1; > > - err = -EINVAL; > - if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) > - goto err2; > - > - nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, desc.len); > + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); > > err = -ENOMEM; > elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, NULL, 0, > @@ -5109,7 +5117,7 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, > err3: > kfree(elem.priv); > err2: > - nft_data_release(&elem.key.val, desc.type); > + nft_data_release(&elem.key.val, NFT_DATA_VALUE); I'm not sure if this can actually happen, but in nft_setelem_parse_key() you are checking that the type is NFT_DATA_VALUE, and returning error if it's not. If the type is not NFT_DATA_VALUE, I guess we shouldn't pass NFT_DATA_VALUE to nft_data_release() here. Maybe nft_setelem_parse_key() could clean up after itself on error. After all, nft_data_init() is now called from there. > err1: > return err; > } Otherwise, it looks good to me. Note that, while I'm working on integrating this with the rest of kernel changes right now, I haven't tested it in any way (that needs your changes for userspace). -- Stefano ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH,nf-next RFC 1/2] netfilter: nf_tables: add nft_setelem_parse_key() 2019-12-05 22:43 ` Stefano Brivio @ 2019-12-06 19:45 ` Pablo Neira Ayuso 2019-12-07 22:51 ` Stefano Brivio 0 siblings, 1 reply; 13+ messages in thread From: Pablo Neira Ayuso @ 2019-12-06 19:45 UTC (permalink / raw) To: Stefano Brivio; +Cc: netfilter-devel On Thu, Dec 05, 2019 at 11:43:50PM +0100, Stefano Brivio wrote: > Hi Pablo, > > Just two nits: > > On Mon, 2 Dec 2019 14:14:06 +0100 > Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > Add helper function to parse the set element key netlink attribute. > > > > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> > > --- > > net/netfilter/nf_tables_api.c | 56 ++++++++++++++++++++++++------------------- > > 1 file changed, 32 insertions(+), 24 deletions(-) > > > > diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c > > index 0db2784fee9a..13e291fac26f 100644 > > --- a/net/netfilter/nf_tables_api.c > > +++ b/net/netfilter/nf_tables_api.c > > @@ -4490,11 +4490,31 @@ static int nft_setelem_parse_flags(const struct nft_set *set, > > return 0; > > } > > > > +static int nft_setelem_parse_key(struct nft_ctx *ctx, struct nft_set *set, > > + struct nft_data *key, struct nlattr *attr) > > +{ > > + struct nft_data_desc desc; > > + int err; > > + > > + err = nft_data_init(ctx, key, sizeof(*key), &desc, attr); > > + if (err < 0) > > + goto err1; > > + > > + err = -EINVAL; > > + if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) > > + goto err2; > > + > > + return 0; > > +err2: > > + nft_data_release(key, desc.type); > > +err1: > > + return err; > > +} > > + > > static int nft_get_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > const struct nlattr *attr) > > { > > struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; > > - struct nft_data_desc desc; > > struct nft_set_elem elem; > > struct sk_buff *skb; > > uint32_t flags = 0; > > @@ -4513,15 +4533,11 @@ static int nft_get_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > if (err < 0) > > return err; > > > > - err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &desc, > > - nla[NFTA_SET_ELEM_KEY]); > > + err = nft_setelem_parse_key(ctx, set, &elem.key.val, > > + nla[NFTA_SET_ELEM_KEY]); > > if (err < 0) > > return err; > > > > - err = -EINVAL; > > - if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) > > - return err; > > - > > priv = set->ops->get(ctx->net, set, &elem, flags); > > if (IS_ERR(priv)) > > return PTR_ERR(priv); > > @@ -4720,13 +4736,13 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > { > > struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; > > u8 genmask = nft_genmask_next(ctx->net); > > - struct nft_data_desc d1, d2; > > struct nft_set_ext_tmpl tmpl; > > struct nft_set_ext *ext, *ext2; > > struct nft_set_elem elem; > > struct nft_set_binding *binding; > > struct nft_object *obj = NULL; > > struct nft_userdata *udata; > > + struct nft_data_desc d2; > > At this point, this could simply be desc, or data_desc. > > > struct nft_data data; > > enum nft_registers dreg; > > struct nft_trans *trans; > > @@ -4792,15 +4808,12 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > return err; > > } > > > > - err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &d1, > > - nla[NFTA_SET_ELEM_KEY]); > > + err = nft_setelem_parse_key(ctx, set, &elem.key.val, > > + nla[NFTA_SET_ELEM_KEY]); > > if (err < 0) > > goto err1; > > - err = -EINVAL; > > - if (d1.type != NFT_DATA_VALUE || d1.len != set->klen) > > - goto err2; > > > > - nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, d1.len); > > + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); > > if (timeout > 0) { > > nft_set_ext_add(&tmpl, NFT_SET_EXT_EXPIRATION); > > if (timeout != set->timeout) > > @@ -4942,7 +4955,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > if (nla[NFTA_SET_ELEM_DATA] != NULL) > > nft_data_release(&data, d2.type); > > err2: > > - nft_data_release(&elem.key.val, d1.type); > > + nft_data_release(&elem.key.val, NFT_DATA_VALUE); > > err1: > > return err; > > } > > @@ -5038,7 +5051,6 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, > > { > > struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; > > struct nft_set_ext_tmpl tmpl; > > - struct nft_data_desc desc; > > struct nft_set_elem elem; > > struct nft_set_ext *ext; > > struct nft_trans *trans; > > @@ -5063,16 +5075,12 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, > > if (flags != 0) > > nft_set_ext_add(&tmpl, NFT_SET_EXT_FLAGS); > > > > - err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &desc, > > - nla[NFTA_SET_ELEM_KEY]); > > + err = nft_setelem_parse_key(ctx, set, &elem.key.val, > > + nla[NFTA_SET_ELEM_KEY]); > > if (err < 0) > > goto err1; > > > > - err = -EINVAL; > > - if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) > > - goto err2; > > - > > - nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, desc.len); > > + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); > > > > err = -ENOMEM; > > elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, NULL, 0, > > @@ -5109,7 +5117,7 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, > > err3: > > kfree(elem.priv); > > err2: > > - nft_data_release(&elem.key.val, desc.type); > > + nft_data_release(&elem.key.val, NFT_DATA_VALUE); > > I'm not sure if this can actually happen, but in > nft_setelem_parse_key() you are checking that the type is > NFT_DATA_VALUE, and returning error if it's not. Exactly. > If the type is not NFT_DATA_VALUE, I guess we shouldn't pass > NFT_DATA_VALUE to nft_data_release() here. The new nft_setelem_parse_key() function makes sure that the key is NFT_DATA_VALUE, otherwise bails out and calls nft_data_release() with desc.type. Then, moving forward in nft_add_set_elem() after the nft_setelem_parse_key(), if an error occurs, nft_data_release() can be called with NFT_DATA_VALUE, because that was already validated by nft_setelem_parse_key(). > Maybe nft_setelem_parse_key() could clean up after itself on error. It's doing so already, right? See err2: label. > After all, nft_data_init() is now called from there. > > > err1: > > return err; > > } > > Otherwise, it looks good to me. Note that, while I'm working on > integrating this with the rest of kernel changes right now, I haven't > tested it in any way (that needs your changes for userspace). Great. Let me know if you have more questions, thanks! ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH,nf-next RFC 1/2] netfilter: nf_tables: add nft_setelem_parse_key() 2019-12-06 19:45 ` Pablo Neira Ayuso @ 2019-12-07 22:51 ` Stefano Brivio 2019-12-09 20:44 ` Pablo Neira Ayuso 0 siblings, 1 reply; 13+ messages in thread From: Stefano Brivio @ 2019-12-07 22:51 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter-devel On Fri, 6 Dec 2019 20:45:17 +0100 Pablo Neira Ayuso <pablo@netfilter.org> wrote: > On Thu, Dec 05, 2019 at 11:43:50PM +0100, Stefano Brivio wrote: > > Hi Pablo, > > > > Just two nits: > > > > On Mon, 2 Dec 2019 14:14:06 +0100 > > Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > > > Add helper function to parse the set element key netlink attribute. > > > > > > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> > > > --- > > > net/netfilter/nf_tables_api.c | 56 ++++++++++++++++++++++++------------------- > > > 1 file changed, 32 insertions(+), 24 deletions(-) > > > > > > diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c > > > index 0db2784fee9a..13e291fac26f 100644 > > > --- a/net/netfilter/nf_tables_api.c > > > +++ b/net/netfilter/nf_tables_api.c > > > @@ -4490,11 +4490,31 @@ static int nft_setelem_parse_flags(const struct nft_set *set, > > > return 0; > > > } > > > > > > +static int nft_setelem_parse_key(struct nft_ctx *ctx, struct nft_set *set, > > > + struct nft_data *key, struct nlattr *attr) > > > +{ > > > + struct nft_data_desc desc; > > > + int err; > > > + > > > + err = nft_data_init(ctx, key, sizeof(*key), &desc, attr); > > > + if (err < 0) > > > + goto err1; > > > + > > > + err = -EINVAL; > > > + if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) > > > + goto err2; > > > + > > > + return 0; > > > +err2: > > > + nft_data_release(key, desc.type); > > > +err1: > > > + return err; > > > +} > > > + > > > static int nft_get_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > > const struct nlattr *attr) > > > { > > > struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; > > > - struct nft_data_desc desc; > > > struct nft_set_elem elem; > > > struct sk_buff *skb; > > > uint32_t flags = 0; > > > @@ -4513,15 +4533,11 @@ static int nft_get_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > > if (err < 0) > > > return err; > > > > > > - err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &desc, > > > - nla[NFTA_SET_ELEM_KEY]); > > > + err = nft_setelem_parse_key(ctx, set, &elem.key.val, > > > + nla[NFTA_SET_ELEM_KEY]); > > > if (err < 0) > > > return err; > > > > > > - err = -EINVAL; > > > - if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) > > > - return err; > > > - > > > priv = set->ops->get(ctx->net, set, &elem, flags); > > > if (IS_ERR(priv)) > > > return PTR_ERR(priv); > > > @@ -4720,13 +4736,13 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > > { > > > struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; > > > u8 genmask = nft_genmask_next(ctx->net); > > > - struct nft_data_desc d1, d2; > > > struct nft_set_ext_tmpl tmpl; > > > struct nft_set_ext *ext, *ext2; > > > struct nft_set_elem elem; > > > struct nft_set_binding *binding; > > > struct nft_object *obj = NULL; > > > struct nft_userdata *udata; > > > + struct nft_data_desc d2; > > > > At this point, this could simply be desc, or data_desc. > > > > > struct nft_data data; > > > enum nft_registers dreg; > > > struct nft_trans *trans; > > > @@ -4792,15 +4808,12 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > > return err; > > > } > > > > > > - err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &d1, > > > - nla[NFTA_SET_ELEM_KEY]); > > > + err = nft_setelem_parse_key(ctx, set, &elem.key.val, > > > + nla[NFTA_SET_ELEM_KEY]); > > > if (err < 0) > > > goto err1; > > > - err = -EINVAL; > > > - if (d1.type != NFT_DATA_VALUE || d1.len != set->klen) > > > - goto err2; > > > > > > - nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, d1.len); > > > + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); > > > if (timeout > 0) { > > > nft_set_ext_add(&tmpl, NFT_SET_EXT_EXPIRATION); > > > if (timeout != set->timeout) > > > @@ -4942,7 +4955,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > > if (nla[NFTA_SET_ELEM_DATA] != NULL) > > > nft_data_release(&data, d2.type); > > > err2: > > > - nft_data_release(&elem.key.val, d1.type); > > > + nft_data_release(&elem.key.val, NFT_DATA_VALUE); > > > err1: > > > return err; > > > } > > > @@ -5038,7 +5051,6 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, > > > { > > > struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; > > > struct nft_set_ext_tmpl tmpl; > > > - struct nft_data_desc desc; > > > struct nft_set_elem elem; > > > struct nft_set_ext *ext; > > > struct nft_trans *trans; > > > @@ -5063,16 +5075,12 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, > > > if (flags != 0) > > > nft_set_ext_add(&tmpl, NFT_SET_EXT_FLAGS); > > > > > > - err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &desc, > > > - nla[NFTA_SET_ELEM_KEY]); > > > + err = nft_setelem_parse_key(ctx, set, &elem.key.val, > > > + nla[NFTA_SET_ELEM_KEY]); > > > if (err < 0) > > > goto err1; > > > > > > - err = -EINVAL; > > > - if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) > > > - goto err2; > > > - > > > - nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, desc.len); > > > + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); > > > > > > err = -ENOMEM; > > > elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, NULL, 0, > > > @@ -5109,7 +5117,7 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, > > > err3: > > > kfree(elem.priv); > > > err2: > > > - nft_data_release(&elem.key.val, desc.type); > > > + nft_data_release(&elem.key.val, NFT_DATA_VALUE); > > > > I'm not sure if this can actually happen, but in > > nft_setelem_parse_key() you are checking that the type is > > NFT_DATA_VALUE, and returning error if it's not. > > Exactly. > > > If the type is not NFT_DATA_VALUE, I guess we shouldn't pass > > NFT_DATA_VALUE to nft_data_release() here. > > The new nft_setelem_parse_key() function makes sure that the key is > NFT_DATA_VALUE, otherwise bails out and calls nft_data_release() with > desc.type. > > Then, moving forward in nft_add_set_elem() after the > nft_setelem_parse_key(), if an error occurs, nft_data_release() can be > called with NFT_DATA_VALUE, because that was already validated by > nft_setelem_parse_key(). > > > Maybe nft_setelem_parse_key() could clean up after itself on error. > > It's doing so already, right? See err2: label. Right you are, my bad, I mixed up err2: and err1: in nft_set_delelem() and then forgot about err2: in nft_setelem_parse_key(). Well, on the other hand, 'return err;" and 'goto fail_elem;" would have been easier to follow, but maybe it's just my taste. :) -- Stefano ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH,nf-next RFC 1/2] netfilter: nf_tables: add nft_setelem_parse_key() 2019-12-07 22:51 ` Stefano Brivio @ 2019-12-09 20:44 ` Pablo Neira Ayuso 0 siblings, 0 replies; 13+ messages in thread From: Pablo Neira Ayuso @ 2019-12-09 20:44 UTC (permalink / raw) To: Stefano Brivio; +Cc: netfilter-devel On Sat, Dec 07, 2019 at 11:51:38PM +0100, Stefano Brivio wrote: > On Fri, 6 Dec 2019 20:45:17 +0100 > Pablo Neira Ayuso <pablo@netfilter.org> wrote: [...] > > On Thu, Dec 05, 2019 at 11:43:50PM +0100, Stefano Brivio wrote: [...] > > > If the type is not NFT_DATA_VALUE, I guess we shouldn't pass > > > NFT_DATA_VALUE to nft_data_release() here. > > > > The new nft_setelem_parse_key() function makes sure that the key is > > NFT_DATA_VALUE, otherwise bails out and calls nft_data_release() with > > desc.type. > > > > Then, moving forward in nft_add_set_elem() after the > > nft_setelem_parse_key(), if an error occurs, nft_data_release() can be > > called with NFT_DATA_VALUE, because that was already validated by > > nft_setelem_parse_key(). > > > > > Maybe nft_setelem_parse_key() could clean up after itself on error. > > > > It's doing so already, right? See err2: label. > > Right you are, my bad, I mixed up err2: and err1: in nft_set_delelem() > and then forgot about err2: in nft_setelem_parse_key(). > > Well, on the other hand, 'return err;" and 'goto fail_elem;" would have > been easier to follow, but maybe it's just my taste. :) Feel free to update this patch to use the goto tags you are suggesting. Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH,nf-next RFC 2/2] netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attribute 2019-12-02 13:14 [PATCH,nf-next RFC 0/2] add NFTA_SET_ELEM_KEY_END Pablo Neira Ayuso 2019-12-02 13:14 ` [PATCH,nf-next RFC 1/2] netfilter: nf_tables: add nft_setelem_parse_key() Pablo Neira Ayuso @ 2019-12-02 13:14 ` Pablo Neira Ayuso 2019-12-05 22:44 ` Stefano Brivio 2019-12-02 16:19 ` [PATCH,nf-next RFC 0/2] add NFTA_SET_ELEM_KEY_END Stefano Brivio 2 siblings, 1 reply; 13+ messages in thread From: Pablo Neira Ayuso @ 2019-12-02 13:14 UTC (permalink / raw) To: sbrivio; +Cc: netfilter-devel Add NFTA_SET_ELEM_KEY_END attribute to convey the closing element of the interval between kernel and userspace. This patch also adds the NFT_SET_EXT_KEY_END extension to store the closing element value in this interval. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> --- include/net/netfilter/nf_tables.h | 14 +++++- include/uapi/linux/netfilter/nf_tables.h | 2 + net/netfilter/nf_tables_api.c | 82 +++++++++++++++++++++++--------- net/netfilter/nft_dynset.c | 2 +- 4 files changed, 76 insertions(+), 24 deletions(-) diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index fe7c50acc681..2252a3892124 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -231,6 +231,7 @@ struct nft_userdata { * struct nft_set_elem - generic representation of set elements * * @key: element key + * @key_end: closing element key * @priv: element private data and extensions */ struct nft_set_elem { @@ -238,6 +239,10 @@ struct nft_set_elem { u32 buf[NFT_DATA_VALUE_MAXLEN / sizeof(u32)]; struct nft_data val; } key; + union { + u32 buf[NFT_DATA_VALUE_MAXLEN / sizeof(u32)]; + struct nft_data val; + } key_end; void *priv; }; @@ -502,6 +507,7 @@ void nf_tables_destroy_set(const struct nft_ctx *ctx, struct nft_set *set); * enum nft_set_extensions - set extension type IDs * * @NFT_SET_EXT_KEY: element key + * @NFT_SET_EXT_KEY_END: closing element key * @NFT_SET_EXT_DATA: mapping data * @NFT_SET_EXT_FLAGS: element flags * @NFT_SET_EXT_TIMEOUT: element timeout @@ -513,6 +519,7 @@ void nf_tables_destroy_set(const struct nft_ctx *ctx, struct nft_set *set); */ enum nft_set_extensions { NFT_SET_EXT_KEY, + NFT_SET_EXT_KEY_END, NFT_SET_EXT_DATA, NFT_SET_EXT_FLAGS, NFT_SET_EXT_TIMEOUT, @@ -606,6 +613,11 @@ static inline struct nft_data *nft_set_ext_key(const struct nft_set_ext *ext) return nft_set_ext(ext, NFT_SET_EXT_KEY); } +static inline struct nft_data *nft_set_ext_key_end(const struct nft_set_ext *ext) +{ + return nft_set_ext(ext, NFT_SET_EXT_KEY_END); +} + static inline struct nft_data *nft_set_ext_data(const struct nft_set_ext *ext) { return nft_set_ext(ext, NFT_SET_EXT_DATA); @@ -655,7 +667,7 @@ static inline struct nft_object **nft_set_ext_obj(const struct nft_set_ext *ext) void *nft_set_elem_init(const struct nft_set *set, const struct nft_set_ext_tmpl *tmpl, - const u32 *key, const u32 *data, + const u32 *key, const u32 *key_end, const u32 *data, u64 timeout, u64 expiration, gfp_t gfp); void nft_set_elem_destroy(const struct nft_set *set, void *elem, bool destroy_expr); diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h index bb9b049310df..1d62552a12a7 100644 --- a/include/uapi/linux/netfilter/nf_tables.h +++ b/include/uapi/linux/netfilter/nf_tables.h @@ -370,6 +370,7 @@ enum nft_set_elem_flags { * @NFTA_SET_ELEM_USERDATA: user data (NLA_BINARY) * @NFTA_SET_ELEM_EXPR: expression (NLA_NESTED: nft_expr_attributes) * @NFTA_SET_ELEM_OBJREF: stateful object reference (NLA_STRING) + * @NFTA_SET_ELEM_KEY_END: closing key value (NLA_STRING) */ enum nft_set_elem_attributes { NFTA_SET_ELEM_UNSPEC, @@ -382,6 +383,7 @@ enum nft_set_elem_attributes { NFTA_SET_ELEM_EXPR, NFTA_SET_ELEM_PAD, NFTA_SET_ELEM_OBJREF, + NFTA_SET_ELEM_KEY_END, __NFTA_SET_ELEM_MAX }; #define NFTA_SET_ELEM_MAX (__NFTA_SET_ELEM_MAX - 1) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 13e291fac26f..927f6de5f65c 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -4199,6 +4199,7 @@ static const struct nla_policy nft_set_elem_policy[NFTA_SET_ELEM_MAX + 1] = { .len = NFT_USERDATA_MAXLEN }, [NFTA_SET_ELEM_EXPR] = { .type = NLA_NESTED }, [NFTA_SET_ELEM_OBJREF] = { .type = NLA_STRING }, + [NFTA_SET_ELEM_KEY_END] = { .type = NLA_NESTED }, }; static const struct nla_policy nft_set_elem_list_policy[NFTA_SET_ELEM_LIST_MAX + 1] = { @@ -4248,6 +4249,11 @@ static int nf_tables_fill_setelem(struct sk_buff *skb, NFT_DATA_VALUE, set->klen) < 0) goto nla_put_failure; + if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END) && + nft_data_dump(skb, NFTA_SET_ELEM_KEY_END, nft_set_ext_key_end(ext), + NFT_DATA_VALUE, set->klen) < 0) + goto nla_put_failure; + if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA) && nft_data_dump(skb, NFTA_SET_ELEM_DATA, nft_set_ext_data(ext), set->dtype == NFT_DATA_VERDICT ? NFT_DATA_VERDICT : NFT_DATA_VALUE, @@ -4538,6 +4544,13 @@ static int nft_get_set_elem(struct nft_ctx *ctx, struct nft_set *set, if (err < 0) return err; + if (nla[NFTA_SET_ELEM_KEY_END]) { + err = nft_setelem_parse_key(ctx, set, &elem.key_end.val, + nla[NFTA_SET_ELEM_KEY_END]); + if (err < 0) + return err; + } + priv = set->ops->get(ctx->net, set, &elem, flags); if (IS_ERR(priv)) return PTR_ERR(priv); @@ -4663,8 +4676,8 @@ static struct nft_trans *nft_trans_elem_alloc(struct nft_ctx *ctx, void *nft_set_elem_init(const struct nft_set *set, const struct nft_set_ext_tmpl *tmpl, - const u32 *key, const u32 *data, - u64 timeout, u64 expiration, gfp_t gfp) + const u32 *key, const u32 *key_end, + const u32 *data, u64 timeout, u64 expiration, gfp_t gfp) { struct nft_set_ext *ext; void *elem; @@ -4677,6 +4690,8 @@ void *nft_set_elem_init(const struct nft_set *set, nft_set_ext_init(ext, tmpl); memcpy(nft_set_ext_key(ext), key, set->klen); + if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END)) + memcpy(nft_set_ext_key_end(ext), key_end, set->klen); if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA)) memcpy(nft_set_ext_data(ext), data, set->dlen); if (nft_set_ext_exists(ext, NFT_SET_EXT_EXPIRATION)) { @@ -4811,9 +4826,19 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, err = nft_setelem_parse_key(ctx, set, &elem.key.val, nla[NFTA_SET_ELEM_KEY]); if (err < 0) - goto err1; + return err; nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); + + if (nla[NFTA_SET_ELEM_KEY_END]) { + err = nft_setelem_parse_key(ctx, set, &elem.key_end.val, + nla[NFTA_SET_ELEM_KEY_END]); + if (err < 0) + goto err_parse_key; + + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY_END, set->klen); + } + if (timeout > 0) { nft_set_ext_add(&tmpl, NFT_SET_EXT_EXPIRATION); if (timeout != set->timeout) @@ -4823,14 +4848,14 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, if (nla[NFTA_SET_ELEM_OBJREF] != NULL) { if (!(set->flags & NFT_SET_OBJECT)) { err = -EINVAL; - goto err2; + goto err_parse_key_end; } obj = nft_obj_lookup(ctx->net, ctx->table, nla[NFTA_SET_ELEM_OBJREF], set->objtype, genmask); if (IS_ERR(obj)) { err = PTR_ERR(obj); - goto err2; + goto err_parse_key_end; } nft_set_ext_add(&tmpl, NFT_SET_EXT_OBJREF); } @@ -4839,11 +4864,11 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, err = nft_data_init(ctx, &data, sizeof(data), &d2, nla[NFTA_SET_ELEM_DATA]); if (err < 0) - goto err2; + goto err_parse_key_end; err = -EINVAL; if (set->dtype != NFT_DATA_VERDICT && d2.len != set->dlen) - goto err3; + goto err_parse_data; dreg = nft_type_to_reg(set->dtype); list_for_each_entry(binding, &set->bindings, list) { @@ -4861,7 +4886,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, &data, d2.type, d2.len); if (err < 0) - goto err3; + goto err_parse_data; if (d2.type == NFT_DATA_VERDICT && (data.verdict.code == NFT_GOTO || @@ -4886,10 +4911,11 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, } err = -ENOMEM; - elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, data.data, + elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, + elem.key_end.val.data, data.data, timeout, expiration, GFP_KERNEL); if (elem.priv == NULL) - goto err3; + goto err_parse_data; ext = nft_set_elem_ext(set, elem.priv); if (flags) @@ -4906,7 +4932,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, trans = nft_trans_elem_alloc(ctx, NFT_MSG_NEWSETELEM, set); if (trans == NULL) - goto err4; + goto err_trans; ext->genmask = nft_genmask_cur(ctx->net) | NFT_SET_ELEM_BUSY_MASK; err = set->ops->insert(ctx->net, set, &elem, &ext2); @@ -4917,7 +4943,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, nft_set_ext_exists(ext, NFT_SET_EXT_OBJREF) ^ nft_set_ext_exists(ext2, NFT_SET_EXT_OBJREF)) { err = -EBUSY; - goto err5; + goto err_element_clash; } if ((nft_set_ext_exists(ext, NFT_SET_EXT_DATA) && nft_set_ext_exists(ext2, NFT_SET_EXT_DATA) && @@ -4930,33 +4956,35 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, else if (!(nlmsg_flags & NLM_F_EXCL)) err = 0; } - goto err5; + goto err_element_clash; } if (set->size && !atomic_add_unless(&set->nelems, 1, set->size + set->ndeact)) { err = -ENFILE; - goto err6; + goto err_set_full; } nft_trans_elem(trans) = elem; list_add_tail(&trans->list, &ctx->net->nft.commit_list); return 0; -err6: +err_set_full: set->ops->remove(ctx->net, set, &elem); -err5: +err_element_clash: kfree(trans); -err4: +err_trans: if (obj) obj->use--; kfree(elem.priv); -err3: +err_parse_data: if (nla[NFTA_SET_ELEM_DATA] != NULL) nft_data_release(&data, d2.type); -err2: +err_parse_key_end: + nft_data_release(&elem.key_end.val, NFT_DATA_VALUE); +err_parse_key: nft_data_release(&elem.key.val, NFT_DATA_VALUE); -err1: + return err; } @@ -5082,9 +5110,19 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); + if (nla[NFTA_SET_ELEM_KEY_END]) { + err = nft_setelem_parse_key(ctx, set, &elem.key_end.val, + nla[NFTA_SET_ELEM_KEY_END]); + if (err < 0) + return err; + + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY_END, set->klen); + } + err = -ENOMEM; - elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, NULL, 0, - 0, GFP_KERNEL); + elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, + elem.key_end.val.data, NULL, 0, 0, + GFP_KERNEL); if (elem.priv == NULL) goto err2; diff --git a/net/netfilter/nft_dynset.c b/net/netfilter/nft_dynset.c index 8887295414dc..683785225a3e 100644 --- a/net/netfilter/nft_dynset.c +++ b/net/netfilter/nft_dynset.c @@ -54,7 +54,7 @@ static void *nft_dynset_new(struct nft_set *set, const struct nft_expr *expr, timeout = priv->timeout ? : set->timeout; elem = nft_set_elem_init(set, &priv->tmpl, - ®s->data[priv->sreg_key], + ®s->data[priv->sreg_key], NULL, ®s->data[priv->sreg_data], timeout, 0, GFP_ATOMIC); if (elem == NULL) -- 2.11.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH,nf-next RFC 2/2] netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attribute 2019-12-02 13:14 ` [PATCH,nf-next RFC 2/2] netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attribute Pablo Neira Ayuso @ 2019-12-05 22:44 ` Stefano Brivio 2019-12-06 19:52 ` Pablo Neira Ayuso 0 siblings, 1 reply; 13+ messages in thread From: Stefano Brivio @ 2019-12-05 22:44 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter-devel On Mon, 2 Dec 2019 14:14:07 +0100 Pablo Neira Ayuso <pablo@netfilter.org> wrote: > Add NFTA_SET_ELEM_KEY_END attribute to convey the closing element of the > interval between kernel and userspace. > > This patch also adds the NFT_SET_EXT_KEY_END extension to store the > closing element value in this interval. > > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> > --- > include/net/netfilter/nf_tables.h | 14 +++++- > include/uapi/linux/netfilter/nf_tables.h | 2 + > net/netfilter/nf_tables_api.c | 82 +++++++++++++++++++++++--------- > net/netfilter/nft_dynset.c | 2 +- > 4 files changed, 76 insertions(+), 24 deletions(-) > > diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h > index fe7c50acc681..2252a3892124 100644 > --- a/include/net/netfilter/nf_tables.h > +++ b/include/net/netfilter/nf_tables.h > @@ -231,6 +231,7 @@ struct nft_userdata { > * struct nft_set_elem - generic representation of set elements > * > * @key: element key > + * @key_end: closing element key The "closing" here takes for granted that we're talking about ranges, but perhaps it's not obvious from the context. Maybe something on the lines of "upper bound element key, for ranges" would be more explanatory. > * @priv: element private data and extensions > */ > struct nft_set_elem { > @@ -238,6 +239,10 @@ struct nft_set_elem { > u32 buf[NFT_DATA_VALUE_MAXLEN / sizeof(u32)]; > struct nft_data val; > } key; > + union { > + u32 buf[NFT_DATA_VALUE_MAXLEN / sizeof(u32)]; > + struct nft_data val; > + } key_end; > void *priv; > }; I wonder if this special need justifies almost doubling the size (for other set types) here. As far as I can tell, *priv doesn't need to be at the end, so we might even consider to have key[0] at the end, with 1 to 2 elements, and I guess nft_set_elem_init() has the information needed to allocate the right size. > > @@ -502,6 +507,7 @@ void nf_tables_destroy_set(const struct nft_ctx *ctx, struct nft_set *set); > * enum nft_set_extensions - set extension type IDs > * > * @NFT_SET_EXT_KEY: element key > + * @NFT_SET_EXT_KEY_END: closing element key > * @NFT_SET_EXT_DATA: mapping data > * @NFT_SET_EXT_FLAGS: element flags > * @NFT_SET_EXT_TIMEOUT: element timeout > @@ -513,6 +519,7 @@ void nf_tables_destroy_set(const struct nft_ctx *ctx, struct nft_set *set); > */ > enum nft_set_extensions { > NFT_SET_EXT_KEY, > + NFT_SET_EXT_KEY_END, > NFT_SET_EXT_DATA, > NFT_SET_EXT_FLAGS, > NFT_SET_EXT_TIMEOUT, > @@ -606,6 +613,11 @@ static inline struct nft_data *nft_set_ext_key(const struct nft_set_ext *ext) > return nft_set_ext(ext, NFT_SET_EXT_KEY); > } > > +static inline struct nft_data *nft_set_ext_key_end(const struct nft_set_ext *ext) > +{ > + return nft_set_ext(ext, NFT_SET_EXT_KEY_END); > +} > + > static inline struct nft_data *nft_set_ext_data(const struct nft_set_ext *ext) > { > return nft_set_ext(ext, NFT_SET_EXT_DATA); > @@ -655,7 +667,7 @@ static inline struct nft_object **nft_set_ext_obj(const struct nft_set_ext *ext) > > void *nft_set_elem_init(const struct nft_set *set, > const struct nft_set_ext_tmpl *tmpl, > - const u32 *key, const u32 *data, > + const u32 *key, const u32 *key_end, const u32 *data, > u64 timeout, u64 expiration, gfp_t gfp); > void nft_set_elem_destroy(const struct nft_set *set, void *elem, > bool destroy_expr); > diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h > index bb9b049310df..1d62552a12a7 100644 > --- a/include/uapi/linux/netfilter/nf_tables.h > +++ b/include/uapi/linux/netfilter/nf_tables.h > @@ -370,6 +370,7 @@ enum nft_set_elem_flags { > * @NFTA_SET_ELEM_USERDATA: user data (NLA_BINARY) > * @NFTA_SET_ELEM_EXPR: expression (NLA_NESTED: nft_expr_attributes) > * @NFTA_SET_ELEM_OBJREF: stateful object reference (NLA_STRING) > + * @NFTA_SET_ELEM_KEY_END: closing key value (NLA_STRING) s/NLA_STRING/NLA_NESTED/ > */ > enum nft_set_elem_attributes { > NFTA_SET_ELEM_UNSPEC, > @@ -382,6 +383,7 @@ enum nft_set_elem_attributes { > NFTA_SET_ELEM_EXPR, > NFTA_SET_ELEM_PAD, > NFTA_SET_ELEM_OBJREF, > + NFTA_SET_ELEM_KEY_END, > __NFTA_SET_ELEM_MAX > }; > #define NFTA_SET_ELEM_MAX (__NFTA_SET_ELEM_MAX - 1) > diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c > index 13e291fac26f..927f6de5f65c 100644 > --- a/net/netfilter/nf_tables_api.c > +++ b/net/netfilter/nf_tables_api.c > @@ -4199,6 +4199,7 @@ static const struct nla_policy nft_set_elem_policy[NFTA_SET_ELEM_MAX + 1] = { > .len = NFT_USERDATA_MAXLEN }, > [NFTA_SET_ELEM_EXPR] = { .type = NLA_NESTED }, > [NFTA_SET_ELEM_OBJREF] = { .type = NLA_STRING }, > + [NFTA_SET_ELEM_KEY_END] = { .type = NLA_NESTED }, > }; > > static const struct nla_policy nft_set_elem_list_policy[NFTA_SET_ELEM_LIST_MAX + 1] = { > @@ -4248,6 +4249,11 @@ static int nf_tables_fill_setelem(struct sk_buff *skb, > NFT_DATA_VALUE, set->klen) < 0) > goto nla_put_failure; > > + if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END) && > + nft_data_dump(skb, NFTA_SET_ELEM_KEY_END, nft_set_ext_key_end(ext), > + NFT_DATA_VALUE, set->klen) < 0) > + goto nla_put_failure; > + > if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA) && > nft_data_dump(skb, NFTA_SET_ELEM_DATA, nft_set_ext_data(ext), > set->dtype == NFT_DATA_VERDICT ? NFT_DATA_VERDICT : NFT_DATA_VALUE, > @@ -4538,6 +4544,13 @@ static int nft_get_set_elem(struct nft_ctx *ctx, struct nft_set *set, > if (err < 0) > return err; > > + if (nla[NFTA_SET_ELEM_KEY_END]) { > + err = nft_setelem_parse_key(ctx, set, &elem.key_end.val, > + nla[NFTA_SET_ELEM_KEY_END]); > + if (err < 0) > + return err; > + } > + > priv = set->ops->get(ctx->net, set, &elem, flags); > if (IS_ERR(priv)) > return PTR_ERR(priv); > @@ -4663,8 +4676,8 @@ static struct nft_trans *nft_trans_elem_alloc(struct nft_ctx *ctx, > > void *nft_set_elem_init(const struct nft_set *set, > const struct nft_set_ext_tmpl *tmpl, > - const u32 *key, const u32 *data, > - u64 timeout, u64 expiration, gfp_t gfp) > + const u32 *key, const u32 *key_end, > + const u32 *data, u64 timeout, u64 expiration, gfp_t gfp) > { > struct nft_set_ext *ext; > void *elem; > @@ -4677,6 +4690,8 @@ void *nft_set_elem_init(const struct nft_set *set, > nft_set_ext_init(ext, tmpl); > > memcpy(nft_set_ext_key(ext), key, set->klen); > + if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END)) > + memcpy(nft_set_ext_key_end(ext), key_end, set->klen); > if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA)) > memcpy(nft_set_ext_data(ext), data, set->dlen); > if (nft_set_ext_exists(ext, NFT_SET_EXT_EXPIRATION)) { > @@ -4811,9 +4826,19 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > err = nft_setelem_parse_key(ctx, set, &elem.key.val, > nla[NFTA_SET_ELEM_KEY]); > if (err < 0) > - goto err1; > + return err; I think this makes sense as labels get meaningful names with this patch, but I wonder if this change is actually intended. > > nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); > + > + if (nla[NFTA_SET_ELEM_KEY_END]) { > + err = nft_setelem_parse_key(ctx, set, &elem.key_end.val, > + nla[NFTA_SET_ELEM_KEY_END]); > + if (err < 0) > + goto err_parse_key; Same comment as patch 1/2, this would be more straightforward if nft_setelem_parse_key() cleaned up after itself (only on error). > + > + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY_END, set->klen); > + } > + > if (timeout > 0) { > nft_set_ext_add(&tmpl, NFT_SET_EXT_EXPIRATION); > if (timeout != set->timeout) > @@ -4823,14 +4848,14 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > if (nla[NFTA_SET_ELEM_OBJREF] != NULL) { > if (!(set->flags & NFT_SET_OBJECT)) { > err = -EINVAL; > - goto err2; > + goto err_parse_key_end; > } > obj = nft_obj_lookup(ctx->net, ctx->table, > nla[NFTA_SET_ELEM_OBJREF], > set->objtype, genmask); > if (IS_ERR(obj)) { > err = PTR_ERR(obj); > - goto err2; > + goto err_parse_key_end; > } > nft_set_ext_add(&tmpl, NFT_SET_EXT_OBJREF); > } > @@ -4839,11 +4864,11 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > err = nft_data_init(ctx, &data, sizeof(data), &d2, > nla[NFTA_SET_ELEM_DATA]); > if (err < 0) > - goto err2; > + goto err_parse_key_end; > > err = -EINVAL; > if (set->dtype != NFT_DATA_VERDICT && d2.len != set->dlen) > - goto err3; > + goto err_parse_data; > > dreg = nft_type_to_reg(set->dtype); > list_for_each_entry(binding, &set->bindings, list) { > @@ -4861,7 +4886,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > &data, > d2.type, d2.len); > if (err < 0) > - goto err3; > + goto err_parse_data; > > if (d2.type == NFT_DATA_VERDICT && > (data.verdict.code == NFT_GOTO || > @@ -4886,10 +4911,11 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > } > > err = -ENOMEM; > - elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, data.data, > + elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, > + elem.key_end.val.data, data.data, > timeout, expiration, GFP_KERNEL); > if (elem.priv == NULL) > - goto err3; > + goto err_parse_data; > > ext = nft_set_elem_ext(set, elem.priv); > if (flags) > @@ -4906,7 +4932,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > trans = nft_trans_elem_alloc(ctx, NFT_MSG_NEWSETELEM, set); > if (trans == NULL) > - goto err4; > + goto err_trans; > > ext->genmask = nft_genmask_cur(ctx->net) | NFT_SET_ELEM_BUSY_MASK; > err = set->ops->insert(ctx->net, set, &elem, &ext2); > @@ -4917,7 +4943,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > nft_set_ext_exists(ext, NFT_SET_EXT_OBJREF) ^ > nft_set_ext_exists(ext2, NFT_SET_EXT_OBJREF)) { > err = -EBUSY; > - goto err5; > + goto err_element_clash; > } > if ((nft_set_ext_exists(ext, NFT_SET_EXT_DATA) && > nft_set_ext_exists(ext2, NFT_SET_EXT_DATA) && > @@ -4930,33 +4956,35 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > else if (!(nlmsg_flags & NLM_F_EXCL)) > err = 0; > } > - goto err5; > + goto err_element_clash; > } > > if (set->size && > !atomic_add_unless(&set->nelems, 1, set->size + set->ndeact)) { > err = -ENFILE; Unrelated: I think -ENFILE is abused here, and -ENOSPC would be a better fit. > - goto err6; > + goto err_set_full; > } > > nft_trans_elem(trans) = elem; > list_add_tail(&trans->list, &ctx->net->nft.commit_list); > return 0; > > -err6: > +err_set_full: > set->ops->remove(ctx->net, set, &elem); > -err5: > +err_element_clash: > kfree(trans); > -err4: > +err_trans: > if (obj) > obj->use--; > kfree(elem.priv); > -err3: > +err_parse_data: > if (nla[NFTA_SET_ELEM_DATA] != NULL) > nft_data_release(&data, d2.type); > -err2: > +err_parse_key_end: > + nft_data_release(&elem.key_end.val, NFT_DATA_VALUE); > +err_parse_key: > nft_data_release(&elem.key.val, NFT_DATA_VALUE); > -err1: > + > return err; > } > > @@ -5082,9 +5110,19 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, > > nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); > > + if (nla[NFTA_SET_ELEM_KEY_END]) { > + err = nft_setelem_parse_key(ctx, set, &elem.key_end.val, > + nla[NFTA_SET_ELEM_KEY_END]); > + if (err < 0) > + return err; > + > + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY_END, set->klen); > + } > + > err = -ENOMEM; > - elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, NULL, 0, > - 0, GFP_KERNEL); > + elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, > + elem.key_end.val.data, NULL, 0, 0, > + GFP_KERNEL); > if (elem.priv == NULL) > goto err2; > > diff --git a/net/netfilter/nft_dynset.c b/net/netfilter/nft_dynset.c > index 8887295414dc..683785225a3e 100644 > --- a/net/netfilter/nft_dynset.c > +++ b/net/netfilter/nft_dynset.c > @@ -54,7 +54,7 @@ static void *nft_dynset_new(struct nft_set *set, const struct nft_expr *expr, > > timeout = priv->timeout ? : set->timeout; > elem = nft_set_elem_init(set, &priv->tmpl, > - ®s->data[priv->sreg_key], > + ®s->data[priv->sreg_key], NULL, > ®s->data[priv->sreg_data], > timeout, 0, GFP_ATOMIC); > if (elem == NULL) The rest looks good to me! -- Stefano ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH,nf-next RFC 2/2] netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attribute 2019-12-05 22:44 ` Stefano Brivio @ 2019-12-06 19:52 ` Pablo Neira Ayuso 2019-12-07 22:52 ` Stefano Brivio 0 siblings, 1 reply; 13+ messages in thread From: Pablo Neira Ayuso @ 2019-12-06 19:52 UTC (permalink / raw) To: Stefano Brivio; +Cc: netfilter-devel On Thu, Dec 05, 2019 at 11:44:21PM +0100, Stefano Brivio wrote: > On Mon, 2 Dec 2019 14:14:07 +0100 > Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > Add NFTA_SET_ELEM_KEY_END attribute to convey the closing element of the > > interval between kernel and userspace. > > > > This patch also adds the NFT_SET_EXT_KEY_END extension to store the > > closing element value in this interval. > > > > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> > > --- > > include/net/netfilter/nf_tables.h | 14 +++++- > > include/uapi/linux/netfilter/nf_tables.h | 2 + > > net/netfilter/nf_tables_api.c | 82 +++++++++++++++++++++++--------- > > net/netfilter/nft_dynset.c | 2 +- > > 4 files changed, 76 insertions(+), 24 deletions(-) > > > > diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h > > index fe7c50acc681..2252a3892124 100644 > > --- a/include/net/netfilter/nf_tables.h > > +++ b/include/net/netfilter/nf_tables.h > > @@ -231,6 +231,7 @@ struct nft_userdata { > > * struct nft_set_elem - generic representation of set elements > > * > > * @key: element key > > + * @key_end: closing element key > > The "closing" here takes for granted that we're talking about ranges, > but perhaps it's not obvious from the context. Maybe something on the > lines of "upper bound element key, for ranges" would be more > explanatory. You mean to update the comment? That's fine indeed. > > * @priv: element private data and extensions > > */ > > struct nft_set_elem { > > @@ -238,6 +239,10 @@ struct nft_set_elem { > > u32 buf[NFT_DATA_VALUE_MAXLEN / sizeof(u32)]; > > struct nft_data val; > > } key; > > + union { > > + u32 buf[NFT_DATA_VALUE_MAXLEN / sizeof(u32)]; > > + struct nft_data val; > > + } key_end; > > void *priv; > > }; > > I wonder if this special need justifies almost doubling the size (for > other set types) here. IIRC, this nft_set_elem structure is only used from the control plane. > As far as I can tell, *priv doesn't need to be at the end, so we might > even consider to have key[0] at the end, with 1 to 2 elements, and I > guess nft_set_elem_init() has the information needed to allocate the > right size. The priv pointer stores data in a linear area through the extension infrastructure. I think the layout of this elem.priv pointer (actually the nft_set_ext object) is what matters in terms of memory efficiency, since it is used from the packet path. We can probably simplify this set extension infrastructure later. At least one key is always guaranteed to be in place. > > @@ -502,6 +507,7 @@ void nf_tables_destroy_set(const struct nft_ctx *ctx, struct nft_set *set); > > * enum nft_set_extensions - set extension type IDs > > * > > * @NFT_SET_EXT_KEY: element key > > + * @NFT_SET_EXT_KEY_END: closing element key > > * @NFT_SET_EXT_DATA: mapping data > > * @NFT_SET_EXT_FLAGS: element flags > > * @NFT_SET_EXT_TIMEOUT: element timeout > > @@ -513,6 +519,7 @@ void nf_tables_destroy_set(const struct nft_ctx *ctx, struct nft_set *set); > > */ > > enum nft_set_extensions { > > NFT_SET_EXT_KEY, > > + NFT_SET_EXT_KEY_END, > > NFT_SET_EXT_DATA, > > NFT_SET_EXT_FLAGS, > > NFT_SET_EXT_TIMEOUT, > > @@ -606,6 +613,11 @@ static inline struct nft_data *nft_set_ext_key(const struct nft_set_ext *ext) > > return nft_set_ext(ext, NFT_SET_EXT_KEY); > > } > > > > +static inline struct nft_data *nft_set_ext_key_end(const struct nft_set_ext *ext) > > +{ > > + return nft_set_ext(ext, NFT_SET_EXT_KEY_END); > > +} > > + > > static inline struct nft_data *nft_set_ext_data(const struct nft_set_ext *ext) > > { > > return nft_set_ext(ext, NFT_SET_EXT_DATA); > > @@ -655,7 +667,7 @@ static inline struct nft_object **nft_set_ext_obj(const struct nft_set_ext *ext) > > > > void *nft_set_elem_init(const struct nft_set *set, > > const struct nft_set_ext_tmpl *tmpl, > > - const u32 *key, const u32 *data, > > + const u32 *key, const u32 *key_end, const u32 *data, > > u64 timeout, u64 expiration, gfp_t gfp); > > void nft_set_elem_destroy(const struct nft_set *set, void *elem, > > bool destroy_expr); > > diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h > > index bb9b049310df..1d62552a12a7 100644 > > --- a/include/uapi/linux/netfilter/nf_tables.h > > +++ b/include/uapi/linux/netfilter/nf_tables.h > > @@ -370,6 +370,7 @@ enum nft_set_elem_flags { > > * @NFTA_SET_ELEM_USERDATA: user data (NLA_BINARY) > > * @NFTA_SET_ELEM_EXPR: expression (NLA_NESTED: nft_expr_attributes) > > * @NFTA_SET_ELEM_OBJREF: stateful object reference (NLA_STRING) > > + * @NFTA_SET_ELEM_KEY_END: closing key value (NLA_STRING) > > s/NLA_STRING/NLA_NESTED/ > > > */ > > enum nft_set_elem_attributes { > > NFTA_SET_ELEM_UNSPEC, > > @@ -382,6 +383,7 @@ enum nft_set_elem_attributes { > > NFTA_SET_ELEM_EXPR, > > NFTA_SET_ELEM_PAD, > > NFTA_SET_ELEM_OBJREF, > > + NFTA_SET_ELEM_KEY_END, > > __NFTA_SET_ELEM_MAX > > }; > > #define NFTA_SET_ELEM_MAX (__NFTA_SET_ELEM_MAX - 1) > > diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c > > index 13e291fac26f..927f6de5f65c 100644 > > --- a/net/netfilter/nf_tables_api.c > > +++ b/net/netfilter/nf_tables_api.c > > @@ -4199,6 +4199,7 @@ static const struct nla_policy nft_set_elem_policy[NFTA_SET_ELEM_MAX + 1] = { > > .len = NFT_USERDATA_MAXLEN }, > > [NFTA_SET_ELEM_EXPR] = { .type = NLA_NESTED }, > > [NFTA_SET_ELEM_OBJREF] = { .type = NLA_STRING }, > > + [NFTA_SET_ELEM_KEY_END] = { .type = NLA_NESTED }, > > }; > > > > static const struct nla_policy nft_set_elem_list_policy[NFTA_SET_ELEM_LIST_MAX + 1] = { > > @@ -4248,6 +4249,11 @@ static int nf_tables_fill_setelem(struct sk_buff *skb, > > NFT_DATA_VALUE, set->klen) < 0) > > goto nla_put_failure; > > > > + if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END) && > > + nft_data_dump(skb, NFTA_SET_ELEM_KEY_END, nft_set_ext_key_end(ext), > > + NFT_DATA_VALUE, set->klen) < 0) > > + goto nla_put_failure; > > + > > if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA) && > > nft_data_dump(skb, NFTA_SET_ELEM_DATA, nft_set_ext_data(ext), > > set->dtype == NFT_DATA_VERDICT ? NFT_DATA_VERDICT : NFT_DATA_VALUE, > > @@ -4538,6 +4544,13 @@ static int nft_get_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > if (err < 0) > > return err; > > > > + if (nla[NFTA_SET_ELEM_KEY_END]) { > > + err = nft_setelem_parse_key(ctx, set, &elem.key_end.val, > > + nla[NFTA_SET_ELEM_KEY_END]); > > + if (err < 0) > > + return err; > > + } > > + > > priv = set->ops->get(ctx->net, set, &elem, flags); > > if (IS_ERR(priv)) > > return PTR_ERR(priv); > > @@ -4663,8 +4676,8 @@ static struct nft_trans *nft_trans_elem_alloc(struct nft_ctx *ctx, > > > > void *nft_set_elem_init(const struct nft_set *set, > > const struct nft_set_ext_tmpl *tmpl, > > - const u32 *key, const u32 *data, > > - u64 timeout, u64 expiration, gfp_t gfp) > > + const u32 *key, const u32 *key_end, > > + const u32 *data, u64 timeout, u64 expiration, gfp_t gfp) > > { > > struct nft_set_ext *ext; > > void *elem; > > @@ -4677,6 +4690,8 @@ void *nft_set_elem_init(const struct nft_set *set, > > nft_set_ext_init(ext, tmpl); > > > > memcpy(nft_set_ext_key(ext), key, set->klen); > > + if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END)) > > + memcpy(nft_set_ext_key_end(ext), key_end, set->klen); > > if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA)) > > memcpy(nft_set_ext_data(ext), data, set->dlen); > > if (nft_set_ext_exists(ext, NFT_SET_EXT_EXPIRATION)) { > > @@ -4811,9 +4826,19 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > err = nft_setelem_parse_key(ctx, set, &elem.key.val, > > nla[NFTA_SET_ELEM_KEY]); > > if (err < 0) > > - goto err1; > > + return err; > > I think this makes sense as labels get meaningful names with this > patch, but I wonder if this change is actually intended. > > > > > nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); > > + > > + if (nla[NFTA_SET_ELEM_KEY_END]) { > > + err = nft_setelem_parse_key(ctx, set, &elem.key_end.val, > > + nla[NFTA_SET_ELEM_KEY_END]); > > + if (err < 0) > > + goto err_parse_key; > > Same comment as patch 1/2, this would be more straightforward if > nft_setelem_parse_key() cleaned up after itself (only on error). > > > + > > + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY_END, set->klen); > > + } > > + > > if (timeout > 0) { > > nft_set_ext_add(&tmpl, NFT_SET_EXT_EXPIRATION); > > if (timeout != set->timeout) > > @@ -4823,14 +4848,14 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > if (nla[NFTA_SET_ELEM_OBJREF] != NULL) { > > if (!(set->flags & NFT_SET_OBJECT)) { > > err = -EINVAL; > > - goto err2; > > + goto err_parse_key_end; > > } > > obj = nft_obj_lookup(ctx->net, ctx->table, > > nla[NFTA_SET_ELEM_OBJREF], > > set->objtype, genmask); > > if (IS_ERR(obj)) { > > err = PTR_ERR(obj); > > - goto err2; > > + goto err_parse_key_end; > > } > > nft_set_ext_add(&tmpl, NFT_SET_EXT_OBJREF); > > } > > @@ -4839,11 +4864,11 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > err = nft_data_init(ctx, &data, sizeof(data), &d2, > > nla[NFTA_SET_ELEM_DATA]); > > if (err < 0) > > - goto err2; > > + goto err_parse_key_end; > > > > err = -EINVAL; > > if (set->dtype != NFT_DATA_VERDICT && d2.len != set->dlen) > > - goto err3; > > + goto err_parse_data; > > > > dreg = nft_type_to_reg(set->dtype); > > list_for_each_entry(binding, &set->bindings, list) { > > @@ -4861,7 +4886,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > &data, > > d2.type, d2.len); > > if (err < 0) > > - goto err3; > > + goto err_parse_data; > > > > if (d2.type == NFT_DATA_VERDICT && > > (data.verdict.code == NFT_GOTO || > > @@ -4886,10 +4911,11 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > } > > > > err = -ENOMEM; > > - elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, data.data, > > + elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, > > + elem.key_end.val.data, data.data, > > timeout, expiration, GFP_KERNEL); > > if (elem.priv == NULL) > > - goto err3; > > + goto err_parse_data; > > > > ext = nft_set_elem_ext(set, elem.priv); > > if (flags) > > @@ -4906,7 +4932,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > > > trans = nft_trans_elem_alloc(ctx, NFT_MSG_NEWSETELEM, set); > > if (trans == NULL) > > - goto err4; > > + goto err_trans; > > > > ext->genmask = nft_genmask_cur(ctx->net) | NFT_SET_ELEM_BUSY_MASK; > > err = set->ops->insert(ctx->net, set, &elem, &ext2); > > @@ -4917,7 +4943,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > nft_set_ext_exists(ext, NFT_SET_EXT_OBJREF) ^ > > nft_set_ext_exists(ext2, NFT_SET_EXT_OBJREF)) { > > err = -EBUSY; > > - goto err5; > > + goto err_element_clash; > > } > > if ((nft_set_ext_exists(ext, NFT_SET_EXT_DATA) && > > nft_set_ext_exists(ext2, NFT_SET_EXT_DATA) && > > @@ -4930,33 +4956,35 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > else if (!(nlmsg_flags & NLM_F_EXCL)) > > err = 0; > > } > > - goto err5; > > + goto err_element_clash; > > } > > > > if (set->size && > > !atomic_add_unless(&set->nelems, 1, set->size + set->ndeact)) { > > err = -ENFILE; > > Unrelated: I think -ENFILE is abused here, and -ENOSPC would be a > better fit. Yes ENOSPC is better indeed, 3dd0673ac3 added introduced this misleading error reporting. From a uAPI perspective, we should not update errors that are exposed to the user, but this one is so wrong that I would take a patch for this. > > - goto err6; > > + goto err_set_full; > > } > > > > nft_trans_elem(trans) = elem; > > list_add_tail(&trans->list, &ctx->net->nft.commit_list); > > return 0; > > > > -err6: > > +err_set_full: > > set->ops->remove(ctx->net, set, &elem); > > -err5: > > +err_element_clash: > > kfree(trans); > > -err4: > > +err_trans: > > if (obj) > > obj->use--; > > kfree(elem.priv); > > -err3: > > +err_parse_data: > > if (nla[NFTA_SET_ELEM_DATA] != NULL) > > nft_data_release(&data, d2.type); > > -err2: > > +err_parse_key_end: > > + nft_data_release(&elem.key_end.val, NFT_DATA_VALUE); > > +err_parse_key: > > nft_data_release(&elem.key.val, NFT_DATA_VALUE); > > -err1: > > + > > return err; > > } > > > > @@ -5082,9 +5110,19 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, > > > > nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); > > > > + if (nla[NFTA_SET_ELEM_KEY_END]) { > > + err = nft_setelem_parse_key(ctx, set, &elem.key_end.val, > > + nla[NFTA_SET_ELEM_KEY_END]); > > + if (err < 0) > > + return err; > > + > > + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY_END, set->klen); > > + } > > + > > err = -ENOMEM; > > - elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, NULL, 0, > > - 0, GFP_KERNEL); > > + elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, > > + elem.key_end.val.data, NULL, 0, 0, > > + GFP_KERNEL); > > if (elem.priv == NULL) > > goto err2; > > > > diff --git a/net/netfilter/nft_dynset.c b/net/netfilter/nft_dynset.c > > index 8887295414dc..683785225a3e 100644 > > --- a/net/netfilter/nft_dynset.c > > +++ b/net/netfilter/nft_dynset.c > > @@ -54,7 +54,7 @@ static void *nft_dynset_new(struct nft_set *set, const struct nft_expr *expr, > > > > timeout = priv->timeout ? : set->timeout; > > elem = nft_set_elem_init(set, &priv->tmpl, > > - ®s->data[priv->sreg_key], > > + ®s->data[priv->sreg_key], NULL, > > ®s->data[priv->sreg_data], > > timeout, 0, GFP_ATOMIC); > > if (elem == NULL) > > The rest looks good to me! > > -- > Stefano > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH,nf-next RFC 2/2] netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attribute 2019-12-06 19:52 ` Pablo Neira Ayuso @ 2019-12-07 22:52 ` Stefano Brivio 0 siblings, 0 replies; 13+ messages in thread From: Stefano Brivio @ 2019-12-07 22:52 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter-devel On Fri, 6 Dec 2019 20:52:55 +0100 Pablo Neira Ayuso <pablo@netfilter.org> wrote: > On Thu, Dec 05, 2019 at 11:44:21PM +0100, Stefano Brivio wrote: > > On Mon, 2 Dec 2019 14:14:07 +0100 > > Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > > > Add NFTA_SET_ELEM_KEY_END attribute to convey the closing element of the > > > interval between kernel and userspace. > > > > > > This patch also adds the NFT_SET_EXT_KEY_END extension to store the > > > closing element value in this interval. > > > > > > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> > > > --- > > > include/net/netfilter/nf_tables.h | 14 +++++- > > > include/uapi/linux/netfilter/nf_tables.h | 2 + > > > net/netfilter/nf_tables_api.c | 82 +++++++++++++++++++++++--------- > > > net/netfilter/nft_dynset.c | 2 +- > > > 4 files changed, 76 insertions(+), 24 deletions(-) > > > > > > diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h > > > index fe7c50acc681..2252a3892124 100644 > > > --- a/include/net/netfilter/nf_tables.h > > > +++ b/include/net/netfilter/nf_tables.h > > > @@ -231,6 +231,7 @@ struct nft_userdata { > > > * struct nft_set_elem - generic representation of set elements > > > * > > > * @key: element key > > > + * @key_end: closing element key > > > > The "closing" here takes for granted that we're talking about ranges, > > but perhaps it's not obvious from the context. Maybe something on the > > lines of "upper bound element key, for ranges" would be more > > explanatory. > > You mean to update the comment? That's fine indeed. Yes, that. > > > * @priv: element private data and extensions > > > */ > > > struct nft_set_elem { > > > @@ -238,6 +239,10 @@ struct nft_set_elem { > > > u32 buf[NFT_DATA_VALUE_MAXLEN / sizeof(u32)]; > > > struct nft_data val; > > > } key; > > > + union { > > > + u32 buf[NFT_DATA_VALUE_MAXLEN / sizeof(u32)]; > > > + struct nft_data val; > > > + } key_end; > > > void *priv; > > > }; > > > > I wonder if this special need justifies almost doubling the size (for > > other set types) here. > > IIRC, this nft_set_elem structure is only used from the control plane. Ah, yes, I got confused by the fact that nft_set_elem_init() allocates 'elem', but it's not the same thing as 'elem' in nft_add_set_elem(). Please discard my comment. > > As far as I can tell, *priv doesn't need to be at the end, so we might > > even consider to have key[0] at the end, with 1 to 2 elements, and I > > guess nft_set_elem_init() has the information needed to allocate the > > right size. > > The priv pointer stores data in a linear area through the extension > infrastructure. I think the layout of this elem.priv pointer (actually > the nft_set_ext object) is what matters in terms of memory efficiency, > since it is used from the packet path. Right, and I had tried to play with it already, dropping the offset[] field and using fixed offsets instead, but I couldn't see a significant improvement (at least on aarch64 and x86_64). I would rather invest time later in trying to avoid dereferencing it altogether in the packet path (at least for sets without timeout). As I was mentioning in my other email, that would be possible by modifying the transaction model, but it's not exactly straightforward, and I have no clue how it affect existing set implementations. > We can probably simplify this set extension infrastructure later. At > least one key is always guaranteed to be in place. It already looks simple enough to me -- I just missed the fact this is only used from the control path. > > > @@ -502,6 +507,7 @@ void nf_tables_destroy_set(const struct nft_ctx *ctx, struct nft_set *set); > > > * enum nft_set_extensions - set extension type IDs > > > * > > > * @NFT_SET_EXT_KEY: element key > > > + * @NFT_SET_EXT_KEY_END: closing element key > > > * @NFT_SET_EXT_DATA: mapping data > > > * @NFT_SET_EXT_FLAGS: element flags > > > * @NFT_SET_EXT_TIMEOUT: element timeout > > > @@ -513,6 +519,7 @@ void nf_tables_destroy_set(const struct nft_ctx *ctx, struct nft_set *set); > > > */ > > > enum nft_set_extensions { > > > NFT_SET_EXT_KEY, > > > + NFT_SET_EXT_KEY_END, > > > NFT_SET_EXT_DATA, > > > NFT_SET_EXT_FLAGS, > > > NFT_SET_EXT_TIMEOUT, > > > @@ -606,6 +613,11 @@ static inline struct nft_data *nft_set_ext_key(const struct nft_set_ext *ext) > > > return nft_set_ext(ext, NFT_SET_EXT_KEY); > > > } > > > > > > +static inline struct nft_data *nft_set_ext_key_end(const struct nft_set_ext *ext) > > > +{ > > > + return nft_set_ext(ext, NFT_SET_EXT_KEY_END); > > > +} > > > + > > > static inline struct nft_data *nft_set_ext_data(const struct nft_set_ext *ext) > > > { > > > return nft_set_ext(ext, NFT_SET_EXT_DATA); > > > @@ -655,7 +667,7 @@ static inline struct nft_object **nft_set_ext_obj(const struct nft_set_ext *ext) > > > > > > void *nft_set_elem_init(const struct nft_set *set, > > > const struct nft_set_ext_tmpl *tmpl, > > > - const u32 *key, const u32 *data, > > > + const u32 *key, const u32 *key_end, const u32 *data, > > > u64 timeout, u64 expiration, gfp_t gfp); > > > void nft_set_elem_destroy(const struct nft_set *set, void *elem, > > > bool destroy_expr); > > > diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h > > > index bb9b049310df..1d62552a12a7 100644 > > > --- a/include/uapi/linux/netfilter/nf_tables.h > > > +++ b/include/uapi/linux/netfilter/nf_tables.h > > > @@ -370,6 +370,7 @@ enum nft_set_elem_flags { > > > * @NFTA_SET_ELEM_USERDATA: user data (NLA_BINARY) > > > * @NFTA_SET_ELEM_EXPR: expression (NLA_NESTED: nft_expr_attributes) > > > * @NFTA_SET_ELEM_OBJREF: stateful object reference (NLA_STRING) > > > + * @NFTA_SET_ELEM_KEY_END: closing key value (NLA_STRING) > > > > s/NLA_STRING/NLA_NESTED/ > > > > > */ > > > enum nft_set_elem_attributes { > > > NFTA_SET_ELEM_UNSPEC, > > > @@ -382,6 +383,7 @@ enum nft_set_elem_attributes { > > > NFTA_SET_ELEM_EXPR, > > > NFTA_SET_ELEM_PAD, > > > NFTA_SET_ELEM_OBJREF, > > > + NFTA_SET_ELEM_KEY_END, > > > __NFTA_SET_ELEM_MAX > > > }; > > > #define NFTA_SET_ELEM_MAX (__NFTA_SET_ELEM_MAX - 1) > > > diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c > > > index 13e291fac26f..927f6de5f65c 100644 > > > --- a/net/netfilter/nf_tables_api.c > > > +++ b/net/netfilter/nf_tables_api.c > > > @@ -4199,6 +4199,7 @@ static const struct nla_policy nft_set_elem_policy[NFTA_SET_ELEM_MAX + 1] = { > > > .len = NFT_USERDATA_MAXLEN }, > > > [NFTA_SET_ELEM_EXPR] = { .type = NLA_NESTED }, > > > [NFTA_SET_ELEM_OBJREF] = { .type = NLA_STRING }, > > > + [NFTA_SET_ELEM_KEY_END] = { .type = NLA_NESTED }, > > > }; > > > > > > static const struct nla_policy nft_set_elem_list_policy[NFTA_SET_ELEM_LIST_MAX + 1] = { > > > @@ -4248,6 +4249,11 @@ static int nf_tables_fill_setelem(struct sk_buff *skb, > > > NFT_DATA_VALUE, set->klen) < 0) > > > goto nla_put_failure; > > > > > > + if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END) && > > > + nft_data_dump(skb, NFTA_SET_ELEM_KEY_END, nft_set_ext_key_end(ext), > > > + NFT_DATA_VALUE, set->klen) < 0) > > > + goto nla_put_failure; > > > + > > > if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA) && > > > nft_data_dump(skb, NFTA_SET_ELEM_DATA, nft_set_ext_data(ext), > > > set->dtype == NFT_DATA_VERDICT ? NFT_DATA_VERDICT : NFT_DATA_VALUE, > > > @@ -4538,6 +4544,13 @@ static int nft_get_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > > if (err < 0) > > > return err; > > > > > > + if (nla[NFTA_SET_ELEM_KEY_END]) { > > > + err = nft_setelem_parse_key(ctx, set, &elem.key_end.val, > > > + nla[NFTA_SET_ELEM_KEY_END]); > > > + if (err < 0) > > > + return err; > > > + } > > > + > > > priv = set->ops->get(ctx->net, set, &elem, flags); > > > if (IS_ERR(priv)) > > > return PTR_ERR(priv); > > > @@ -4663,8 +4676,8 @@ static struct nft_trans *nft_trans_elem_alloc(struct nft_ctx *ctx, > > > > > > void *nft_set_elem_init(const struct nft_set *set, > > > const struct nft_set_ext_tmpl *tmpl, > > > - const u32 *key, const u32 *data, > > > - u64 timeout, u64 expiration, gfp_t gfp) > > > + const u32 *key, const u32 *key_end, > > > + const u32 *data, u64 timeout, u64 expiration, gfp_t gfp) > > > { > > > struct nft_set_ext *ext; > > > void *elem; > > > @@ -4677,6 +4690,8 @@ void *nft_set_elem_init(const struct nft_set *set, > > > nft_set_ext_init(ext, tmpl); > > > > > > memcpy(nft_set_ext_key(ext), key, set->klen); > > > + if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END)) > > > + memcpy(nft_set_ext_key_end(ext), key_end, set->klen); > > > if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA)) > > > memcpy(nft_set_ext_data(ext), data, set->dlen); > > > if (nft_set_ext_exists(ext, NFT_SET_EXT_EXPIRATION)) { > > > @@ -4811,9 +4826,19 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > > err = nft_setelem_parse_key(ctx, set, &elem.key.val, > > > nla[NFTA_SET_ELEM_KEY]); > > > if (err < 0) > > > - goto err1; > > > + return err; > > > > I think this makes sense as labels get meaningful names with this > > patch, but I wonder if this change is actually intended. > > > > > > > > nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); > > > + > > > + if (nla[NFTA_SET_ELEM_KEY_END]) { > > > + err = nft_setelem_parse_key(ctx, set, &elem.key_end.val, > > > + nla[NFTA_SET_ELEM_KEY_END]); > > > + if (err < 0) > > > + goto err_parse_key; > > > > Same comment as patch 1/2, this would be more straightforward if > > nft_setelem_parse_key() cleaned up after itself (only on error). > > > > > + > > > + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY_END, set->klen); > > > + } > > > + > > > if (timeout > 0) { > > > nft_set_ext_add(&tmpl, NFT_SET_EXT_EXPIRATION); > > > if (timeout != set->timeout) > > > @@ -4823,14 +4848,14 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > > if (nla[NFTA_SET_ELEM_OBJREF] != NULL) { > > > if (!(set->flags & NFT_SET_OBJECT)) { > > > err = -EINVAL; > > > - goto err2; > > > + goto err_parse_key_end; > > > } > > > obj = nft_obj_lookup(ctx->net, ctx->table, > > > nla[NFTA_SET_ELEM_OBJREF], > > > set->objtype, genmask); > > > if (IS_ERR(obj)) { > > > err = PTR_ERR(obj); > > > - goto err2; > > > + goto err_parse_key_end; > > > } > > > nft_set_ext_add(&tmpl, NFT_SET_EXT_OBJREF); > > > } > > > @@ -4839,11 +4864,11 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > > err = nft_data_init(ctx, &data, sizeof(data), &d2, > > > nla[NFTA_SET_ELEM_DATA]); > > > if (err < 0) > > > - goto err2; > > > + goto err_parse_key_end; > > > > > > err = -EINVAL; > > > if (set->dtype != NFT_DATA_VERDICT && d2.len != set->dlen) > > > - goto err3; > > > + goto err_parse_data; > > > > > > dreg = nft_type_to_reg(set->dtype); > > > list_for_each_entry(binding, &set->bindings, list) { > > > @@ -4861,7 +4886,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > > &data, > > > d2.type, d2.len); > > > if (err < 0) > > > - goto err3; > > > + goto err_parse_data; > > > > > > if (d2.type == NFT_DATA_VERDICT && > > > (data.verdict.code == NFT_GOTO || > > > @@ -4886,10 +4911,11 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > > } > > > > > > err = -ENOMEM; > > > - elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, data.data, > > > + elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, > > > + elem.key_end.val.data, data.data, > > > timeout, expiration, GFP_KERNEL); > > > if (elem.priv == NULL) > > > - goto err3; > > > + goto err_parse_data; > > > > > > ext = nft_set_elem_ext(set, elem.priv); > > > if (flags) > > > @@ -4906,7 +4932,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > > > > > trans = nft_trans_elem_alloc(ctx, NFT_MSG_NEWSETELEM, set); > > > if (trans == NULL) > > > - goto err4; > > > + goto err_trans; > > > > > > ext->genmask = nft_genmask_cur(ctx->net) | NFT_SET_ELEM_BUSY_MASK; > > > err = set->ops->insert(ctx->net, set, &elem, &ext2); > > > @@ -4917,7 +4943,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > > nft_set_ext_exists(ext, NFT_SET_EXT_OBJREF) ^ > > > nft_set_ext_exists(ext2, NFT_SET_EXT_OBJREF)) { > > > err = -EBUSY; > > > - goto err5; > > > + goto err_element_clash; > > > } > > > if ((nft_set_ext_exists(ext, NFT_SET_EXT_DATA) && > > > nft_set_ext_exists(ext2, NFT_SET_EXT_DATA) && > > > @@ -4930,33 +4956,35 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, > > > else if (!(nlmsg_flags & NLM_F_EXCL)) > > > err = 0; > > > } > > > - goto err5; > > > + goto err_element_clash; > > > } > > > > > > if (set->size && > > > !atomic_add_unless(&set->nelems, 1, set->size + set->ndeact)) { > > > err = -ENFILE; > > > > Unrelated: I think -ENFILE is abused here, and -ENOSPC would be a > > better fit. > > Yes ENOSPC is better indeed, 3dd0673ac3 added introduced this > misleading error reporting. > > From a uAPI perspective, we should not update errors that are exposed > to the user, but this one is so wrong that I would take a patch for > this. Okay, I'll send a patch once we're done with this. -- Stefano ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH,nf-next RFC 0/2] add NFTA_SET_ELEM_KEY_END 2019-12-02 13:14 [PATCH,nf-next RFC 0/2] add NFTA_SET_ELEM_KEY_END Pablo Neira Ayuso 2019-12-02 13:14 ` [PATCH,nf-next RFC 1/2] netfilter: nf_tables: add nft_setelem_parse_key() Pablo Neira Ayuso 2019-12-02 13:14 ` [PATCH,nf-next RFC 2/2] netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attribute Pablo Neira Ayuso @ 2019-12-02 16:19 ` Stefano Brivio 2019-12-03 11:02 ` Pablo Neira Ayuso 2 siblings, 1 reply; 13+ messages in thread From: Stefano Brivio @ 2019-12-02 16:19 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter-devel Hi Pablo, On Mon, 2 Dec 2019 14:14:05 +0100 Pablo Neira Ayuso <pablo@netfilter.org> wrote: > Hi Stefano, > > This patchset extends the netlink API to allow to express an interval > with one single element. > > This simplifies this interface since userspace does not need to send two > independent elements anymore, one of the including the > NFT_SET_ELEM_INTERVAL_END flag. > > The idea is to use the _DESC to specify that userspace speaks the kernel > that new API representation. In your case, the new description attribute > that tells that this set contains interval + concatenation implicitly > tells the kernel that userspace supports for this new API. Thanks! I just had a quick look, I think the new set implementation would indeed look more elegant this way. As to design choices, I'm afraid I'm not familiar enough with the big picture to comment on the general idea, but my uninformed opinion agrees with this approach. :) For what it's worth, I'd review this in deeper detail next. > If you're fine with this, I can scratch a bit of time to finish the > libnftnl part. The nft code will need a small update too. You will not > need to use the nft_set_pipapo object as scratchpad area anymore. On my side, I'm almost done with nft/libnftnl/kernel changes for the NFT_SET_DESC_CONCAT thing. How should we proceed? Do you want me to share those patches so that you can add this bit on top, or should this come first, or in a separate series? I could also just share the new nft/libnftnl patches (I should have them ready between today and tomorrow), and proceed adapting the kernel part according to your changes. Related question: to avoid copying data around, I'm now dynamically allocating a struct nft_data_desc in nf_tables_newset() with a reference from struct nft_set: desc->dlen, desc->klen, desc->size would all live there, together with the "subkey" stuff. Is it a bad idea? I can undo it easily, I just don't know if there's a specific reason why those fields are repeated in struct nft_set. -- Stefano ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH,nf-next RFC 0/2] add NFTA_SET_ELEM_KEY_END 2019-12-02 16:19 ` [PATCH,nf-next RFC 0/2] add NFTA_SET_ELEM_KEY_END Stefano Brivio @ 2019-12-03 11:02 ` Pablo Neira Ayuso 2019-12-03 15:56 ` Stefano Brivio 0 siblings, 1 reply; 13+ messages in thread From: Pablo Neira Ayuso @ 2019-12-03 11:02 UTC (permalink / raw) To: Stefano Brivio; +Cc: netfilter-devel Hi Stefano, On Mon, Dec 02, 2019 at 05:19:52PM +0100, Stefano Brivio wrote: [...] > On Mon, 2 Dec 2019 14:14:05 +0100 > Pablo Neira Ayuso <pablo@netfilter.org> wrote: [...] > > This patchset extends the netlink API to allow to express an interval > > with one single element. > > > > This simplifies this interface since userspace does not need to send two > > independent elements anymore, one of the including the > > NFT_SET_ELEM_INTERVAL_END flag. > > > > The idea is to use the _DESC to specify that userspace speaks the kernel > > that new API representation. In your case, the new description attribute > > that tells that this set contains interval + concatenation implicitly > > tells the kernel that userspace supports for this new API. > > Thanks! I just had a quick look, I think the new set implementation > would indeed look more elegant this way. As to design choices, I'm > afraid I'm not familiar enough with the big picture to comment on the > general idea, but my uninformed opinion agrees with this approach. :) > > For what it's worth, I'd review this in deeper detail next. Thanks. > > If you're fine with this, I can scratch a bit of time to finish the > > libnftnl part. The nft code will need a small update too. You will not > > need to use the nft_set_pipapo object as scratchpad area anymore. > > On my side, I'm almost done with nft/libnftnl/kernel changes for the > NFT_SET_DESC_CONCAT thing. How should we proceed? Do you want me to > share those patches so that you can add this bit on top, or should this > come first, or in a separate series? My suggestion is that you can take them and place them at the beginning of your batch since it will be the first client for this new netlink attribute, you will have to adapt pipapo to use the new key_end value too. > I could also just share the new nft/libnftnl patches (I should have them > ready between today and tomorrow), and proceed adapting the kernel part > according to your changes. I still have to send you the libnftnl part for this. > Related question: to avoid copying data around, I'm now dynamically > allocating a struct nft_data_desc in nf_tables_newset() with a > reference from struct nft_set: desc->dlen, desc->klen, desc->size would > all live there, together with the "subkey" stuff. > > Is it a bad idea? I can undo it easily, I just don't know if there's a > specific reason why those fields are repeated in struct nft_set. Not sure I understand, probably some code sketch? From your words it does not look like a major issue though, but let me know. BTW, there is also one more pending issue: I can see there is a clone point in nft_set_pipapo, you mentioned some problems to make things fit in into the transaction infrastructure. Could you describe how you integrate with it? Probably there is a chance to extend the front-end API too to make it easier for pipapo. Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH,nf-next RFC 0/2] add NFTA_SET_ELEM_KEY_END 2019-12-03 11:02 ` Pablo Neira Ayuso @ 2019-12-03 15:56 ` Stefano Brivio 0 siblings, 0 replies; 13+ messages in thread From: Stefano Brivio @ 2019-12-03 15:56 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter-devel On Tue, 3 Dec 2019 12:02:54 +0100 Pablo Neira Ayuso <pablo@netfilter.org> wrote: > Hi Stefano, > > On Mon, Dec 02, 2019 at 05:19:52PM +0100, Stefano Brivio wrote: > [...] > > On Mon, 2 Dec 2019 14:14:05 +0100 > > Pablo Neira Ayuso <pablo@netfilter.org> wrote: > [...] > > > This patchset extends the netlink API to allow to express an interval > > > with one single element. > > > > > > This simplifies this interface since userspace does not need to send two > > > independent elements anymore, one of the including the > > > NFT_SET_ELEM_INTERVAL_END flag. > > > > > > The idea is to use the _DESC to specify that userspace speaks the kernel > > > that new API representation. In your case, the new description attribute > > > that tells that this set contains interval + concatenation implicitly > > > tells the kernel that userspace supports for this new API. > > > > Thanks! I just had a quick look, I think the new set implementation > > would indeed look more elegant this way. As to design choices, I'm > > afraid I'm not familiar enough with the big picture to comment on the > > general idea, but my uninformed opinion agrees with this approach. :) > > > > For what it's worth, I'd review this in deeper detail next. > > Thanks. > > > > If you're fine with this, I can scratch a bit of time to finish the > > > libnftnl part. The nft code will need a small update too. You will not > > > need to use the nft_set_pipapo object as scratchpad area anymore. > > > > On my side, I'm almost done with nft/libnftnl/kernel changes for the > > NFT_SET_DESC_CONCAT thing. How should we proceed? Do you want me to > > share those patches so that you can add this bit on top, or should this > > come first, or in a separate series? > > My suggestion is that you can take them and place them at the > beginning of your batch since it will be the first client for this new > netlink attribute, you will have to adapt pipapo to use the new > key_end value too. Okay, then I'll adjust the NFT_SET_DESC_CONCAT changes on top of your changes. > > I could also just share the new nft/libnftnl patches (I should have them > > ready between today and tomorrow), and proceed adapting the kernel part > > according to your changes. > > I still have to send you the libnftnl part for this. If that's the order of changes you suggest, yes. Anyway, I guess I can already proceed adjusting the kernel part meanwhile. > > Related question: to avoid copying data around, I'm now dynamically > > allocating a struct nft_data_desc in nf_tables_newset() with a > > reference from struct nft_set: desc->dlen, desc->klen, desc->size would > > all live there, together with the "subkey" stuff. > > > > Is it a bad idea? I can undo it easily, I just don't know if there's a > > specific reason why those fields are repeated in struct nft_set. > > Not sure I understand, probably some code sketch? From your words it > does not look like a major issue though, but let me know. Right now, nf_tables_newset() in nf_tables_api.c has: struct nft_set_desc desc; [...] memset(&desc, 0, sizeof(desc)); [...] desc.klen = ntohl(nla_get_be32(nla[NFTA_SET_KEY_LEN])); [and so on] then values from desc are copied one by one into struct nft_set *set, later in the same function: set->klen = desc.klen; set->dlen = desc.dlen; set->size = desc.size; and I don't see the point (well, we avoid one further dereference in e.g. __nft_rbtree_lookup(), but is that worth it?). As I'm adding two fields (field_len[] and field_count) to struct nft_set_desc, I would instead allocate it dynamically: struct nlattr *desc; [...] desc = kzalloc(sizeof(*desc), GFP_KERNEL); fill it: desc->klen = ntohl(nla_get_be32(nla[NFTA_SET_KEY_LEN])); [and so on] and then store a reference to it in struct nft_set *set: set->desc = desc; instead of copying fields one by one, and having their declaration duplicated in struct nft_set and struct nft_set_desc. I can also stick to the current way, that is, add those fields to both struct nft_set and struct nft_set_desc, and copy them. It's mostly a matter of taste, unless there's a specific reason why we shouldn't point to a struct nft_set_desc from struct nft_set. > BTW, there is also one more pending issue: I can see there is a clone > point in nft_set_pipapo, you mentioned some problems to make things > fit in into the transaction infrastructure. Could you describe how you > integrate with it? Currently, there are no problems, and the description of how relevant operations integrate with the API is in the kerneldoc comments for nft_pipapo_activate(), _deactivate(), _flush(), _remove(). While the operations are, per se, conceptually not atomic, atomicity is then guaranteed by the genmask mechanism. The problems would start once we want to add some specific optimisations that rely on the fact there are no (entirely) overlapping entries in current, live matching data. If we could rely on that, and we opt to opportunistically coalesce per-field classifier regions, we can reduce the complexity of some steps. Let's say we have a set matching, separately, two entries: - 192.168.1.1:1024 -> x - 192.168.1.1:2048 -> y then in the classifier region for the IPv4 address field we could naturally have a single rule, that, if matched, would lead to evaluation of the port bits against both 1024 and 2048, which would then map to two different entries. Let's name: - "192.168.1.1" rule #0 for the IPv4 address - "1024" rule #0 for the port (mapping to x) - "2048" rule #1 for the port (mapping to y) Then rule #0 for the IPv4 address activates processing of rule #0 and rule #1 for the port bits, and at each step we select up to one single rule, that results in up to one single match. However, we can have this situation: - 192.168.1.1:1024 -> x (active element reference) - 192.168.1.1:2048 -> y (active element reference) - 192.168.1.1:1024 -> z (inactive element reference) so we need: - "1024" rule #0 for the port (mapping to x) - "2048" rule #1 for the port (mapping to y) - "1024" rule #2 for the port (mapping to z) and once we have evaluated the port field, we might have n > 1 rules selected, so we might need to check multiple elements before returning a match, and this introduces complexity in the fast path. We also need the nft_set_elem_active() check, which is quite expensive. I wouldn't introduce this kind of optimisation right now, mostly because of the complexity it adds for listing and deletion, and I'd rather prefer the basic implementation to be reasonably mature before proceeding with this. In general, I find the current API a bit unnatural. As I mentioned, I would have expected to implement a single insert operation, a single delete operation, and a commit routine, instead of five operations that can't be clearly defined in terms of a typical transaction/commit model. I think that a clear distinction between insert/delete and commit would also more naturally fit the publish/subscribe RCU model. If we had a single commit callback, there couldn't be entirely overlapping entries in live matching data, because if any such entry is present in the commit phase, the commit would fail. On top of that, the flush() operation would actually correspond to its name. > Probably there is a chance to extend the front-end > API too to make it easier for pipapo. I think that any change going in this direction, whether desirable or not, is going to be quite invasive. In particular, I would first try to figure out pros and cons for existing set implementations. Some set implementations would probably benefit from having a completely separated copy of matching data where pending operations are performed. In general, we would be able to skip the nft_set_elem_active() check in lookup functions, which looks expensive in case the NFT_SET_MAP flag is not set, because just for that single check we need to read data quite far away. Inserting elements directly on live data might also result in some amount of false sharing, but I wouldn't comment further on that without some actual test results. In conclusion, I'd rather defer this kind of API change (again, if it even makes sense) to a later time, I would start with some simplified implementation to check how existing set types behave with it, and I don't see any "quick" way to simplify pipapo's operations any further right now. -- Stefano ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2019-12-09 20:44 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-12-02 13:14 [PATCH,nf-next RFC 0/2] add NFTA_SET_ELEM_KEY_END Pablo Neira Ayuso 2019-12-02 13:14 ` [PATCH,nf-next RFC 1/2] netfilter: nf_tables: add nft_setelem_parse_key() Pablo Neira Ayuso 2019-12-05 22:43 ` Stefano Brivio 2019-12-06 19:45 ` Pablo Neira Ayuso 2019-12-07 22:51 ` Stefano Brivio 2019-12-09 20:44 ` Pablo Neira Ayuso 2019-12-02 13:14 ` [PATCH,nf-next RFC 2/2] netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attribute Pablo Neira Ayuso 2019-12-05 22:44 ` Stefano Brivio 2019-12-06 19:52 ` Pablo Neira Ayuso 2019-12-07 22:52 ` Stefano Brivio 2019-12-02 16:19 ` [PATCH,nf-next RFC 0/2] add NFTA_SET_ELEM_KEY_END Stefano Brivio 2019-12-03 11:02 ` Pablo Neira Ayuso 2019-12-03 15:56 ` Stefano Brivio
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.