netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: netfilter-devel@vger.kernel.org,
	"Florian Westphal" <fw@strlen.de>,
	"Kadlecsik József" <kadlec@blackhole.kfki.hu>,
	"Eric Garver" <eric@garver.life>, "Phil Sutter" <phil@nwl.cc>
Subject: Re: [PATCH nf-next v2 3/8] nf_tables: Add set type for arbitrary concatenation of ranges
Date: Wed, 27 Nov 2019 12:02:49 +0100	[thread overview]
Message-ID: <20191127120249.292d4a69@elisabeth> (raw)
In-Reply-To: <20191127092945.kp25vjfwxcrbjapx@salvia>

On Wed, 27 Nov 2019 10:29:45 +0100
Pablo Neira Ayuso <pablo@netfilter.org> wrote:

> Hi Stefano,
> 
> Just started reading, a few initial questions.
> 
> On Fri, Nov 22, 2019 at 02:40:02PM +0100, Stefano Brivio wrote:
> [...]
>
> > +static int nft_pipapo_insert(const struct net *net, const struct nft_set *set,
> > +			     const struct nft_set_elem *elem,
> > +			     struct nft_set_ext **ext2)
> > +{
> > +	const struct nft_set_ext *ext = nft_set_elem_ext(set, elem->priv);
> > +	const u8 *data = (const u8 *)elem->key.val.data, *start, *end;
> > +	union nft_pipapo_map_bucket rulemap[NFT_PIPAPO_MAX_FIELDS];
> > +	struct nft_pipapo *priv = nft_set_priv(set);
> > +	struct nft_pipapo_match *m = priv->clone;
> > +	struct nft_pipapo_elem *e = elem->priv;
> > +	struct nft_pipapo_field *f;
> > +	int i, bsize_max, err = 0;
> > +	void *dup;
> > +
> > +	dup = nft_pipapo_get(net, set, elem, 0);
> > +	if (PTR_ERR(dup) != -ENOENT) {
> > +		priv->start_elem = NULL;
> > +		if (IS_ERR(dup))
> > +			return PTR_ERR(dup);
> > +		*ext2 = dup;  
> 
> dup should be of nft_set_ext type. I just had a look at
> nft_pipapo_get() and I think this returns nft_pipapo_elem, which is
> almost good, since it contains nft_set_ext, right?

Right, it returns nft_pipapo_elem which contains that.

> I think you also need to check if the object is active in the next
> generation via nft_genmask_next() and nft_set_elem_active(), otherwise
> ignore it.

I guess I should actually do this in nft_pipapo_get(), also because we
don't want to return inactive elements when userspace "gets" them.

I just noticed this is currently inconsistent with the lookup, because
nft_pipapo_lookup() correctly does:

--
next_match:
		b = pipapo_refill(res_map, f->bsize, f->rules, fill_map, f->mt,
				  last);
		if (b < 0) {
			raw_cpu_write(nft_pipapo_scratch_index, map_index);
			local_bh_enable();

			return false;
		}

		if (last) {
			*ext = &f->mt[b].e->ext;
			if (unlikely(nft_set_elem_expired(*ext) ||
				     !nft_set_elem_active(*ext, genmask)))
				goto next_match;
--

but I forgot to implement the same check in pipapo_get():

--
next_match:
		b = pipapo_refill(res_map, f->bsize, f->rules, fill_map, f->mt,
				  last);
		if (b < 0)
			goto out;

		if (last) {
			if (nft_set_elem_expired(&f->mt[b].e->ext))
				goto next_match;
--

this check should simply include || !nft_set_elem_active(...), and then
I wouldn't need any further check in nft_pipapo_init(). I'd fix this in
v3.

I'm actually not sure if I need to report these elements to
nft_pipapo_remove(). If it's needed, I would add some kind of
"get_inactive" flag to pipapo_get(), which is true on the call from
nft_pipapo_remove(), and false on other paths. If the flag is true, the
nft_set_elem_active() check is then skipped.

> Note that the datastructure needs to temporarily deal with duplicates,
> ie. one inactive object (just deleted) and one active object (just
> added) for the next generation.

Yes, this is taken care of (except for the problem described above),
specifically, there can be n inactive objects, and a single active
object that are entirely overlapping.

This makes some optimisations harder to implement, namely, step 5.2.1
from:
	https://pipapo.lameexcu.se/pipapo/tree/pipapo.c#n337

because we need to allow entirely overlapping entries and map them to
possibly distinct elements.

Now, I think this would all be easier if the API implemented
transactions and commit in a way that appears more natural to me.

When I started working on this, I initially thought activate() would be
called once per transaction, not per element, so that insert() and
remove() would add or remove elements pending for a given transaction,
and activate() would commit it. Same for flush().

At that point, we would have a copy of lookup data with pending
insertions and without pending deletions, and on transaction commit,
this copy would become active, with no inactive elements into it.
Hence, no overlapping elements in live data.

This way we could also make transactions atomic. If activate() is
called once for each element in the transaction, that can't be atomic.

I plan to work on this (if it makes sense), but it looks rather
complicated to match this with existing set implementations and
especially current UAPI, that's the main reason why I "worked around"
this aspect for the moment being. I guess that having at least one set
implementation that can play along with this model would help later.

> > +		return -EEXIST;
> > +	}
> > +
> > +	if (!nft_set_ext_exists(ext, NFT_SET_EXT_FLAGS) ||
> > +	    !(*nft_set_ext_flags(ext) & NFT_SET_ELEM_INTERVAL_END)) {
> > +		priv->start_elem = e;
> > +		*ext2 = &e->ext;
> > +		memcpy(priv->start_data, data, priv->width);
> > +		return 0;
> > +	}
> > +
> > +	if (!priv->start_elem)
> > +		return -EINVAL;  
> 
> I'm working on a sketch patch to extend the front-end code to make
> this easier for you, will post it asap, so you don't need this special
> handling to collect both ends of the interval.

Nice, thanks. Mind that I think this is actually a bit ugly but fine.
As I was mentioning to Florian, it doesn't present any particular race
with bad consequences (at least in v2).

Right now I was trying to get the NFTA_SET_DESC_CONCAT >
NFTA_LIST_ELEM > NFTA_SET_FIELD_LEN nesting implemented in libnftnl in
a somewhat acceptable way. Let me know if the front-end changes would
affect this significantly, I'll wait for your patch in that case.

> So far, just spend a bit of time on this, will get back to you with
> more feedback.
> 
> Thanks for working on this!

And thanks for reviewing it!

-- 
Stefano


  reply	other threads:[~2019-11-27 11:03 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-22 13:39 [PATCH nf-next v2 0/8] nftables: Set implementation for arbitrary concatenation of ranges Stefano Brivio
2019-11-22 13:40 ` [PATCH nf-next v2 1/8] netfilter: nf_tables: Support for subkeys, set with multiple ranged fields Stefano Brivio
2019-11-23 20:01   ` Pablo Neira Ayuso
2019-11-25  9:30     ` Stefano Brivio
2019-11-25  9:58       ` Pablo Neira Ayuso
2019-11-25 13:26         ` Stefano Brivio
2019-11-25 14:30           ` Pablo Neira Ayuso
2019-11-25 14:54             ` Stefano Brivio
2019-11-25 20:38               ` Pablo Neira Ayuso
2019-11-22 13:40 ` [PATCH nf-next v2 2/8] bitmap: Introduce bitmap_cut(): cut bits and shift remaining Stefano Brivio
2019-11-22 13:40 ` [PATCH nf-next v2 3/8] nf_tables: Add set type for arbitrary concatenation of ranges Stefano Brivio
2019-11-27  9:29   ` Pablo Neira Ayuso
2019-11-27 11:02     ` Stefano Brivio [this message]
2019-11-27 18:29       ` Pablo Neira Ayuso
2019-11-22 13:40 ` [PATCH nf-next v2 4/8] selftests: netfilter: Introduce tests for sets with range concatenation Stefano Brivio
2019-11-22 13:40 ` [PATCH nf-next v2 5/8] nft_set_pipapo: Provide unrolled lookup loops for common field sizes Stefano Brivio
2019-11-22 13:40 ` [PATCH nf-next v2 6/8] nft_set_pipapo: Prepare for vectorised implementation: alignment Stefano Brivio
2019-11-22 13:40 ` [PATCH nf-next v2 7/8] nft_set_pipapo: Prepare for vectorised implementation: helpers Stefano Brivio
2019-11-22 13:40 ` [PATCH nf-next v2 8/8] nft_set_pipapo: Introduce AVX2-based lookup implementation Stefano Brivio
2019-11-26  6:36   ` kbuild test robot
2019-11-23 20:05 ` [PATCH nf-next v2 0/8] nftables: Set implementation for arbitrary concatenation of ranges Pablo Neira Ayuso
2019-11-25  9:31   ` Stefano Brivio
2019-11-25 10:02     ` Pablo Neira Ayuso
2019-11-25 13:36       ` Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191127120249.292d4a69@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=eric@garver.life \
    --cc=fw@strlen.de \
    --cc=kadlec@blackhole.kfki.hu \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=phil@nwl.cc \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).