All of lore.kernel.org
 help / color / mirror / Atom feed
* [nft PATCH] libnftables: Fix for multiple context instances
@ 2017-11-16 19:10 Phil Sutter
  2017-11-20 12:37 ` Pablo Neira Ayuso
  0 siblings, 1 reply; 28+ messages in thread
From: Phil Sutter @ 2017-11-16 19:10 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel

If a second context is created, the second call to nft_ctx_free() leads
to freeing invalid pointers in nft_exit(). Fix this by introducing a
reference counter so that neither nft_init() nor nft_exit() run more
than once in a row.

This reference counter has to be protected from parallel access. Do this
using a mutex in a way that once nft_init() returns, the first call to
that function running in parallel is guaranteed to be finished -
otherwise it could happen that things being initialized in one thread
are already accessed in another one.

Signed-off-by: Phil Sutter <phil@nwl.cc>
---
 src/libnftables.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/src/libnftables.c b/src/libnftables.c
index e8fa6742f7d17..9df9658930c39 100644
--- a/src/libnftables.c
+++ b/src/libnftables.c
@@ -14,6 +14,7 @@
 #include <iface.h>
 
 #include <errno.h>
+#include <pthread.h>
 #include <stdlib.h>
 #include <string.h>
 
@@ -102,8 +103,16 @@ err1:
 	return ret;
 }
 
+static int nft_refcnt;
+static pthread_mutex_t nft_refcnt_mutex = PTHREAD_MUTEX_INITIALIZER;
+
 static void nft_init(void)
 {
+	pthread_mutex_lock(&nft_refcnt_mutex);
+
+	if (nft_refcnt++)
+		goto unlock;
+
 	mark_table_init();
 	realm_table_rt_init();
 	devgroup_table_init();
@@ -113,15 +122,26 @@ static void nft_init(void)
 #ifdef HAVE_LIBXTABLES
 	xt_init();
 #endif
+
+unlock:
+	pthread_mutex_unlock(&nft_refcnt_mutex);
 }
 
 static void nft_exit(void)
 {
+	pthread_mutex_lock(&nft_refcnt_mutex);
+
+	if (--nft_refcnt)
+		goto unlock;
+
 	ct_label_table_exit();
 	realm_table_rt_exit();
 	devgroup_table_exit();
 	realm_table_meta_exit();
 	mark_table_exit();
+
+unlock:
+	pthread_mutex_unlock(&nft_refcnt_mutex);
 }
 
 int nft_ctx_add_include_path(struct nft_ctx *ctx, const char *path)
-- 
2.13.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [nft PATCH] libnftables: Fix for multiple context instances
  2017-11-16 19:10 [nft PATCH] libnftables: Fix for multiple context instances Phil Sutter
@ 2017-11-20 12:37 ` Pablo Neira Ayuso
  2017-11-20 12:54   ` Phil Sutter
  0 siblings, 1 reply; 28+ messages in thread
From: Pablo Neira Ayuso @ 2017-11-20 12:37 UTC (permalink / raw)
  To: Phil Sutter; +Cc: netfilter-devel

On Thu, Nov 16, 2017 at 08:10:24PM +0100, Phil Sutter wrote:
> If a second context is created, the second call to nft_ctx_free() leads
> to freeing invalid pointers in nft_exit(). Fix this by introducing a
> reference counter so that neither nft_init() nor nft_exit() run more
> than once in a row.
> 
> This reference counter has to be protected from parallel access. Do this
> using a mutex in a way that once nft_init() returns, the first call to
> that function running in parallel is guaranteed to be finished -
> otherwise it could happen that things being initialized in one thread
> are already accessed in another one.

I would prefer this table is placed into the context object, so they
are not global anymore. So I would prefer we fix this the right way(tm).

Let me know your thoughts, thanks!

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [nft PATCH] libnftables: Fix for multiple context instances
  2017-11-20 12:37 ` Pablo Neira Ayuso
@ 2017-11-20 12:54   ` Phil Sutter
  2017-11-20 13:07     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 28+ messages in thread
From: Phil Sutter @ 2017-11-20 12:54 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel

On Mon, Nov 20, 2017 at 01:37:26PM +0100, Pablo Neira Ayuso wrote:
> On Thu, Nov 16, 2017 at 08:10:24PM +0100, Phil Sutter wrote:
> > If a second context is created, the second call to nft_ctx_free() leads
> > to freeing invalid pointers in nft_exit(). Fix this by introducing a
> > reference counter so that neither nft_init() nor nft_exit() run more
> > than once in a row.
> > 
> > This reference counter has to be protected from parallel access. Do this
> > using a mutex in a way that once nft_init() returns, the first call to
> > that function running in parallel is guaranteed to be finished -
> > otherwise it could happen that things being initialized in one thread
> > are already accessed in another one.
> 
> I would prefer this table is placed into the context object, so they
> are not global anymore. So I would prefer we fix this the right way(tm).
> 
> Let me know your thoughts, thanks!

I don't think that's feasible: Looking at 'mark_tbl' for instance, this is
accessed from callbacks in 'mark_type', so we would have to make nft_ctx
accessible by all functions dealing with datatypes.

Cheers, Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [nft PATCH] libnftables: Fix for multiple context instances
  2017-11-20 12:54   ` Phil Sutter
@ 2017-11-20 13:07     ` Pablo Neira Ayuso
  2017-11-20 15:58       ` Phil Sutter
  0 siblings, 1 reply; 28+ messages in thread
From: Pablo Neira Ayuso @ 2017-11-20 13:07 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel

On Mon, Nov 20, 2017 at 01:54:18PM +0100, Phil Sutter wrote:
> On Mon, Nov 20, 2017 at 01:37:26PM +0100, Pablo Neira Ayuso wrote:
> > On Thu, Nov 16, 2017 at 08:10:24PM +0100, Phil Sutter wrote:
> > > If a second context is created, the second call to nft_ctx_free() leads
> > > to freeing invalid pointers in nft_exit(). Fix this by introducing a
> > > reference counter so that neither nft_init() nor nft_exit() run more
> > > than once in a row.
> > > 
> > > This reference counter has to be protected from parallel access. Do this
> > > using a mutex in a way that once nft_init() returns, the first call to
> > > that function running in parallel is guaranteed to be finished -
> > > otherwise it could happen that things being initialized in one thread
> > > are already accessed in another one.
> > 
> > I would prefer this table is placed into the context object, so they
> > are not global anymore. So I would prefer we fix this the right way(tm).
> > 
> > Let me know your thoughts, thanks!
> 
> I don't think that's feasible: Looking at 'mark_tbl' for instance, this is
> accessed from callbacks in 'mark_type', so we would have to make nft_ctx
> accessible by all functions dealing with datatypes.

Probably some specific new context object that wrap access to these
tables, not necessarily nft_ctx.

It's just more code, it will not be a small patch, but I don't see any
reason this can't be done.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [nft PATCH] libnftables: Fix for multiple context instances
  2017-11-20 13:07     ` Pablo Neira Ayuso
@ 2017-11-20 15:58       ` Phil Sutter
  2017-11-20 16:53         ` Pablo Neira Ayuso
  0 siblings, 1 reply; 28+ messages in thread
From: Phil Sutter @ 2017-11-20 15:58 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel

On Mon, Nov 20, 2017 at 02:07:32PM +0100, Pablo Neira Ayuso wrote:
> On Mon, Nov 20, 2017 at 01:54:18PM +0100, Phil Sutter wrote:
> > On Mon, Nov 20, 2017 at 01:37:26PM +0100, Pablo Neira Ayuso wrote:
> > > On Thu, Nov 16, 2017 at 08:10:24PM +0100, Phil Sutter wrote:
> > > > If a second context is created, the second call to nft_ctx_free() leads
> > > > to freeing invalid pointers in nft_exit(). Fix this by introducing a
> > > > reference counter so that neither nft_init() nor nft_exit() run more
> > > > than once in a row.
> > > > 
> > > > This reference counter has to be protected from parallel access. Do this
> > > > using a mutex in a way that once nft_init() returns, the first call to
> > > > that function running in parallel is guaranteed to be finished -
> > > > otherwise it could happen that things being initialized in one thread
> > > > are already accessed in another one.
> > > 
> > > I would prefer this table is placed into the context object, so they
> > > are not global anymore. So I would prefer we fix this the right way(tm).
> > > 
> > > Let me know your thoughts, thanks!
> > 
> > I don't think that's feasible: Looking at 'mark_tbl' for instance, this is
> > accessed from callbacks in 'mark_type', so we would have to make nft_ctx
> > accessible by all functions dealing with datatypes.
> 
> Probably some specific new context object that wrap access to these
> tables, not necessarily nft_ctx.
> 
> It's just more code, it will not be a small patch, but I don't see any
> reason this can't be done.

Yes, this is my guess as well. Though for what benefit? I don't think
having this logic for global run-time data is bad per se. What is it
that you don't like about it?

Cheers, Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [nft PATCH] libnftables: Fix for multiple context instances
  2017-11-20 15:58       ` Phil Sutter
@ 2017-11-20 16:53         ` Pablo Neira Ayuso
  2017-11-22 17:49           ` Phil Sutter
  0 siblings, 1 reply; 28+ messages in thread
From: Pablo Neira Ayuso @ 2017-11-20 16:53 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel

On Mon, Nov 20, 2017 at 04:58:22PM +0100, Phil Sutter wrote:
> On Mon, Nov 20, 2017 at 02:07:32PM +0100, Pablo Neira Ayuso wrote:
> > On Mon, Nov 20, 2017 at 01:54:18PM +0100, Phil Sutter wrote:
> > > On Mon, Nov 20, 2017 at 01:37:26PM +0100, Pablo Neira Ayuso wrote:
> > > > On Thu, Nov 16, 2017 at 08:10:24PM +0100, Phil Sutter wrote:
> > > > > If a second context is created, the second call to nft_ctx_free() leads
> > > > > to freeing invalid pointers in nft_exit(). Fix this by introducing a
> > > > > reference counter so that neither nft_init() nor nft_exit() run more
> > > > > than once in a row.
> > > > > 
> > > > > This reference counter has to be protected from parallel access. Do this
> > > > > using a mutex in a way that once nft_init() returns, the first call to
> > > > > that function running in parallel is guaranteed to be finished -
> > > > > otherwise it could happen that things being initialized in one thread
> > > > > are already accessed in another one.
> > > > 
> > > > I would prefer this table is placed into the context object, so they
> > > > are not global anymore. So I would prefer we fix this the right way(tm).
> > > > 
> > > > Let me know your thoughts, thanks!
> > > 
> > > I don't think that's feasible: Looking at 'mark_tbl' for instance, this is
> > > accessed from callbacks in 'mark_type', so we would have to make nft_ctx
> > > accessible by all functions dealing with datatypes.
> > 
> > Probably some specific new context object that wrap access to these
> > tables, not necessarily nft_ctx.
> > 
> > It's just more code, it will not be a small patch, but I don't see any
> > reason this can't be done.
> 
> Yes, this is my guess as well. Though for what benefit? I don't think
> having this logic for global run-time data is bad per se. What is it
> that you don't like about it?

This is breaking the assumption that releasing the context object will
be clearing all state behind us.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [nft PATCH] libnftables: Fix for multiple context instances
  2017-11-20 16:53         ` Pablo Neira Ayuso
@ 2017-11-22 17:49           ` Phil Sutter
  2017-11-22 18:18             ` Pablo Neira Ayuso
       [not found]             ` <20171204100955.GA1822@salvia>
  0 siblings, 2 replies; 28+ messages in thread
From: Phil Sutter @ 2017-11-22 17:49 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel

On Mon, Nov 20, 2017 at 05:53:13PM +0100, Pablo Neira Ayuso wrote:
> On Mon, Nov 20, 2017 at 04:58:22PM +0100, Phil Sutter wrote:
> > On Mon, Nov 20, 2017 at 02:07:32PM +0100, Pablo Neira Ayuso wrote:
> > > On Mon, Nov 20, 2017 at 01:54:18PM +0100, Phil Sutter wrote:
> > > > On Mon, Nov 20, 2017 at 01:37:26PM +0100, Pablo Neira Ayuso wrote:
> > > > > On Thu, Nov 16, 2017 at 08:10:24PM +0100, Phil Sutter wrote:
> > > > > > If a second context is created, the second call to nft_ctx_free() leads
> > > > > > to freeing invalid pointers in nft_exit(). Fix this by introducing a
> > > > > > reference counter so that neither nft_init() nor nft_exit() run more
> > > > > > than once in a row.
> > > > > > 
> > > > > > This reference counter has to be protected from parallel access. Do this
> > > > > > using a mutex in a way that once nft_init() returns, the first call to
> > > > > > that function running in parallel is guaranteed to be finished -
> > > > > > otherwise it could happen that things being initialized in one thread
> > > > > > are already accessed in another one.
> > > > > 
> > > > > I would prefer this table is placed into the context object, so they
> > > > > are not global anymore. So I would prefer we fix this the right way(tm).
> > > > > 
> > > > > Let me know your thoughts, thanks!
> > > > 
> > > > I don't think that's feasible: Looking at 'mark_tbl' for instance, this is
> > > > accessed from callbacks in 'mark_type', so we would have to make nft_ctx
> > > > accessible by all functions dealing with datatypes.
> > > 
> > > Probably some specific new context object that wrap access to these
> > > tables, not necessarily nft_ctx.
> > > 
> > > It's just more code, it will not be a small patch, but I don't see any
> > > reason this can't be done.
> > 
> > Yes, this is my guess as well. Though for what benefit? I don't think
> > having this logic for global run-time data is bad per se. What is it
> > that you don't like about it?
> 
> This is breaking the assumption that releasing the context object will
> be clearing all state behind us.

Hmm. Does that matter here? I don't think it makes sense to make the
generated iproute2 tables context-local data. And I don't even see a way
to make gmp_init() and xt_init() apply per context only.

The effort of implementing this is quite high as well, since the table
pointers would have to be passed down to all places where datatypes are
parsed or printed.

Cheers, Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [nft PATCH] libnftables: Fix for multiple context instances
  2017-11-22 17:49           ` Phil Sutter
@ 2017-11-22 18:18             ` Pablo Neira Ayuso
       [not found]             ` <20171204100955.GA1822@salvia>
  1 sibling, 0 replies; 28+ messages in thread
From: Pablo Neira Ayuso @ 2017-11-22 18:18 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel

Hi Phil,

On Wed, Nov 22, 2017 at 06:49:43PM +0100, Phil Sutter wrote:
> On Mon, Nov 20, 2017 at 05:53:13PM +0100, Pablo Neira Ayuso wrote:
> > On Mon, Nov 20, 2017 at 04:58:22PM +0100, Phil Sutter wrote:
> > > On Mon, Nov 20, 2017 at 02:07:32PM +0100, Pablo Neira Ayuso wrote:
> > > > On Mon, Nov 20, 2017 at 01:54:18PM +0100, Phil Sutter wrote:
> > > > > On Mon, Nov 20, 2017 at 01:37:26PM +0100, Pablo Neira Ayuso wrote:
> > > > > > On Thu, Nov 16, 2017 at 08:10:24PM +0100, Phil Sutter wrote:
> > > > > > > If a second context is created, the second call to nft_ctx_free() leads
> > > > > > > to freeing invalid pointers in nft_exit(). Fix this by introducing a
> > > > > > > reference counter so that neither nft_init() nor nft_exit() run more
> > > > > > > than once in a row.
> > > > > > > 
> > > > > > > This reference counter has to be protected from parallel access. Do this
> > > > > > > using a mutex in a way that once nft_init() returns, the first call to
> > > > > > > that function running in parallel is guaranteed to be finished -
> > > > > > > otherwise it could happen that things being initialized in one thread
> > > > > > > are already accessed in another one.
> > > > > > 
> > > > > > I would prefer this table is placed into the context object, so they
> > > > > > are not global anymore. So I would prefer we fix this the right way(tm).
> > > > > > 
> > > > > > Let me know your thoughts, thanks!
> > > > > 
> > > > > I don't think that's feasible: Looking at 'mark_tbl' for instance, this is
> > > > > accessed from callbacks in 'mark_type', so we would have to make nft_ctx
> > > > > accessible by all functions dealing with datatypes.
> > > > 
> > > > Probably some specific new context object that wrap access to these
> > > > tables, not necessarily nft_ctx.
> > > > 
> > > > It's just more code, it will not be a small patch, but I don't see any
> > > > reason this can't be done.
> > > 
> > > Yes, this is my guess as well. Though for what benefit? I don't think
> > > having this logic for global run-time data is bad per se. What is it
> > > that you don't like about it?
> > 
> > This is breaking the assumption that releasing the context object will
> > be clearing all state behind us.
> 
> Hmm. Does that matter here? I don't think it makes sense to make the
> generated iproute2 tables context-local data. And I don't even see a way
> to make gmp_init() and xt_init() apply per context only.

Actually, there's no real internal state neither in gmp_init() nor in
xt_init(), so we could be just a one-time initialization things, once
and never ever again.

Moreover, the existing behaviour allows simple API clients to refresh
context on demand, via poor-man release context object and alloc it
again. Actually, we could add a function to force an context update,
so we re-parse all those files again and get fresh context.

> The effort of implementing this is quite high as well, since the table
> pointers would have to be passed down to all places where datatypes are
> parsed or printed.

I understand, but high? I'm seeing a bunch of datatype_lookup() and
datatype_print() calls, and we already have context objects all over
the place to propagate this.

BTW, do you have a usecase to make this library thread safe now?  This
is just going to hit the single nfnl mutex in the kernel side, so
there is not much gain in having multiple threads sending commands to
the kernel. We can just leave this task to some Outreachy intern if
you're busy now ;-). We've been doing quite a bit of work so far is to
de-global everything, we could have just started by adding this mutex
since the beginning. So we're almost there.

Anyway, looking at more essencial stuff, probably it's time to have a
look at the batch mode, this one-command per call is expensive.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* libnftables extended API proposal (Was: Re: [nft PATCH] libnftables: Fix for multiple context instances)
       [not found]                     ` <20171204184604.GA1556@salvia>
@ 2017-12-05 13:43                       ` Phil Sutter
  2017-12-07  0:05                         ` Pablo Neira Ayuso
  0 siblings, 1 reply; 28+ messages in thread
From: Phil Sutter @ 2017-12-05 13:43 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

Hi Pablo,

Since I was about to start explaining my extended API idea as part of my
reply, let's take this on-list and I'll give a full overview.

On Mon, Dec 04, 2017 at 07:46:04PM +0100, Pablo Neira Ayuso wrote:
[...]
> Kernel code to check if an element is exists is already upstream, it's
> in current -rc2. And I have a patch here that I'm finishing to add a
> command to do something like:
> 
> # nft get element x y { 1.1.1.1 }
> 
> to see if element 1.1.1.1 is there. You can check for several elements
> in one go. For intervals, you can check both if an element exists in an
> interval and if an interval exists itself.

Looks good!

> Is that what you need? It would be a matter of offering an API for
> this.

It was just an example. The problem with simple API is that for any
task, a command string has to be created and the output parsed again.
Actually, there are good reasons to just call 'nft' instead:

- Debugging is a bit easier since behaviour is fully defined by the
  command string. In simple API, one has to look at settings of nft
  context object as well to determine behaviour.
- For reading output, popen() is a bit simpler to use than fmemopen().
  Also, simple API might lose messages due to bugs in code (missing
  nft_print() conversion) - whatever nft binary prints is caught by
  popen().
- Success/fail state of command is clearly defined by nft exit code, no
  need to check for presence of erec messages or similar "hacks" to find
  out.

> > I have a draft for this extended API and have been working on a PoC to
> > sched a light on the biggest shortcomings. Expect me to start upstream
> > discussion soon. :)
> 
> Great.

My "vision" for an extended API which actually provides an additional
benefit is something that allows to work with the entities nft language
defines in an abstract manner, ideally without having to invoke the
parser all the time.

Naturally, nftables entities are hierarchical: rules are contained in
chains, chains contained in tables, etc. At the topmost level, there is
something I call 'ruleset', which is basically just an instance of
struct nft_cache. Since we have that in nft context already, it was
optimized out (for now at least). As a leftover, I have a function which
does a cache update (although this might be done implicitly as well).

For each entity contained in the ruleset, I wrote two functions, lookup
and create, to reference them later. Due to the hierarchical layout,
both functions take the higher-level entity as an argument. For
instance:

| struct nft_table *nft_table_lookup(struct nft_ctx *nft,
| 				     unsigned int family,
| 				     const char *name);
| struct nft_chain *nft_chain_new(struct nft_ctx *nft,
| 				  struct nft_table *table,
| 				  const char *name);

Family and name are enough to uniquely identify a table. By passing the
returned object to the second function and a name, a new chain in that
table can be created - or more precisely, a command (struct cmd
instance) is created and stored in a new field of struct nft_ctx for
later, when calling:

| int nft_ruleset_commit(struct nft_ctx *nft);

This constructs a new batch job using the previously created commands
and calls netlink_batch_send().

The entities I've defined so far are:

struct nft_table;
struct nft_chain;
struct nft_rule;
struct nft_set;
struct nft_expr; /* actually this should be setelem */

The implementation is very incomplete and merely a playground at this
point. I started with using the parser for everything, then tried to
eliminate as much as possible. E.g. the first version to add an element
to a set looked roughly like this (pseudo-code):

| int nft_set_add_element(struct nft_ctx *nft, struct nft_set *set,
| 			  const char *elem)
| {
| 	char buf[1024];
| 
|	sprintf(buf, "add element ip t %s %s", set->name, elem);
|	scanner_push_buffer(scanner, &indesc_cmdline, buf);
|	nft_parse(nft, scanner, &state);
|	list_splice_tail(&state.cmds, &nft->cmds);
| }

After tweaking the parser a bit, I can use it now to parse just a
set_list_member_expr and use the struct expr it returns. This made it
possible to create the desired struct cmd in above function without
having to invoke the parser there.

Exercising this refining consequently should allow to reach arbitrary
levels of granularity. For instance, one could stop at statement level,
i.e. statements are created using a string representation. Or one could
go down to expression level, and statements are created using one or two
expressions (depending on whether it is relational or not). Of course
this means the library will eventually become as complicated as the
parser itself, not necessarily a good thing.

On the other hand, having an abstract representation for set elements is
quite convenient - their string representations might differ (take e.g.
"22" vs. "ssh") so strcmp() is not sufficient to compare them.

I hope this allows you to get an idea of how I imagine extended API
although certainly details are missing here. What do you think about it?
Are you fine with the general concept so we can discuss details or do
you see a fundamental problem with it?

Cheers, Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal (Was: Re: [nft PATCH] libnftables: Fix for multiple context instances)
  2017-12-05 13:43                       ` libnftables extended API proposal (Was: Re: [nft PATCH] libnftables: Fix for multiple context instances) Phil Sutter
@ 2017-12-07  0:05                         ` Pablo Neira Ayuso
  2017-12-07 11:34                           ` Phil Sutter
  0 siblings, 1 reply; 28+ messages in thread
From: Pablo Neira Ayuso @ 2017-12-07  0:05 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel, Florian Westphal

Hi Phil,

On Tue, Dec 05, 2017 at 02:43:17PM +0100, Phil Sutter wrote:
[...]
> My "vision" for an extended API which actually provides an additional
> benefit is something that allows to work with the entities nft language
> defines in an abstract manner, ideally without having to invoke the
> parser all the time.
>
> Naturally, nftables entities are hierarchical: rules are contained in
> chains, chains contained in tables, etc. At the topmost level, there is
> something I call 'ruleset', which is basically just an instance of
> struct nft_cache. Since we have that in nft context already, it was
> optimized out (for now at least). As a leftover, I have a function which
> does a cache update (although this might be done implicitly as well).
> 
> For each entity contained in the ruleset, I wrote two functions, lookup
> and create, to reference them later. Due to the hierarchical layout,
> both functions take the higher-level entity as an argument. For
> instance:
> 
> | struct nft_table *nft_table_lookup(struct nft_ctx *nft,
> | 				     unsigned int family,
> | 				     const char *name);
> | struct nft_chain *nft_chain_new(struct nft_ctx *nft,
> | 				  struct nft_table *table,
> | 				  const char *name);
> 
> Family and name are enough to uniquely identify a table. By passing the
> returned object to the second function and a name, a new chain in that
> table can be created - or more precisely, a command (struct cmd
> instance) is created and stored in a new field of struct nft_ctx for
> later, when calling:
> 
> | int nft_ruleset_commit(struct nft_ctx *nft);
> 
> This constructs a new batch job using the previously created commands
> and calls netlink_batch_send().
> 
> The entities I've defined so far are:
> 
> struct nft_table;
> struct nft_chain;
> struct nft_rule;
> struct nft_set;
> struct nft_expr; /* actually this should be setelem */
> 
> The implementation is very incomplete and merely a playground at this
> point. I started with using the parser for everything, then tried to
> eliminate as much as possible. E.g. the first version to add an element
> to a set looked roughly like this (pseudo-code):
> 
> | int nft_set_add_element(struct nft_ctx *nft, struct nft_set *set,
> | 			  const char *elem)
> | {
> | 	char buf[1024];
> | 
> |	sprintf(buf, "add element ip t %s %s", set->name, elem);
> |	scanner_push_buffer(scanner, &indesc_cmdline, buf);
> |	nft_parse(nft, scanner, &state);
> |	list_splice_tail(&state.cmds, &nft->cmds);
> | }
> 
> After tweaking the parser a bit, I can use it now to parse just a
> set_list_member_expr and use the struct expr it returns. This made it
> possible to create the desired struct cmd in above function without
> having to invoke the parser there.
> 
> Exercising this refining consequently should allow to reach arbitrary
> levels of granularity. For instance, one could stop at statement level,
> i.e. statements are created using a string representation. Or one could
> go down to expression level, and statements are created using one or two
> expressions (depending on whether it is relational or not). Of course
> this means the library will eventually become as complicated as the
> parser itself, not necessarily a good thing.

Yes, and we'll expose all internal representation details, that we
will need to maintain forever if we don't want to break backward.

> On the other hand, having an abstract representation for set elements is
> quite convenient - their string representations might differ (take e.g.
> "22" vs. "ssh") so strcmp() is not sufficient to compare them.
> 
> I hope this allows you to get an idea of how I imagine extended API
> although certainly details are missing here. What do you think about it?
> Are you fine with the general concept so we can discuss details or do
> you see a fundamental problem with it?

OK, my understanding is that you would like to operate with some
native library object representation.

Most objects (table, chain...) are easy to represent, as you
mentioned. Rules are the most complex ones internally, but you can
probably abstract a simplified representation that suits well for your
usecases, e.g expose them in an iptables like representation -
something like adding matches and actions - Obviously, this needs to
allow to take sets as input, eg.

        int meta_match_immediate(struct nft_rule *r, enum nft_meta_type, void *data);
        int meta_match_set(struct nft_rule *r, enum nft_meta_type, struct nft_set *set);

meta_match_immediate() adds a meta + cmp to the rule, to compare for
an immediate value. meta_match_set() adds meta + lookup.

A list of use-cases, for the third party application, would be good to
have to design this API.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal (Was: Re: [nft PATCH] libnftables: Fix for multiple context instances)
  2017-12-07  0:05                         ` Pablo Neira Ayuso
@ 2017-12-07 11:34                           ` Phil Sutter
  2017-12-10 21:55                             ` Pablo Neira Ayuso
  0 siblings, 1 reply; 28+ messages in thread
From: Phil Sutter @ 2017-12-07 11:34 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

Hi Pablo,

On Thu, Dec 07, 2017 at 01:05:45AM +0100, Pablo Neira Ayuso wrote:
> On Tue, Dec 05, 2017 at 02:43:17PM +0100, Phil Sutter wrote:
[...]
> > After tweaking the parser a bit, I can use it now to parse just a
> > set_list_member_expr and use the struct expr it returns. This made it
> > possible to create the desired struct cmd in above function without
> > having to invoke the parser there.
> > 
> > Exercising this refining consequently should allow to reach arbitrary
> > levels of granularity. For instance, one could stop at statement level,
> > i.e. statements are created using a string representation. Or one could
> > go down to expression level, and statements are created using one or two
> > expressions (depending on whether it is relational or not). Of course
> > this means the library will eventually become as complicated as the
> > parser itself, not necessarily a good thing.
> 
> Yes, and we'll expose all internal representation details, that we
> will need to maintain forever if we don't want to break backward.

Not necessarily. I had this in mind when declaring 'struct nft_table'
instead of reusing 'struct table'. :)

The parser defines the grammar, the library would just follow it. So if
a given internal change complies with the old grammar, it should comply
with the library as well. Though this is quite theoretical, of course.

Let's take relational expressions as simple example: In bison, we define
'expr op rhs_expr'. An equivalent library function could be:

| struct nft_expr *nft_relational_new(struct nft_expr *,
| 				      enum rel_ops,
| 				      struct nft_expr *);

What is allowed in rhs_expr may change internally without breaking ABI
or the parser-defined language.

Can you think of a problematic situation? My view is probably a bit
rose-coloured. ;)

> > On the other hand, having an abstract representation for set elements is
> > quite convenient - their string representations might differ (take e.g.
> > "22" vs. "ssh") so strcmp() is not sufficient to compare them.
> > 
> > I hope this allows you to get an idea of how I imagine extended API
> > although certainly details are missing here. What do you think about it?
> > Are you fine with the general concept so we can discuss details or do
> > you see a fundamental problem with it?
> 
> OK, my understanding is that you would like to operate with some
> native library object representation.
> 
> Most objects (table, chain...) are easy to represent, as you
> mentioned. Rules are the most complex ones internally, but you can
> probably abstract a simplified representation that suits well for your
> usecases, e.g expose them in an iptables like representation -
> something like adding matches and actions - Obviously, this needs to
> allow to take sets as input, eg.
> 
>         int meta_match_immediate(struct nft_rule *r, enum nft_meta_type, void *data);
>         int meta_match_set(struct nft_rule *r, enum nft_meta_type, struct nft_set *set);
> 
> meta_match_immediate() adds a meta + cmp to the rule, to compare for
> an immediate value. meta_match_set() adds meta + lookup.

Yes, that sounds good. I had something like this in mind:

| struct nft_stmt *nft_meta_match_immediate(enum nft_meta_type, void *data);
| int nft_rule_append_stmt(struct nft_rule *r, struct nft_stmt *stmt);

The obvious problem is that at the time that meta match is created,
there is no context information. So the second function would have to
do that.

I am not sure if this kind of context evaluation works in any case. E.g.
set elements are interpreted depending on the set they are added to. To
my surprise, that wasn't really an issue - the parser interprets it as
constant symbol, when evaluating the expression as part of adding it to
the set it is resolved properly. This might not work in any case,
though.

> A list of use-cases, for the third party application, would be good to
> have to design this API.

OK, I'll take firewalld as an example and come up with a number of
use-cases which would help there.

Thanks, Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal (Was: Re: [nft PATCH] libnftables: Fix for multiple context instances)
  2017-12-07 11:34                           ` Phil Sutter
@ 2017-12-10 21:55                             ` Pablo Neira Ayuso
  2017-12-16 16:06                               ` libnftables extended API proposal Phil Sutter
  0 siblings, 1 reply; 28+ messages in thread
From: Pablo Neira Ayuso @ 2017-12-10 21:55 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel, Florian Westphal

On Thu, Dec 07, 2017 at 12:34:31PM +0100, Phil Sutter wrote:
> Hi Pablo,
> 
> On Thu, Dec 07, 2017 at 01:05:45AM +0100, Pablo Neira Ayuso wrote:
> > On Tue, Dec 05, 2017 at 02:43:17PM +0100, Phil Sutter wrote:
> [...]
> > > After tweaking the parser a bit, I can use it now to parse just a
> > > set_list_member_expr and use the struct expr it returns. This made it
> > > possible to create the desired struct cmd in above function without
> > > having to invoke the parser there.
> > > 
> > > Exercising this refining consequently should allow to reach arbitrary
> > > levels of granularity. For instance, one could stop at statement level,
> > > i.e. statements are created using a string representation. Or one could
> > > go down to expression level, and statements are created using one or two
> > > expressions (depending on whether it is relational or not). Of course
> > > this means the library will eventually become as complicated as the
> > > parser itself, not necessarily a good thing.
> > 
> > Yes, and we'll expose all internal representation details, that we
> > will need to maintain forever if we don't want to break backward.
> 
> Not necessarily. I had this in mind when declaring 'struct nft_table'
> instead of reusing 'struct table'. :)
> 
> The parser defines the grammar, the library would just follow it. So if
> a given internal change complies with the old grammar, it should comply
> with the library as well. Though this is quite theoretical, of course.
> 
> Let's take relational expressions as simple example: In bison, we define
> 'expr op rhs_expr'. An equivalent library function could be:
> 
> | struct nft_expr *nft_relational_new(struct nft_expr *,
> | 				      enum rel_ops,
> | 				      struct nft_expr *);

Then that means you would like to expose an API that allows you to
build the abstract syntax tree.

> What is allowed in rhs_expr may change internally without breaking ABI
> or the parser-defined language.
> 
> Can you think of a problematic situation? My view is probably a bit
> rose-coloured. ;)
>
> > > On the other hand, having an abstract representation for set elements is
> > > quite convenient - their string representations might differ (take e.g.
> > > "22" vs. "ssh") so strcmp() is not sufficient to compare them.
> > > 
> > > I hope this allows you to get an idea of how I imagine extended API
> > > although certainly details are missing here. What do you think about it?
> > > Are you fine with the general concept so we can discuss details or do
> > > you see a fundamental problem with it?
> > 
> > OK, my understanding is that you would like to operate with some
> > native library object representation.
> > 
> > Most objects (table, chain...) are easy to represent, as you
> > mentioned. Rules are the most complex ones internally, but you can
> > probably abstract a simplified representation that suits well for your
> > usecases, e.g expose them in an iptables like representation -
> > something like adding matches and actions - Obviously, this needs to
> > allow to take sets as input, eg.
> > 
> >         int meta_match_immediate(struct nft_rule *r, enum nft_meta_type, void *data);
> >         int meta_match_set(struct nft_rule *r, enum nft_meta_type, struct nft_set *set);
> > 
> > meta_match_immediate() adds a meta + cmp to the rule, to compare for
> > an immediate value. meta_match_set() adds meta + lookup.
> 
> Yes, that sounds good. I had something like this in mind:
> 
> | struct nft_stmt *nft_meta_match_immediate(enum nft_meta_type, void *data);
> | int nft_rule_append_stmt(struct nft_rule *r, struct nft_stmt *stmt);
> 
> The obvious problem is that at the time that meta match is created,
> there is no context information. So the second function would have to
> do that.
> 
> I am not sure if this kind of context evaluation works in any case. E.g.
> set elements are interpreted depending on the set they are added to. To
> my surprise, that wasn't really an issue - the parser interprets it as
> constant symbol, when evaluating the expression as part of adding it to
> the set it is resolved properly. This might not work in any case,
> though.

Not sure I follow, what's the problem with the "missing context"?

> > A list of use-cases, for the third party application, would be good to
> > have to design this API.
> 
> OK, I'll take firewalld as an example and come up with a number of
> use-cases which would help there.

Thanks, those use-cases would be very useful to design this.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2017-12-10 21:55                             ` Pablo Neira Ayuso
@ 2017-12-16 16:06                               ` Phil Sutter
  2017-12-18 23:00                                 ` Pablo Neira Ayuso
  0 siblings, 1 reply; 28+ messages in thread
From: Phil Sutter @ 2017-12-16 16:06 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

Hi Pablo,

On Sun, Dec 10, 2017 at 10:55:40PM +0100, Pablo Neira Ayuso wrote:
> On Thu, Dec 07, 2017 at 12:34:31PM +0100, Phil Sutter wrote:
> > On Thu, Dec 07, 2017 at 01:05:45AM +0100, Pablo Neira Ayuso wrote:
> > > On Tue, Dec 05, 2017 at 02:43:17PM +0100, Phil Sutter wrote:
> > [...]
> > > > After tweaking the parser a bit, I can use it now to parse just a
> > > > set_list_member_expr and use the struct expr it returns. This made it
> > > > possible to create the desired struct cmd in above function without
> > > > having to invoke the parser there.
> > > > 
> > > > Exercising this refining consequently should allow to reach arbitrary
> > > > levels of granularity. For instance, one could stop at statement level,
> > > > i.e. statements are created using a string representation. Or one could
> > > > go down to expression level, and statements are created using one or two
> > > > expressions (depending on whether it is relational or not). Of course
> > > > this means the library will eventually become as complicated as the
> > > > parser itself, not necessarily a good thing.
> > > 
> > > Yes, and we'll expose all internal representation details, that we
> > > will need to maintain forever if we don't want to break backward.
> > 
> > Not necessarily. I had this in mind when declaring 'struct nft_table'
> > instead of reusing 'struct table'. :)
> > 
> > The parser defines the grammar, the library would just follow it. So if
> > a given internal change complies with the old grammar, it should comply
> > with the library as well. Though this is quite theoretical, of course.
> > 
> > Let's take relational expressions as simple example: In bison, we define
> > 'expr op rhs_expr'. An equivalent library function could be:
> > 
> > | struct nft_expr *nft_relational_new(struct nft_expr *,
> > | 				      enum rel_ops,
> > | 				      struct nft_expr *);
> 
> Then that means you would like to expose an API that allows you to
> build the abstract syntax tree.

That was the idea I had when I thought about how to transition from
fully text-based simple API to an extended one which allows to work with
objects instead. We could start simple and refine further if
required/sensible. At the basic level, adding a new rule could be
something like:

| myrule = nft_rule_create("tcp dport 22 accept");

If required, one could implement rule building based on statements:

| stmt1 = nft_stmt_create("tcp dport 22");
| stmt2 = nft_stmt_create("accept");
| myrule = nft_rule_create();
| nft_rule_add_stmt(myrule, stmt1);
| nft_rule_add_stmt(myrule, stmt2);

[...]
> > Yes, that sounds good. I had something like this in mind:
> > 
> > | struct nft_stmt *nft_meta_match_immediate(enum nft_meta_type, void *data);
> > | int nft_rule_append_stmt(struct nft_rule *r, struct nft_stmt *stmt);
> > 
> > The obvious problem is that at the time that meta match is created,
> > there is no context information. So the second function would have to
> > do that.
> > 
> > I am not sure if this kind of context evaluation works in any case. E.g.
> > set elements are interpreted depending on the set they are added to. To
> > my surprise, that wasn't really an issue - the parser interprets it as
> > constant symbol, when evaluating the expression as part of adding it to
> > the set it is resolved properly. This might not work in any case,
> > though.
> 
> Not sure I follow, what's the problem with the "missing context"?

The parser always has the full context available. E.g. if you pass it a
string such as this:

| add rule ip t c tcp dport 22 accept

By the time it parses the individual statements, it already knows e.g.
the table family and that might make a difference. As far as I can tell
from my hacking, it is possible to tweak the parser so that one could
use it to parse just a statement ('tcp dport 22').

> > > A list of use-cases, for the third party application, would be good to
> > > have to design this API.
> > 
> > OK, I'll take firewalld as an example and come up with a number of
> > use-cases which would help there.
> 
> Thanks, those use-cases would be very useful to design this.

OK, we've collected a bit:

Ability to verify ruleset state
-------------------------------
I want to find out whether a given element (table, chain, rule, set, set
element) previously added to the ruleset still exists.

Possibility to remove previously added ruleset components
---------------------------------------------------------
After making an addition to the ruleset, it should be possible to remove
the added items again. For tables, chains, sets, set elements and
stateful objects, the data used when adding the item is sufficient for
later removal (as removal happens by value). For rules though, the
corresponding handle is required.
XXX: This is inconsistent regarding items removed and added back
again by another instance: As the rule's handle changes, it is not
found anymore afterwards.

Create/delete user chains
-------------------------
Firewalld makes heavy use of custom named chains to organize rules. It
needs to be able to create/delete them easily.

Sets
----
I want to be able to freely read/modify sets of IPv4/IPv6/MAC addresses
and have them apply to existing rules.

Transactions
------------
I want to be able to add many custom chains and rules in
batch/transaction mode to reduce round trips. Ideally failure of a
transaction would return something that can indicate the invalid piece
of the transaction.

Cheers, Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2017-12-16 16:06                               ` libnftables extended API proposal Phil Sutter
@ 2017-12-18 23:00                                 ` Pablo Neira Ayuso
  2017-12-20 12:32                                   ` Phil Sutter
  0 siblings, 1 reply; 28+ messages in thread
From: Pablo Neira Ayuso @ 2017-12-18 23:00 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel, Florian Westphal

Hi Phil,

On Sat, Dec 16, 2017 at 05:06:51PM +0100, Phil Sutter wrote:
> Hi Pablo,
> 
> On Sun, Dec 10, 2017 at 10:55:40PM +0100, Pablo Neira Ayuso wrote:
> > On Thu, Dec 07, 2017 at 12:34:31PM +0100, Phil Sutter wrote:
> > > On Thu, Dec 07, 2017 at 01:05:45AM +0100, Pablo Neira Ayuso wrote:
> > > > On Tue, Dec 05, 2017 at 02:43:17PM +0100, Phil Sutter wrote:
> > > [...]
> > > > > After tweaking the parser a bit, I can use it now to parse just a
> > > > > set_list_member_expr and use the struct expr it returns. This made it
> > > > > possible to create the desired struct cmd in above function without
> > > > > having to invoke the parser there.
> > > > > 
> > > > > Exercising this refining consequently should allow to reach arbitrary
> > > > > levels of granularity. For instance, one could stop at statement level,
> > > > > i.e. statements are created using a string representation. Or one could
> > > > > go down to expression level, and statements are created using one or two
> > > > > expressions (depending on whether it is relational or not). Of course
> > > > > this means the library will eventually become as complicated as the
> > > > > parser itself, not necessarily a good thing.
> > > > 
> > > > Yes, and we'll expose all internal representation details, that we
> > > > will need to maintain forever if we don't want to break backward.
> > > 
> > > Not necessarily. I had this in mind when declaring 'struct nft_table'
> > > instead of reusing 'struct table'. :)
> > > 
> > > The parser defines the grammar, the library would just follow it. So if
> > > a given internal change complies with the old grammar, it should comply
> > > with the library as well. Though this is quite theoretical, of course.
> > > 
> > > Let's take relational expressions as simple example: In bison, we define
> > > 'expr op rhs_expr'. An equivalent library function could be:
> > > 
> > > | struct nft_expr *nft_relational_new(struct nft_expr *,
> > > | 				      enum rel_ops,
> > > | 				      struct nft_expr *);
> > 
> > Then that means you would like to expose an API that allows you to
> > build the abstract syntax tree.
> 
> That was the idea I had when I thought about how to transition from
> fully text-based simple API to an extended one which allows to work with
> objects instead. We could start simple and refine further if
> required/sensible. At the basic level, adding a new rule could be
> something like:
> 
> | myrule = nft_rule_create("tcp dport 22 accept");
> 
> If required, one could implement rule building based on statements:
> 
> | stmt1 = nft_stmt_create("tcp dport 22");
> | stmt2 = nft_stmt_create("accept");
> | myrule = nft_rule_create();
> | nft_rule_add_stmt(myrule, stmt1);
> | nft_rule_add_stmt(myrule, stmt2);

This is mixing parsing and abstract syntax tree creation.

If you want to expose the syntax tree, then I would the parsing layer
entirely and expose the syntax tree, which is what the json
representation for the high level library will be doing.

To support new protocol, we will need a new library version too, when
the abstraction to represent a payload is already well-defined, ie.
[ base, offset, length ], which is pretty much everywhere the same,
not only in nftables.

I wonder if firewalld could generate high level json representation
instead, so it becomes a compiler/translator from its own
representation to nftables abstract syntax tree. As I said, the json
representation is mapping to the abstract syntax tree we have in nft.
I'm refering to the high level json representation that doesn't exist
yet, not the low level one for libnftnl.

> [...]
> > > Yes, that sounds good. I had something like this in mind:
> > > 
> > > | struct nft_stmt *nft_meta_match_immediate(enum nft_meta_type, void *data);
> > > | int nft_rule_append_stmt(struct nft_rule *r, struct nft_stmt *stmt);
> > > 
> > > The obvious problem is that at the time that meta match is created,
> > > there is no context information. So the second function would have to
> > > do that.
> > > 
> > > I am not sure if this kind of context evaluation works in any case. E.g.
> > > set elements are interpreted depending on the set they are added to. To
> > > my surprise, that wasn't really an issue - the parser interprets it as
> > > constant symbol, when evaluating the expression as part of adding it to
> > > the set it is resolved properly. This might not work in any case,
> > > though.
> > 
> > Not sure I follow, what's the problem with the "missing context"?
> 
> The parser always has the full context available. E.g. if you pass it a
> string such as this:
> 
> | add rule ip t c tcp dport 22 accept
> 
> By the time it parses the individual statements, it already knows e.g.
> the table family and that might make a difference. As far as I can tell
> from my hacking, it is possible to tweak the parser so that one could
> use it to parse just a statement ('tcp dport 22').

You could just postpone evaluation once you have the full rule.

> > > > A list of use-cases, for the third party application, would be good to
> > > > have to design this API.
> > > 
> > > OK, I'll take firewalld as an example and come up with a number of
> > > use-cases which would help there.
> > 
> > Thanks, those use-cases would be very useful to design this.
> 
> OK, we've collected a bit:
> 
> Ability to verify ruleset state
> -------------------------------
> I want to find out whether a given element (table, chain, rule, set, set
> element) previously added to the ruleset still exists.

We need to add handle number for every object, I think that would
suffice for this.

> Possibility to remove previously added ruleset components
> ---------------------------------------------------------
> After making an addition to the ruleset, it should be possible to remove
> the added items again. For tables, chains, sets, set elements and
> stateful objects, the data used when adding the item is sufficient for
> later removal (as removal happens by value). For rules though, the
> corresponding handle is required.
> XXX: This is inconsistent regarding items removed and added back
> again by another instance: As the rule's handle changes, it is not
> found anymore afterwards.

Then it's time to add unique handles for everything.

> Create/delete user chains
> -------------------------
> Firewalld makes heavy use of custom named chains to organize rules. It
> needs to be able to create/delete them easily.
>
> Sets
> ----
> I want to be able to freely read/modify sets of IPv4/IPv6/MAC addresses
> and have them apply to existing rules.

These two, I think we already support.

> Transactions
> ------------
> I want to be able to add many custom chains and rules in
> batch/transaction mode to reduce round trips. Ideally failure of a
> transaction would return something that can indicate the invalid piece
> of the transaction.

I need to find the time to finish the fine grain netlink error
reporting for nftables.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2017-12-18 23:00                                 ` Pablo Neira Ayuso
@ 2017-12-20 12:32                                   ` Phil Sutter
  2017-12-20 22:23                                     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 28+ messages in thread
From: Phil Sutter @ 2017-12-20 12:32 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

Hi Pablo,

On Tue, Dec 19, 2017 at 12:00:48AM +0100, Pablo Neira Ayuso wrote:
> On Sat, Dec 16, 2017 at 05:06:51PM +0100, Phil Sutter wrote:
> > On Sun, Dec 10, 2017 at 10:55:40PM +0100, Pablo Neira Ayuso wrote:
> > > On Thu, Dec 07, 2017 at 12:34:31PM +0100, Phil Sutter wrote:
> > > > On Thu, Dec 07, 2017 at 01:05:45AM +0100, Pablo Neira Ayuso wrote:
> > > > > On Tue, Dec 05, 2017 at 02:43:17PM +0100, Phil Sutter wrote:
> > > > [...]
> > > > > > After tweaking the parser a bit, I can use it now to parse just a
> > > > > > set_list_member_expr and use the struct expr it returns. This made it
> > > > > > possible to create the desired struct cmd in above function without
> > > > > > having to invoke the parser there.
> > > > > > 
> > > > > > Exercising this refining consequently should allow to reach arbitrary
> > > > > > levels of granularity. For instance, one could stop at statement level,
> > > > > > i.e. statements are created using a string representation. Or one could
> > > > > > go down to expression level, and statements are created using one or two
> > > > > > expressions (depending on whether it is relational or not). Of course
> > > > > > this means the library will eventually become as complicated as the
> > > > > > parser itself, not necessarily a good thing.
> > > > > 
> > > > > Yes, and we'll expose all internal representation details, that we
> > > > > will need to maintain forever if we don't want to break backward.
> > > > 
> > > > Not necessarily. I had this in mind when declaring 'struct nft_table'
> > > > instead of reusing 'struct table'. :)
> > > > 
> > > > The parser defines the grammar, the library would just follow it. So if
> > > > a given internal change complies with the old grammar, it should comply
> > > > with the library as well. Though this is quite theoretical, of course.
> > > > 
> > > > Let's take relational expressions as simple example: In bison, we define
> > > > 'expr op rhs_expr'. An equivalent library function could be:
> > > > 
> > > > | struct nft_expr *nft_relational_new(struct nft_expr *,
> > > > | 				      enum rel_ops,
> > > > | 				      struct nft_expr *);
> > > 
> > > Then that means you would like to expose an API that allows you to
> > > build the abstract syntax tree.
> > 
> > That was the idea I had when I thought about how to transition from
> > fully text-based simple API to an extended one which allows to work with
> > objects instead. We could start simple and refine further if
> > required/sensible. At the basic level, adding a new rule could be
> > something like:
> > 
> > | myrule = nft_rule_create("tcp dport 22 accept");
> > 
> > If required, one could implement rule building based on statements:
> > 
> > | stmt1 = nft_stmt_create("tcp dport 22");
> > | stmt2 = nft_stmt_create("accept");
> > | myrule = nft_rule_create();
> > | nft_rule_add_stmt(myrule, stmt1);
> > | nft_rule_add_stmt(myrule, stmt2);
> 
> This is mixing parsing and abstract syntax tree creation.
> 
> If you want to expose the syntax tree, then I would the parsing layer
> entirely and expose the syntax tree, which is what the json
> representation for the high level library will be doing.

But that means having to provide a creating function for every
expression there is, no?

> To support new protocol, we will need a new library version too, when
> the abstraction to represent a payload is already well-defined, ie.
> [ base, offset, length ], which is pretty much everywhere the same,
> not only in nftables.

Sorry, I didn't get that. Are you talking about that JSON
representation?

> I wonder if firewalld could generate high level json representation
> instead, so it becomes a compiler/translator from its own
> representation to nftables abstract syntax tree. As I said, the json
> representation is mapping to the abstract syntax tree we have in nft.
> I'm refering to the high level json representation that doesn't exist
> yet, not the low level one for libnftnl.

Can you point me to some information about that high level JSON
representation? Seems I'm missing something here.

Thanks, Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2017-12-20 12:32                                   ` Phil Sutter
@ 2017-12-20 22:23                                     ` Pablo Neira Ayuso
  2017-12-22 13:08                                       ` Phil Sutter
  0 siblings, 1 reply; 28+ messages in thread
From: Pablo Neira Ayuso @ 2017-12-20 22:23 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel, Florian Westphal

Hi Phil,

On Wed, Dec 20, 2017 at 01:32:25PM +0100, Phil Sutter wrote:
[...]
> On Tue, Dec 19, 2017 at 12:00:48AM +0100, Pablo Neira Ayuso wrote:
> > On Sat, Dec 16, 2017 at 05:06:51PM +0100, Phil Sutter wrote:
> > > On Sun, Dec 10, 2017 at 10:55:40PM +0100, Pablo Neira Ayuso wrote:
> > > > On Thu, Dec 07, 2017 at 12:34:31PM +0100, Phil Sutter wrote:
> > > > > On Thu, Dec 07, 2017 at 01:05:45AM +0100, Pablo Neira Ayuso wrote:
> > > > > > On Tue, Dec 05, 2017 at 02:43:17PM +0100, Phil Sutter wrote:
> > > > > [...]
> > > > > > > After tweaking the parser a bit, I can use it now to parse just a
> > > > > > > set_list_member_expr and use the struct expr it returns. This made it
> > > > > > > possible to create the desired struct cmd in above function without
> > > > > > > having to invoke the parser there.
> > > > > > > 
> > > > > > > Exercising this refining consequently should allow to reach arbitrary
> > > > > > > levels of granularity. For instance, one could stop at statement level,
> > > > > > > i.e. statements are created using a string representation. Or one could
> > > > > > > go down to expression level, and statements are created using one or two
> > > > > > > expressions (depending on whether it is relational or not). Of course
> > > > > > > this means the library will eventually become as complicated as the
> > > > > > > parser itself, not necessarily a good thing.
> > > > > > 
> > > > > > Yes, and we'll expose all internal representation details, that we
> > > > > > will need to maintain forever if we don't want to break backward.
> > > > > 
> > > > > Not necessarily. I had this in mind when declaring 'struct nft_table'
> > > > > instead of reusing 'struct table'. :)
> > > > > 
> > > > > The parser defines the grammar, the library would just follow it. So if
> > > > > a given internal change complies with the old grammar, it should comply
> > > > > with the library as well. Though this is quite theoretical, of course.
> > > > > 
> > > > > Let's take relational expressions as simple example: In bison, we define
> > > > > 'expr op rhs_expr'. An equivalent library function could be:
> > > > > 
> > > > > | struct nft_expr *nft_relational_new(struct nft_expr *,
> > > > > | 				      enum rel_ops,
> > > > > | 				      struct nft_expr *);
> > > > 
> > > > Then that means you would like to expose an API that allows you to
> > > > build the abstract syntax tree.
> > > 
> > > That was the idea I had when I thought about how to transition from
> > > fully text-based simple API to an extended one which allows to work with
> > > objects instead. We could start simple and refine further if
> > > required/sensible. At the basic level, adding a new rule could be
> > > something like:
> > > 
> > > | myrule = nft_rule_create("tcp dport 22 accept");
> > > 
> > > If required, one could implement rule building based on statements:
> > > 
> > > | stmt1 = nft_stmt_create("tcp dport 22");
> > > | stmt2 = nft_stmt_create("accept");
> > > | myrule = nft_rule_create();
> > > | nft_rule_add_stmt(myrule, stmt1);
> > > | nft_rule_add_stmt(myrule, stmt2);
> > 
> > This is mixing parsing and abstract syntax tree creation.
> > 
> > If you want to expose the syntax tree, then I would the parsing layer
> > entirely and expose the syntax tree, which is what the json
> > representation for the high level library will be doing.
> 
> But that means having to provide a creating function for every
> expression there is, no?

Yes.

> > To support new protocol, we will need a new library version too, when
> > the abstraction to represent a payload is already well-defined, ie.
> > [ base, offset, length ], which is pretty much everywhere the same,
> > not only in nftables.
> 
> Sorry, I didn't get that. Are you talking about that JSON
> representation?

Yes. The one that does not exist.

> > I wonder if firewalld could generate high level json representation
> > instead, so it becomes a compiler/translator from its own
> > representation to nftables abstract syntax tree. As I said, the json
> > representation is mapping to the abstract syntax tree we have in nft.
> > I'm refering to the high level json representation that doesn't exist
> > yet, not the low level one for libnftnl.
> 
> Can you point me to some information about that high level JSON
> representation? Seems I'm missing something here.

It doesn't exist :-), if we expose a json-based API, third party tool
only have to generate the json high-level representation, we would
need very few API calls for this, and anyone could generate rulesets
for nftables, without relying on the bison parser, given the json
representation exposes the abstract syntax tree.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2017-12-20 22:23                                     ` Pablo Neira Ayuso
@ 2017-12-22 13:08                                       ` Phil Sutter
  2017-12-22 13:49                                         ` Pablo Neira Ayuso
  0 siblings, 1 reply; 28+ messages in thread
From: Phil Sutter @ 2017-12-22 13:08 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

Hi Pablo,

On Wed, Dec 20, 2017 at 11:23:36PM +0100, Pablo Neira Ayuso wrote:
> On Wed, Dec 20, 2017 at 01:32:25PM +0100, Phil Sutter wrote:
> [...]
> > On Tue, Dec 19, 2017 at 12:00:48AM +0100, Pablo Neira Ayuso wrote:
> > > On Sat, Dec 16, 2017 at 05:06:51PM +0100, Phil Sutter wrote:
> > > > On Sun, Dec 10, 2017 at 10:55:40PM +0100, Pablo Neira Ayuso wrote:
> > > > > On Thu, Dec 07, 2017 at 12:34:31PM +0100, Phil Sutter wrote:
> > > > > > On Thu, Dec 07, 2017 at 01:05:45AM +0100, Pablo Neira Ayuso wrote:
> > > > > > > On Tue, Dec 05, 2017 at 02:43:17PM +0100, Phil Sutter wrote:
> > > > > > [...]
> > > > > > > > After tweaking the parser a bit, I can use it now to parse just a
> > > > > > > > set_list_member_expr and use the struct expr it returns. This made it
> > > > > > > > possible to create the desired struct cmd in above function without
> > > > > > > > having to invoke the parser there.
> > > > > > > > 
> > > > > > > > Exercising this refining consequently should allow to reach arbitrary
> > > > > > > > levels of granularity. For instance, one could stop at statement level,
> > > > > > > > i.e. statements are created using a string representation. Or one could
> > > > > > > > go down to expression level, and statements are created using one or two
> > > > > > > > expressions (depending on whether it is relational or not). Of course
> > > > > > > > this means the library will eventually become as complicated as the
> > > > > > > > parser itself, not necessarily a good thing.
> > > > > > > 
> > > > > > > Yes, and we'll expose all internal representation details, that we
> > > > > > > will need to maintain forever if we don't want to break backward.
> > > > > > 
> > > > > > Not necessarily. I had this in mind when declaring 'struct nft_table'
> > > > > > instead of reusing 'struct table'. :)
> > > > > > 
> > > > > > The parser defines the grammar, the library would just follow it. So if
> > > > > > a given internal change complies with the old grammar, it should comply
> > > > > > with the library as well. Though this is quite theoretical, of course.
> > > > > > 
> > > > > > Let's take relational expressions as simple example: In bison, we define
> > > > > > 'expr op rhs_expr'. An equivalent library function could be:
> > > > > > 
> > > > > > | struct nft_expr *nft_relational_new(struct nft_expr *,
> > > > > > | 				      enum rel_ops,
> > > > > > | 				      struct nft_expr *);
> > > > > 
> > > > > Then that means you would like to expose an API that allows you to
> > > > > build the abstract syntax tree.
> > > > 
> > > > That was the idea I had when I thought about how to transition from
> > > > fully text-based simple API to an extended one which allows to work with
> > > > objects instead. We could start simple and refine further if
> > > > required/sensible. At the basic level, adding a new rule could be
> > > > something like:
> > > > 
> > > > | myrule = nft_rule_create("tcp dport 22 accept");
> > > > 
> > > > If required, one could implement rule building based on statements:
> > > > 
> > > > | stmt1 = nft_stmt_create("tcp dport 22");
> > > > | stmt2 = nft_stmt_create("accept");
> > > > | myrule = nft_rule_create();
> > > > | nft_rule_add_stmt(myrule, stmt1);
> > > > | nft_rule_add_stmt(myrule, stmt2);
> > > 
> > > This is mixing parsing and abstract syntax tree creation.
> > > 
> > > If you want to expose the syntax tree, then I would the parsing layer
> > > entirely and expose the syntax tree, which is what the json
> > > representation for the high level library will be doing.
> > 
> > But that means having to provide a creating function for every
> > expression there is, no?
> 
> Yes.
> 
> > > To support new protocol, we will need a new library version too, when
> > > the abstraction to represent a payload is already well-defined, ie.
> > > [ base, offset, length ], which is pretty much everywhere the same,
> > > not only in nftables.
> > 
> > Sorry, I didn't get that. Are you talking about that JSON
> > representation?
> 
> Yes. The one that does not exist.
> 
> > > I wonder if firewalld could generate high level json representation
> > > instead, so it becomes a compiler/translator from its own
> > > representation to nftables abstract syntax tree. As I said, the json
> > > representation is mapping to the abstract syntax tree we have in nft.
> > > I'm refering to the high level json representation that doesn't exist
> > > yet, not the low level one for libnftnl.
> > 
> > Can you point me to some information about that high level JSON
> > representation? Seems I'm missing something here.
> 
> It doesn't exist :-), if we expose a json-based API, third party tool
> only have to generate the json high-level representation, we would
> need very few API calls for this, and anyone could generate rulesets
> for nftables, without relying on the bison parser, given the json
> representation exposes the abstract syntax tree.

So you're idea is to accept a whole command in JSON format from
applications? And output in JSON format as well since that is easier for
parsing than human readable text we have right now?

I'm not sure about the '[ base, offset, length ]' part though:
Applications would have to care about protocol header layout including
any specialties themselves, or should libnftables provide them with
convenience functions to generate the correct JSON markup? For simple
stuff like matching on a TCP port there's probably no need, but
correctly interpreting IPv4 ToS field is rather error-prone I guess.

The approach seems simple at first, but application input in JSON format
has to be validated as well, so I fear we'll end up with a second parser
to avoid the first one.

Cheers, Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2017-12-22 13:08                                       ` Phil Sutter
@ 2017-12-22 13:49                                         ` Pablo Neira Ayuso
  2017-12-22 15:30                                           ` Phil Sutter
  0 siblings, 1 reply; 28+ messages in thread
From: Pablo Neira Ayuso @ 2017-12-22 13:49 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel, Florian Westphal

Hi Phil,

On Fri, Dec 22, 2017 at 02:08:16PM +0100, Phil Sutter wrote:
> On Wed, Dec 20, 2017 at 11:23:36PM +0100, Pablo Neira Ayuso wrote:
> > On Wed, Dec 20, 2017 at 01:32:25PM +0100, Phil Sutter wrote:
> > [...]
> > > On Tue, Dec 19, 2017 at 12:00:48AM +0100, Pablo Neira Ayuso wrote:
> > > > On Sat, Dec 16, 2017 at 05:06:51PM +0100, Phil Sutter wrote:
[...]
> > > > I wonder if firewalld could generate high level json representation
> > > > instead, so it becomes a compiler/translator from its own
> > > > representation to nftables abstract syntax tree. As I said, the json
> > > > representation is mapping to the abstract syntax tree we have in nft.
> > > > I'm refering to the high level json representation that doesn't exist
> > > > yet, not the low level one for libnftnl.
> > > 
> > > Can you point me to some information about that high level JSON
> > > representation? Seems I'm missing something here.
> > 
> > It doesn't exist :-), if we expose a json-based API, third party tool
> > only have to generate the json high-level representation, we would
> > need very few API calls for this, and anyone could generate rulesets
> > for nftables, without relying on the bison parser, given the json
> > representation exposes the abstract syntax tree.
> 
> So you're idea is to accept a whole command in JSON format from
> applications? And output in JSON format as well since that is easier for
> parsing than human readable text we have right now?

Just brainstorming here, we're discussing an API for third party
applications. In this case, they just need to build the json
representation for the ruleset they want to add. They could even embed
this into a network message that they can send of the wire.

> I'm not sure about the '[ base, offset, length ]' part though:
> Applications would have to care about protocol header layout including
> any specialties themselves, or should libnftables provide them with
> convenience functions to generate the correct JSON markup?

It depends, you may want to expose json representations for each
protocol payload you support.

> For simple stuff like matching on a TCP port there's probably no
> need, but correctly interpreting IPv4 ToS field is rather
> error-prone I guess.

And bitfields are going to be cumbersome too, so we would indeed need
a json representation for each protocol that we support, so third
party applications don't need to deal with this.

> The approach seems simple at first, but application input in JSON format
> has to be validated as well, so I fear we'll end up with a second parser
> to avoid the first one.

There are libraries like jansson that already do the parsing for us,
so we don't need to maintain our own json parser. We would still need
internal code to libnftables, to navigate the json representation and
create the objects.

On our side, we would need to maintain a very simple API, basically
that allows you to parse a json representation and to export it. For
backward compatibility reasons, we have to keep supporting the json
layout, instead of a large number of functions.

I guess the question here is if this would be good for firewalld, I
didn't have a look at that code, but many third party applications I
have seen are basically creating iptables commands in text, so this
approach would be similar, well, actually better since we would be
providing a well-structured representation.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2017-12-22 13:49                                         ` Pablo Neira Ayuso
@ 2017-12-22 15:30                                           ` Phil Sutter
  2017-12-22 20:39                                             ` Pablo Neira Ayuso
  0 siblings, 1 reply; 28+ messages in thread
From: Phil Sutter @ 2017-12-22 15:30 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

Hi Pablo,

On Fri, Dec 22, 2017 at 02:49:06PM +0100, Pablo Neira Ayuso wrote:
> On Fri, Dec 22, 2017 at 02:08:16PM +0100, Phil Sutter wrote:
> > On Wed, Dec 20, 2017 at 11:23:36PM +0100, Pablo Neira Ayuso wrote:
> > > On Wed, Dec 20, 2017 at 01:32:25PM +0100, Phil Sutter wrote:
> > > [...]
> > > > On Tue, Dec 19, 2017 at 12:00:48AM +0100, Pablo Neira Ayuso wrote:
> > > > > On Sat, Dec 16, 2017 at 05:06:51PM +0100, Phil Sutter wrote:
> [...]
> > > > > I wonder if firewalld could generate high level json representation
> > > > > instead, so it becomes a compiler/translator from its own
> > > > > representation to nftables abstract syntax tree. As I said, the json
> > > > > representation is mapping to the abstract syntax tree we have in nft.
> > > > > I'm refering to the high level json representation that doesn't exist
> > > > > yet, not the low level one for libnftnl.
> > > > 
> > > > Can you point me to some information about that high level JSON
> > > > representation? Seems I'm missing something here.
> > > 
> > > It doesn't exist :-), if we expose a json-based API, third party tool
> > > only have to generate the json high-level representation, we would
> > > need very few API calls for this, and anyone could generate rulesets
> > > for nftables, without relying on the bison parser, given the json
> > > representation exposes the abstract syntax tree.
> > 
> > So you're idea is to accept a whole command in JSON format from
> > applications? And output in JSON format as well since that is easier for
> > parsing than human readable text we have right now?
> 
> Just brainstorming here, we're discussing an API for third party
> applications. In this case, they just need to build the json
> representation for the ruleset they want to add. They could even embed
> this into a network message that they can send of the wire.
> 
> > I'm not sure about the '[ base, offset, length ]' part though:
> > Applications would have to care about protocol header layout including
> > any specialties themselves, or should libnftables provide them with
> > convenience functions to generate the correct JSON markup?
> 
> It depends, you may want to expose json representations for each
> protocol payload you support.
> 
> > For simple stuff like matching on a TCP port there's probably no
> > need, but correctly interpreting IPv4 ToS field is rather
> > error-prone I guess.
> 
> And bitfields are going to be cumbersome too, so we would indeed need
> a json representation for each protocol that we support, so third
> party applications don't need to deal with this.
> 
> > The approach seems simple at first, but application input in JSON format
> > has to be validated as well, so I fear we'll end up with a second parser
> > to avoid the first one.
> 
> There are libraries like jansson that already do the parsing for us,
> so we don't need to maintain our own json parser. We would still need
> internal code to libnftables, to navigate the json representation and
> create the objects.

Yes sure, there are libraries doing the actual parsing of JSON -
probably I wasn't clear enough. My point is what happens if you have a
parsed JSON tree (or array, however it may look like in practice). The
data sent by the application is either explicit enough for the
translation into netlink messages to be really trivial, or it is not
(which I prefer, otherwise applications could use libnftnl directly with
no drawback) - then we still have to implement a middle layer between
data in JSON and nftables objects. Maybe an example will do:

| [{
| 	"type": "relational",
| 	"left": {
| 		"type": "expression",
| 		"name": "tcp_hdr_expr",
| 		"value": {
| 			"type": "tcp_hdr_field",
| 			"value": "dport",
| 		},
| 	},
| 	"right": {
| 		"type": "expression",
| 		"name": "integer_expr",
| 		"value": 22,
| 	}
| }]

So this might be how a relational expression could be represented in
JSON. Note that I intentionally didn't break it down to payload_expr,
otherwise it had to contain TCP header offset, etc. (In this case that
might be preferred, but as stated above it's not the best option in
every case.)

Parsing^WInterpreting code would then probably look like:

| type = json_object_get(data, "type");
| if (!strcmp(type, "relational")) {
| 	left = parse_expr(json_object_get(data, "left"));
| 	right = parse_expr(json_object_get(data, "right"));
| 	expr = relational_expr_alloc(&internal_location,
| 				     OP_IMPLICIT, left, right);
| }

I think this last part might easily become bigger than parser_bison.y
and scanner.l combined.

> On our side, we would need to maintain a very simple API, basically
> that allows you to parse a json representation and to export it. For
> backward compatibility reasons, we have to keep supporting the json
> layout, instead of a large number of functions.

Yes, the API *itself* might be a lot smaller since it just takes a chunk
of JSON for about anything. But the middle layer (which is not exported
to applications) will be the relevant factor instead.

> I guess the question here is if this would be good for firewalld, I
> didn't have a look at that code, but many third party applications I
> have seen are basically creating iptables commands in text, so this
> approach would be similar, well, actually better since we would be
> providing a well-structured representation.

Yes, of course firewalld builds iptables commands, but just because
there is no better option. Hence the request for a better libnftables
API, to avoid repeating that with another command (or function call
to which the command's parameters are passed).

Firewalld is written in Python, so it won't be able to use libnftables
directly, anyway. At least a thin layer of wrapping code will be there,
even if it's just via ctypes module.

>From my perspective, the argument of being well-structured doesn't quite
hold. Of course, JSON will provide something like "here starts a
statement" and "here it ends", but e.g. asynchronism between input and
output will be reflected by it as well if not solved in common code
already.

Cheers, Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2017-12-22 15:30                                           ` Phil Sutter
@ 2017-12-22 20:39                                             ` Pablo Neira Ayuso
  2017-12-23 13:19                                               ` Phil Sutter
  0 siblings, 1 reply; 28+ messages in thread
From: Pablo Neira Ayuso @ 2017-12-22 20:39 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel, Florian Westphal

On Fri, Dec 22, 2017 at 04:30:49PM +0100, Phil Sutter wrote:
> Hi Pablo,
> 
> On Fri, Dec 22, 2017 at 02:49:06PM +0100, Pablo Neira Ayuso wrote:
> > On Fri, Dec 22, 2017 at 02:08:16PM +0100, Phil Sutter wrote:
> > > On Wed, Dec 20, 2017 at 11:23:36PM +0100, Pablo Neira Ayuso wrote:
> > > > On Wed, Dec 20, 2017 at 01:32:25PM +0100, Phil Sutter wrote:
> > > > [...]
> > > > > On Tue, Dec 19, 2017 at 12:00:48AM +0100, Pablo Neira Ayuso wrote:
> > > > > > On Sat, Dec 16, 2017 at 05:06:51PM +0100, Phil Sutter wrote:
> > [...]
> > > > > > I wonder if firewalld could generate high level json representation
> > > > > > instead, so it becomes a compiler/translator from its own
> > > > > > representation to nftables abstract syntax tree. As I said, the json
> > > > > > representation is mapping to the abstract syntax tree we have in nft.
> > > > > > I'm refering to the high level json representation that doesn't exist
> > > > > > yet, not the low level one for libnftnl.
> > > > > 
> > > > > Can you point me to some information about that high level JSON
> > > > > representation? Seems I'm missing something here.
> > > > 
> > > > It doesn't exist :-), if we expose a json-based API, third party tool
> > > > only have to generate the json high-level representation, we would
> > > > need very few API calls for this, and anyone could generate rulesets
> > > > for nftables, without relying on the bison parser, given the json
> > > > representation exposes the abstract syntax tree.
> > > 
> > > So you're idea is to accept a whole command in JSON format from
> > > applications? And output in JSON format as well since that is easier for
> > > parsing than human readable text we have right now?
> > 
> > Just brainstorming here, we're discussing an API for third party
> > applications. In this case, they just need to build the json
> > representation for the ruleset they want to add. They could even embed
> > this into a network message that they can send of the wire.
> > 
> > > I'm not sure about the '[ base, offset, length ]' part though:
> > > Applications would have to care about protocol header layout including
> > > any specialties themselves, or should libnftables provide them with
> > > convenience functions to generate the correct JSON markup?
> > 
> > It depends, you may want to expose json representations for each
> > protocol payload you support.
> > 
> > > For simple stuff like matching on a TCP port there's probably no
> > > need, but correctly interpreting IPv4 ToS field is rather
> > > error-prone I guess.
> > 
> > And bitfields are going to be cumbersome too, so we would indeed need
> > a json representation for each protocol that we support, so third
> > party applications don't need to deal with this.
> > 
> > > The approach seems simple at first, but application input in JSON format
> > > has to be validated as well, so I fear we'll end up with a second parser
> > > to avoid the first one.
> > 
> > There are libraries like jansson that already do the parsing for us,
> > so we don't need to maintain our own json parser. We would still need
> > internal code to libnftables, to navigate the json representation and
> > create the objects.
> 
> Yes sure, there are libraries doing the actual parsing of JSON -
> probably I wasn't clear enough. My point is what happens if you have a
> parsed JSON tree (or array, however it may look like in practice). The
> data sent by the application is either explicit enough for the
> translation into netlink messages to be really trivial, or it is not
> (which I prefer, otherwise applications could use libnftnl directly with
> no drawback) - then we still have to implement a middle layer between
> data in JSON and nftables objects. Maybe an example will do:
> 
> | [{
> | 	"type": "relational",
> | 	"left": {
> | 		"type": "expression",
> | 		"name": "tcp_hdr_expr",
> | 		"value": {
> | 			"type": "tcp_hdr_field",
> | 			"value": "dport",
> | 		},
> | 	},
> | 	"right": {
> | 		"type": "expression",
> | 		"name": "integer_expr",
> | 		"value": 22,
> | 	}
> | }]

Probably something more simple representation, like this?

[{
        "match": {
                "left": {
                        "type": "payload",
                        "name": "tcp",
                        "field: "dport",
        	},
                "right": {
                        "type": "immediate",
                        "value": 22,
                }
        }
}]

For non-matching things, we can add an "action".

I wonder if this can even be made more simple and more compact indeed.

> So this might be how a relational expression could be represented in
> JSON. Note that I intentionally didn't break it down to payload_expr,
> otherwise it had to contain TCP header offset, etc. (In this case that
> might be preferred, but as stated above it's not the best option in
> every case.)
> 
> Parsing^WInterpreting code would then probably look like:
> 
> | type = json_object_get(data, "type");
> | if (!strcmp(type, "relational")) {
> | 	left = parse_expr(json_object_get(data, "left"));
> | 	right = parse_expr(json_object_get(data, "right"));
> | 	expr = relational_expr_alloc(&internal_location,
> | 				     OP_IMPLICIT, left, right);
> | }
>
> I think this last part might easily become bigger than parser_bison.y
> and scanner.l combined.
>
> > On our side, we would need to maintain a very simple API, basically
> > that allows you to parse a json representation and to export it. For
> > backward compatibility reasons, we have to keep supporting the json
> > layout, instead of a large number of functions.
> 
> Yes, the API *itself* might be a lot smaller since it just takes a chunk
> of JSON for about anything. But the middle layer (which is not exported
> to applications) will be the relevant factor instead.

Yes, in C I guess it will be quite a bit of new boiler plate code.
Unless we find a way to autogenerate code skeletons in some way.

I'm not so worry about maintaining more code. Real problem is API
in the longterm: you have to stick to them for long time (or forever
if you want if you want to take backward compatibility seriously, we
have a very good record on this).

And having little API mean, library can internally evolve more freely.

> > I guess the question here is if this would be good for firewalld, I
> > didn't have a look at that code, but many third party applications I
> > have seen are basically creating iptables commands in text, so this
> > approach would be similar, well, actually better since we would be
> > providing a well-structured representation.
> 
> Yes, of course firewalld builds iptables commands, but just because
> there is no better option. Hence the request for a better libnftables
> API, to avoid repeating that with another command (or function call
> to which the command's parameters are passed).
> 
> Firewalld is written in Python, so it won't be able to use libnftables
> directly, anyway. At least a thin layer of wrapping code will be there,
> even if it's just via ctypes module.

Parsing of json in python is actually rather easy, right? I remember
to have seen code mapping XML to an object whose attributes can be
accessed as object fields. I wonder about how hard would be to
generate it too.

> From my perspective, the argument of being well-structured doesn't quite
> hold. Of course, JSON will provide something like "here starts a
> statement" and "here it ends", but e.g. asynchronism between input and
> output will be reflected by it as well if not solved in common code
> already.

I'm not sure we should expose statements and relationals in the way we
do in nftables, it's too low level still for a high level library :-).
We can probably provide a more simplistic json representation that is
human readable and understandable.

Regarding asynchronism between input and output, not sure I follow.

Thanks!

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2017-12-22 20:39                                             ` Pablo Neira Ayuso
@ 2017-12-23 13:19                                               ` Phil Sutter
  2017-12-28 19:21                                                 ` Pablo Neira Ayuso
  0 siblings, 1 reply; 28+ messages in thread
From: Phil Sutter @ 2017-12-23 13:19 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

On Fri, Dec 22, 2017 at 09:39:03PM +0100, Pablo Neira Ayuso wrote:
> On Fri, Dec 22, 2017 at 04:30:49PM +0100, Phil Sutter wrote:
> > Hi Pablo,
> > 
> > On Fri, Dec 22, 2017 at 02:49:06PM +0100, Pablo Neira Ayuso wrote:
> > > On Fri, Dec 22, 2017 at 02:08:16PM +0100, Phil Sutter wrote:
> > > > On Wed, Dec 20, 2017 at 11:23:36PM +0100, Pablo Neira Ayuso wrote:
> > > > > On Wed, Dec 20, 2017 at 01:32:25PM +0100, Phil Sutter wrote:
> > > > > [...]
> > > > > > On Tue, Dec 19, 2017 at 12:00:48AM +0100, Pablo Neira Ayuso wrote:
> > > > > > > On Sat, Dec 16, 2017 at 05:06:51PM +0100, Phil Sutter wrote:
> > > [...]
> > > > > > > I wonder if firewalld could generate high level json representation
> > > > > > > instead, so it becomes a compiler/translator from its own
> > > > > > > representation to nftables abstract syntax tree. As I said, the json
> > > > > > > representation is mapping to the abstract syntax tree we have in nft.
> > > > > > > I'm refering to the high level json representation that doesn't exist
> > > > > > > yet, not the low level one for libnftnl.
> > > > > > 
> > > > > > Can you point me to some information about that high level JSON
> > > > > > representation? Seems I'm missing something here.
> > > > > 
> > > > > It doesn't exist :-), if we expose a json-based API, third party tool
> > > > > only have to generate the json high-level representation, we would
> > > > > need very few API calls for this, and anyone could generate rulesets
> > > > > for nftables, without relying on the bison parser, given the json
> > > > > representation exposes the abstract syntax tree.
> > > > 
> > > > So you're idea is to accept a whole command in JSON format from
> > > > applications? And output in JSON format as well since that is easier for
> > > > parsing than human readable text we have right now?
> > > 
> > > Just brainstorming here, we're discussing an API for third party
> > > applications. In this case, they just need to build the json
> > > representation for the ruleset they want to add. They could even embed
> > > this into a network message that they can send of the wire.
> > > 
> > > > I'm not sure about the '[ base, offset, length ]' part though:
> > > > Applications would have to care about protocol header layout including
> > > > any specialties themselves, or should libnftables provide them with
> > > > convenience functions to generate the correct JSON markup?
> > > 
> > > It depends, you may want to expose json representations for each
> > > protocol payload you support.
> > > 
> > > > For simple stuff like matching on a TCP port there's probably no
> > > > need, but correctly interpreting IPv4 ToS field is rather
> > > > error-prone I guess.
> > > 
> > > And bitfields are going to be cumbersome too, so we would indeed need
> > > a json representation for each protocol that we support, so third
> > > party applications don't need to deal with this.
> > > 
> > > > The approach seems simple at first, but application input in JSON format
> > > > has to be validated as well, so I fear we'll end up with a second parser
> > > > to avoid the first one.
> > > 
> > > There are libraries like jansson that already do the parsing for us,
> > > so we don't need to maintain our own json parser. We would still need
> > > internal code to libnftables, to navigate the json representation and
> > > create the objects.
> > 
> > Yes sure, there are libraries doing the actual parsing of JSON -
> > probably I wasn't clear enough. My point is what happens if you have a
> > parsed JSON tree (or array, however it may look like in practice). The
> > data sent by the application is either explicit enough for the
> > translation into netlink messages to be really trivial, or it is not
> > (which I prefer, otherwise applications could use libnftnl directly with
> > no drawback) - then we still have to implement a middle layer between
> > data in JSON and nftables objects. Maybe an example will do:
> > 
> > | [{
> > | 	"type": "relational",
> > | 	"left": {
> > | 		"type": "expression",
> > | 		"name": "tcp_hdr_expr",
> > | 		"value": {
> > | 			"type": "tcp_hdr_field",
> > | 			"value": "dport",
> > | 		},
> > | 	},
> > | 	"right": {
> > | 		"type": "expression",
> > | 		"name": "integer_expr",
> > | 		"value": 22,
> > | 	}
> > | }]
> 
> Probably something more simple representation, like this?
> 
> [{
>         "match": {
>                 "left": {
>                         "type": "payload",
>                         "name": "tcp",
>                         "field: "dport",
>         	},
>                 "right": {
>                         "type": "immediate",
>                         "value": 22,
>                 }
>         }
> }]
> 
> For non-matching things, we can add an "action".

You mean for non-EQ type relationals? I would just add a third field
below "match", e.g. '"op": "GT"' or so.

> I wonder if this can even be made more simple and more compact indeed.

I guess the more compact it becomes, the less work (less diversity) for
applications and the more work ("parsing") on libnftables side.

> > So this might be how a relational expression could be represented in
> > JSON. Note that I intentionally didn't break it down to payload_expr,
> > otherwise it had to contain TCP header offset, etc. (In this case that
> > might be preferred, but as stated above it's not the best option in
> > every case.)
> > 
> > Parsing^WInterpreting code would then probably look like:
> > 
> > | type = json_object_get(data, "type");
> > | if (!strcmp(type, "relational")) {
> > | 	left = parse_expr(json_object_get(data, "left"));
> > | 	right = parse_expr(json_object_get(data, "right"));
> > | 	expr = relational_expr_alloc(&internal_location,
> > | 				     OP_IMPLICIT, left, right);
> > | }
> >
> > I think this last part might easily become bigger than parser_bison.y
> > and scanner.l combined.
> >
> > > On our side, we would need to maintain a very simple API, basically
> > > that allows you to parse a json representation and to export it. For
> > > backward compatibility reasons, we have to keep supporting the json
> > > layout, instead of a large number of functions.
> > 
> > Yes, the API *itself* might be a lot smaller since it just takes a chunk
> > of JSON for about anything. But the middle layer (which is not exported
> > to applications) will be the relevant factor instead.
> 
> Yes, in C I guess it will be quite a bit of new boiler plate code.
> Unless we find a way to autogenerate code skeletons in some way.

Maybe use something like lex/yacc? *SCNR* ;)

> I'm not so worry about maintaining more code. Real problem is API
> in the longterm: you have to stick to them for long time (or forever
> if you want if you want to take backward compatibility seriously, we
> have a very good record on this).

But isn't the problem of keeping the API compatible comparable to
the problem of keeping the JSON representation compatible?

> And having little API mean, library can internally evolve more freely.

Either way, keeping the API or JSON representation as abstract as
possible (without losing too much functionality) is advisable I guess.

> > > I guess the question here is if this would be good for firewalld, I
> > > didn't have a look at that code, but many third party applications I
> > > have seen are basically creating iptables commands in text, so this
> > > approach would be similar, well, actually better since we would be
> > > providing a well-structured representation.
> > 
> > Yes, of course firewalld builds iptables commands, but just because
> > there is no better option. Hence the request for a better libnftables
> > API, to avoid repeating that with another command (or function call
> > to which the command's parameters are passed).
> > 
> > Firewalld is written in Python, so it won't be able to use libnftables
> > directly, anyway. At least a thin layer of wrapping code will be there,
> > even if it's just via ctypes module.
> 
> Parsing of json in python is actually rather easy, right? I remember
> to have seen code mapping XML to an object whose attributes can be
> accessed as object fields. I wonder about how hard would be to
> generate it too.

Yes, I think JSON suits interpreted languages quite well since (at
least Python) comes with datatypes resembling how JSON structures data.
E.g. something like:

| [{
| 	"foo": [ "bar", "baz" ],
| 	"feed": "babe",
| }]

In Python would translate to:

| obj = {
| 	"foo": [ "bar", "baz" ],
| 	"feed": "babe"
| }

Undeniable similarity! :)

> > From my perspective, the argument of being well-structured doesn't quite
> > hold. Of course, JSON will provide something like "here starts a
> > statement" and "here it ends", but e.g. asynchronism between input and
> > output will be reflected by it as well if not solved in common code
> > already.
> 
> I'm not sure we should expose statements and relationals in the way we
> do in nftables, it's too low level still for a high level library :-).
> We can probably provide a more simplistic json representation that is
> human readable and understandable.

Yes, I agree. The big question is who will define the JSON format. :)

> Regarding asynchronism between input and output, not sure I follow.

I am grepping through tests/py/*/*.t for pattern 'ok;':

- Adjacent ranges are combined.
- meta l4proto ipv6-icmp icmpv6 type nd-router-advert;ok;icmpv6 type nd-router-advert
  -> more abstraction needed.
- meta priority 0x87654321;ok;meta priority 8765:4321
  -> What the heck? :)
- meta iif "lo" accept;ok;iif "lo" accept
  -> Maybe less abstraction?
- tcp dport 22 iiftype ether ip daddr 1.2.3.4 ether saddr 00:0f:54:0c:11:4 accept ok;tcp dport 22 ether saddr 00:0f:54:0c:11:04 ip daddr 1.2.3.4 accept
  -> Here something is "optimized out", not sure if it should be kept in
  JSON.
- reject with icmpx type port-unreachable;ok;reject
  -> This simplification should be easy (or even required) to undo in JSON.

Those are just some examples. If an application compares JSON it sent to
libnftables with what it returned, there are definitely things JSON
wouldn't "hide" (such as reordering in sets or differing number
formatting).

Cheers, Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2017-12-23 13:19                                               ` Phil Sutter
@ 2017-12-28 19:21                                                 ` Pablo Neira Ayuso
  2017-12-29 14:58                                                   ` Phil Sutter
  0 siblings, 1 reply; 28+ messages in thread
From: Pablo Neira Ayuso @ 2017-12-28 19:21 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel, Florian Westphal

Hi Phil,

On Sat, Dec 23, 2017 at 02:19:41PM +0100, Phil Sutter wrote:
> On Fri, Dec 22, 2017 at 09:39:03PM +0100, Pablo Neira Ayuso wrote:
> > On Fri, Dec 22, 2017 at 04:30:49PM +0100, Phil Sutter wrote:
> > > Hi Pablo,
> > > 
> > > On Fri, Dec 22, 2017 at 02:49:06PM +0100, Pablo Neira Ayuso wrote:
> > > > On Fri, Dec 22, 2017 at 02:08:16PM +0100, Phil Sutter wrote:
> > > > > On Wed, Dec 20, 2017 at 11:23:36PM +0100, Pablo Neira Ayuso wrote:
> > > > > > On Wed, Dec 20, 2017 at 01:32:25PM +0100, Phil Sutter wrote:
> > > > > > [...]
> > > > > > > On Tue, Dec 19, 2017 at 12:00:48AM +0100, Pablo Neira Ayuso wrote:
> > > > > > > > On Sat, Dec 16, 2017 at 05:06:51PM +0100, Phil Sutter wrote:
> > > > [...]
> > > > > > > > I wonder if firewalld could generate high level json representation
> > > > > > > > instead, so it becomes a compiler/translator from its own
> > > > > > > > representation to nftables abstract syntax tree. As I said, the json
> > > > > > > > representation is mapping to the abstract syntax tree we have in nft.
> > > > > > > > I'm refering to the high level json representation that doesn't exist
> > > > > > > > yet, not the low level one for libnftnl.
> > > > > > > 
> > > > > > > Can you point me to some information about that high level JSON
> > > > > > > representation? Seems I'm missing something here.
> > > > > > 
> > > > > > It doesn't exist :-), if we expose a json-based API, third party tool
> > > > > > only have to generate the json high-level representation, we would
> > > > > > need very few API calls for this, and anyone could generate rulesets
> > > > > > for nftables, without relying on the bison parser, given the json
> > > > > > representation exposes the abstract syntax tree.
> > > > > 
> > > > > So you're idea is to accept a whole command in JSON format from
> > > > > applications? And output in JSON format as well since that is easier for
> > > > > parsing than human readable text we have right now?
> > > > 
> > > > Just brainstorming here, we're discussing an API for third party
> > > > applications. In this case, they just need to build the json
> > > > representation for the ruleset they want to add. They could even embed
> > > > this into a network message that they can send of the wire.
> > > > 
> > > > > I'm not sure about the '[ base, offset, length ]' part though:
> > > > > Applications would have to care about protocol header layout including
> > > > > any specialties themselves, or should libnftables provide them with
> > > > > convenience functions to generate the correct JSON markup?
> > > > 
> > > > It depends, you may want to expose json representations for each
> > > > protocol payload you support.
> > > > 
> > > > > For simple stuff like matching on a TCP port there's probably no
> > > > > need, but correctly interpreting IPv4 ToS field is rather
> > > > > error-prone I guess.
> > > > 
> > > > And bitfields are going to be cumbersome too, so we would indeed need
> > > > a json representation for each protocol that we support, so third
> > > > party applications don't need to deal with this.
> > > > 
> > > > > The approach seems simple at first, but application input in JSON format
> > > > > has to be validated as well, so I fear we'll end up with a second parser
> > > > > to avoid the first one.
> > > > 
> > > > There are libraries like jansson that already do the parsing for us,
> > > > so we don't need to maintain our own json parser. We would still need
> > > > internal code to libnftables, to navigate the json representation and
> > > > create the objects.
> > > 
> > > Yes sure, there are libraries doing the actual parsing of JSON -
> > > probably I wasn't clear enough. My point is what happens if you have a
> > > parsed JSON tree (or array, however it may look like in practice). The
> > > data sent by the application is either explicit enough for the
> > > translation into netlink messages to be really trivial, or it is not
> > > (which I prefer, otherwise applications could use libnftnl directly with
> > > no drawback) - then we still have to implement a middle layer between
> > > data in JSON and nftables objects. Maybe an example will do:
> > > 
> > > | [{
> > > | 	"type": "relational",
> > > | 	"left": {
> > > | 		"type": "expression",
> > > | 		"name": "tcp_hdr_expr",
> > > | 		"value": {
> > > | 			"type": "tcp_hdr_field",
> > > | 			"value": "dport",
> > > | 		},
> > > | 	},
> > > | 	"right": {
> > > | 		"type": "expression",
> > > | 		"name": "integer_expr",
> > > | 		"value": 22,
> > > | 	}
> > > | }]
> > 
> > Probably something more simple representation, like this?
> > 
> > [{
> >         "match": {
> >                 "left": {
> >                         "type": "payload",
> >                         "name": "tcp",
> >                         "field: "dport",
> >         	},
> >                 "right": {
> >                         "type": "immediate",
> >                         "value": 22,
> >                 }
> >         }
> > }]
> > 
> > For non-matching things, we can add an "action".
> 
> You mean for non-EQ type relationals? I would just add a third field
> below "match", e.g. '"op": "GT"' or so.

Agreed.

> > I wonder if this can even be made more simple and more compact indeed.
> 
> I guess the more compact it becomes, the less work (less diversity) for
> applications and the more work ("parsing") on libnftables side.

Yes, that would place a bit more work on the library, but I think we
should provide a high level representation that makes it easy for
people to express things. People should not deal with bitfield
handling, that's very tricky. Just have a look at commits that I and
Florian made to fix this. We don't want users to fall into this trap
by creating incorrect expressions, or worse, making them feel our
library does not work (even if it's their own mistake to build
incorrect expressions).

So I would go for a high level representation that represent things in
a way that is understandable to users, and that makes it easy to
generate ruleset for third party application like firewalld.

> > > So this might be how a relational expression could be represented in
> > > JSON. Note that I intentionally didn't break it down to payload_expr,
> > > otherwise it had to contain TCP header offset, etc. (In this case that
> > > might be preferred, but as stated above it's not the best option in
> > > every case.)
> > > 
> > > Parsing^WInterpreting code would then probably look like:
> > > 
> > > | type = json_object_get(data, "type");
> > > | if (!strcmp(type, "relational")) {
> > > | 	left = parse_expr(json_object_get(data, "left"));
> > > | 	right = parse_expr(json_object_get(data, "right"));
> > > | 	expr = relational_expr_alloc(&internal_location,
> > > | 				     OP_IMPLICIT, left, right);
> > > | }
> > >
> > > I think this last part might easily become bigger than parser_bison.y
> > > and scanner.l combined.
> > >
> > > > On our side, we would need to maintain a very simple API, basically
> > > > that allows you to parse a json representation and to export it. For
> > > > backward compatibility reasons, we have to keep supporting the json
> > > > layout, instead of a large number of functions.
> > > 
> > > Yes, the API *itself* might be a lot smaller since it just takes a chunk
> > > of JSON for about anything. But the middle layer (which is not exported
> > > to applications) will be the relevant factor instead.
> > 
> > Yes, in C I guess it will be quite a bit of new boiler plate code.
> > Unless we find a way to autogenerate code skeletons in some way.
> 
> Maybe use something like lex/yacc? *SCNR* ;)

Yes, that's one possibility we could explore.

> > I'm not so worry about maintaining more code. Real problem is API
> > in the longterm: you have to stick to them for long time (or forever
> > if you want if you want to take backward compatibility seriously, we
> > have a very good record on this).
> 
> But isn't the problem of keeping the API compatible comparable to
> the problem of keeping the JSON representation compatible?

Well, keeping backward compatibility is always a burden to carry on.
Even though, to me, JSON is as extensible as netlink is, ie. we can
add new keys and deprecate things without breaking backward.

I worry about exposing the expression as they are in nftables now, I
consider it still a too low level representation.

> > And having little API mean, library can internally evolve more freely.
> 
> Either way, keeping the API or JSON representation as abstract as
> possible (without losing too much functionality) is advisable I guess.

Right.

> > > > I guess the question here is if this would be good for firewalld, I
> > > > didn't have a look at that code, but many third party applications I
> > > > have seen are basically creating iptables commands in text, so this
> > > > approach would be similar, well, actually better since we would be
> > > > providing a well-structured representation.
> > > 
> > > Yes, of course firewalld builds iptables commands, but just because
> > > there is no better option. Hence the request for a better libnftables
> > > API, to avoid repeating that with another command (or function call
> > > to which the command's parameters are passed).
> > > 
> > > Firewalld is written in Python, so it won't be able to use libnftables
> > > directly, anyway. At least a thin layer of wrapping code will be there,
> > > even if it's just via ctypes module.
> > 
> > Parsing of json in python is actually rather easy, right? I remember
> > to have seen code mapping XML to an object whose attributes can be
> > accessed as object fields. I wonder about how hard would be to
> > generate it too.
> 
> Yes, I think JSON suits interpreted languages quite well since (at
> least Python) comes with datatypes resembling how JSON structures data.
> E.g. something like:
> 
> | [{
> | 	"foo": [ "bar", "baz" ],
> | 	"feed": "babe",
> | }]
> 
> In Python would translate to:
> 
> | obj = {
> | 	"foo": [ "bar", "baz" ],
> | 	"feed": "babe"
> | }
> 
> Undeniable similarity! :)

That confirms my feeling with these modern languages :)

> > > From my perspective, the argument of being well-structured doesn't quite
> > > hold. Of course, JSON will provide something like "here starts a
> > > statement" and "here it ends", but e.g. asynchronism between input and
> > > output will be reflected by it as well if not solved in common code
> > > already.
> > 
> > I'm not sure we should expose statements and relationals in the way we
> > do in nftables, it's too low level still for a high level library :-).
> > We can probably provide a more simplistic json representation that is
> > human readable and understandable.
> 
> Yes, I agree. The big question is who will define the JSON format. :)

Oh, I can help you on that. Although you're very much closer to
firewalld usecases that I'm, so probably a draft from you on this
would be nice ;-)

> > Regarding asynchronism between input and output, not sure I follow.
> 
> I am grepping through tests/py/*/*.t for pattern 'ok;':
> 
> - Adjacent ranges are combined.

We should disable this, there was some discussion on this recently.
User should request this explicitly to happen through some knob.

> - meta l4proto ipv6-icmp icmpv6 type nd-router-advert;ok;icmpv6 type nd-router-advert
>   -> more abstraction needed.

Not sure what you mean.

> - meta priority 0x87654321;ok;meta priority 8765:4321
>   -> What the heck? :)

This is testing that we accept basetypes, given classid is an integer
basetype.

> - meta iif "lo" accept;ok;iif "lo" accept
>   -> Maybe less abstraction?

This is just dealing with something that is causing us problems, that
is, iif is handled as primary key, so we cannot reuse it in the
grammar given it results in shift-reduce conflicts.

> - tcp dport 22 iiftype ether ip daddr 1.2.3.4 ether saddr 00:0f:54:0c:11:4 accept ok;tcp dport 22 ether saddr 00:0f:54:0c:11:04 ip daddr 1.2.3.4 accept
>   -> Here something is "optimized out", not sure if it should be kept in
>   JSON.

This is testing redudant information, that we can remove given we can
infer it.

> - reject with icmpx type port-unreachable;ok;reject
>   -> This simplification should be easy (or even required) to undo in JSON.

We can probably remove this simplification, so user doesn't need to
remember our default behaviour.

> Those are just some examples. If an application compares JSON it sent to
> libnftables with what it returned, there are definitely things JSON
> wouldn't "hide" (such as reordering in sets or differing number
> formatting).

All these problems/inconsistency you're mentioning will be good if we
can fix them, if json helps us spot them, we'll be killing two birds
in one shot.

Let me know,
Thanks!

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2017-12-28 19:21                                                 ` Pablo Neira Ayuso
@ 2017-12-29 14:58                                                   ` Phil Sutter
  2018-01-02 18:02                                                     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 28+ messages in thread
From: Phil Sutter @ 2017-12-29 14:58 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

[-- Attachment #1: Type: text/plain, Size: 15426 bytes --]

On Thu, Dec 28, 2017 at 08:21:41PM +0100, Pablo Neira Ayuso wrote:
> Hi Phil,
> 
> On Sat, Dec 23, 2017 at 02:19:41PM +0100, Phil Sutter wrote:
> > On Fri, Dec 22, 2017 at 09:39:03PM +0100, Pablo Neira Ayuso wrote:
> > > On Fri, Dec 22, 2017 at 04:30:49PM +0100, Phil Sutter wrote:
> > > > Hi Pablo,
> > > > 
> > > > On Fri, Dec 22, 2017 at 02:49:06PM +0100, Pablo Neira Ayuso wrote:
> > > > > On Fri, Dec 22, 2017 at 02:08:16PM +0100, Phil Sutter wrote:
> > > > > > On Wed, Dec 20, 2017 at 11:23:36PM +0100, Pablo Neira Ayuso wrote:
> > > > > > > On Wed, Dec 20, 2017 at 01:32:25PM +0100, Phil Sutter wrote:
> > > > > > > [...]
> > > > > > > > On Tue, Dec 19, 2017 at 12:00:48AM +0100, Pablo Neira Ayuso wrote:
> > > > > > > > > On Sat, Dec 16, 2017 at 05:06:51PM +0100, Phil Sutter wrote:
> > > > > [...]
> > > > > > > > > I wonder if firewalld could generate high level json representation
> > > > > > > > > instead, so it becomes a compiler/translator from its own
> > > > > > > > > representation to nftables abstract syntax tree. As I said, the json
> > > > > > > > > representation is mapping to the abstract syntax tree we have in nft.
> > > > > > > > > I'm refering to the high level json representation that doesn't exist
> > > > > > > > > yet, not the low level one for libnftnl.
> > > > > > > > 
> > > > > > > > Can you point me to some information about that high level JSON
> > > > > > > > representation? Seems I'm missing something here.
> > > > > > > 
> > > > > > > It doesn't exist :-), if we expose a json-based API, third party tool
> > > > > > > only have to generate the json high-level representation, we would
> > > > > > > need very few API calls for this, and anyone could generate rulesets
> > > > > > > for nftables, without relying on the bison parser, given the json
> > > > > > > representation exposes the abstract syntax tree.
> > > > > > 
> > > > > > So you're idea is to accept a whole command in JSON format from
> > > > > > applications? And output in JSON format as well since that is easier for
> > > > > > parsing than human readable text we have right now?
> > > > > 
> > > > > Just brainstorming here, we're discussing an API for third party
> > > > > applications. In this case, they just need to build the json
> > > > > representation for the ruleset they want to add. They could even embed
> > > > > this into a network message that they can send of the wire.
> > > > > 
> > > > > > I'm not sure about the '[ base, offset, length ]' part though:
> > > > > > Applications would have to care about protocol header layout including
> > > > > > any specialties themselves, or should libnftables provide them with
> > > > > > convenience functions to generate the correct JSON markup?
> > > > > 
> > > > > It depends, you may want to expose json representations for each
> > > > > protocol payload you support.
> > > > > 
> > > > > > For simple stuff like matching on a TCP port there's probably no
> > > > > > need, but correctly interpreting IPv4 ToS field is rather
> > > > > > error-prone I guess.
> > > > > 
> > > > > And bitfields are going to be cumbersome too, so we would indeed need
> > > > > a json representation for each protocol that we support, so third
> > > > > party applications don't need to deal with this.
> > > > > 
> > > > > > The approach seems simple at first, but application input in JSON format
> > > > > > has to be validated as well, so I fear we'll end up with a second parser
> > > > > > to avoid the first one.
> > > > > 
> > > > > There are libraries like jansson that already do the parsing for us,
> > > > > so we don't need to maintain our own json parser. We would still need
> > > > > internal code to libnftables, to navigate the json representation and
> > > > > create the objects.
> > > > 
> > > > Yes sure, there are libraries doing the actual parsing of JSON -
> > > > probably I wasn't clear enough. My point is what happens if you have a
> > > > parsed JSON tree (or array, however it may look like in practice). The
> > > > data sent by the application is either explicit enough for the
> > > > translation into netlink messages to be really trivial, or it is not
> > > > (which I prefer, otherwise applications could use libnftnl directly with
> > > > no drawback) - then we still have to implement a middle layer between
> > > > data in JSON and nftables objects. Maybe an example will do:
> > > > 
> > > > | [{
> > > > | 	"type": "relational",
> > > > | 	"left": {
> > > > | 		"type": "expression",
> > > > | 		"name": "tcp_hdr_expr",
> > > > | 		"value": {
> > > > | 			"type": "tcp_hdr_field",
> > > > | 			"value": "dport",
> > > > | 		},
> > > > | 	},
> > > > | 	"right": {
> > > > | 		"type": "expression",
> > > > | 		"name": "integer_expr",
> > > > | 		"value": 22,
> > > > | 	}
> > > > | }]
> > > 
> > > Probably something more simple representation, like this?
> > > 
> > > [{
> > >         "match": {
> > >                 "left": {
> > >                         "type": "payload",
> > >                         "name": "tcp",
> > >                         "field: "dport",
> > >         	},
> > >                 "right": {
> > >                         "type": "immediate",
> > >                         "value": 22,
> > >                 }
> > >         }
> > > }]
> > > 
> > > For non-matching things, we can add an "action".
> > 
> > You mean for non-EQ type relationals? I would just add a third field
> > below "match", e.g. '"op": "GT"' or so.
> 
> Agreed.
> 
> > > I wonder if this can even be made more simple and more compact indeed.
> > 
> > I guess the more compact it becomes, the less work (less diversity) for
> > applications and the more work ("parsing") on libnftables side.
> 
> Yes, that would place a bit more work on the library, but I think we
> should provide a high level representation that makes it easy for
> people to express things. People should not deal with bitfield
> handling, that's very tricky. Just have a look at commits that I and
> Florian made to fix this. We don't want users to fall into this trap
> by creating incorrect expressions, or worse, making them feel our
> library does not work (even if it's their own mistake to build
> incorrect expressions).

I wonder if it made sense to provide JSON generating helpers. Maybe even
just for sample purposes.

Anyway, I had a look at JSON Schema[1] which should help in this regard
as well.

> So I would go for a high level representation that represent things in
> a way that is understandable to users, and that makes it easy to
> generate ruleset for third party application like firewalld.

ACK.

> > > > So this might be how a relational expression could be represented in
> > > > JSON. Note that I intentionally didn't break it down to payload_expr,
> > > > otherwise it had to contain TCP header offset, etc. (In this case that
> > > > might be preferred, but as stated above it's not the best option in
> > > > every case.)
> > > > 
> > > > Parsing^WInterpreting code would then probably look like:
> > > > 
> > > > | type = json_object_get(data, "type");
> > > > | if (!strcmp(type, "relational")) {
> > > > | 	left = parse_expr(json_object_get(data, "left"));
> > > > | 	right = parse_expr(json_object_get(data, "right"));
> > > > | 	expr = relational_expr_alloc(&internal_location,
> > > > | 				     OP_IMPLICIT, left, right);
> > > > | }
> > > >
> > > > I think this last part might easily become bigger than parser_bison.y
> > > > and scanner.l combined.
> > > >
> > > > > On our side, we would need to maintain a very simple API, basically
> > > > > that allows you to parse a json representation and to export it. For
> > > > > backward compatibility reasons, we have to keep supporting the json
> > > > > layout, instead of a large number of functions.
> > > > 
> > > > Yes, the API *itself* might be a lot smaller since it just takes a chunk
> > > > of JSON for about anything. But the middle layer (which is not exported
> > > > to applications) will be the relevant factor instead.
> > > 
> > > Yes, in C I guess it will be quite a bit of new boiler plate code.
> > > Unless we find a way to autogenerate code skeletons in some way.
> > 
> > Maybe use something like lex/yacc? *SCNR* ;)
> 
> Yes, that's one possibility we could explore.
> 
> > > I'm not so worry about maintaining more code. Real problem is API
> > > in the longterm: you have to stick to them for long time (or forever
> > > if you want if you want to take backward compatibility seriously, we
> > > have a very good record on this).
> > 
> > But isn't the problem of keeping the API compatible comparable to
> > the problem of keeping the JSON representation compatible?
> 
> Well, keeping backward compatibility is always a burden to carry on.
> Even though, to me, JSON is as extensible as netlink is, ie. we can
> add new keys and deprecate things without breaking backward.

Yes, the format is very flexible. But how does that help with
compatibility? You'll have to support the deprecated attributes or JSON
without the new attributes either way, no?

> I worry about exposing the expression as they are in nftables now, I
> consider it still a too low level representation.
> 
> > > And having little API mean, library can internally evolve more freely.
> > 
> > Either way, keeping the API or JSON representation as abstract as
> > possible (without losing too much functionality) is advisable I guess.
> 
> Right.
> 
> > > > > I guess the question here is if this would be good for firewalld, I
> > > > > didn't have a look at that code, but many third party applications I
> > > > > have seen are basically creating iptables commands in text, so this
> > > > > approach would be similar, well, actually better since we would be
> > > > > providing a well-structured representation.
> > > > 
> > > > Yes, of course firewalld builds iptables commands, but just because
> > > > there is no better option. Hence the request for a better libnftables
> > > > API, to avoid repeating that with another command (or function call
> > > > to which the command's parameters are passed).
> > > > 
> > > > Firewalld is written in Python, so it won't be able to use libnftables
> > > > directly, anyway. At least a thin layer of wrapping code will be there,
> > > > even if it's just via ctypes module.
> > > 
> > > Parsing of json in python is actually rather easy, right? I remember
> > > to have seen code mapping XML to an object whose attributes can be
> > > accessed as object fields. I wonder about how hard would be to
> > > generate it too.
> > 
> > Yes, I think JSON suits interpreted languages quite well since (at
> > least Python) comes with datatypes resembling how JSON structures data.
> > E.g. something like:
> > 
> > | [{
> > | 	"foo": [ "bar", "baz" ],
> > | 	"feed": "babe",
> > | }]
> > 
> > In Python would translate to:
> > 
> > | obj = {
> > | 	"foo": [ "bar", "baz" ],
> > | 	"feed": "babe"
> > | }
> > 
> > Undeniable similarity! :)
> 
> That confirms my feeling with these modern languages :)

In fact, the only incompatibility I found so far between JSON and Python
is camel vs. lower case in booleans.

> > > > From my perspective, the argument of being well-structured doesn't quite
> > > > hold. Of course, JSON will provide something like "here starts a
> > > > statement" and "here it ends", but e.g. asynchronism between input and
> > > > output will be reflected by it as well if not solved in common code
> > > > already.
> > > 
> > > I'm not sure we should expose statements and relationals in the way we
> > > do in nftables, it's too low level still for a high level library :-).
> > > We can probably provide a more simplistic json representation that is
> > > human readable and understandable.
> > 
> > Yes, I agree. The big question is who will define the JSON format. :)
> 
> Oh, I can help you on that. Although you're very much closer to
> firewalld usecases that I'm, so probably a draft from you on this
> would be nice ;-)

I went ahead and converted my local ruleset into JSON manually (until I
got bored), see attached ruleset.json. Then I wrote a schema to validate
it (also attached). Please let me know if you're OK with the base
format at least, i.e. everything except expressions and statements. Any
feedback on the few statements and expressions I put in there is highly
appreciated as well, of course! :)

> > > Regarding asynchronism between input and output, not sure I follow.
> > 
> > I am grepping through tests/py/*/*.t for pattern 'ok;':
> > 
> > - Adjacent ranges are combined.
> 
> We should disable this, there was some discussion on this recently.
> User should request this explicitly to happen through some knob.

I agree. While it's a nice idea, adding two adjacent ranges and later
wanting to delete one of them is troublesome. Do you have a name in mind
for that knob? Maybe something more generic which could cover other
cases of overly intelligent nft (in the future) as well?

> > - meta l4proto ipv6-icmp icmpv6 type nd-router-advert;ok;icmpv6 type nd-router-advert
> >   -> more abstraction needed.
> 
> Not sure what you mean.

I meant 'icmpv6 ...' should imply 'meta l4proto ipv6-icmp', but that's
the case already. Maybe this is a similar case as the combining of
adjacent ranges in that it's a good thing per se but might not suit
users' preferences.

> > - meta priority 0x87654321;ok;meta priority 8765:4321
> >   -> What the heck? :)
> 
> This is testing that we accept basetypes, given classid is an integer
> basetype.

Ah, I see.

> > - meta iif "lo" accept;ok;iif "lo" accept
> >   -> Maybe less abstraction?
> 
> This is just dealing with something that is causing us problems, that
> is, iif is handled as primary key, so we cannot reuse it in the
> grammar given it results in shift-reduce conflicts.

The question is why allow both variants then? Since 'iif' is being used
as fib_flag as well, using 'iif' alone should be deprecated. Or is this
a case of backwards compatibility?

Anyway, this won't be an issue in JSON at least.

> > - tcp dport 22 iiftype ether ip daddr 1.2.3.4 ether saddr 00:0f:54:0c:11:4 accept ok;tcp dport 22 ether saddr 00:0f:54:0c:11:04 ip daddr 1.2.3.4 accept
> >   -> Here something is "optimized out", not sure if it should be kept in
> >   JSON.
> 
> This is testing redudant information, that we can remove given we can
> infer it.

Yeah, similar situation to the 'meta l4proto' one above. The (still)
tricky part is to communicate assigned handles back to the application.
Maybe we could return an exact copy of their JSON with just handle
properties added?

> > - reject with icmpx type port-unreachable;ok;reject
> >   -> This simplification should be easy (or even required) to undo in JSON.
> 
> We can probably remove this simplification, so user doesn't need to
> remember our default behaviour.

As long as doing this consequently doesn't bloat rules too much, sounds
fine!

> > Those are just some examples. If an application compares JSON it sent to
> > libnftables with what it returned, there are definitely things JSON
> > wouldn't "hide" (such as reordering in sets or differing number
> > formatting).
> 
> All these problems/inconsistency you're mentioning will be good if we
> can fix them, if json helps us spot them, we'll be killing two birds
> in one shot.

It doesn't, but just suffers from it as well. Grepping 'ok;' in
tests/py/*/*.t does, though! :)

Cheers, Phil

[1] http://json-schema.org/

[-- Attachment #2: ruleset.json --]
[-- Type: application/json, Size: 3152 bytes --]

[-- Attachment #3: ruleset.schema --]
[-- Type: text/plain, Size: 2910 bytes --]

{
	"id": "http://netfilter.org/nftables/ruleset-schema",
	"$schema": "http://json-schema.org/draft-06/schema#",
	"description": "schema for an nftables ruleset dump",
	"type": "array",
	"minitems": 0,
	"items": {
		"type": "object",
		"required": [ "name", "family" ],
		"properties": {
			"name": { "type": "string" },
			"family": { "type": "string" },
			"sets": { "$ref": "#/definitions/sets" },
			"chains": { "$ref": "#/definitions/chains" }
		}
	},
        "uniqueItems": true,
	"definitions": {
		"expression": {
			"type": "object",
			"required": [ "type" ],
			"properties": {
				"type": { "type": "string" },
				"name": { "type": "string" },
				"field": { "type": "string" },
				"val": {
					"oneOf": [
						{ "type": "string" },
						{ "type": "integer" }
					]
				}
			}
		},
		"set": {
			"type": "object",
			"required": [ "name", "type" ],
			"properties": {
				"name": { "type": "string" },
				"type": { "type": "string" },
				"elements": {
					"type": "array",
					"minitems": 0,
					"items": { "type": "string" }
				}
			}
		},
		"sets": {
			"type": "array",
			"minitems": 0,
			"items": { "$ref": "#/definitions/set" },
			"uniqueItems": true
		},
		"statement": {
			"type": "object",
			"oneOf": [{
				"required": [ "match" ],
				"properties": {
					"match": {
						"type": "object",
						"required": [ "left", "right" ],
						"properties": {
							"left": {
								"$ref": "#/definitions/expression"
							},
							"op": { "enum": [ "EQ", "NEQ", "LT", "GT", "GTE", "LTE" ] },
							"right": {
								"$ref": "#/definitions/expression"
							}
						}
					}
				}
			}, {
				"required": [ "accept" ],
				"properties": { "accept": { "type": "string" } }
			}, {
				"required": [ "drop" ],
				"properties": { "drop": { "type": "string" } }
			}, {
				"required": [ "reject" ],
				"properties": { "reject": { "type": "string" } }
			}, {
				"required": [ "jump" ],
				"properties": { "jump": { "type": "string" } }
			}]
		},
		"statements": {
			"type": "array",
			"minitems": 0,
			"items": { "$ref": "#/definitions/statement" }
		},
		"rule": {
			"type": "object",
			"required": [ "handle" ],
			"properties": {
				"handle": {
					"type": "integer",
					"minimum": 1
				},
				"statements": { "$ref": "#/definitions/statements" }
			}
		},
		"rules": {
			"type": "array",
			"minitems": 0,
			"items": { "$ref": "#/definitions/rule" }
		},
		"chain": {
			"type": "object",
			"required": [ "name" ],
			"properties": {
				"name": { "type": "string" },
				"type": { "type": "string" },
				"hook": { "type": "string" },
				"priority": {
					"type": "integer",
					"minimum": -1000,
					"maximum": 1000
				},
				"policy": { "type": "string" },
				"rules": { "$ref": "#/definitions/rules" }
			}
		},
		"chains": {
			"type": "array",
			"minitems": 0,
			"items": { "$ref": "#/definitions/chain" },
			"uniqueItems": true
		}
	}
}

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2017-12-29 14:58                                                   ` Phil Sutter
@ 2018-01-02 18:02                                                     ` Pablo Neira Ayuso
  2018-01-05 17:52                                                       ` Phil Sutter
  0 siblings, 1 reply; 28+ messages in thread
From: Pablo Neira Ayuso @ 2018-01-02 18:02 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel, Florian Westphal

Hi Phil,

On Fri, Dec 29, 2017 at 03:58:16PM +0100, Phil Sutter wrote:
> On Thu, Dec 28, 2017 at 08:21:41PM +0100, Pablo Neira Ayuso wrote:
> > On Sat, Dec 23, 2017 at 02:19:41PM +0100, Phil Sutter wrote:
[...]
> > Yes, that would place a bit more work on the library, but I think we
> > should provide a high level representation that makes it easy for
> > people to express things. People should not deal with bitfield
> > handling, that's very tricky. Just have a look at commits that I and
> > Florian made to fix this. We don't want users to fall into this trap
> > by creating incorrect expressions, or worse, making them feel our
> > library does not work (even if it's their own mistake to build
> > incorrect expressions).
> 
> I wonder if it made sense to provide JSON generating helpers. Maybe even
> just for sample purposes.

If you add them, I prefer you keep them private internally, not
exposed through API.

[...]
> > > But isn't the problem of keeping the API compatible comparable to
> > > the problem of keeping the JSON representation compatible?
> > 
> > Well, keeping backward compatibility is always a burden to carry on.
> > Even though, to me, JSON is as extensible as netlink is, ie. we can
> > add new keys and deprecate things without breaking backward.
> 
> Yes, the format is very flexible. But how does that help with
> compatibility? You'll have to support the deprecated attributes or JSON
> without the new attributes either way, no?

Probably it's my subjective judging that maintaing json layout will be
easier rather than a large bunch of APIs and internal object
representations through getter/setter.

Anyway, these days, we expect people to use modern languages to build
upper layers in the stack, right? And that seems to mix well with JSON.
Again core infrastructure person here talking about upper layers, so
take this lightly ;-).

[...]
> > Oh, I can help you on that. Although you're very much closer to
> > firewalld usecases that I'm, so probably a draft from you on this
> > would be nice ;-)
> 
> I went ahead and converted my local ruleset into JSON manually (until I
> got bored), see attached ruleset.json. Then I wrote a schema to validate
> it (also attached). Please let me know if you're OK with the base
> format at least, i.e. everything except expressions and statements. Any
> feedback on the few statements and expressions I put in there is highly
> appreciated as well, of course! :)

Probably instead of having left and right, we can replace it by:

"match" : {
        "key": {
                ...
        },
        "value": "lo"
}

Then, allow to place expressions as "value" when it comes from a set
lookup.

> > > > Regarding asynchronism between input and output, not sure I follow.
> > > 
> > > I am grepping through tests/py/*/*.t for pattern 'ok;':
> > > 
> > > - Adjacent ranges are combined.
> > 
> > We should disable this, there was some discussion on this recently.
> > User should request this explicitly to happen through some knob.
> 
> I agree. While it's a nice idea, adding two adjacent ranges and later
> wanting to delete one of them is troublesome. Do you have a name in mind
> for that knob? Maybe something more generic which could cover other
> cases of overly intelligent nft (in the future) as well?

Probably "coalesce" or "merge".

> > > - meta l4proto ipv6-icmp icmpv6 type nd-router-advert;ok;icmpv6 type nd-router-advert
> > >   -> more abstraction needed.
> > 
> > Not sure what you mean.
> 
> I meant 'icmpv6 ...' should imply 'meta l4proto ipv6-icmp', but that's
> the case already. Maybe this is a similar case as the combining of
> adjacent ranges in that it's a good thing per se but might not suit
> users' preferences.

This is an implicit dependency, right? I think this test just makes sure
that we handle an explicit dependencies - in case the user decides to be
too verbose - just fine.

> > > - meta priority 0x87654321;ok;meta priority 8765:4321
> > >   -> What the heck? :)
> > 
> > This is testing that we accept basetypes, given classid is an integer
> > basetype.
> 
> Ah, I see.
> 
> > > - meta iif "lo" accept;ok;iif "lo" accept
> > >   -> Maybe less abstraction?
> > 
> > This is just dealing with something that is causing us problems, that
> > is, iif is handled as primary key, so we cannot reuse it in the
> > grammar given it results in shift-reduce conflicts.
> 
> The question is why allow both variants then? Since 'iif' is being used
> as fib_flag as well, using 'iif' alone should be deprecated. Or is this
> a case of backwards compatibility?

It was compromise solution, not to break syntax all of a sudden,
allowing old and new ways for a little while. But this one, I think it
was not add this.

> Anyway, this won't be an issue in JSON at least.
> 
> > > - tcp dport 22 iiftype ether ip daddr 1.2.3.4 ether saddr 00:0f:54:0c:11:4 accept ok;tcp dport 22 ether saddr 00:0f:54:0c:11:04 ip daddr 1.2.3.4 accept
> > >   -> Here something is "optimized out", not sure if it should be kept in
> > >   JSON.
> > 
> > This is testing redudant information, that we can remove given we can
> > infer it.
> 
> Yeah, similar situation to the 'meta l4proto' one above. The (still)
> tricky part is to communicate assigned handles back to the application.
> Maybe we could return an exact copy of their JSON with just handle
> properties added?

Yes, that should be fine. We already have the NLM_F_ECHO in place,
just a matter of supporting json there.

BTW, a simple testsuite for this would be good too.

Thanks!

P.S: Happy new year BTW.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2018-01-02 18:02                                                     ` Pablo Neira Ayuso
@ 2018-01-05 17:52                                                       ` Phil Sutter
  2018-01-09 23:31                                                         ` Pablo Neira Ayuso
  0 siblings, 1 reply; 28+ messages in thread
From: Phil Sutter @ 2018-01-05 17:52 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

Hi Pablo,

On Tue, Jan 02, 2018 at 07:02:19PM +0100, Pablo Neira Ayuso wrote:
> On Fri, Dec 29, 2017 at 03:58:16PM +0100, Phil Sutter wrote:
> > On Thu, Dec 28, 2017 at 08:21:41PM +0100, Pablo Neira Ayuso wrote:
> > > On Sat, Dec 23, 2017 at 02:19:41PM +0100, Phil Sutter wrote:
[...]
> > > > But isn't the problem of keeping the API compatible comparable to
> > > > the problem of keeping the JSON representation compatible?
> > > 
> > > Well, keeping backward compatibility is always a burden to carry on.
> > > Even though, to me, JSON is as extensible as netlink is, ie. we can
> > > add new keys and deprecate things without breaking backward.
> > 
> > Yes, the format is very flexible. But how does that help with
> > compatibility? You'll have to support the deprecated attributes or JSON
> > without the new attributes either way, no?
> 
> Probably it's my subjective judging that maintaing json layout will be
> easier rather than a large bunch of APIs and internal object
> representations through getter/setter.
> 
> Anyway, these days, we expect people to use modern languages to build
> upper layers in the stack, right? And that seems to mix well with JSON.
> Again core infrastructure person here talking about upper layers, so
> take this lightly ;-).

Yeah, for firewalld at least JSON is not a disadvantage. Not sure about
projects written in C though, but that's a different kettle of fish. :)

> [...]
> > > Oh, I can help you on that. Although you're very much closer to
> > > firewalld usecases that I'm, so probably a draft from you on this
> > > would be nice ;-)
> > 
> > I went ahead and converted my local ruleset into JSON manually (until I
> > got bored), see attached ruleset.json. Then I wrote a schema to validate
> > it (also attached). Please let me know if you're OK with the base
> > format at least, i.e. everything except expressions and statements. Any
> > feedback on the few statements and expressions I put in there is highly
> > appreciated as well, of course! :)
> 
> Probably instead of having left and right, we can replace it by:
> 
> "match" : {
>         "key": {
>                 ...
>         },
>         "value": "lo"
> }
> 
> Then, allow to place expressions as "value" when it comes from a set
> lookup.

Yes, JSON Schema allows to define multiple possible types for an
attribute (see #/definitions/expression/val for instance). But I don't
follow regarding set lookup: There are other uses for an expression on
RHS as well, no?

> > > > > Regarding asynchronism between input and output, not sure I follow.
> > > > 
> > > > I am grepping through tests/py/*/*.t for pattern 'ok;':
> > > > 
> > > > - Adjacent ranges are combined.
> > > 
> > > We should disable this, there was some discussion on this recently.
> > > User should request this explicitly to happen through some knob.
> > 
> > I agree. While it's a nice idea, adding two adjacent ranges and later
> > wanting to delete one of them is troublesome. Do you have a name in mind
> > for that knob? Maybe something more generic which could cover other
> > cases of overly intelligent nft (in the future) as well?
> 
> Probably "coalesce" or "merge".

So you'd prefer a specific flag just for that feature? I had a more
generic one in mind, something like "optimize" for instance.

[...]
> > > > - meta iif "lo" accept;ok;iif "lo" accept
> > > >   -> Maybe less abstraction?
> > > 
> > > This is just dealing with something that is causing us problems, that
> > > is, iif is handled as primary key, so we cannot reuse it in the
> > > grammar given it results in shift-reduce conflicts.
> > 
> > The question is why allow both variants then? Since 'iif' is being used
> > as fib_flag as well, using 'iif' alone should be deprecated. Or is this
> > a case of backwards compatibility?
> 
> It was compromise solution, not to break syntax all of a sudden,
> allowing old and new ways for a little while. But this one, I think it
> was not add this.

I couldn't parse your last sentence here. :)

> > > > - tcp dport 22 iiftype ether ip daddr 1.2.3.4 ether saddr 00:0f:54:0c:11:4 accept ok;tcp dport 22 ether saddr 00:0f:54:0c:11:04 ip daddr 1.2.3.4 accept
> > > >   -> Here something is "optimized out", not sure if it should be kept in
> > > >   JSON.
> > > 
> > > This is testing redudant information, that we can remove given we can
> > > infer it.
> > 
> > Yeah, similar situation to the 'meta l4proto' one above. The (still)
> > tricky part is to communicate assigned handles back to the application.
> > Maybe we could return an exact copy of their JSON with just handle
> > properties added?
> 
> Yes, that should be fine. We already have the NLM_F_ECHO in place,
> just a matter of supporting json there.
> 
> BTW, a simple testsuite for this would be good too.

Sure! Maybe the existing data in tests/py could be reused (the *.payload
files at least).

> P.S: Happy new year BTW.

Thanks, likewise! :)

Cheers, Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2018-01-05 17:52                                                       ` Phil Sutter
@ 2018-01-09 23:31                                                         ` Pablo Neira Ayuso
  2018-01-10  4:46                                                           ` mark diener
  0 siblings, 1 reply; 28+ messages in thread
From: Pablo Neira Ayuso @ 2018-01-09 23:31 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel, Florian Westphal

On Fri, Jan 05, 2018 at 06:52:03PM +0100, Phil Sutter wrote:
> Hi Pablo,
> 
> On Tue, Jan 02, 2018 at 07:02:19PM +0100, Pablo Neira Ayuso wrote:
> > On Fri, Dec 29, 2017 at 03:58:16PM +0100, Phil Sutter wrote:
> > > On Thu, Dec 28, 2017 at 08:21:41PM +0100, Pablo Neira Ayuso wrote:
> > > > On Sat, Dec 23, 2017 at 02:19:41PM +0100, Phil Sutter wrote:
> [...]
> > > > > But isn't the problem of keeping the API compatible comparable to
> > > > > the problem of keeping the JSON representation compatible?
> > > > 
> > > > Well, keeping backward compatibility is always a burden to carry on.
> > > > Even though, to me, JSON is as extensible as netlink is, ie. we can
> > > > add new keys and deprecate things without breaking backward.
> > > 
> > > Yes, the format is very flexible. But how does that help with
> > > compatibility? You'll have to support the deprecated attributes or JSON
> > > without the new attributes either way, no?
> > 
> > Probably it's my subjective judging that maintaing json layout will be
> > easier rather than a large bunch of APIs and internal object
> > representations through getter/setter.
> > 
> > Anyway, these days, we expect people to use modern languages to build
> > upper layers in the stack, right? And that seems to mix well with JSON.
> > Again core infrastructure person here talking about upper layers, so
> > take this lightly ;-).
> 
> Yeah, for firewalld at least JSON is not a disadvantage. Not sure about
> projects written in C though, but that's a different kettle of fish. :)

It seems to me many of these new control plane/orchestration software
is not done in C, so this representation can be also useful to them.

> > [...]
> > > > Oh, I can help you on that. Although you're very much closer to
> > > > firewalld usecases that I'm, so probably a draft from you on this
> > > > would be nice ;-)
> > > 
> > > I went ahead and converted my local ruleset into JSON manually (until I
> > > got bored), see attached ruleset.json. Then I wrote a schema to validate
> > > it (also attached). Please let me know if you're OK with the base
> > > format at least, i.e. everything except expressions and statements. Any
> > > feedback on the few statements and expressions I put in there is highly
> > > appreciated as well, of course! :)
> > 
> > Probably instead of having left and right, we can replace it by:
> > 
> > "match" : {
> >         "key": {
> >                 ...
> >         },
> >         "value": "lo"
> > }
> > 
> > Then, allow to place expressions as "value" when it comes from a set
> > lookup.
> 
> Yes, JSON Schema allows to define multiple possible types for an
> attribute (see #/definitions/expression/val for instance). But I don't
> follow regarding set lookup: There are other uses for an expression on
> RHS as well, no?
> 
> > > > > > Regarding asynchronism between input and output, not sure I follow.
> > > > > 
> > > > > I am grepping through tests/py/*/*.t for pattern 'ok;':
> > > > > 
> > > > > - Adjacent ranges are combined.
> > > > 
> > > > We should disable this, there was some discussion on this recently.
> > > > User should request this explicitly to happen through some knob.
> > > 
> > > I agree. While it's a nice idea, adding two adjacent ranges and later
> > > wanting to delete one of them is troublesome. Do you have a name in mind
> > > for that knob? Maybe something more generic which could cover other
> > > cases of overly intelligent nft (in the future) as well?
> > 
> > Probably "coalesce" or "merge".
> 
> So you'd prefer a specific flag just for that feature? I had a more
> generic one in mind, something like "optimize" for instance.

At this stage, I'm consider to disable range automerge before 0.8.1 is
released. So we can revisit this later on with something that users
will explicitly enable on demand.

> [...]
> > > > > - meta iif "lo" accept;ok;iif "lo" accept
> > > > >   -> Maybe less abstraction?
> > > > 
> > > > This is just dealing with something that is causing us problems, that
> > > > is, iif is handled as primary key, so we cannot reuse it in the
> > > > grammar given it results in shift-reduce conflicts.
> > > 
> > > The question is why allow both variants then? Since 'iif' is being used
> > > as fib_flag as well, using 'iif' alone should be deprecated. Or is this
> > > a case of backwards compatibility?
> > 
> > It was compromise solution, not to break syntax all of a sudden,
> > allowing old and new ways for a little while. But this one, I think it
> > was not add this.
> 
> I couldn't parse your last sentence here. :)

Sorry, I mean. 'meta iff' came first, then 'iif' was added. To avoid
breaking things, the old meta iff has been preserved.

> > > > > - tcp dport 22 iiftype ether ip daddr 1.2.3.4 ether saddr 00:0f:54:0c:11:4 accept ok;tcp dport 22 ether saddr 00:0f:54:0c:11:04 ip daddr 1.2.3.4 accept
> > > > >   -> Here something is "optimized out", not sure if it should be kept in
> > > > >   JSON.
> > > > 
> > > > This is testing redudant information, that we can remove given we can
> > > > infer it.
> > > 
> > > Yeah, similar situation to the 'meta l4proto' one above. The (still)
> > > tricky part is to communicate assigned handles back to the application.
> > > Maybe we could return an exact copy of their JSON with just handle
> > > properties added?
> > 
> > Yes, that should be fine. We already have the NLM_F_ECHO in place,
> > just a matter of supporting json there.
> > 
> > BTW, a simple testsuite for this would be good too.
> 
> Sure! Maybe the existing data in tests/py could be reused (the *.payload
> files at least).

That would be fine.

Thanks!

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2018-01-09 23:31                                                         ` Pablo Neira Ayuso
@ 2018-01-10  4:46                                                           ` mark diener
  2018-01-10 10:39                                                             ` Phil Sutter
  0 siblings, 1 reply; 28+ messages in thread
From: mark diener @ 2018-01-10  4:46 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Phil Sutter, netfilter-devel, Florian Westphal

Why don't you just put a JSON layer above the c-based libnftl 0.9 ?

That way, whatever is working in C-based API can then get JSON support
and disrupt the apple cart.

Call it libnftljson-0.9.so, which is then dependent on libnftl-0.9.so

But keep the c-based api the c-based api
and the JSON calling translater-to-c api a different optional library.


Good idea?  Already done?

On Tue, Jan 9, 2018 at 5:31 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Fri, Jan 05, 2018 at 06:52:03PM +0100, Phil Sutter wrote:
>> Hi Pablo,
>>
>> On Tue, Jan 02, 2018 at 07:02:19PM +0100, Pablo Neira Ayuso wrote:
>> > On Fri, Dec 29, 2017 at 03:58:16PM +0100, Phil Sutter wrote:
>> > > On Thu, Dec 28, 2017 at 08:21:41PM +0100, Pablo Neira Ayuso wrote:
>> > > > On Sat, Dec 23, 2017 at 02:19:41PM +0100, Phil Sutter wrote:
>> [...]
>> > > > > But isn't the problem of keeping the API compatible comparable to
>> > > > > the problem of keeping the JSON representation compatible?
>> > > >
>> > > > Well, keeping backward compatibility is always a burden to carry on.
>> > > > Even though, to me, JSON is as extensible as netlink is, ie. we can
>> > > > add new keys and deprecate things without breaking backward.
>> > >
>> > > Yes, the format is very flexible. But how does that help with
>> > > compatibility? You'll have to support the deprecated attributes or JSON
>> > > without the new attributes either way, no?
>> >
>> > Probably it's my subjective judging that maintaing json layout will be
>> > easier rather than a large bunch of APIs and internal object
>> > representations through getter/setter.
>> >
>> > Anyway, these days, we expect people to use modern languages to build
>> > upper layers in the stack, right? And that seems to mix well with JSON.
>> > Again core infrastructure person here talking about upper layers, so
>> > take this lightly ;-).
>>
>> Yeah, for firewalld at least JSON is not a disadvantage. Not sure about
>> projects written in C though, but that's a different kettle of fish. :)
>
> It seems to me many of these new control plane/orchestration software
> is not done in C, so this representation can be also useful to them.
>
>> > [...]
>> > > > Oh, I can help you on that. Although you're very much closer to
>> > > > firewalld usecases that I'm, so probably a draft from you on this
>> > > > would be nice ;-)
>> > >
>> > > I went ahead and converted my local ruleset into JSON manually (until I
>> > > got bored), see attached ruleset.json. Then I wrote a schema to validate
>> > > it (also attached). Please let me know if you're OK with the base
>> > > format at least, i.e. everything except expressions and statements. Any
>> > > feedback on the few statements and expressions I put in there is highly
>> > > appreciated as well, of course! :)
>> >
>> > Probably instead of having left and right, we can replace it by:
>> >
>> > "match" : {
>> >         "key": {
>> >                 ...
>> >         },
>> >         "value": "lo"
>> > }
>> >
>> > Then, allow to place expressions as "value" when it comes from a set
>> > lookup.
>>
>> Yes, JSON Schema allows to define multiple possible types for an
>> attribute (see #/definitions/expression/val for instance). But I don't
>> follow regarding set lookup: There are other uses for an expression on
>> RHS as well, no?
>>
>> > > > > > Regarding asynchronism between input and output, not sure I follow.
>> > > > >
>> > > > > I am grepping through tests/py/*/*.t for pattern 'ok;':
>> > > > >
>> > > > > - Adjacent ranges are combined.
>> > > >
>> > > > We should disable this, there was some discussion on this recently.
>> > > > User should request this explicitly to happen through some knob.
>> > >
>> > > I agree. While it's a nice idea, adding two adjacent ranges and later
>> > > wanting to delete one of them is troublesome. Do you have a name in mind
>> > > for that knob? Maybe something more generic which could cover other
>> > > cases of overly intelligent nft (in the future) as well?
>> >
>> > Probably "coalesce" or "merge".
>>
>> So you'd prefer a specific flag just for that feature? I had a more
>> generic one in mind, something like "optimize" for instance.
>
> At this stage, I'm consider to disable range automerge before 0.8.1 is
> released. So we can revisit this later on with something that users
> will explicitly enable on demand.
>
>> [...]
>> > > > > - meta iif "lo" accept;ok;iif "lo" accept
>> > > > >   -> Maybe less abstraction?
>> > > >
>> > > > This is just dealing with something that is causing us problems, that
>> > > > is, iif is handled as primary key, so we cannot reuse it in the
>> > > > grammar given it results in shift-reduce conflicts.
>> > >
>> > > The question is why allow both variants then? Since 'iif' is being used
>> > > as fib_flag as well, using 'iif' alone should be deprecated. Or is this
>> > > a case of backwards compatibility?
>> >
>> > It was compromise solution, not to break syntax all of a sudden,
>> > allowing old and new ways for a little while. But this one, I think it
>> > was not add this.
>>
>> I couldn't parse your last sentence here. :)
>
> Sorry, I mean. 'meta iff' came first, then 'iif' was added. To avoid
> breaking things, the old meta iff has been preserved.
>
>> > > > > - tcp dport 22 iiftype ether ip daddr 1.2.3.4 ether saddr 00:0f:54:0c:11:4 accept ok;tcp dport 22 ether saddr 00:0f:54:0c:11:04 ip daddr 1.2.3.4 accept
>> > > > >   -> Here something is "optimized out", not sure if it should be kept in
>> > > > >   JSON.
>> > > >
>> > > > This is testing redudant information, that we can remove given we can
>> > > > infer it.
>> > >
>> > > Yeah, similar situation to the 'meta l4proto' one above. The (still)
>> > > tricky part is to communicate assigned handles back to the application.
>> > > Maybe we could return an exact copy of their JSON with just handle
>> > > properties added?
>> >
>> > Yes, that should be fine. We already have the NLM_F_ECHO in place,
>> > just a matter of supporting json there.
>> >
>> > BTW, a simple testsuite for this would be good too.
>>
>> Sure! Maybe the existing data in tests/py could be reused (the *.payload
>> files at least).
>
> That would be fine.
>
> Thanks!
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: libnftables extended API proposal
  2018-01-10  4:46                                                           ` mark diener
@ 2018-01-10 10:39                                                             ` Phil Sutter
  0 siblings, 0 replies; 28+ messages in thread
From: Phil Sutter @ 2018-01-10 10:39 UTC (permalink / raw)
  To: mark diener; +Cc: Pablo Neira Ayuso, netfilter-devel, Florian Westphal

Hi Mark,

On Tue, Jan 09, 2018 at 10:46:14PM -0600, mark diener wrote:
> Why don't you just put a JSON layer above the c-based libnftl 0.9 ?
> 
> That way, whatever is working in C-based API can then get JSON support
> and disrupt the apple cart.
> 
> Call it libnftljson-0.9.so, which is then dependent on libnftl-0.9.so
> 
> But keep the c-based api the c-based api
> and the JSON calling translater-to-c api a different optional library.

I'm not sure I follow. The reason we started working on this libnftables
in the first place was that everyone considered libnftnl way too
low-level for direct use in applications.

Are you suggesting to basically reimplement libnftables with JSON
interfacing in both directions?

Cheers, Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2018-01-10 10:39 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-16 19:10 [nft PATCH] libnftables: Fix for multiple context instances Phil Sutter
2017-11-20 12:37 ` Pablo Neira Ayuso
2017-11-20 12:54   ` Phil Sutter
2017-11-20 13:07     ` Pablo Neira Ayuso
2017-11-20 15:58       ` Phil Sutter
2017-11-20 16:53         ` Pablo Neira Ayuso
2017-11-22 17:49           ` Phil Sutter
2017-11-22 18:18             ` Pablo Neira Ayuso
     [not found]             ` <20171204100955.GA1822@salvia>
     [not found]               ` <20171204105324.GX32305@orbyte.nwl.cc>
     [not found]                 ` <20171204110142.GA19776@salvia>
     [not found]                   ` <20171204164327.GA32305@orbyte.nwl.cc>
     [not found]                     ` <20171204184604.GA1556@salvia>
2017-12-05 13:43                       ` libnftables extended API proposal (Was: Re: [nft PATCH] libnftables: Fix for multiple context instances) Phil Sutter
2017-12-07  0:05                         ` Pablo Neira Ayuso
2017-12-07 11:34                           ` Phil Sutter
2017-12-10 21:55                             ` Pablo Neira Ayuso
2017-12-16 16:06                               ` libnftables extended API proposal Phil Sutter
2017-12-18 23:00                                 ` Pablo Neira Ayuso
2017-12-20 12:32                                   ` Phil Sutter
2017-12-20 22:23                                     ` Pablo Neira Ayuso
2017-12-22 13:08                                       ` Phil Sutter
2017-12-22 13:49                                         ` Pablo Neira Ayuso
2017-12-22 15:30                                           ` Phil Sutter
2017-12-22 20:39                                             ` Pablo Neira Ayuso
2017-12-23 13:19                                               ` Phil Sutter
2017-12-28 19:21                                                 ` Pablo Neira Ayuso
2017-12-29 14:58                                                   ` Phil Sutter
2018-01-02 18:02                                                     ` Pablo Neira Ayuso
2018-01-05 17:52                                                       ` Phil Sutter
2018-01-09 23:31                                                         ` Pablo Neira Ayuso
2018-01-10  4:46                                                           ` mark diener
2018-01-10 10:39                                                             ` Phil Sutter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.