All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] netfilter: nft_meta: Add NFT_META_I/OIFKIND meta type
@ 2019-01-15 23:53 wenxu
  2019-01-18 14:24 ` Pablo Neira Ayuso
  0 siblings, 1 reply; 8+ messages in thread
From: wenxu @ 2019-01-15 23:53 UTC (permalink / raw)
  To: pablo, fw; +Cc: netdev, netfilter-devel

From: wenxu <wenxu@ucloud.cn>

In the ip_rcv the skb go through the PREROUTING hook first,
Then jump in vrf device go through the same hook again.
When conntrack dnat work with vrf, there will be some conflict for rules.
Because the package go through the hook twice with different nf status

ip link add user1 type vrf table 1
ip link add user2 type vrf table 2
ip l set dev tun1 master user1
ip l set dev tun2 master user2

nft add table firewall
nft add chain firewall zones { type filter hook prerouting  priority - 300 \; }
nft add rule firewall zones counter ct zone set iif map { "tun1" : 1, "tun2" : 2 }
nft add chain firewall rule-1000-ingress
nft add rule firewall rule-1000-ingress ct zone 1 tcp dport 22 ct state new counter accept
nft add rule firewall rule-1000-ingress counter drop
nft add chain firewall rule-1000-egress
nft add rule firewall rule-1000-egress tcp dport 22 ct state new counter drop
nft add rule firewall rule-1000-egress counter accept

nft add chain firewall rules-all { type filter hook prerouting priority - 150 \; }
nft add rule firewall rules-all ip daddr vmap { "2.2.2.11" : jump rule-1000-ingress }
nft add rule firewall rules-all ct zone vmap { 1 : jump rule-1000-egress }

nft add rule firewall dnat-all ct zone vmap { 1 : jump dnat-1000 }
nft add rule firewall dnat-1000 ip daddr 2.2.2.11 counter dnat to 10.0.0.7

For a package with ip daddr 2.2.2.11 and tcp dport 22, first time accept in the
rule-1000-ingress and dnat to 10.0.0.7. Then second time the packet goto the wrong
chain rule-1000-egress which leads the packet drop

so with this patch userspace can add the 'don't re-do entire ruleset for vrf' policy
itself like the following

nft add rule firewall rules-all meta iifkind "vrf" counter accept

Signed-off-by: wenxu <wenxu@ucloud.cn>
---
 include/uapi/linux/netfilter/nf_tables.h |  4 ++++
 net/netfilter/nft_meta.c                 | 12 ++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index 7de4f1b..046b997 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -789,6 +789,8 @@ enum nft_exthdr_attributes {
  * @NFT_META_CGROUP: socket control group (skb->sk->sk_classid)
  * @NFT_META_PRANDOM: a 32bit pseudo-random number
  * @NFT_META_SECPATH: boolean, secpath_exists (!!skb->sp)
+ * @NFT_META_IIFKIND: packet input interface kind name (dev->rtnl_link_ops->kind)
+ * @NFT_META_OIFKIND: packet output interface kind name (dev->rtnl_link_ops->kind)
  */
 enum nft_meta_keys {
 	NFT_META_LEN,
@@ -817,6 +819,8 @@ enum nft_meta_keys {
 	NFT_META_CGROUP,
 	NFT_META_PRANDOM,
 	NFT_META_SECPATH,
+	NFT_META_IIFKIND,
+	NFT_META_OIFKIND,
 };
 
 /**
diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c
index 6df486c..987d2d6 100644
--- a/net/netfilter/nft_meta.c
+++ b/net/netfilter/nft_meta.c
@@ -244,6 +244,16 @@ void nft_meta_get_eval(const struct nft_expr *expr,
 		strncpy((char *)dest, p->br->dev->name, IFNAMSIZ);
 		return;
 #endif
+	case NFT_META_IIFKIND:
+		if (in == NULL || in->rtnl_link_ops == NULL)
+			goto err;
+		strncpy((char *)dest, in->rtnl_link_ops->kind, IFNAMSIZ);
+		break;
+	case NFT_META_OIFKIND:
+		if (out == NULL || out->rtnl_link_ops == NULL)
+			goto err;
+		strncpy((char *)dest, out->rtnl_link_ops->kind, IFNAMSIZ);
+		break;
 	default:
 		WARN_ON(1);
 		goto err;
@@ -340,6 +350,8 @@ static int nft_meta_get_init(const struct nft_ctx *ctx,
 		break;
 	case NFT_META_IIFNAME:
 	case NFT_META_OIFNAME:
+	case NFT_META_IIFKIND:
+	case NFT_META_OIFKIND:
 		len = IFNAMSIZ;
 		break;
 	case NFT_META_PRANDOM:
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] netfilter: nft_meta: Add NFT_META_I/OIFKIND meta type
  2019-01-15 23:53 [PATCH v3] netfilter: nft_meta: Add NFT_META_I/OIFKIND meta type wenxu
@ 2019-01-18 14:24 ` Pablo Neira Ayuso
  2019-01-18 14:27   ` David Ahern
  2019-01-18 14:32   ` Florian Westphal
  0 siblings, 2 replies; 8+ messages in thread
From: Pablo Neira Ayuso @ 2019-01-18 14:24 UTC (permalink / raw)
  To: wenxu; +Cc: fw, netdev, netfilter-devel, David Ahern

Cc'ing dsahern.

On Wed, Jan 16, 2019 at 07:53:51AM +0800, wenxu@ucloud.cn wrote:
> From: wenxu <wenxu@ucloud.cn>
> 
> In the ip_rcv the skb go through the PREROUTING hook first,
> Then jump in vrf device go through the same hook again.
> When conntrack dnat work with vrf, there will be some conflict for rules.
> Because the package go through the hook twice with different nf status
> 
> ip link add user1 type vrf table 1
> ip link add user2 type vrf table 2
> ip l set dev tun1 master user1
> ip l set dev tun2 master user2
> 
> nft add table firewall
> nft add chain firewall zones { type filter hook prerouting  priority - 300 \; }
> nft add rule firewall zones counter ct zone set iif map { "tun1" : 1, "tun2" : 2 }
> nft add chain firewall rule-1000-ingress
> nft add rule firewall rule-1000-ingress ct zone 1 tcp dport 22 ct state new counter accept
> nft add rule firewall rule-1000-ingress counter drop
> nft add chain firewall rule-1000-egress
> nft add rule firewall rule-1000-egress tcp dport 22 ct state new counter drop
> nft add rule firewall rule-1000-egress counter accept
> 
> nft add chain firewall rules-all { type filter hook prerouting priority - 150 \; }
> nft add rule firewall rules-all ip daddr vmap { "2.2.2.11" : jump rule-1000-ingress }
> nft add rule firewall rules-all ct zone vmap { 1 : jump rule-1000-egress }
> 
> nft add rule firewall dnat-all ct zone vmap { 1 : jump dnat-1000 }
> nft add rule firewall dnat-1000 ip daddr 2.2.2.11 counter dnat to 10.0.0.7
> 
> For a package with ip daddr 2.2.2.11 and tcp dport 22, first time accept in the
> rule-1000-ingress and dnat to 10.0.0.7. Then second time the packet goto the wrong
> chain rule-1000-egress which leads the packet drop
> 
> so with this patch userspace can add the 'don't re-do entire ruleset for vrf' policy
> itself like the following
> 
> nft add rule firewall rules-all meta iifkind "vrf" counter accept
> 
> Signed-off-by: wenxu <wenxu@ucloud.cn>
> ---
>  include/uapi/linux/netfilter/nf_tables.h |  4 ++++
>  net/netfilter/nft_meta.c                 | 12 ++++++++++++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
> index 7de4f1b..046b997 100644
> --- a/include/uapi/linux/netfilter/nf_tables.h
> +++ b/include/uapi/linux/netfilter/nf_tables.h
> @@ -789,6 +789,8 @@ enum nft_exthdr_attributes {
>   * @NFT_META_CGROUP: socket control group (skb->sk->sk_classid)
>   * @NFT_META_PRANDOM: a 32bit pseudo-random number
>   * @NFT_META_SECPATH: boolean, secpath_exists (!!skb->sp)
> + * @NFT_META_IIFKIND: packet input interface kind name (dev->rtnl_link_ops->kind)
> + * @NFT_META_OIFKIND: packet output interface kind name (dev->rtnl_link_ops->kind)
>   */
>  enum nft_meta_keys {
>  	NFT_META_LEN,
> @@ -817,6 +819,8 @@ enum nft_meta_keys {
>  	NFT_META_CGROUP,
>  	NFT_META_PRANDOM,
>  	NFT_META_SECPATH,
> +	NFT_META_IIFKIND,
> +	NFT_META_OIFKIND,
>  };
>  
>  /**
> diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c
> index 6df486c..987d2d6 100644
> --- a/net/netfilter/nft_meta.c
> +++ b/net/netfilter/nft_meta.c
> @@ -244,6 +244,16 @@ void nft_meta_get_eval(const struct nft_expr *expr,
>  		strncpy((char *)dest, p->br->dev->name, IFNAMSIZ);
>  		return;
>  #endif
> +	case NFT_META_IIFKIND:
> +		if (in == NULL || in->rtnl_link_ops == NULL)
> +			goto err;
> +		strncpy((char *)dest, in->rtnl_link_ops->kind, IFNAMSIZ);

It seems kind can be arbitrarily large, no limitation in its length.

Thinking...

There is no other way to identify a vft device rather than this
string? The only l3mdev that exists if vrf, right?

If there is no other alternative, we can just place this in the tree,
but probably it would be better to have a numeric way to identify a
vrf device?

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] netfilter: nft_meta: Add NFT_META_I/OIFKIND meta type
  2019-01-18 14:24 ` Pablo Neira Ayuso
@ 2019-01-18 14:27   ` David Ahern
  2019-01-18 14:30     ` Pablo Neira Ayuso
  2019-01-18 14:32   ` Florian Westphal
  1 sibling, 1 reply; 8+ messages in thread
From: David Ahern @ 2019-01-18 14:27 UTC (permalink / raw)
  To: Pablo Neira Ayuso, wenxu; +Cc: fw, netdev, netfilter-devel

On 1/18/19 7:24 AM, Pablo Neira Ayuso wrote:
> 
> There is no other way to identify a vft device rather than this
> string? The only l3mdev that exists if vrf, right?

ipvlan uses some of the hooks.

> 
> If there is no other alternative, we can just place this in the tree,
> but probably it would be better to have a numeric way to identify a
> vrf device?

IFF_L3MDEV_MASTER and IFF_L3MDEV_SLAVE are used via netif_is_l3_master
and netif_is_l3_slave in all of the code. This was done because of
requests to not bleed 'vrf' all over core kernel code.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] netfilter: nft_meta: Add NFT_META_I/OIFKIND meta type
  2019-01-18 14:27   ` David Ahern
@ 2019-01-18 14:30     ` Pablo Neira Ayuso
  0 siblings, 0 replies; 8+ messages in thread
From: Pablo Neira Ayuso @ 2019-01-18 14:30 UTC (permalink / raw)
  To: David Ahern; +Cc: wenxu, fw, netdev, netfilter-devel

On Fri, Jan 18, 2019 at 07:27:48AM -0700, David Ahern wrote:
> On 1/18/19 7:24 AM, Pablo Neira Ayuso wrote:
> > 
> > There is no other way to identify a vft device rather than this
> > string? The only l3mdev that exists if vrf, right?
> 
> ipvlan uses some of the hooks.
> 
> > 
> > If there is no other alternative, we can just place this in the tree,
> > but probably it would be better to have a numeric way to identify a
> > vrf device?
> 
> IFF_L3MDEV_MASTER and IFF_L3MDEV_SLAVE are used via netif_is_l3_master
> and netif_is_l3_slave in all of the code. This was done because of
> requests to not bleed 'vrf' all over core kernel code.

Thanks for explaining.

So no other way than this string to identify vrf device, right?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] netfilter: nft_meta: Add NFT_META_I/OIFKIND meta type
  2019-01-18 14:24 ` Pablo Neira Ayuso
  2019-01-18 14:27   ` David Ahern
@ 2019-01-18 14:32   ` Florian Westphal
  2019-01-18 14:35     ` Pablo Neira Ayuso
  1 sibling, 1 reply; 8+ messages in thread
From: Florian Westphal @ 2019-01-18 14:32 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: wenxu, fw, netdev, netfilter-devel, David Ahern

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > +	case NFT_META_IIFKIND:
> > +		if (in == NULL || in->rtnl_link_ops == NULL)
> > +			goto err;
> > +		strncpy((char *)dest, in->rtnl_link_ops->kind, IFNAMSIZ);
> 
> It seems kind can be arbitrarily large, no limitation in its length.

Its limited to 60 or 56 bytes it seems:
char kind[MODULE_NAME_LEN];

nla_strlcpy(kind, linfo[IFLA_INFO_KIND], sizeof(kind));

(linkinfo_to_kind_ops in rtnetlink.c).

> Thinking...
> 
> There is no other way to identify a vft device rather than this
> string? The only l3mdev that exists if vrf, right?

There is, I suggested this more generic approach, as it would allow
to create rules that match on the kind of device used (vrf, ppp, etc.).

If you think its too generic, ok.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] netfilter: nft_meta: Add NFT_META_I/OIFKIND meta type
  2019-01-18 14:32   ` Florian Westphal
@ 2019-01-18 14:35     ` Pablo Neira Ayuso
  2019-01-24 13:51       ` wenxu
  0 siblings, 1 reply; 8+ messages in thread
From: Pablo Neira Ayuso @ 2019-01-18 14:35 UTC (permalink / raw)
  To: Florian Westphal; +Cc: wenxu, netdev, netfilter-devel, David Ahern

On Fri, Jan 18, 2019 at 03:32:08PM +0100, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > > +	case NFT_META_IIFKIND:
> > > +		if (in == NULL || in->rtnl_link_ops == NULL)
> > > +			goto err;
> > > +		strncpy((char *)dest, in->rtnl_link_ops->kind, IFNAMSIZ);
> > 
> > It seems kind can be arbitrarily large, no limitation in its length.
> 
> Its limited to 60 or 56 bytes it seems:
> char kind[MODULE_NAME_LEN];
> 
> nla_strlcpy(kind, linfo[IFLA_INFO_KIND], sizeof(kind));
> 
> (linkinfo_to_kind_ops in rtnetlink.c).
> 
> > Thinking...
> > 
> > There is no other way to identify a vft device rather than this
> > string? The only l3mdev that exists if vrf, right?
> 
> There is, I suggested this more generic approach, as it would allow
> to create rules that match on the kind of device used (vrf, ppp, etc.).

Ah I see.

> If you think its too generic, ok.

I think it's fine, thanks for explaining.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] netfilter: nft_meta: Add NFT_META_I/OIFKIND meta type
  2019-01-18 14:35     ` Pablo Neira Ayuso
@ 2019-01-24 13:51       ` wenxu
  2019-01-24 13:59         ` Florian Westphal
  0 siblings, 1 reply; 8+ messages in thread
From: wenxu @ 2019-01-24 13:51 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Florian Westphal; +Cc: netdev, netfilter-devel, David Ahern

Hi all,
Are there any other idear for this patch?  Maye should modify  IFNAMSIZ to MODULE_NAME_LEN

BR
wenxu
On 2019/1/18 下午10:35, Pablo Neira Ayuso wrote:
> On Fri, Jan 18, 2019 at 03:32:08PM +0100, Florian Westphal wrote:
>> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
>>>> +	case NFT_META_IIFKIND:
>>>> +		if (in == NULL || in->rtnl_link_ops == NULL)
>>>> +			goto err;
>>>> +		strncpy((char *)dest, in->rtnl_link_ops->kind, IFNAMSIZ);
>>> It seems kind can be arbitrarily large, no limitation in its length.
>> Its limited to 60 or 56 bytes it seems:
>> char kind[MODULE_NAME_LEN];
>>
>> nla_strlcpy(kind, linfo[IFLA_INFO_KIND], sizeof(kind));
>>
>> (linkinfo_to_kind_ops in rtnetlink.c).
>>
>>> Thinking...
>>>
>>> There is no other way to identify a vft device rather than this
>>> string? The only l3mdev that exists if vrf, right?
>> There is, I suggested this more generic approach, as it would allow
>> to create rules that match on the kind of device used (vrf, ppp, etc.).
> Ah I see.
>
>> If you think its too generic, ok.
> I think it's fine, thanks for explaining.
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] netfilter: nft_meta: Add NFT_META_I/OIFKIND meta type
  2019-01-24 13:51       ` wenxu
@ 2019-01-24 13:59         ` Florian Westphal
  0 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2019-01-24 13:59 UTC (permalink / raw)
  To: wenxu
  Cc: Pablo Neira Ayuso, Florian Westphal, netdev, netfilter-devel,
	David Ahern

wenxu <wenxu@ucloud.cn> wrote:
> Hi all,
> Are there any other idear for this patch?  Maye should modify  IFNAMSIZ to MODULE_NAME_LEN

i think its fine as-is.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-01-24 13:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-15 23:53 [PATCH v3] netfilter: nft_meta: Add NFT_META_I/OIFKIND meta type wenxu
2019-01-18 14:24 ` Pablo Neira Ayuso
2019-01-18 14:27   ` David Ahern
2019-01-18 14:30     ` Pablo Neira Ayuso
2019-01-18 14:32   ` Florian Westphal
2019-01-18 14:35     ` Pablo Neira Ayuso
2019-01-24 13:51       ` wenxu
2019-01-24 13:59         ` Florian Westphal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.