netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] xfrm: add incoming interface to selector
@ 2011-11-28 20:14 Ulrich Weber
  2011-11-28 20:14 ` [PATCH 1/3] " Ulrich Weber
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Ulrich Weber @ 2011-11-28 20:14 UTC (permalink / raw)
  To: netdev; +Cc: davem

Current implementation allows to match IPsec policies based
on the outgoing interface. This however is kind of broken:
 a) decode_session will not fill oif and is therefore always set to zero and
    will never match (see patch #2).
 b) its not possible to match the incoming interface. This makes it impossible,
    to create redundant IPsec tunnels over two uplink interfaces.

Over a year ago there was a discussion about adding the incoming interface to
the xfrm selector. See the following thread for more details:
http://kerneltrap.org/mailarchive/linux-netdev/2010/3/31/6273708


The first patch will reuse the user attribute in the xfrm selector as
incoming interface. Yes, this breaks the ABI. But user in xfrm selector was
never used and user space programs set it to zero. The alternative would
be to have different xfrm_selector structs in kernel and user space.

Second patch makes matching outgoing and incoming interfaces operational by
setting oif and iif in flowi struct. Local generated packets will have iif
set to loopback interface.

Third patch allows to override the incoming interface of decrypted packets.
This allows to create virtual IPsec devices for redundant IPsec tunnels, when
only one Internet uplink is available (e.g. used by Amazon VPC Connector).

Ulrich Weber (3):
  xfrm: add incoming interface to selector
  route: set iif and oif information in flowi struct
  xfrm: allow to overwrite incoming dev after decryption

 include/linux/xfrm.h    |    5 +++--
 include/net/xfrm.h      |    6 +++++-
 net/ipv4/route.c        |    4 ++++
 net/ipv4/xfrm4_policy.c |   19 +++++++++++++++++--
 net/ipv4/xfrm4_state.c  |    3 ++-
 net/ipv6/fib6_rules.c   |   10 ++++++++--
 net/ipv6/mip6.c         |    3 ++-
 net/ipv6/xfrm6_policy.c |   18 ++++++++++++++++--
 net/ipv6/xfrm6_state.c  |    3 ++-
 net/key/af_key.c        |    2 +-
 net/xfrm/xfrm_input.c   |    5 +++++
 net/xfrm/xfrm_policy.c  |    6 ++++--
 net/xfrm/xfrm_state.c   |   44 +++++++++++++++++++++++++++++++++++++++-----
 net/xfrm/xfrm_user.c    |   14 +++++++++++++-
 14 files changed, 121 insertions(+), 21 deletions(-)

-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] xfrm: add incoming interface to selector
  2011-11-28 20:14 [PATCH 0/3] xfrm: add incoming interface to selector Ulrich Weber
@ 2011-11-28 20:14 ` Ulrich Weber
  2011-11-30  0:00   ` David Miller
  2011-11-28 20:14 ` [PATCH 2/3] route: set iif and oif information in flowi struct Ulrich Weber
  2011-11-28 20:14 ` [PATCH 3/3] xfrm: allow to overwrite incoming dev after decryption Ulrich Weber
  2 siblings, 1 reply; 11+ messages in thread
From: Ulrich Weber @ 2011-11-28 20:14 UTC (permalink / raw)
  To: netdev; +Cc: davem

Current implementation only allows to match for outgoing
interface in xfrm policy database.

This replaces the user ID (which was never used) with the
incoming interface, so both interfaces can be matched against.

Signed-off-by: Ulrich Weber <ulrich.weber@sophos.com>
---
 include/linux/xfrm.h   |    4 ++--
 net/ipv4/xfrm4_state.c |    3 ++-
 net/ipv6/mip6.c        |    3 ++-
 net/ipv6/xfrm6_state.c |    3 ++-
 net/xfrm/xfrm_policy.c |    6 ++++--
 5 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h
index 22e61fd..bb1bb49 100644
--- a/include/linux/xfrm.h
+++ b/include/linux/xfrm.h
@@ -54,8 +54,8 @@ struct xfrm_selector {
 	__u8	prefixlen_d;
 	__u8	prefixlen_s;
 	__u8	proto;
-	int	ifindex;
-	__kernel_uid32_t	user;
+	int	oif;
+	int 	iif;
 };
 
 #define XFRM_INF (~(__u64)0)
diff --git a/net/ipv4/xfrm4_state.c b/net/ipv4/xfrm4_state.c
index 9258e75..96d9b55 100644
--- a/net/ipv4/xfrm4_state.c
+++ b/net/ipv4/xfrm4_state.c
@@ -36,7 +36,8 @@ __xfrm4_init_tempsel(struct xfrm_selector *sel, const struct flowi *fl)
 	sel->prefixlen_d = 32;
 	sel->prefixlen_s = 32;
 	sel->proto = fl4->flowi4_proto;
-	sel->ifindex = fl4->flowi4_oif;
+	sel->oif = fl4->flowi4_oif;
+	sel->iif = fl4->flowi4_iif;
 }
 
 static void
diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c
index 7e1e0fb..c0ebe40 100644
--- a/net/ipv6/mip6.c
+++ b/net/ipv6/mip6.c
@@ -248,7 +248,8 @@ static int mip6_destopt_reject(struct xfrm_state *x, struct sk_buff *skb,
 	sel.sport = xfrm_flowi_sport(fl, &fl6->uli);
 	if (sel.sport)
 		sel.sport_mask = htons(~0);
-	sel.ifindex = fl6->flowi6_oif;
+	sel.oif = fl6->flowi6_oif;
+	sel.iif = fl6->flowi6_iif;
 
 	err = km_report(net, IPPROTO_DSTOPTS, &sel,
 			(hao ? (xfrm_address_t *)&hao->addr : NULL));
diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c
index 3f2f7c4..bdb13df 100644
--- a/net/ipv6/xfrm6_state.c
+++ b/net/ipv6/xfrm6_state.c
@@ -37,7 +37,8 @@ __xfrm6_init_tempsel(struct xfrm_selector *sel, const struct flowi *fl)
 	sel->prefixlen_d = 128;
 	sel->prefixlen_s = 128;
 	sel->proto = fl6->flowi6_proto;
-	sel->ifindex = fl6->flowi6_oif;
+	sel->oif = fl6->flowi6_oif;
+	sel->iif = fl6->flowi6_iif;
 }
 
 static void
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 4fce1ce..648c9e7 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -66,7 +66,8 @@ __xfrm4_selector_match(const struct xfrm_selector *sel, const struct flowi *fl)
 		!((xfrm_flowi_dport(fl, &fl4->uli) ^ sel->dport) & sel->dport_mask) &&
 		!((xfrm_flowi_sport(fl, &fl4->uli) ^ sel->sport) & sel->sport_mask) &&
 		(fl4->flowi4_proto == sel->proto || !sel->proto) &&
-		(fl4->flowi4_oif == sel->ifindex || !sel->ifindex);
+		(fl4->flowi4_oif == sel->oif || !sel->oif) &&
+		(fl4->flowi4_iif == sel->iif || !sel->iif);
 }
 
 static inline int
@@ -79,7 +80,8 @@ __xfrm6_selector_match(const struct xfrm_selector *sel, const struct flowi *fl)
 		!((xfrm_flowi_dport(fl, &fl6->uli) ^ sel->dport) & sel->dport_mask) &&
 		!((xfrm_flowi_sport(fl, &fl6->uli) ^ sel->sport) & sel->sport_mask) &&
 		(fl6->flowi6_proto == sel->proto || !sel->proto) &&
-		(fl6->flowi6_oif == sel->ifindex || !sel->ifindex);
+		(fl6->flowi6_oif == sel->oif || !sel->oif) &&
+		(fl6->flowi6_iif == sel->iif || !sel->iif);
 }
 
 int xfrm_selector_match(const struct xfrm_selector *sel, const struct flowi *fl,
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] route: set iif and oif information in flowi struct
  2011-11-28 20:14 [PATCH 0/3] xfrm: add incoming interface to selector Ulrich Weber
  2011-11-28 20:14 ` [PATCH 1/3] " Ulrich Weber
@ 2011-11-28 20:14 ` Ulrich Weber
  2011-11-28 23:53   ` Julian Anastasov
  2011-11-30  0:01   ` David Miller
  2011-11-28 20:14 ` [PATCH 3/3] xfrm: allow to overwrite incoming dev after decryption Ulrich Weber
  2 siblings, 2 replies; 11+ messages in thread
From: Ulrich Weber @ 2011-11-28 20:14 UTC (permalink / raw)
  To: netdev; +Cc: davem

Outgoing packets have loopback interface as incoming interface.

Signed-off-by: Ulrich Weber <ulrich.weber@sophos.com>
---
 net/ipv4/route.c        |    4 ++++
 net/ipv4/xfrm4_policy.c |   19 +++++++++++++++++--
 net/ipv6/fib6_rules.c   |   10 ++++++++--
 net/ipv6/xfrm6_policy.c |   18 ++++++++++++++++--
 4 files changed, 45 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index fb47c8f..1702ec0 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2744,6 +2744,10 @@ struct rtable *__ip_route_output_key(struct net *net, struct flowi4 *flp4)
 				flp4->saddr = rth->rt_src;
 			if (!flp4->daddr)
 				flp4->daddr = rth->rt_dst;
+			if (!flp4->flowi4_iif)
+				flp4->flowi4_iif = net->loopback_dev->ifindex;
+			if (!flp4->flowi4_oif)
+				flp4->flowi4_oif = rth->rt_iif;
 			return rth;
 		}
 		RT_CACHE_STAT_INC(out_hlist_search);
diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
index a0b4c5d..ad9c620 100644
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -110,6 +110,8 @@ static int xfrm4_fill_dst(struct xfrm_dst *xdst, struct net_device *dev,
 static void
 _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse)
 {
+	const struct net *net = dev_net(skb->dev);
+	const struct rtable *rt = skb_rtable(skb);
 	const struct iphdr *iph = ip_hdr(skb);
 	u8 *xprth = skb_network_header(skb) + iph->ihl * 4;
 	struct flowi4 *fl4 = &fl->u.ip4;
@@ -185,9 +187,22 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse)
 		}
 	}
 	fl4->flowi4_proto = iph->protocol;
-	fl4->daddr = reverse ? iph->saddr : iph->daddr;
-	fl4->saddr = reverse ? iph->daddr : iph->saddr;
 	fl4->flowi4_tos = iph->tos;
+
+	if (reverse) {
+		fl4->daddr = iph->saddr;
+		fl4->saddr = iph->daddr;
+		fl4->flowi4_oif = skb->skb_iif ?: net->loopback_dev->ifindex;
+		if (rt)
+			fl4->flowi4_iif = rt->rt_oif ?: rt->dst.dev->ifindex;
+	}
+	else {
+		fl4->daddr = iph->daddr;
+		fl4->saddr = iph->saddr;
+		fl4->flowi4_iif = skb->skb_iif ?: net->loopback_dev->ifindex;
+		if (rt)
+			fl4->flowi4_oif = rt->rt_oif ?: rt->dst.dev->ifindex;
+	}
 }
 
 static inline int xfrm4_garbage_collect(struct dst_ops *ops)
diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c
index b6c5731..fd520b5 100644
--- a/net/ipv6/fib6_rules.c
+++ b/net/ipv6/fib6_rules.c
@@ -108,8 +108,14 @@ again:
 discard_pkt:
 	dst_hold(&rt->dst);
 out:
-	arg->result = rt;
-	return rt == NULL ? -EAGAIN : 0;
+	if ((arg->result = rt)) {
+		if (!flp6->flowi6_iif)
+			flp6->flowi6_iif = net->loopback_dev->ifindex;
+		if (!flp6->flowi6_oif)
+			flp6->flowi6_oif = rt->rt6i_dev->ifindex;
+		return 0;
+	}
+	return -EAGAIN;
 }
 
 
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index 8ea65e0..7c0196f 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -121,6 +121,8 @@ static int xfrm6_fill_dst(struct xfrm_dst *xdst, struct net_device *dev,
 static inline void
 _decode_session6(struct sk_buff *skb, struct flowi *fl, int reverse)
 {
+	const struct net *net = dev_net(skb->dev);
+	const struct rt6_info *rt = (struct rt6_info*)skb_dst(skb);
 	struct flowi6 *fl6 = &fl->u.ip6;
 	int onlyproto = 0;
 	u16 offset = skb_network_header_len(skb);
@@ -132,8 +134,20 @@ _decode_session6(struct sk_buff *skb, struct flowi *fl, int reverse)
 	memset(fl6, 0, sizeof(struct flowi6));
 	fl6->flowi6_mark = skb->mark;
 
-	fl6->daddr = reverse ? hdr->saddr : hdr->daddr;
-	fl6->saddr = reverse ? hdr->daddr : hdr->saddr;
+	if (reverse) {
+		fl6->daddr = hdr->saddr;
+		fl6->saddr = hdr->daddr;
+		fl6->flowi6_oif = skb->skb_iif ?: net->loopback_dev->ifindex;
+		if (rt)
+			fl6->flowi6_iif = rt->rt6i_dev->ifindex;
+	}
+	else {
+		fl6->daddr = hdr->daddr;
+		fl6->saddr = hdr->saddr;
+		fl6->flowi6_iif = skb->skb_iif ?: net->loopback_dev->ifindex;
+		if (rt)
+			fl6->flowi6_oif = rt->rt6i_dev->ifindex;
+	}
 
 	while (nh + offset + 1 < skb->data ||
 	       pskb_may_pull(skb, nh + offset + 1 - skb->data)) {
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] xfrm: allow to overwrite incoming dev after decryption
  2011-11-28 20:14 [PATCH 0/3] xfrm: add incoming interface to selector Ulrich Weber
  2011-11-28 20:14 ` [PATCH 1/3] " Ulrich Weber
  2011-11-28 20:14 ` [PATCH 2/3] route: set iif and oif information in flowi struct Ulrich Weber
@ 2011-11-28 20:14 ` Ulrich Weber
  2 siblings, 0 replies; 11+ messages in thread
From: Ulrich Weber @ 2011-11-28 20:14 UTC (permalink / raw)
  To: netdev; +Cc: davem

and flush related xfrm states if interface goes down.

If XFRMA_RECV_DEV is set to an interface index, all decrypted
packets of the associated xfrm state will have the incoming
interface set to that interface.

This allows to create multiple virtual IPsec devices, which
all receive their packets via the same uplink interface.

Xfrm policies can then match against these virtual IPsec
interfaces. Otherwise only one policy could be installed
matching the uplink interface.

Signed-off-by: Ulrich Weber <ulrich.weber@sophos.com>
---
 include/linux/xfrm.h  |    1 +
 include/net/xfrm.h    |    6 +++++-
 net/key/af_key.c      |    2 +-
 net/xfrm/xfrm_input.c |    5 +++++
 net/xfrm/xfrm_state.c |   44 +++++++++++++++++++++++++++++++++++++++-----
 net/xfrm/xfrm_user.c  |   14 +++++++++++++-
 6 files changed, 64 insertions(+), 8 deletions(-)

diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h
index bb1bb49..13b04f4 100644
--- a/include/linux/xfrm.h
+++ b/include/linux/xfrm.h
@@ -295,6 +295,7 @@ enum xfrm_attr_type_t {
 	XFRMA_MARK,		/* struct xfrm_mark */
 	XFRMA_TFCPAD,		/* __u32 */
 	XFRMA_REPLAY_ESN_VAL,	/* struct xfrm_replay_esn */
+	XFRMA_RECV_DEV,		/* __u32 */
 	__XFRMA_MAX
 
 #define XFRMA_MAX (__XFRMA_MAX - 1)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 89174e2..3febf6a 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -226,6 +226,9 @@ struct xfrm_state {
 	/* Security context */
 	struct xfrm_sec_ctx	*security;
 
+	/* Overwrite incoming device */
+	struct net_device	*dev;
+
 	/* Private data of this transformer, format is opaque,
 	 * interpreted by xfrm_type methods. */
 	void			*data;
@@ -1442,7 +1445,8 @@ struct xfrmk_spdinfo {
 extern struct xfrm_state *xfrm_find_acq_byseq(struct net *net, u32 mark,
 					      u32 seq);
 extern int xfrm_state_delete(struct xfrm_state *x);
-extern int xfrm_state_flush(struct net *net, u8 proto, struct xfrm_audit *audit_info);
+extern int xfrm_state_flush(struct net *net, u8 proto, struct net_device *dev,
+			    struct xfrm_audit *audit_info);
 extern void xfrm_sad_getinfo(struct net *net, struct xfrmk_sadinfo *si);
 extern void xfrm_spd_getinfo(struct net *net, struct xfrmk_spdinfo *si);
 extern u32 xfrm_replay_seqhi(struct xfrm_state *x, __be32 net_seq);
diff --git a/net/key/af_key.c b/net/key/af_key.c
index bfc0bef..3bd8075 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -1726,7 +1726,7 @@ static int pfkey_flush(struct sock *sk, struct sk_buff *skb, const struct sadb_m
 	audit_info.loginuid = audit_get_loginuid(current);
 	audit_info.sessionid = audit_get_sessionid(current);
 	audit_info.secid = 0;
-	err = xfrm_state_flush(net, proto, &audit_info);
+	err = xfrm_state_flush(net, proto, NULL, &audit_info);
 	err2 = unicast_flush_resp(sk, hdr);
 	if (err || err2) {
 		if (err == -ESRCH) /* empty table - go quietly */
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 54a0dc2..80eb73a 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -239,6 +239,11 @@ resume:
 			goto drop;
 		}
 
+		if (x->dev) {
+			skb->dev = x->dev;
+			skb->skb_iif = skb->dev->ifindex;
+		}
+
 		if (x->outer_mode->flags & XFRM_MODE_FLAG_TUNNEL) {
 			decaps = 1;
 			break;
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 5b228f9..41c6cc5 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -368,6 +368,8 @@ static void xfrm_state_gc_destroy(struct xfrm_state *x)
 		x->type->destructor(x);
 		xfrm_put_type(x->type);
 	}
+	if (x->dev)
+		dev_put(x->dev);
 	security_xfrm_state_free(x);
 	kfree(x);
 }
@@ -567,7 +569,8 @@ EXPORT_SYMBOL(xfrm_state_delete);
 
 #ifdef CONFIG_SECURITY_NETWORK_XFRM
 static inline int
-xfrm_state_flush_secctx_check(struct net *net, u8 proto, struct xfrm_audit *audit_info)
+xfrm_state_flush_secctx_check(struct net *net, u8 proto, struct net_device *dev,
+			      struct xfrm_audit *audit_info)
 {
 	int i, err = 0;
 
@@ -577,6 +580,7 @@ xfrm_state_flush_secctx_check(struct net *net, u8 proto, struct xfrm_audit *audi
 
 		hlist_for_each_entry(x, entry, net->xfrm.state_bydst+i, bydst) {
 			if (xfrm_id_proto_match(x->id.proto, proto) &&
+			   (!dev || dev == x->dev) &&
 			   (err = security_xfrm_state_delete(x)) != 0) {
 				xfrm_audit_state_delete(x, 0,
 							audit_info->loginuid,
@@ -591,18 +595,20 @@ xfrm_state_flush_secctx_check(struct net *net, u8 proto, struct xfrm_audit *audi
 }
 #else
 static inline int
-xfrm_state_flush_secctx_check(struct net *net, u8 proto, struct xfrm_audit *audit_info)
+xfrm_state_flush_secctx_check(struct net *net, u8 proto, struct net_device *dev,
+			      struct xfrm_audit *audit_info)
 {
 	return 0;
 }
 #endif
 
-int xfrm_state_flush(struct net *net, u8 proto, struct xfrm_audit *audit_info)
+int xfrm_state_flush(struct net *net, u8 proto, struct net_device *dev,
+		     struct xfrm_audit *audit_info)
 {
 	int i, err = 0, cnt = 0;
 
 	spin_lock_bh(&xfrm_state_lock);
-	err = xfrm_state_flush_secctx_check(net, proto, audit_info);
+	err = xfrm_state_flush_secctx_check(net, proto, dev, audit_info);
 	if (err)
 		goto out;
 
@@ -613,6 +619,7 @@ int xfrm_state_flush(struct net *net, u8 proto, struct xfrm_audit *audit_info)
 restart:
 		hlist_for_each_entry(x, entry, net->xfrm.state_bydst+i, bydst) {
 			if (!xfrm_state_kern(x) &&
+			    (!dev || x->dev == dev) &&
 			    xfrm_id_proto_match(x->id.proto, proto)) {
 				xfrm_state_hold(x);
 				spin_unlock_bh(&xfrm_state_lock);
@@ -1185,6 +1192,11 @@ static struct xfrm_state *xfrm_state_clone(struct xfrm_state *orig, int *errp)
 
 	memcpy(&x->mark, &orig->mark, sizeof(x->mark));
 
+	if (orig->dev) {
+		dev_hold(dev);
+		x->dev = orig->dev;
+	}
+
 	err = xfrm_init_state(x);
 	if (err)
 		goto error;
@@ -2005,6 +2017,26 @@ int xfrm_init_state(struct xfrm_state *x)
 
 EXPORT_SYMBOL(xfrm_init_state);
 
+
+static int xfrm_dev_event(struct notifier_block *this, unsigned long event, void *ptr)
+{
+	struct net_device *dev = ptr;
+	struct xfrm_audit audit_info;
+
+	switch (event) {
+	case NETDEV_DOWN:
+		audit_info.loginuid = -1;
+		audit_info.sessionid = -1;
+		audit_info.secid = 0;
+		xfrm_state_flush(dev_net(dev), IPSEC_PROTO_ANY, dev, &audit_info);
+	}
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block xfrm_dev_notifier = {
+	.notifier_call	= xfrm_dev_event,
+};
+
 int __net_init xfrm_state_init(struct net *net)
 {
 	unsigned int sz;
@@ -2029,6 +2061,8 @@ int __net_init xfrm_state_init(struct net *net)
 	INIT_HLIST_HEAD(&net->xfrm.state_gc_list);
 	INIT_WORK(&net->xfrm.state_gc_work, xfrm_state_gc_task);
 	init_waitqueue_head(&net->xfrm.km_waitq);
+	if (net_eq(net, &init_net))
+		register_netdevice_notifier(&xfrm_dev_notifier);
 	return 0;
 
 out_byspi:
@@ -2048,7 +2082,7 @@ void xfrm_state_fini(struct net *net)
 	audit_info.loginuid = -1;
 	audit_info.sessionid = -1;
 	audit_info.secid = 0;
-	xfrm_state_flush(net, IPSEC_PROTO_ANY, &audit_info);
+	xfrm_state_flush(net, IPSEC_PROTO_ANY, NULL, &audit_info);
 	flush_work(&net->xfrm.state_gc_work);
 
 	WARN_ON(!list_empty(&net->xfrm.state_all));
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index d0a42df..46bd7ad 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -532,6 +532,12 @@ static struct xfrm_state *xfrm_state_construct(struct net *net,
 			goto error;
 	}
 
+	if (attrs[XFRMA_RECV_DEV]) {
+		x->dev = dev_get_by_index(net, *(int *) nla_data(attrs[XFRMA_RECV_DEV]));
+		if (x->dev == NULL)
+			goto error;
+	}
+
 	xfrm_mark_get(attrs, &x->mark);
 
 	err = __xfrm_init_state(x, false);
@@ -762,6 +768,9 @@ static int copy_to_user_state_extra(struct xfrm_state *x,
 	if (x->lastused)
 		NLA_PUT_U64(skb, XFRMA_LASTUSED, x->lastused);
 
+	if (x->dev)
+		NLA_PUT_U32(skb, XFRMA_RECV_DEV, x->dev->ifindex);
+
 	if (x->aead)
 		NLA_PUT(skb, XFRMA_ALG_AEAD, aead_len(x->aead), x->aead);
 	if (x->aalg) {
@@ -1642,7 +1651,7 @@ static int xfrm_flush_sa(struct sk_buff *skb, struct nlmsghdr *nlh,
 	audit_info.loginuid = audit_get_loginuid(current);
 	audit_info.sessionid = audit_get_sessionid(current);
 	security_task_getsecid(current, &audit_info.secid);
-	err = xfrm_state_flush(net, p->proto, &audit_info);
+	err = xfrm_state_flush(net, p->proto, NULL, &audit_info);
 	if (err) {
 		if (err == -ESRCH) /* empty table */
 			return 0;
@@ -2243,6 +2252,7 @@ static const struct nla_policy xfrma_policy[XFRMA_MAX+1] = {
 	[XFRMA_MARK]		= { .len = sizeof(struct xfrm_mark) },
 	[XFRMA_TFCPAD]		= { .type = NLA_U32 },
 	[XFRMA_REPLAY_ESN_VAL]	= { .len = sizeof(struct xfrm_replay_state_esn) },
+	[XFRMA_RECV_DEV]	= { .type = NLA_U32 },
 };
 
 static struct xfrm_link {
@@ -2432,6 +2442,8 @@ static inline size_t xfrm_sa_len(struct xfrm_state *x)
 				    x->security->ctx_len);
 	if (x->coaddr)
 		l += nla_total_size(sizeof(*x->coaddr));
+	if (x->dev)
+		l += nla_total_size(sizeof(u32));
 
 	/* Must count x->lastused as it may become non-zero behind our back. */
 	l += nla_total_size(sizeof(u64));
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] route: set iif and oif information in flowi struct
  2011-11-28 20:14 ` [PATCH 2/3] route: set iif and oif information in flowi struct Ulrich Weber
@ 2011-11-28 23:53   ` Julian Anastasov
  2011-11-30 17:21     ` Ulrich Weber
  2011-11-30  0:01   ` David Miller
  1 sibling, 1 reply; 11+ messages in thread
From: Julian Anastasov @ 2011-11-28 23:53 UTC (permalink / raw)
  To: Ulrich Weber; +Cc: netdev, davem


	Hello,

On Mon, 28 Nov 2011, Ulrich Weber wrote:

> Outgoing packets have loopback interface as incoming interface.
> 
> Signed-off-by: Ulrich Weber <ulrich.weber@sophos.com>
> ---
>  net/ipv4/route.c        |    4 ++++
>  net/ipv4/xfrm4_policy.c |   19 +++++++++++++++++--
>  net/ipv6/fib6_rules.c   |   10 ++++++++--
>  net/ipv6/xfrm6_policy.c |   18 ++++++++++++++++--
>  4 files changed, 45 insertions(+), 6 deletions(-)
> 
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index fb47c8f..1702ec0 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -2744,6 +2744,10 @@ struct rtable *__ip_route_output_key(struct net *net, struct flowi4 *flp4)
>  				flp4->saddr = rth->rt_src;
>  			if (!flp4->daddr)
>  				flp4->daddr = rth->rt_dst;
> +			if (!flp4->flowi4_iif)
> +				flp4->flowi4_iif = net->loopback_dev->ifindex;
> +			if (!flp4->flowi4_oif)
> +				flp4->flowi4_oif = rth->rt_iif;

	May be setting flowi4_oif unconditionally here is more
correct because ip_route_output_slow fills flowi4_oif with
the selected oif, it can even change the provided original
oif in flowi4_oif. What about this?:

			flp4->flowi4_oif = rth->dst.dev->ifindex;

	OTOH, rt_iif has some complex semantic: original oif
or the selected oif. May be you prefer flowi4_oif to hold
the selected oif, right?

	I see one dangerous place that must be checked:
icmp_route_lookup. Before now __ip_route_output_key was
called after xfrm_decode_session_reverse with 0 in
flowi4_oif, i.e. no oif binding was used. But now when
decode_session sets flowi4_oif we will restrict the route
via this interface?

>  			return rth;
>  		}
>  		RT_CACHE_STAT_INC(out_hlist_search);

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] xfrm: add incoming interface to selector
  2011-11-28 20:14 ` [PATCH 1/3] " Ulrich Weber
@ 2011-11-30  0:00   ` David Miller
  2011-11-30 17:33     ` Ulrich Weber
  0 siblings, 1 reply; 11+ messages in thread
From: David Miller @ 2011-11-30  0:00 UTC (permalink / raw)
  To: ulrich.weber; +Cc: netdev

From: Ulrich Weber <ulrich.weber@Sophos.com>
Date: Mon, 28 Nov 2011 21:14:50 +0100

> Current implementation only allows to match for outgoing
> interface in xfrm policy database.
> 
> This replaces the user ID (which was never used) with the
> incoming interface, so both interfaces can be matched against.
> 
> Signed-off-by: Ulrich Weber <ulrich.weber@sophos.com>

This isn't safe, because we have no idea if existing users are putting
garbage there.  So your change can break things.

You'll have to add a netlink attribute or similar.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] route: set iif and oif information in flowi struct
  2011-11-28 20:14 ` [PATCH 2/3] route: set iif and oif information in flowi struct Ulrich Weber
  2011-11-28 23:53   ` Julian Anastasov
@ 2011-11-30  0:01   ` David Miller
  1 sibling, 0 replies; 11+ messages in thread
From: David Miller @ 2011-11-30  0:01 UTC (permalink / raw)
  To: ulrich.weber; +Cc: netdev

From: Ulrich Weber <ulrich.weber@Sophos.com>
Date: Mon, 28 Nov 2011 21:14:51 +0100

> +	}
> +	else {
 ...
> +	}
> +	else {

Besides Julian's comments, these are not properly formatted, use:

	} else {

instead.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] route: set iif and oif information in flowi struct
  2011-11-28 23:53   ` Julian Anastasov
@ 2011-11-30 17:21     ` Ulrich Weber
  2011-11-30 22:37       ` Julian Anastasov
  0 siblings, 1 reply; 11+ messages in thread
From: Ulrich Weber @ 2011-11-30 17:21 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: netdev, davem

On 29.11.2011 00:53, Julian Anastasov wrote:
>
> 	May be setting flowi4_oif unconditionally here is more
> correct because ip_route_output_slow fills flowi4_oif with
> the selected oif, it can even change the provided original
> oif in flowi4_oif. What about this?:
>
> 			flp4->flowi4_oif = rth->dst.dev->ifindex;
>
> 	OTOH, rt_iif has some complex semantic: original oif
> or the selected oif. May be you prefer flowi4_oif to hold
> the selected oif, right?
I wasn't aware the ip_route_output_slow() might change the original oif.
You know why this might happen? Shouldn't fib_lookup only return
a route matching the given oif?

Anyway, if thats the case your code above is more correct. The packet
should always match the xfrm policy where it was originally routed.

> 	I see one dangerous place that must be checked:
> icmp_route_lookup. Before now __ip_route_output_key was
> called after xfrm_decode_session_reverse with 0 in
> flowi4_oif, i.e. no oif binding was used. But now when
> decode_session sets flowi4_oif we will restrict the route
> via this interface?
Thanks for the hint! Yes the current patch will force the ICMP packet
over the received interface.

Will add "fl4_dec.flowi4_oif = 0;" in case the saddr is local, so the
behavior will be the same. fl4_dec.flowi4_oif will then be set in
_ip_route_output_key() again.

Cheers
Ulrich

-- 
Ulrich Weber | ulrich.weber@sophos.com | Senior Software Engineer
Astaro - a Sophos company | Amalienbadstr 41 | 76227 Karlsruhe | Germany
Phone +49-721-25516-0 | Fax –200 | www.astaro.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] xfrm: add incoming interface to selector
  2011-11-30  0:00   ` David Miller
@ 2011-11-30 17:33     ` Ulrich Weber
  2011-11-30 17:47       ` David Miller
  0 siblings, 1 reply; 11+ messages in thread
From: Ulrich Weber @ 2011-11-30 17:33 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

On 30.11.2011 01:00, David Miller wrote:
>
> This isn't safe, because we have no idea if existing users are putting
> garbage there.  So your change can break things.
>
> You'll have to add a netlink attribute or similar.
But a implementation matching xfrm against UID would break
existing programs too, where sel->user is set to garbage.

I checked all common programs, they set sel->user to zero:

iproute 3.1.0: sel->user is shown if set, but its not possible to set it
openswan 2.6.37: xfrm_selector memset to zero
strongswan 4.6.1: ifindex and user set to zero
ipsec-tools 0.8.0: PF_KEY only (memset to zero in net/key/af_key.c)
ike 2.1.7: PF_KEY only
isakmpd 20041012: PF_KEY only

Cheers
Ulrich

-- 
Ulrich Weber | ulrich.weber@sophos.com | Senior Software Engineer
Astaro - a Sophos company | Amalienbadstr 41 | 76227 Karlsruhe | Germany
Phone +49-721-25516-0 | Fax –200 | www.astaro.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] xfrm: add incoming interface to selector
  2011-11-30 17:33     ` Ulrich Weber
@ 2011-11-30 17:47       ` David Miller
  0 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2011-11-30 17:47 UTC (permalink / raw)
  To: ulrich.weber; +Cc: netdev

From: Ulrich Weber <ulrich.weber@Sophos.com>
Date: Wed, 30 Nov 2011 18:33:54 +0100

> On 30.11.2011 01:00, David Miller wrote:
>>
>> This isn't safe, because we have no idea if existing users are putting
>> garbage there.  So your change can break things.
>>
>> You'll have to add a netlink attribute or similar.
> But a implementation matching xfrm against UID would break
> existing programs too, where sel->user is set to garbage.

Such a program would have to be used with other agents which are known
to not modify the sel->user of existing entries.

I'm not letting you reassign this existing structure member for another
use, it's been exposed to userspace for nearly 10 years.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] route: set iif and oif information in flowi struct
  2011-11-30 17:21     ` Ulrich Weber
@ 2011-11-30 22:37       ` Julian Anastasov
  0 siblings, 0 replies; 11+ messages in thread
From: Julian Anastasov @ 2011-11-30 22:37 UTC (permalink / raw)
  To: Ulrich Weber; +Cc: netdev, davem


	Hello,

On Wed, 30 Nov 2011, Ulrich Weber wrote:

> > 	OTOH, rt_iif has some complex semantic: original oif
> > or the selected oif. May be you prefer flowi4_oif to hold
> > the selected oif, right?
> I wasn't aware the ip_route_output_slow() might change the original oif.
> You know why this might happen? Shouldn't fib_lookup only return
> a route matching the given oif?

	There are exceptions but only in this form:

	dev_out = net->loopback_dev;

	i.e. when traffic is diverted to loopback device.
In all others cases the provided oif is considered and never
changed.

> Anyway, if thats the case your code above is more correct. The packet
> should always match the xfrm policy where it was originally routed.
> 
> > 	I see one dangerous place that must be checked:
> > icmp_route_lookup. Before now __ip_route_output_key was
> > called after xfrm_decode_session_reverse with 0 in
> > flowi4_oif, i.e. no oif binding was used. But now when
> > decode_session sets flowi4_oif we will restrict the route
> > via this interface?
> Thanks for the hint! Yes the current patch will force the ICMP packet
> over the received interface.
> 
> Will add "fl4_dec.flowi4_oif = 0;" in case the saddr is local, so the
> behavior will be the same. fl4_dec.flowi4_oif will then be set in
> _ip_route_output_key() again.

	Yes, I think this should work. I now see another
unrelated problem with this code. _decode_session4 uses
fl4->flowi4_tos = iph->tos; But we then feed fl4_dec
to __ip_route_output_key without applying IPTOS_RT_MASK.
__ip_route_output_key does not work with plain TOS from IP
header.

	Currently, the situation around flowi4_tos is as follows:

- flowi4_tos:
	- (IPTOS_RT_MASK | RTO_ONLINK) bits when provided to
	__ip_route_output_key

	- on return above value is preserved if route is
	in cache, it is not changed

	- only IPTOS_RT_MASK bits are preserved by
	ip_route_output_slow if the route is not cached
	because flowi4_tos holds the bits that are
	matched by ip rules tos XX

	As result, we can not rely on the RTO_ONLINK bit
	being valid on return. IPTOS_RT_MASK bits are preserved.

- rt_key_tos:
	- (IPTOS_RT_MASK | RTO_ONLINK) bits

	- we have a bug here: ip_route_output_slow filters
	the provided flowi4_tos by removing RTO_ONLINK bit,
	then __mkroute_output tries to get original value
	with the RTO_ONLINK bit but ends up with the
	modified value. We need to provide orig_tos
	to __mkroute_output.

	In XFRM. Currently only xfrm_bundle_create calls
xfrm_get_tos to get a routable TOS value and such value is
provided to xfrm_dst_lookup. It means __xfrm4_dst_lookup gets
such filtered TOS value, lets name it rtos (routing tos) and
provides it to output routing correctly.

	xfrm4_fill_dst probably correctly copies
flowi4_tos into rt_key_tos if fl is result of output route
and not returned by decode_session. Of course, RTO_ONLINK
can be lost. But may be problem can happen at least for icmp.c
for the xfrm_decode_session_reverse -> xfrm_lookup code
where a full IP TOS can be copied by xfrm4_fill_dst:

	rt->rt_key_tos = fl4->flowi4_tos;

	What significance has this assignment? Should we have
valid RTO_ONLINK bit in flowi4_tos ?

	So, the fl4->flowi4_tos = iph->tos code does not
look nice, I don't know if this field is used for some
matching and we need to hold all IP TOS bits. If not, may be
we do not need to play games and have to use this instead:

	fl4->flowi4_tos = iph->tos & IPTOS_RT_MASK;

	because flowi4_tos is a rtos, not an IP TOS field.
Then we can reach xfrm_lookup or __ip_route_output_key safely
after _decode_session4.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-11-30 22:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-28 20:14 [PATCH 0/3] xfrm: add incoming interface to selector Ulrich Weber
2011-11-28 20:14 ` [PATCH 1/3] " Ulrich Weber
2011-11-30  0:00   ` David Miller
2011-11-30 17:33     ` Ulrich Weber
2011-11-30 17:47       ` David Miller
2011-11-28 20:14 ` [PATCH 2/3] route: set iif and oif information in flowi struct Ulrich Weber
2011-11-28 23:53   ` Julian Anastasov
2011-11-30 17:21     ` Ulrich Weber
2011-11-30 22:37       ` Julian Anastasov
2011-11-30  0:01   ` David Miller
2011-11-28 20:14 ` [PATCH 3/3] xfrm: allow to overwrite incoming dev after decryption Ulrich Weber

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).