netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/1] rtnetlink: request RTM_GETLINK by pid or fd
@ 2018-01-18 20:21 Christian Brauner
  2018-01-18 20:21 ` [PATCH net-next 1/1] " Christian Brauner
  0 siblings, 1 reply; 23+ messages in thread
From: Christian Brauner @ 2018-01-18 20:21 UTC (permalink / raw)
  To: davem, dsahern, fw, daniel, lucien.xin, mschiffer,
	jakub.kicinski, vyasevich, jbenc, netdev, linux-kernel
  Cc: Christian Brauner

Hey everyone,

This makes it possible to identify the target network namespace of a
RTM_GETLINK message by pid or fd.
Often userspace tools that make heavy use of network namespaces need a simple
and cheap way of querying network devices and network device properties. This
becomes even more crucial when the network namespaces in question are
transient. In such scenarios setting a netns id property is not really wanted
and it is preferable to avoid the hit of (possibly multiple) setns() syscalls
(e.g. attaching to the target network namespace and back to the original
network namespace.). This commit lets userspace set the IFLA_NET_NS_{FD,PID}
property to identify a target network namespace where the device in question is
to be queried.
I couldn't find any obvious reason why this shouldn't be allowed but I haven't
done a deep dive into the possible security implications. So if I missed a very
obvious point why this wasn't possible so far, I'm sorry.

Christian

Christian Brauner (1):
  rtnetlink: request RTM_GETLINK by pid or fd

 net/core/rtnetlink.c | 63 +++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 50 insertions(+), 13 deletions(-)

-- 
2.14.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-18 20:21 [PATCH net-next 0/1] rtnetlink: request RTM_GETLINK by pid or fd Christian Brauner
@ 2018-01-18 20:21 ` Christian Brauner
  2018-01-18 20:29   ` Jiri Benc
  0 siblings, 1 reply; 23+ messages in thread
From: Christian Brauner @ 2018-01-18 20:21 UTC (permalink / raw)
  To: davem, dsahern, fw, daniel, lucien.xin, mschiffer,
	jakub.kicinski, vyasevich, jbenc, netdev, linux-kernel
  Cc: Christian Brauner

This makes it possible to identify the target network namespace of a
RTM_GETLINK message by pid or fd.
Often userspace tools that make heavy use of network namespaces need a
simple and cheap way of querying network devices and network device
properties. This becomes even more crucial when the network namespaces in
question are transient. In such scenarios setting a netns id property is
not really wanted and it is preferable to avoid the hit of (possibly
multiple) setns() syscalls (e.g. attaching to the target network namespace
and back to the original network namespace.). This commit lets userspace
set the IFLA_NET_NS_{FD,PID} property to identify a target network
namespace where the device in question is to be queried.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 net/core/rtnetlink.c | 63 +++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 50 insertions(+), 13 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 16d644a4f974..5210448dcf1f 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1509,7 +1509,8 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb,
 			    struct net_device *dev, struct net *src_net,
 			    int type, u32 pid, u32 seq, u32 change,
 			    unsigned int flags, u32 ext_filter_mask,
-			    u32 event, int *new_nsid, int tgt_netnsid)
+			    u32 event, int *new_nsid, int tgt_netnsid,
+			    int tgt_netnsid_type)
 {
 	struct ifinfomsg *ifm;
 	struct nlmsghdr *nlh;
@@ -1527,8 +1528,23 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb,
 	ifm->ifi_flags = dev_get_flags(dev);
 	ifm->ifi_change = change;
 
-	if (tgt_netnsid >= 0 && nla_put_s32(skb, IFLA_IF_NETNSID, tgt_netnsid))
-		goto nla_put_failure;
+	if (tgt_netnsid >= 0) {
+		u32 err = -1;
+
+		switch (tgt_netnsid_type) {
+		case IFLA_IF_NETNSID:
+			err = nla_put_s32(skb, IFLA_IF_NETNSID, tgt_netnsid);
+			break;
+		case IFLA_NET_NS_PID:
+			err = nla_put_s32(skb, IFLA_NET_NS_PID, tgt_netnsid);
+			break;
+		case IFLA_NET_NS_FD:
+			err = nla_put_s32(skb, IFLA_NET_NS_FD, tgt_netnsid);
+			break;
+		}
+		if (err)
+			goto nla_put_failure;
+	}
 
 	if (nla_put_string(skb, IFLA_IFNAME, dev->name) ||
 	    nla_put_u32(skb, IFLA_TXQLEN, dev->tx_queue_len) ||
@@ -1791,6 +1807,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
 	unsigned int flags = NLM_F_MULTI;
 	int master_idx = 0;
 	int netnsid = -1;
+	int netnsid_type = -1;
 	int err;
 	int hdrlen;
 
@@ -1810,12 +1827,22 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
 	if (nlmsg_parse(cb->nlh, hdrlen, tb, IFLA_MAX,
 			ifla_policy, NULL) >= 0) {
 		if (tb[IFLA_IF_NETNSID]) {
+			netnsid_type = IFLA_IF_NETNSID;
 			netnsid = nla_get_s32(tb[IFLA_IF_NETNSID]);
 			tgt_net = get_target_net(skb->sk, netnsid);
-			if (IS_ERR(tgt_net)) {
-				tgt_net = net;
-				netnsid = -1;
-			}
+		} else if (tb[IFLA_NET_NS_PID]) {
+			netnsid_type = IFLA_NET_NS_PID;
+			netnsid = nla_get_s32(tb[IFLA_NET_NS_PID]);
+			tgt_net = get_net_ns_by_pid(netnsid);
+		} else if (tb[IFLA_NET_NS_FD]) {
+			netnsid_type = IFLA_NET_NS_FD;
+			netnsid = nla_get_s32(tb[IFLA_NET_NS_FD]);
+			tgt_net = get_net_ns_by_fd(netnsid);
+		}
+		if (IS_ERR(tgt_net)) {
+			tgt_net = net;
+			netnsid = -1;
+			netnsid_type = -1;
 		}
 
 		if (tb[IFLA_EXT_MASK])
@@ -1845,7 +1872,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
 					       cb->nlh->nlmsg_seq, 0,
 					       flags,
 					       ext_filter_mask, 0, NULL,
-					       netnsid);
+					       netnsid, netnsid_type);
 
 			if (err < 0) {
 				if (likely(skb->len))
@@ -2984,6 +3011,7 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 	struct net_device *dev = NULL;
 	struct sk_buff *nskb;
 	int netnsid = -1;
+	int netnsid_type = -1;
 	int err;
 	u32 ext_filter_mask = 0;
 
@@ -2992,11 +3020,20 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 		return err;
 
 	if (tb[IFLA_IF_NETNSID]) {
+		netnsid_type = IFLA_IF_NETNSID;
 		netnsid = nla_get_s32(tb[IFLA_IF_NETNSID]);
 		tgt_net = get_target_net(NETLINK_CB(skb).sk, netnsid);
-		if (IS_ERR(tgt_net))
-			return PTR_ERR(tgt_net);
-	}
+	} else if (tb[IFLA_NET_NS_PID]) {
+		netnsid_type = IFLA_NET_NS_PID;
+		netnsid = nla_get_s32(tb[IFLA_NET_NS_PID]);
+		tgt_net = get_net_ns_by_pid(netnsid);
+	} else if (tb[IFLA_NET_NS_FD]) {
+		netnsid_type = IFLA_NET_NS_FD;
+		netnsid = nla_get_s32(tb[IFLA_NET_NS_FD]);
+		tgt_net = get_net_ns_by_fd(netnsid);
+	}
+	if (IS_ERR(tgt_net))
+		return PTR_ERR(tgt_net);
 
 	if (tb[IFLA_IFNAME])
 		nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
@@ -3025,7 +3062,7 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 	err = rtnl_fill_ifinfo(nskb, dev, net,
 			       RTM_NEWLINK, NETLINK_CB(skb).portid,
 			       nlh->nlmsg_seq, 0, 0, ext_filter_mask,
-			       0, NULL, netnsid);
+			       0, NULL, netnsid, netnsid_type);
 	if (err < 0) {
 		/* -EMSGSIZE implies BUG in if_nlmsg_size */
 		WARN_ON(err == -EMSGSIZE);
@@ -3134,7 +3171,7 @@ struct sk_buff *rtmsg_ifinfo_build_skb(int type, struct net_device *dev,
 
 	err = rtnl_fill_ifinfo(skb, dev, dev_net(dev),
 			       type, 0, 0, change, 0, 0, event,
-			       new_nsid, -1);
+			       new_nsid, -1, -1);
 	if (err < 0) {
 		/* -EMSGSIZE implies BUG in if_nlmsg_size() */
 		WARN_ON(err == -EMSGSIZE);
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-18 20:21 ` [PATCH net-next 1/1] " Christian Brauner
@ 2018-01-18 20:29   ` Jiri Benc
  2018-01-18 20:55     ` Christian Brauner
  0 siblings, 1 reply; 23+ messages in thread
From: Jiri Benc @ 2018-01-18 20:29 UTC (permalink / raw)
  To: Christian Brauner
  Cc: davem, dsahern, fw, daniel, lucien.xin, mschiffer,
	jakub.kicinski, vyasevich, netdev, linux-kernel

On Thu, 18 Jan 2018 21:21:24 +0100, Christian Brauner wrote:
> In such scenarios setting a netns id property is
> not really wanted

Why? I think that's what you should do if you want to avoid setns. Just
use netnsid. I don't see any problem with that and you didn't provide
any explanation.

 Jiri

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-18 20:29   ` Jiri Benc
@ 2018-01-18 20:55     ` Christian Brauner
  2018-01-22 21:00       ` Jiri Benc
  0 siblings, 1 reply; 23+ messages in thread
From: Christian Brauner @ 2018-01-18 20:55 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Christian Brauner, davem, dsahern, fw, daniel, lucien.xin,
	mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel,
	stephen

On Thu, Jan 18, 2018 at 09:29:14PM +0100, Jiri Benc wrote:
> On Thu, 18 Jan 2018 21:21:24 +0100, Christian Brauner wrote:
> > In such scenarios setting a netns id property is
> > not really wanted
> 
> Why? I think that's what you should do if you want to avoid setns. Just
> use netnsid. I don't see any problem with that and you didn't provide
> any explanation.

Ah, sorry, maybe I was to brief. When creating and destroying a lot of
short-lived network namespaces it seems unnecessary to give them all
label/netns id. Using a netns id makes much more sense when you want a
persistent, long-living network namespace. For example, iproute2 where
you want to create a persistent network namespace that sticks around via
ip netns add bla && ip netns set bla 5.

A more concrete scenario is creating a network namespace, moving a
device into it via RTM_SETLINK which also supports IFLA_NET_NS_{FD,PID}
and then wanting to query the device. This would be very easy to do if
one could reuse the IFLA_NET_NS_{FD,PID} without having to set a
netnsid.

Christian

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-18 20:55     ` Christian Brauner
@ 2018-01-22 21:00       ` Jiri Benc
  2018-01-22 21:23         ` Christian Brauner
  0 siblings, 1 reply; 23+ messages in thread
From: Jiri Benc @ 2018-01-22 21:00 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christian Brauner, davem, dsahern, fw, daniel, lucien.xin,
	mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel,
	stephen

On Thu, 18 Jan 2018 21:55:53 +0100, Christian Brauner wrote:
> A more concrete scenario is creating a network namespace, moving a
> device into it via RTM_SETLINK which also supports IFLA_NET_NS_{FD,PID}
> and then wanting to query the device. This would be very easy to do if
> one could reuse the IFLA_NET_NS_{FD,PID} without having to set a
> netnsid.

Wouldn't be a better solution to have a way to request the netnsid
allocation (and return it) right away when creating the name space,
then?

 Jiri

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-22 21:00       ` Jiri Benc
@ 2018-01-22 21:23         ` Christian Brauner
  2018-01-22 22:06           ` Jiri Benc
  0 siblings, 1 reply; 23+ messages in thread
From: Christian Brauner @ 2018-01-22 21:23 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Christian Brauner, davem, dsahern, fw, daniel, lucien.xin,
	mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel,
	stephen

On Mon, Jan 22, 2018 at 10:00:46PM +0100, Jiri Benc wrote:
> On Thu, 18 Jan 2018 21:55:53 +0100, Christian Brauner wrote:
> > A more concrete scenario is creating a network namespace, moving a
> > device into it via RTM_SETLINK which also supports IFLA_NET_NS_{FD,PID}
> > and then wanting to query the device. This would be very easy to do if
> > one could reuse the IFLA_NET_NS_{FD,PID} without having to set a
> > netnsid.
> 
> Wouldn't be a better solution to have a way to request the netnsid
> allocation (and return it) right away when creating the name space,
> then?

That is certainly a good idea and I'm happy to send a follow-up patch
for that! But there's still value in being able to use
IFLA_NET_NS_{FD,PID} in scenarios where the network namespace has been
created by another process. In this case we don't know what its netnsid
is and not even if it had been assigned one at creation time or not. In
this case it would be useful to refer to the netns via a pid or fd. A
more concrete and frequent example is querying a network namespace of a
(sorry for the buzzword :)) container for all defined network
interfaces.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-22 21:23         ` Christian Brauner
@ 2018-01-22 22:06           ` Jiri Benc
  2018-01-22 22:25             ` Christian Brauner
  2018-01-23 16:37             ` Nicolas Dichtel
  0 siblings, 2 replies; 23+ messages in thread
From: Jiri Benc @ 2018-01-22 22:06 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christian Brauner, davem, dsahern, fw, daniel, lucien.xin,
	mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel,
	stephen

On Mon, 22 Jan 2018 22:23:54 +0100, Christian Brauner wrote:
> That is certainly a good idea and I'm happy to send a follow-up patch
> for that!

Note that I haven't looked into that and I don't know whether it is
easily possible. I'll appreciate if you could try that.

> But there's still value in being able to use
> IFLA_NET_NS_{FD,PID} in scenarios where the network namespace has been
> created by another process. In this case we don't know what its netnsid
> is and not even if it had been assigned one at creation time or not. In
> this case it would be useful to refer to the netns via a pid or fd. A
> more concrete and frequent example is querying a network namespace of a
> (sorry for the buzzword :)) container for all defined network
> interfaces.

That's what spurred my original comment. If you don't know the netnsid
in such case, we're missing something in uAPI but at a different point
than RTM_GETLINK.

When you find yourself in a need to query an interface in another
netns, you had to learn about that interface in the first place.
Meaning you got its ifindex (or ifname, perhaps) somehow. My point is,
you should have learned the netnsid at the same time.

ifindex alone is not an unique identifier of an interface. The
(ifindex, netnsid) pair is. (Also, note that ifindex can change when
moving the interface to a different netns.) So you should never get
ifindex alone when the interface is in another netns than the current
one. If that happens, that is the uAPI that needs to be fixed. You have
to always get the (ifindex, netnsid) pair.

And that's also the way how it should operate in the other direction:
(ifindex, netnsid) is the identifier to query interface in another
netns.

Note that many APIs in networking are based around netnsid - look at
NETLINK_LISTEN_ALL_NSID. It allows you to keep track of interfaces as
they are moved from name spaces to name spaces using a single socket:
you get notifications about interfaces disappearing or appearing in
"watched" name spaces. The name spaces are, of course, referenced by
their netnsid. In order to add another "watched" name space, just
assign it (or, more typically, let the kernel assign) a netnsid.

Btw, we have one missing piece here: when an interface is moved to a
name space that does not have netnsid attached, we want to find out
where the interface was moved to. But there's currently no reliable way
to do it. For veth, the other end can be used to get the netnsid (note
that RTM_GETLINK will return the correct link netnsid even when the
queried veth interface is in a different name space), for openvswitch,
we can now use genetlink, etc., but using different ways for different
interface types is not the best API ever and for standalone interfaces
we have nothing. I'd like to see something added to uAPI to cover this
in a generic way.

But as for this patch, I don't think it's the correct way. We do have
missing pieces in uAPI wrt. netns support but I think they're at
different places. If you have a counter example, please speak up.

Thanks,

 Jiri

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-22 22:06           ` Jiri Benc
@ 2018-01-22 22:25             ` Christian Brauner
  2018-01-23  9:30               ` Jiri Benc
  2018-01-23 16:37             ` Nicolas Dichtel
  1 sibling, 1 reply; 23+ messages in thread
From: Christian Brauner @ 2018-01-22 22:25 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Christian Brauner, davem, dsahern, fw, daniel, lucien.xin,
	mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel,
	stephen

On Mon, Jan 22, 2018 at 11:06:16PM +0100, Jiri Benc wrote:
> On Mon, 22 Jan 2018 22:23:54 +0100, Christian Brauner wrote:
> > That is certainly a good idea and I'm happy to send a follow-up patch
> > for that!
> 
> Note that I haven't looked into that and I don't know whether it is
> easily possible. I'll appreciate if you could try that.
> 
> > But there's still value in being able to use
> > IFLA_NET_NS_{FD,PID} in scenarios where the network namespace has been
> > created by another process. In this case we don't know what its netnsid
> > is and not even if it had been assigned one at creation time or not. In
> > this case it would be useful to refer to the netns via a pid or fd. A
> > more concrete and frequent example is querying a network namespace of a
> > (sorry for the buzzword :)) container for all defined network
> > interfaces.
> 
> That's what spurred my original comment. If you don't know the netnsid
> in such case, we're missing something in uAPI but at a different point
> than RTM_GETLINK.
> 
> When you find yourself in a need to query an interface in another
> netns, you had to learn about that interface in the first place.
> Meaning you got its ifindex (or ifname, perhaps) somehow. My point is,
> you should have learned the netnsid at the same time.

<snip>

> different places. If you have a counter example, please speak up.

This is not necessarily true in scenarios where I move a network device
via RTM_NEWLINK + IFLA_NET_NS_PID into a network namespace I haven't
created. Here is an example:

nlmsghdr->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
nlmsghdr->nlmsg_type = RTM_NEWLINK;
/* move to network namespace of pid */
nla_put_u32(nlmsg, IFLA_NET_NS_PID, pid)
/* give interface new name */
nla_put_string(nlmsg, IFLA_IFNAME, ifname)

The only thing I have is the pid that identifies the network namespace.
There's no non-syscall way to learn the netnsid. In this case I just
want to be able to do the analogue to RTM_NEWLINK + IFLA_NET_NS_PID aka
RTM_GETLINK + IFLA_NET_NS_PID to e.g. retrieve the list of all network
devices in the network namespace identified by pid.

Christian

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-22 22:25             ` Christian Brauner
@ 2018-01-23  9:30               ` Jiri Benc
  2018-01-23 10:26                 ` Wolfgang Bumiller
  0 siblings, 1 reply; 23+ messages in thread
From: Jiri Benc @ 2018-01-23  9:30 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christian Brauner, davem, dsahern, fw, daniel, lucien.xin,
	mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel,
	stephen

On Mon, 22 Jan 2018 23:25:41 +0100, Christian Brauner wrote:
> This is not necessarily true in scenarios where I move a network device
> via RTM_NEWLINK + IFLA_NET_NS_PID into a network namespace I haven't
> created. Here is an example:
> 
> nlmsghdr->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
> nlmsghdr->nlmsg_type = RTM_NEWLINK;
> /* move to network namespace of pid */
> nla_put_u32(nlmsg, IFLA_NET_NS_PID, pid)
> /* give interface new name */
> nla_put_string(nlmsg, IFLA_IFNAME, ifname)
> 
> The only thing I have is the pid that identifies the network namespace.

How do you know the interface did not get renamed in the new netns?

This is racy and won't work reliably. You really need to know the
netnsid before moving the interface to the netns to be able to do
meaningful queries.

You may argue that for your case, you're fine with the race. But I know
about use cases where it matters a lot: those are tools that show
network topology including changes in real time, such as Skydive. It's
important to have the uAPI designed right. And we don't want two
different APIs for the same thing.

If you want to do any watching over the interfaces (as opposed to "move
to the netns and forget"), you really have to work with netnsids. Let's
focus on how to do that more easily. We don't return netnsid at all
places where we should return it and we need to fix that.

> There's no non-syscall way to learn the netnsid.

And that is the primary problem.

 Jiri

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-23  9:30               ` Jiri Benc
@ 2018-01-23 10:26                 ` Wolfgang Bumiller
  2018-01-23 10:42                   ` Jiri Benc
  2018-01-23 16:50                   ` [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd Nicolas Dichtel
  0 siblings, 2 replies; 23+ messages in thread
From: Wolfgang Bumiller @ 2018-01-23 10:26 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Christian Brauner, Christian Brauner, davem, dsahern, fw, daniel,
	lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev,
	linux-kernel, stephen

On Tue, Jan 23, 2018 at 10:30:09AM +0100, Jiri Benc wrote:
> On Mon, 22 Jan 2018 23:25:41 +0100, Christian Brauner wrote:
> > This is not necessarily true in scenarios where I move a network device
> > via RTM_NEWLINK + IFLA_NET_NS_PID into a network namespace I haven't
> > created. Here is an example:
> > 
> > nlmsghdr->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
> > nlmsghdr->nlmsg_type = RTM_NEWLINK;
> > /* move to network namespace of pid */
> > nla_put_u32(nlmsg, IFLA_NET_NS_PID, pid)
> > /* give interface new name */
> > nla_put_string(nlmsg, IFLA_IFNAME, ifname)
> > 
> > The only thing I have is the pid that identifies the network namespace.
> 
> How do you know the interface did not get renamed in the new netns?
> 
> This is racy and won't work reliably. You really need to know the
> netnsid before moving the interface to the netns to be able to do
> meaningful queries.

Even if you know the netnsid, do the mentioned watches work for
nested/child namespaces if eg. a container creates new namespace before
and/or after the watch was established and moves interfaces to these
child namespaces, would you just see them disappear, or can you keep
track of them later on as well?

Even if that works, from what the documentation tells me netlink is an
unreliable protocol, so if my watcher's socket buffer is full, wouldn't
I be losing important tracking information?

I think one possible solution to tracking interfaces would be to have a
unique identifier that never changes (even if it's just a simple
uint64_t incremented whenever an interface is created). But since
they're not local to the current namespace that may require a lot of
extra permission checks (but I'm just speculating here...).

In any case, IFLA_NET_NS_FD/PID are already there and I had been
wondering previously why they couldn't be used with RTM_GETLINK, it
would just make sense.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-23 10:26                 ` Wolfgang Bumiller
@ 2018-01-23 10:42                   ` Jiri Benc
       [not found]                     ` <20180123114218.vsm5nu2jajrqjvko@gmail.com>
  2018-01-23 16:50                   ` [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd Nicolas Dichtel
  1 sibling, 1 reply; 23+ messages in thread
From: Jiri Benc @ 2018-01-23 10:42 UTC (permalink / raw)
  To: Wolfgang Bumiller
  Cc: Christian Brauner, Christian Brauner, davem, dsahern, fw, daniel,
	lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev,
	linux-kernel, stephen

On Tue, 23 Jan 2018 11:26:58 +0100, Wolfgang Bumiller wrote:
> Even if you know the netnsid, do the mentioned watches work for
> nested/child namespaces if eg. a container creates new namespace before
> and/or after the watch was established and moves interfaces to these
> child namespaces, would you just see them disappear, or can you keep
> track of them later on as well?

What do you mean by "nested namespaces"? There's no such thing for net
name spaces.

As for missing API to get netnsid of the netns the interface is moved
to, see my previous emails in this thread. This needs to be added.

> Even if that works, from what the documentation tells me netlink is an
> unreliable protocol, so if my watcher's socket buffer is full, wouldn't
> I be losing important tracking information?

Sure. But that's fundamentally unfixable independently on netlink, the
kernel needs to take an action if a program is not reading its
messages. Either some messages get dropped or the program is killed or
infinite amount of memory is consumed. This has nothing to do with uAPI
design.

> I think one possible solution to tracking interfaces would be to have a
> unique identifier that never changes (even if it's just a simple
> uint64_t incremented whenever an interface is created). But since
> they're not local to the current namespace that may require a lot of
> extra permission checks (but I'm just speculating here...).

You'll get a hard NACK from CRIU folks if you try to propose this.

> In any case, IFLA_NET_NS_FD/PID are already there and I had been
> wondering previously why they couldn't be used with RTM_GETLINK, it
> would just make sense.

Those predate netnsids and we can't get rid of them now, since they're
part of uAPI. But we can (and should) make sure we don't add more of
those.

 Jiri

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
       [not found]                     ` <20180123114218.vsm5nu2jajrqjvko@gmail.com>
@ 2018-01-23 12:22                       ` Jiri Benc
  2018-01-23 16:55                         ` Nicolas Dichtel
                                           ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Jiri Benc @ 2018-01-23 12:22 UTC (permalink / raw)
  To: Christian Brauner; +Cc: netdev

(Christian, I'm adding back the netdev list, there's no reason not to
have the discussion in open.)

On Tue, 23 Jan 2018 12:42:19 +0100, Christian Brauner wrote:
> Thanks for the comments and discussion. Sorry, for not going through the
> list but I just have a quick question that doesn't deserve to spam the
> whole netdev list. :) I have written another set of patches that would
> add support for IFLA_IF_NETNSID to RTM_{N,S}ETLINK and friends which I
> planned to send out after the pid and fd ones. Would you be interested
> in those if I sent them out over the next week?

Yes, please CC me. I will have limited time next week (whole day
meetings for the most of the week) but I'll try to review.

In general, I see value in adding netnsid support to setlink, newlink
and dellink. It's a bit of minefield, though. Look for and be sure to
resolve:

- Backwards compatibility with old kernels. Old kernels will ignore
  the IFLA_IF_NETNSID attribute that is unknown to them. And it's
  certainly undesirable for an application to attempt to set/create an
  interface in a given netns and instead the kernel sets/creates an
  interface in the current netns.

  In 79e1ad148c844, I put into place means to handle that: any kernel
  that understands IFLA_IF_NETNSID but does not support
  setlink/newlink/dellink will return -EOPNOTSUPP. This reduces the
  problem to detecting whether the kernel understands IFLA_IF_NETNSID.
  This can be done by invoking getlink with IFLA_IF_NETNSID and
  looking whether the reply contains IFLA_IF_NETNSID.

  The current getlink calls are also limited to CAP_NET_ADMIN in the
  *target* netns. If/when this is relaxed in the future, we still need
  to stay backwards compatible.

  So, the application needs to do this:
  1. call RTM_GETLINK with IFLA_IF_NETNSID on lo (or do a dump)
  2. if getting EACCESS, assume IFLA_IF_NETNSID is not supported =>
     fall back to other means
  2. if the reply does not contain IFLA_IF_NETNSID, the kernel does not
     support IFLA_IF_NETNSID => fall back to other means
  3. try RTM_SETLINK/NEWLINK/DELLINK and check for EOPNOTSUPP => fall
     back to other means

  What is important for your kernel patches is to not break this. If
  you implement only partial support for some of the operation, think
  about the future extensibility. Applications have to be able to
  reliably detect what is supported on the running kernel, regardless
  of the kernel version.

- Access rights. This is security sensitive stuff. I took the big
  hammer and limited everything to CAP_NET_ADMIN in the target netns.
  If you want to relax this, be sure to get review from people involved
  in name space security.

 Jiri

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-22 22:06           ` Jiri Benc
  2018-01-22 22:25             ` Christian Brauner
@ 2018-01-23 16:37             ` Nicolas Dichtel
  2018-01-23 17:08               ` Jiri Benc
  1 sibling, 1 reply; 23+ messages in thread
From: Nicolas Dichtel @ 2018-01-23 16:37 UTC (permalink / raw)
  To: Jiri Benc, Christian Brauner
  Cc: Christian Brauner, davem, dsahern, fw, daniel, lucien.xin,
	mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel,
	stephen

Le 22/01/2018 à 23:06, Jiri Benc a écrit :
[snip]
> Btw, we have one missing piece here: when an interface is moved to a
> name space that does not have netnsid attached, we want to find out
> where the interface was moved to. But there's currently no reliable way
> to do it. For veth, the other end can be used to get the netnsid (note
> that RTM_GETLINK will return the correct link netnsid even when the
> queried veth interface is in a different name space), for openvswitch,
> we can now use genetlink, etc., but using different ways for different
> interface types is not the best API ever and for standalone interfaces
> we have nothing. I'd like to see something added to uAPI to cover this
> in a generic way.
When a virtual interface moves to another netns, the netlink RTM_DELLINK message
contains the attribute IFLA_NEW_NETNSID, which identifies where the interface
moves. The nsid may be allocated if needed.
I don't know if it's acceptable to also allocate an nsid in case of a physical
interface.


Regards,
Nicolas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-23 10:26                 ` Wolfgang Bumiller
  2018-01-23 10:42                   ` Jiri Benc
@ 2018-01-23 16:50                   ` Nicolas Dichtel
  1 sibling, 0 replies; 23+ messages in thread
From: Nicolas Dichtel @ 2018-01-23 16:50 UTC (permalink / raw)
  To: Wolfgang Bumiller, Jiri Benc
  Cc: Christian Brauner, Christian Brauner, davem, dsahern, fw, daniel,
	lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev,
	linux-kernel, stephen

Le 23/01/2018 à 11:26, Wolfgang Bumiller a écrit :
> On Tue, Jan 23, 2018 at 10:30:09AM +0100, Jiri Benc wrote:
>> On Mon, 22 Jan 2018 23:25:41 +0100, Christian Brauner wrote:
>>> This is not necessarily true in scenarios where I move a network device
>>> via RTM_NEWLINK + IFLA_NET_NS_PID into a network namespace I haven't
>>> created. Here is an example:
>>>
>>> nlmsghdr->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
>>> nlmsghdr->nlmsg_type = RTM_NEWLINK;
>>> /* move to network namespace of pid */
>>> nla_put_u32(nlmsg, IFLA_NET_NS_PID, pid)
>>> /* give interface new name */
>>> nla_put_string(nlmsg, IFLA_IFNAME, ifname)
>>>
>>> The only thing I have is the pid that identifies the network namespace.
>>
>> How do you know the interface did not get renamed in the new netns?
>>
>> This is racy and won't work reliably. You really need to know the
>> netnsid before moving the interface to the netns to be able to do
>> meaningful queries.
> 
> Even if you know the netnsid, do the mentioned watches work for
> nested/child namespaces if eg. a container creates new namespace before
> and/or after the watch was established and moves interfaces to these
> child namespaces, would you just see them disappear, or can you keep
> track of them later on as well?
nsid can be monitored (see ip monitor nsid).

> 
> Even if that works, from what the documentation tells me netlink is an
> unreliable protocol, so if my watcher's socket buffer is full, wouldn't
> I be losing important tracking information?
You can track socket error statistics. In case of error, you can start a dump to
ensure that you have the right view of the system.

> 
> I think one possible solution to tracking interfaces would be to have a
> unique identifier that never changes (even if it's just a simple
> uint64_t incremented whenever an interface is created). But since
> they're not local to the current namespace that may require a lot of
> extra permission checks (but I'm just speculating here...).
It's not possible to have unique identifiers. With CRIU, you need to be able to
reassign all existing ids.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-23 12:22                       ` Jiri Benc
@ 2018-01-23 16:55                         ` Nicolas Dichtel
  2018-01-23 18:05                           ` Christian Brauner
  2018-01-24 11:32                         ` [PATCH net-next 0/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_{DEL,SET}LINK Christian Brauner
                                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 23+ messages in thread
From: Nicolas Dichtel @ 2018-01-23 16:55 UTC (permalink / raw)
  To: Jiri Benc, Christian Brauner; +Cc: netdev

Le 23/01/2018 à 13:22, Jiri Benc a écrit :
> (Christian, I'm adding back the netdev list, there's no reason not to
> have the discussion in open.)
> 
> On Tue, 23 Jan 2018 12:42:19 +0100, Christian Brauner wrote:
>> Thanks for the comments and discussion. Sorry, for not going through the
>> list but I just have a quick question that doesn't deserve to spam the
>> whole netdev list. :) I have written another set of patches that would
>> add support for IFLA_IF_NETNSID to RTM_{N,S}ETLINK and friends which I
>> planned to send out after the pid and fd ones. Would you be interested
>> in those if I sent them out over the next week?
> 
> Yes, please CC me. I will have limited time next week (whole day
> meetings for the most of the week) but I'll try to review.
CC me also please.


Regards,
Nicolas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-23 16:37             ` Nicolas Dichtel
@ 2018-01-23 17:08               ` Jiri Benc
  2018-01-24 10:53                 ` Nicolas Dichtel
  0 siblings, 1 reply; 23+ messages in thread
From: Jiri Benc @ 2018-01-23 17:08 UTC (permalink / raw)
  To: Nicolas Dichtel
  Cc: Christian Brauner, Christian Brauner, davem, dsahern, fw, daniel,
	lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev,
	linux-kernel, stephen

On Tue, 23 Jan 2018 17:37:11 +0100, Nicolas Dichtel wrote:
> When a virtual interface moves to another netns, the netlink RTM_DELLINK message
> contains the attribute IFLA_NEW_NETNSID, which identifies where the interface
> moves. The nsid may be allocated if needed.

The problem is that ifindex may change and it's not announced. The only
way is to track both ifindex and ifname, watch for the ifname to appear
in the target netns and update the application's view of ifindex.

It would be much better if the whole (ifindex, netnsid) pair was
returned. I think it could be added.

> I don't know if it's acceptable to also allocate an nsid in case of a physical
> interface.

I think we need that.

Thanks,

 Jiri

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-23 16:55                         ` Nicolas Dichtel
@ 2018-01-23 18:05                           ` Christian Brauner
  0 siblings, 0 replies; 23+ messages in thread
From: Christian Brauner @ 2018-01-23 18:05 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: Jiri Benc, netdev

On Tue, Jan 23, 2018 at 05:55:27PM +0100, Nicolas Dichtel wrote:
> Le 23/01/2018 à 13:22, Jiri Benc a écrit :
> > (Christian, I'm adding back the netdev list, there's no reason not to
> > have the discussion in open.)
> > 
> > On Tue, 23 Jan 2018 12:42:19 +0100, Christian Brauner wrote:
> >> Thanks for the comments and discussion. Sorry, for not going through the
> >> list but I just have a quick question that doesn't deserve to spam the
> >> whole netdev list. :) I have written another set of patches that would
> >> add support for IFLA_IF_NETNSID to RTM_{N,S}ETLINK and friends which I
> >> planned to send out after the pid and fd ones. Would you be interested
> >> in those if I sent them out over the next week?
> > 
> > Yes, please CC me. I will have limited time next week (whole day
> > meetings for the most of the week) but I'll try to review.
> CC me also please.

Of course, will do. I plan on sending out the patches for RTM_SETLINK
and RTM_DELLINK tomorrow after I've done some more testing on it.

Thanks!
Christian

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-23 17:08               ` Jiri Benc
@ 2018-01-24 10:53                 ` Nicolas Dichtel
  2018-01-24 11:03                   ` Jiri Benc
  0 siblings, 1 reply; 23+ messages in thread
From: Nicolas Dichtel @ 2018-01-24 10:53 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Christian Brauner, Christian Brauner, davem, dsahern, fw, daniel,
	lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev,
	linux-kernel, stephen

Le 23/01/2018 à 18:08, Jiri Benc a écrit :
> On Tue, 23 Jan 2018 17:37:11 +0100, Nicolas Dichtel wrote:
>> When a virtual interface moves to another netns, the netlink RTM_DELLINK message
>> contains the attribute IFLA_NEW_NETNSID, which identifies where the interface
>> moves. The nsid may be allocated if needed.
> 
> The problem is that ifindex may change and it's not announced. The only
> way is to track both ifindex and ifname, watch for the ifname to appear
> in the target netns and update the application's view of ifindex.
Yes, you're right.

> 
> It would be much better if the whole (ifindex, netnsid) pair was
> returned. I think it could be added.
Sure. Do you plan to send a patch?

> 
>> I don't know if it's acceptable to also allocate an nsid in case of a physical
>> interface.
> 
> I think we need that.
If you agree, I can send a patch to remove this limitation.


Regards,
Nicolas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
  2018-01-24 10:53                 ` Nicolas Dichtel
@ 2018-01-24 11:03                   ` Jiri Benc
  0 siblings, 0 replies; 23+ messages in thread
From: Jiri Benc @ 2018-01-24 11:03 UTC (permalink / raw)
  To: Nicolas Dichtel
  Cc: Christian Brauner, Christian Brauner, davem, dsahern, fw, daniel,
	lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev,
	linux-kernel, stephen

On Wed, 24 Jan 2018 11:53:11 +0100, Nicolas Dichtel wrote:
> Le 23/01/2018 à 18:08, Jiri Benc a écrit :
> > It would be much better if the whole (ifindex, netnsid) pair was
> > returned. I think it could be added.  
> Sure. Do you plan to send a patch?

I can do that but it will take a while. I'll be happy to leave that to
someone else but if there's no one, I'll eventually do it.

> > I think we need that.  
> If you agree, I can send a patch to remove this limitation.

That would be great.

Thanks!

 Jiri

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH net-next 0/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_{DEL,SET}LINK
  2018-01-23 12:22                       ` Jiri Benc
  2018-01-23 16:55                         ` Nicolas Dichtel
@ 2018-01-24 11:32                         ` Christian Brauner
  2018-01-24 11:32                           ` [PATCH net-next 1/3] rtnetlink: enable IFLA_IF_NETNSID in do_setlink() Christian Brauner
  2018-01-24 11:52                         ` [PATCH net-next 2/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_SETLINK Christian Brauner
  2018-01-24 11:53                         ` [PATCH net-next 3/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_DELLINK Christian Brauner
  3 siblings, 1 reply; 23+ messages in thread
From: Christian Brauner @ 2018-01-24 11:32 UTC (permalink / raw)
  To: netdev
  Cc: davem, dsahern, fw, daniel, lucien.xin, mschiffer,
	jakub.kicinski, vyasevich, linux-kernel, stephen, jbenc,
	w.bumiller, nicolas.dichtel, Christian Brauner

Hi,

Based on the previous discussion this enables passing a IFLA_IF_NETNSID
property along with RTM_SETLINK and RTM_DELLINK requests. The patch for
RTM_NEWLINK will be sent out in a separate patch since there are more
corner-cases to think about.

Best,
Christian

Christian Brauner (3):
  rtnetlink: enable IFLA_IF_NETNSID in do_setlink()
  rtnetlink: enable IFLA_IF_NETNSID for RTM_SETLINK
  rtnetlink: enable IFLA_IF_NETNSID for RTM_DELLINK

 net/core/rtnetlink.c | 97 ++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 76 insertions(+), 21 deletions(-)

-- 
2.14.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH net-next 1/3] rtnetlink: enable IFLA_IF_NETNSID in do_setlink()
  2018-01-24 11:32                         ` [PATCH net-next 0/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_{DEL,SET}LINK Christian Brauner
@ 2018-01-24 11:32                           ` Christian Brauner
  0 siblings, 0 replies; 23+ messages in thread
From: Christian Brauner @ 2018-01-24 11:32 UTC (permalink / raw)
  To: netdev
  Cc: davem, dsahern, fw, daniel, lucien.xin, mschiffer,
	jakub.kicinski, vyasevich, linux-kernel, stephen, jbenc,
	w.bumiller, nicolas.dichtel, Christian Brauner

RTM_{NEW,SET}LINK already allow operations on other network namespaces
by identifying the target network namespace through IFLA_NET_NS_{FD,PID}
properties. This is done by looking for the corresponding properties in
do_setlink(). Extend do_setlink() to also look for the IFLA_IF_NETNSID
property. This introduces no functional changes since all callers of
do_setlink() currently block IFLA_IF_NETNSID by reporting an error before
they reach do_setlink().

This introduces the helpers:

static struct net *rtnl_link_get_net_by_nlattr(struct net *src_net, struct
                                               nlattr *tb[])

static struct net *rtnl_link_get_net_capable(const struct sk_buff *skb,
                                             struct net *src_net,
					     struct nlattr *tb[], int cap)

to simplify permission checks and target network namespace retrieval for
RTM_* requests that already support IFLA_NET_NS_{FD,PID} but get extended
to IFLA_IF_NETNSID. To perserve backwards compatibility the helpers look
for IFLA_NET_NS_{FD,PID} properties first before checking for
IFLA_IF_NETNSID.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 net/core/rtnetlink.c | 54 +++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 47 insertions(+), 7 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 16d644a4f974..54134187485b 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1893,6 +1893,49 @@ struct net *rtnl_link_get_net(struct net *src_net, struct nlattr *tb[])
 }
 EXPORT_SYMBOL(rtnl_link_get_net);
 
+/* Figure out which network namespace we are talking about by
+ * examining the link attributes in the following order:
+ *
+ * 1. IFLA_NET_NS_PID
+ * 2. IFLA_NET_NS_FD
+ * 3. IFLA_IF_NETNSID
+ */
+static struct net *rtnl_link_get_net_by_nlattr(struct net *src_net,
+					       struct nlattr *tb[])
+{
+	struct net *net;
+
+	if (tb[IFLA_NET_NS_PID] || tb[IFLA_NET_NS_FD])
+		return rtnl_link_get_net(src_net, tb);
+
+	if (!tb[IFLA_IF_NETNSID])
+		return get_net(src_net);
+
+	net = get_net_ns_by_id(src_net, nla_get_u32(tb[IFLA_IF_NETNSID]));
+	if (!net)
+		return ERR_PTR(-EINVAL);
+
+	return net;
+}
+
+static struct net *rtnl_link_get_net_capable(const struct sk_buff *skb,
+					     struct net *src_net,
+					     struct nlattr *tb[], int cap)
+{
+	struct net *net;
+
+	net = rtnl_link_get_net_by_nlattr(src_net, tb);
+	if (IS_ERR(net))
+		return net;
+
+	if (!netlink_ns_capable(skb, net->user_ns, cap)) {
+		put_net(net);
+		return ERR_PTR(-EPERM);
+	}
+
+	return net;
+}
+
 static int validate_linkmsg(struct net_device *dev, struct nlattr *tb[])
 {
 	if (dev) {
@@ -2155,17 +2198,14 @@ static int do_setlink(const struct sk_buff *skb,
 	const struct net_device_ops *ops = dev->netdev_ops;
 	int err;
 
-	if (tb[IFLA_NET_NS_PID] || tb[IFLA_NET_NS_FD]) {
-		struct net *net = rtnl_link_get_net(dev_net(dev), tb);
+	if (tb[IFLA_NET_NS_PID] || tb[IFLA_NET_NS_FD] || tb[IFLA_IF_NETNSID]) {
+		struct net *net = rtnl_link_get_net_capable(skb, dev_net(dev),
+							    tb, CAP_NET_ADMIN);
 		if (IS_ERR(net)) {
 			err = PTR_ERR(net);
 			goto errout;
 		}
-		if (!netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) {
-			put_net(net);
-			err = -EPERM;
-			goto errout;
-		}
+
 		err = dev_change_net_namespace(dev, net, ifname);
 		put_net(net);
 		if (err)
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net-next 2/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_SETLINK
  2018-01-23 12:22                       ` Jiri Benc
  2018-01-23 16:55                         ` Nicolas Dichtel
  2018-01-24 11:32                         ` [PATCH net-next 0/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_{DEL,SET}LINK Christian Brauner
@ 2018-01-24 11:52                         ` Christian Brauner
  2018-01-24 11:53                         ` [PATCH net-next 3/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_DELLINK Christian Brauner
  3 siblings, 0 replies; 23+ messages in thread
From: Christian Brauner @ 2018-01-24 11:52 UTC (permalink / raw)
  To: netdev
  Cc: davem, dsahern, fw, daniel, lucien.xin, mschiffer,
	jakub.kicinski, vyasevich, linux-kernel, jbenc, w.bumiller,
	nicolas.dichtel, Christian Brauner

- Backwards Compatibility:
  If userspace wants to determine whether RTM_SETLINK supports the
  IFLA_IF_NETNSID property they should first send an RTM_GETLINK request
  with IFLA_IF_NETNSID on lo. If either EACCESS is returned or the reply
  does not include IFLA_IF_NETNSID userspace should assume that
  IFLA_IF_NETNSID is not supported on this kernel.
  If the reply does contain an IFLA_IF_NETNSID property userspace
  can send an RTM_SETLINK with a IFLA_IF_NETNSID property. If they receive
  EOPNOTSUPP then the kernel does not support the IFLA_IF_NETNSID property
  with RTM_SETLINK. Userpace should then fallback to other means.

  To retain backwards compatibility the kernel will first check whether a
  IFLA_NET_NS_PID or IFLA_NET_NS_FD property has been passed. If either
  one is found it will be used to identify the target network namespace.
  This implies that users who do not care whether their running kernel
  supports IFLA_IF_NETNSID with RTM_SETLINK can pass both
  IFLA_NET_NS_{FD,PID} and IFLA_IF_NETNSID referring to the same network
  namespace.

- Security:
  Callers must have CAP_NET_ADMIN in the owning user namespace of the
  target network namespace.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 net/core/rtnetlink.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 54134187485b..a4d4409685e3 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2546,9 +2546,6 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 	if (err < 0)
 		goto errout;
 
-	if (tb[IFLA_IF_NETNSID])
-		return -EOPNOTSUPP;
-
 	if (tb[IFLA_IFNAME])
 		nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
 	else
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net-next 3/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_DELLINK
  2018-01-23 12:22                       ` Jiri Benc
                                           ` (2 preceding siblings ...)
  2018-01-24 11:52                         ` [PATCH net-next 2/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_SETLINK Christian Brauner
@ 2018-01-24 11:53                         ` Christian Brauner
  3 siblings, 0 replies; 23+ messages in thread
From: Christian Brauner @ 2018-01-24 11:53 UTC (permalink / raw)
  To: netdev
  Cc: davem, dsahern, fw, daniel, lucien.xin, mschiffer,
	jakub.kicinski, vyasevich, linux-kernel, jbenc, w.bumiller,
	nicolas.dichtel, Christian Brauner

- Backwards Compatibility:
  If userspace wants to determine whether RTM_DELLINK supports the
  IFLA_IF_NETNSID property they should first send an RTM_GETLINK request
  with IFLA_IF_NETNSID on lo. If either EACCESS is returned or the reply
  does not include IFLA_IF_NETNSID userspace should assume that
  IFLA_IF_NETNSID is not supported on this kernel.
  If the reply does contain an IFLA_IF_NETNSID property userspace
  can send an RTM_DELLINK with a IFLA_IF_NETNSID property. If they receive
  EOPNOTSUPP then the kernel does not support the IFLA_IF_NETNSID property
  with RTM_DELLINK. Userpace should then fallback to other means.

- Security:
  Callers must have CAP_NET_ADMIN in the owning user namespace of the
  target network namespace.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 net/core/rtnetlink.c | 40 +++++++++++++++++++++++++++++-----------
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index a4d4409685e3..17f7c3508a0c 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2630,36 +2630,54 @@ static int rtnl_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
 			struct netlink_ext_ack *extack)
 {
 	struct net *net = sock_net(skb->sk);
-	struct net_device *dev;
+	struct net *tgt_net = net;
+	struct net_device *dev = NULL;
 	struct ifinfomsg *ifm;
 	char ifname[IFNAMSIZ];
 	struct nlattr *tb[IFLA_MAX+1];
 	int err;
+	int netnsid = -1;
 
 	err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy, extack);
 	if (err < 0)
 		return err;
 
-	if (tb[IFLA_IF_NETNSID])
-		return -EOPNOTSUPP;
-
 	if (tb[IFLA_IFNAME])
 		nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
 
+	if (tb[IFLA_IF_NETNSID]) {
+		netnsid = nla_get_s32(tb[IFLA_IF_NETNSID]);
+		tgt_net = get_target_net(NETLINK_CB(skb).sk, netnsid);
+		if (IS_ERR(tgt_net))
+			return PTR_ERR(tgt_net);
+	}
+
+	err = -EINVAL;
 	ifm = nlmsg_data(nlh);
 	if (ifm->ifi_index > 0)
-		dev = __dev_get_by_index(net, ifm->ifi_index);
+		dev = __dev_get_by_index(tgt_net, ifm->ifi_index);
 	else if (tb[IFLA_IFNAME])
-		dev = __dev_get_by_name(net, ifname);
+		dev = __dev_get_by_name(tgt_net, ifname);
 	else if (tb[IFLA_GROUP])
-		return rtnl_group_dellink(net, nla_get_u32(tb[IFLA_GROUP]));
+		err = rtnl_group_dellink(tgt_net, nla_get_u32(tb[IFLA_GROUP]));
 	else
-		return -EINVAL;
+		goto out;
 
-	if (!dev)
-		return -ENODEV;
+	if (!dev) {
+		if (tb[IFLA_GROUP])
+			goto out;
 
-	return rtnl_delete_link(dev);
+		err = -ENODEV;
+		goto out;
+	}
+
+	err = rtnl_delete_link(dev);
+
+out:
+	if (netnsid >= 0)
+		put_net(tgt_net);
+
+	return err;
 }
 
 int rtnl_configure_link(struct net_device *dev, const struct ifinfomsg *ifm)
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2018-01-24 11:53 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-18 20:21 [PATCH net-next 0/1] rtnetlink: request RTM_GETLINK by pid or fd Christian Brauner
2018-01-18 20:21 ` [PATCH net-next 1/1] " Christian Brauner
2018-01-18 20:29   ` Jiri Benc
2018-01-18 20:55     ` Christian Brauner
2018-01-22 21:00       ` Jiri Benc
2018-01-22 21:23         ` Christian Brauner
2018-01-22 22:06           ` Jiri Benc
2018-01-22 22:25             ` Christian Brauner
2018-01-23  9:30               ` Jiri Benc
2018-01-23 10:26                 ` Wolfgang Bumiller
2018-01-23 10:42                   ` Jiri Benc
     [not found]                     ` <20180123114218.vsm5nu2jajrqjvko@gmail.com>
2018-01-23 12:22                       ` Jiri Benc
2018-01-23 16:55                         ` Nicolas Dichtel
2018-01-23 18:05                           ` Christian Brauner
2018-01-24 11:32                         ` [PATCH net-next 0/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_{DEL,SET}LINK Christian Brauner
2018-01-24 11:32                           ` [PATCH net-next 1/3] rtnetlink: enable IFLA_IF_NETNSID in do_setlink() Christian Brauner
2018-01-24 11:52                         ` [PATCH net-next 2/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_SETLINK Christian Brauner
2018-01-24 11:53                         ` [PATCH net-next 3/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_DELLINK Christian Brauner
2018-01-23 16:50                   ` [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd Nicolas Dichtel
2018-01-23 16:37             ` Nicolas Dichtel
2018-01-23 17:08               ` Jiri Benc
2018-01-24 10:53                 ` Nicolas Dichtel
2018-01-24 11:03                   ` Jiri Benc

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).