netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] RFC: connector: Add network namespace awareness
@ 2020-07-02  0:26 Matt Bennett
  2020-07-02  0:26 ` [PATCH 1/5] connector: Use task pid helpers Matt Bennett
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Matt Bennett @ 2020-07-02  0:26 UTC (permalink / raw)
  To: netdev; +Cc: zbr, ebiederm, linux-kernel, Matt Bennett

Previously the connector functionality could only be used by processes running in the
default network namespace. This meant that any process that uses the connector functionality
could not operate correctly when run inside a container. This is a draft patch series that
attempts to now allow this functionality outside of the default network namespace.

I see this has been discussed previously [1], but am not sure how my changes relate to all
of the topics discussed there and/or if there are any unintended side effects from my draft
changes.

Thanks.

[1] https://marc.info/?l=linux-kernel&m=150806196728365&w=2

Matt Bennett (5):
  connector: Use task pid helpers
  connector: Use 'current_user_ns' function
  connector: Ensure callback entry is released
  connector: Prepare for supporting multiple namespaces
  connector: Create connector per namespace

 Documentation/driver-api/connector.rst |   6 +-
 drivers/connector/cn_proc.c            | 110 +++++++-------
 drivers/connector/cn_queue.c           |   9 +-
 drivers/connector/connector.c          | 192 ++++++++++++++++++++-----
 drivers/hv/hv_fcopy.c                  |   1 +
 drivers/hv/hv_utils_transport.c        |   6 +-
 drivers/md/dm-log-userspace-transfer.c |   6 +-
 drivers/video/fbdev/uvesafb.c          |   8 +-
 drivers/w1/w1_netlink.c                |  19 +--
 include/linux/connector.h              |  38 +++--
 include/net/net_namespace.h            |   4 +
 kernel/exit.c                          |   2 +-
 samples/connector/cn_test.c            |   6 +-
 13 files changed, 286 insertions(+), 121 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/5] connector: Use task pid helpers
  2020-07-02  0:26 [PATCH 0/5] RFC: connector: Add network namespace awareness Matt Bennett
@ 2020-07-02  0:26 ` Matt Bennett
  2020-07-02  0:26 ` [PATCH 2/5] connector: Use 'current_user_ns' function Matt Bennett
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Matt Bennett @ 2020-07-02  0:26 UTC (permalink / raw)
  To: netdev; +Cc: zbr, ebiederm, linux-kernel, Matt Bennett

In preparation for supporting the connector outside of the default
network namespace we switch to using these helpers now. As the connector
is still only supported in the default namespace this change is a no-op.

Signed-off-by: Matt Bennett <matt.bennett@alliedtelesis.co.nz>
---
 drivers/connector/cn_proc.c | 48 ++++++++++++++++++-------------------
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index 646ad385e490..36a7823c56ec 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -83,11 +83,11 @@ void proc_fork_connector(struct task_struct *task)
 	ev->what = PROC_EVENT_FORK;
 	rcu_read_lock();
 	parent = rcu_dereference(task->real_parent);
-	ev->event_data.fork.parent_pid = parent->pid;
-	ev->event_data.fork.parent_tgid = parent->tgid;
+	ev->event_data.fork.parent_pid = task_pid_vnr(parent);
+	ev->event_data.fork.parent_tgid = task_tgid_vnr(parent);
 	rcu_read_unlock();
-	ev->event_data.fork.child_pid = task->pid;
-	ev->event_data.fork.child_tgid = task->tgid;
+	ev->event_data.fork.child_pid = task_pid_vnr(task);
+	ev->event_data.fork.child_tgid = task_tgid_vnr(task);
 
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
 	msg->ack = 0; /* not used */
@@ -110,8 +110,8 @@ void proc_exec_connector(struct task_struct *task)
 	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	ev->timestamp_ns = ktime_get_ns();
 	ev->what = PROC_EVENT_EXEC;
-	ev->event_data.exec.process_pid = task->pid;
-	ev->event_data.exec.process_tgid = task->tgid;
+	ev->event_data.exec.process_pid = task_pid_vnr(task);
+	ev->event_data.exec.process_tgid = task_tgid_vnr(task);
 
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
 	msg->ack = 0; /* not used */
@@ -134,8 +134,8 @@ void proc_id_connector(struct task_struct *task, int which_id)
 	ev = (struct proc_event *)msg->data;
 	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	ev->what = which_id;
-	ev->event_data.id.process_pid = task->pid;
-	ev->event_data.id.process_tgid = task->tgid;
+	ev->event_data.id.process_pid = task_pid_vnr(task);
+	ev->event_data.id.process_tgid = task_tgid_vnr(task);
 	rcu_read_lock();
 	cred = __task_cred(task);
 	if (which_id == PROC_EVENT_UID) {
@@ -172,8 +172,8 @@ void proc_sid_connector(struct task_struct *task)
 	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	ev->timestamp_ns = ktime_get_ns();
 	ev->what = PROC_EVENT_SID;
-	ev->event_data.sid.process_pid = task->pid;
-	ev->event_data.sid.process_tgid = task->tgid;
+	ev->event_data.sid.process_pid = task_pid_vnr(task);
+	ev->event_data.sid.process_tgid = task_tgid_vnr(task);
 
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
 	msg->ack = 0; /* not used */
@@ -196,11 +196,11 @@ void proc_ptrace_connector(struct task_struct *task, int ptrace_id)
 	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	ev->timestamp_ns = ktime_get_ns();
 	ev->what = PROC_EVENT_PTRACE;
-	ev->event_data.ptrace.process_pid  = task->pid;
-	ev->event_data.ptrace.process_tgid = task->tgid;
+	ev->event_data.ptrace.process_pid  = task_pid_vnr(task);
+	ev->event_data.ptrace.process_tgid = task_tgid_vnr(task);
 	if (ptrace_id == PTRACE_ATTACH) {
-		ev->event_data.ptrace.tracer_pid  = current->pid;
-		ev->event_data.ptrace.tracer_tgid = current->tgid;
+		ev->event_data.ptrace.tracer_pid  = task_pid_vnr(current);
+		ev->event_data.ptrace.tracer_tgid = task_tgid_vnr(current);
 	} else if (ptrace_id == PTRACE_DETACH) {
 		ev->event_data.ptrace.tracer_pid  = 0;
 		ev->event_data.ptrace.tracer_tgid = 0;
@@ -228,8 +228,8 @@ void proc_comm_connector(struct task_struct *task)
 	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	ev->timestamp_ns = ktime_get_ns();
 	ev->what = PROC_EVENT_COMM;
-	ev->event_data.comm.process_pid  = task->pid;
-	ev->event_data.comm.process_tgid = task->tgid;
+	ev->event_data.comm.process_pid  = task_pid_vnr(task);
+	ev->event_data.comm.process_tgid = task_tgid_vnr(task);
 	get_task_comm(ev->event_data.comm.comm, task);
 
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
@@ -254,14 +254,14 @@ void proc_coredump_connector(struct task_struct *task)
 	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	ev->timestamp_ns = ktime_get_ns();
 	ev->what = PROC_EVENT_COREDUMP;
-	ev->event_data.coredump.process_pid = task->pid;
-	ev->event_data.coredump.process_tgid = task->tgid;
+	ev->event_data.coredump.process_pid = task_pid_vnr(task);
+	ev->event_data.coredump.process_tgid = task_tgid_vnr(task);
 
 	rcu_read_lock();
 	if (pid_alive(task)) {
 		parent = rcu_dereference(task->real_parent);
-		ev->event_data.coredump.parent_pid = parent->pid;
-		ev->event_data.coredump.parent_tgid = parent->tgid;
+		ev->event_data.coredump.parent_pid = task_pid_vnr(parent);
+		ev->event_data.coredump.parent_tgid = task_tgid_vnr(parent);
 	}
 	rcu_read_unlock();
 
@@ -287,16 +287,16 @@ void proc_exit_connector(struct task_struct *task)
 	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	ev->timestamp_ns = ktime_get_ns();
 	ev->what = PROC_EVENT_EXIT;
-	ev->event_data.exit.process_pid = task->pid;
-	ev->event_data.exit.process_tgid = task->tgid;
+	ev->event_data.exit.process_pid = task_pid_vnr(task);
+	ev->event_data.exit.process_tgid = task_tgid_vnr(task);
 	ev->event_data.exit.exit_code = task->exit_code;
 	ev->event_data.exit.exit_signal = task->exit_signal;
 
 	rcu_read_lock();
 	if (pid_alive(task)) {
 		parent = rcu_dereference(task->real_parent);
-		ev->event_data.exit.parent_pid = parent->pid;
-		ev->event_data.exit.parent_tgid = parent->tgid;
+		ev->event_data.exit.parent_pid = task_pid_vnr(parent);
+		ev->event_data.exit.parent_tgid = task_tgid_vnr(parent);
 	}
 	rcu_read_unlock();
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/5] connector: Use 'current_user_ns' function
  2020-07-02  0:26 [PATCH 0/5] RFC: connector: Add network namespace awareness Matt Bennett
  2020-07-02  0:26 ` [PATCH 1/5] connector: Use task pid helpers Matt Bennett
@ 2020-07-02  0:26 ` Matt Bennett
  2020-07-02  0:26 ` [PATCH 3/5] connector: Ensure callback entry is released Matt Bennett
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Matt Bennett @ 2020-07-02  0:26 UTC (permalink / raw)
  To: netdev; +Cc: zbr, ebiederm, linux-kernel, Matt Bennett

In preparation for supporting the connector outside of the default
network namespace we switch to using this function now. As the connector
is still only supported in the default namespace this change is a no-op.

Signed-off-by: Matt Bennett <matt.bennett@alliedtelesis.co.nz>
---
 drivers/connector/cn_proc.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index 36a7823c56ec..d90aea555a21 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -139,11 +139,11 @@ void proc_id_connector(struct task_struct *task, int which_id)
 	rcu_read_lock();
 	cred = __task_cred(task);
 	if (which_id == PROC_EVENT_UID) {
-		ev->event_data.id.r.ruid = from_kuid_munged(&init_user_ns, cred->uid);
-		ev->event_data.id.e.euid = from_kuid_munged(&init_user_ns, cred->euid);
+		ev->event_data.id.r.ruid = from_kuid_munged(current_user_ns(), cred->uid);
+		ev->event_data.id.e.euid = from_kuid_munged(current_user_ns(), cred->euid);
 	} else if (which_id == PROC_EVENT_GID) {
-		ev->event_data.id.r.rgid = from_kgid_munged(&init_user_ns, cred->gid);
-		ev->event_data.id.e.egid = from_kgid_munged(&init_user_ns, cred->egid);
+		ev->event_data.id.r.rgid = from_kgid_munged(current_user_ns(), cred->gid);
+		ev->event_data.id.e.egid = from_kgid_munged(current_user_ns(), cred->egid);
 	} else {
 		rcu_read_unlock();
 		return;
@@ -362,7 +362,7 @@ static void cn_proc_mcast_ctl(struct cn_msg *msg,
 		return;
 
 	/* Can only change if privileged. */
-	if (!__netlink_ns_capable(nsp, &init_user_ns, CAP_NET_ADMIN)) {
+	if (!__netlink_ns_capable(nsp, current_user_ns(), CAP_NET_ADMIN)) {
 		err = EPERM;
 		goto out;
 	}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/5] connector: Ensure callback entry is released
  2020-07-02  0:26 [PATCH 0/5] RFC: connector: Add network namespace awareness Matt Bennett
  2020-07-02  0:26 ` [PATCH 1/5] connector: Use task pid helpers Matt Bennett
  2020-07-02  0:26 ` [PATCH 2/5] connector: Use 'current_user_ns' function Matt Bennett
@ 2020-07-02  0:26 ` Matt Bennett
  2020-07-02  0:26 ` [PATCH 4/5] connector: Prepare for supporting multiple namespaces Matt Bennett
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Matt Bennett @ 2020-07-02  0:26 UTC (permalink / raw)
  To: netdev; +Cc: zbr, ebiederm, linux-kernel, Matt Bennett

Currently the entry itself appears to be being leaked.

Signed-off-by: Matt Bennett <matt.bennett@alliedtelesis.co.nz>
---
 drivers/connector/cn_queue.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/connector/cn_queue.c b/drivers/connector/cn_queue.c
index 49295052ba8b..a82ceeb37f26 100644
--- a/drivers/connector/cn_queue.c
+++ b/drivers/connector/cn_queue.c
@@ -132,8 +132,10 @@ void cn_queue_free_dev(struct cn_queue_dev *dev)
 	struct cn_callback_entry *cbq, *n;
 
 	spin_lock_bh(&dev->queue_lock);
-	list_for_each_entry_safe(cbq, n, &dev->queue_list, callback_entry)
+	list_for_each_entry_safe(cbq, n, &dev->queue_list, callback_entry) {
 		list_del(&cbq->callback_entry);
+		cn_queue_release_callback(cbq);
+	}
 	spin_unlock_bh(&dev->queue_lock);
 
 	while (atomic_read(&dev->refcnt)) {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/5] connector: Prepare for supporting multiple namespaces
  2020-07-02  0:26 [PATCH 0/5] RFC: connector: Add network namespace awareness Matt Bennett
                   ` (2 preceding siblings ...)
  2020-07-02  0:26 ` [PATCH 3/5] connector: Ensure callback entry is released Matt Bennett
@ 2020-07-02  0:26 ` Matt Bennett
  2020-07-02  0:26 ` [PATCH 5/5] connector: Create connector per namespace Matt Bennett
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Matt Bennett @ 2020-07-02  0:26 UTC (permalink / raw)
  To: netdev; +Cc: zbr, ebiederm, linux-kernel, Matt Bennett

Extend the existing function definitions / call sites to start
passing the network namespace. For now we still only pass the
default namespace.

Signed-off-by: Matt Bennett <matt.bennett@alliedtelesis.co.nz>
---
 Documentation/driver-api/connector.rst |  6 +++---
 drivers/connector/cn_proc.c            |  5 +++--
 drivers/connector/cn_queue.c           |  5 +++--
 drivers/connector/connector.c          | 21 ++++++++++++---------
 drivers/hv/hv_utils_transport.c        |  6 ++++--
 drivers/md/dm-log-userspace-transfer.c |  6 ++++--
 drivers/video/fbdev/uvesafb.c          |  8 +++++---
 drivers/w1/w1_netlink.c                | 19 +++++++++++--------
 include/linux/connector.h              | 24 ++++++++++++++++--------
 samples/connector/cn_test.c            |  6 ++++--
 10 files changed, 65 insertions(+), 41 deletions(-)

diff --git a/Documentation/driver-api/connector.rst b/Documentation/driver-api/connector.rst
index c100c7482289..4fb1f73d76ad 100644
--- a/Documentation/driver-api/connector.rst
+++ b/Documentation/driver-api/connector.rst
@@ -25,9 +25,9 @@ handling, etc...  The Connector driver allows any kernelspace agents to use
 netlink based networking for inter-process communication in a significantly
 easier way::
 
-  int cn_add_callback(struct cb_id *id, char *name, void (*callback) (struct cn_msg *, struct netlink_skb_parms *));
-  void cn_netlink_send_multi(struct cn_msg *msg, u16 len, u32 portid, u32 __group, int gfp_mask);
-  void cn_netlink_send(struct cn_msg *msg, u32 portid, u32 __group, int gfp_mask);
+  int cn_add_callback(struct cb_id *id, char *name, void (*callback) (struct net *, struct cn_msg *, struct netlink_skb_parms *));
+  void cn_netlink_send_multi(struct net *net, struct cn_msg *msg, u16 len, u32 portid, u32 __group, int gfp_mask);
+  void cn_netlink_send(struct net *net, struct cn_msg *msg, u32 portid, u32 __group, int gfp_mask);
 
   struct cb_id
   {
diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index d90aea555a21..9202be177a30 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -16,6 +16,7 @@
 #include <linux/ptrace.h>
 #include <linux/atomic.h>
 #include <linux/pid_namespace.h>
+#include <net/net_namespace.h>
 
 #include <linux/cn_proc.h>
 #include <linux/local_lock.h>
@@ -61,7 +62,7 @@ static inline void send_msg(struct cn_msg *msg)
 	 *
 	 * If cn_netlink_send() fails, the data is not sent.
 	 */
-	cn_netlink_send(msg, 0, CN_IDX_PROC, GFP_NOWAIT);
+	cn_netlink_send(&init_net, msg, 0, CN_IDX_PROC, GFP_NOWAIT);
 
 	local_unlock(&local_event.lock);
 }
@@ -343,7 +344,7 @@ static void cn_proc_ack(int err, int rcvd_seq, int rcvd_ack)
  * cn_proc_mcast_ctl
  * @data: message sent from userspace via the connector
  */
-static void cn_proc_mcast_ctl(struct cn_msg *msg,
+static void cn_proc_mcast_ctl(struct net *net, struct cn_msg *msg,
 			      struct netlink_skb_parms *nsp)
 {
 	enum proc_cn_mcast_op *mc_op = NULL;
diff --git a/drivers/connector/cn_queue.c b/drivers/connector/cn_queue.c
index a82ceeb37f26..22fdd2b149af 100644
--- a/drivers/connector/cn_queue.c
+++ b/drivers/connector/cn_queue.c
@@ -16,11 +16,12 @@
 #include <linux/suspend.h>
 #include <linux/connector.h>
 #include <linux/delay.h>
+#include <net/net_namespace.h>
 
 static struct cn_callback_entry *
 cn_queue_alloc_callback_entry(struct cn_queue_dev *dev, const char *name,
 			      struct cb_id *id,
-			      void (*callback)(struct cn_msg *,
+			      void (*callback)(struct net *, struct cn_msg *,
 					       struct netlink_skb_parms *))
 {
 	struct cn_callback_entry *cbq;
@@ -58,7 +59,7 @@ int cn_cb_equal(struct cb_id *i1, struct cb_id *i2)
 
 int cn_queue_add_callback(struct cn_queue_dev *dev, const char *name,
 			  struct cb_id *id,
-			  void (*callback)(struct cn_msg *,
+			  void (*callback)(struct net *, struct cn_msg *,
 					   struct netlink_skb_parms *))
 {
 	struct cn_callback_entry *cbq, *__cbq;
diff --git a/drivers/connector/connector.c b/drivers/connector/connector.c
index 2d22d6bf52f2..82fcaa4d8be3 100644
--- a/drivers/connector/connector.c
+++ b/drivers/connector/connector.c
@@ -58,8 +58,8 @@ static int cn_already_initialized;
  * The message is sent to, the portid if given, the group if given, both if
  * both, or if both are zero then the group is looked up and sent there.
  */
-int cn_netlink_send_mult(struct cn_msg *msg, u16 len, u32 portid, u32 __group,
-	gfp_t gfp_mask)
+int cn_netlink_send_mult(struct net *net, struct cn_msg *msg, u16 len,
+			 u32 portid, u32 __group, gfp_t gfp_mask)
 {
 	struct cn_callback_entry *__cbq;
 	unsigned int size;
@@ -118,17 +118,18 @@ int cn_netlink_send_mult(struct cn_msg *msg, u16 len, u32 portid, u32 __group,
 EXPORT_SYMBOL_GPL(cn_netlink_send_mult);
 
 /* same as cn_netlink_send_mult except msg->len is used for len */
-int cn_netlink_send(struct cn_msg *msg, u32 portid, u32 __group,
-	gfp_t gfp_mask)
+int cn_netlink_send(struct net *net, struct cn_msg *msg, u32 portid,
+		    u32 __group, gfp_t gfp_mask)
 {
-	return cn_netlink_send_mult(msg, msg->len, portid, __group, gfp_mask);
+	return cn_netlink_send_mult(net, msg, msg->len, portid, __group,
+				    gfp_mask);
 }
 EXPORT_SYMBOL_GPL(cn_netlink_send);
 
 /*
  * Callback helper - queues work and setup destructor for given data.
  */
-static int cn_call_callback(struct sk_buff *skb)
+static int cn_call_callback(struct net *net, struct sk_buff *skb)
 {
 	struct nlmsghdr *nlh;
 	struct cn_callback_entry *i, *cbq = NULL;
@@ -153,7 +154,7 @@ static int cn_call_callback(struct sk_buff *skb)
 	spin_unlock_bh(&dev->cbdev->queue_lock);
 
 	if (cbq != NULL) {
-		cbq->callback(msg, nsp);
+		cbq->callback(net, msg, nsp);
 		kfree_skb(skb);
 		cn_queue_release_callback(cbq);
 		err = 0;
@@ -172,6 +173,8 @@ static void cn_rx_skb(struct sk_buff *skb)
 	struct nlmsghdr *nlh;
 	int len, err;
 
+	struct net *net = sock_net(skb->sk);
+
 	if (skb->len >= NLMSG_HDRLEN) {
 		nlh = nlmsg_hdr(skb);
 		len = nlmsg_len(nlh);
@@ -181,7 +184,7 @@ static void cn_rx_skb(struct sk_buff *skb)
 		    len > CONNECTOR_MAX_MSG_SIZE)
 			return;
 
-		err = cn_call_callback(skb_get(skb));
+		err = cn_call_callback(net, skb_get(skb));
 		if (err < 0)
 			kfree_skb(skb);
 	}
@@ -194,7 +197,7 @@ static void cn_rx_skb(struct sk_buff *skb)
  * May sleep.
  */
 int cn_add_callback(struct cb_id *id, const char *name,
-		    void (*callback)(struct cn_msg *,
+		    void (*callback)(struct net *, struct cn_msg *,
 				     struct netlink_skb_parms *))
 {
 	int err;
diff --git a/drivers/hv/hv_utils_transport.c b/drivers/hv/hv_utils_transport.c
index eb2833d2b5d0..1a67efe59e91 100644
--- a/drivers/hv/hv_utils_transport.c
+++ b/drivers/hv/hv_utils_transport.c
@@ -8,6 +8,7 @@
 #include <linux/slab.h>
 #include <linux/fs.h>
 #include <linux/poll.h>
+#include <net/net_namespace.h>
 
 #include "hyperv_vmbus.h"
 #include "hv_utils_transport.h"
@@ -181,7 +182,8 @@ static int hvt_op_release(struct inode *inode, struct file *file)
 	return 0;
 }
 
-static void hvt_cn_callback(struct cn_msg *msg, struct netlink_skb_parms *nsp)
+static void hvt_cn_callback(struct net *net, struct cn_msg *msg,
+			    struct netlink_skb_parms *nsp)
 {
 	struct hvutil_transport *hvt, *hvt_found = NULL;
 
@@ -231,7 +233,7 @@ int hvutil_transport_send(struct hvutil_transport *hvt, void *msg, int len,
 		cn_msg->id.val = hvt->cn_id.val;
 		cn_msg->len = len;
 		memcpy(cn_msg->data, msg, len);
-		ret = cn_netlink_send(cn_msg, 0, 0, GFP_ATOMIC);
+		ret = cn_netlink_send(&init_net, cn_msg, 0, 0, GFP_ATOMIC);
 		kfree(cn_msg);
 		/*
 		 * We don't know when netlink messages are delivered but unlike
diff --git a/drivers/md/dm-log-userspace-transfer.c b/drivers/md/dm-log-userspace-transfer.c
index fdf8ec304f8d..0e835acf14da 100644
--- a/drivers/md/dm-log-userspace-transfer.c
+++ b/drivers/md/dm-log-userspace-transfer.c
@@ -12,6 +12,7 @@
 #include <linux/connector.h>
 #include <linux/device-mapper.h>
 #include <linux/dm-log-userspace.h>
+#include <net/net_namespace.h>
 
 #include "dm-log-userspace-transfer.h"
 
@@ -66,7 +67,7 @@ static int dm_ulog_sendto_server(struct dm_ulog_request *tfr)
 	msg->seq = tfr->seq;
 	msg->len = sizeof(struct dm_ulog_request) + tfr->data_size;
 
-	r = cn_netlink_send(msg, 0, 0, gfp_any());
+	r = cn_netlink_send(&init_net, msg, 0, 0, gfp_any());
 
 	return r;
 }
@@ -130,7 +131,8 @@ static int fill_pkg(struct cn_msg *msg, struct dm_ulog_request *tfr)
  * This is the connector callback that delivers data
  * that was sent from userspace.
  */
-static void cn_ulog_callback(struct cn_msg *msg, struct netlink_skb_parms *nsp)
+static void cn_ulog_callback(struct net *net, struct cn_msg *msg,
+			     struct netlink_skb_parms *nsp)
 {
 	struct dm_ulog_request *tfr = (struct dm_ulog_request *)(msg + 1);
 
diff --git a/drivers/video/fbdev/uvesafb.c b/drivers/video/fbdev/uvesafb.c
index def14ac0ebe1..f9b6ed7b97f2 100644
--- a/drivers/video/fbdev/uvesafb.c
+++ b/drivers/video/fbdev/uvesafb.c
@@ -25,6 +25,7 @@
 #include <linux/slab.h>
 #include <video/edid.h>
 #include <video/uvesafb.h>
+#include <net/net_namespace.h>
 #ifdef CONFIG_X86
 #include <video/vga.h>
 #endif
@@ -69,7 +70,8 @@ static DEFINE_MUTEX(uvfb_lock);
  * find the kernel part of the task struct, copy the registers and
  * the buffer contents and then complete the task.
  */
-static void uvesafb_cn_callback(struct cn_msg *msg, struct netlink_skb_parms *nsp)
+static void uvesafb_cn_callback(struct net *net, struct cn_msg *msg,
+				struct netlink_skb_parms *nsp)
 {
 	struct uvesafb_task *utask;
 	struct uvesafb_ktask *task;
@@ -194,7 +196,7 @@ static int uvesafb_exec(struct uvesafb_ktask *task)
 	uvfb_tasks[seq] = task;
 	mutex_unlock(&uvfb_lock);
 
-	err = cn_netlink_send(m, 0, 0, GFP_KERNEL);
+	err = cn_netlink_send(&init_net, m, 0, 0, GFP_KERNEL);
 	if (err == -ESRCH) {
 		/*
 		 * Try to start the userspace helper if sending
@@ -206,7 +208,7 @@ static int uvesafb_exec(struct uvesafb_ktask *task)
 			pr_err("make sure that the v86d helper is installed and executable\n");
 		} else {
 			v86d_started = 1;
-			err = cn_netlink_send(m, 0, 0, gfp_any());
+			err = cn_netlink_send(&init_net, m, 0, 0, gfp_any());
 			if (err == -ENOBUFS)
 				err = 0;
 		}
diff --git a/drivers/w1/w1_netlink.c b/drivers/w1/w1_netlink.c
index fa490aa4407c..246844b61613 100644
--- a/drivers/w1/w1_netlink.c
+++ b/drivers/w1/w1_netlink.c
@@ -7,6 +7,7 @@
 #include <linux/skbuff.h>
 #include <linux/netlink.h>
 #include <linux/connector.h>
+#include <net/net_namespace.h>
 
 #include "w1_internal.h"
 #include "w1_netlink.h"
@@ -64,8 +65,8 @@ static void w1_unref_block(struct w1_cb_block *block)
 	if (atomic_sub_return(1, &block->refcnt) == 0) {
 		u16 len = w1_reply_len(block);
 		if (len) {
-			cn_netlink_send_mult(block->first_cn, len,
-				block->portid, 0, GFP_KERNEL);
+			cn_netlink_send_mult(&init_net, block->first_cn, len,
+					     block->portid, 0, GFP_KERNEL);
 		}
 		kfree(block);
 	}
@@ -83,7 +84,8 @@ static void w1_reply_make_space(struct w1_cb_block *block, u16 space)
 {
 	u16 len = w1_reply_len(block);
 	if (len + space >= block->maxlen) {
-		cn_netlink_send_mult(block->first_cn, len, block->portid, 0, GFP_KERNEL);
+		cn_netlink_send_mult(&init_net, block->first_cn, len,
+				     block->portid, 0, GFP_KERNEL);
 		block->first_cn->len = 0;
 		block->cn = NULL;
 		block->msg = NULL;
@@ -201,7 +203,7 @@ static void w1_netlink_send_error(struct cn_msg *cn, struct w1_netlink_msg *msg,
 	packet.cn.len = sizeof(packet.msg);
 	packet.msg.len = 0;
 	packet.msg.status = (u8)-error;
-	cn_netlink_send(&packet.cn, portid, 0, GFP_KERNEL);
+	cn_netlink_send(&init_net, &packet.cn, portid, 0, GFP_KERNEL);
 }
 
 /**
@@ -228,7 +230,7 @@ void w1_netlink_send(struct w1_master *dev, struct w1_netlink_msg *msg)
 	memcpy(&packet.msg, msg, sizeof(*msg));
 	packet.msg.len = 0;
 
-	cn_netlink_send(&packet.cn, 0, 0, GFP_KERNEL);
+	cn_netlink_send(&init_net, &packet.cn, 0, 0, GFP_KERNEL);
 }
 
 static void w1_send_slave(struct w1_master *dev, u64 rn)
@@ -421,7 +423,7 @@ static int w1_process_command_root(struct cn_msg *req_cn, u32 portid)
 	mutex_lock(&w1_mlock);
 	list_for_each_entry(dev, &w1_masters, w1_master_entry) {
 		if (cn->len + sizeof(*id) > PAGE_SIZE - sizeof(struct cn_msg)) {
-			cn_netlink_send(cn, portid, 0, GFP_KERNEL);
+			cn_netlink_send(&init_net, cn, portid, 0, GFP_KERNEL);
 			cn->len = sizeof(struct w1_netlink_msg);
 			msg->len = 0;
 			id = (u32 *)msg->data;
@@ -432,7 +434,7 @@ static int w1_process_command_root(struct cn_msg *req_cn, u32 portid)
 		cn->len += sizeof(*id);
 		id++;
 	}
-	cn_netlink_send(cn, portid, 0, GFP_KERNEL);
+	cn_netlink_send(&init_net, cn, portid, 0, GFP_KERNEL);
 	mutex_unlock(&w1_mlock);
 
 	kfree(cn);
@@ -532,7 +534,8 @@ static void w1_list_count_cmds(struct w1_netlink_msg *msg, int *cmd_count,
 	}
 }
 
-static void w1_cn_callback(struct cn_msg *cn, struct netlink_skb_parms *nsp)
+static void w1_cn_callback(struct net *net, struct cn_msg *cn,
+			   struct netlink_skb_parms *nsp)
 {
 	struct w1_netlink_msg *msg = (struct w1_netlink_msg *)(cn + 1);
 	struct w1_slave *sl;
diff --git a/include/linux/connector.h b/include/linux/connector.h
index cb732643471b..8e9385eb18f8 100644
--- a/include/linux/connector.h
+++ b/include/linux/connector.h
@@ -40,7 +40,8 @@ struct cn_callback_entry {
 	struct cn_queue_dev *pdev;
 
 	struct cn_callback_id id;
-	void (*callback) (struct cn_msg *, struct netlink_skb_parms *);
+	void (*callback)(struct net *, struct cn_msg *,
+			 struct netlink_skb_parms *);
 
 	u32 seq, group;
 };
@@ -62,10 +63,12 @@ struct cn_dev {
  *		in-kernel users.
  * @name:	connector's callback symbolic name.
  * @callback:	connector's callback.
- * 		parameters are %cn_msg and the sender's credentials
+ *		parameters are network namespace, %cn_msg and
+ *		the sender's credentials
  */
 int cn_add_callback(struct cb_id *id, const char *name,
-		    void (*callback)(struct cn_msg *, struct netlink_skb_parms *));
+		    void (*callback)(struct net *, struct cn_msg *,
+				     struct netlink_skb_parms *));
 /**
  * cn_del_callback() - Unregisters new callback with connector core.
  *
@@ -75,8 +78,9 @@ void cn_del_callback(struct cb_id *id);
 
 
 /**
- * cn_netlink_send_mult - Sends message to the specified groups.
+ * cn_netlink_send_mult - Sends messages to the specified groups.
  *
+ * @net:	network namespace
  * @msg: 	message header(with attached data).
  * @len:	Number of @msg to be sent.
  * @portid:	destination port.
@@ -96,11 +100,13 @@ void cn_del_callback(struct cb_id *id);
  *
  * If there are no listeners for given group %-ESRCH can be returned.
  */
-int cn_netlink_send_mult(struct cn_msg *msg, u16 len, u32 portid, u32 group, gfp_t gfp_mask);
+int cn_netlink_send_mult(struct net *net, struct cn_msg *msg, u16 len,
+			 u32 portid, u32 group, gfp_t gfp_mask);
 
 /**
- * cn_netlink_send_mult - Sends message to the specified groups.
+ * cn_netlink_send - Sends message to the specified groups.
  *
+ * @net:	network namespace
  * @msg:	message header(with attached data).
  * @portid:	destination port.
  *		If non-zero the message will be sent to the given port,
@@ -119,11 +125,13 @@ int cn_netlink_send_mult(struct cn_msg *msg, u16 len, u32 portid, u32 group, gfp
  *
  * If there are no listeners for given group %-ESRCH can be returned.
  */
-int cn_netlink_send(struct cn_msg *msg, u32 portid, u32 group, gfp_t gfp_mask);
+int cn_netlink_send(struct net *net, struct cn_msg *msg, u32 portid, u32 group,
+		    gfp_t gfp_mask);
 
 int cn_queue_add_callback(struct cn_queue_dev *dev, const char *name,
 			  struct cb_id *id,
-			  void (*callback)(struct cn_msg *, struct netlink_skb_parms *));
+			  void (*callback)(struct net *, struct cn_msg *,
+					   struct netlink_skb_parms *));
 void cn_queue_del_callback(struct cn_queue_dev *dev, struct cb_id *id);
 void cn_queue_release_callback(struct cn_callback_entry *);
 
diff --git a/samples/connector/cn_test.c b/samples/connector/cn_test.c
index 0958a171d048..9eaf40bbd714 100644
--- a/samples/connector/cn_test.c
+++ b/samples/connector/cn_test.c
@@ -16,13 +16,15 @@
 #include <linux/timer.h>
 
 #include <linux/connector.h>
+#include <net/net_namespace.h>
 
 static struct cb_id cn_test_id = { CN_NETLINK_USERS + 3, 0x456 };
 static char cn_test_name[] = "cn_test";
 static struct sock *nls;
 static struct timer_list cn_test_timer;
 
-static void cn_test_callback(struct cn_msg *msg, struct netlink_skb_parms *nsp)
+static void cn_test_callback(struct net *net, struct cn_msg *msg,
+			     struct netlink_skb_parms *nsp)
 {
 	pr_info("%s: %lu: idx=%x, val=%x, seq=%u, ack=%u, len=%d: %s.\n",
 	        __func__, jiffies, msg->id.idx, msg->id.val,
@@ -132,7 +134,7 @@ static void cn_test_timer_func(struct timer_list *unused)
 
 		memcpy(m + 1, data, m->len);
 
-		cn_netlink_send(m, 0, 0, GFP_ATOMIC);
+		cn_netlink_send(&init_net, m, 0, 0, GFP_ATOMIC);
 		kfree(m);
 	}
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/5] connector: Create connector per namespace
  2020-07-02  0:26 [PATCH 0/5] RFC: connector: Add network namespace awareness Matt Bennett
                   ` (3 preceding siblings ...)
  2020-07-02  0:26 ` [PATCH 4/5] connector: Prepare for supporting multiple namespaces Matt Bennett
@ 2020-07-02  0:26 ` Matt Bennett
  2020-07-02  5:52   ` kernel test robot
                     ` (2 more replies)
  2020-07-02 13:17 ` [PATCH 0/5] RFC: connector: Add network namespace awareness Eric W. Biederman
  2020-07-02 18:59 ` Eric W. Biederman
  6 siblings, 3 replies; 19+ messages in thread
From: Matt Bennett @ 2020-07-02  0:26 UTC (permalink / raw)
  To: netdev; +Cc: zbr, ebiederm, linux-kernel, Matt Bennett

Move to storing the connector instance per network namespace. In doing
so the ability to use the connector functionality outside the default
namespace is now available.

Signed-off-by: Matt Bennett <matt.bennett@alliedtelesis.co.nz>
---
 drivers/connector/cn_proc.c   |  49 ++++++----
 drivers/connector/connector.c | 171 ++++++++++++++++++++++++++++------
 drivers/hv/hv_fcopy.c         |   1 +
 include/linux/connector.h     |  14 ++-
 include/net/net_namespace.h   |   4 +
 kernel/exit.c                 |   2 +-
 6 files changed, 190 insertions(+), 51 deletions(-)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index 9202be177a30..661d921fd146 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -17,6 +17,7 @@
 #include <linux/atomic.h>
 #include <linux/pid_namespace.h>
 #include <net/net_namespace.h>
+#include <linux/netlink.h>
 
 #include <linux/cn_proc.h>
 #include <linux/local_lock.h>
@@ -37,7 +38,6 @@ static inline struct cn_msg *buffer_to_cn_msg(__u8 *buffer)
 	return (struct cn_msg *)(buffer + 4);
 }
 
-static atomic_t proc_event_num_listeners = ATOMIC_INIT(0);
 static struct cb_id cn_proc_event_id = { CN_IDX_PROC, CN_VAL_PROC };
 
 /* local_event.count is used as the sequence number of the netlink message */
@@ -51,6 +51,9 @@ static DEFINE_PER_CPU(struct local_event, local_event) = {
 
 static inline void send_msg(struct cn_msg *msg)
 {
+	int ret = 0;
+	struct net *net = current->nsproxy->net_ns;
+
 	local_lock(&local_event.lock);
 
 	msg->seq = __this_cpu_inc_return(local_event.count) - 1;
@@ -62,7 +65,9 @@ static inline void send_msg(struct cn_msg *msg)
 	 *
 	 * If cn_netlink_send() fails, the data is not sent.
 	 */
-	cn_netlink_send(&init_net, msg, 0, CN_IDX_PROC, GFP_NOWAIT);
+	ret = cn_netlink_send(net, msg, 0, CN_IDX_PROC, GFP_NOWAIT);
+	if (ret == -ESRCH && netlink_has_listeners(net->cdev.nls, CN_IDX_PROC) == 0)
+		atomic_set(&(net->cdev.proc_event_num_listeners), 0);
 
 	local_unlock(&local_event.lock);
 }
@@ -73,8 +78,9 @@ void proc_fork_connector(struct task_struct *task)
 	struct proc_event *ev;
 	__u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
 	struct task_struct *parent;
+	struct net *net = current->nsproxy->net_ns;
 
-	if (atomic_read(&proc_event_num_listeners) < 1)
+	if (atomic_read(&(net->cdev.proc_event_num_listeners)) < 1)
 		return;
 
 	msg = buffer_to_cn_msg(buffer);
@@ -102,8 +108,9 @@ void proc_exec_connector(struct task_struct *task)
 	struct cn_msg *msg;
 	struct proc_event *ev;
 	__u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
+	struct net *net = current->nsproxy->net_ns;
 
-	if (atomic_read(&proc_event_num_listeners) < 1)
+	if (atomic_read(&(net->cdev.proc_event_num_listeners)) < 1)
 		return;
 
 	msg = buffer_to_cn_msg(buffer);
@@ -127,8 +134,9 @@ void proc_id_connector(struct task_struct *task, int which_id)
 	struct proc_event *ev;
 	__u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
 	const struct cred *cred;
+	struct net *net = current->nsproxy->net_ns;
 
-	if (atomic_read(&proc_event_num_listeners) < 1)
+	if (atomic_read(&(net->cdev.proc_event_num_listeners)) < 1)
 		return;
 
 	msg = buffer_to_cn_msg(buffer);
@@ -164,8 +172,9 @@ void proc_sid_connector(struct task_struct *task)
 	struct cn_msg *msg;
 	struct proc_event *ev;
 	__u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
+	struct net *net = current->nsproxy->net_ns;
 
-	if (atomic_read(&proc_event_num_listeners) < 1)
+	if (atomic_read(&(net->cdev.proc_event_num_listeners)) < 1)
 		return;
 
 	msg = buffer_to_cn_msg(buffer);
@@ -188,8 +197,9 @@ void proc_ptrace_connector(struct task_struct *task, int ptrace_id)
 	struct cn_msg *msg;
 	struct proc_event *ev;
 	__u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
+	struct net *net = current->nsproxy->net_ns;
 
-	if (atomic_read(&proc_event_num_listeners) < 1)
+	if (atomic_read(&(net->cdev.proc_event_num_listeners)) < 1)
 		return;
 
 	msg = buffer_to_cn_msg(buffer);
@@ -220,8 +230,9 @@ void proc_comm_connector(struct task_struct *task)
 	struct cn_msg *msg;
 	struct proc_event *ev;
 	__u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
+	struct net *net = current->nsproxy->net_ns;
 
-	if (atomic_read(&proc_event_num_listeners) < 1)
+	if (atomic_read(&(net->cdev.proc_event_num_listeners)) < 1)
 		return;
 
 	msg = buffer_to_cn_msg(buffer);
@@ -246,8 +257,9 @@ void proc_coredump_connector(struct task_struct *task)
 	struct proc_event *ev;
 	struct task_struct *parent;
 	__u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
+	struct net *net = current->nsproxy->net_ns;
 
-	if (atomic_read(&proc_event_num_listeners) < 1)
+	if (atomic_read(&(net->cdev.proc_event_num_listeners)) < 1)
 		return;
 
 	msg = buffer_to_cn_msg(buffer);
@@ -279,8 +291,9 @@ void proc_exit_connector(struct task_struct *task)
 	struct proc_event *ev;
 	struct task_struct *parent;
 	__u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
+	struct net *net = current->nsproxy->net_ns;
 
-	if (atomic_read(&proc_event_num_listeners) < 1)
+	if (atomic_read(&(net->cdev.proc_event_num_listeners)) < 1)
 		return;
 
 	msg = buffer_to_cn_msg(buffer);
@@ -321,8 +334,9 @@ static void cn_proc_ack(int err, int rcvd_seq, int rcvd_ack)
 	struct cn_msg *msg;
 	struct proc_event *ev;
 	__u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
+	struct net *net = current->nsproxy->net_ns;
 
-	if (atomic_read(&proc_event_num_listeners) < 1)
+	if (atomic_read(&(net->cdev.proc_event_num_listeners)) < 1)
 		return;
 
 	msg = buffer_to_cn_msg(buffer);
@@ -353,13 +367,10 @@ static void cn_proc_mcast_ctl(struct net *net, struct cn_msg *msg,
 	if (msg->len != sizeof(*mc_op))
 		return;
 
-	/* 
-	 * Events are reported with respect to the initial pid
-	 * and user namespaces so ignore requestors from
-	 * other namespaces.
+	/*
+	 * Events are reported with respect to network namespaces.
 	 */
-	if ((current_user_ns() != &init_user_ns) ||
-	    (task_active_pid_ns(current) != &init_pid_ns))
+	if (current->nsproxy->net_ns != net)
 		return;
 
 	/* Can only change if privileged. */
@@ -371,10 +382,10 @@ static void cn_proc_mcast_ctl(struct net *net, struct cn_msg *msg,
 	mc_op = (enum proc_cn_mcast_op *)msg->data;
 	switch (*mc_op) {
 	case PROC_CN_MCAST_LISTEN:
-		atomic_inc(&proc_event_num_listeners);
+		atomic_inc(&(net->cdev.proc_event_num_listeners));
 		break;
 	case PROC_CN_MCAST_IGNORE:
-		atomic_dec(&proc_event_num_listeners);
+		atomic_dec(&(net->cdev.proc_event_num_listeners));
 		break;
 	default:
 		err = EINVAL;
diff --git a/drivers/connector/connector.c b/drivers/connector/connector.c
index 82fcaa4d8be3..30efcf39751f 100644
--- a/drivers/connector/connector.c
+++ b/drivers/connector/connector.c
@@ -26,9 +26,7 @@ MODULE_AUTHOR("Evgeniy Polyakov <zbr@ioremap.net>");
 MODULE_DESCRIPTION("Generic userspace <-> kernelspace connector.");
 MODULE_ALIAS_NET_PF_PROTO(PF_NETLINK, NETLINK_CONNECTOR);
 
-static struct cn_dev cdev;
-
-static int cn_already_initialized;
+static DEFINE_MUTEX(cn_mutex);
 
 /*
  * Sends mult (multiple) cn_msg at a time.
@@ -66,10 +64,13 @@ int cn_netlink_send_mult(struct net *net, struct cn_msg *msg, u16 len,
 	struct sk_buff *skb;
 	struct nlmsghdr *nlh;
 	struct cn_msg *data;
-	struct cn_dev *dev = &cdev;
+	struct cn_dev *dev = &(net->cdev);
 	u32 group = 0;
 	int found = 0;
 
+	if (!msg || len < 0)
+		return -EINVAL;
+
 	if (portid || __group) {
 		group = __group;
 	} else {
@@ -133,7 +134,7 @@ static int cn_call_callback(struct net *net, struct sk_buff *skb)
 {
 	struct nlmsghdr *nlh;
 	struct cn_callback_entry *i, *cbq = NULL;
-	struct cn_dev *dev = &cdev;
+	struct cn_dev *dev = &net->cdev;
 	struct cn_msg *msg = nlmsg_data(nlmsg_hdr(skb));
 	struct netlink_skb_parms *nsp = &NETLINK_CB(skb);
 	int err = -ENODEV;
@@ -168,7 +169,7 @@ static int cn_call_callback(struct net *net, struct sk_buff *skb)
  *
  * It checks skb, netlink header and msg sizes, and calls callback helper.
  */
-static void cn_rx_skb(struct sk_buff *skb)
+static void __cn_rx_skb(struct sk_buff *skb)
 {
 	struct nlmsghdr *nlh;
 	int len, err;
@@ -190,6 +191,13 @@ static void cn_rx_skb(struct sk_buff *skb)
 	}
 }
 
+static void cn_rx_skb(struct sk_buff *skb)
+{
+	mutex_lock(&cn_mutex);
+	__cn_rx_skb(skb);
+	mutex_unlock(&cn_mutex);
+}
+
 /*
  * Callback add routing - adds callback with given ID and name.
  * If there is registered callback with the same ID it will not be added.
@@ -200,20 +208,47 @@ int cn_add_callback(struct cb_id *id, const char *name,
 		    void (*callback)(struct net *, struct cn_msg *,
 				     struct netlink_skb_parms *))
 {
-	int err;
-	struct cn_dev *dev = &cdev;
-
-	if (!cn_already_initialized)
-		return -EAGAIN;
+	int err = -EINVAL;
+	struct net *net = NULL;
+	struct cn_dev *dev = NULL;
 
-	err = cn_queue_add_callback(dev->cbdev, name, id, callback);
-	if (err)
+	if (!id || !name || !callback)
 		return err;
 
-	return 0;
+	down_read(&net_rwsem);
+	for_each_net(net) {
+		dev = &net->cdev;
+		err = cn_queue_add_callback(dev->cbdev, name, id, callback);
+		if (err)
+			break;
+	}
+
+	if (err) {
+		for_each_net(net) {
+			dev = &net->cdev;
+			cn_queue_del_callback(dev->cbdev, id);
+		}
+	}
+	up_read(&net_rwsem);
+
+	return err;
 }
 EXPORT_SYMBOL_GPL(cn_add_callback);
 
+int cn_add_callback_one(struct net *net, struct cb_id *id, const char *name,
+			void (*callback)(struct net *, struct cn_msg *,
+					 struct netlink_skb_parms *))
+{
+	struct cn_dev *dev = NULL;
+
+	if (!net || !id || !name || !callback)
+		return -EINVAL;
+
+	dev = &(net->cdev);
+	return cn_queue_add_callback(dev->cbdev, name, id, callback);
+}
+EXPORT_SYMBOL_GPL(cn_add_callback_one);
+
 /*
  * Callback remove routing - removes callback
  * with given ID.
@@ -224,15 +259,25 @@ EXPORT_SYMBOL_GPL(cn_add_callback);
  */
 void cn_del_callback(struct cb_id *id)
 {
-	struct cn_dev *dev = &cdev;
+	struct net *net = NULL;
+	struct cn_dev *dev = NULL;
+
+	if (!id)
+		return;
 
-	cn_queue_del_callback(dev->cbdev, id);
+	down_read(&net_rwsem);
+	for_each_net(net) {
+		dev = &net->cdev;
+		cn_queue_del_callback(dev->cbdev, id);
+	}
+	up_read(&net_rwsem);
 }
 EXPORT_SYMBOL_GPL(cn_del_callback);
 
 static int __maybe_unused cn_proc_show(struct seq_file *m, void *v)
 {
-	struct cn_queue_dev *dev = cdev.cbdev;
+	struct net *net = seq_file_single_net(m);
+	struct cn_queue_dev *dev = net->cdev.cbdev;
 	struct cn_callback_entry *cbq;
 
 	seq_printf(m, "Name            ID\n");
@@ -251,15 +296,62 @@ static int __maybe_unused cn_proc_show(struct seq_file *m, void *v)
 	return 0;
 }
 
-static int cn_init(void)
+static int init_cn_net(struct net *net)
+{
+	int ret = 0;
+	struct cn_dev *init_dev = &(init_net.cdev);
+	struct cn_queue_dev *cbdev = init_dev->cbdev;
+
+	struct cn_callback_entry *cbq = NULL;
+	struct cn_callback_entry_ex *cbq_ex = NULL;
+	struct cn_callback_entry_ex *tmp = NULL;
+	LIST_HEAD(head);
+
+	if (!net)
+		return -EINVAL;
+
+	spin_lock_bh(&cbdev->queue_lock);
+	list_for_each_entry(cbq, &cbdev->queue_list, callback_entry) {
+		cbq_ex = kmalloc(sizeof(*cbq_ex), GFP_ATOMIC);
+		if (!cbq_ex) {
+			ret = -ENOMEM;
+			break;
+		}
+		INIT_LIST_HEAD(&(cbq_ex->list));
+
+		memcpy(&cbq_ex->id, &(cbq->id.id), sizeof(struct cb_id));
+		memcpy(cbq_ex->name, &(cbq->id.name), CN_CBQ_NAMELEN);
+		cbq_ex->callback =  cbq->callback;
+
+		list_add_tail(&(cbq_ex->list), &head);
+	}
+	spin_unlock_bh(&cbdev->queue_lock);
+
+	if (ret < 0) {
+		list_for_each_entry_safe(cbq_ex, tmp, &head, list) {
+			kfree(cbq_ex);
+		}
+	} else {
+		list_for_each_entry_safe(cbq_ex, tmp, &head, list) {
+			cn_add_callback_one(net, &(cbq_ex->id), cbq_ex->name,
+					    cbq_ex->callback);
+			kfree(cbq_ex);
+		}
+	}
+
+	return ret;
+}
+
+static int __net_init cn_init(struct net *net)
 {
-	struct cn_dev *dev = &cdev;
+	int ret = 0;
+	struct cn_dev *dev = &net->cdev;
 	struct netlink_kernel_cfg cfg = {
 		.groups	= CN_NETLINK_USERS + 0xf,
 		.input	= cn_rx_skb,
 	};
 
-	dev->nls = netlink_kernel_create(&init_net, NETLINK_CONNECTOR, &cfg);
+	dev->nls = netlink_kernel_create(net, NETLINK_CONNECTOR, &cfg);
 	if (!dev->nls)
 		return -EIO;
 
@@ -268,25 +360,44 @@ static int cn_init(void)
 		netlink_kernel_release(dev->nls);
 		return -EINVAL;
 	}
+	atomic_set(&(dev->proc_event_num_listeners), 0);
 
-	cn_already_initialized = 1;
+	ret = init_cn_net(net);
+	if (ret < 0) {
+		cn_queue_free_dev(dev->cbdev);
+		netlink_kernel_release(dev->nls);
+		return ret;
+	}
 
-	proc_create_single("connector", S_IRUGO, init_net.proc_net, cn_proc_show);
+	proc_create_net_single("connector", 0444, net->proc_net, cn_proc_show,
+			       NULL);
 
 	return 0;
 }
 
-static void cn_fini(void)
+static void __net_exit cn_fini(struct net *net)
 {
-	struct cn_dev *dev = &cdev;
-
-	cn_already_initialized = 0;
-
-	remove_proc_entry("connector", init_net.proc_net);
+	struct cn_dev *dev = &net->cdev;
 
+	remove_proc_entry("connector", net->proc_net);
 	cn_queue_free_dev(dev->cbdev);
 	netlink_kernel_release(dev->nls);
 }
 
-subsys_initcall(cn_init);
-module_exit(cn_fini);
+static struct pernet_operations cn_netlink_net_ops = {
+	.init = cn_init,
+	.exit = cn_fini,
+};
+
+static int __init connector_init(void)
+{
+	return register_pernet_subsys(&cn_netlink_net_ops);
+}
+
+static void __exit connector_exit(void)
+{
+	unregister_pernet_subsys(&cn_netlink_net_ops);
+}
+
+subsys_initcall(connector_init);
+module_exit(connector_exit);
diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
index 5040d7e0cd9e..a7151296af5c 100644
--- a/drivers/hv/hv_fcopy.c
+++ b/drivers/hv/hv_fcopy.c
@@ -13,6 +13,7 @@
 #include <linux/workqueue.h>
 #include <linux/hyperv.h>
 #include <linux/sched.h>
+#include <linux/slab.h>
 #include <asm/hyperv-tlfs.h>
 
 #include "hyperv_vmbus.h"
diff --git a/include/linux/connector.h b/include/linux/connector.h
index 8e9385eb18f8..17febd6946ce 100644
--- a/include/linux/connector.h
+++ b/include/linux/connector.h
@@ -14,11 +14,14 @@
 #include <linux/list.h>
 #include <linux/workqueue.h>
 
-#include <net/sock.h>
 #include <uapi/linux/connector.h>
 
 #define CN_CBQ_NAMELEN		32
 
+struct net;
+struct sock;
+struct netlink_skb_parms;
+
 struct cn_queue_dev {
 	atomic_t refcnt;
 	unsigned char name[CN_CBQ_NAMELEN];
@@ -46,11 +49,20 @@ struct cn_callback_entry {
 	u32 seq, group;
 };
 
+struct cn_callback_entry_ex {
+	struct list_head list;
+	struct cb_id id;
+	unsigned char name[CN_CBQ_NAMELEN];
+	void (*callback)(struct net *net, struct cn_msg *cn_msg,
+			 struct netlink_skb_parms *nsp);
+};
+
 struct cn_dev {
 	struct cb_id id;
 
 	u32 seq, groups;
 	struct sock *nls;
+	atomic_t proc_event_num_listeners;
 
 	struct cn_queue_dev *cbdev;
 };
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 2ee5901bec7a..312972fb2dcc 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -38,6 +38,7 @@
 #include <linux/idr.h>
 #include <linux/skbuff.h>
 #include <linux/notifier.h>
+#include <linux/connector.h>
 
 struct user_namespace;
 struct proc_dir_entry;
@@ -187,6 +188,9 @@ struct net {
 #endif
 #if IS_ENABLED(CONFIG_CRYPTO_USER)
 	struct sock		*crypto_nlsk;
+#endif
+#if IS_ENABLED(CONFIG_CONNECTOR)
+	struct cn_dev		cdev;
 #endif
 	struct sock		*diag_nlsk;
 } __randomize_layout;
diff --git a/kernel/exit.c b/kernel/exit.c
index 727150f28103..976fd6032024 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -788,6 +788,7 @@ void __noreturn do_exit(long code)
 
 	tsk->exit_code = code;
 	taskstats_exit(tsk, group_dead);
+	proc_exit_connector(tsk);
 
 	exit_mm();
 
@@ -824,7 +825,6 @@ void __noreturn do_exit(long code)
 
 	exit_tasks_rcu_start();
 	exit_notify(tsk, group_dead);
-	proc_exit_connector(tsk);
 	mpol_put_task_policy(tsk);
 #ifdef CONFIG_FUTEX
 	if (unlikely(current->pi_state_cache))
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] connector: Create connector per namespace
  2020-07-02  0:26 ` [PATCH 5/5] connector: Create connector per namespace Matt Bennett
@ 2020-07-02  5:52   ` kernel test robot
  2020-07-02  6:40   ` kernel test robot
  2020-07-02 14:32   ` [kbuild] " Dan Carpenter
  2 siblings, 0 replies; 19+ messages in thread
From: kernel test robot @ 2020-07-02  5:52 UTC (permalink / raw)
  To: Matt Bennett, netdev
  Cc: kbuild-all, zbr, ebiederm, linux-kernel, Matt Bennett

[-- Attachment #1: Type: text/plain, Size: 4918 bytes --]

Hi Matt,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on ipvs/master]
[also build test WARNING on dm/for-next linux/master linus/master v5.8-rc3 next-20200701]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use  as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Matt-Bennett/RFC-connector-Add-network-namespace-awareness/20200702-083030
base:   https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs.git master
config: powerpc-pmac32_defconfig (attached as .config)
compiler: powerpc-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/connector/connector.c: In function 'cn_netlink_send_mult':
>> drivers/connector/connector.c:71:18: warning: comparison is always false due to limited range of data type [-Wtype-limits]
      71 |  if (!msg || len < 0)
         |                  ^
   drivers/connector/connector.c: At top level:
   drivers/connector/connector.c:238:5: warning: no previous prototype for 'cn_add_callback_one' [-Wmissing-prototypes]
     238 | int cn_add_callback_one(struct net *net, struct cb_id *id, const char *name,
         |     ^~~~~~~~~~~~~~~~~~~

vim +71 drivers/connector/connector.c

    30	
    31	/*
    32	 * Sends mult (multiple) cn_msg at a time.
    33	 *
    34	 * msg->seq and msg->ack are used to determine message genealogy.
    35	 * When someone sends message it puts there locally unique sequence
    36	 * and random acknowledge numbers.  Sequence number may be copied into
    37	 * nlmsghdr->nlmsg_seq too.
    38	 *
    39	 * Sequence number is incremented with each message to be sent.
    40	 *
    41	 * If we expect a reply to our message then the sequence number in
    42	 * received message MUST be the same as in original message, and
    43	 * acknowledge number MUST be the same + 1.
    44	 *
    45	 * If we receive a message and its sequence number is not equal to the
    46	 * one we are expecting then it is a new message.
    47	 *
    48	 * If we receive a message and its sequence number is the same as one
    49	 * we are expecting but it's acknowledgement number is not equal to
    50	 * the acknowledgement number in the original message + 1, then it is
    51	 * a new message.
    52	 *
    53	 * If msg->len != len, then additional cn_msg messages are expected following
    54	 * the first msg.
    55	 *
    56	 * The message is sent to, the portid if given, the group if given, both if
    57	 * both, or if both are zero then the group is looked up and sent there.
    58	 */
    59	int cn_netlink_send_mult(struct net *net, struct cn_msg *msg, u16 len,
    60				 u32 portid, u32 __group, gfp_t gfp_mask)
    61	{
    62		struct cn_callback_entry *__cbq;
    63		unsigned int size;
    64		struct sk_buff *skb;
    65		struct nlmsghdr *nlh;
    66		struct cn_msg *data;
    67		struct cn_dev *dev = &(net->cdev);
    68		u32 group = 0;
    69		int found = 0;
    70	
  > 71		if (!msg || len < 0)
    72			return -EINVAL;
    73	
    74		if (portid || __group) {
    75			group = __group;
    76		} else {
    77			spin_lock_bh(&dev->cbdev->queue_lock);
    78			list_for_each_entry(__cbq, &dev->cbdev->queue_list,
    79					    callback_entry) {
    80				if (cn_cb_equal(&__cbq->id.id, &msg->id)) {
    81					found = 1;
    82					group = __cbq->group;
    83					break;
    84				}
    85			}
    86			spin_unlock_bh(&dev->cbdev->queue_lock);
    87	
    88			if (!found)
    89				return -ENODEV;
    90		}
    91	
    92		if (!portid && !netlink_has_listeners(dev->nls, group))
    93			return -ESRCH;
    94	
    95		size = sizeof(*msg) + len;
    96	
    97		skb = nlmsg_new(size, gfp_mask);
    98		if (!skb)
    99			return -ENOMEM;
   100	
   101		nlh = nlmsg_put(skb, 0, msg->seq, NLMSG_DONE, size, 0);
   102		if (!nlh) {
   103			kfree_skb(skb);
   104			return -EMSGSIZE;
   105		}
   106	
   107		data = nlmsg_data(nlh);
   108	
   109		memcpy(data, msg, size);
   110	
   111		NETLINK_CB(skb).dst_group = group;
   112	
   113		if (group)
   114			return netlink_broadcast(dev->nls, skb, portid, group,
   115						 gfp_mask);
   116		return netlink_unicast(dev->nls, skb, portid,
   117				!gfpflags_allow_blocking(gfp_mask));
   118	}
   119	EXPORT_SYMBOL_GPL(cn_netlink_send_mult);
   120	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24974 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] connector: Create connector per namespace
  2020-07-02  0:26 ` [PATCH 5/5] connector: Create connector per namespace Matt Bennett
  2020-07-02  5:52   ` kernel test robot
@ 2020-07-02  6:40   ` kernel test robot
  2020-07-02 14:32   ` [kbuild] " Dan Carpenter
  2 siblings, 0 replies; 19+ messages in thread
From: kernel test robot @ 2020-07-02  6:40 UTC (permalink / raw)
  To: Matt Bennett, netdev
  Cc: kbuild-all, zbr, ebiederm, linux-kernel, Matt Bennett

[-- Attachment #1: Type: text/plain, Size: 6573 bytes --]

Hi Matt,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on ipvs/master]
[also build test WARNING on dm/for-next linux/master linus/master v5.8-rc3 next-20200701]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use  as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Matt-Bennett/RFC-connector-Add-network-namespace-awareness/20200702-083030
base:   https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs.git master
config: s390-randconfig-r016-20200701 (attached as .config)
compiler: s390-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=s390 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from drivers/connector/connector.c:9:
   drivers/connector/connector.c: In function 'cn_netlink_send_mult':
   drivers/connector/connector.c:71:18: warning: comparison is always false due to limited range of data type [-Wtype-limits]
      71 |  if (!msg || len < 0)
         |                  ^
   include/linux/compiler.h:58:52: note: in definition of macro '__trace_if_var'
      58 | #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
         |                                                    ^~~~
>> drivers/connector/connector.c:71:2: note: in expansion of macro 'if'
      71 |  if (!msg || len < 0)
         |  ^~
   drivers/connector/connector.c:71:18: warning: comparison is always false due to limited range of data type [-Wtype-limits]
      71 |  if (!msg || len < 0)
         |                  ^
   include/linux/compiler.h:58:61: note: in definition of macro '__trace_if_var'
      58 | #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
         |                                                             ^~~~
>> drivers/connector/connector.c:71:2: note: in expansion of macro 'if'
      71 |  if (!msg || len < 0)
         |  ^~
   drivers/connector/connector.c:71:18: warning: comparison is always false due to limited range of data type [-Wtype-limits]
      71 |  if (!msg || len < 0)
         |                  ^
   include/linux/compiler.h:69:3: note: in definition of macro '__trace_if_value'
      69 |  (cond) ?     \
         |   ^~~~
   include/linux/compiler.h:56:28: note: in expansion of macro '__trace_if_var'
      56 | #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
         |                            ^~~~~~~~~~~~~~
>> drivers/connector/connector.c:71:2: note: in expansion of macro 'if'
      71 |  if (!msg || len < 0)
         |  ^~
   drivers/connector/connector.c: At top level:
   drivers/connector/connector.c:238:5: warning: no previous prototype for 'cn_add_callback_one' [-Wmissing-prototypes]
     238 | int cn_add_callback_one(struct net *net, struct cb_id *id, const char *name,
         |     ^~~~~~~~~~~~~~~~~~~

vim +/if +71 drivers/connector/connector.c

    30	
    31	/*
    32	 * Sends mult (multiple) cn_msg at a time.
    33	 *
    34	 * msg->seq and msg->ack are used to determine message genealogy.
    35	 * When someone sends message it puts there locally unique sequence
    36	 * and random acknowledge numbers.  Sequence number may be copied into
    37	 * nlmsghdr->nlmsg_seq too.
    38	 *
    39	 * Sequence number is incremented with each message to be sent.
    40	 *
    41	 * If we expect a reply to our message then the sequence number in
    42	 * received message MUST be the same as in original message, and
    43	 * acknowledge number MUST be the same + 1.
    44	 *
    45	 * If we receive a message and its sequence number is not equal to the
    46	 * one we are expecting then it is a new message.
    47	 *
    48	 * If we receive a message and its sequence number is the same as one
    49	 * we are expecting but it's acknowledgement number is not equal to
    50	 * the acknowledgement number in the original message + 1, then it is
    51	 * a new message.
    52	 *
    53	 * If msg->len != len, then additional cn_msg messages are expected following
    54	 * the first msg.
    55	 *
    56	 * The message is sent to, the portid if given, the group if given, both if
    57	 * both, or if both are zero then the group is looked up and sent there.
    58	 */
    59	int cn_netlink_send_mult(struct net *net, struct cn_msg *msg, u16 len,
    60				 u32 portid, u32 __group, gfp_t gfp_mask)
    61	{
    62		struct cn_callback_entry *__cbq;
    63		unsigned int size;
    64		struct sk_buff *skb;
    65		struct nlmsghdr *nlh;
    66		struct cn_msg *data;
    67		struct cn_dev *dev = &(net->cdev);
    68		u32 group = 0;
    69		int found = 0;
    70	
  > 71		if (!msg || len < 0)
    72			return -EINVAL;
    73	
    74		if (portid || __group) {
    75			group = __group;
    76		} else {
    77			spin_lock_bh(&dev->cbdev->queue_lock);
    78			list_for_each_entry(__cbq, &dev->cbdev->queue_list,
    79					    callback_entry) {
    80				if (cn_cb_equal(&__cbq->id.id, &msg->id)) {
    81					found = 1;
    82					group = __cbq->group;
    83					break;
    84				}
    85			}
    86			spin_unlock_bh(&dev->cbdev->queue_lock);
    87	
    88			if (!found)
    89				return -ENODEV;
    90		}
    91	
    92		if (!portid && !netlink_has_listeners(dev->nls, group))
    93			return -ESRCH;
    94	
    95		size = sizeof(*msg) + len;
    96	
    97		skb = nlmsg_new(size, gfp_mask);
    98		if (!skb)
    99			return -ENOMEM;
   100	
   101		nlh = nlmsg_put(skb, 0, msg->seq, NLMSG_DONE, size, 0);
   102		if (!nlh) {
   103			kfree_skb(skb);
   104			return -EMSGSIZE;
   105		}
   106	
   107		data = nlmsg_data(nlh);
   108	
   109		memcpy(data, msg, size);
   110	
   111		NETLINK_CB(skb).dst_group = group;
   112	
   113		if (group)
   114			return netlink_broadcast(dev->nls, skb, portid, group,
   115						 gfp_mask);
   116		return netlink_unicast(dev->nls, skb, portid,
   117				!gfpflags_allow_blocking(gfp_mask));
   118	}
   119	EXPORT_SYMBOL_GPL(cn_netlink_send_mult);
   120	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28134 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/5] RFC: connector: Add network namespace awareness
  2020-07-02  0:26 [PATCH 0/5] RFC: connector: Add network namespace awareness Matt Bennett
                   ` (4 preceding siblings ...)
  2020-07-02  0:26 ` [PATCH 5/5] connector: Create connector per namespace Matt Bennett
@ 2020-07-02 13:17 ` Eric W. Biederman
  2020-07-02 19:10   ` Christian Brauner
  2020-07-02 18:59 ` Eric W. Biederman
  6 siblings, 1 reply; 19+ messages in thread
From: Eric W. Biederman @ 2020-07-02 13:17 UTC (permalink / raw)
  To: Matt Bennett; +Cc: netdev, zbr, linux-kernel, Linux Containers

Matt Bennett <matt.bennett@alliedtelesis.co.nz> writes:

> Previously the connector functionality could only be used by processes running in the
> default network namespace. This meant that any process that uses the connector functionality
> could not operate correctly when run inside a container. This is a draft patch series that
> attempts to now allow this functionality outside of the default network namespace.
>
> I see this has been discussed previously [1], but am not sure how my changes relate to all
> of the topics discussed there and/or if there are any unintended side effects from my draft
> changes.

Is there a piece of software that uses connector that you want to get
working in containers?

I am curious what the motivation is because up until now there has been
nothing very interesting using this functionality.  So it hasn't been
worth anyone's time to make the necessary changes to the code.

Eric


> Thanks.
>
> [1] https://marc.info/?l=linux-kernel&m=150806196728365&w=2
>
> Matt Bennett (5):
>   connector: Use task pid helpers
>   connector: Use 'current_user_ns' function
>   connector: Ensure callback entry is released
>   connector: Prepare for supporting multiple namespaces
>   connector: Create connector per namespace
>
>  Documentation/driver-api/connector.rst |   6 +-
>  drivers/connector/cn_proc.c            | 110 +++++++-------
>  drivers/connector/cn_queue.c           |   9 +-
>  drivers/connector/connector.c          | 192 ++++++++++++++++++++-----
>  drivers/hv/hv_fcopy.c                  |   1 +
>  drivers/hv/hv_utils_transport.c        |   6 +-
>  drivers/md/dm-log-userspace-transfer.c |   6 +-
>  drivers/video/fbdev/uvesafb.c          |   8 +-
>  drivers/w1/w1_netlink.c                |  19 +--
>  include/linux/connector.h              |  38 +++--
>  include/net/net_namespace.h            |   4 +
>  kernel/exit.c                          |   2 +-
>  samples/connector/cn_test.c            |   6 +-
>  13 files changed, 286 insertions(+), 121 deletions(-)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [kbuild] Re: [PATCH 5/5] connector: Create connector per namespace
  2020-07-02  0:26 ` [PATCH 5/5] connector: Create connector per namespace Matt Bennett
  2020-07-02  5:52   ` kernel test robot
  2020-07-02  6:40   ` kernel test robot
@ 2020-07-02 14:32   ` Dan Carpenter
  2 siblings, 0 replies; 19+ messages in thread
From: Dan Carpenter @ 2020-07-02 14:32 UTC (permalink / raw)
  To: kbuild, Matt Bennett, netdev
  Cc: lkp, Dan Carpenter, kbuild-all, zbr, ebiederm, linux-kernel,
	Matt Bennett

[-- Attachment #1: Type: text/plain, Size: 5919 bytes --]

Hi Matt,

url:    https://github.com/0day-ci/linux/commits/Matt-Bennett/RFC-connector-Add-network-namespace-awareness/20200702-083030
base:   https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs.git master
config: arm-randconfig-m031-20200701 (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
drivers/connector/connector.c:71 cn_netlink_send_mult() warn: impossible condition '(len < 0) => (0-u16max < 0)'

# https://github.com/0day-ci/linux/commit/32f49fbf9dcfe5c3bc100728f5b8b5f9c95757cb
git remote add linux-review https://github.com/0day-ci/linux
git remote update linux-review
git checkout 32f49fbf9dcfe5c3bc100728f5b8b5f9c95757cb
vim +71 drivers/connector/connector.c

e921441014422d Matt Bennett             2020-07-02   59  int cn_netlink_send_mult(struct net *net, struct cn_msg *msg, u16 len,
                                                                                                                       ^^^^^^^

e921441014422d Matt Bennett             2020-07-02   60  			 u32 portid, u32 __group, gfp_t gfp_mask)
7672d0b5441137 Evgeniy Polyakov         2005-09-11   61  {
7672d0b5441137 Evgeniy Polyakov         2005-09-11   62  	struct cn_callback_entry *__cbq;
7672d0b5441137 Evgeniy Polyakov         2005-09-11   63  	unsigned int size;
7672d0b5441137 Evgeniy Polyakov         2005-09-11   64  	struct sk_buff *skb;
7672d0b5441137 Evgeniy Polyakov         2005-09-11   65  	struct nlmsghdr *nlh;
7672d0b5441137 Evgeniy Polyakov         2005-09-11   66  	struct cn_msg *data;
32f49fbf9dcfe5 Matt Bennett             2020-07-02   67  	struct cn_dev *dev = &(net->cdev);
7672d0b5441137 Evgeniy Polyakov         2005-09-11   68  	u32 group = 0;
7672d0b5441137 Evgeniy Polyakov         2005-09-11   69  	int found = 0;
7672d0b5441137 Evgeniy Polyakov         2005-09-11   70  
32f49fbf9dcfe5 Matt Bennett             2020-07-02  @71  	if (!msg || len < 0)
                                                                            ^^^^^^^
Unsigned can't be less than zero.

32f49fbf9dcfe5 Matt Bennett             2020-07-02   72  		return -EINVAL;
32f49fbf9dcfe5 Matt Bennett             2020-07-02   73  
ac8f73305eea8a David Fries              2014-01-15   74  	if (portid || __group) {
ac8f73305eea8a David Fries              2014-01-15   75  		group = __group;
ac8f73305eea8a David Fries              2014-01-15   76  	} else {
7672d0b5441137 Evgeniy Polyakov         2005-09-11   77  		spin_lock_bh(&dev->cbdev->queue_lock);
7672d0b5441137 Evgeniy Polyakov         2005-09-11   78  		list_for_each_entry(__cbq, &dev->cbdev->queue_list,
7672d0b5441137 Evgeniy Polyakov         2005-09-11   79  				    callback_entry) {
acd042bb2de50d Evgeniy Polyakov         2005-09-26   80  			if (cn_cb_equal(&__cbq->id.id, &msg->id)) {
7672d0b5441137 Evgeniy Polyakov         2005-09-11   81  				found = 1;
7672d0b5441137 Evgeniy Polyakov         2005-09-11   82  				group = __cbq->group;
fd00eeccd92b7b Li Zefan                 2008-01-04   83  				break;
7672d0b5441137 Evgeniy Polyakov         2005-09-11   84  			}
7672d0b5441137 Evgeniy Polyakov         2005-09-11   85  		}
7672d0b5441137 Evgeniy Polyakov         2005-09-11   86  		spin_unlock_bh(&dev->cbdev->queue_lock);
7672d0b5441137 Evgeniy Polyakov         2005-09-11   87  
7672d0b5441137 Evgeniy Polyakov         2005-09-11   88  		if (!found)
7672d0b5441137 Evgeniy Polyakov         2005-09-11   89  			return -ENODEV;
7672d0b5441137 Evgeniy Polyakov         2005-09-11   90  	}
7672d0b5441137 Evgeniy Polyakov         2005-09-11   91  
ac8f73305eea8a David Fries              2014-01-15   92  	if (!portid && !netlink_has_listeners(dev->nls, group))
b191ba0d599928 Evgeniy Polyakov         2006-03-20   93  		return -ESRCH;
b191ba0d599928 Evgeniy Polyakov         2006-03-20   94  
34470e0bfae223 David Fries              2014-04-08   95  	size = sizeof(*msg) + len;
7672d0b5441137 Evgeniy Polyakov         2005-09-11   96  
9631d79e815197 Hong zhi guo             2013-03-27   97  	skb = nlmsg_new(size, gfp_mask);
7672d0b5441137 Evgeniy Polyakov         2005-09-11   98  	if (!skb)
7672d0b5441137 Evgeniy Polyakov         2005-09-11   99  		return -ENOMEM;
7672d0b5441137 Evgeniy Polyakov         2005-09-11  100  
9631d79e815197 Hong zhi guo             2013-03-27  101  	nlh = nlmsg_put(skb, 0, msg->seq, NLMSG_DONE, size, 0);
85c931665d822f Javier Martinez Canillas 2012-06-26  102  	if (!nlh) {
85c931665d822f Javier Martinez Canillas 2012-06-26  103  		kfree_skb(skb);
85c931665d822f Javier Martinez Canillas 2012-06-26  104  		return -EMSGSIZE;
85c931665d822f Javier Martinez Canillas 2012-06-26  105  	}
7672d0b5441137 Evgeniy Polyakov         2005-09-11  106  
85c931665d822f Javier Martinez Canillas 2012-06-26  107  	data = nlmsg_data(nlh);
7672d0b5441137 Evgeniy Polyakov         2005-09-11  108  
ac73bf50b709de Mathias Krause           2013-09-30  109  	memcpy(data, msg, size);
7672d0b5441137 Evgeniy Polyakov         2005-09-11  110  
7672d0b5441137 Evgeniy Polyakov         2005-09-11  111  	NETLINK_CB(skb).dst_group = group;
7672d0b5441137 Evgeniy Polyakov         2005-09-11  112  
ac8f73305eea8a David Fries              2014-01-15  113  	if (group)
ac8f73305eea8a David Fries              2014-01-15  114  		return netlink_broadcast(dev->nls, skb, portid, group,
ac8f73305eea8a David Fries              2014-01-15  115  					 gfp_mask);
d0164adc89f6bb Mel Gorman               2015-11-06  116  	return netlink_unicast(dev->nls, skb, portid,
d0164adc89f6bb Mel Gorman               2015-11-06  117  			!gfpflags_allow_blocking(gfp_mask));
7672d0b5441137 Evgeniy Polyakov         2005-09-11  118  }

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 27431 bytes --]

[-- Attachment #3: Type: text/plain, Size: 149 bytes --]

_______________________________________________
kbuild mailing list -- kbuild@lists.01.org
To unsubscribe send an email to kbuild-leave@lists.01.org

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/5] RFC: connector: Add network namespace awareness
  2020-07-02  0:26 [PATCH 0/5] RFC: connector: Add network namespace awareness Matt Bennett
                   ` (5 preceding siblings ...)
  2020-07-02 13:17 ` [PATCH 0/5] RFC: connector: Add network namespace awareness Eric W. Biederman
@ 2020-07-02 18:59 ` Eric W. Biederman
  2020-07-05 22:31   ` Matt Bennett
  6 siblings, 1 reply; 19+ messages in thread
From: Eric W. Biederman @ 2020-07-02 18:59 UTC (permalink / raw)
  To: Matt Bennett; +Cc: netdev, zbr, linux-kernel, Linux Containers

Matt Bennett <matt.bennett@alliedtelesis.co.nz> writes:

> Previously the connector functionality could only be used by processes running in the
> default network namespace. This meant that any process that uses the connector functionality
> could not operate correctly when run inside a container. This is a draft patch series that
> attempts to now allow this functionality outside of the default network namespace.
>
> I see this has been discussed previously [1], but am not sure how my changes relate to all
> of the topics discussed there and/or if there are any unintended side
> effects from my draft

In a quick skim this patchset does not look like it approaches a correct
conversion to having code that works in multiple namespaces.

I will take the changes to proc_id_connector for example.
You report the values in the callers current namespaces.

Which means an unprivileged user can create a user namespace and get
connector to report whichever ids they want to users in another
namespace.  AKA lie.

So this appears to make connector completely unreliable.

Eric



> changes.
>
> Thanks.
>
> [1] https://marc.info/?l=linux-kernel&m=150806196728365&w=2
>
> Matt Bennett (5):
>   connector: Use task pid helpers
>   connector: Use 'current_user_ns' function
>   connector: Ensure callback entry is released
>   connector: Prepare for supporting multiple namespaces
>   connector: Create connector per namespace
>
>  Documentation/driver-api/connector.rst |   6 +-
>  drivers/connector/cn_proc.c            | 110 +++++++-------
>  drivers/connector/cn_queue.c           |   9 +-
>  drivers/connector/connector.c          | 192 ++++++++++++++++++++-----
>  drivers/hv/hv_fcopy.c                  |   1 +
>  drivers/hv/hv_utils_transport.c        |   6 +-
>  drivers/md/dm-log-userspace-transfer.c |   6 +-
>  drivers/video/fbdev/uvesafb.c          |   8 +-
>  drivers/w1/w1_netlink.c                |  19 +--
>  include/linux/connector.h              |  38 +++--
>  include/net/net_namespace.h            |   4 +
>  kernel/exit.c                          |   2 +-
>  samples/connector/cn_test.c            |   6 +-
>  13 files changed, 286 insertions(+), 121 deletions(-)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/5] RFC: connector: Add network namespace awareness
  2020-07-02 13:17 ` [PATCH 0/5] RFC: connector: Add network namespace awareness Eric W. Biederman
@ 2020-07-02 19:10   ` Christian Brauner
  2020-07-02 22:44     ` Aleksa Sarai
  2020-07-05 22:32     ` Matt Bennett
  0 siblings, 2 replies; 19+ messages in thread
From: Christian Brauner @ 2020-07-02 19:10 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Matt Bennett, netdev, Linux Containers, linux-kernel, zbr

On Thu, Jul 02, 2020 at 08:17:38AM -0500, Eric W. Biederman wrote:
> Matt Bennett <matt.bennett@alliedtelesis.co.nz> writes:
> 
> > Previously the connector functionality could only be used by processes running in the
> > default network namespace. This meant that any process that uses the connector functionality
> > could not operate correctly when run inside a container. This is a draft patch series that
> > attempts to now allow this functionality outside of the default network namespace.
> >
> > I see this has been discussed previously [1], but am not sure how my changes relate to all
> > of the topics discussed there and/or if there are any unintended side effects from my draft
> > changes.
> 
> Is there a piece of software that uses connector that you want to get
> working in containers?
> 
> I am curious what the motivation is because up until now there has been
> nothing very interesting using this functionality.  So it hasn't been
> worth anyone's time to make the necessary changes to the code.

Imho, we should just state once and for all that the proc connector will
not be namespaced. This is such a corner-case thing and has been
non-namespaced for such a long time without consistent push for it to be
namespaced combined with the fact that this needs quite some code to
make it work correctly that I fear we end up buying more bugs than we're
selling features. And realistically, you and I will end up maintaining
this and I feel this is not worth the time(?). Maybe I'm being too
pessimistic though.

Christian

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/5] RFC: connector: Add network namespace awareness
  2020-07-02 19:10   ` Christian Brauner
@ 2020-07-02 22:44     ` Aleksa Sarai
  2020-07-05 22:32     ` Matt Bennett
  1 sibling, 0 replies; 19+ messages in thread
From: Aleksa Sarai @ 2020-07-02 22:44 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Eric W. Biederman, netdev, Linux Containers, linux-kernel,
	Matt Bennett, zbr

[-- Attachment #1: Type: text/plain, Size: 3874 bytes --]

On 2020-07-02, Christian Brauner <christian.brauner@ubuntu.com> wrote:
> On Thu, Jul 02, 2020 at 08:17:38AM -0500, Eric W. Biederman wrote:
> > Matt Bennett <matt.bennett@alliedtelesis.co.nz> writes:
> > 
> > > Previously the connector functionality could only be used by processes running in the
> > > default network namespace. This meant that any process that uses the connector functionality
> > > could not operate correctly when run inside a container. This is a draft patch series that
> > > attempts to now allow this functionality outside of the default network namespace.
> > >
> > > I see this has been discussed previously [1], but am not sure how my changes relate to all
> > > of the topics discussed there and/or if there are any unintended side effects from my draft
> > > changes.
> > 
> > Is there a piece of software that uses connector that you want to get
> > working in containers?
> > 
> > I am curious what the motivation is because up until now there has been
> > nothing very interesting using this functionality.  So it hasn't been
> > worth anyone's time to make the necessary changes to the code.
> 
> Imho, we should just state once and for all that the proc connector will
> not be namespaced. This is such a corner-case thing and has been
> non-namespaced for such a long time without consistent push for it to be
> namespaced combined with the fact that this needs quite some code to
> make it work correctly that I fear we end up buying more bugs than we're
> selling features. And realistically, you and I will end up maintaining
> this and I feel this is not worth the time(?). Maybe I'm being too
> pessimistic though.

It would be nice to have the proc connector be namespaced, because it
would allow you to have init systems that don't depend on cgroups to
operate -- and it would allow us to have a subset of FreeBSD's kqueue
functionality that doesn't exist today under Linux. However, arguably
pidfds might be a better path forward toward implementing such events
these days -- and is maybe something we should look into.

All of that being said, I agree that doing this is going to be
particularly hairy and likely not worth the effort. In particular, the
proc connector is:

 * Almost entirely unused (and largely unknown) by userspace.

 * Fairly fundamentally broken right now (the "security feature" of
   PROC_CN_MCAST_LISTEN doesn't work because once there is one listener,
   anyone who opens an cn_proc socket can get all events on the system
   -- and if the process which opened the socket dies with calling
   PROC_CN_MCAST_IGNORE then that information is now always streaming).
   So if we end up supporting this, we'll need to fix those bugs too.

 * Is so deeply intertwined with netlink and thus is so deeply embedded
   with network namespaces (rather than pid namespaces) meaning that
   getting it to correctly handle shared-network-namespace cases is
   going to be a nightmare. I agree with Eric that this patchset looks
   like it doesn't approach the problem from the right angle (and
   thinking about how you could fix it makes me a little nervous).

Not to mention that when I brought this up with the maintainer listed in
MAINTAINERS a few years ago (soon after I posted [1]), I was told that
they no longer maintain this code -- so whoever touches it next is the
new maintainer.

In 2017, I wrote that GNU Shepherd uses cn_proc, however I'm pretty sure
(looking at the code now) that it wasn't true then and isn't true now
(Shepherd seems to just do basic pidfile liveliness checks). So even the
niche example I used then doesn't actually use cn_proc.

[1]: https://lore.kernel.org/lkml/a2fa1602-2280-c5e8-cac9-b718eaea5d22@suse.de/

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/5] RFC: connector: Add network namespace awareness
  2020-07-02 18:59 ` Eric W. Biederman
@ 2020-07-05 22:31   ` Matt Bennett
  2020-07-13 18:39     ` Eric W. Biederman
  0 siblings, 1 reply; 19+ messages in thread
From: Matt Bennett @ 2020-07-05 22:31 UTC (permalink / raw)
  To: ebiederm; +Cc: netdev, zbr, linux-kernel, containers

On Thu, 2020-07-02 at 13:59 -0500, Eric W. Biederman wrote:
> Matt Bennett <matt.bennett@alliedtelesis.co.nz> writes:
> 
> > Previously the connector functionality could only be used by processes running in the
> > default network namespace. This meant that any process that uses the connector functionality
> > could not operate correctly when run inside a container. This is a draft patch series that
> > attempts to now allow this functionality outside of the default network namespace.
> > 
> > I see this has been discussed previously [1], but am not sure how my changes relate to all
> > of the topics discussed there and/or if there are any unintended side
> > effects from my draft
> 
> In a quick skim this patchset does not look like it approaches a correct
> conversion to having code that works in multiple namespaces.
> 
> I will take the changes to proc_id_connector for example.
> You report the values in the callers current namespaces.
> 
> Which means an unprivileged user can create a user namespace and get
> connector to report whichever ids they want to users in another
> namespace.  AKA lie.
> 
> So this appears to make connector completely unreliable.
> 
> Eric
> 

Hi Eric,

Thank you for taking the time to review. I wrote these patches in an attempt to show that I was willing to do the work myself rather than simply
asking for someone else to do it for me. The changes worked for my use cases when I tested them, but I expected that some of the changes would be
incorrect and that I would need some guidance. I can spend some time to really dig in and fully understand the changes I am trying to make (I have
limited kernel development experience) but based on the rest of the discussion threads it seems that there is likely no appetite to ever support
namespaces with the connector.

Best regards,
Matt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/5] RFC: connector: Add network namespace awareness
  2020-07-02 19:10   ` Christian Brauner
  2020-07-02 22:44     ` Aleksa Sarai
@ 2020-07-05 22:32     ` Matt Bennett
  2020-07-13 18:34       ` Eric W. Biederman
  1 sibling, 1 reply; 19+ messages in thread
From: Matt Bennett @ 2020-07-05 22:32 UTC (permalink / raw)
  To: christian.brauner, ebiederm; +Cc: netdev, linux-kernel, containers, zbr

On Thu, 2020-07-02 at 21:10 +0200, Christian Brauner wrote:
> On Thu, Jul 02, 2020 at 08:17:38AM -0500, Eric W. Biederman wrote:
> > Matt Bennett <matt.bennett@alliedtelesis.co.nz> writes:
> > 
> > > Previously the connector functionality could only be used by processes running in the
> > > default network namespace. This meant that any process that uses the connector functionality
> > > could not operate correctly when run inside a container. This is a draft patch series that
> > > attempts to now allow this functionality outside of the default network namespace.
> > > 
> > > I see this has been discussed previously [1], but am not sure how my changes relate to all
> > > of the topics discussed there and/or if there are any unintended side effects from my draft
> > > changes.
> > 
> > Is there a piece of software that uses connector that you want to get
> > working in containers?

We have an IPC system [1] where processes can register their socket details (unix, tcp, tipc, ...) to a 'monitor' process. Processes can then get
notified  when other processes they are interested in start/stop their servers and use the registered details to connect to them. Everything works
unless a process crashes, in which case the monitoring process never removes their details. Therefore the monitoring process uses the connector
functionality with PROC_EVENT_EXIT to detect when a process crashes and removes the details if it is a previously registered PID.

This was working for us until we tried to run our system in a container.

> > 
> > I am curious what the motivation is because up until now there has been
> > nothing very interesting using this functionality.  So it hasn't been
> > worth anyone's time to make the necessary changes to the code.
> 
> Imho, we should just state once and for all that the proc connector will
> not be namespaced. This is such a corner-case thing and has been
> non-namespaced for such a long time without consistent push for it to be
> namespaced combined with the fact that this needs quite some code to
> make it work correctly that I fear we end up buying more bugs than we're
> selling features. And realistically, you and I will end up maintaining
> this and I feel this is not worth the time(?). Maybe I'm being too
> pessimistic though.
> 

Fair enough. I can certainly look for another way to detect process crashes. Interestingly I found a patch set [2] on the mailing list that attempts
to solve the problem I wish to solve, but it doesn't look like the patches were ever developed further. From reading the discussion thread on that
patch set it appears that I should be doing some form of polling on the /proc files.

Best regards,
Matt

[1] https://github.com/alliedtelesis/cmsg/blob/master/cmsg/src/service_listener/netlink.c#L61
[2] https://lkml.org/lkml/2018/10/29/638


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/5] RFC: connector: Add network namespace awareness
  2020-07-05 22:32     ` Matt Bennett
@ 2020-07-13 18:34       ` Eric W. Biederman
  2020-07-14  5:03         ` Aleksa Sarai
  0 siblings, 1 reply; 19+ messages in thread
From: Eric W. Biederman @ 2020-07-13 18:34 UTC (permalink / raw)
  To: Matt Bennett; +Cc: christian.brauner, netdev, linux-kernel, containers, zbr

Matt Bennett <Matt.Bennett@alliedtelesis.co.nz> writes:

> On Thu, 2020-07-02 at 21:10 +0200, Christian Brauner wrote:
>> On Thu, Jul 02, 2020 at 08:17:38AM -0500, Eric W. Biederman wrote:
>> > Matt Bennett <matt.bennett@alliedtelesis.co.nz> writes:
>> > 
>> > > Previously the connector functionality could only be used by processes running in the
>> > > default network namespace. This meant that any process that uses the connector functionality
>> > > could not operate correctly when run inside a container. This is a draft patch series that
>> > > attempts to now allow this functionality outside of the default network namespace.
>> > > 
>> > > I see this has been discussed previously [1], but am not sure how my changes relate to all
>> > > of the topics discussed there and/or if there are any unintended side effects from my draft
>> > > changes.
>> > 
>> > Is there a piece of software that uses connector that you want to get
>> > working in containers?
>
> We have an IPC system [1] where processes can register their socket
> details (unix, tcp, tipc, ...) to a 'monitor' process. Processes can
> then get notified when other processes they are interested in
> start/stop their servers and use the registered details to connect to
> them. Everything works unless a process crashes, in which case the
> monitoring process never removes their details. Therefore the
> monitoring process uses the connector functionality with
> PROC_EVENT_EXIT to detect when a process crashes and removes the
> details if it is a previously registered PID.
>
> This was working for us until we tried to run our system in a container.
>
>> > 
>> > I am curious what the motivation is because up until now there has been
>> > nothing very interesting using this functionality.  So it hasn't been
>> > worth anyone's time to make the necessary changes to the code.
>> 
>> Imho, we should just state once and for all that the proc connector will
>> not be namespaced. This is such a corner-case thing and has been
>> non-namespaced for such a long time without consistent push for it to be
>> namespaced combined with the fact that this needs quite some code to
>> make it work correctly that I fear we end up buying more bugs than we're
>> selling features. And realistically, you and I will end up maintaining
>> this and I feel this is not worth the time(?). Maybe I'm being too
>> pessimistic though.
>> 
>
> Fair enough. I can certainly look for another way to detect process
> crashes. Interestingly I found a patch set [2] on the mailing list
> that attempts to solve the problem I wish to solve, but it doesn't
> look like the patches were ever developed further. From reading the
> discussion thread on that patch set it appears that I should be doing
> some form of polling on the /proc files.

Recently Christian Brauner implemented pidfd complete with a poll
operation that reports when a process terminates.

If you are willing to change your userspace code switching to pidfd
should be all that you need.

Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/5] RFC: connector: Add network namespace awareness
  2020-07-05 22:31   ` Matt Bennett
@ 2020-07-13 18:39     ` Eric W. Biederman
  0 siblings, 0 replies; 19+ messages in thread
From: Eric W. Biederman @ 2020-07-13 18:39 UTC (permalink / raw)
  To: Matt Bennett; +Cc: netdev, zbr, linux-kernel, containers

Matt Bennett <Matt.Bennett@alliedtelesis.co.nz> writes:

> On Thu, 2020-07-02 at 13:59 -0500, Eric W. Biederman wrote:
>> Matt Bennett <matt.bennett@alliedtelesis.co.nz> writes:
>> 
>> > Previously the connector functionality could only be used by processes running in the
>> > default network namespace. This meant that any process that uses the connector functionality
>> > could not operate correctly when run inside a container. This is a draft patch series that
>> > attempts to now allow this functionality outside of the default network namespace.
>> > 
>> > I see this has been discussed previously [1], but am not sure how my changes relate to all
>> > of the topics discussed there and/or if there are any unintended side
>> > effects from my draft
>> 
>> In a quick skim this patchset does not look like it approaches a correct
>> conversion to having code that works in multiple namespaces.
>> 
>> I will take the changes to proc_id_connector for example.
>> You report the values in the callers current namespaces.
>> 
>> Which means an unprivileged user can create a user namespace and get
>> connector to report whichever ids they want to users in another
>> namespace.  AKA lie.
>> 
>> So this appears to make connector completely unreliable.
>> 
>> Eric
>> 
>
> Hi Eric,
>
> Thank you for taking the time to review. I wrote these patches in an
> attempt to show that I was willing to do the work myself rather than
> simply asking for someone else to do it for me. The changes worked for
> my use cases when I tested them, but I expected that some of the
> changes would be incorrect and that I would need some guidance. I can
> spend some time to really dig in and fully understand the changes I am
> trying to make (I have limited kernel development experience) but
> based on the rest of the discussion threads it seems that there is
> likely no appetite to ever support namespaces with the connector.

Good approach to this.

My sense is that there are few enough uses of connector that if don't
mind changing your code so that it works in a container (and the pidfd
support appears to already provide what you need) that is probably the
past of least resistance.

I don't think it maintaining connector support would be much more work
than it is now, if someone went through and did the work to carefully
convert the code.  So if someone really wants to use connector we can
namespace the code.

Otherwise it is probably makes sense to let the few users gradually stop
using connector so the code can eventually be removed.

Please checkout out the pidfd support and tell us how it meets your
needs.  If there is something that connector really does better it would
be good to know.

Thank you,
Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/5] RFC: connector: Add network namespace awareness
  2020-07-13 18:34       ` Eric W. Biederman
@ 2020-07-14  5:03         ` Aleksa Sarai
  2020-07-14  5:19           ` Matt Bennett
  0 siblings, 1 reply; 19+ messages in thread
From: Aleksa Sarai @ 2020-07-14  5:03 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Matt Bennett, netdev, christian.brauner, containers, linux-kernel, zbr

[-- Attachment #1: Type: text/plain, Size: 3544 bytes --]

On 2020-07-13, Eric W. Biederman <ebiederm@xmission.com> wrote:
> Matt Bennett <Matt.Bennett@alliedtelesis.co.nz> writes:
> 
> > On Thu, 2020-07-02 at 21:10 +0200, Christian Brauner wrote:
> >> On Thu, Jul 02, 2020 at 08:17:38AM -0500, Eric W. Biederman wrote:
> >> > Matt Bennett <matt.bennett@alliedtelesis.co.nz> writes:
> >> > 
> >> > > Previously the connector functionality could only be used by processes running in the
> >> > > default network namespace. This meant that any process that uses the connector functionality
> >> > > could not operate correctly when run inside a container. This is a draft patch series that
> >> > > attempts to now allow this functionality outside of the default network namespace.
> >> > > 
> >> > > I see this has been discussed previously [1], but am not sure how my changes relate to all
> >> > > of the topics discussed there and/or if there are any unintended side effects from my draft
> >> > > changes.
> >> > 
> >> > Is there a piece of software that uses connector that you want to get
> >> > working in containers?
> >
> > We have an IPC system [1] where processes can register their socket
> > details (unix, tcp, tipc, ...) to a 'monitor' process. Processes can
> > then get notified when other processes they are interested in
> > start/stop their servers and use the registered details to connect to
> > them. Everything works unless a process crashes, in which case the
> > monitoring process never removes their details. Therefore the
> > monitoring process uses the connector functionality with
> > PROC_EVENT_EXIT to detect when a process crashes and removes the
> > details if it is a previously registered PID.
> >
> > This was working for us until we tried to run our system in a container.
> >
> >> > 
> >> > I am curious what the motivation is because up until now there has been
> >> > nothing very interesting using this functionality.  So it hasn't been
> >> > worth anyone's time to make the necessary changes to the code.
> >> 
> >> Imho, we should just state once and for all that the proc connector will
> >> not be namespaced. This is such a corner-case thing and has been
> >> non-namespaced for such a long time without consistent push for it to be
> >> namespaced combined with the fact that this needs quite some code to
> >> make it work correctly that I fear we end up buying more bugs than we're
> >> selling features. And realistically, you and I will end up maintaining
> >> this and I feel this is not worth the time(?). Maybe I'm being too
> >> pessimistic though.
> >> 
> >
> > Fair enough. I can certainly look for another way to detect process
> > crashes. Interestingly I found a patch set [2] on the mailing list
> > that attempts to solve the problem I wish to solve, but it doesn't
> > look like the patches were ever developed further. From reading the
> > discussion thread on that patch set it appears that I should be doing
> > some form of polling on the /proc files.
> 
> Recently Christian Brauner implemented pidfd complete with a poll
> operation that reports when a process terminates.
> 
> If you are willing to change your userspace code switching to pidfd
> should be all that you need.

While this does solve the problem of getting exit notifications in
general, you cannot get the exit code. But if they don't care about that
then we can solve that problem another time. :D

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/5] RFC: connector: Add network namespace awareness
  2020-07-14  5:03         ` Aleksa Sarai
@ 2020-07-14  5:19           ` Matt Bennett
  0 siblings, 0 replies; 19+ messages in thread
From: Matt Bennett @ 2020-07-14  5:19 UTC (permalink / raw)
  To: ebiederm, cyphar; +Cc: netdev, christian.brauner, zbr, containers, linux-kernel

On Tue, 2020-07-14 at 15:03 +1000, Aleksa Sarai wrote:
> On 2020-07-13, Eric W. Biederman <ebiederm@xmission.com> wrote:
> > Matt Bennett <Matt.Bennett@alliedtelesis.co.nz> writes:
> > 
> > > On Thu, 2020-07-02 at 21:10 +0200, Christian Brauner wrote:
> > > > On Thu, Jul 02, 2020 at 08:17:38AM -0500, Eric W. Biederman wrote:
> > > > > Matt Bennett <matt.bennett@alliedtelesis.co.nz> writes:
> > > > > 
> > > > > > Previously the connector functionality could only be used by processes running in the
> > > > > > default network namespace. This meant that any process that uses the connector functionality
> > > > > > could not operate correctly when run inside a container. This is a draft patch series that
> > > > > > attempts to now allow this functionality outside of the default network namespace.
> > > > > > 
> > > > > > I see this has been discussed previously [1], but am not sure how my changes relate to all
> > > > > > of the topics discussed there and/or if there are any unintended side effects from my draft
> > > > > > changes.
> > > > > 
> > > > > Is there a piece of software that uses connector that you want to get
> > > > > working in containers?
> > > 
> > > We have an IPC system [1] where processes can register their socket
> > > details (unix, tcp, tipc, ...) to a 'monitor' process. Processes can
> > > then get notified when other processes they are interested in
> > > start/stop their servers and use the registered details to connect to
> > > them. Everything works unless a process crashes, in which case the
> > > monitoring process never removes their details. Therefore the
> > > monitoring process uses the connector functionality with
> > > PROC_EVENT_EXIT to detect when a process crashes and removes the
> > > details if it is a previously registered PID.
> > > 
> > > This was working for us until we tried to run our system in a container.
> > > 
> > > > > 
> > > > > I am curious what the motivation is because up until now there has been
> > > > > nothing very interesting using this functionality.  So it hasn't been
> > > > > worth anyone's time to make the necessary changes to the code.
> > > > 
> > > > Imho, we should just state once and for all that the proc connector will
> > > > not be namespaced. This is such a corner-case thing and has been
> > > > non-namespaced for such a long time without consistent push for it to be
> > > > namespaced combined with the fact that this needs quite some code to
> > > > make it work correctly that I fear we end up buying more bugs than we're
> > > > selling features. And realistically, you and I will end up maintaining
> > > > this and I feel this is not worth the time(?). Maybe I'm being too
> > > > pessimistic though.
> > > > 
> > > 
> > > Fair enough. I can certainly look for another way to detect process
> > > crashes. Interestingly I found a patch set [2] on the mailing list
> > > that attempts to solve the problem I wish to solve, but it doesn't
> > > look like the patches were ever developed further. From reading the
> > > discussion thread on that patch set it appears that I should be doing
> > > some form of polling on the /proc files.
> > 
> > Recently Christian Brauner implemented pidfd complete with a poll
> > operation that reports when a process terminates.
> > 
> > If you are willing to change your userspace code switching to pidfd
> > should be all that you need.
> 
> While this does solve the problem of getting exit notifications in
> general, you cannot get the exit code. But if they don't care about that
> then we can solve that problem another time. :D
> 

From first glance using pidfd will do exactly what we need. Not being able
to get the exit code will not be an issue. In fact I think it will be an
improvement over the connector as the listener will now only be waiting for
the PIDs we actually care about - rather than getting woken up on every single
process exit and having to check if it cares about the PID.

Many thanks Eric and others,
Matt



^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-07-14  5:19 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-02  0:26 [PATCH 0/5] RFC: connector: Add network namespace awareness Matt Bennett
2020-07-02  0:26 ` [PATCH 1/5] connector: Use task pid helpers Matt Bennett
2020-07-02  0:26 ` [PATCH 2/5] connector: Use 'current_user_ns' function Matt Bennett
2020-07-02  0:26 ` [PATCH 3/5] connector: Ensure callback entry is released Matt Bennett
2020-07-02  0:26 ` [PATCH 4/5] connector: Prepare for supporting multiple namespaces Matt Bennett
2020-07-02  0:26 ` [PATCH 5/5] connector: Create connector per namespace Matt Bennett
2020-07-02  5:52   ` kernel test robot
2020-07-02  6:40   ` kernel test robot
2020-07-02 14:32   ` [kbuild] " Dan Carpenter
2020-07-02 13:17 ` [PATCH 0/5] RFC: connector: Add network namespace awareness Eric W. Biederman
2020-07-02 19:10   ` Christian Brauner
2020-07-02 22:44     ` Aleksa Sarai
2020-07-05 22:32     ` Matt Bennett
2020-07-13 18:34       ` Eric W. Biederman
2020-07-14  5:03         ` Aleksa Sarai
2020-07-14  5:19           ` Matt Bennett
2020-07-02 18:59 ` Eric W. Biederman
2020-07-05 22:31   ` Matt Bennett
2020-07-13 18:39     ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).