linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rdma-next 0/2] Remove PID namespaces support from restrack
@ 2019-10-02 12:32 Leon Romanovsky
  2019-10-02 12:32 ` [PATCH rdma-next 1/2] RDMA/restrack: Remove PID namespace support Leon Romanovsky
  2019-10-02 12:32 ` [PATCH rdma-next 2/2] RDMA/core: Check that process is still alive before sending it to the users Leon Romanovsky
  0 siblings, 2 replies; 7+ messages in thread
From: Leon Romanovsky @ 2019-10-02 12:32 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, RDMA mailing list

From: Leon Romanovsky <leonro@mellanox.com>

Hi,

Please see individual commit messages for more details.

Thanks

Leon Romanovsky (2):
  RDMA/restrack: Remove PID namespace support
  RDMA/core: Check that process is still alive before sending it to the
    users

 drivers/infiniband/core/counters.c | 28 ++++++++-----------
 drivers/infiniband/core/nldev.c    | 44 ++++++++++++++++--------------
 drivers/infiniband/core/restrack.c | 20 +-------------
 drivers/infiniband/core/restrack.h |  1 -
 4 files changed, 37 insertions(+), 56 deletions(-)

--
2.20.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH rdma-next 1/2] RDMA/restrack: Remove PID namespace support
  2019-10-02 12:32 [PATCH rdma-next 0/2] Remove PID namespaces support from restrack Leon Romanovsky
@ 2019-10-02 12:32 ` Leon Romanovsky
  2019-10-02 12:32 ` [PATCH rdma-next 2/2] RDMA/core: Check that process is still alive before sending it to the users Leon Romanovsky
  1 sibling, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2019-10-02 12:32 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, RDMA mailing list

From: Leon Romanovsky <leonro@mellanox.com>

IB resources are bounded to IB device and file descriptors,
both entities are unaware to PID namespaces and to task lifetime.

The difference in model caused to unpredictable behavior for the
following scenario:
 1. Create FD and context
 2. Share it with ephemeral child
 3. Create any object and exit that child

The end result of this flow, that those newly created objects will be
tracked by restrack, but won't be visible for users because task_struct
associated with them already exited.

The right thing is to rely on net namespace only for any filtering
purposes  and drop PID namespace.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/counters.c | 14 --------------
 drivers/infiniband/core/nldev.c    | 13 +------------
 drivers/infiniband/core/restrack.c | 20 +-------------------
 drivers/infiniband/core/restrack.h |  1 -
 4 files changed, 2 insertions(+), 46 deletions(-)

diff --git a/drivers/infiniband/core/counters.c b/drivers/infiniband/core/counters.c
index 736ab760025d..12ba2685abcf 100644
--- a/drivers/infiniband/core/counters.c
+++ b/drivers/infiniband/core/counters.c
@@ -149,9 +149,6 @@ static bool auto_mode_match(struct ib_qp *qp, struct rdma_counter *counter,
 	struct auto_mode_param *param = &counter->mode.param;
 	bool match = true;

-	if (!rdma_is_visible_in_pid_ns(&qp->res))
-		return false;
-
 	/* Ensure that counter belongs to the right PID */
 	if (task_pid_nr(counter->res.task) != task_pid_nr(qp->res.task))
 		return false;
@@ -229,9 +226,6 @@ static struct rdma_counter *rdma_get_counter_auto_mode(struct ib_qp *qp,
 	rt = &dev->res[RDMA_RESTRACK_COUNTER];
 	xa_lock(&rt->xa);
 	xa_for_each(&rt->xa, id, res) {
-		if (!rdma_is_visible_in_pid_ns(res))
-			continue;
-
 		counter = container_of(res, struct rdma_counter, res);
 		if ((counter->device != qp->device) || (counter->port != port))
 			goto next;
@@ -412,9 +406,6 @@ static struct ib_qp *rdma_counter_get_qp(struct ib_device *dev, u32 qp_num)
 	if (IS_ERR(res))
 		return NULL;

-	if (!rdma_is_visible_in_pid_ns(res))
-		goto err;
-
 	qp = container_of(res, struct ib_qp, res);
 	if (qp->qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW))
 		goto err;
@@ -445,11 +436,6 @@ static struct rdma_counter *rdma_get_counter_by_id(struct ib_device *dev,
 	if (IS_ERR(res))
 		return NULL;

-	if (!rdma_is_visible_in_pid_ns(res)) {
-		rdma_restrack_put(res);
-		return NULL;
-	}
-
 	counter = container_of(res, struct rdma_counter, res);
 	kref_get(&counter->kref);
 	rdma_restrack_put(res);
diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
index 7a7474000100..71bc08510064 100644
--- a/drivers/infiniband/core/nldev.c
+++ b/drivers/infiniband/core/nldev.c
@@ -698,9 +698,6 @@ static int fill_stat_counter_qps(struct sk_buff *msg,
 	rt = &counter->device->res[RDMA_RESTRACK_QP];
 	xa_lock(&rt->xa);
 	xa_for_each(&rt->xa, id, res) {
-		if (!rdma_is_visible_in_pid_ns(res))
-			continue;
-
 		qp = container_of(res, struct ib_qp, res);
 		if (qp->qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW))
 			continue;
@@ -1222,15 +1219,10 @@ static int res_get_common_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
 		goto err;
 	}

-	if (!rdma_is_visible_in_pid_ns(res)) {
-		ret = -ENOENT;
-		goto err_get;
-	}
-
 	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
 	if (!msg) {
 		ret = -ENOMEM;
-		goto err;
+		goto err_get;
 	}

 	nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
@@ -1334,9 +1326,6 @@ static int res_get_common_dumpit(struct sk_buff *skb,
 	 * objects.
 	 */
 	xa_for_each(&rt->xa, id, res) {
-		if (!rdma_is_visible_in_pid_ns(res))
-			continue;
-
 		if (idx < start || !rdma_restrack_get(res))
 			goto next;

diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c
index a07665f7ef8c..62fbb0ae9cb4 100644
--- a/drivers/infiniband/core/restrack.c
+++ b/drivers/infiniband/core/restrack.c
@@ -116,11 +116,8 @@ int rdma_restrack_count(struct ib_device *dev, enum rdma_restrack_type type)
 	u32 cnt = 0;

 	xa_lock(&rt->xa);
-	xas_for_each(&xas, e, U32_MAX) {
-		if (!rdma_is_visible_in_pid_ns(e))
-			continue;
+	xas_for_each(&xas, e, U32_MAX)
 		cnt++;
-	}
 	xa_unlock(&rt->xa);
 	return cnt;
 }
@@ -346,18 +343,3 @@ void rdma_restrack_del(struct rdma_restrack_entry *res)
 	}
 }
 EXPORT_SYMBOL(rdma_restrack_del);
-
-bool rdma_is_visible_in_pid_ns(struct rdma_restrack_entry *res)
-{
-	/*
-	 * 1. Kern resources should be visible in init
-	 *    namespace only
-	 * 2. Present only resources visible in the current
-	 *     namespace
-	 */
-	if (rdma_is_kernel_res(res))
-		return task_active_pid_ns(current) == &init_pid_ns;
-
-	/* PID 0 means that resource is not found in current namespace */
-	return task_pid_vnr(res->task);
-}
diff --git a/drivers/infiniband/core/restrack.h b/drivers/infiniband/core/restrack.h
index 7bd177cc0a61..d084e5f89849 100644
--- a/drivers/infiniband/core/restrack.h
+++ b/drivers/infiniband/core/restrack.h
@@ -27,5 +27,4 @@ int rdma_restrack_init(struct ib_device *dev);
 void rdma_restrack_clean(struct ib_device *dev);
 void rdma_restrack_attach_task(struct rdma_restrack_entry *res,
 			       struct task_struct *task);
-bool rdma_is_visible_in_pid_ns(struct rdma_restrack_entry *res);
 #endif /* _RDMA_CORE_RESTRACK_H_ */
--
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH rdma-next 2/2] RDMA/core: Check that process is still alive before sending it to the users
  2019-10-02 12:32 [PATCH rdma-next 0/2] Remove PID namespaces support from restrack Leon Romanovsky
  2019-10-02 12:32 ` [PATCH rdma-next 1/2] RDMA/restrack: Remove PID namespace support Leon Romanovsky
@ 2019-10-02 12:32 ` Leon Romanovsky
  2019-10-07 18:58   ` Parav Pandit
  1 sibling, 1 reply; 7+ messages in thread
From: Leon Romanovsky @ 2019-10-02 12:32 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, RDMA mailing list

From: Leon Romanovsky <leonro@mellanox.com>

The PID information can disappear asynchronically because task can be
killed and moved to zombie state. In such case, PID will be zero in
similar way to the kernel tasks. Recognize such situation where
we are asking to return orphaned object and simply skip filling
PID attribute.

As part of this change, document the same scenario in counter.c code.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/counters.c | 14 ++++++++++++--
 drivers/infiniband/core/nldev.c    | 31 ++++++++++++++++++++++--------
 2 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/core/counters.c b/drivers/infiniband/core/counters.c
index 12ba2685abcf..47c551a0bcb0 100644
--- a/drivers/infiniband/core/counters.c
+++ b/drivers/infiniband/core/counters.c
@@ -149,8 +149,18 @@ static bool auto_mode_match(struct ib_qp *qp, struct rdma_counter *counter,
 	struct auto_mode_param *param = &counter->mode.param;
 	bool match = true;

-	/* Ensure that counter belongs to the right PID */
-	if (task_pid_nr(counter->res.task) != task_pid_nr(qp->res.task))
+	/*
+	 * Ensure that counter belongs to the right PID.
+	 * This operation can race with user space which kills
+	 * the process and leaves QP and counters orphans.
+	 *
+	 * It is not a big deal because exitted task will leave both
+	 * QP and counter in the same bucket of zombie process. Just ensure
+	 * that process is still alive before procedding.
+	 *
+	 */
+	if (task_pid_nr(counter->res.task) != task_pid_nr(qp->res.task) ||
+	    !task_pid_nr(qp->res.task))
 		return false;

 	if (auto_mask & RDMA_COUNTER_MASK_QP_TYPE)
diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
index 71bc08510064..c6fe0c52f6dc 100644
--- a/drivers/infiniband/core/nldev.c
+++ b/drivers/infiniband/core/nldev.c
@@ -399,20 +399,35 @@ static int fill_res_info(struct sk_buff *msg, struct ib_device *device)
 static int fill_res_name_pid(struct sk_buff *msg,
 			     struct rdma_restrack_entry *res)
 {
+	int err = 0;
+	pid_t pid;
+
 	/*
 	 * For user resources, user is should read /proc/PID/comm to get the
 	 * name of the task file.
 	 */
 	if (rdma_is_kernel_res(res)) {
-		if (nla_put_string(msg, RDMA_NLDEV_ATTR_RES_KERN_NAME,
-		    res->kern_name))
-			return -EMSGSIZE;
-	} else {
-		if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID,
-		    task_pid_vnr(res->task)))
-			return -EMSGSIZE;
+		err = nla_put_string(msg, RDMA_NLDEV_ATTR_RES_KERN_NAME,
+				     res->kern_name);
+		goto out;
 	}
-	return 0;
+
+	pid = task_pid_vnr(res->task);
+	/*
+	 * PID == 0 returns in two scenarios:
+	 * 1. It is kernel task, but because we checked above, it won't be possible.
+	 * 2. Task is dead and in zombie state. There is no need to print PID anymore.
+	 */
+	if (pid)
+		/*
+		 * This part is racy, task can be killed and PID will be zero right
+		 * here but it is ok, next query won't return PID. We don't promise
+		 * real-time reflection of SW objects.
+		 */
+		err = nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID, pid);
+
+out:
+	return err ? -EMSGSIZE : 0;
 }

 static bool fill_res_entry(struct ib_device *dev, struct sk_buff *msg,
--
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* RE: [PATCH rdma-next 2/2] RDMA/core: Check that process is still alive before sending it to the users
  2019-10-02 12:32 ` [PATCH rdma-next 2/2] RDMA/core: Check that process is still alive before sending it to the users Leon Romanovsky
@ 2019-10-07 18:58   ` Parav Pandit
  2019-10-08  7:52     ` Leon Romanovsky
  2019-10-08 19:11     ` Jason Gunthorpe
  0 siblings, 2 replies; 7+ messages in thread
From: Parav Pandit @ 2019-10-07 18:58 UTC (permalink / raw)
  To: Leon Romanovsky, Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list



> -----Original Message-----
> From: linux-rdma-owner@vger.kernel.org <linux-rdma-
> owner@vger.kernel.org> On Behalf Of Leon Romanovsky
> Sent: Wednesday, October 2, 2019 7:33 AM
> To: Doug Ledford <dledford@redhat.com>; Jason Gunthorpe
> <jgg@mellanox.com>
> Cc: Leon Romanovsky <leonro@mellanox.com>; RDMA mailing list <linux-
> rdma@vger.kernel.org>
> Subject: [PATCH rdma-next 2/2] RDMA/core: Check that process is still alive
> before sending it to the users
> 
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> The PID information can disappear asynchronically because task can be killed
> and moved to zombie state. In such case, PID will be zero in similar way to the
> kernel tasks. Recognize such situation where we are asking to return orphaned
> object and simply skip filling PID attribute.
> 
> As part of this change, document the same scenario in counter.c code.
> 
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>  drivers/infiniband/core/counters.c | 14 ++++++++++++--
>  drivers/infiniband/core/nldev.c    | 31 ++++++++++++++++++++++--------
>  2 files changed, 35 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/infiniband/core/counters.c
> b/drivers/infiniband/core/counters.c
> index 12ba2685abcf..47c551a0bcb0 100644
> --- a/drivers/infiniband/core/counters.c
> +++ b/drivers/infiniband/core/counters.c
> @@ -149,8 +149,18 @@ static bool auto_mode_match(struct ib_qp *qp, struct
> rdma_counter *counter,
>  	struct auto_mode_param *param = &counter->mode.param;
>  	bool match = true;
> 
> -	/* Ensure that counter belongs to the right PID */
> -	if (task_pid_nr(counter->res.task) != task_pid_nr(qp->res.task))
> +	/*
> +	 * Ensure that counter belongs to the right PID.
> +	 * This operation can race with user space which kills
> +	 * the process and leaves QP and counters orphans.
> +	 *
> +	 * It is not a big deal because exitted task will leave both
> +	 * QP and counter in the same bucket of zombie process. Just ensure
> +	 * that process is still alive before procedding.
> +	 *
> +	 */
> +	if (task_pid_nr(counter->res.task) != task_pid_nr(qp->res.task) ||
> +	    !task_pid_nr(qp->res.task))
>  		return false;
> 
>  	if (auto_mask & RDMA_COUNTER_MASK_QP_TYPE) diff --git
> a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c index
> 71bc08510064..c6fe0c52f6dc 100644
> --- a/drivers/infiniband/core/nldev.c
> +++ b/drivers/infiniband/core/nldev.c
> @@ -399,20 +399,35 @@ static int fill_res_info(struct sk_buff *msg, struct
> ib_device *device)  static int fill_res_name_pid(struct sk_buff *msg,
>  			     struct rdma_restrack_entry *res)  {
> +	int err = 0;
> +	pid_t pid;
> +
>  	/*
>  	 * For user resources, user is should read /proc/PID/comm to get the
>  	 * name of the task file.
>  	 */
>  	if (rdma_is_kernel_res(res)) {
> -		if (nla_put_string(msg,
> RDMA_NLDEV_ATTR_RES_KERN_NAME,
> -		    res->kern_name))
> -			return -EMSGSIZE;
> -	} else {
> -		if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID,
> -		    task_pid_vnr(res->task)))
> -			return -EMSGSIZE;
> +		err = nla_put_string(msg,
> RDMA_NLDEV_ATTR_RES_KERN_NAME,
> +				     res->kern_name);
> +		goto out;
>  	}
> -	return 0;
> +
> +	pid = task_pid_vnr(res->task);
> +	/*
> +	 * PID == 0 returns in two scenarios:
> +	 * 1. It is kernel task, but because we checked above, it won't be
> possible.
Please drop above comment point 1. See more below.

> +	 * 2. Task is dead and in zombie state. There is no need to print PID
> anymore.
> +	 */
> +	if (pid)
> +		/*
> +		 * This part is racy, task can be killed and PID will be zero right
> +		 * here but it is ok, next query won't return PID. We don't
> promise
> +		 * real-time reflection of SW objects.
> +		 */
> +		err = nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID, pid);
> +
> +out:
> +	return err ? -EMSGSIZE : 0;
>  }

Below code reads better along with rest of the comments in the patch.

if (kern_resource) {
	err = nla_put_string(msg, RDMA_NLDEV_ATTR_RES_KERN_NAME,
			     res->kern_name);
} else {
	pid_t pid;

	pid = task_pid_vnr(res->task);
	if (pid)
		err = nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID, pid);
}

> 
>  static bool fill_res_entry(struct ib_device *dev, struct sk_buff *msg,
> --
> 2.20.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH rdma-next 2/2] RDMA/core: Check that process is still alive before sending it to the users
  2019-10-07 18:58   ` Parav Pandit
@ 2019-10-08  7:52     ` Leon Romanovsky
  2019-10-08 13:46       ` Parav Pandit
  2019-10-08 19:11     ` Jason Gunthorpe
  1 sibling, 1 reply; 7+ messages in thread
From: Leon Romanovsky @ 2019-10-08  7:52 UTC (permalink / raw)
  To: Parav Pandit; +Cc: Doug Ledford, Jason Gunthorpe, RDMA mailing list

On Mon, Oct 07, 2019 at 06:58:13PM +0000, Parav Pandit wrote:
>
>
> > -----Original Message-----
> > From: linux-rdma-owner@vger.kernel.org <linux-rdma-
> > owner@vger.kernel.org> On Behalf Of Leon Romanovsky
> > Sent: Wednesday, October 2, 2019 7:33 AM
> > To: Doug Ledford <dledford@redhat.com>; Jason Gunthorpe
> > <jgg@mellanox.com>
> > Cc: Leon Romanovsky <leonro@mellanox.com>; RDMA mailing list <linux-
> > rdma@vger.kernel.org>
> > Subject: [PATCH rdma-next 2/2] RDMA/core: Check that process is still alive
> > before sending it to the users
> >
> > From: Leon Romanovsky <leonro@mellanox.com>
> >
> > The PID information can disappear asynchronically because task can be killed
> > and moved to zombie state. In such case, PID will be zero in similar way to the
> > kernel tasks. Recognize such situation where we are asking to return orphaned
> > object and simply skip filling PID attribute.
> >
> > As part of this change, document the same scenario in counter.c code.
> >
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > ---
> >  drivers/infiniband/core/counters.c | 14 ++++++++++++--
> >  drivers/infiniband/core/nldev.c    | 31 ++++++++++++++++++++++--------
> >  2 files changed, 35 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/infiniband/core/counters.c
> > b/drivers/infiniband/core/counters.c
> > index 12ba2685abcf..47c551a0bcb0 100644
> > --- a/drivers/infiniband/core/counters.c
> > +++ b/drivers/infiniband/core/counters.c
> > @@ -149,8 +149,18 @@ static bool auto_mode_match(struct ib_qp *qp, struct
> > rdma_counter *counter,
> >  	struct auto_mode_param *param = &counter->mode.param;
> >  	bool match = true;
> >
> > -	/* Ensure that counter belongs to the right PID */
> > -	if (task_pid_nr(counter->res.task) != task_pid_nr(qp->res.task))
> > +	/*
> > +	 * Ensure that counter belongs to the right PID.
> > +	 * This operation can race with user space which kills
> > +	 * the process and leaves QP and counters orphans.
> > +	 *
> > +	 * It is not a big deal because exitted task will leave both
> > +	 * QP and counter in the same bucket of zombie process. Just ensure
> > +	 * that process is still alive before procedding.
> > +	 *
> > +	 */
> > +	if (task_pid_nr(counter->res.task) != task_pid_nr(qp->res.task) ||
> > +	    !task_pid_nr(qp->res.task))
> >  		return false;
> >
> >  	if (auto_mask & RDMA_COUNTER_MASK_QP_TYPE) diff --git
> > a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c index
> > 71bc08510064..c6fe0c52f6dc 100644
> > --- a/drivers/infiniband/core/nldev.c
> > +++ b/drivers/infiniband/core/nldev.c
> > @@ -399,20 +399,35 @@ static int fill_res_info(struct sk_buff *msg, struct
> > ib_device *device)  static int fill_res_name_pid(struct sk_buff *msg,
> >  			     struct rdma_restrack_entry *res)  {
> > +	int err = 0;
> > +	pid_t pid;
> > +
> >  	/*
> >  	 * For user resources, user is should read /proc/PID/comm to get the
> >  	 * name of the task file.
> >  	 */
> >  	if (rdma_is_kernel_res(res)) {
> > -		if (nla_put_string(msg,
> > RDMA_NLDEV_ATTR_RES_KERN_NAME,
> > -		    res->kern_name))
> > -			return -EMSGSIZE;
> > -	} else {
> > -		if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID,
> > -		    task_pid_vnr(res->task)))
> > -			return -EMSGSIZE;
> > +		err = nla_put_string(msg,
> > RDMA_NLDEV_ATTR_RES_KERN_NAME,
> > +				     res->kern_name);
> > +		goto out;
> >  	}
> > -	return 0;
> > +
> > +	pid = task_pid_vnr(res->task);
> > +	/*
> > +	 * PID == 0 returns in two scenarios:
> > +	 * 1. It is kernel task, but because we checked above, it won't be
> > possible.
> Please drop above comment point 1. See more below.
>
> > +	 * 2. Task is dead and in zombie state. There is no need to print PID
> > anymore.
> > +	 */
> > +	if (pid)
> > +		/*
> > +		 * This part is racy, task can be killed and PID will be zero right
> > +		 * here but it is ok, next query won't return PID. We don't
> > promise
> > +		 * real-time reflection of SW objects.
> > +		 */
> > +		err = nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID, pid);
> > +
> > +out:
> > +	return err ? -EMSGSIZE : 0;
> >  }
>
> Below code reads better along with rest of the comments in the patch.
>
> if (kern_resource) {
> 	err = nla_put_string(msg, RDMA_NLDEV_ATTR_RES_KERN_NAME,
> 			     res->kern_name);
> } else {
> 	pid_t pid;
>
> 	pid = task_pid_vnr(res->task);
> 	if (pid)
> 		err = nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID, pid);
> }

Why do you think that nested "if" reads better?

Thanks

>
> >
> >  static bool fill_res_entry(struct ib_device *dev, struct sk_buff *msg,
> > --
> > 2.20.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH rdma-next 2/2] RDMA/core: Check that process is still alive before sending it to the users
  2019-10-08  7:52     ` Leon Romanovsky
@ 2019-10-08 13:46       ` Parav Pandit
  0 siblings, 0 replies; 7+ messages in thread
From: Parav Pandit @ 2019-10-08 13:46 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, Jason Gunthorpe, RDMA mailing list



> -----Original Message-----
> From: Leon Romanovsky <leon@kernel.org>
> Sent: Tuesday, October 8, 2019 2:52 AM
> To: Parav Pandit <parav@mellanox.com>
> Cc: Doug Ledford <dledford@redhat.com>; Jason Gunthorpe
> <jgg@mellanox.com>; RDMA mailing list <linux-rdma@vger.kernel.org>
> Subject: Re: [PATCH rdma-next 2/2] RDMA/core: Check that process is still
> alive before sending it to the users
> 
> On Mon, Oct 07, 2019 at 06:58:13PM +0000, Parav Pandit wrote:
> >
> >
> > > -----Original Message-----
> > > From: linux-rdma-owner@vger.kernel.org <linux-rdma-
> > > owner@vger.kernel.org> On Behalf Of Leon Romanovsky
> > > Sent: Wednesday, October 2, 2019 7:33 AM
> > > To: Doug Ledford <dledford@redhat.com>; Jason Gunthorpe
> > > <jgg@mellanox.com>
> > > Cc: Leon Romanovsky <leonro@mellanox.com>; RDMA mailing list <linux-
> > > rdma@vger.kernel.org>
> > > Subject: [PATCH rdma-next 2/2] RDMA/core: Check that process is
> > > still alive before sending it to the users
> > >
> > > From: Leon Romanovsky <leonro@mellanox.com>
> > >
> > > The PID information can disappear asynchronically because task can
> > > be killed and moved to zombie state. In such case, PID will be zero
> > > in similar way to the kernel tasks. Recognize such situation where
> > > we are asking to return orphaned object and simply skip filling PID
> attribute.
> > >
> > > As part of this change, document the same scenario in counter.c code.
> > >
> > > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > > ---
> > >  drivers/infiniband/core/counters.c | 14 ++++++++++++--
> > >  drivers/infiniband/core/nldev.c    | 31 ++++++++++++++++++++++--------
> > >  2 files changed, 35 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/drivers/infiniband/core/counters.c
> > > b/drivers/infiniband/core/counters.c
> > > index 12ba2685abcf..47c551a0bcb0 100644
> > > --- a/drivers/infiniband/core/counters.c
> > > +++ b/drivers/infiniband/core/counters.c
> > > @@ -149,8 +149,18 @@ static bool auto_mode_match(struct ib_qp *qp,
> > > struct rdma_counter *counter,
> > >  	struct auto_mode_param *param = &counter->mode.param;
> > >  	bool match = true;
> > >
> > > -	/* Ensure that counter belongs to the right PID */
> > > -	if (task_pid_nr(counter->res.task) != task_pid_nr(qp->res.task))
> > > +	/*
> > > +	 * Ensure that counter belongs to the right PID.
> > > +	 * This operation can race with user space which kills
> > > +	 * the process and leaves QP and counters orphans.
> > > +	 *
> > > +	 * It is not a big deal because exitted task will leave both
> > > +	 * QP and counter in the same bucket of zombie process. Just ensure
> > > +	 * that process is still alive before procedding.
> > > +	 *
> > > +	 */
> > > +	if (task_pid_nr(counter->res.task) != task_pid_nr(qp->res.task) ||
> > > +	    !task_pid_nr(qp->res.task))
> > >  		return false;
> > >
> > >  	if (auto_mask & RDMA_COUNTER_MASK_QP_TYPE) diff --git
> > > a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
> > > index 71bc08510064..c6fe0c52f6dc 100644
> > > --- a/drivers/infiniband/core/nldev.c
> > > +++ b/drivers/infiniband/core/nldev.c
> > > @@ -399,20 +399,35 @@ static int fill_res_info(struct sk_buff *msg,
> > > struct ib_device *device)  static int fill_res_name_pid(struct sk_buff
> *msg,
> > >  			     struct rdma_restrack_entry *res)  {
> > > +	int err = 0;
> > > +	pid_t pid;
> > > +
> > >  	/*
> > >  	 * For user resources, user is should read /proc/PID/comm to get the
> > >  	 * name of the task file.
> > >  	 */
> > >  	if (rdma_is_kernel_res(res)) {
> > > -		if (nla_put_string(msg,
> > > RDMA_NLDEV_ATTR_RES_KERN_NAME,
> > > -		    res->kern_name))
> > > -			return -EMSGSIZE;
> > > -	} else {
> > > -		if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID,
> > > -		    task_pid_vnr(res->task)))
> > > -			return -EMSGSIZE;
> > > +		err = nla_put_string(msg,
> > > RDMA_NLDEV_ATTR_RES_KERN_NAME,
> > > +				     res->kern_name);
> > > +		goto out;
> > >  	}
> > > -	return 0;
> > > +
> > > +	pid = task_pid_vnr(res->task);
> > > +	/*
> > > +	 * PID == 0 returns in two scenarios:
> > > +	 * 1. It is kernel task, but because we checked above, it won't be
> > > possible.
> > Please drop above comment point 1. See more below.
> >
> > > +	 * 2. Task is dead and in zombie state. There is no need to print
> > > +PID
> > > anymore.
> > > +	 */
> > > +	if (pid)
> > > +		/*
> > > +		 * This part is racy, task can be killed and PID will be zero
> right
> > > +		 * here but it is ok, next query won't return PID. We don't
> > > promise
> > > +		 * real-time reflection of SW objects.
> > > +		 */
> > > +		err = nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID, pid);
> > > +
> > > +out:
> > > +	return err ? -EMSGSIZE : 0;
> > >  }
> >
> > Below code reads better along with rest of the comments in the patch.
> >
> > if (kern_resource) {
> > 	err = nla_put_string(msg, RDMA_NLDEV_ATTR_RES_KERN_NAME,
> > 			     res->kern_name);
> > } else {
> > 	pid_t pid;
> >
> > 	pid = task_pid_vnr(res->task);
> > 	if (pid)
> > 		err = nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID, pid); }
> 
> Why do you think that nested "if" reads better?
> 
Because pid access is required only for non-kernel resource.
Hence it shouldn't be called for kernel resource; it doesn't matter it return zero or not.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH rdma-next 2/2] RDMA/core: Check that process is still alive before sending it to the users
  2019-10-07 18:58   ` Parav Pandit
  2019-10-08  7:52     ` Leon Romanovsky
@ 2019-10-08 19:11     ` Jason Gunthorpe
  1 sibling, 0 replies; 7+ messages in thread
From: Jason Gunthorpe @ 2019-10-08 19:11 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Leon Romanovsky, Doug Ledford, Leon Romanovsky, RDMA mailing list

On Mon, Oct 07, 2019 at 06:58:13PM +0000, Parav Pandit wrote:
> > +	 * 2. Task is dead and in zombie state. There is no need to print PID
> > anymore.
> > +	 */
> > +	if (pid)
> > +		/*
> > +		 * This part is racy, task can be killed and PID will be zero right
> > +		 * here but it is ok, next query won't return PID. We don't
> > promise
> > +		 * real-time reflection of SW objects.
> > +		 */
> > +		err = nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID, pid);
> > +
> > +out:
> > +	return err ? -EMSGSIZE : 0;
> >  }
> 
> Below code reads better along with rest of the comments in the patch.
> 
> if (kern_resource) {
> 	err = nla_put_string(msg, RDMA_NLDEV_ATTR_RES_KERN_NAME,
> 			     res->kern_name);
> } else {
> 	pid_t pid;
> 
> 	pid = task_pid_vnr(res->task);
> 	if (pid)
> 		err = nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID, pid);
> }

I tend to agree, that the pid == 0 happens for !kernel is pretty
indirect

Jason

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-10-08 19:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-02 12:32 [PATCH rdma-next 0/2] Remove PID namespaces support from restrack Leon Romanovsky
2019-10-02 12:32 ` [PATCH rdma-next 1/2] RDMA/restrack: Remove PID namespace support Leon Romanovsky
2019-10-02 12:32 ` [PATCH rdma-next 2/2] RDMA/core: Check that process is still alive before sending it to the users Leon Romanovsky
2019-10-07 18:58   ` Parav Pandit
2019-10-08  7:52     ` Leon Romanovsky
2019-10-08 13:46       ` Parav Pandit
2019-10-08 19:11     ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).