All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH rdma-next 00/14] RDMA resource tracking
@ 2017-12-21 18:17 Leon Romanovsky
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise

Hi,

I would like to start discussion over following series. This is marked
as RFC, because its verification is not completed yet, however it is stable
enough to share this work with you. It works for me and Steve :).

The user space was uploaded to [1] and I'll publish it early next week.

The original goal of this series was to allow ability to view connection (QP)
information about running processes, however I used this opportunity and created
common infrastructure to track and report various resources. The report
part is implemented in netlink (nldev), but smart ULPs can now create
advanced usage models based on device utilization.

The current implementation relies on one lock per-object per-device, so
creation/destroying of various objects (CQ, PD, e.t.c) on various or the
same devices doesn't interfere each with another.

The data protection is performed with SRCU and its reader-writer model
ensures that resource won't be destroyed till readers will finish their
work.

Such scheme,

Possible future work will include:
 * Reducing number of locks in RDMA, because of SRCU.
 * Converting CMA to be based completely on resource tracking.
 * Addition of other objects and extending current to give full
   and detailed state of the RDMA kernel stack.
 * Replacing synchronize_srcu with call_srcu to make destroy flow
   non-blocking.
 * Provide reliable device reset flow, preserving resource creation ordering.

The patches are available in the git repository at:
  git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git topic/restrack-srcu

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/leon/iproute2.git/log/?h=topic/restrack

	Thanks

CC: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
---------------------------------------

Leon Romanovsky (14):
  RDMA/netlink: Simplify code of autoload modules
  RDMA/core: Enforce requirement to hold lists_rwsem semaphore
  RDMA/core: Replace open-coded variant of put_device
  RDMA/nldev: Refactor nldev handle to be common function
  RDMA/core: Provide locked variant of device name to index function
  RDMA/netlink: Protect device query from device removal
  RDMA/nldev: Protect port query from accidental device removal
  RDMA/restrack: Add general infrastructure to track RDMA resources
  RDMA/core: Add helper function to create named QPs
  RDMA: Annotate create QP callers
  RDMA/core: Add resource tracking for create and destroy CQs
  RDMA/core: Add resource tracking for create and destroy PDs
  RDMA/nldev: Provide global resource utilization
  RDMA/nldev: Provide detailed QP information

 drivers/infiniband/core/Makefile           |   2 +-
 drivers/infiniband/core/cma.c              |   1 +
 drivers/infiniband/core/core_priv.h        |  21 ++
 drivers/infiniband/core/cq.c               |   3 +
 drivers/infiniband/core/device.c           |  32 +-
 drivers/infiniband/core/mad.c              |   1 +
 drivers/infiniband/core/netlink.c          |   8 +-
 drivers/infiniband/core/nldev.c            | 454 +++++++++++++++++++++++++++--
 drivers/infiniband/core/restrack.c         | 177 +++++++++++
 drivers/infiniband/core/uverbs_cmd.c       |   5 +-
 drivers/infiniband/core/uverbs_std_types.c |   2 +
 drivers/infiniband/core/verbs.c            |   8 +-
 drivers/infiniband/hw/mlx4/mad.c           |   1 +
 drivers/infiniband/hw/mlx4/qp.c            |   1 +
 drivers/infiniband/hw/mlx5/gsi.c           |   2 +
 drivers/infiniband/ulp/ipoib/ipoib_cm.c    |   4 +-
 drivers/infiniband/ulp/ipoib/ipoib_verbs.c |   1 +
 drivers/infiniband/ulp/srp/ib_srp.c        |   1 +
 drivers/infiniband/ulp/srpt/ib_srpt.c      |   1 +
 include/rdma/ib_verbs.h                    |  23 +-
 include/rdma/restrack.h                    | 149 ++++++++++
 include/uapi/rdma/rdma_netlink.h           |  54 ++++
 net/smc/smc_ib.c                           |   1 +
 23 files changed, 908 insertions(+), 44 deletions(-)
 create mode 100644 drivers/infiniband/core/restrack.c
 create mode 100644 include/rdma/restrack.h

--
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH rdma-next 01/14] RDMA/netlink: Simplify code of autoload modules
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2017-12-21 18:17   ` Leon Romanovsky
  2017-12-21 18:17   ` [RFC PATCH rdma-next 02/14] RDMA/core: Enforce requirement to hold lists_rwsem semaphore Leon Romanovsky
                     ` (14 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise,
	Leon Romanovsky

From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

The request_module() call is internally wrapped by CONFIG_MODULE,
so there is no need to check it in our RDMA code too.

Refactor the code to simplify the code.

Reviewed-by: Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/netlink.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/netlink.c b/drivers/infiniband/core/netlink.c
index 1fb72c356e36..fdff071c3784 100644
--- a/drivers/infiniband/core/netlink.c
+++ b/drivers/infiniband/core/netlink.c
@@ -83,15 +83,13 @@ static bool is_nl_valid(unsigned int type, unsigned int op)
 	if (!is_nl_msg_valid(type, op))
 		return false;

-	cb_table = rdma_nl_types[type].cb_table;
-#ifdef CONFIG_MODULES
-	if (!cb_table) {
+	if (!rdma_nl_types[type].cb_table) {
 		mutex_unlock(&rdma_nl_mutex);
 		request_module("rdma-netlink-subsys-%d", type);
 		mutex_lock(&rdma_nl_mutex);
-		cb_table = rdma_nl_types[type].cb_table;
 	}
-#endif
+
+	cb_table = rdma_nl_types[type].cb_table;

 	if (!cb_table || (!cb_table[op].dump && !cb_table[op].doit))
 		return false;
--
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH rdma-next 02/14] RDMA/core: Enforce requirement to hold lists_rwsem semaphore
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2017-12-21 18:17   ` [RFC PATCH rdma-next 01/14] RDMA/netlink: Simplify code of autoload modules Leon Romanovsky
@ 2017-12-21 18:17   ` Leon Romanovsky
       [not found]     ` <20171221181748.17126-3-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2017-12-21 18:17   ` [RFC PATCH rdma-next 03/14] RDMA/core: Replace open-coded variant of put_device Leon Romanovsky
                     ` (13 subsequent siblings)
  15 siblings, 1 reply; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise,
	Leon Romanovsky

From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Add comment and run time check to the __ib_device_get_by_index()
function to remind that the caller should hold lists_rwsem semaphore.

Reviewed-by: Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/device.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 950310143ef2..7fe00a9b2318 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -134,10 +134,15 @@ static int ib_device_check_mandatory(struct ib_device *device)
 	return 0;
 }

+/*
+ * Caller to this function should hold lists_rwsem
+ */
 struct ib_device *__ib_device_get_by_index(u32 index)
 {
 	struct ib_device *device;

+	WARN_ON_ONCE(!rwsem_is_locked(&lists_rwsem));
+
 	list_for_each_entry(device, &device_list, core_list)
 		if (device->index == index)
 			return device;
@@ -526,8 +531,8 @@ int ib_register_device(struct ib_device *device,
 		if (!add_client_context(device, client) && client->add)
 			client->add(device);

-	device->index = __dev_new_index();
 	down_write(&lists_rwsem);
+	device->index = __dev_new_index();
 	list_add_tail(&device->core_list, &device_list);
 	up_write(&lists_rwsem);
 	mutex_unlock(&device_mutex);
--
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH rdma-next 03/14] RDMA/core: Replace open-coded variant of put_device
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2017-12-21 18:17   ` [RFC PATCH rdma-next 01/14] RDMA/netlink: Simplify code of autoload modules Leon Romanovsky
  2017-12-21 18:17   ` [RFC PATCH rdma-next 02/14] RDMA/core: Enforce requirement to hold lists_rwsem semaphore Leon Romanovsky
@ 2017-12-21 18:17   ` Leon Romanovsky
  2017-12-21 18:17   ` [RFC PATCH rdma-next 04/14] RDMA/nldev: Refactor nldev handle to be common function Leon Romanovsky
                     ` (12 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise,
	Leon Romanovsky

From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

There is an existing function to decrease reference counter
of the device, let's use it.

Reviewed-by: Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 7fe00a9b2318..cb69357a1909 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -277,7 +277,7 @@ void ib_dealloc_device(struct ib_device *device)
 {
 	WARN_ON(device->reg_state != IB_DEV_UNREGISTERED &&
 		device->reg_state != IB_DEV_UNINITIALIZED);
-	kobject_put(&device->dev.kobj);
+	put_device(&device->dev);
 }
 EXPORT_SYMBOL(ib_dealloc_device);

--
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH rdma-next 04/14] RDMA/nldev: Refactor nldev handle to be common function
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-12-21 18:17   ` [RFC PATCH rdma-next 03/14] RDMA/core: Replace open-coded variant of put_device Leon Romanovsky
@ 2017-12-21 18:17   ` Leon Romanovsky
  2017-12-21 18:17   ` [RFC PATCH rdma-next 05/14] RDMA/core: Provide locked variant of device name to index function Leon Romanovsky
                     ` (11 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise,
	Leon Romanovsky

From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

The NLDEV commands are using IB device indexes and names as a handler
for netlink communications. Put all relevant code into one function,
so it will be reused easily.

Reviewed-by: Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/nldev.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
index 9a05245a1acf..2b631307349d 100644
--- a/drivers/infiniband/core/nldev.c
+++ b/drivers/infiniband/core/nldev.c
@@ -54,14 +54,23 @@ static const struct nla_policy nldev_policy[RDMA_NLDEV_ATTR_MAX] = {
 	[RDMA_NLDEV_ATTR_DEV_NODE_TYPE] = { .type = NLA_U8 },
 };

-static int fill_dev_info(struct sk_buff *msg, struct ib_device *device)
+static int fill_nldev_handle(struct sk_buff *msg, struct ib_device *device)
 {
-	char fw[IB_FW_VERSION_NAME_MAX];
-
 	if (nla_put_u32(msg, RDMA_NLDEV_ATTR_DEV_INDEX, device->index))
 		return -EMSGSIZE;
 	if (nla_put_string(msg, RDMA_NLDEV_ATTR_DEV_NAME, device->name))
 		return -EMSGSIZE;
+
+	return 0;
+}
+
+static int fill_dev_info(struct sk_buff *msg, struct ib_device *device)
+{
+	char fw[IB_FW_VERSION_NAME_MAX];
+
+	if (fill_nldev_handle(msg, device))
+		return -EMSGSIZE;
+
 	if (nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, rdma_end_port(device)))
 		return -EMSGSIZE;

@@ -92,10 +101,9 @@ static int fill_port_info(struct sk_buff *msg,
 	struct ib_port_attr attr;
 	int ret;

-	if (nla_put_u32(msg, RDMA_NLDEV_ATTR_DEV_INDEX, device->index))
-		return -EMSGSIZE;
-	if (nla_put_string(msg, RDMA_NLDEV_ATTR_DEV_NAME, device->name))
+	if (fill_nldev_handle(msg, device))
 		return -EMSGSIZE;
+
 	if (nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, port))
 		return -EMSGSIZE;

--
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH rdma-next 05/14] RDMA/core: Provide locked variant of device name to index function
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (3 preceding siblings ...)
  2017-12-21 18:17   ` [RFC PATCH rdma-next 04/14] RDMA/nldev: Refactor nldev handle to be common function Leon Romanovsky
@ 2017-12-21 18:17   ` Leon Romanovsky
  2017-12-21 18:17   ` [RFC PATCH rdma-next 06/14] RDMA/netlink: Protect device query from device removal Leon Romanovsky
                     ` (10 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise,
	Leon Romanovsky

From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Add self-contained with locks device name to index function.

Reviewed-by: Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/core_priv.h |  1 +
 drivers/infiniband/core/device.c    | 16 ++++++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index ded3850721e0..e71dd1814bf0 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -301,6 +301,7 @@ static inline int ib_mad_enforce_security(struct ib_mad_agent_private *map,
 #endif

 struct ib_device *__ib_device_get_by_index(u32 ifindex);
+struct ib_device *ib_device_get_by_index(u32 ifindex);
 /* RDMA device netlink */
 void nldev_init(void);
 void nldev_exit(void);
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index cb69357a1909..adf3a4ca038b 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -150,6 +150,22 @@ struct ib_device *__ib_device_get_by_index(u32 index)
 	return NULL;
 }

+/*
+ * Caller is responsible to return refrerence count by calling put_device()
+ */
+struct ib_device *ib_device_get_by_index(u32 index)
+{
+	struct ib_device *device;
+
+	down_write(&lists_rwsem);
+	device = __ib_device_get_by_index(index);
+	if (device)
+		get_device(&device->dev);
+
+	up_write(&lists_rwsem);
+	return device;
+}
+
 static struct ib_device *__ib_device_get_by_name(const char *name)
 {
 	struct ib_device *device;
--
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH rdma-next 06/14] RDMA/netlink: Protect device query from device removal
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (4 preceding siblings ...)
  2017-12-21 18:17   ` [RFC PATCH rdma-next 05/14] RDMA/core: Provide locked variant of device name to index function Leon Romanovsky
@ 2017-12-21 18:17   ` Leon Romanovsky
  2017-12-21 18:17   ` [RFC PATCH rdma-next 07/14] RDMA/nldev: Protect port query from accidental " Leon Romanovsky
                     ` (9 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise,
	Leon Romanovsky

From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

There is a chance that device will be removed during device query
operations and it will cause to kernel panic in the flows which
doesn't hold lists_rwsem semaphore.

Fixes: e5c9469efcb1 ("RDMA/netlink: Add nldev device doit implementation")
Reviewed-by: Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/nldev.c | 25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
index 2b631307349d..e3033d7a4029 100644
--- a/drivers/infiniband/core/nldev.c
+++ b/drivers/infiniband/core/nldev.c
@@ -141,36 +141,41 @@ static int nldev_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
 	struct ib_device *device;
 	struct sk_buff *msg;
 	u32 index;
-	int err;
+	int ret = -ENOMEM;

-	err = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
+	ret = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
 			  nldev_policy, extack);
-	if (err || !tb[RDMA_NLDEV_ATTR_DEV_INDEX])
+	if (ret || !tb[RDMA_NLDEV_ATTR_DEV_INDEX])
 		return -EINVAL;

 	index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);

-	device = __ib_device_get_by_index(index);
+	device = ib_device_get_by_index(index);
 	if (!device)
 		return -EINVAL;

 	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
 	if (!msg)
-		return -ENOMEM;
+		goto err;

 	nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
 			RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_GET),
 			0, 0);

-	err = fill_dev_info(msg, device);
-	if (err) {
-		nlmsg_free(msg);
-		return err;
-	}
+	ret = fill_dev_info(msg, device);
+	if (ret)
+		goto err_free;

 	nlmsg_end(msg, nlh);

+	put_device(&device->dev);
 	return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
+
+err_free:
+	nlmsg_free(msg);
+err:
+	put_device(&device->dev);
+	return ret;
 }

 static int _nldev_get_dumpit(struct ib_device *device,
--
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH rdma-next 07/14] RDMA/nldev: Protect port query from accidental device removal
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (5 preceding siblings ...)
  2017-12-21 18:17   ` [RFC PATCH rdma-next 06/14] RDMA/netlink: Protect device query from device removal Leon Romanovsky
@ 2017-12-21 18:17   ` Leon Romanovsky
  2017-12-21 18:17   ` [RFC PATCH rdma-next 08/14] RDMA/restrack: Add general infrastructure to track RDMA resources Leon Romanovsky
                     ` (8 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise,
	Leon Romanovsky

From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

There is a chance that device will be removed during port query
operations and it will cause to kernel panic in the flows which doesn't
hold lists_rwsem semaphore.

Fixes: c3f66f7b0052 ("RDMA/netlink: Implement nldev port doit callback")
Fixes: 7d02f605f0dc ("RDMA/netlink: Add nldev port dumpit implementation")
Reviewed-by: Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/nldev.c | 37 +++++++++++++++++++++++--------------
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
index e3033d7a4029..ed7e639e7dee 100644
--- a/drivers/infiniband/core/nldev.c
+++ b/drivers/infiniband/core/nldev.c
@@ -223,41 +223,48 @@ static int nldev_port_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
 	struct sk_buff *msg;
 	u32 index;
 	u32 port;
-	int err;
+	int ret = -EINVAL;

-	err = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
+	ret = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
 			  nldev_policy, extack);
-	if (err ||
+	if (ret ||
 	    !tb[RDMA_NLDEV_ATTR_DEV_INDEX] ||
 	    !tb[RDMA_NLDEV_ATTR_PORT_INDEX])
 		return -EINVAL;

 	index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
-	device = __ib_device_get_by_index(index);
+	device = ib_device_get_by_index(index);
 	if (!device)
 		return -EINVAL;

 	port = nla_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]);
 	if (!rdma_is_port_valid(device, port))
-		return -EINVAL;
+		goto err;

 	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
-	if (!msg)
-		return -ENOMEM;
+	if (!msg) {
+		ret = -ENOMEM;
+		goto err;
+	}

 	nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
 			RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_GET),
 			0, 0);

-	err = fill_port_info(msg, device, port);
-	if (err) {
-		nlmsg_free(msg);
-		return err;
-	}
+	ret = fill_port_info(msg, device, port);
+	if (ret)
+		goto err_free;

 	nlmsg_end(msg, nlh);
+	put_device(&device->dev);

 	return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
+
+err_free:
+	nlmsg_free(msg);
+err:
+	put_device(&device->dev);
+	return ret;
 }

 static int nldev_port_get_dumpit(struct sk_buff *skb,
@@ -278,7 +285,7 @@ static int nldev_port_get_dumpit(struct sk_buff *skb,
 		return -EINVAL;

 	ifindex = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
-	device = __ib_device_get_by_index(ifindex);
+	device = ib_device_get_by_index(ifindex);
 	if (!device)
 		return -EINVAL;

@@ -312,7 +319,9 @@ static int nldev_port_get_dumpit(struct sk_buff *skb,
 		nlmsg_end(skb, nlh);
 	}

-out:	cb->args[0] = idx;
+out:
+	put_device(&device->dev);
+	cb->args[0] = idx;
 	return skb->len;
 }

--
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH rdma-next 08/14] RDMA/restrack: Add general infrastructure to track RDMA resources
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (6 preceding siblings ...)
  2017-12-21 18:17   ` [RFC PATCH rdma-next 07/14] RDMA/nldev: Protect port query from accidental " Leon Romanovsky
@ 2017-12-21 18:17   ` Leon Romanovsky
  2017-12-21 18:17   ` [RFC PATCH rdma-next 09/14] RDMA/core: Add helper function to create named QPs Leon Romanovsky
                     ` (7 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise,
	Leon Romanovsky

From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

The RDMA subsystem has very strict set of objects to work on it,
but it completely lacks tracking facilities and no visibility of
resource utilization.

The following patch adds such infrastructure to keep track of RDMA
resources to help with debugging of user space applications. The primary
user of this infrastructure is RDMA nldev netlink (following patches),
but it is not limited too.

At this stage, the main three objects (PD, CQ and QP) are added,
and more will be added later.

There are four new functions in use by RDMA/core:
 * rdma_restrack_init(...)   - initializes restrack database
 * rdma_restrack_clean(...)  - cleans restrack database
 * rdma_restrack_add(...)    - adds object to be tracked
 * rdma_restrack_del(...)    - removes object from tracking

3 functions and one iterator visible to kernel users:
 * rdma_restrack_count(...) - returns number of allocated objects of
			      specific type
 * rdma_restrack_lock(...)  - Lock primitive to protect access to list
			      of resources
 * rdma_restrack_unlock(...)- Unlock primitive to protect access to list
			      of resources
 * for_each_res_safe(...)   - iterates over all relevant objects in
   the restrack database.

Reviewed-by: Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/Makefile    |   2 +-
 drivers/infiniband/core/core_priv.h |   1 +
 drivers/infiniband/core/device.c    |   7 ++
 drivers/infiniband/core/restrack.c  | 177 ++++++++++++++++++++++++++++++++++++
 include/rdma/ib_verbs.h             |  17 +++-
 include/rdma/restrack.h             | 149 ++++++++++++++++++++++++++++++
 6 files changed, 351 insertions(+), 2 deletions(-)
 create mode 100644 drivers/infiniband/core/restrack.c
 create mode 100644 include/rdma/restrack.h

diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index 504b926552c6..f69833db0a32 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -12,7 +12,7 @@ ib_core-y :=			packer.o ud_header.o verbs.o cq.o rw.o sysfs.o \
 				device.o fmr_pool.o cache.o netlink.o \
 				roce_gid_mgmt.o mr_pool.o addr.o sa_query.o \
 				multicast.o mad.o smi.o agent.o mad_rmpp.o \
-				security.o nldev.o
+				security.o nldev.o restrack.o

 ib_core-$(CONFIG_INFINIBAND_USER_MEM) += umem.o
 ib_core-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += umem_odp.o
diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index e71dd1814bf0..1fe2b92fe357 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -40,6 +40,7 @@
 #include <rdma/ib_verbs.h>
 #include <rdma/opa_addr.h>
 #include <rdma/ib_mad.h>
+#include <rdma/restrack.h>
 #include "mad_priv.h"

 struct pkey_index_qp_list {
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index adf3a4ca038b..0e8955f62e5a 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -268,6 +268,11 @@ struct ib_device *ib_alloc_device(size_t size)
 	if (!device)
 		return NULL;

+	if (rdma_restrack_init(&device->res)) {
+		kfree(device);
+		return NULL;
+	}
+
 	device->dev.class = &ib_class;
 	device_initialize(&device->dev);

@@ -593,6 +598,8 @@ void ib_unregister_device(struct ib_device *device)
 	}
 	up_read(&lists_rwsem);

+	rdma_restrack_clean(&device->res);
+
 	ib_device_unregister_rdmacg(device);
 	ib_device_unregister_sysfs(device);

diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c
new file mode 100644
index 000000000000..7a25e3aa43b3
--- /dev/null
+++ b/drivers/infiniband/core/restrack.c
@@ -0,0 +1,177 @@
+/*
+ * Copyright (c) 2017 Mellanox Technologies. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the names of the copyright holders nor the names of its
+ *    contributors may be used to endorse or promote products derived from
+ *    this software without specific prior written permission.
+ *
+ * Alternatively, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") version 2 as published by the Free
+ * Software Foundation.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rdma/ib_verbs.h>
+#include <rdma/restrack.h>
+#include <linux/rculist.h>
+
+int rdma_restrack_init(struct rdma_restrack_root *res)
+{
+	int i = 0;
+
+	for (; i < _RDMA_RESTRACK_MAX; i++) {
+		refcount_set(&res->cnt[i], 1);
+		INIT_LIST_HEAD_RCU(&res->list[i]);
+		mutex_init(&res->lock[i]);
+	}
+
+	return 0;
+}
+
+void rdma_restrack_clean(struct rdma_restrack_root *res)
+{
+	int i = 0;
+
+	for (; i < _RDMA_RESTRACK_MAX; i++) {
+		WARN_ON_ONCE(!refcount_dec_and_test(&res->cnt[i]));
+		WARN_ON_ONCE(!list_empty(&res->list[i]));
+	}
+}
+
+static bool is_restrack_valid(enum rdma_restrack_obj type)
+{
+	return !(type >= _RDMA_RESTRACK_MAX);
+}
+
+int rdma_restrack_count(struct rdma_restrack_root *res,
+			enum rdma_restrack_obj type)
+{
+	if (!is_restrack_valid(type))
+		return 0;
+
+	/*
+	 * The counter was initialized to 1 at the beginning.
+	 */
+	return refcount_read(&res->cnt[type]) - 1;
+}
+EXPORT_SYMBOL(rdma_restrack_count);
+
+void rdma_restrack_add(struct rdma_restrack_entry *res,
+		       enum rdma_restrack_obj type, const char *comm)
+{
+	struct ib_device *dev;
+	struct ib_pd *pd;
+	struct ib_cq *cq;
+	struct ib_qp *qp;
+
+	if (!is_restrack_valid(type))
+		return;
+
+	switch (type) {
+	case RDMA_RESTRACK_PD:
+		pd = container_of(res, struct ib_pd, res);
+		dev = pd->device;
+		break;
+	case RDMA_RESTRACK_CQ:
+		cq = container_of(res, struct ib_cq, res);
+		dev = cq->device;
+		break;
+	case RDMA_RESTRACK_QP:
+		qp = container_of(res, struct ib_qp, res);
+		dev = qp->device;
+		break;
+	default:
+		/* unreachable */
+		return;
+	}
+
+	refcount_inc(&dev->res.cnt[type]);
+
+	if (!comm || !strlen(comm)) {
+		get_task_comm(res->task_comm, current);
+		/*
+		 * Return global PID
+		 */
+		res->pid = task_pid_nr(current);
+	} else {
+		/*
+		 * no need to set PID, it comes from
+		 * core kernel, so pid will be zero
+		 */
+		strncpy(res->task_comm, comm, TASK_COMM_LEN);
+	}
+	mutex_lock(&dev->res.lock[type]);
+	if (init_srcu_struct(&res->srcu))
+		/*
+		 * We are not returning error, because there is nothing
+		 * we can do it in such case, it is already too late to
+		 * crash the driver just of failure in resource tracking.
+		 *
+		 * Simply leave this resource is not valid.
+		 */
+		goto out;
+
+	list_add(&res->list, &dev->res.list[type]);
+	res->valid = true;
+
+out:
+	mutex_unlock(&dev->res.lock[type]);
+}
+EXPORT_SYMBOL(rdma_restrack_add);
+
+void rdma_restrack_del(struct rdma_restrack_entry *res,
+		       enum rdma_restrack_obj type)
+{
+	struct ib_device *dev;
+	struct ib_pd *pd;
+	struct ib_cq *cq;
+	struct ib_qp *qp;
+
+	if (!is_restrack_valid(type) || !res->valid)
+		return;
+
+	switch (type) {
+	case RDMA_RESTRACK_PD:
+		pd = container_of(res, struct ib_pd, res);
+		dev = pd->device;
+		break;
+	case RDMA_RESTRACK_CQ:
+		cq = container_of(res, struct ib_cq, res);
+		dev = cq->device;
+		break;
+	case RDMA_RESTRACK_QP:
+		qp = container_of(res, struct ib_qp, res);
+		dev = qp->device;
+		break;
+	default:
+		/* unreachable */
+		return;
+	}
+
+	refcount_dec(&dev->res.cnt[type]);
+	mutex_lock(&dev->res.lock[type]);
+	list_del(&res->list);
+	mutex_unlock(&dev->res.lock[type]);
+	synchronize_srcu(&res->srcu);
+	cleanup_srcu_struct(&res->srcu);
+}
+EXPORT_SYMBOL(rdma_restrack_del);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index c7c8032b1ecd..a2678e80c2a7 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -63,6 +63,7 @@
 #include <linux/uaccess.h>
 #include <linux/cgroup_rdma.h>
 #include <uapi/rdma/ib_user_verbs.h>
+#include <rdma/restrack.h>

 #define IB_FW_VERSION_NAME_MAX	ETHTOOL_FWVERS_LEN

@@ -1526,9 +1527,10 @@ struct ib_pd {
 	u32			unsafe_global_rkey;

 	/*
-	 * Implementation details of the RDMA core, don't use in drivers:
+	 * Implementation details of the RDMA core, don't use in the drivers
 	 */
 	struct ib_mr	       *__internal_mr;
+	struct rdma_restrack_entry res;
 };

 struct ib_xrcd {
@@ -1569,6 +1571,10 @@ struct ib_cq {
 		struct irq_poll		iop;
 		struct work_struct	work;
 	};
+	/*
+	 * Internal to RDMA/core, don't use in the drivers
+	 */
+	struct rdma_restrack_entry res;
 };

 struct ib_srq {
@@ -1745,6 +1751,11 @@ struct ib_qp {
 	struct ib_rwq_ind_table *rwq_ind_tbl;
 	struct ib_qp_security  *qp_sec;
 	u8			port;
+
+	/*
+	 * Internal to RDMA/core, don't use in the drivers
+	 */
+	struct rdma_restrack_entry     res;
 };

 struct ib_mr {
@@ -2351,6 +2362,10 @@ struct ib_device {
 #endif

 	u32                          index;
+	/*
+	 * Implementation details of the RDMA core, don't use in the drivers
+	 */
+	struct rdma_restrack_root     res;

 	/**
 	 * The following mandatory functions are used only at device
diff --git a/include/rdma/restrack.h b/include/rdma/restrack.h
new file mode 100644
index 000000000000..0e12346d2c2c
--- /dev/null
+++ b/include/rdma/restrack.h
@@ -0,0 +1,149 @@
+/*
+ * Copyright (c) 2017 Mellanox Technologies. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the names of the copyright holders nor the names of its
+ *    contributors may be used to endorse or promote products derived from
+ *    this software without specific prior written permission.
+ *
+ * Alternatively, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") version 2 as published by the Free
+ * Software Foundation.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RDMA_RESTRACK_H_
+#define _RDMA_RESTRACK_H_
+
+#include <linux/typecheck.h>
+#include <linux/srcu.h>
+#include <linux/refcount.h>
+#include <linux/sched.h>
+
+/*
+ * HW objects to track
+ */
+enum rdma_restrack_obj {
+	RDMA_RESTRACK_PD,
+	RDMA_RESTRACK_CQ,
+	RDMA_RESTRACK_QP,
+	/* Always last, counts number of elements */
+	_RDMA_RESTRACK_MAX
+};
+
+/*
+ * Resource trcking management entity per restrack object
+ */
+struct rdma_restrack_root {
+	/*
+	 * Global counter to avoid the need to count number
+	 * of elements in the object's list.
+	 *
+	 * It can be different from list_count, because we don't
+	 * grab lock for the additions of new objects and don't
+	 * synchronize the RCU.
+	 */
+	refcount_t		cnt[_RDMA_RESTRACK_MAX];
+	struct list_head	list[_RDMA_RESTRACK_MAX];
+	/*
+	 * Internal lock to protect the add/delete list operations.
+	 */
+	struct mutex		lock[_RDMA_RESTRACK_MAX];
+};
+
+struct rdma_restrack_entry {
+	struct list_head	list;
+
+	/*
+	 * The entries are filled during rdma_restrack_add,
+	 * can be attempted to be free during rdma_restrack_del.
+	 *
+	 * As an example for that, see mlx5 QPs with type MLX5_IB_QPT_HW_GSI
+	 */
+	bool			valid;
+
+	/*
+	 * Sleepabale RCU to protect object data.
+	 */
+	struct srcu_struct	srcu;
+
+	/*
+	 * Information for resource tracking,
+	 * Copied here to save locking of task_struct
+	 * while accessing this information from NLDEV
+	 */
+	pid_t                   pid;
+
+	/*
+	 * User can get this information from /proc/PID/comm file,
+	 * but it will create a lot of syscalls for reads for many QPs,
+	 * let's store it here to save work for users.
+	 */
+	char                    task_comm[TASK_COMM_LEN];
+};
+
+int rdma_restrack_init(struct rdma_restrack_root *res);
+void rdma_restrack_clean(struct rdma_restrack_root *res);
+
+/*
+ * Iterator - use rdma_restrack_lock/rdma_restrack_unlock to protect it
+ */
+#define for_each_res_safe(r, n, type, dev) \
+	list_for_each_safe(r, n, &(dev)->res.list[type])
+
+/*
+ * lock/unlock to protect reads of restrack_obj structs
+ */
+static inline void rdma_restrack_lock(struct rdma_restrack_root *res,
+				      enum rdma_restrack_obj type)
+{
+	mutex_lock(&res->lock[type]);
+}
+
+static inline void rdma_restrack_unlock(struct rdma_restrack_root *res,
+					enum rdma_restrack_obj type)
+{
+	mutex_unlock(&res->lock[type]);
+}
+
+/*
+ * Returns the current usage of specific object.
+ * Users can get device utilization by comparing with max_objname
+ * (e.g. max_qp, max_pd e.t.c),
+ */
+int rdma_restrack_count(struct rdma_restrack_root *res,
+			enum rdma_restrack_obj type);
+
+/*
+ * Track object:
+ *  res - resource tracker to operate on, usually allocated on ib_device
+ *  type - actual type of object to operate.
+ *  comm - the owner of this resource. For kernel created resources,
+ *         there is a need to pass a name here, which will be visible to users.
+ *         For user created resources, there is a need to pass NULL here and the
+ *         owner will be taken from current struct task_struct.
+ */
+
+void rdma_restrack_add(struct rdma_restrack_entry *res,
+		       enum rdma_restrack_obj type, const char *comm);
+void rdma_restrack_del(struct rdma_restrack_entry *res,
+		       enum rdma_restrack_obj type);
+#endif /* _RDMA_RESTRACK_H_ */
--
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH rdma-next 09/14] RDMA/core: Add helper function to create named QPs
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (7 preceding siblings ...)
  2017-12-21 18:17   ` [RFC PATCH rdma-next 08/14] RDMA/restrack: Add general infrastructure to track RDMA resources Leon Romanovsky
@ 2017-12-21 18:17   ` Leon Romanovsky
  2017-12-21 18:17   ` [RFC PATCH rdma-next 10/14] RDMA: Annotate create QP callers Leon Romanovsky
                     ` (6 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise,
	Leon Romanovsky

From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

The QPs in the RDMA stack can be created by kernel
or users space, but the owner is not visible to users
after that.

The added helper keeps track of newly created QP together
with the name of the owner. In case of kernel, the caller to
create_qp is supposed to update QP's attribute with the name.
For user space callers no change is needed and the name will
be taken from the process name.

This helper sets qp->device field for all QP types including
XRC_TGT, which RDMA/core didn't do before.

Reviewed-by: Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/core_priv.h | 19 +++++++++++++++++++
 include/rdma/ib_verbs.h             |  6 ++++++
 2 files changed, 25 insertions(+)

diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index 1fe2b92fe357..4c2ff4a02114 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -306,4 +306,23 @@ struct ib_device *ib_device_get_by_index(u32 ifindex);
 /* RDMA device netlink */
 void nldev_init(void);
 void nldev_exit(void);
+
+static inline struct ib_qp *_ib_create_qp(struct ib_device *dev,
+					  struct ib_pd *pd,
+					  struct ib_qp_init_attr *attr,
+					  struct ib_udata *udata)
+{
+	struct ib_qp *qp;
+
+	qp = dev->create_qp(pd, attr, udata);
+	if (!IS_ERR(qp)) {
+		qp->device = dev;
+		if (attr->qp_type < IB_QPT_MAX)
+			rdma_restrack_add(&qp->res,
+					  RDMA_RESTRACK_QP,
+					  attr->comm);
+	}
+
+	return qp;
+}
 #endif /* _CORE_PRIV_H */
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index a2678e80c2a7..64032f4c8d5a 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1140,6 +1140,12 @@ struct ib_qp_init_attr {
 	u8			port_num;
 	struct ib_rwq_ind_table *rwq_ind_tbl;
 	u32			source_qpn;
+
+	/*
+	 * Name of entity which created this QP, empty string means that
+	 * it will be taken automatically from task_struct.
+	 */
+	char comm[TASK_COMM_LEN];
 };

 struct ib_qp_open_attr {
--
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH rdma-next 10/14] RDMA: Annotate create QP callers
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (8 preceding siblings ...)
  2017-12-21 18:17   ` [RFC PATCH rdma-next 09/14] RDMA/core: Add helper function to create named QPs Leon Romanovsky
@ 2017-12-21 18:17   ` Leon Romanovsky
  2017-12-21 18:17   ` [RFC PATCH rdma-next 11/14] RDMA/core: Add resource tracking for create and destroy CQs Leon Romanovsky
                     ` (5 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise,
	Leon Romanovsky

From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Update all callers to provide owner name through QP attribute
structure and connect create_qp with helper which supports
resource tracking.

Reviewed-by: Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/cma.c              | 1 +
 drivers/infiniband/core/mad.c              | 1 +
 drivers/infiniband/core/uverbs_cmd.c       | 3 +--
 drivers/infiniband/core/verbs.c            | 4 ++--
 drivers/infiniband/hw/mlx4/mad.c           | 1 +
 drivers/infiniband/hw/mlx4/qp.c            | 1 +
 drivers/infiniband/hw/mlx5/gsi.c           | 2 ++
 drivers/infiniband/ulp/ipoib/ipoib_cm.c    | 4 +++-
 drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 1 +
 drivers/infiniband/ulp/srp/ib_srp.c        | 1 +
 drivers/infiniband/ulp/srpt/ib_srpt.c      | 1 +
 net/smc/smc_ib.c                           | 1 +
 12 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index e5c9f42a7e4b..b13968cc5a30 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -857,6 +857,7 @@ int rdma_create_qp(struct rdma_cm_id *id, struct ib_pd *pd,
 		return -EINVAL;

 	qp_init_attr->port_num = id->port_num;
+	strncpy(qp_init_attr->comm, "rdma-cm", TASK_COMM_LEN);
 	qp = ib_create_qp(pd, qp_init_attr);
 	if (IS_ERR(qp))
 		return PTR_ERR(qp);
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index cb91245e9163..f58a2b8e6979 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -3104,6 +3104,7 @@ static int create_mad_qp(struct ib_mad_qp_info *qp_info,
 	qp_init_attr.port_num = qp_info->port_priv->port_num;
 	qp_init_attr.qp_context = qp_info;
 	qp_init_attr.event_handler = qp_event_handler;
+	strncpy(qp_init_attr.comm, "rdma-mad", TASK_COMM_LEN);
 	qp_info->qp = ib_create_qp(qp_info->port_priv->pd, &qp_init_attr);
 	if (IS_ERR(qp_info->qp)) {
 		dev_err(&qp_info->port_priv->device->dev,
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 16d55710b116..24609be0c5cf 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -1517,7 +1517,7 @@ static int create_qp(struct ib_uverbs_file *file,
 	if (cmd->qp_type == IB_QPT_XRC_TGT)
 		qp = ib_create_qp(pd, &attr);
 	else
-		qp = device->create_qp(pd, &attr, uhw);
+		qp = _ib_create_qp(device, pd, &attr, uhw);

 	if (IS_ERR(qp)) {
 		ret = PTR_ERR(qp);
@@ -1530,7 +1530,6 @@ static int create_qp(struct ib_uverbs_file *file,
 			goto err_cb;

 		qp->real_qp	  = qp;
-		qp->device	  = device;
 		qp->pd		  = pd;
 		qp->send_cq	  = attr.send_cq;
 		qp->recv_cq	  = attr.recv_cq;
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index b53fb0e98751..6d95263a40d6 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -866,7 +866,7 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd,
 	if (qp_init_attr->cap.max_rdma_ctxs)
 		rdma_rw_init_qp(device, qp_init_attr);

-	qp = device->create_qp(pd, qp_init_attr, NULL);
+	qp = _ib_create_qp(device, pd, qp_init_attr, NULL);
 	if (IS_ERR(qp))
 		return qp;

@@ -876,7 +876,6 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd,
 		return ERR_PTR(ret);
 	}

-	qp->device     = device;
 	qp->real_qp    = qp;
 	qp->uobject    = NULL;
 	qp->qp_type    = qp_init_attr->qp_type;
@@ -1503,6 +1502,7 @@ int ib_destroy_qp(struct ib_qp *qp)
 	if (!qp->uobject)
 		rdma_rw_cleanup_mrs(qp);

+	rdma_restrack_del(&qp->res, RDMA_RESTRACK_QP);
 	ret = qp->device->destroy_qp(qp);
 	if (!ret) {
 		if (pd)
diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
index 0793a21d76f4..f816df420fb9 100644
--- a/drivers/infiniband/hw/mlx4/mad.c
+++ b/drivers/infiniband/hw/mlx4/mad.c
@@ -1834,6 +1834,7 @@ static int create_pv_sqp(struct mlx4_ib_demux_pv_ctx *ctx,
 	qp_init_attr.init_attr.port_num = ctx->port;
 	qp_init_attr.init_attr.qp_context = ctx;
 	qp_init_attr.init_attr.event_handler = pv_qp_event_handler;
+	strncpy(qp_init_attr.init_attr.comm, "mlx4-sriov", TASK_COMM_LEN);
 	tun_qp->qp = ib_create_qp(ctx->pd, &qp_init_attr.init_attr);
 	if (IS_ERR(tun_qp->qp)) {
 		ret = PTR_ERR(tun_qp->qp);
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 0172e1514527..6aebc50dae79 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -1677,6 +1677,7 @@ struct ib_qp *mlx4_ib_create_qp(struct ib_pd *pd,
 		if (is_eth &&
 		    dev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2) {
 			init_attr->create_flags |= MLX4_IB_QP_CREATE_ROCE_V2_GSI;
+			strncpy(init_attr->comm, "mlx4-gsi", TASK_COMM_LEN);
 			sqp->roce_v2_gsi = ib_create_qp(pd, init_attr);

 			if (IS_ERR(sqp->roce_v2_gsi)) {
diff --git a/drivers/infiniband/hw/mlx5/gsi.c b/drivers/infiniband/hw/mlx5/gsi.c
index 79e6309460dc..b1b177d1a0dd 100644
--- a/drivers/infiniband/hw/mlx5/gsi.c
+++ b/drivers/infiniband/hw/mlx5/gsi.c
@@ -184,6 +184,7 @@ struct ib_qp *mlx5_ib_gsi_create_qp(struct ib_pd *pd,
 		hw_init_attr.cap.max_send_sge = 0;
 		hw_init_attr.cap.max_inline_data = 0;
 	}
+	strncpy(hw_init_attr.comm, "mlx5-gsi", TASK_COMM_LEN);
 	gsi->rx_qp = ib_create_qp(pd, &hw_init_attr);
 	if (IS_ERR(gsi->rx_qp)) {
 		mlx5_ib_warn(dev, "unable to create hardware GSI QP. error %ld\n",
@@ -264,6 +265,7 @@ static struct ib_qp *create_gsi_ud_qp(struct mlx5_ib_gsi_qp *gsi)
 		.sq_sig_type = gsi->sq_sig_type,
 		.qp_type = IB_QPT_UD,
 		.create_flags = mlx5_ib_create_qp_sqpn_qp1(),
+		.comm = "mlx5-gsi",
 	};

 	return ib_create_qp(pd, &init_attr);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 87f4bd99cdf7..e660db0f8884 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -263,6 +263,7 @@ static struct ib_qp *ipoib_cm_create_rx_qp(struct net_device *dev,
 		.sq_sig_type = IB_SIGNAL_ALL_WR,
 		.qp_type = IB_QPT_RC,
 		.qp_context = p,
+		.comm = "ipoib-cm",
 	};

 	if (!ipoib_cm_has_srq(dev)) {
@@ -1060,7 +1061,8 @@ static struct ib_qp *ipoib_cm_create_tx_qp(struct net_device *dev, struct ipoib_
 		.sq_sig_type		= IB_SIGNAL_ALL_WR,
 		.qp_type		= IB_QPT_RC,
 		.qp_context		= tx,
-		.create_flags		= 0
+		.create_flags		= 0,
+		.comm			= "ipoib-cm",
 	};
 	struct ib_qp *tx_qp;

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
index a1ed25422b72..9fd0a9c9022e 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
@@ -206,6 +206,7 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca)
 	if (priv->hca_caps & IB_DEVICE_MANAGED_FLOW_STEERING)
 		init_attr.create_flags |= IB_QP_CREATE_NETIF_QP;

+	strncpy(init_attr.comm, "ipoib-verbs", TASK_COMM_LEN);
 	priv->qp = ib_create_qp(priv->pd, &init_attr);
 	if (IS_ERR(priv->qp)) {
 		printk(KERN_WARNING "%s: failed to create QP\n", ca->name);
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 972d4b3c5223..65ef854cb640 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -521,6 +521,7 @@ static int srp_create_ch_ib(struct srp_rdma_ch *ch)
 	init_attr->send_cq             = send_cq;
 	init_attr->recv_cq             = recv_cq;

+	strncpy(init_attr->comm, "srp", TASK_COMM_LEN);
 	qp = ib_create_qp(dev->pd, init_attr);
 	if (IS_ERR(qp)) {
 		ret = PTR_ERR(qp);
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 8a1bd354b1cc..a35563da9b47 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1673,6 +1673,7 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch)
 		qp_init->cap.max_recv_sge = qp_init->cap.max_send_sge;
 	}

+	strncpy(qp_init->comm, "srpt", TASK_COMM_LEN);
 	ch->qp = ib_create_qp(sdev->pd, qp_init);
 	if (IS_ERR(ch->qp)) {
 		ret = PTR_ERR(ch->qp);
diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
index 90f1a7f9085c..9f5bca333cce 100644
--- a/net/smc/smc_ib.c
+++ b/net/smc/smc_ib.c
@@ -243,6 +243,7 @@ int smc_ib_create_queue_pair(struct smc_link *lnk)
 		},
 		.sq_sig_type = IB_SIGNAL_REQ_WR,
 		.qp_type = IB_QPT_RC,
+		.comm = "sec-ib",
 	};
 	int rc;

--
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH rdma-next 11/14] RDMA/core: Add resource tracking for create and destroy CQs
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (9 preceding siblings ...)
  2017-12-21 18:17   ` [RFC PATCH rdma-next 10/14] RDMA: Annotate create QP callers Leon Romanovsky
@ 2017-12-21 18:17   ` Leon Romanovsky
  2017-12-21 18:17   ` [RFC PATCH rdma-next 12/14] RDMA/core: Add resource tracking for create and destroy PDs Leon Romanovsky
                     ` (4 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise,
	Leon Romanovsky

From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Track all creation and destroy of CQ objects.

Reviewed-by: Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/cq.c               | 3 +++
 drivers/infiniband/core/uverbs_cmd.c       | 1 +
 drivers/infiniband/core/uverbs_std_types.c | 2 ++
 drivers/infiniband/core/verbs.c            | 2 ++
 4 files changed, 8 insertions(+)

diff --git a/drivers/infiniband/core/cq.c b/drivers/infiniband/core/cq.c
index f2ae75fa3128..6344e007dbce 100644
--- a/drivers/infiniband/core/cq.c
+++ b/drivers/infiniband/core/cq.c
@@ -154,6 +154,8 @@ struct ib_cq *ib_alloc_cq(struct ib_device *dev, void *private,
 	if (!cq->wc)
 		goto out_destroy_cq;

+	rdma_restrack_add(&cq->res, RDMA_RESTRACK_CQ, NULL);
+
 	switch (cq->poll_ctx) {
 	case IB_POLL_DIRECT:
 		cq->comp_handler = ib_cq_completion_direct;
@@ -208,6 +210,7 @@ void ib_free_cq(struct ib_cq *cq)
 		WARN_ON_ONCE(1);
 	}

+	rdma_restrack_del(&cq->res, RDMA_RESTRACK_CQ);
 	kfree(cq->wc);
 	ret = cq->device->destroy_cq(cq);
 	WARN_ON_ONCE(ret);
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 24609be0c5cf..f462bde6e3b4 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -1033,6 +1033,7 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 		goto err_cb;

 	uobj_alloc_commit(&obj->uobject);
+	rdma_restrack_add(&cq->res, RDMA_RESTRACK_CQ, NULL);

 	return obj;

diff --git a/drivers/infiniband/core/uverbs_std_types.c b/drivers/infiniband/core/uverbs_std_types.c
index c3ee5d9b336d..2d0d27865b09 100644
--- a/drivers/infiniband/core/uverbs_std_types.c
+++ b/drivers/infiniband/core/uverbs_std_types.c
@@ -35,6 +35,7 @@
 #include <rdma/ib_verbs.h>
 #include <linux/bug.h>
 #include <linux/file.h>
+#include <rdma/restrack.h>
 #include "rdma_core.h"
 #include "uverbs.h"

@@ -319,6 +320,7 @@ static int uverbs_create_cq_handler(struct ib_device *ib_dev,
 	obj->uobject.object = cq;
 	obj->uobject.user_handle = user_handle;
 	atomic_set(&cq->usecnt, 0);
+	rdma_restrack_add(&cq->res, RDMA_RESTRACK_CQ, NULL);

 	ret = uverbs_copy_to(attrs, CREATE_CQ_RESP_CQE, &cq->cqe);
 	if (ret)
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 6d95263a40d6..1e023e2d1dde 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1545,6 +1545,7 @@ struct ib_cq *ib_create_cq(struct ib_device *device,
 		cq->event_handler = event_handler;
 		cq->cq_context    = cq_context;
 		atomic_set(&cq->usecnt, 0);
+		rdma_restrack_add(&cq->res, RDMA_RESTRACK_CQ, NULL);
 	}

 	return cq;
@@ -1563,6 +1564,7 @@ int ib_destroy_cq(struct ib_cq *cq)
 	if (atomic_read(&cq->usecnt))
 		return -EBUSY;

+	rdma_restrack_del(&cq->res, RDMA_RESTRACK_CQ);
 	return cq->device->destroy_cq(cq);
 }
 EXPORT_SYMBOL(ib_destroy_cq);
--
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH rdma-next 12/14] RDMA/core: Add resource tracking for create and destroy PDs
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (10 preceding siblings ...)
  2017-12-21 18:17   ` [RFC PATCH rdma-next 11/14] RDMA/core: Add resource tracking for create and destroy CQs Leon Romanovsky
@ 2017-12-21 18:17   ` Leon Romanovsky
  2017-12-21 18:17   ` [RFC PATCH rdma-next 13/14] RDMA/nldev: Provide global resource utilization Leon Romanovsky
                     ` (3 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise,
	Leon Romanovsky

From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Track all creation and destroy of PD objects.

Reviewed-by: Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs_cmd.c | 1 +
 drivers/infiniband/core/verbs.c      | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index f462bde6e3b4..ef50c919b09d 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -340,6 +340,7 @@ ssize_t ib_uverbs_alloc_pd(struct ib_uverbs_file *file,
 	uobj->object = pd;
 	memset(&resp, 0, sizeof resp);
 	resp.pd_handle = uobj->id;
+	rdma_restrack_add(&pd->res, RDMA_RESTRACK_PD, NULL);

 	if (copy_to_user(u64_to_user_ptr(cmd.response), &resp, sizeof resp)) {
 		ret = -EFAULT;
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 1e023e2d1dde..8dd5ffc5de40 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -246,6 +246,7 @@ struct ib_pd *__ib_alloc_pd(struct ib_device *device, unsigned int flags,
 		pr_warn("%s: enabling unsafe global rkey\n", caller);
 		mr_access_flags |= IB_ACCESS_REMOTE_READ | IB_ACCESS_REMOTE_WRITE;
 	}
+	rdma_restrack_add(&pd->res, RDMA_RESTRACK_PD, NULL);

 	if (mr_access_flags) {
 		struct ib_mr *mr;
@@ -296,6 +297,7 @@ void ib_dealloc_pd(struct ib_pd *pd)
 	   requires the caller to guarantee we can't race here. */
 	WARN_ON(atomic_read(&pd->usecnt));

+	rdma_restrack_del(&pd->res, RDMA_RESTRACK_PD);
 	/* Making delalloc_pd a void return is a WIP, no driver should return
 	   an error here. */
 	ret = pd->device->dealloc_pd(pd);
--
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH rdma-next 13/14] RDMA/nldev: Provide global resource utilization
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (11 preceding siblings ...)
  2017-12-21 18:17   ` [RFC PATCH rdma-next 12/14] RDMA/core: Add resource tracking for create and destroy PDs Leon Romanovsky
@ 2017-12-21 18:17   ` Leon Romanovsky
  2017-12-21 18:17   ` [RFC PATCH rdma-next 14/14] RDMA/nldev: Provide detailed QP information Leon Romanovsky
                     ` (2 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise,
	Leon Romanovsky

From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Export through netlink interface, the global device utilization
for the rdmatool as the main user of RDMA nldev interface.

Provide both dumpit and doit callbacks.

As an example of possible output from rdmatool for system with 5
Mellanox's card

$ rdma res
1: mlx5_0: curr/max: qp 4/262144 cq 5/16777216 pd 3/16777216
2: mlx5_1: curr/max: qp 4/262144 cq 5/16777216 pd 3/16777216
3: mlx5_2: curr/max: qp 4/262144 cq 5/16777216 pd 3/16777216
4: mlx5_3: curr/max: qp 2/262144 cq 3/16777216 pd 2/16777216
5: mlx5_4: curr/max: qp 4/262144 cq 5/16777216 pd 3/16777216

Reviewed-by: Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/nldev.c  | 165 +++++++++++++++++++++++++++++++++++++++
 include/uapi/rdma/rdma_netlink.h |  11 +++
 2 files changed, 176 insertions(+)

diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
index ed7e639e7dee..7aca9458e946 100644
--- a/drivers/infiniband/core/nldev.c
+++ b/drivers/infiniband/core/nldev.c
@@ -52,6 +52,12 @@ static const struct nla_policy nldev_policy[RDMA_NLDEV_ATTR_MAX] = {
 	[RDMA_NLDEV_ATTR_PORT_STATE]	= { .type = NLA_U8 },
 	[RDMA_NLDEV_ATTR_PORT_PHYS_STATE] = { .type = NLA_U8 },
 	[RDMA_NLDEV_ATTR_DEV_NODE_TYPE] = { .type = NLA_U8 },
+	[RDMA_NLDEV_ATTR_RES_SUMMARY]	= { .type = NLA_NESTED },
+	[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY]	= { .type = NLA_NESTED },
+	[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_NAME] = { .type = NLA_NUL_STRING,
+					     .len = 16 },
+	[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_CURR] = { .type = NLA_U64 },
+	[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_MAX]	= { .type = NLA_U64 },
 };

 static int fill_nldev_handle(struct sk_buff *msg, struct ib_device *device)
@@ -134,6 +140,79 @@ static int fill_port_info(struct sk_buff *msg,
 	return 0;
 }

+static int fill_res_info_entry(struct sk_buff *msg,
+			       const char *name, u64 curr, u64 max)
+{
+	struct nlattr *entry_attr;
+
+	entry_attr = nla_nest_start(msg, RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY);
+	if (!entry_attr)
+		return -EMSGSIZE;
+
+	if (nla_put_string(msg, RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_NAME, name))
+		goto err;
+	if (nla_put_u64_64bit(msg,
+			      RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_CURR, curr, 0))
+		goto err;
+	if (nla_put_u64_64bit(msg,
+			      RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_MAX, max, 0))
+		goto err;
+
+	nla_nest_end(msg, entry_attr);
+	return 0;
+
+err:
+	nla_nest_cancel(msg, entry_attr);
+	return -EMSGSIZE;
+}
+
+static u32 get_res_max(struct ib_device *dev, int idx)
+{
+	switch (idx) {
+	case RDMA_RESTRACK_PD: return dev->attrs.max_pd;
+	case RDMA_RESTRACK_CQ: return dev->attrs.max_cq;
+	case RDMA_RESTRACK_QP: return dev->attrs.max_qp;
+	default:	       return 0; /* unreachable */
+	}
+}
+
+static int fill_res_info(struct sk_buff *msg, struct ib_device *device)
+{
+	static const char *names[_RDMA_RESTRACK_MAX] = {
+		[RDMA_RESTRACK_PD] = "pd",
+		[RDMA_RESTRACK_CQ] = "cq",
+		[RDMA_RESTRACK_QP] = "qp",
+	};
+
+	struct rdma_restrack_root *res = &device->res;
+	struct nlattr *table_attr;
+	int ret, i;
+
+	if (fill_nldev_handle(msg, device))
+		return -EMSGSIZE;
+
+	table_attr = nla_nest_start(msg, RDMA_NLDEV_ATTR_RES_SUMMARY);
+	if (!table_attr)
+		return -EMSGSIZE;
+
+	for (i = 0; i < _RDMA_RESTRACK_MAX; i++) {
+		if (!names[i])
+			continue;
+		ret = fill_res_info_entry(msg, names[i],
+					  rdma_restrack_count(res, i),
+					  get_res_max(device, i));
+		if (ret)
+			goto err;
+	}
+
+	nla_nest_end(msg, table_attr);
+	return 0;
+
+err:
+	nla_nest_cancel(msg, table_attr);
+	return ret;
+}
+
 static int nldev_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
 			  struct netlink_ext_ack *extack)
 {
@@ -325,6 +404,88 @@ static int nldev_port_get_dumpit(struct sk_buff *skb,
 	return skb->len;
 }

+static int nldev_res_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
+			      struct netlink_ext_ack *extack)
+{
+	struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
+	struct ib_device *device;
+	struct sk_buff *msg;
+	u32 index;
+	int ret = -ENOMEM;
+
+	ret = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
+			  nldev_policy, extack);
+	if (ret || !tb[RDMA_NLDEV_ATTR_DEV_INDEX])
+		return -EINVAL;
+
+	index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
+	device = ib_device_get_by_index(index);
+	if (!device)
+		return -EINVAL;
+
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		goto err;
+
+	nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
+			RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_RES_GET),
+			0, 0);
+
+	ret = fill_res_info(msg, device);
+	if (ret)
+		goto err_free;
+
+	nlmsg_end(msg, nlh);
+	put_device(&device->dev);
+
+	return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
+
+err_free:
+	nlmsg_free(msg);
+err:
+	put_device(&device->dev);
+	return ret;
+}
+
+static int _nldev_res_get_dumpit(struct ib_device *device,
+				 struct sk_buff *skb,
+				 struct netlink_callback *cb,
+				 unsigned int idx)
+{
+	int start = cb->args[0];
+	struct nlmsghdr *nlh;
+
+	if (idx < start)
+		return 0;
+
+	nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
+			RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_RES_GET),
+			0, NLM_F_MULTI);
+
+	if (fill_res_info(skb, device)) {
+		nlmsg_cancel(skb, nlh);
+		goto out;
+	}
+
+	nlmsg_end(skb, nlh);
+
+	idx++;
+
+out:
+	cb->args[0] = idx;
+	return skb->len;
+}
+
+static int nldev_res_get_dumpit(struct sk_buff *skb,
+				struct netlink_callback *cb)
+{
+	/*
+	 * There is no need to take lock, because
+	 * we are relying on ib_core's lists_rwsem
+	 */
+	return ib_enum_all_devs(_nldev_res_get_dumpit, skb, cb);
+}
+
 static const struct rdma_nl_cbs nldev_cb_table[RDMA_NLDEV_NUM_OPS] = {
 	[RDMA_NLDEV_CMD_GET] = {
 		.doit = nldev_get_doit,
@@ -334,6 +495,10 @@ static const struct rdma_nl_cbs nldev_cb_table[RDMA_NLDEV_NUM_OPS] = {
 		.doit = nldev_port_get_doit,
 		.dump = nldev_port_get_dumpit,
 	},
+	[RDMA_NLDEV_CMD_RES_GET] = {
+		.doit = nldev_res_get_doit,
+		.dump = nldev_res_get_dumpit,
+	},
 };

 void __init nldev_init(void)
diff --git a/include/uapi/rdma/rdma_netlink.h b/include/uapi/rdma/rdma_netlink.h
index cc002e316d09..e041d2eca4b8 100644
--- a/include/uapi/rdma/rdma_netlink.h
+++ b/include/uapi/rdma/rdma_netlink.h
@@ -236,6 +236,11 @@ enum rdma_nldev_command {
 	RDMA_NLDEV_CMD_PORT_NEW,
 	RDMA_NLDEV_CMD_PORT_DEL,

+	RDMA_NLDEV_CMD_RES_GET, /* can dump */
+	RDMA_NLDEV_CMD_RES_SET,
+	RDMA_NLDEV_CMD_RES_NEW,
+	RDMA_NLDEV_CMD_RES_DEL,
+
 	RDMA_NLDEV_NUM_OPS
 };

@@ -303,6 +308,12 @@ enum rdma_nldev_attr {

 	RDMA_NLDEV_ATTR_DEV_NODE_TYPE,		/* u8 */

+	RDMA_NLDEV_ATTR_RES_SUMMARY,		/* nested table */
+	RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY,	/* nested table */
+	RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_NAME,	/* string */
+	RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_CURR,	/* u64 */
+	RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_MAX,	/* u64 */
+
 	RDMA_NLDEV_ATTR_MAX
 };
 #endif /* _UAPI_RDMA_NETLINK_H */
--
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH rdma-next 14/14] RDMA/nldev: Provide detailed QP information
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (12 preceding siblings ...)
  2017-12-21 18:17   ` [RFC PATCH rdma-next 13/14] RDMA/nldev: Provide global resource utilization Leon Romanovsky
@ 2017-12-21 18:17   ` Leon Romanovsky
  2017-12-21 18:21   ` [RFC PATCH rdma-next 00/14] RDMA resource tracking Leon Romanovsky
  2017-12-21 19:14   ` Jason Gunthorpe
  15 siblings, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise,
	Leon Romanovsky

From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Implement RDMA nldev netlink interface to get detailed
QP information.

Currently only dumpit variant is implemented.

Reviewed-by: Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/nldev.c  | 207 +++++++++++++++++++++++++++++++++++++++
 include/uapi/rdma/rdma_netlink.h |  43 ++++++++
 2 files changed, 250 insertions(+)

diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
index 7aca9458e946..6b22f1f2d084 100644
--- a/drivers/infiniband/core/nldev.c
+++ b/drivers/infiniband/core/nldev.c
@@ -58,6 +58,18 @@ static const struct nla_policy nldev_policy[RDMA_NLDEV_ATTR_MAX] = {
 					     .len = 16 },
 	[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_CURR] = { .type = NLA_U64 },
 	[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_MAX]	= { .type = NLA_U64 },
+	[RDMA_NLDEV_ATTR_RES_QP]		= { .type = NLA_NESTED },
+	[RDMA_NLDEV_ATTR_RES_QP_ENTRY]		= { .type = NLA_NESTED },
+	[RDMA_NLDEV_ATTR_RES_LQPN]		= { .type = NLA_U32 },
+	[RDMA_NLDEV_ATTR_RES_RQPN]		= { .type = NLA_U32 },
+	[RDMA_NLDEV_ATTR_RES_RQ_PSN]		= { .type = NLA_U32 },
+	[RDMA_NLDEV_ATTR_RES_SQ_PSN]		= { .type = NLA_U32 },
+	[RDMA_NLDEV_ATTR_RES_PATH_MIG_STATE] = { .type = NLA_U8 },
+	[RDMA_NLDEV_ATTR_RES_TYPE]		= { .type = NLA_U8 },
+	[RDMA_NLDEV_ATTR_RES_STATE]		= { .type = NLA_U8 },
+	[RDMA_NLDEV_ATTR_RES_PID]		= { .type = NLA_U32 },
+	[RDMA_NLDEV_ATTR_RES_PID_COMM]	= { .type = NLA_NUL_STRING,
+						    .len = TASK_COMM_LEN },
 };

 static int fill_nldev_handle(struct sk_buff *msg, struct ib_device *device)
@@ -213,6 +225,74 @@ static int fill_res_info(struct sk_buff *msg, struct ib_device *device)
 	return ret;
 }

+static int fill_res_qp_entry(struct sk_buff *msg,
+			     struct ib_qp *qp, uint32_t port)
+{
+	struct ib_qp_init_attr qp_init_attr;
+	struct nlattr *entry_attr;
+	struct ib_qp_attr qp_attr;
+	int ret;
+
+	ret = ib_query_qp(qp, &qp_attr, 0, &qp_init_attr);
+	if (ret)
+		return ret;
+
+	if (port && port != qp_attr.port_num)
+		return 0;
+
+	entry_attr = nla_nest_start(msg, RDMA_NLDEV_ATTR_RES_QP_ENTRY);
+	if (!entry_attr)
+		goto out;
+
+	/* In create_qp() port is not set yet */
+	if (qp_attr.port_num &&
+	    nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, qp_attr.port_num))
+		goto err;
+
+	if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_LQPN, qp->qp_num))
+		goto err;
+	if (qp->qp_type == IB_QPT_RC || qp->qp_type == IB_QPT_UC) {
+		if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_RQPN,
+				qp_attr.dest_qp_num))
+			goto err;
+		if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_RQ_PSN,
+				qp_attr.rq_psn))
+			goto err;
+	}
+
+	if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_SQ_PSN, qp_attr.sq_psn))
+		goto err;
+
+	if (qp->qp_type == IB_QPT_RC || qp->qp_type == IB_QPT_UC ||
+	    qp->qp_type == IB_QPT_XRC_INI || qp->qp_type == IB_QPT_XRC_TGT) {
+		if (nla_put_u8(msg, RDMA_NLDEV_ATTR_RES_PATH_MIG_STATE,
+			       qp_attr.path_mig_state))
+			goto err;
+	}
+	if (nla_put_u8(msg, RDMA_NLDEV_ATTR_RES_TYPE, qp->qp_type))
+		goto err;
+	if (nla_put_u8(msg, RDMA_NLDEV_ATTR_RES_STATE, qp_attr.qp_state))
+		goto err;
+
+	/* PID == 0 means that this QP was created by kernel */
+	if (qp->res.pid && nla_put_u32(msg,
+				       RDMA_NLDEV_ATTR_RES_PID, qp->res.pid))
+		goto err;
+
+	if (nla_put_string(msg,
+			   RDMA_NLDEV_ATTR_RES_PID_COMM, qp->res.task_comm))
+		goto err;
+
+	nla_nest_end(msg, entry_attr);
+
+	return 0;
+
+err:
+	nla_nest_cancel(msg, entry_attr);
+out:
+	return -EMSGSIZE;
+}
+
 static int nldev_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
 			  struct netlink_ext_ack *extack)
 {
@@ -486,6 +566,120 @@ static int nldev_res_get_dumpit(struct sk_buff *skb,
 	return ib_enum_all_devs(_nldev_res_get_dumpit, skb, cb);
 }

+static int nldev_res_get_qp_dumpit(struct sk_buff *skb,
+				   struct netlink_callback *cb)
+{
+	struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
+	struct rdma_restrack_entry *res;
+	struct list_head *pos, *nxt;
+	int err, ret, key, idx = 0;
+	struct nlattr *table_attr;
+	struct ib_device *device;
+	int start = cb->args[0];
+	struct nlmsghdr *nlh;
+	u32 index, port = 0;
+	struct ib_qp *qp;
+
+	err = nlmsg_parse(cb->nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
+			  nldev_policy, NULL);
+	/*
+	 * Right now, we are expecting the device index to get QP information,
+	 * but it is possible to extend this code to return all devices in
+	 * one shot by checking the existence of RDMA_NLDEV_ATTR_DEV_INDEX.
+	 * if it doesn't exist, we will iterate over all devices.
+	 *
+	 * But it is not needed for now.
+	 */
+	if (err || !tb[RDMA_NLDEV_ATTR_DEV_INDEX])
+		return -EINVAL;
+
+	index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
+	device = ib_device_get_by_index(index);
+	if (!device)
+		return -EINVAL;
+
+	/*
+	 * If no PORT_INDEX is supplied, we will return QPs from whole device
+	 */
+	if (tb[RDMA_NLDEV_ATTR_PORT_INDEX]) {
+		port = nla_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]);
+		if (!rdma_is_port_valid(device, port)) {
+			ret = -EINVAL;
+			goto err_index;
+		}
+	}
+
+	nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
+			RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_RES_QP_GET),
+			0, NLM_F_MULTI);
+
+	if (fill_nldev_handle(skb, device)) {
+		ret = -EMSGSIZE;
+		goto err;
+	}
+
+	table_attr = nla_nest_start(skb, RDMA_NLDEV_ATTR_RES_QP);
+	if (!table_attr) {
+		ret = -EMSGSIZE;
+		goto err;
+	}
+
+	rdma_restrack_lock(&device->res, RDMA_RESTRACK_QP);
+	for_each_res_safe(pos, nxt, RDMA_RESTRACK_QP, device) {
+		if (idx < start) {
+			idx++;
+			continue;
+		}
+
+		res = list_entry(pos, struct rdma_restrack_entry, list);
+		if (!res->valid)
+			/*
+			 * It can be if resource failed to initialize srcu,
+			 * in other cases internal to restrack lock will esnure
+			 * that this list has only valid entries.
+			 */
+			continue;
+
+		key = srcu_read_lock(&res->srcu);
+
+		qp = container_of(res, struct ib_qp, res);
+		rdma_restrack_unlock(&device->res, RDMA_RESTRACK_QP);
+		ret = fill_res_qp_entry(skb, qp, port);
+		rdma_restrack_lock(&device->res, RDMA_RESTRACK_QP);
+
+		srcu_read_unlock(&res->srcu, key);
+
+		if (ret == -EMSGSIZE)
+			/*
+			 * There is a chance to optimize here.
+			 * It can be done by using list_prepare_entry
+			 * and list_for_each_entry_continue afterwards.
+			 */
+			break;
+		if (ret)
+			goto res_err;
+		idx++;
+	}
+	rdma_restrack_unlock(&device->res, RDMA_RESTRACK_QP);
+
+	nla_nest_end(skb, table_attr);
+	nlmsg_end(skb, nlh);
+	cb->args[0] = idx;
+	put_device(&device->dev);
+	return 0;
+
+res_err:
+	nla_nest_cancel(skb, table_attr);
+	rdma_restrack_unlock(&device->res, RDMA_RESTRACK_QP);
+
+err:
+	nlmsg_cancel(skb, nlh);
+
+err_index:
+	put_device(&device->dev);
+	return ret;
+}
+
 static const struct rdma_nl_cbs nldev_cb_table[RDMA_NLDEV_NUM_OPS] = {
 	[RDMA_NLDEV_CMD_GET] = {
 		.doit = nldev_get_doit,
@@ -499,6 +693,19 @@ static const struct rdma_nl_cbs nldev_cb_table[RDMA_NLDEV_NUM_OPS] = {
 		.doit = nldev_res_get_doit,
 		.dump = nldev_res_get_dumpit,
 	},
+	[RDMA_NLDEV_CMD_RES_QP_GET] = {
+		.dump = nldev_res_get_qp_dumpit,
+		/*
+		 * .doit is not implemented yet for two reasons:
+		 * 1. It is not needed yet.
+		 * 2. There is a need to provide identifier, while it is easy
+		 * for the QPs (device index + port index + LQPN), it is not
+		 * the case for the rest of resources (PD and CQ). Because it
+		 * is better to provide similar interface for all resources,
+		 * let's wait till we will have other resources implemented
+		 * too.
+		 */
+	},
 };

 void __init nldev_init(void)
diff --git a/include/uapi/rdma/rdma_netlink.h b/include/uapi/rdma/rdma_netlink.h
index e041d2eca4b8..9a90cd9f614e 100644
--- a/include/uapi/rdma/rdma_netlink.h
+++ b/include/uapi/rdma/rdma_netlink.h
@@ -241,6 +241,11 @@ enum rdma_nldev_command {
 	RDMA_NLDEV_CMD_RES_NEW,
 	RDMA_NLDEV_CMD_RES_DEL,

+	RDMA_NLDEV_CMD_RES_QP_GET, /* can dump */
+	RDMA_NLDEV_CMD_RES_QP_SET,
+	RDMA_NLDEV_CMD_RES_QP_NEW,
+	RDMA_NLDEV_CMD_RES_QP_DEL,
+
 	RDMA_NLDEV_NUM_OPS
 };

@@ -314,6 +319,44 @@ enum rdma_nldev_attr {
 	RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_CURR,	/* u64 */
 	RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_MAX,	/* u64 */

+	RDMA_NLDEV_ATTR_RES_QP,			/* nested table */
+	RDMA_NLDEV_ATTR_RES_QP_ENTRY,		/* nested table */
+	/*
+	 * Local QPN
+	 */
+	RDMA_NLDEV_ATTR_RES_LQPN,		/* u32 */
+	/*
+	 * Remote QPN,
+	 * Applicable for RC and UC only IBTA 11.2.5.3 QUERY QUEUE PAIR
+	 */
+	RDMA_NLDEV_ATTR_RES_RQPN,		/* u32 */
+	/*
+	 * Receive Queue PSN,
+	 * Applicable for RC and UC only 11.2.5.3 QUERY QUEUE PAIR
+	 */
+	RDMA_NLDEV_ATTR_RES_RQ_PSN,		/* u32 */
+	/*
+	 * Send Queue PSN
+	 */
+	RDMA_NLDEV_ATTR_RES_SQ_PSN,		/* u32 */
+	RDMA_NLDEV_ATTR_RES_PATH_MIG_STATE,	/* u8 */
+	/*
+	 * QP types as visible to RDMA/core, the reserved QPT
+	 * are not exported through this interface.
+	 */
+	RDMA_NLDEV_ATTR_RES_TYPE,		/* u8 */
+	RDMA_NLDEV_ATTR_RES_STATE,		/* u8 */
+	/*
+	 * Process ID created QP, in case of kernel PID is equal to 0
+	 * and this field won't be set, so user will distinguish user/kernel
+	 * processes without relying on PID number.
+	 */
+	RDMA_NLDEV_ATTR_RES_PID,		/* u32 */
+	/*
+	 * The name of process created following resource.
+	 */
+	RDMA_NLDEV_ATTR_RES_PID_COMM,		/* string */
+
 	RDMA_NLDEV_ATTR_MAX
 };
 #endif /* _UAPI_RDMA_NETLINK_H */
--
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH rdma-next 00/14] RDMA resource tracking
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (13 preceding siblings ...)
  2017-12-21 18:17   ` [RFC PATCH rdma-next 14/14] RDMA/nldev: Provide detailed QP information Leon Romanovsky
@ 2017-12-21 18:21   ` Leon Romanovsky
  2017-12-21 19:14   ` Jason Gunthorpe
  15 siblings, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 18:21 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise

[-- Attachment #1: Type: text/plain, Size: 118 bytes --]

On Thu, Dec 21, 2017 at 08:17:34PM +0200, Leon Romanovsky wrote:
>
> Such scheme,

Ooh, I missed to remove it.

Sorry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH rdma-next 02/14] RDMA/core: Enforce requirement to hold lists_rwsem semaphore
       [not found]     ` <20171221181748.17126-3-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2017-12-21 19:12       ` Jason Gunthorpe
       [not found]         ` <20171221191200.GE20015-uk2M96/98Pc@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2017-12-21 19:12 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch,
	Steve Wise, Leon Romanovsky

On Thu, Dec 21, 2017 at 08:17:36PM +0200, Leon Romanovsky wrote:
> @@ -526,8 +531,8 @@ int ib_register_device(struct ib_device *device,
>  		if (!add_client_context(device, client) && client->add)
>  			client->add(device);
> 
> -	device->index = __dev_new_index();
>  	down_write(&lists_rwsem);
> +	device->index = __dev_new_index();

Isn't this hunk a for-rc bugfix??

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH rdma-next 00/14] RDMA resource tracking
       [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (14 preceding siblings ...)
  2017-12-21 18:21   ` [RFC PATCH rdma-next 00/14] RDMA resource tracking Leon Romanovsky
@ 2017-12-21 19:14   ` Jason Gunthorpe
       [not found]     ` <20171221191424.GF20015-uk2M96/98Pc@public.gmane.org>
  15 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2017-12-21 19:14 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise

On Thu, Dec 21, 2017 at 08:17:34PM +0200, Leon Romanovsky wrote:

> Leon Romanovsky (14):
>   RDMA/netlink: Simplify code of autoload modules
>   RDMA/core: Enforce requirement to hold lists_rwsem semaphore
>   RDMA/core: Replace open-coded variant of put_device
>   RDMA/nldev: Refactor nldev handle to be common function
>   RDMA/core: Provide locked variant of device name to index function
>   RDMA/netlink: Protect device query from device removal
>   RDMA/nldev: Protect port query from accidental device removal

These all seem totally unrelated.. Shouldn't these cleanups should go
ahead right away? Can you send them as a non-rfc series?

Then we are down to something less intimidating to review for resource
tracking..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH rdma-next 02/14] RDMA/core: Enforce requirement to hold lists_rwsem semaphore
       [not found]         ` <20171221191200.GE20015-uk2M96/98Pc@public.gmane.org>
@ 2017-12-21 19:51           ` Leon Romanovsky
       [not found]             ` <20171221195134.GM2942-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 19:51 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise

[-- Attachment #1: Type: text/plain, Size: 850 bytes --]

On Thu, Dec 21, 2017 at 12:12:00PM -0700, Jason Gunthorpe wrote:
> On Thu, Dec 21, 2017 at 08:17:36PM +0200, Leon Romanovsky wrote:
> > @@ -526,8 +531,8 @@ int ib_register_device(struct ib_device *device,
> >  		if (!add_client_context(device, client) && client->add)
> >  			client->add(device);
> >
> > -	device->index = __dev_new_index();
> >  	down_write(&lists_rwsem);
> > +	device->index = __dev_new_index();
>
> Isn't this hunk a for-rc bugfix??

Whole patch can go to for-rc.

This for-rc/for-next thing is new for me. In RDMA, we didn't bother ourselves too
much and sent everything to for-next.

Thanks

>
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH rdma-next 00/14] RDMA resource tracking
       [not found]     ` <20171221191424.GF20015-uk2M96/98Pc@public.gmane.org>
@ 2017-12-21 19:52       ` Leon Romanovsky
  0 siblings, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 19:52 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise

[-- Attachment #1: Type: text/plain, Size: 917 bytes --]

On Thu, Dec 21, 2017 at 12:14:24PM -0700, Jason Gunthorpe wrote:
> On Thu, Dec 21, 2017 at 08:17:34PM +0200, Leon Romanovsky wrote:
>
> > Leon Romanovsky (14):
> >   RDMA/netlink: Simplify code of autoload modules
> >   RDMA/core: Enforce requirement to hold lists_rwsem semaphore
> >   RDMA/core: Replace open-coded variant of put_device
> >   RDMA/nldev: Refactor nldev handle to be common function
> >   RDMA/core: Provide locked variant of device name to index function
> >   RDMA/netlink: Protect device query from device removal
> >   RDMA/nldev: Protect port query from accidental device removal
>
> These all seem totally unrelated.. Shouldn't these cleanups should go
> ahead right away? Can you send them as a non-rfc series?
>

I can, but it won't change the fact that we should review those patches anyway.


> Then we are down to something less intimidating to review for resource
> tracking..
>
> Jason

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH rdma-next 02/14] RDMA/core: Enforce requirement to hold lists_rwsem semaphore
       [not found]             ` <20171221195134.GM2942-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-12-21 19:57               ` Leon Romanovsky
  2017-12-21 20:23               ` Jason Gunthorpe
  1 sibling, 0 replies; 22+ messages in thread
From: Leon Romanovsky @ 2017-12-21 19:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch, Steve Wise

[-- Attachment #1: Type: text/plain, Size: 1048 bytes --]

On Thu, Dec 21, 2017 at 09:51:34PM +0200, Leon Romanovsky wrote:
> On Thu, Dec 21, 2017 at 12:12:00PM -0700, Jason Gunthorpe wrote:
> > On Thu, Dec 21, 2017 at 08:17:36PM +0200, Leon Romanovsky wrote:
> > > @@ -526,8 +531,8 @@ int ib_register_device(struct ib_device *device,
> > >  		if (!add_client_context(device, client) && client->add)
> > >  			client->add(device);
> > >
> > > -	device->index = __dev_new_index();
> > >  	down_write(&lists_rwsem);
> > > +	device->index = __dev_new_index();
> >
> > Isn't this hunk a for-rc bugfix??
>
> Whole patch can go to for-rc.
>

But you should remember that it will create dependency on for-rc for
whole series.

> This for-rc/for-next thing is new for me. In RDMA, we didn't bother ourselves too
> much and sent everything to for-next.
>
> Thanks
>
> >
> > Jason
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH rdma-next 02/14] RDMA/core: Enforce requirement to hold lists_rwsem semaphore
       [not found]             ` <20171221195134.GM2942-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  2017-12-21 19:57               ` Leon Romanovsky
@ 2017-12-21 20:23               ` Jason Gunthorpe
  1 sibling, 0 replies; 22+ messages in thread
From: Jason Gunthorpe @ 2017-12-21 20:23 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mark Bloch,
	Steve Wise, Dennis Dalessandro

On Thu, Dec 21, 2017 at 09:51:34PM +0200, Leon Romanovsky wrote:

> This for-rc/for-next thing is new for me. In RDMA, we didn't bother ourselves too
> much and sent everything to for-next.

Okay.. But I think it helps distros if we classify our work a little
more finely, eg anything that merged through the for-rc should be
inspected more closely for distro or stable backporting.

The dependencies are easy enough to deal with if you mark them in the
cover letter.

And I prefer a workflow where the series are smaller, focused on a
topic and more often than in a giant patch bomb once per release. At
least so far anyhow..

Though, I know Doug works quite differently.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-12-21 20:23 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-21 18:17 [RFC PATCH rdma-next 00/14] RDMA resource tracking Leon Romanovsky
     [not found] ` <20171221181748.17126-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-12-21 18:17   ` [RFC PATCH rdma-next 01/14] RDMA/netlink: Simplify code of autoload modules Leon Romanovsky
2017-12-21 18:17   ` [RFC PATCH rdma-next 02/14] RDMA/core: Enforce requirement to hold lists_rwsem semaphore Leon Romanovsky
     [not found]     ` <20171221181748.17126-3-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-12-21 19:12       ` Jason Gunthorpe
     [not found]         ` <20171221191200.GE20015-uk2M96/98Pc@public.gmane.org>
2017-12-21 19:51           ` Leon Romanovsky
     [not found]             ` <20171221195134.GM2942-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-12-21 19:57               ` Leon Romanovsky
2017-12-21 20:23               ` Jason Gunthorpe
2017-12-21 18:17   ` [RFC PATCH rdma-next 03/14] RDMA/core: Replace open-coded variant of put_device Leon Romanovsky
2017-12-21 18:17   ` [RFC PATCH rdma-next 04/14] RDMA/nldev: Refactor nldev handle to be common function Leon Romanovsky
2017-12-21 18:17   ` [RFC PATCH rdma-next 05/14] RDMA/core: Provide locked variant of device name to index function Leon Romanovsky
2017-12-21 18:17   ` [RFC PATCH rdma-next 06/14] RDMA/netlink: Protect device query from device removal Leon Romanovsky
2017-12-21 18:17   ` [RFC PATCH rdma-next 07/14] RDMA/nldev: Protect port query from accidental " Leon Romanovsky
2017-12-21 18:17   ` [RFC PATCH rdma-next 08/14] RDMA/restrack: Add general infrastructure to track RDMA resources Leon Romanovsky
2017-12-21 18:17   ` [RFC PATCH rdma-next 09/14] RDMA/core: Add helper function to create named QPs Leon Romanovsky
2017-12-21 18:17   ` [RFC PATCH rdma-next 10/14] RDMA: Annotate create QP callers Leon Romanovsky
2017-12-21 18:17   ` [RFC PATCH rdma-next 11/14] RDMA/core: Add resource tracking for create and destroy CQs Leon Romanovsky
2017-12-21 18:17   ` [RFC PATCH rdma-next 12/14] RDMA/core: Add resource tracking for create and destroy PDs Leon Romanovsky
2017-12-21 18:17   ` [RFC PATCH rdma-next 13/14] RDMA/nldev: Provide global resource utilization Leon Romanovsky
2017-12-21 18:17   ` [RFC PATCH rdma-next 14/14] RDMA/nldev: Provide detailed QP information Leon Romanovsky
2017-12-21 18:21   ` [RFC PATCH rdma-next 00/14] RDMA resource tracking Leon Romanovsky
2017-12-21 19:14   ` Jason Gunthorpe
     [not found]     ` <20171221191424.GF20015-uk2M96/98Pc@public.gmane.org>
2017-12-21 19:52       ` Leon Romanovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.