All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3 for-next 0/7] Change IDR usage and locking in uverbs
@ 2017-04-04 10:31 Matan Barak
       [not found] ` <1491301907-32290-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2017-04-04 10:31 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Sean Hefty, Leon Romanovsky, Majd Dibbiny, Tal Alon,
	Yishai Hadas, Ira Weiny, Haggai Eran, Christoph Lameter,
	Matan Barak

Hi Doug,

This series is the first part of introducing the new ABI.
It changes the IDR and locking mechanism in ib_uverbs.
This will allow handling all types in the same way, both IB/core
standard types and driver specific types. In that case, type is
referred to either IDR (such as QP, CQ, etc.) or FD such as
completion channel based uobjects.

Previously, we had global IDR tables per type and a per-type list on
the user context, which implied that the type system was fixed.
This patch series changes that into a uverbs-file IDR table (since
in the new proposed ABI each device could have a different type
system and a uverbs-file refers for a specific device) and a unified
per-context objects list. Objects in this list could either be IDR
based objects or FD based objects.

A type is represented by an identifier, an allocation size, a free
function which is used in the context tear down or object removal
and a release order. The allocation function is actually the
handler (for example, create_qp). The release order determines the
relations between objects. Some relations are set in the hardware
or user-space. For example, a MW could be created before MR, but
if this MW is bounded to the MR, the MW should be released before the
MR. Later on, we'll add actions in an object-oriented programming
manner to each type. So, a type contains all required information for
allocating its size and freeing it.

Since after this change a type contains all the information required
for freeing it, we refactor all the destroy handlers to just use our
infrastructure to destroy the type. The reason for the destruction
is passed to the destroy function.

The locking mechanism is changed as well. Previously, the uverbs_cmd
handlers created new objects themselves and dealt with the internals
of locking/unlocking them. This is now moved to a separate code which
either creates a new object, destroys an object or locks it for
read/write. This is possible since we have a descriptive type system.
Hence, we could have a shared code to create, lock and destroy types.

In contrast to the previous locking approach, we don't block the
user-space thread if an object is already locked, but we just return
-EBUSY and expect the user to handle this. In order to maintain
backward compatibility, we've added explicit locks to the uverbs_cmd
handlers (in non-racy scenarios of course), thus for sane flows the
behaviour should be the same as previous one.
The incentive here is to provide a robust infrastructure to add new
actions that can't create a dead-lock (e.g. two commands that try to
lock objects in AB-BA manner).
In addition, since objects creation and locking is dealt in a
centralized place, the new proposed ABI won't need to deal with it
explicitly in its handlers.

A typical flow of a handler will be:
1. Serialize the user-space command to kernel-space structures.
2. Get all objects the handle needs (through the new centralized
     mechanism).
3. Do the actual command.
4. Commit or abort objects allocation/fetching/destruction.
5. Write response to the user.

We use the following locks/krefs:
1. ib_uobject: usecnt (atomic)
	o Protects from concurrent read and write/destroy
	o The possible values are
		- (0):   Unlocked
		- (-1):  Locked for writing
		- (> 0): Locked for reading. The exact value is number
			 of readers.
2. ib_ucontext: uobjects_lock
	o Protect from concurrently modifying the context uobject's list
3. ib_uobject: ref [kref]
	o Let's a handler delay the memory release of an uobject even if
	  it was already removed from the ucontext's objects list
	  (e.g. CQ takes a reference count on the completion handler)
	o rdma_core takes one reference count for all objects in its
	  repository.
4. ib_uverbs_file: cleanup_mutex [existing lock]
	o Protects concurrently closing a file from releasing the ucontext.
	  Since when removing a uobject, we need to remove it from the
	  uobjects list and thus we use the ucontext lock (clause 2),
	  we need to make sure the ucontext is alive throughout this process.
5. ib_ucontext: cleanup_rwswm
	o Protects cleanup_context from concurrent lookup_get, remove_commit
	  and alloc_commit.
	o If alloc_commit is running during context cleanup, we remove the
	  object with a special reason.
	o It makes rdma_core more self contained and less dependant on
	  external locks.
6. ib_uobject_file: idr_lock
	o Protects concurrent write access to the idr.
	o RCU protects idr read access.
7. File reference count:
	o Protects race between lookup_get on a fd uobject and a release of
	  such an object.
	o If we get the file, we increased the reference on this file and we
	  won't get to the .release file operation.
	o Otherwise, the .release file operation is already called (or would
	  be called soon) and then the fd was already removed from the files
	  table of the process.

This patch-set is applied on top of Doug's k.o/for-4.12 branch.

Regards,
Matan

Changes from V2:
1. Rebased on top of Doug's latest branch
    a. Moved the cgroups code into the uobject infrastructure
2. Cleaned up rdma_lookup_get_uobject a bit

Changes from V1 - address Jason's comments:
1. Added ucontext->cleanup_rwsem to make the locking more self contained.
2. When calling a destroy handler, the object's free function is called
   with the appropriate reason.
3. Bug fix of double free when uobject file allocation failed.
4. Wrap the type->ops with appropriate inline function and macros
5. Reduce amount of macros.
6. Add lockdep style debugging and some other safety checks.
7. Replace a release function with needs_rcu bool flag.
8. Rebase the patches.

Changes from V0 - address some of Jason's comments:
1. Change idr table to be per uverbs_file and not per device.
2. Make changes more gradual
    a. Instead of doing a flags day approach, first introduce the idr
       uobjects with a new API and change their usages:
	 o Ditch the live flag
	 o Manage all idr objects through rdma_core
	 o create and cleanup context through rdma_core
    b. Add a lock to preserve old attch_mcast/detach_mcast behaviour.
    c. Add fd based objects infrastructure
    d. Change current completion channel to use this fd infrastructure.
3. Ditch the switch-case enums and favor a more OOP approach through a
   vtable.
4. Instead of having a kref-ed lock, put a new lock on the uverbs_file
   to synchronize fd objects deletion from context removal
    a. We favored this approach from keeping the ucontext alive, as the
       ucontext is allocated by the driver and thus we want to deallocate
       it through the same entity. We don't want to defer the whole
       deallocation process as in the current state we could rmmod the
       drivers code when applications are still running and trigger
       a disassociate flow.
5. Reduce the amount of macros - use only a macro for declaring idr
   types and a macro to declare fd types.
6. Use kref to manage all uobjects.
7. Use proper types for idr.

Matan Barak (7):
  IB/core: Refactor idr to be per uverbs_file
  IB/core: Add support for idr types
  IB/core: Add idr based standard types
  IB/core: Change idr objects to use the new schema
  IB/core: Add lock to multicast handlers
  IB/core: Add support for fd objects
  IB/core: Change completion channel to use the reworked objects schema

 drivers/infiniband/core/Makefile           |    3 +-
 drivers/infiniband/core/rdma_core.c        |  625 +++++++++++++
 drivers/infiniband/core/rdma_core.h        |   78 ++
 drivers/infiniband/core/uverbs.h           |   53 +-
 drivers/infiniband/core/uverbs_cmd.c       | 1388 ++++++++--------------------
 drivers/infiniband/core/uverbs_main.c      |  450 ++++-----
 drivers/infiniband/core/uverbs_std_types.c |  275 ++++++
 include/rdma/ib_verbs.h                    |   41 +-
 include/rdma/uverbs_std_types.h            |  114 +++
 include/rdma/uverbs_types.h                |  171 ++++
 10 files changed, 1863 insertions(+), 1335 deletions(-)
 create mode 100644 drivers/infiniband/core/rdma_core.c
 create mode 100644 drivers/infiniband/core/rdma_core.h
 create mode 100644 drivers/infiniband/core/uverbs_std_types.c
 create mode 100644 include/rdma/uverbs_std_types.h
 create mode 100644 include/rdma/uverbs_types.h

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH V3 for-next 1/7] IB/core: Refactor idr to be per uverbs_file
       [not found] ` <1491301907-32290-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2017-04-04 10:31   ` Matan Barak
       [not found]     ` <1491301907-32290-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2017-04-04 10:31   ` [PATCH V3 for-next 2/7] IB/core: Add support for idr types Matan Barak
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2017-04-04 10:31 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Sean Hefty, Leon Romanovsky, Majd Dibbiny, Tal Alon,
	Yishai Hadas, Ira Weiny, Haggai Eran, Christoph Lameter,
	Matan Barak

The current code creates an idr per type. Since types are currently
common for all drivers and known in advance, this was good enough.
However, the proposed ioctl based infrastructure allows each driver
to declare only some of the common types and declare its own specific
types.

Thus, we decided to implement idr to be per uverbs_file.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs.h      |  19 ++--
 drivers/infiniband/core/uverbs_cmd.c  | 157 ++++++++++++++++------------------
 drivers/infiniband/core/uverbs_main.c |  45 +++-------
 include/rdma/ib_verbs.h               |   1 +
 4 files changed, 95 insertions(+), 127 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index e1bedf0..6215735 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -123,6 +123,10 @@ struct ib_uverbs_file {
 	struct ib_uverbs_event_file	       *async_file;
 	struct list_head			list;
 	int					is_closed;
+
+	struct idr		idr;
+	/* spinlock protects write access to idr */
+	spinlock_t		idr_lock;
 };
 
 struct ib_uverbs_event {
@@ -176,20 +180,7 @@ struct ib_ucq_object {
 	u32			async_events_reported;
 };
 
-extern spinlock_t ib_uverbs_idr_lock;
-extern struct idr ib_uverbs_pd_idr;
-extern struct idr ib_uverbs_mr_idr;
-extern struct idr ib_uverbs_mw_idr;
-extern struct idr ib_uverbs_ah_idr;
-extern struct idr ib_uverbs_cq_idr;
-extern struct idr ib_uverbs_qp_idr;
-extern struct idr ib_uverbs_srq_idr;
-extern struct idr ib_uverbs_xrcd_idr;
-extern struct idr ib_uverbs_rule_idr;
-extern struct idr ib_uverbs_wq_idr;
-extern struct idr ib_uverbs_rwq_ind_tbl_idr;
-
-void idr_remove_uobj(struct idr *idp, struct ib_uobject *uobj);
+void idr_remove_uobj(struct ib_uobject *uobj);
 
 struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
 					struct ib_device *ib_dev,
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 7b7a76e..03c4f68 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -120,37 +120,36 @@ static void put_uobj_write(struct ib_uobject *uobj)
 	put_uobj(uobj);
 }
 
-static int idr_add_uobj(struct idr *idr, struct ib_uobject *uobj)
+static int idr_add_uobj(struct ib_uobject *uobj)
 {
 	int ret;
 
 	idr_preload(GFP_KERNEL);
-	spin_lock(&ib_uverbs_idr_lock);
+	spin_lock(&uobj->context->ufile->idr_lock);
 
-	ret = idr_alloc(idr, uobj, 0, 0, GFP_NOWAIT);
+	ret = idr_alloc(&uobj->context->ufile->idr, uobj, 0, 0, GFP_NOWAIT);
 	if (ret >= 0)
 		uobj->id = ret;
 
-	spin_unlock(&ib_uverbs_idr_lock);
+	spin_unlock(&uobj->context->ufile->idr_lock);
 	idr_preload_end();
 
 	return ret < 0 ? ret : 0;
 }
 
-void idr_remove_uobj(struct idr *idr, struct ib_uobject *uobj)
+void idr_remove_uobj(struct ib_uobject *uobj)
 {
-	spin_lock(&ib_uverbs_idr_lock);
-	idr_remove(idr, uobj->id);
-	spin_unlock(&ib_uverbs_idr_lock);
+	spin_lock(&uobj->context->ufile->idr_lock);
+	idr_remove(&uobj->context->ufile->idr, uobj->id);
+	spin_unlock(&uobj->context->ufile->idr_lock);
 }
 
-static struct ib_uobject *__idr_get_uobj(struct idr *idr, int id,
-					 struct ib_ucontext *context)
+static struct ib_uobject *__idr_get_uobj(int id, struct ib_ucontext *context)
 {
 	struct ib_uobject *uobj;
 
 	rcu_read_lock();
-	uobj = idr_find(idr, id);
+	uobj = idr_find(&context->ufile->idr, id);
 	if (uobj) {
 		if (uobj->context == context)
 			kref_get(&uobj->ref);
@@ -162,12 +161,12 @@ static struct ib_uobject *__idr_get_uobj(struct idr *idr, int id,
 	return uobj;
 }
 
-static struct ib_uobject *idr_read_uobj(struct idr *idr, int id,
-					struct ib_ucontext *context, int nested)
+static struct ib_uobject *idr_read_uobj(int id, struct ib_ucontext *context,
+					int nested)
 {
 	struct ib_uobject *uobj;
 
-	uobj = __idr_get_uobj(idr, id, context);
+	uobj = __idr_get_uobj(id, context);
 	if (!uobj)
 		return NULL;
 
@@ -183,12 +182,11 @@ static struct ib_uobject *idr_read_uobj(struct idr *idr, int id,
 	return uobj;
 }
 
-static struct ib_uobject *idr_write_uobj(struct idr *idr, int id,
-					 struct ib_ucontext *context)
+static struct ib_uobject *idr_write_uobj(int id, struct ib_ucontext *context)
 {
 	struct ib_uobject *uobj;
 
-	uobj = __idr_get_uobj(idr, id, context);
+	uobj = __idr_get_uobj(id, context);
 	if (!uobj)
 		return NULL;
 
@@ -201,18 +199,18 @@ static struct ib_uobject *idr_write_uobj(struct idr *idr, int id,
 	return uobj;
 }
 
-static void *idr_read_obj(struct idr *idr, int id, struct ib_ucontext *context,
+static void *idr_read_obj(int id, struct ib_ucontext *context,
 			  int nested)
 {
 	struct ib_uobject *uobj;
 
-	uobj = idr_read_uobj(idr, id, context, nested);
+	uobj = idr_read_uobj(id, context, nested);
 	return uobj ? uobj->object : NULL;
 }
 
 static struct ib_pd *idr_read_pd(int pd_handle, struct ib_ucontext *context)
 {
-	return idr_read_obj(&ib_uverbs_pd_idr, pd_handle, context, 0);
+	return idr_read_obj(pd_handle, context, 0);
 }
 
 static void put_pd_read(struct ib_pd *pd)
@@ -222,7 +220,7 @@ static void put_pd_read(struct ib_pd *pd)
 
 static struct ib_cq *idr_read_cq(int cq_handle, struct ib_ucontext *context, int nested)
 {
-	return idr_read_obj(&ib_uverbs_cq_idr, cq_handle, context, nested);
+	return idr_read_obj(cq_handle, context, nested);
 }
 
 static void put_cq_read(struct ib_cq *cq)
@@ -232,7 +230,7 @@ static void put_cq_read(struct ib_cq *cq)
 
 static struct ib_ah *idr_read_ah(int ah_handle, struct ib_ucontext *context)
 {
-	return idr_read_obj(&ib_uverbs_ah_idr, ah_handle, context, 0);
+	return idr_read_obj(ah_handle, context, 0);
 }
 
 static void put_ah_read(struct ib_ah *ah)
@@ -242,12 +240,12 @@ static void put_ah_read(struct ib_ah *ah)
 
 static struct ib_qp *idr_read_qp(int qp_handle, struct ib_ucontext *context)
 {
-	return idr_read_obj(&ib_uverbs_qp_idr, qp_handle, context, 0);
+	return idr_read_obj(qp_handle, context, 0);
 }
 
 static struct ib_wq *idr_read_wq(int wq_handle, struct ib_ucontext *context)
 {
-	return idr_read_obj(&ib_uverbs_wq_idr, wq_handle, context, 0);
+	return idr_read_obj(wq_handle, context, 0);
 }
 
 static void put_wq_read(struct ib_wq *wq)
@@ -258,7 +256,7 @@ static void put_wq_read(struct ib_wq *wq)
 static struct ib_rwq_ind_table *idr_read_rwq_indirection_table(int ind_table_handle,
 							       struct ib_ucontext *context)
 {
-	return idr_read_obj(&ib_uverbs_rwq_ind_tbl_idr, ind_table_handle, context, 0);
+	return idr_read_obj(ind_table_handle, context, 0);
 }
 
 static void put_rwq_indirection_table_read(struct ib_rwq_ind_table *ind_table)
@@ -270,7 +268,7 @@ static struct ib_qp *idr_write_qp(int qp_handle, struct ib_ucontext *context)
 {
 	struct ib_uobject *uobj;
 
-	uobj = idr_write_uobj(&ib_uverbs_qp_idr, qp_handle, context);
+	uobj = idr_write_uobj(qp_handle, context);
 	return uobj ? uobj->object : NULL;
 }
 
@@ -286,7 +284,7 @@ static void put_qp_write(struct ib_qp *qp)
 
 static struct ib_srq *idr_read_srq(int srq_handle, struct ib_ucontext *context)
 {
-	return idr_read_obj(&ib_uverbs_srq_idr, srq_handle, context, 0);
+	return idr_read_obj(srq_handle, context, 0);
 }
 
 static void put_srq_read(struct ib_srq *srq)
@@ -297,7 +295,7 @@ static void put_srq_read(struct ib_srq *srq)
 static struct ib_xrcd *idr_read_xrcd(int xrcd_handle, struct ib_ucontext *context,
 				     struct ib_uobject **uobj)
 {
-	*uobj = idr_read_uobj(&ib_uverbs_xrcd_idr, xrcd_handle, context, 0);
+	*uobj = idr_read_uobj(xrcd_handle, context, 0);
 	return *uobj ? (*uobj)->object : NULL;
 }
 
@@ -305,7 +303,6 @@ static void put_xrcd_read(struct ib_uobject *uobj)
 {
 	put_uobj_read(uobj);
 }
-
 ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 			      struct ib_device *ib_dev,
 			      const char __user *buf,
@@ -348,6 +345,8 @@ ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 
 	ucontext->device = ib_dev;
 	ucontext->cg_obj = cg_obj;
+	/* ufile is required when some objects are released */
+	ucontext->ufile = file;
 	INIT_LIST_HEAD(&ucontext->pd_list);
 	INIT_LIST_HEAD(&ucontext->mr_list);
 	INIT_LIST_HEAD(&ucontext->mw_list);
@@ -591,7 +590,7 @@ ssize_t ib_uverbs_alloc_pd(struct ib_uverbs_file *file,
 	atomic_set(&pd->usecnt, 0);
 
 	uobj->object = pd;
-	ret = idr_add_uobj(&ib_uverbs_pd_idr, uobj);
+	ret = idr_add_uobj(uobj);
 	if (ret)
 		goto err_idr;
 
@@ -615,7 +614,7 @@ ssize_t ib_uverbs_alloc_pd(struct ib_uverbs_file *file,
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(&ib_uverbs_pd_idr, uobj);
+	idr_remove_uobj(uobj);
 
 err_idr:
 	ib_dealloc_pd(pd);
@@ -639,7 +638,7 @@ ssize_t ib_uverbs_dealloc_pd(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(&ib_uverbs_pd_idr, cmd.pd_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.pd_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 	pd = uobj->object;
@@ -659,7 +658,7 @@ ssize_t ib_uverbs_dealloc_pd(struct ib_uverbs_file *file,
 	uobj->live = 0;
 	put_uobj_write(uobj);
 
-	idr_remove_uobj(&ib_uverbs_pd_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -835,7 +834,7 @@ ssize_t ib_uverbs_open_xrcd(struct ib_uverbs_file *file,
 
 	atomic_set(&obj->refcnt, 0);
 	obj->uobject.object = xrcd;
-	ret = idr_add_uobj(&ib_uverbs_xrcd_idr, &obj->uobject);
+	ret = idr_add_uobj(&obj->uobject);
 	if (ret)
 		goto err_idr;
 
@@ -879,7 +878,7 @@ ssize_t ib_uverbs_open_xrcd(struct ib_uverbs_file *file,
 	}
 
 err_insert_xrcd:
-	idr_remove_uobj(&ib_uverbs_xrcd_idr, &obj->uobject);
+	idr_remove_uobj(&obj->uobject);
 
 err_idr:
 	ib_dealloc_xrcd(xrcd);
@@ -913,7 +912,7 @@ ssize_t ib_uverbs_close_xrcd(struct ib_uverbs_file *file,
 		return -EFAULT;
 
 	mutex_lock(&file->device->xrcd_tree_mutex);
-	uobj = idr_write_uobj(&ib_uverbs_xrcd_idr, cmd.xrcd_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.xrcd_handle, file->ucontext);
 	if (!uobj) {
 		ret = -EINVAL;
 		goto out;
@@ -946,7 +945,7 @@ ssize_t ib_uverbs_close_xrcd(struct ib_uverbs_file *file,
 	if (inode && !live)
 		xrcd_table_delete(file->device, inode);
 
-	idr_remove_uobj(&ib_uverbs_xrcd_idr, uobj);
+	idr_remove_uobj(uobj);
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
 	mutex_unlock(&file->mutex);
@@ -1043,7 +1042,7 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
 	atomic_inc(&pd->usecnt);
 
 	uobj->object = mr;
-	ret = idr_add_uobj(&ib_uverbs_mr_idr, uobj);
+	ret = idr_add_uobj(uobj);
 	if (ret)
 		goto err_unreg;
 
@@ -1071,7 +1070,7 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(&ib_uverbs_mr_idr, uobj);
+	idr_remove_uobj(uobj);
 
 err_unreg:
 	ib_dereg_mr(mr);
@@ -1119,8 +1118,7 @@ ssize_t ib_uverbs_rereg_mr(struct ib_uverbs_file *file,
 	     (cmd.start & ~PAGE_MASK) != (cmd.hca_va & ~PAGE_MASK)))
 			return -EINVAL;
 
-	uobj = idr_write_uobj(&ib_uverbs_mr_idr, cmd.mr_handle,
-			      file->ucontext);
+	uobj = idr_write_uobj(cmd.mr_handle, file->ucontext);
 
 	if (!uobj)
 		return -EINVAL;
@@ -1189,7 +1187,7 @@ ssize_t ib_uverbs_dereg_mr(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(&ib_uverbs_mr_idr, cmd.mr_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.mr_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 
@@ -1205,8 +1203,7 @@ ssize_t ib_uverbs_dereg_mr(struct ib_uverbs_file *file,
 		return ret;
 
 	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
-
-	idr_remove_uobj(&ib_uverbs_mr_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -1271,7 +1268,7 @@ ssize_t ib_uverbs_alloc_mw(struct ib_uverbs_file *file,
 	atomic_inc(&pd->usecnt);
 
 	uobj->object = mw;
-	ret = idr_add_uobj(&ib_uverbs_mw_idr, uobj);
+	ret = idr_add_uobj(uobj);
 	if (ret)
 		goto err_unalloc;
 
@@ -1298,7 +1295,7 @@ ssize_t ib_uverbs_alloc_mw(struct ib_uverbs_file *file,
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(&ib_uverbs_mw_idr, uobj);
+	idr_remove_uobj(uobj);
 
 err_unalloc:
 	uverbs_dealloc_mw(mw);
@@ -1327,7 +1324,7 @@ ssize_t ib_uverbs_dealloc_mw(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof(cmd)))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(&ib_uverbs_mw_idr, cmd.mw_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.mw_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 
@@ -1343,8 +1340,7 @@ ssize_t ib_uverbs_dealloc_mw(struct ib_uverbs_file *file,
 		return ret;
 
 	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
-
-	idr_remove_uobj(&ib_uverbs_mw_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -1463,7 +1459,7 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 	atomic_set(&cq->usecnt, 0);
 
 	obj->uobject.object = cq;
-	ret = idr_add_uobj(&ib_uverbs_cq_idr, &obj->uobject);
+	ret = idr_add_uobj(&obj->uobject);
 	if (ret)
 		goto err_free;
 
@@ -1489,7 +1485,7 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 	return obj;
 
 err_cb:
-	idr_remove_uobj(&ib_uverbs_cq_idr, &obj->uobject);
+	idr_remove_uobj(&obj->uobject);
 
 err_free:
 	ib_destroy_cq(cq);
@@ -1763,7 +1759,7 @@ ssize_t ib_uverbs_destroy_cq(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(&ib_uverbs_cq_idr, cmd.cq_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.cq_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 	cq      = uobj->object;
@@ -1780,8 +1776,7 @@ ssize_t ib_uverbs_destroy_cq(struct ib_uverbs_file *file,
 		return ret;
 
 	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
-
-	idr_remove_uobj(&ib_uverbs_cq_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -1994,7 +1989,7 @@ static int create_qp(struct ib_uverbs_file *file,
 	qp->uobject = &obj->uevent.uobject;
 
 	obj->uevent.uobject.object = qp;
-	ret = idr_add_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject);
+	ret = idr_add_uobj(&obj->uevent.uobject);
 	if (ret)
 		goto err_destroy;
 
@@ -2042,7 +2037,7 @@ static int create_qp(struct ib_uverbs_file *file,
 
 	return 0;
 err_cb:
-	idr_remove_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject);
+	idr_remove_uobj(&obj->uevent.uobject);
 
 err_destroy:
 	ib_destroy_qp(qp);
@@ -2232,7 +2227,7 @@ ssize_t ib_uverbs_open_qp(struct ib_uverbs_file *file,
 	qp->uobject = &obj->uevent.uobject;
 
 	obj->uevent.uobject.object = qp;
-	ret = idr_add_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject);
+	ret = idr_add_uobj(&obj->uevent.uobject);
 	if (ret)
 		goto err_destroy;
 
@@ -2261,7 +2256,7 @@ ssize_t ib_uverbs_open_qp(struct ib_uverbs_file *file,
 	return in_len;
 
 err_remove:
-	idr_remove_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject);
+	idr_remove_uobj(&obj->uevent.uobject);
 
 err_destroy:
 	ib_destroy_qp(qp);
@@ -2557,7 +2552,7 @@ ssize_t ib_uverbs_destroy_qp(struct ib_uverbs_file *file,
 
 	memset(&resp, 0, sizeof resp);
 
-	uobj = idr_write_uobj(&ib_uverbs_qp_idr, cmd.qp_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.qp_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 	qp  = uobj->object;
@@ -2582,7 +2577,7 @@ ssize_t ib_uverbs_destroy_qp(struct ib_uverbs_file *file,
 	if (obj->uxrcd)
 		atomic_dec(&obj->uxrcd->refcnt);
 
-	idr_remove_uobj(&ib_uverbs_qp_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -3048,7 +3043,7 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
 	ah->uobject  = uobj;
 	uobj->object = ah;
 
-	ret = idr_add_uobj(&ib_uverbs_ah_idr, uobj);
+	ret = idr_add_uobj(uobj);
 	if (ret)
 		goto err_destroy;
 
@@ -3073,7 +3068,7 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(&ib_uverbs_ah_idr, uobj);
+	idr_remove_uobj(uobj);
 
 err_destroy:
 	ib_destroy_ah(ah);
@@ -3101,7 +3096,7 @@ ssize_t ib_uverbs_destroy_ah(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(&ib_uverbs_ah_idr, cmd.ah_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.ah_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 	ah = uobj->object;
@@ -3116,8 +3111,7 @@ ssize_t ib_uverbs_destroy_ah(struct ib_uverbs_file *file,
 		return ret;
 
 	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
-
-	idr_remove_uobj(&ib_uverbs_ah_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -3450,7 +3444,7 @@ int ib_uverbs_ex_create_wq(struct ib_uverbs_file *file,
 	atomic_inc(&cq->usecnt);
 	wq->uobject = &obj->uevent.uobject;
 	obj->uevent.uobject.object = wq;
-	err = idr_add_uobj(&ib_uverbs_wq_idr, &obj->uevent.uobject);
+	err = idr_add_uobj(&obj->uevent.uobject);
 	if (err)
 		goto destroy_wq;
 
@@ -3477,7 +3471,7 @@ int ib_uverbs_ex_create_wq(struct ib_uverbs_file *file,
 	return 0;
 
 err_copy:
-	idr_remove_uobj(&ib_uverbs_wq_idr, &obj->uevent.uobject);
+	idr_remove_uobj(&obj->uevent.uobject);
 destroy_wq:
 	ib_destroy_wq(wq);
 err_put_cq:
@@ -3526,7 +3520,7 @@ int ib_uverbs_ex_destroy_wq(struct ib_uverbs_file *file,
 		return -EOPNOTSUPP;
 
 	resp.response_length = required_resp_len;
-	uobj = idr_write_uobj(&ib_uverbs_wq_idr, cmd.wq_handle,
+	uobj = idr_write_uobj(cmd.wq_handle,
 			      file->ucontext);
 	if (!uobj)
 		return -EINVAL;
@@ -3541,7 +3535,7 @@ int ib_uverbs_ex_destroy_wq(struct ib_uverbs_file *file,
 	if (ret)
 		return ret;
 
-	idr_remove_uobj(&ib_uverbs_wq_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -3713,7 +3707,7 @@ int ib_uverbs_ex_create_rwq_ind_table(struct ib_uverbs_file *file,
 	for (i = 0; i < num_wq_handles; i++)
 		atomic_inc(&wqs[i]->usecnt);
 
-	err = idr_add_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+	err = idr_add_uobj(uobj);
 	if (err)
 		goto destroy_ind_tbl;
 
@@ -3741,7 +3735,7 @@ int ib_uverbs_ex_create_rwq_ind_table(struct ib_uverbs_file *file,
 	return 0;
 
 err_copy:
-	idr_remove_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+	idr_remove_uobj(uobj);
 destroy_ind_tbl:
 	ib_destroy_rwq_ind_table(rwq_ind_tbl);
 err_uobj:
@@ -3784,7 +3778,7 @@ int ib_uverbs_ex_destroy_rwq_ind_table(struct ib_uverbs_file *file,
 	if (cmd.comp_mask)
 		return -EOPNOTSUPP;
 
-	uobj = idr_write_uobj(&ib_uverbs_rwq_ind_tbl_idr, cmd.ind_tbl_handle,
+	uobj = idr_write_uobj(cmd.ind_tbl_handle,
 			      file->ucontext);
 	if (!uobj)
 		return -EINVAL;
@@ -3800,7 +3794,7 @@ int ib_uverbs_ex_destroy_rwq_ind_table(struct ib_uverbs_file *file,
 	if (ret)
 		return ret;
 
-	idr_remove_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -3945,7 +3939,7 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 	flow_id->uobject = uobj;
 	uobj->object = flow_id;
 
-	err = idr_add_uobj(&ib_uverbs_rule_idr, uobj);
+	err = idr_add_uobj(uobj);
 	if (err)
 		goto destroy_flow;
 
@@ -3970,7 +3964,7 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 		kfree(kern_flow_attr);
 	return 0;
 err_copy:
-	idr_remove_uobj(&ib_uverbs_rule_idr, uobj);
+	idr_remove_uobj(uobj);
 destroy_flow:
 	ib_destroy_flow(flow_id);
 err_create:
@@ -4007,8 +4001,7 @@ int ib_uverbs_ex_destroy_flow(struct ib_uverbs_file *file,
 	if (cmd.comp_mask)
 		return -EINVAL;
 
-	uobj = idr_write_uobj(&ib_uverbs_rule_idr, cmd.flow_handle,
-			      file->ucontext);
+	uobj = idr_write_uobj(cmd.flow_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 	flow_id = uobj->object;
@@ -4022,7 +4015,7 @@ int ib_uverbs_ex_destroy_flow(struct ib_uverbs_file *file,
 
 	put_uobj_write(uobj);
 
-	idr_remove_uobj(&ib_uverbs_rule_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -4115,7 +4108,7 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 	atomic_set(&srq->usecnt, 0);
 
 	obj->uevent.uobject.object = srq;
-	ret = idr_add_uobj(&ib_uverbs_srq_idr, &obj->uevent.uobject);
+	ret = idr_add_uobj(&obj->uevent.uobject);
 	if (ret)
 		goto err_destroy;
 
@@ -4149,7 +4142,7 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 	return 0;
 
 err_copy:
-	idr_remove_uobj(&ib_uverbs_srq_idr, &obj->uevent.uobject);
+	idr_remove_uobj(&obj->uevent.uobject);
 
 err_destroy:
 	ib_destroy_srq(srq);
@@ -4327,7 +4320,7 @@ ssize_t ib_uverbs_destroy_srq(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(&ib_uverbs_srq_idr, cmd.srq_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.srq_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 	srq = uobj->object;
@@ -4350,7 +4343,7 @@ ssize_t ib_uverbs_destroy_srq(struct ib_uverbs_file *file,
 		atomic_dec(&us->uxrcd->refcnt);
 	}
 
-	idr_remove_uobj(&ib_uverbs_srq_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 35c788a..f6812fb 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -67,19 +67,6 @@ enum {
 
 static struct class *uverbs_class;
 
-DEFINE_SPINLOCK(ib_uverbs_idr_lock);
-DEFINE_IDR(ib_uverbs_pd_idr);
-DEFINE_IDR(ib_uverbs_mr_idr);
-DEFINE_IDR(ib_uverbs_mw_idr);
-DEFINE_IDR(ib_uverbs_ah_idr);
-DEFINE_IDR(ib_uverbs_cq_idr);
-DEFINE_IDR(ib_uverbs_qp_idr);
-DEFINE_IDR(ib_uverbs_srq_idr);
-DEFINE_IDR(ib_uverbs_xrcd_idr);
-DEFINE_IDR(ib_uverbs_rule_idr);
-DEFINE_IDR(ib_uverbs_wq_idr);
-DEFINE_IDR(ib_uverbs_rwq_ind_tbl_idr);
-
 static DEFINE_SPINLOCK(map_lock);
 static DECLARE_BITMAP(dev_map, IB_UVERBS_MAX_DEVICES);
 
@@ -236,7 +223,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 	list_for_each_entry_safe(uobj, tmp, &context->ah_list, list) {
 		struct ib_ah *ah = uobj->object;
 
-		idr_remove_uobj(&ib_uverbs_ah_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_destroy_ah(ah);
 		ib_rdmacg_uncharge(&uobj->cg_obj, context->device,
 				   RDMACG_RESOURCE_HCA_OBJECT);
@@ -247,7 +234,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 	list_for_each_entry_safe(uobj, tmp, &context->mw_list, list) {
 		struct ib_mw *mw = uobj->object;
 
-		idr_remove_uobj(&ib_uverbs_mw_idr, uobj);
+		idr_remove_uobj(uobj);
 		uverbs_dealloc_mw(mw);
 		ib_rdmacg_uncharge(&uobj->cg_obj, context->device,
 				   RDMACG_RESOURCE_HCA_OBJECT);
@@ -257,7 +244,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 	list_for_each_entry_safe(uobj, tmp, &context->rule_list, list) {
 		struct ib_flow *flow_id = uobj->object;
 
-		idr_remove_uobj(&ib_uverbs_rule_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_destroy_flow(flow_id);
 		ib_rdmacg_uncharge(&uobj->cg_obj, context->device,
 				   RDMACG_RESOURCE_HCA_OBJECT);
@@ -269,7 +256,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		struct ib_uqp_object *uqp =
 			container_of(uobj, struct ib_uqp_object, uevent.uobject);
 
-		idr_remove_uobj(&ib_uverbs_qp_idr, uobj);
+		idr_remove_uobj(uobj);
 		if (qp == qp->real_qp)
 			ib_uverbs_detach_umcast(qp, uqp);
 		ib_destroy_qp(qp);
@@ -283,7 +270,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		struct ib_rwq_ind_table *rwq_ind_tbl = uobj->object;
 		struct ib_wq **ind_tbl = rwq_ind_tbl->ind_tbl;
 
-		idr_remove_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_destroy_rwq_ind_table(rwq_ind_tbl);
 		kfree(ind_tbl);
 		kfree(uobj);
@@ -294,7 +281,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		struct ib_uwq_object *uwq =
 			container_of(uobj, struct ib_uwq_object, uevent.uobject);
 
-		idr_remove_uobj(&ib_uverbs_wq_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_destroy_wq(wq);
 		ib_uverbs_release_uevent(file, &uwq->uevent);
 		kfree(uwq);
@@ -305,7 +292,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		struct ib_uevent_object *uevent =
 			container_of(uobj, struct ib_uevent_object, uobject);
 
-		idr_remove_uobj(&ib_uverbs_srq_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_destroy_srq(srq);
 		ib_rdmacg_uncharge(&uobj->cg_obj, context->device,
 				   RDMACG_RESOURCE_HCA_OBJECT);
@@ -319,7 +306,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		struct ib_ucq_object *ucq =
 			container_of(uobj, struct ib_ucq_object, uobject);
 
-		idr_remove_uobj(&ib_uverbs_cq_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_destroy_cq(cq);
 		ib_rdmacg_uncharge(&uobj->cg_obj, context->device,
 				   RDMACG_RESOURCE_HCA_OBJECT);
@@ -330,7 +317,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 	list_for_each_entry_safe(uobj, tmp, &context->mr_list, list) {
 		struct ib_mr *mr = uobj->object;
 
-		idr_remove_uobj(&ib_uverbs_mr_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_dereg_mr(mr);
 		ib_rdmacg_uncharge(&uobj->cg_obj, context->device,
 				   RDMACG_RESOURCE_HCA_OBJECT);
@@ -343,7 +330,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		struct ib_uxrcd_object *uxrcd =
 			container_of(uobj, struct ib_uxrcd_object, uobject);
 
-		idr_remove_uobj(&ib_uverbs_xrcd_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_uverbs_dealloc_xrcd(file->device, xrcd);
 		kfree(uxrcd);
 	}
@@ -352,7 +339,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 	list_for_each_entry_safe(uobj, tmp, &context->pd_list, list) {
 		struct ib_pd *pd = uobj->object;
 
-		idr_remove_uobj(&ib_uverbs_pd_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_dealloc_pd(pd);
 		ib_rdmacg_uncharge(&uobj->cg_obj, context->device,
 				   RDMACG_RESOURCE_HCA_OBJECT);
@@ -986,6 +973,8 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp)
 	}
 
 	file->device	 = dev;
+	spin_lock_init(&file->idr_lock);
+	idr_init(&file->idr);
 	file->ucontext	 = NULL;
 	file->async_file = NULL;
 	kref_init(&file->ref);
@@ -1023,6 +1012,7 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
 		file->ucontext = NULL;
 	}
 	mutex_unlock(&file->cleanup_mutex);
+	idr_destroy(&file->idr);
 
 	mutex_lock(&file->device->lists_mutex);
 	if (!file->is_closed) {
@@ -1396,13 +1386,6 @@ static void __exit ib_uverbs_cleanup(void)
 	unregister_chrdev_region(IB_UVERBS_BASE_DEV, IB_UVERBS_MAX_DEVICES);
 	if (overflow_maj)
 		unregister_chrdev_region(overflow_maj, IB_UVERBS_MAX_DEVICES);
-	idr_destroy(&ib_uverbs_pd_idr);
-	idr_destroy(&ib_uverbs_mr_idr);
-	idr_destroy(&ib_uverbs_mw_idr);
-	idr_destroy(&ib_uverbs_ah_idr);
-	idr_destroy(&ib_uverbs_cq_idr);
-	idr_destroy(&ib_uverbs_qp_idr);
-	idr_destroy(&ib_uverbs_srq_idr);
 }
 
 module_init(ib_uverbs_init);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 0f1813c..319e691 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1365,6 +1365,7 @@ struct ib_rdmacg_object {
 
 struct ib_ucontext {
 	struct ib_device       *device;
+	struct ib_uverbs_file  *ufile;
 	struct list_head	pd_list;
 	struct list_head	mr_list;
 	struct list_head	mw_list;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V3 for-next 2/7] IB/core: Add support for idr types
       [not found] ` <1491301907-32290-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2017-04-04 10:31   ` [PATCH V3 for-next 1/7] IB/core: Refactor idr to be per uverbs_file Matan Barak
@ 2017-04-04 10:31   ` Matan Barak
       [not found]     ` <1491301907-32290-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2017-04-04 10:31   ` [PATCH V3 for-next 3/7] IB/core: Add idr based standard types Matan Barak
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2017-04-04 10:31 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Sean Hefty, Leon Romanovsky, Majd Dibbiny, Tal Alon,
	Yishai Hadas, Ira Weiny, Haggai Eran, Christoph Lameter,
	Matan Barak

The new ioctl infrastructure supports driver specific objects.
Each such object type has a hot unplug function, allocation size and
an order of destruction.

When a ucontext is created, a new list is created in this ib_ucontext.
This list contains all objects created under this ib_ucontext.
When a ib_ucontext is destroyed, we traverse this list several time
destroying the various objects by the order mentioned in the object
type description. If few object types have the same destruction order,
they are destroyed in an order opposite to their creation.

Adding an object is done in two parts.
First, an object is allocated and added to idr tree. Then, the
command's handlers (in downstream patches) could work on this object
and fill in its required details.
After a successful command, the commit part is called and the user
objects become ucontext visible. If the handler failed, alloc_abort
should be called.

Removing an uboject is done by calling lookup_get with the write flag
and finalizing it with destroy_commit. A major change from the previous
code is that we actually destroy the kernel object itself in
destroy_commit (rather than just the uobject).

We should make sure idr (per-uverbs-file) and list (per-ucontext) could
be accessed concurrently without corrupting them.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/Makefile    |   3 +-
 drivers/infiniband/core/rdma_core.c | 450 ++++++++++++++++++++++++++++++++++++
 drivers/infiniband/core/rdma_core.h |  55 +++++
 include/rdma/ib_verbs.h             |  21 ++
 include/rdma/uverbs_types.h         | 132 +++++++++++
 5 files changed, 660 insertions(+), 1 deletion(-)
 create mode 100644 drivers/infiniband/core/rdma_core.c
 create mode 100644 drivers/infiniband/core/rdma_core.h
 create mode 100644 include/rdma/uverbs_types.h

diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index e426ac8..d29f910 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -29,4 +29,5 @@ ib_umad-y :=			user_mad.o
 
 ib_ucm-y :=			ucm.o
 
-ib_uverbs-y :=			uverbs_main.o uverbs_cmd.o uverbs_marshall.o
+ib_uverbs-y :=			uverbs_main.o uverbs_cmd.o uverbs_marshall.o \
+				rdma_core.o
diff --git a/drivers/infiniband/core/rdma_core.c b/drivers/infiniband/core/rdma_core.c
new file mode 100644
index 0000000..1cbc053
--- /dev/null
+++ b/drivers/infiniband/core/rdma_core.c
@@ -0,0 +1,450 @@
+/*
+ * Copyright (c) 2016, Mellanox Technologies inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/file.h>
+#include <linux/anon_inodes.h>
+#include <rdma/ib_verbs.h>
+#include <rdma/uverbs_types.h>
+#include <linux/rcupdate.h>
+#include "uverbs.h"
+#include "core_priv.h"
+#include "rdma_core.h"
+
+void uverbs_uobject_get(struct ib_uobject *uobject)
+{
+	kref_get(&uobject->ref);
+}
+
+static void uverbs_uobject_put_ref(struct kref *ref)
+{
+	struct ib_uobject *uobj =
+		container_of(ref, struct ib_uobject, ref);
+
+	if (uobj->type->type_class->needs_kfree_rcu)
+		kfree_rcu(uobj, rcu);
+	else
+		kfree(uobj);
+}
+
+void uverbs_uobject_put(struct ib_uobject *uobject)
+{
+	kref_put(&uobject->ref, uverbs_uobject_put_ref);
+}
+
+static int uverbs_try_lock_object(struct ib_uobject *uobj, bool write)
+{
+	/*
+	 * When a read is required, we use a positive counter. Each read
+	 * request checks that the value != -1 and increment it. Write
+	 * requires an exclusive access, thus we check that the counter is
+	 * zero (nobody claimed this object) and we set it to -1.
+	 * Releasing a read lock is done by simply decreasing the counter.
+	 * As for writes, since only a single write is permitted, setting
+	 * it to zero is enough for releasing it.
+	 */
+	if (!write)
+		return __atomic_add_unless(&uobj->usecnt, 1, -1) == -1 ?
+			-EBUSY : 0;
+
+	/* lock is either WRITE or DESTROY - should be exclusive */
+	return atomic_cmpxchg(&uobj->usecnt, 0, -1) == 0 ? 0 : -EBUSY;
+}
+
+static struct ib_uobject *alloc_uobj(struct ib_ucontext *context,
+				     const struct uverbs_obj_type *type)
+{
+	struct ib_uobject *uobj = kmalloc(type->obj_size, GFP_KERNEL);
+
+	if (!uobj)
+		return ERR_PTR(-ENOMEM);
+	/*
+	 * user_handle should be filled by the handler,
+	 * The object is added to the list in the commit stage.
+	 */
+	uobj->context = context;
+	uobj->type = type;
+	atomic_set(&uobj->usecnt, 0);
+	kref_init(&uobj->ref);
+
+	return uobj;
+}
+
+static int idr_add_uobj(struct ib_uobject *uobj)
+{
+	int ret;
+
+	idr_preload(GFP_KERNEL);
+	spin_lock(&uobj->context->ufile->idr_lock);
+
+	/*
+	 * We start with allocating an idr pointing to NULL. This represents an
+	 * object which isn't initialized yet. We'll replace it later on with
+	 * the real object once we commit.
+	 */
+	ret = idr_alloc(&uobj->context->ufile->idr, NULL, 0,
+			min_t(unsigned long, U32_MAX - 1, INT_MAX), GFP_NOWAIT);
+	if (ret >= 0)
+		uobj->id = ret;
+
+	spin_unlock(&uobj->context->ufile->idr_lock);
+	idr_preload_end();
+
+	return ret < 0 ? ret : 0;
+}
+
+/*
+ * It only removes it from the uobjects list, uverbs_uobject_put() is still
+ * required.
+ */
+static void uverbs_idr_remove_uobj(struct ib_uobject *uobj)
+{
+	spin_lock(&uobj->context->ufile->idr_lock);
+	idr_remove(&uobj->context->ufile->idr, uobj->id);
+	spin_unlock(&uobj->context->ufile->idr_lock);
+}
+
+/* Returns the ib_uobject or an error. The caller should check for IS_ERR. */
+static struct ib_uobject *lookup_get_idr_uobject(const struct uverbs_obj_type *type,
+						 struct ib_ucontext *ucontext,
+						 int id, bool write)
+{
+	struct ib_uobject *uobj;
+
+	rcu_read_lock();
+	/* object won't be released as we're protected in rcu */
+	uobj = idr_find(&ucontext->ufile->idr, id);
+	if (!uobj) {
+		uobj = ERR_PTR(-ENOENT);
+		goto free;
+	}
+
+	uverbs_uobject_get(uobj);
+free:
+	rcu_read_unlock();
+	return uobj;
+}
+
+struct ib_uobject *rdma_lookup_get_uobject(const struct uverbs_obj_type *type,
+					   struct ib_ucontext *ucontext,
+					   int id, bool write)
+{
+	struct ib_uobject *uobj;
+	int ret;
+
+	uobj = type->type_class->lookup_get(type, ucontext, id, write);
+	if (IS_ERR(uobj))
+		return uobj;
+
+	if (uobj->type != type) {
+		ret = -EINVAL;
+		goto free;
+	}
+
+	ret = uverbs_try_lock_object(uobj, write);
+	if (ret) {
+		WARN(ucontext->cleanup_reason,
+		     "ib_uverbs: Trying to lookup_get while cleanup context\n");
+		goto free;
+	}
+
+	return uobj;
+free:
+	uobj->type->type_class->lookup_put(uobj, write);
+	uverbs_uobject_put(uobj);
+	return ERR_PTR(ret);
+}
+
+static struct ib_uobject *alloc_begin_idr_uobject(const struct uverbs_obj_type *type,
+						  struct ib_ucontext *ucontext)
+{
+	int ret;
+	struct ib_uobject *uobj;
+
+	uobj = alloc_uobj(ucontext, type);
+	if (IS_ERR(uobj))
+		return uobj;
+
+	ret = idr_add_uobj(uobj);
+	if (ret)
+		goto uobj_put;
+
+	ret = ib_rdmacg_try_charge(&uobj->cg_obj, ucontext->device,
+				   RDMACG_RESOURCE_HCA_OBJECT);
+	if (ret)
+		goto idr_remove;
+
+	return uobj;
+
+idr_remove:
+	uverbs_idr_remove_uobj(uobj);
+uobj_put:
+	uverbs_uobject_put(uobj);
+	return ERR_PTR(ret);
+}
+
+struct ib_uobject *rdma_alloc_begin_uobject(const struct uverbs_obj_type *type,
+					    struct ib_ucontext *ucontext)
+{
+	return type->type_class->alloc_begin(type, ucontext);
+}
+
+static void uverbs_uobject_add(struct ib_uobject *uobject)
+{
+	mutex_lock(&uobject->context->uobjects_lock);
+	list_add(&uobject->list, &uobject->context->uobjects);
+	mutex_unlock(&uobject->context->uobjects_lock);
+}
+
+static int __must_check remove_commit_idr_uobject(struct ib_uobject *uobj,
+						  enum rdma_remove_reason why)
+{
+	const struct uverbs_obj_idr_type *idr_type =
+		container_of(uobj->type, struct uverbs_obj_idr_type,
+			     type);
+	int ret = idr_type->destroy_object(uobj, why);
+
+	/*
+	 * We can only fail gracefully if the user requested to destroy the
+	 * object. In the rest of the cases, just remove whatever you can.
+	 */
+	if (why == RDMA_REMOVE_DESTROY && ret)
+		return ret;
+
+	ib_rdmacg_uncharge(&uobj->cg_obj, uobj->context->device,
+			   RDMACG_RESOURCE_HCA_OBJECT);
+	uverbs_idr_remove_uobj(uobj);
+
+	return ret;
+}
+
+static void lockdep_check(struct ib_uobject *uobj, bool write)
+{
+#ifdef CONFIG_LOCKDEP
+	if (write)
+		WARN_ON(atomic_read(&uobj->usecnt) > 0);
+	else
+		WARN_ON(atomic_read(&uobj->usecnt) == -1);
+#endif
+}
+
+static int __must_check _rdma_remove_commit_uobject(struct ib_uobject *uobj,
+						    enum rdma_remove_reason why,
+						    bool lock)
+{
+	int ret;
+	struct ib_ucontext *ucontext = uobj->context;
+
+	ret = uobj->type->type_class->remove_commit(uobj, why);
+	if (ret && why == RDMA_REMOVE_DESTROY) {
+		/* We couldn't remove the object, so just unlock the uobject */
+		atomic_set(&uobj->usecnt, 0);
+		uobj->type->type_class->lookup_put(uobj, true);
+	} else {
+		if (lock)
+			mutex_lock(&ucontext->uobjects_lock);
+		list_del(&uobj->list);
+		if (lock)
+			mutex_unlock(&ucontext->uobjects_lock);
+		/* put the ref we took when we created the object */
+		uverbs_uobject_put(uobj);
+	}
+
+	return ret;
+}
+
+/* This is called only for user requested DESTROY reasons */
+int __must_check rdma_remove_commit_uobject(struct ib_uobject *uobj)
+{
+	int ret;
+	struct ib_ucontext *ucontext = uobj->context;
+
+	/* put the ref count we took at lookup_get */
+	uverbs_uobject_put(uobj);
+	/* Cleanup is running. Calling this should have been impossible */
+	if (!down_read_trylock(&ucontext->cleanup_rwsem)) {
+		WARN(true, "ib_uverbs: Cleanup is running while removing an uobject\n");
+		return 0;
+	}
+	lockdep_check(uobj, true);
+	ret = _rdma_remove_commit_uobject(uobj, RDMA_REMOVE_DESTROY, true);
+
+	up_read(&ucontext->cleanup_rwsem);
+	return ret;
+}
+
+static void alloc_commit_idr_uobject(struct ib_uobject *uobj)
+{
+	uverbs_uobject_add(uobj);
+	spin_lock(&uobj->context->ufile->idr_lock);
+	/*
+	 * We already allocated this IDR with a NULL object, so
+	 * this shouldn't fail.
+	 */
+	WARN_ON(idr_replace(&uobj->context->ufile->idr,
+			    uobj, uobj->id));
+	spin_unlock(&uobj->context->ufile->idr_lock);
+}
+
+int rdma_alloc_commit_uobject(struct ib_uobject *uobj)
+{
+	/* Cleanup is running. Calling this should have been impossible */
+	if (!down_read_trylock(&uobj->context->cleanup_rwsem)) {
+		int ret;
+
+		WARN(true, "ib_uverbs: Cleanup is running while allocating an uobject\n");
+		ret = uobj->type->type_class->remove_commit(uobj,
+							    RDMA_REMOVE_DURING_CLEANUP);
+		if (ret)
+			pr_warn("ib_uverbs: cleanup of idr object %d failed\n",
+				uobj->id);
+		return ret;
+	}
+
+	uobj->type->type_class->alloc_commit(uobj);
+	up_read(&uobj->context->cleanup_rwsem);
+
+	return 0;
+}
+
+static void alloc_abort_idr_uobject(struct ib_uobject *uobj)
+{
+	uverbs_idr_remove_uobj(uobj);
+	ib_rdmacg_uncharge(&uobj->cg_obj, uobj->context->device,
+			   RDMACG_RESOURCE_HCA_OBJECT);
+	uverbs_uobject_put(uobj);
+}
+
+void rdma_alloc_abort_uobject(struct ib_uobject *uobj)
+{
+	uobj->type->type_class->alloc_abort(uobj);
+}
+
+static void lookup_put_idr_uobject(struct ib_uobject *uobj, bool write)
+{
+}
+
+void rdma_lookup_put_uobject(struct ib_uobject *uobj, bool write)
+{
+	lockdep_check(uobj, write);
+	uobj->type->type_class->lookup_put(uobj, write);
+	/*
+	 * In order to unlock an object, either decrease its usecnt for
+	 * read access or zero it in case of write access. See
+	 * uverbs_try_lock_object for locking schema information.
+	 */
+	if (!write)
+		atomic_dec(&uobj->usecnt);
+	else
+		atomic_set(&uobj->usecnt, 0);
+
+	uverbs_uobject_put(uobj);
+}
+
+const struct uverbs_obj_type_class uverbs_idr_class = {
+	.alloc_begin = alloc_begin_idr_uobject,
+	.lookup_get = lookup_get_idr_uobject,
+	.alloc_commit = alloc_commit_idr_uobject,
+	.alloc_abort = alloc_abort_idr_uobject,
+	.lookup_put = lookup_put_idr_uobject,
+	.remove_commit = remove_commit_idr_uobject,
+	/*
+	 * When we destroy an object, we first just lock it for WRITE and
+	 * actually DESTROY it in the finalize stage. So, the problematic
+	 * scenario is when we just started the finalize stage of the
+	 * destruction (nothing was executed yet). Now, the other thread
+	 * fetched the object for READ access, but it didn't lock it yet.
+	 * The DESTROY thread continues and starts destroying the object.
+	 * When the other thread continue - without the RCU, it would
+	 * access freed memory. However, the rcu_read_lock delays the free
+	 * until the rcu_read_lock of the READ operation quits. Since the
+	 * write lock of the object is still taken by the DESTROY flow, the
+	 * READ operation will get -EBUSY and it'll just bail out.
+	 */
+	.needs_kfree_rcu = true,
+};
+
+void uverbs_cleanup_ucontext(struct ib_ucontext *ucontext, bool device_removed)
+{
+	enum rdma_remove_reason reason = device_removed ?
+		RDMA_REMOVE_DRIVER_REMOVE : RDMA_REMOVE_CLOSE;
+	unsigned int cur_order = 0;
+
+	ucontext->cleanup_reason = reason;
+	/*
+	 * Waits for all remove_commit and alloc_commit to finish. Logically, We
+	 * want to hold this forever as the context is going to be destroyed,
+	 * but we'll release it since it causes a "held lock freed" BUG message.
+	 */
+	down_write(&ucontext->cleanup_rwsem);
+
+	while (!list_empty(&ucontext->uobjects)) {
+		struct ib_uobject *obj, *next_obj;
+		unsigned int next_order = UINT_MAX;
+
+		/*
+		 * This shouldn't run while executing other commands on this
+		 * context.
+		 */
+		mutex_lock(&ucontext->uobjects_lock);
+		list_for_each_entry_safe(obj, next_obj, &ucontext->uobjects,
+					 list)
+			if (obj->type->destroy_order == cur_order) {
+				int ret;
+
+				/*
+				 * if we hit this WARN_ON, that means we are
+				 * racing with a lookup_get.
+				 */
+				WARN_ON(uverbs_try_lock_object(obj, true));
+				ret = _rdma_remove_commit_uobject(obj, reason,
+								  false);
+				if (ret)
+					pr_warn("ib_uverbs: failed to remove uobject id %d order %u\n",
+						obj->id, cur_order);
+			} else {
+				next_order = min(next_order,
+						 obj->type->destroy_order);
+			}
+		mutex_unlock(&ucontext->uobjects_lock);
+		cur_order = next_order;
+	}
+	up_write(&ucontext->cleanup_rwsem);
+}
+
+void uverbs_initialize_ucontext(struct ib_ucontext *ucontext)
+{
+	ucontext->cleanup_reason = 0;
+	mutex_init(&ucontext->uobjects_lock);
+	INIT_LIST_HEAD(&ucontext->uobjects);
+	init_rwsem(&ucontext->cleanup_rwsem);
+}
+
diff --git a/drivers/infiniband/core/rdma_core.h b/drivers/infiniband/core/rdma_core.h
new file mode 100644
index 0000000..ab665a6
--- /dev/null
+++ b/drivers/infiniband/core/rdma_core.h
@@ -0,0 +1,55 @@
+/*
+ * Copyright (c) 2005 Topspin Communications.  All rights reserved.
+ * Copyright (c) 2005, 2006 Cisco Systems.  All rights reserved.
+ * Copyright (c) 2005-2017 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2005 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2005 PathScale, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RDMA_CORE_H
+#define RDMA_CORE_H
+
+#include <linux/idr.h>
+#include <rdma/uverbs_types.h>
+#include <rdma/ib_verbs.h>
+#include <linux/mutex.h>
+
+/*
+ * These functions initialize the context and cleanups its uobjects.
+ * The context has a list of objects which is protected by a mutex
+ * on the context. initialize_ucontext should be called when we create
+ * a context.
+ * cleanup_ucontext removes all uobjects from the context and puts them.
+ */
+void uverbs_cleanup_ucontext(struct ib_ucontext *ucontext, bool device_removed);
+void uverbs_initialize_ucontext(struct ib_ucontext *ucontext);
+
+#endif /* RDMA_CORE_H */
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 319e691..d3efd22 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1357,6 +1357,17 @@ struct ib_fmr_attr {
 
 struct ib_umem;
 
+enum rdma_remove_reason {
+	/* Userspace requested uobject deletion. Call could fail */
+	RDMA_REMOVE_DESTROY,
+	/* Context deletion. This call should delete the actual object itself */
+	RDMA_REMOVE_CLOSE,
+	/* Driver is being hot-unplugged. This call should delete the actual object itself */
+	RDMA_REMOVE_DRIVER_REMOVE,
+	/* Context is being cleaned-up, but commit was just completed */
+	RDMA_REMOVE_DURING_CLEANUP,
+};
+
 struct ib_rdmacg_object {
 #ifdef CONFIG_CGROUP_RDMA
 	struct rdma_cgroup	*cg;		/* owner rdma cgroup */
@@ -1379,6 +1390,13 @@ struct ib_ucontext {
 	struct list_head	rwq_ind_tbl_list;
 	int			closing;
 
+	/* locking the uobjects_list */
+	struct mutex		uobjects_lock;
+	struct list_head	uobjects;
+	/* protects cleanup process from other actions */
+	struct rw_semaphore	cleanup_rwsem;
+	enum rdma_remove_reason cleanup_reason;
+
 	struct pid             *tgid;
 #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	struct rb_root      umem_tree;
@@ -1409,8 +1427,11 @@ struct ib_uobject {
 	int			id;		/* index into kernel idr */
 	struct kref		ref;
 	struct rw_semaphore	mutex;		/* protects .live */
+	atomic_t		usecnt;		/* protects exclusive access */
 	struct rcu_head		rcu;		/* kfree_rcu() overhead */
 	int			live;
+
+	const struct uverbs_obj_type *type;
 };
 
 struct ib_udata {
diff --git a/include/rdma/uverbs_types.h b/include/rdma/uverbs_types.h
new file mode 100644
index 0000000..0777e40
--- /dev/null
+++ b/include/rdma/uverbs_types.h
@@ -0,0 +1,132 @@
+/*
+ * Copyright (c) 2017, Mellanox Technologies inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef _UVERBS_TYPES_
+#define _UVERBS_TYPES_
+
+#include <linux/kernel.h>
+#include <rdma/ib_verbs.h>
+
+struct uverbs_obj_type;
+
+struct uverbs_obj_type_class {
+	/*
+	 * Get an ib_uobject that corresponds to the given id from ucontext,
+	 * These functions could create or destroy objects if required.
+	 * The action will be finalized only when commit, abort or put fops are
+	 * called.
+	 * The flow of the different actions is:
+	 * [alloc]:	 Starts with alloc_begin. The handlers logic is than
+	 *		 executed. If the handler is successful, alloc_commit
+	 *		 is called and the object is inserted to the repository.
+	 *		 Once alloc_commit completes the object is visible to
+	 *		 other threads and userspace.
+	 e		 Otherwise, alloc_abort is called and the object is
+	 *		 destroyed.
+	 * [lookup]:	 Starts with lookup_get which fetches and locks the
+	 *		 object. After the handler finished using the object, it
+	 *		 needs to call lookup_put to unlock it. The write flag
+	 *		 indicates if the object is locked for exclusive access.
+	 * [remove]:	 Starts with lookup_get with write flag set. This locks
+	 *		 the object for exclusive access. If the handler code
+	 *		 completed successfully, remove_commit is called and
+	 *		 the ib_uobject is removed from the context's uobjects
+	 *		 repository and put. The object itself is destroyed as
+	 *		 well. Once remove succeeds new krefs to the object
+	 *		 cannot be acquired by other threads or userspace and
+	 *		 the hardware driver is removed from the object.
+	 *		 Other krefs on the object may still exist.
+	 *		 If the handler code failed, lookup_put should be
+	 *		 called. This callback is used when the context
+	 *		 is destroyed as well (process termination,
+	 *		 reset flow).
+	 */
+	struct ib_uobject *(*alloc_begin)(const struct uverbs_obj_type *type,
+					  struct ib_ucontext *ucontext);
+	void (*alloc_commit)(struct ib_uobject *uobj);
+	void (*alloc_abort)(struct ib_uobject *uobj);
+
+	struct ib_uobject *(*lookup_get)(const struct uverbs_obj_type *type,
+					 struct ib_ucontext *ucontext, int id,
+					 bool write);
+	void (*lookup_put)(struct ib_uobject *uobj, bool write);
+	/*
+	 * Must be called with the write lock held. If successful uobj is
+	 * invalid on return. On failure uobject is left completely
+	 * unchanged
+	 */
+	int __must_check (*remove_commit)(struct ib_uobject *uobj,
+					  enum rdma_remove_reason why);
+	u8    needs_kfree_rcu;
+};
+
+struct uverbs_obj_type {
+	const struct uverbs_obj_type_class * const type_class;
+	size_t	     obj_size;
+	unsigned int destroy_order;
+};
+
+/*
+ * Objects type classes which support a detach state (object is still alive but
+ * it's not attached to any context need to make sure:
+ * (a) no call through to a driver after a detach is called
+ * (b) detach isn't called concurrently with context_cleanup
+ */
+
+struct uverbs_obj_idr_type {
+	/*
+	 * In idr based objects, uverbs_obj_type_class points to a generic
+	 * idr operations. In order to specialize the underlying types (e.g. CQ,
+	 * QPs, etc.), we add destroy_object specific callbacks.
+	 */
+	struct uverbs_obj_type  type;
+
+	/* Free driver resources from the uobject, make the driver uncallable,
+	 * and move the uobject to the detached state. If the object was
+	 * destroyed by the user's request, a failure should leave the uobject
+	 * completely unchanged.
+	 */
+	int __must_check (*destroy_object)(struct ib_uobject *uobj,
+					   enum rdma_remove_reason why);
+};
+
+struct ib_uobject *rdma_lookup_get_uobject(const struct uverbs_obj_type *type,
+					   struct ib_ucontext *ucontext,
+					   int id, bool write);
+void rdma_lookup_put_uobject(struct ib_uobject *uobj, bool write);
+struct ib_uobject *rdma_alloc_begin_uobject(const struct uverbs_obj_type *type,
+					    struct ib_ucontext *ucontext);
+void rdma_alloc_abort_uobject(struct ib_uobject *uobj);
+int __must_check rdma_remove_commit_uobject(struct ib_uobject *uobj);
+int rdma_alloc_commit_uobject(struct ib_uobject *uobj);
+
+#endif
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V3 for-next 3/7] IB/core: Add idr based standard types
       [not found] ` <1491301907-32290-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2017-04-04 10:31   ` [PATCH V3 for-next 1/7] IB/core: Refactor idr to be per uverbs_file Matan Barak
  2017-04-04 10:31   ` [PATCH V3 for-next 2/7] IB/core: Add support for idr types Matan Barak
@ 2017-04-04 10:31   ` Matan Barak
       [not found]     ` <1491301907-32290-4-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2017-04-04 10:31   ` [PATCH V3 for-next 4/7] IB/core: Change idr objects to use the new schema Matan Barak
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2017-04-04 10:31 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Sean Hefty, Leon Romanovsky, Majd Dibbiny, Tal Alon,
	Yishai Hadas, Ira Weiny, Haggai Eran, Christoph Lameter,
	Matan Barak

This patch adds the standard idr based types. These types are
used in downstream patches in order to initialize, destroy and
lookup IB standard objects which are based on idr objects.

An idr object requires filling out several parameters. Its op pointer
should point to uverbs_idr_ops and its size should be at least the
size of ib_uobject. We add a macro to make the type declaration easier.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/Makefile           |   2 +-
 drivers/infiniband/core/uverbs.h           |   5 +-
 drivers/infiniband/core/uverbs_cmd.c       |  16 +-
 drivers/infiniband/core/uverbs_main.c      |   8 +-
 drivers/infiniband/core/uverbs_std_types.c | 244 +++++++++++++++++++++++++++++
 include/rdma/uverbs_std_types.h            |  50 ++++++
 include/rdma/uverbs_types.h                |  14 ++
 7 files changed, 329 insertions(+), 10 deletions(-)
 create mode 100644 drivers/infiniband/core/uverbs_std_types.c
 create mode 100644 include/rdma/uverbs_std_types.h

diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index d29f910..6ebd9ad 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -30,4 +30,4 @@ ib_umad-y :=			user_mad.o
 ib_ucm-y :=			ucm.o
 
 ib_uverbs-y :=			uverbs_main.o uverbs_cmd.o uverbs_marshall.o \
-				rdma_core.o
+				rdma_core.o uverbs_std_types.o
diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 6215735..cf0519d 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -201,9 +201,12 @@ void ib_uverbs_release_uevent(struct ib_uverbs_file *file,
 void ib_uverbs_srq_event_handler(struct ib_event *event, void *context_ptr);
 void ib_uverbs_event_handler(struct ib_event_handler *handler,
 			     struct ib_event *event);
-void ib_uverbs_dealloc_xrcd(struct ib_uverbs_device *dev, struct ib_xrcd *xrcd);
+int ib_uverbs_dealloc_xrcd(struct ib_uverbs_device *dev, struct ib_xrcd *xrcd,
+			   enum rdma_remove_reason why);
 
 int uverbs_dealloc_mw(struct ib_mw *mw);
+void ib_uverbs_detach_umcast(struct ib_qp *qp,
+			     struct ib_uqp_object *uobj);
 
 struct ib_uverbs_flow_spec {
 	union {
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 03c4f68..79de69d 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -958,19 +958,25 @@ ssize_t ib_uverbs_close_xrcd(struct ib_uverbs_file *file,
 	return ret;
 }
 
-void ib_uverbs_dealloc_xrcd(struct ib_uverbs_device *dev,
-			    struct ib_xrcd *xrcd)
+int ib_uverbs_dealloc_xrcd(struct ib_uverbs_device *dev,
+			   struct ib_xrcd *xrcd,
+			   enum rdma_remove_reason why)
 {
 	struct inode *inode;
+	int ret;
 
 	inode = xrcd->inode;
 	if (inode && !atomic_dec_and_test(&xrcd->usecnt))
-		return;
+		return 0;
 
-	ib_dealloc_xrcd(xrcd);
+	ret = ib_dealloc_xrcd(xrcd);
 
-	if (inode)
+	if (why == RDMA_REMOVE_DESTROY && ret)
+		atomic_inc(&xrcd->usecnt);
+	else if (inode)
 		xrcd_table_delete(dev, inode);
+
+	return ret;
 }
 
 ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index f6812fb..e1db678 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -201,8 +201,8 @@ void ib_uverbs_release_uevent(struct ib_uverbs_file *file,
 	spin_unlock_irq(&file->async_file->lock);
 }
 
-static void ib_uverbs_detach_umcast(struct ib_qp *qp,
-				    struct ib_uqp_object *uobj)
+void ib_uverbs_detach_umcast(struct ib_qp *qp,
+			     struct ib_uqp_object *uobj)
 {
 	struct ib_uverbs_mcast_entry *mcast, *tmp;
 
@@ -331,7 +331,9 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 			container_of(uobj, struct ib_uxrcd_object, uobject);
 
 		idr_remove_uobj(uobj);
-		ib_uverbs_dealloc_xrcd(file->device, xrcd);
+		ib_uverbs_dealloc_xrcd(file->device, xrcd,
+				       file->ucontext ? RDMA_REMOVE_CLOSE :
+				       RDMA_REMOVE_DRIVER_REMOVE);
 		kfree(uxrcd);
 	}
 	mutex_unlock(&file->device->xrcd_tree_mutex);
diff --git a/drivers/infiniband/core/uverbs_std_types.c b/drivers/infiniband/core/uverbs_std_types.c
new file mode 100644
index 0000000..a514556
--- /dev/null
+++ b/drivers/infiniband/core/uverbs_std_types.c
@@ -0,0 +1,244 @@
+/*
+ * Copyright (c) 2017, Mellanox Technologies inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <rdma/uverbs_std_types.h>
+#include <rdma/ib_user_verbs.h>
+#include <rdma/ib_verbs.h>
+#include <linux/bug.h>
+#include <linux/file.h>
+#include "rdma_core.h"
+#include "uverbs.h"
+
+int uverbs_free_ah(struct ib_uobject *uobject,
+		   enum rdma_remove_reason why)
+{
+	return ib_destroy_ah((struct ib_ah *)uobject->object);
+}
+
+int uverbs_free_flow(struct ib_uobject *uobject,
+		     enum rdma_remove_reason why)
+{
+	return ib_destroy_flow((struct ib_flow *)uobject->object);
+}
+
+int uverbs_free_mw(struct ib_uobject *uobject,
+		   enum rdma_remove_reason why)
+{
+	return uverbs_dealloc_mw((struct ib_mw *)uobject->object);
+}
+
+int uverbs_free_qp(struct ib_uobject *uobject,
+		   enum rdma_remove_reason why)
+{
+	struct ib_qp *qp = uobject->object;
+	struct ib_uqp_object *uqp =
+		container_of(uobject, struct ib_uqp_object, uevent.uobject);
+	int ret;
+
+	if (why == RDMA_REMOVE_DESTROY) {
+		if (!list_empty(&uqp->mcast_list))
+			return -EBUSY;
+	} else if (qp == qp->real_qp) {
+		ib_uverbs_detach_umcast(qp, uqp);
+	}
+
+	ret = ib_destroy_qp(qp);
+	if (ret && why == RDMA_REMOVE_DESTROY)
+		return ret;
+
+	if (uqp->uxrcd)
+		atomic_dec(&uqp->uxrcd->refcnt);
+
+	ib_uverbs_release_uevent(uobject->context->ufile, &uqp->uevent);
+	return ret;
+}
+
+int uverbs_free_rwq_ind_tbl(struct ib_uobject *uobject,
+			    enum rdma_remove_reason why)
+{
+	struct ib_rwq_ind_table *rwq_ind_tbl = uobject->object;
+	struct ib_wq **ind_tbl = rwq_ind_tbl->ind_tbl;
+	int ret;
+
+	ret = ib_destroy_rwq_ind_table(rwq_ind_tbl);
+	if (!ret || why != RDMA_REMOVE_DESTROY)
+		kfree(ind_tbl);
+	return ret;
+}
+
+int uverbs_free_wq(struct ib_uobject *uobject,
+		   enum rdma_remove_reason why)
+{
+	struct ib_wq *wq = uobject->object;
+	struct ib_uwq_object *uwq =
+		container_of(uobject, struct ib_uwq_object, uevent.uobject);
+	int ret;
+
+	ret = ib_destroy_wq(wq);
+	if (!ret || why != RDMA_REMOVE_DESTROY)
+		ib_uverbs_release_uevent(uobject->context->ufile, &uwq->uevent);
+	return ret;
+}
+
+int uverbs_free_srq(struct ib_uobject *uobject,
+		    enum rdma_remove_reason why)
+{
+	struct ib_srq *srq = uobject->object;
+	struct ib_uevent_object *uevent =
+		container_of(uobject, struct ib_uevent_object, uobject);
+	enum ib_srq_type  srq_type = srq->srq_type;
+	int ret;
+
+	ret = ib_destroy_srq(srq);
+
+	if (ret && why == RDMA_REMOVE_DESTROY)
+		return ret;
+
+	if (srq_type == IB_SRQT_XRC) {
+		struct ib_usrq_object *us =
+			container_of(uevent, struct ib_usrq_object, uevent);
+
+		atomic_dec(&us->uxrcd->refcnt);
+	}
+
+	ib_uverbs_release_uevent(uobject->context->ufile, uevent);
+	return ret;
+}
+
+int uverbs_free_cq(struct ib_uobject *uobject,
+		   enum rdma_remove_reason why)
+{
+	struct ib_cq *cq = uobject->object;
+	struct ib_uverbs_event_file *ev_file = cq->cq_context;
+	struct ib_ucq_object *ucq =
+		container_of(uobject, struct ib_ucq_object, uobject);
+	int ret;
+
+	ret = ib_destroy_cq(cq);
+	if (!ret || why != RDMA_REMOVE_DESTROY)
+		ib_uverbs_release_ucq(uobject->context->ufile, ev_file, ucq);
+	return ret;
+}
+
+int uverbs_free_mr(struct ib_uobject *uobject,
+		   enum rdma_remove_reason why)
+{
+	return ib_dereg_mr((struct ib_mr *)uobject->object);
+}
+
+int uverbs_free_xrcd(struct ib_uobject *uobject,
+		     enum rdma_remove_reason why)
+{
+	struct ib_xrcd *xrcd = uobject->object;
+	struct ib_uxrcd_object *uxrcd =
+		container_of(uobject, struct ib_uxrcd_object, uobject);
+	int ret;
+
+	mutex_lock(&uobject->context->ufile->device->xrcd_tree_mutex);
+	if (why == RDMA_REMOVE_DESTROY && atomic_read(&uxrcd->refcnt))
+		ret = -EBUSY;
+	else
+		ret = ib_uverbs_dealloc_xrcd(uobject->context->ufile->device,
+					     xrcd, why);
+	mutex_unlock(&uobject->context->ufile->device->xrcd_tree_mutex);
+
+	return ret;
+}
+
+int uverbs_free_pd(struct ib_uobject *uobject,
+		   enum rdma_remove_reason why)
+{
+	struct ib_pd *pd = uobject->object;
+
+	if (why == RDMA_REMOVE_DESTROY && atomic_read(&pd->usecnt))
+		return -EBUSY;
+
+	ib_dealloc_pd((struct ib_pd *)uobject->object);
+	return 0;
+}
+
+const struct uverbs_obj_idr_type uverbs_type_attrs_cq = {
+	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_ucq_object), 0),
+	.destroy_object = uverbs_free_cq,
+};
+
+const struct uverbs_obj_idr_type uverbs_type_attrs_qp = {
+	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uqp_object), 0),
+	.destroy_object = uverbs_free_qp,
+};
+
+const struct uverbs_obj_idr_type uverbs_type_attrs_mw = {
+	.type = UVERBS_TYPE_ALLOC_IDR(0),
+	.destroy_object = uverbs_free_mw,
+};
+
+const struct uverbs_obj_idr_type uverbs_type_attrs_mr = {
+	/* 1 is used in order to free the MR after all the MWs */
+	.type = UVERBS_TYPE_ALLOC_IDR(1),
+	.destroy_object = uverbs_free_mr,
+};
+
+const struct uverbs_obj_idr_type uverbs_type_attrs_srq = {
+	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_usrq_object), 0),
+	.destroy_object = uverbs_free_srq,
+};
+
+const struct uverbs_obj_idr_type uverbs_type_attrs_ah = {
+	.type = UVERBS_TYPE_ALLOC_IDR(0),
+	.destroy_object = uverbs_free_ah,
+};
+
+const struct uverbs_obj_idr_type uverbs_type_attrs_flow = {
+	.type = UVERBS_TYPE_ALLOC_IDR(0),
+	.destroy_object = uverbs_free_flow,
+};
+
+const struct uverbs_obj_idr_type uverbs_type_attrs_wq = {
+	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uwq_object), 0),
+	.destroy_object = uverbs_free_wq,
+};
+
+const struct uverbs_obj_idr_type uverbs_type_attrs_rwq_ind_table = {
+	.type = UVERBS_TYPE_ALLOC_IDR(0),
+	.destroy_object = uverbs_free_rwq_ind_tbl,
+};
+
+const struct uverbs_obj_idr_type uverbs_type_attrs_xrcd = {
+	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uxrcd_object), 0),
+	.destroy_object = uverbs_free_xrcd,
+};
+
+const struct uverbs_obj_idr_type uverbs_type_attrs_pd = {
+	/* 2 is used in order to free the PD after MRs */
+	.type = UVERBS_TYPE_ALLOC_IDR(2),
+	.destroy_object = uverbs_free_pd,
+};
diff --git a/include/rdma/uverbs_std_types.h b/include/rdma/uverbs_std_types.h
new file mode 100644
index 0000000..2edb776
--- /dev/null
+++ b/include/rdma/uverbs_std_types.h
@@ -0,0 +1,50 @@
+/*
+ * Copyright (c) 2017, Mellanox Technologies inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef _UVERBS_STD_TYPES__
+#define _UVERBS_STD_TYPES__
+
+#include <rdma/uverbs_types.h>
+
+extern const struct uverbs_obj_idr_type uverbs_type_attrs_cq;
+extern const struct uverbs_obj_idr_type uverbs_type_attrs_qp;
+extern const struct uverbs_obj_idr_type uverbs_type_attrs_rwq_ind_table;
+extern const struct uverbs_obj_idr_type uverbs_type_attrs_wq;
+extern const struct uverbs_obj_idr_type uverbs_type_attrs_srq;
+extern const struct uverbs_obj_idr_type uverbs_type_attrs_ah;
+extern const struct uverbs_obj_idr_type uverbs_type_attrs_flow;
+extern const struct uverbs_obj_idr_type uverbs_type_attrs_mr;
+extern const struct uverbs_obj_idr_type uverbs_type_attrs_mw;
+extern const struct uverbs_obj_idr_type uverbs_type_attrs_pd;
+extern const struct uverbs_obj_idr_type uverbs_type_attrs_xrcd;
+#endif
+
diff --git a/include/rdma/uverbs_types.h b/include/rdma/uverbs_types.h
index 0777e40..66368b5 100644
--- a/include/rdma/uverbs_types.h
+++ b/include/rdma/uverbs_types.h
@@ -129,4 +129,18 @@ struct ib_uobject *rdma_alloc_begin_uobject(const struct uverbs_obj_type *type,
 int __must_check rdma_remove_commit_uobject(struct ib_uobject *uobj);
 int rdma_alloc_commit_uobject(struct ib_uobject *uobj);
 
+extern const struct uverbs_obj_type_class uverbs_idr_class;
+
+#define UVERBS_BUILD_BUG_ON(cond) (sizeof(char[1 - 2 * !!(cond)]) -	\
+				   sizeof(char))
+#define UVERBS_TYPE_ALLOC_IDR_SZ(_size, _order)				\
+	{								\
+		.destroy_order = _order,				\
+		.type_class = &uverbs_idr_class,			\
+		.obj_size = (_size) +					\
+			  UVERBS_BUILD_BUG_ON((_size) <			\
+					      sizeof(struct ib_uobject)), \
+	}
+#define UVERBS_TYPE_ALLOC_IDR(_order)					\
+	 UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uobject), _order)
 #endif
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V3 for-next 4/7] IB/core: Change idr objects to use the new schema
       [not found] ` <1491301907-32290-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-04-04 10:31   ` [PATCH V3 for-next 3/7] IB/core: Add idr based standard types Matan Barak
@ 2017-04-04 10:31   ` Matan Barak
       [not found]     ` <1491301907-32290-5-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2017-04-04 10:31   ` [PATCH V3 for-next 5/7] IB/core: Add lock to multicast handlers Matan Barak
                     ` (2 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2017-04-04 10:31 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Sean Hefty, Leon Romanovsky, Majd Dibbiny, Tal Alon,
	Yishai Hadas, Ira Weiny, Haggai Eran, Christoph Lameter,
	Matan Barak

This changes only the handlers which deals with idr based objects to
use the new idr allocation, fetching and destruction schema.
This patch consists of the following changes:
(1) Allocation, fetching and destruction is done via idr ops.
(2) Context initializing and release is done through
    uverbs_initialize_ucontext and uverbs_cleanup_ucontext.
(3) Ditching the live flag. Mostly, this is pretty straight
    forward. The only place that is a bit trickier is in
    ib_uverbs_open_qp. Commit [1] added code to check whether
    the uobject is already live and initialized. This mostly
    happens because of a race between open_qp and events.
    We delayed assigning the uobject's pointer in order to
    eliminate this race without using the live variable.

[1] commit a040f95dc819
	("IB/core: Fix XRC race condition in ib_uverbs_open_qp")

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/rdma_core.h   |   15 +
 drivers/infiniband/core/uverbs.h      |    2 -
 drivers/infiniband/core/uverbs_cmd.c  | 1311 ++++++++-------------------------
 drivers/infiniband/core/uverbs_main.c |  142 +---
 include/rdma/ib_verbs.h               |   13 -
 include/rdma/uverbs_std_types.h       |   63 ++
 6 files changed, 402 insertions(+), 1144 deletions(-)

diff --git a/drivers/infiniband/core/rdma_core.h b/drivers/infiniband/core/rdma_core.h
index ab665a6..0247bb5 100644
--- a/drivers/infiniband/core/rdma_core.h
+++ b/drivers/infiniband/core/rdma_core.h
@@ -52,4 +52,19 @@
 void uverbs_cleanup_ucontext(struct ib_ucontext *ucontext, bool device_removed);
 void uverbs_initialize_ucontext(struct ib_ucontext *ucontext);
 
+/*
+ * uverbs_uobject_get is called in order to increase the reference count on
+ * an uobject. This is useful when a handler wants to keep the uobject's memory
+ * alive, regardless if this uobject is still alive in the context's objects
+ * repository. Objects are put via uverbs_uobject_put.
+ */
+void uverbs_uobject_get(struct ib_uobject *uobject);
+
+/*
+ * In order to indicate we no longer needs this uobject, uverbs_uobject_put
+ * is called. When the reference count is decreased, the uobject is freed.
+ * For example, this is used when attaching a completion channel to a CQ.
+ */
+void uverbs_uobject_put(struct ib_uobject *uobject);
+
 #endif /* RDMA_CORE_H */
diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index cf0519d..3660278 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -180,8 +180,6 @@ struct ib_ucq_object {
 	u32			async_events_reported;
 };
 
-void idr_remove_uobj(struct ib_uobject *uobj);
-
 struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
 					struct ib_device *ib_dev,
 					int is_async);
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 79de69d..2f258aa 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -40,269 +40,13 @@
 
 #include <linux/uaccess.h>
 
+#include <rdma/uverbs_types.h>
+#include <rdma/uverbs_std_types.h>
+#include "rdma_core.h"
+
 #include "uverbs.h"
 #include "core_priv.h"
 
-struct uverbs_lock_class {
-	struct lock_class_key	key;
-	char			name[16];
-};
-
-static struct uverbs_lock_class pd_lock_class	= { .name = "PD-uobj" };
-static struct uverbs_lock_class mr_lock_class	= { .name = "MR-uobj" };
-static struct uverbs_lock_class mw_lock_class	= { .name = "MW-uobj" };
-static struct uverbs_lock_class cq_lock_class	= { .name = "CQ-uobj" };
-static struct uverbs_lock_class qp_lock_class	= { .name = "QP-uobj" };
-static struct uverbs_lock_class ah_lock_class	= { .name = "AH-uobj" };
-static struct uverbs_lock_class srq_lock_class	= { .name = "SRQ-uobj" };
-static struct uverbs_lock_class xrcd_lock_class = { .name = "XRCD-uobj" };
-static struct uverbs_lock_class rule_lock_class = { .name = "RULE-uobj" };
-static struct uverbs_lock_class wq_lock_class = { .name = "WQ-uobj" };
-static struct uverbs_lock_class rwq_ind_table_lock_class = { .name = "IND_TBL-uobj" };
-
-/*
- * The ib_uobject locking scheme is as follows:
- *
- * - ib_uverbs_idr_lock protects the uverbs idrs themselves, so it
- *   needs to be held during all idr write operations.  When an object is
- *   looked up, a reference must be taken on the object's kref before
- *   dropping this lock.  For read operations, the rcu_read_lock()
- *   and rcu_write_lock() but similarly the kref reference is grabbed
- *   before the rcu_read_unlock().
- *
- * - Each object also has an rwsem.  This rwsem must be held for
- *   reading while an operation that uses the object is performed.
- *   For example, while registering an MR, the associated PD's
- *   uobject.mutex must be held for reading.  The rwsem must be held
- *   for writing while initializing or destroying an object.
- *
- * - In addition, each object has a "live" flag.  If this flag is not
- *   set, then lookups of the object will fail even if it is found in
- *   the idr.  This handles a reader that blocks and does not acquire
- *   the rwsem until after the object is destroyed.  The destroy
- *   operation will set the live flag to 0 and then drop the rwsem;
- *   this will allow the reader to acquire the rwsem, see that the
- *   live flag is 0, and then drop the rwsem and its reference to
- *   object.  The underlying storage will not be freed until the last
- *   reference to the object is dropped.
- */
-
-static void init_uobj(struct ib_uobject *uobj, u64 user_handle,
-		      struct ib_ucontext *context, struct uverbs_lock_class *c)
-{
-	uobj->user_handle = user_handle;
-	uobj->context     = context;
-	kref_init(&uobj->ref);
-	init_rwsem(&uobj->mutex);
-	lockdep_set_class_and_name(&uobj->mutex, &c->key, c->name);
-	uobj->live        = 0;
-}
-
-static void release_uobj(struct kref *kref)
-{
-	kfree_rcu(container_of(kref, struct ib_uobject, ref), rcu);
-}
-
-static void put_uobj(struct ib_uobject *uobj)
-{
-	kref_put(&uobj->ref, release_uobj);
-}
-
-static void put_uobj_read(struct ib_uobject *uobj)
-{
-	up_read(&uobj->mutex);
-	put_uobj(uobj);
-}
-
-static void put_uobj_write(struct ib_uobject *uobj)
-{
-	up_write(&uobj->mutex);
-	put_uobj(uobj);
-}
-
-static int idr_add_uobj(struct ib_uobject *uobj)
-{
-	int ret;
-
-	idr_preload(GFP_KERNEL);
-	spin_lock(&uobj->context->ufile->idr_lock);
-
-	ret = idr_alloc(&uobj->context->ufile->idr, uobj, 0, 0, GFP_NOWAIT);
-	if (ret >= 0)
-		uobj->id = ret;
-
-	spin_unlock(&uobj->context->ufile->idr_lock);
-	idr_preload_end();
-
-	return ret < 0 ? ret : 0;
-}
-
-void idr_remove_uobj(struct ib_uobject *uobj)
-{
-	spin_lock(&uobj->context->ufile->idr_lock);
-	idr_remove(&uobj->context->ufile->idr, uobj->id);
-	spin_unlock(&uobj->context->ufile->idr_lock);
-}
-
-static struct ib_uobject *__idr_get_uobj(int id, struct ib_ucontext *context)
-{
-	struct ib_uobject *uobj;
-
-	rcu_read_lock();
-	uobj = idr_find(&context->ufile->idr, id);
-	if (uobj) {
-		if (uobj->context == context)
-			kref_get(&uobj->ref);
-		else
-			uobj = NULL;
-	}
-	rcu_read_unlock();
-
-	return uobj;
-}
-
-static struct ib_uobject *idr_read_uobj(int id, struct ib_ucontext *context,
-					int nested)
-{
-	struct ib_uobject *uobj;
-
-	uobj = __idr_get_uobj(id, context);
-	if (!uobj)
-		return NULL;
-
-	if (nested)
-		down_read_nested(&uobj->mutex, SINGLE_DEPTH_NESTING);
-	else
-		down_read(&uobj->mutex);
-	if (!uobj->live) {
-		put_uobj_read(uobj);
-		return NULL;
-	}
-
-	return uobj;
-}
-
-static struct ib_uobject *idr_write_uobj(int id, struct ib_ucontext *context)
-{
-	struct ib_uobject *uobj;
-
-	uobj = __idr_get_uobj(id, context);
-	if (!uobj)
-		return NULL;
-
-	down_write(&uobj->mutex);
-	if (!uobj->live) {
-		put_uobj_write(uobj);
-		return NULL;
-	}
-
-	return uobj;
-}
-
-static void *idr_read_obj(int id, struct ib_ucontext *context,
-			  int nested)
-{
-	struct ib_uobject *uobj;
-
-	uobj = idr_read_uobj(id, context, nested);
-	return uobj ? uobj->object : NULL;
-}
-
-static struct ib_pd *idr_read_pd(int pd_handle, struct ib_ucontext *context)
-{
-	return idr_read_obj(pd_handle, context, 0);
-}
-
-static void put_pd_read(struct ib_pd *pd)
-{
-	put_uobj_read(pd->uobject);
-}
-
-static struct ib_cq *idr_read_cq(int cq_handle, struct ib_ucontext *context, int nested)
-{
-	return idr_read_obj(cq_handle, context, nested);
-}
-
-static void put_cq_read(struct ib_cq *cq)
-{
-	put_uobj_read(cq->uobject);
-}
-
-static struct ib_ah *idr_read_ah(int ah_handle, struct ib_ucontext *context)
-{
-	return idr_read_obj(ah_handle, context, 0);
-}
-
-static void put_ah_read(struct ib_ah *ah)
-{
-	put_uobj_read(ah->uobject);
-}
-
-static struct ib_qp *idr_read_qp(int qp_handle, struct ib_ucontext *context)
-{
-	return idr_read_obj(qp_handle, context, 0);
-}
-
-static struct ib_wq *idr_read_wq(int wq_handle, struct ib_ucontext *context)
-{
-	return idr_read_obj(wq_handle, context, 0);
-}
-
-static void put_wq_read(struct ib_wq *wq)
-{
-	put_uobj_read(wq->uobject);
-}
-
-static struct ib_rwq_ind_table *idr_read_rwq_indirection_table(int ind_table_handle,
-							       struct ib_ucontext *context)
-{
-	return idr_read_obj(ind_table_handle, context, 0);
-}
-
-static void put_rwq_indirection_table_read(struct ib_rwq_ind_table *ind_table)
-{
-	put_uobj_read(ind_table->uobject);
-}
-
-static struct ib_qp *idr_write_qp(int qp_handle, struct ib_ucontext *context)
-{
-	struct ib_uobject *uobj;
-
-	uobj = idr_write_uobj(qp_handle, context);
-	return uobj ? uobj->object : NULL;
-}
-
-static void put_qp_read(struct ib_qp *qp)
-{
-	put_uobj_read(qp->uobject);
-}
-
-static void put_qp_write(struct ib_qp *qp)
-{
-	put_uobj_write(qp->uobject);
-}
-
-static struct ib_srq *idr_read_srq(int srq_handle, struct ib_ucontext *context)
-{
-	return idr_read_obj(srq_handle, context, 0);
-}
-
-static void put_srq_read(struct ib_srq *srq)
-{
-	put_uobj_read(srq->uobject);
-}
-
-static struct ib_xrcd *idr_read_xrcd(int xrcd_handle, struct ib_ucontext *context,
-				     struct ib_uobject **uobj)
-{
-	*uobj = idr_read_uobj(xrcd_handle, context, 0);
-	return *uobj ? (*uobj)->object : NULL;
-}
-
-static void put_xrcd_read(struct ib_uobject *uobj)
-{
-	put_uobj_read(uobj);
-}
 ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 			      struct ib_device *ib_dev,
 			      const char __user *buf,
@@ -347,17 +91,8 @@ ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 	ucontext->cg_obj = cg_obj;
 	/* ufile is required when some objects are released */
 	ucontext->ufile = file;
-	INIT_LIST_HEAD(&ucontext->pd_list);
-	INIT_LIST_HEAD(&ucontext->mr_list);
-	INIT_LIST_HEAD(&ucontext->mw_list);
-	INIT_LIST_HEAD(&ucontext->cq_list);
-	INIT_LIST_HEAD(&ucontext->qp_list);
-	INIT_LIST_HEAD(&ucontext->srq_list);
-	INIT_LIST_HEAD(&ucontext->ah_list);
-	INIT_LIST_HEAD(&ucontext->wq_list);
-	INIT_LIST_HEAD(&ucontext->rwq_ind_tbl_list);
-	INIT_LIST_HEAD(&ucontext->xrcd_list);
-	INIT_LIST_HEAD(&ucontext->rule_list);
+	uverbs_initialize_ucontext(ucontext);
+
 	rcu_read_lock();
 	ucontext->tgid = get_task_pid(current->group_leader, PIDTYPE_PID);
 	rcu_read_unlock();
@@ -564,19 +299,9 @@ ssize_t ib_uverbs_alloc_pd(struct ib_uverbs_file *file,
 		   (unsigned long) cmd.response + sizeof resp,
 		   in_len - sizeof cmd, out_len - sizeof resp);
 
-	uobj = kmalloc(sizeof *uobj, GFP_KERNEL);
-	if (!uobj)
-		return -ENOMEM;
-
-	init_uobj(uobj, 0, file->ucontext, &pd_lock_class);
-	ret = ib_rdmacg_try_charge(&uobj->cg_obj, ib_dev,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-	if (ret) {
-		kfree(uobj);
-		return ret;
-	}
-
-	down_write(&uobj->mutex);
+	uobj  = uobj_alloc(uobj_get_type(pd), file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
 	pd = ib_dev->alloc_pd(ib_dev, file->ucontext, &udata);
 	if (IS_ERR(pd)) {
@@ -590,10 +315,6 @@ ssize_t ib_uverbs_alloc_pd(struct ib_uverbs_file *file,
 	atomic_set(&pd->usecnt, 0);
 
 	uobj->object = pd;
-	ret = idr_add_uobj(uobj);
-	if (ret)
-		goto err_idr;
-
 	memset(&resp, 0, sizeof resp);
 	resp.pd_handle = uobj->id;
 
@@ -603,25 +324,15 @@ ssize_t ib_uverbs_alloc_pd(struct ib_uverbs_file *file,
 		goto err_copy;
 	}
 
-	mutex_lock(&file->mutex);
-	list_add_tail(&uobj->list, &file->ucontext->pd_list);
-	mutex_unlock(&file->mutex);
-
-	uobj->live = 1;
-
-	up_write(&uobj->mutex);
+	uobj_alloc_commit(uobj);
 
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(uobj);
-
-err_idr:
 	ib_dealloc_pd(pd);
 
 err:
-	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
-	put_uobj_write(uobj);
+	uobj_alloc_abort(uobj);
 	return ret;
 }
 
@@ -632,45 +343,19 @@ ssize_t ib_uverbs_dealloc_pd(struct ib_uverbs_file *file,
 {
 	struct ib_uverbs_dealloc_pd cmd;
 	struct ib_uobject          *uobj;
-	struct ib_pd		   *pd;
 	int                         ret;
 
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(cmd.pd_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
-	pd = uobj->object;
+	uobj  = uobj_get_write(uobj_get_type(pd), cmd.pd_handle,
+			       file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
-	if (atomic_read(&pd->usecnt)) {
-		ret = -EBUSY;
-		goto err_put;
-	}
+	ret = uobj_remove_commit(uobj);
 
-	ret = pd->device->dealloc_pd(uobj->object);
-	WARN_ONCE(ret, "Infiniband HW driver failed dealloc_pd");
-	if (ret)
-		goto err_put;
-
-	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
-
-	uobj->live = 0;
-	put_uobj_write(uobj);
-
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	put_uobj(uobj);
-
-	return in_len;
-
-err_put:
-	put_uobj_write(uobj);
-	return ret;
+	return ret ?: in_len;
 }
 
 struct xrcd_table_entry {
@@ -807,16 +492,13 @@ ssize_t ib_uverbs_open_xrcd(struct ib_uverbs_file *file,
 		}
 	}
 
-	obj = kmalloc(sizeof *obj, GFP_KERNEL);
-	if (!obj) {
-		ret = -ENOMEM;
+	obj  = (struct ib_uxrcd_object *)uobj_alloc(uobj_get_type(xrcd),
+						    file->ucontext);
+	if (IS_ERR(obj)) {
+		ret = PTR_ERR(obj);
 		goto err_tree_mutex_unlock;
 	}
 
-	init_uobj(&obj->uobject, 0, file->ucontext, &xrcd_lock_class);
-
-	down_write(&obj->uobject.mutex);
-
 	if (!xrcd) {
 		xrcd = ib_dev->alloc_xrcd(ib_dev, file->ucontext, &udata);
 		if (IS_ERR(xrcd)) {
@@ -834,10 +516,6 @@ ssize_t ib_uverbs_open_xrcd(struct ib_uverbs_file *file,
 
 	atomic_set(&obj->refcnt, 0);
 	obj->uobject.object = xrcd;
-	ret = idr_add_uobj(&obj->uobject);
-	if (ret)
-		goto err_idr;
-
 	memset(&resp, 0, sizeof resp);
 	resp.xrcd_handle = obj->uobject.id;
 
@@ -846,7 +524,7 @@ ssize_t ib_uverbs_open_xrcd(struct ib_uverbs_file *file,
 			/* create new inode/xrcd table entry */
 			ret = xrcd_table_insert(file->device, inode, xrcd);
 			if (ret)
-				goto err_insert_xrcd;
+				goto err_dealloc_xrcd;
 		}
 		atomic_inc(&xrcd->usecnt);
 	}
@@ -860,12 +538,7 @@ ssize_t ib_uverbs_open_xrcd(struct ib_uverbs_file *file,
 	if (f.file)
 		fdput(f);
 
-	mutex_lock(&file->mutex);
-	list_add_tail(&obj->uobject.list, &file->ucontext->xrcd_list);
-	mutex_unlock(&file->mutex);
-
-	obj->uobject.live = 1;
-	up_write(&obj->uobject.mutex);
+	uobj_alloc_commit(&obj->uobject);
 
 	mutex_unlock(&file->device->xrcd_tree_mutex);
 	return in_len;
@@ -877,14 +550,11 @@ ssize_t ib_uverbs_open_xrcd(struct ib_uverbs_file *file,
 		atomic_dec(&xrcd->usecnt);
 	}
 
-err_insert_xrcd:
-	idr_remove_uobj(&obj->uobject);
-
-err_idr:
+err_dealloc_xrcd:
 	ib_dealloc_xrcd(xrcd);
 
 err:
-	put_uobj_write(&obj->uobject);
+	uobj_alloc_abort(&obj->uobject);
 
 err_tree_mutex_unlock:
 	if (f.file)
@@ -902,60 +572,20 @@ ssize_t ib_uverbs_close_xrcd(struct ib_uverbs_file *file,
 {
 	struct ib_uverbs_close_xrcd cmd;
 	struct ib_uobject           *uobj;
-	struct ib_xrcd              *xrcd = NULL;
-	struct inode                *inode = NULL;
-	struct ib_uxrcd_object      *obj;
-	int                         live;
 	int                         ret = 0;
 
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	mutex_lock(&file->device->xrcd_tree_mutex);
-	uobj = idr_write_uobj(cmd.xrcd_handle, file->ucontext);
-	if (!uobj) {
-		ret = -EINVAL;
-		goto out;
-	}
-
-	xrcd  = uobj->object;
-	inode = xrcd->inode;
-	obj   = container_of(uobj, struct ib_uxrcd_object, uobject);
-	if (atomic_read(&obj->refcnt)) {
-		put_uobj_write(uobj);
-		ret = -EBUSY;
-		goto out;
-	}
-
-	if (!inode || atomic_dec_and_test(&xrcd->usecnt)) {
-		ret = ib_dealloc_xrcd(uobj->object);
-		if (!ret)
-			uobj->live = 0;
+	uobj  = uobj_get_write(uobj_get_type(xrcd), cmd.xrcd_handle,
+			       file->ucontext);
+	if (IS_ERR(uobj)) {
+		mutex_unlock(&file->device->xrcd_tree_mutex);
+		return PTR_ERR(uobj);
 	}
 
-	live = uobj->live;
-	if (inode && ret)
-		atomic_inc(&xrcd->usecnt);
-
-	put_uobj_write(uobj);
-
-	if (ret)
-		goto out;
-
-	if (inode && !live)
-		xrcd_table_delete(file->device, inode);
-
-	idr_remove_uobj(uobj);
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	put_uobj(uobj);
-	ret = in_len;
-
-out:
-	mutex_unlock(&file->device->xrcd_tree_mutex);
-	return ret;
+	ret = uobj_remove_commit(uobj);
+	return ret ?: in_len;
 }
 
 int ib_uverbs_dealloc_xrcd(struct ib_uverbs_device *dev,
@@ -1009,14 +639,11 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
 	if (ret)
 		return ret;
 
-	uobj = kmalloc(sizeof *uobj, GFP_KERNEL);
-	if (!uobj)
-		return -ENOMEM;
-
-	init_uobj(uobj, 0, file->ucontext, &mr_lock_class);
-	down_write(&uobj->mutex);
+	uobj  = uobj_alloc(uobj_get_type(mr), file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
-	pd = idr_read_pd(cmd.pd_handle, file->ucontext);
+	pd = uobj_get_obj_read(pd, cmd.pd_handle, file->ucontext);
 	if (!pd) {
 		ret = -EINVAL;
 		goto err_free;
@@ -1030,10 +657,6 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
 			goto err_put;
 		}
 	}
-	ret = ib_rdmacg_try_charge(&uobj->cg_obj, ib_dev,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-	if (ret)
-		goto err_charge;
 
 	mr = pd->device->reg_user_mr(pd, cmd.start, cmd.length, cmd.hca_va,
 				     cmd.access_flags, &udata);
@@ -1048,9 +671,6 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
 	atomic_inc(&pd->usecnt);
 
 	uobj->object = mr;
-	ret = idr_add_uobj(uobj);
-	if (ret)
-		goto err_unreg;
 
 	memset(&resp, 0, sizeof resp);
 	resp.lkey      = mr->lkey;
@@ -1063,32 +683,20 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
 		goto err_copy;
 	}
 
-	put_pd_read(pd);
-
-	mutex_lock(&file->mutex);
-	list_add_tail(&uobj->list, &file->ucontext->mr_list);
-	mutex_unlock(&file->mutex);
+	uobj_put_obj_read(pd);
 
-	uobj->live = 1;
-
-	up_write(&uobj->mutex);
+	uobj_alloc_commit(uobj);
 
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(uobj);
-
-err_unreg:
 	ib_dereg_mr(mr);
 
 err_put:
-	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
-
-err_charge:
-	put_pd_read(pd);
+	uobj_put_obj_read(pd);
 
 err_free:
-	put_uobj_write(uobj);
+	uobj_alloc_abort(uobj);
 	return ret;
 }
 
@@ -1124,10 +732,10 @@ ssize_t ib_uverbs_rereg_mr(struct ib_uverbs_file *file,
 	     (cmd.start & ~PAGE_MASK) != (cmd.hca_va & ~PAGE_MASK)))
 			return -EINVAL;
 
-	uobj = idr_write_uobj(cmd.mr_handle, file->ucontext);
-
-	if (!uobj)
-		return -EINVAL;
+	uobj  = uobj_get_write(uobj_get_type(mr), cmd.mr_handle,
+			       file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
 	mr = uobj->object;
 
@@ -1138,7 +746,7 @@ ssize_t ib_uverbs_rereg_mr(struct ib_uverbs_file *file,
 	}
 
 	if (cmd.flags & IB_MR_REREG_PD) {
-		pd = idr_read_pd(cmd.pd_handle, file->ucontext);
+		pd = uobj_get_obj_read(pd, cmd.pd_handle, file->ucontext);
 		if (!pd) {
 			ret = -EINVAL;
 			goto put_uobjs;
@@ -1171,11 +779,10 @@ ssize_t ib_uverbs_rereg_mr(struct ib_uverbs_file *file,
 
 put_uobj_pd:
 	if (cmd.flags & IB_MR_REREG_PD)
-		put_pd_read(pd);
+		uobj_put_obj_read(pd);
 
 put_uobjs:
-
-	put_uobj_write(mr->uobject);
+	uobj_put_write(uobj);
 
 	return ret;
 }
@@ -1186,38 +793,20 @@ ssize_t ib_uverbs_dereg_mr(struct ib_uverbs_file *file,
 			   int out_len)
 {
 	struct ib_uverbs_dereg_mr cmd;
-	struct ib_mr             *mr;
 	struct ib_uobject	 *uobj;
 	int                       ret = -EINVAL;
 
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(cmd.mr_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
-
-	mr = uobj->object;
-
-	ret = ib_dereg_mr(mr);
-	if (!ret)
-		uobj->live = 0;
-
-	put_uobj_write(uobj);
+	uobj  = uobj_get_write(uobj_get_type(mr), cmd.mr_handle,
+			       file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
-	if (ret)
-		return ret;
+	ret = uobj_remove_commit(uobj);
 
-	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	put_uobj(uobj);
-
-	return in_len;
+	return ret ?: in_len;
 }
 
 ssize_t ib_uverbs_alloc_mw(struct ib_uverbs_file *file,
@@ -1239,14 +828,11 @@ ssize_t ib_uverbs_alloc_mw(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof(cmd)))
 		return -EFAULT;
 
-	uobj = kmalloc(sizeof(*uobj), GFP_KERNEL);
-	if (!uobj)
-		return -ENOMEM;
+	uobj  = uobj_alloc(uobj_get_type(mw), file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
-	init_uobj(uobj, 0, file->ucontext, &mw_lock_class);
-	down_write(&uobj->mutex);
-
-	pd = idr_read_pd(cmd.pd_handle, file->ucontext);
+	pd = uobj_get_obj_read(pd, cmd.pd_handle, file->ucontext);
 	if (!pd) {
 		ret = -EINVAL;
 		goto err_free;
@@ -1257,11 +843,6 @@ ssize_t ib_uverbs_alloc_mw(struct ib_uverbs_file *file,
 		   in_len - sizeof(cmd) - sizeof(struct ib_uverbs_cmd_hdr),
 		   out_len - sizeof(resp));
 
-	ret = ib_rdmacg_try_charge(&uobj->cg_obj, ib_dev,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-	if (ret)
-		goto err_charge;
-
 	mw = pd->device->alloc_mw(pd, cmd.mw_type, &udata);
 	if (IS_ERR(mw)) {
 		ret = PTR_ERR(mw);
@@ -1274,9 +855,6 @@ ssize_t ib_uverbs_alloc_mw(struct ib_uverbs_file *file,
 	atomic_inc(&pd->usecnt);
 
 	uobj->object = mw;
-	ret = idr_add_uobj(uobj);
-	if (ret)
-		goto err_unalloc;
 
 	memset(&resp, 0, sizeof(resp));
 	resp.rkey      = mw->rkey;
@@ -1288,32 +866,17 @@ ssize_t ib_uverbs_alloc_mw(struct ib_uverbs_file *file,
 		goto err_copy;
 	}
 
-	put_pd_read(pd);
-
-	mutex_lock(&file->mutex);
-	list_add_tail(&uobj->list, &file->ucontext->mw_list);
-	mutex_unlock(&file->mutex);
-
-	uobj->live = 1;
-
-	up_write(&uobj->mutex);
+	uobj_put_obj_read(pd);
+	uobj_alloc_commit(uobj);
 
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(uobj);
-
-err_unalloc:
 	uverbs_dealloc_mw(mw);
-
 err_put:
-	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
-
-err_charge:
-	put_pd_read(pd);
-
+	uobj_put_obj_read(pd);
 err_free:
-	put_uobj_write(uobj);
+	uobj_alloc_abort(uobj);
 	return ret;
 }
 
@@ -1323,38 +886,19 @@ ssize_t ib_uverbs_dealloc_mw(struct ib_uverbs_file *file,
 			     int out_len)
 {
 	struct ib_uverbs_dealloc_mw cmd;
-	struct ib_mw               *mw;
 	struct ib_uobject	   *uobj;
 	int                         ret = -EINVAL;
 
 	if (copy_from_user(&cmd, buf, sizeof(cmd)))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(cmd.mw_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
-
-	mw = uobj->object;
+	uobj  = uobj_get_write(uobj_get_type(mw), cmd.mw_handle,
+			       file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
-	ret = uverbs_dealloc_mw(mw);
-	if (!ret)
-		uobj->live = 0;
-
-	put_uobj_write(uobj);
-
-	if (ret)
-		return ret;
-
-	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	put_uobj(uobj);
-
-	return in_len;
+	ret = uobj_remove_commit(uobj);
+	return ret ?: in_len;
 }
 
 ssize_t ib_uverbs_create_comp_channel(struct ib_uverbs_file *file,
@@ -1418,12 +962,10 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 	if (cmd->comp_vector >= file->device->num_comp_vectors)
 		return ERR_PTR(-EINVAL);
 
-	obj = kmalloc(sizeof *obj, GFP_KERNEL);
-	if (!obj)
-		return ERR_PTR(-ENOMEM);
-
-	init_uobj(&obj->uobject, cmd->user_handle, file->ucontext, &cq_lock_class);
-	down_write(&obj->uobject.mutex);
+	obj  = (struct ib_ucq_object *)uobj_alloc(uobj_get_type(cq),
+						  file->ucontext);
+	if (IS_ERR(obj))
+		return obj;
 
 	if (cmd->comp_channel >= 0) {
 		ev_file = ib_uverbs_lookup_comp_file(cmd->comp_channel);
@@ -1433,6 +975,7 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 		}
 	}
 
+	obj->uobject.user_handle = cmd->user_handle;
 	obj->uverbs_file	   = file;
 	obj->comp_events_reported  = 0;
 	obj->async_events_reported = 0;
@@ -1445,13 +988,7 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 	if (cmd_sz > offsetof(typeof(*cmd), flags) + sizeof(cmd->flags))
 		attr.flags = cmd->flags;
 
-	ret = ib_rdmacg_try_charge(&obj->uobject.cg_obj, ib_dev,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-	if (ret)
-		goto err_charge;
-
-	cq = ib_dev->create_cq(ib_dev, &attr,
-					     file->ucontext, uhw);
+	cq = ib_dev->create_cq(ib_dev, &attr, file->ucontext, uhw);
 	if (IS_ERR(cq)) {
 		ret = PTR_ERR(cq);
 		goto err_file;
@@ -1465,10 +1002,6 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 	atomic_set(&cq->usecnt, 0);
 
 	obj->uobject.object = cq;
-	ret = idr_add_uobj(&obj->uobject);
-	if (ret)
-		goto err_free;
-
 	memset(&resp, 0, sizeof resp);
 	resp.base.cq_handle = obj->uobject.id;
 	resp.base.cqe       = cq->cqe;
@@ -1480,32 +1013,19 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 	if (ret)
 		goto err_cb;
 
-	mutex_lock(&file->mutex);
-	list_add_tail(&obj->uobject.list, &file->ucontext->cq_list);
-	mutex_unlock(&file->mutex);
-
-	obj->uobject.live = 1;
-
-	up_write(&obj->uobject.mutex);
+	uobj_alloc_commit(&obj->uobject);
 
 	return obj;
 
 err_cb:
-	idr_remove_uobj(&obj->uobject);
-
-err_free:
 	ib_destroy_cq(cq);
 
 err_file:
-	ib_rdmacg_uncharge(&obj->uobject.cg_obj, ib_dev,
-			   RDMACG_RESOURCE_HCA_OBJECT);
-
-err_charge:
 	if (ev_file)
 		ib_uverbs_release_ucq(file, ev_file, obj);
 
 err:
-	put_uobj_write(&obj->uobject);
+	uobj_alloc_abort(&obj->uobject);
 
 	return ERR_PTR(ret);
 }
@@ -1628,7 +1148,7 @@ ssize_t ib_uverbs_resize_cq(struct ib_uverbs_file *file,
 		   (unsigned long) cmd.response + sizeof resp,
 		   in_len - sizeof cmd, out_len - sizeof resp);
 
-	cq = idr_read_cq(cmd.cq_handle, file->ucontext, 0);
+	cq = uobj_get_obj_read(cq, cmd.cq_handle, file->ucontext);
 	if (!cq)
 		return -EINVAL;
 
@@ -1643,7 +1163,7 @@ ssize_t ib_uverbs_resize_cq(struct ib_uverbs_file *file,
 		ret = -EFAULT;
 
 out:
-	put_cq_read(cq);
+	uobj_put_obj_read(cq);
 
 	return ret ? ret : in_len;
 }
@@ -1690,7 +1210,7 @@ ssize_t ib_uverbs_poll_cq(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	cq = idr_read_cq(cmd.cq_handle, file->ucontext, 0);
+	cq = uobj_get_obj_read(cq, cmd.cq_handle, file->ucontext);
 	if (!cq)
 		return -EINVAL;
 
@@ -1722,7 +1242,7 @@ ssize_t ib_uverbs_poll_cq(struct ib_uverbs_file *file,
 	ret = in_len;
 
 out_put:
-	put_cq_read(cq);
+	uobj_put_obj_read(cq);
 	return ret;
 }
 
@@ -1737,14 +1257,14 @@ ssize_t ib_uverbs_req_notify_cq(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	cq = idr_read_cq(cmd.cq_handle, file->ucontext, 0);
+	cq = uobj_get_obj_read(cq, cmd.cq_handle, file->ucontext);
 	if (!cq)
 		return -EINVAL;
 
 	ib_req_notify_cq(cq, cmd.solicited_only ?
 			 IB_CQ_SOLICITED : IB_CQ_NEXT_COMP);
 
-	put_cq_read(cq);
+	uobj_put_obj_read(cq);
 
 	return in_len;
 }
@@ -1765,37 +1285,32 @@ ssize_t ib_uverbs_destroy_cq(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(cmd.cq_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
+	uobj  = uobj_get_write(uobj_get_type(cq), cmd.cq_handle,
+			       file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
+
+	/*
+	 * Make sure we don't free the memory in remove_commit as we still
+	 * needs the uobject memory to create the response.
+	 */
+	uverbs_uobject_get(uobj);
 	cq      = uobj->object;
 	ev_file = cq->cq_context;
 	obj     = container_of(cq->uobject, struct ib_ucq_object, uobject);
 
-	ret = ib_destroy_cq(cq);
-	if (!ret)
-		uobj->live = 0;
-
-	put_uobj_write(uobj);
+	memset(&resp, 0, sizeof(resp));
 
-	if (ret)
+	ret = uobj_remove_commit(uobj);
+	if (ret) {
+		uverbs_uobject_put(uobj);
 		return ret;
+	}
 
-	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	ib_uverbs_release_ucq(file, ev_file, obj);
-
-	memset(&resp, 0, sizeof resp);
 	resp.comp_events_reported  = obj->comp_events_reported;
 	resp.async_events_reported = obj->async_events_reported;
 
-	put_uobj(uobj);
-
+	uverbs_uobject_put(uobj);
 	if (copy_to_user((void __user *) (unsigned long) cmd.response,
 			 &resp, sizeof resp))
 		return -EFAULT;
@@ -1817,7 +1332,7 @@ static int create_qp(struct ib_uverbs_file *file,
 	struct ib_device		*device;
 	struct ib_pd			*pd = NULL;
 	struct ib_xrcd			*xrcd = NULL;
-	struct ib_uobject		*uninitialized_var(xrcd_uobj);
+	struct ib_uobject		*xrcd_uobj = ERR_PTR(-ENOENT);
 	struct ib_cq			*scq = NULL, *rcq = NULL;
 	struct ib_srq			*srq = NULL;
 	struct ib_qp			*qp;
@@ -1831,18 +1346,19 @@ static int create_qp(struct ib_uverbs_file *file,
 	if (cmd->qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW))
 		return -EPERM;
 
-	obj = kzalloc(sizeof *obj, GFP_KERNEL);
-	if (!obj)
-		return -ENOMEM;
+	obj  = (struct ib_uqp_object *)uobj_alloc(uobj_get_type(qp),
+						  file->ucontext);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+	obj->uxrcd = NULL;
+	obj->uevent.uobject.user_handle = cmd->user_handle;
 
-	init_uobj(&obj->uevent.uobject, cmd->user_handle, file->ucontext,
-		  &qp_lock_class);
-	down_write(&obj->uevent.uobject.mutex);
 	if (cmd_sz >= offsetof(typeof(*cmd), rwq_ind_tbl_handle) +
 		      sizeof(cmd->rwq_ind_tbl_handle) &&
 		      (cmd->comp_mask & IB_UVERBS_CREATE_QP_MASK_IND_TABLE)) {
-		ind_tbl = idr_read_rwq_indirection_table(cmd->rwq_ind_tbl_handle,
-							 file->ucontext);
+		ind_tbl = uobj_get_obj_read(rwq_ind_table,
+					    cmd->rwq_ind_tbl_handle,
+					    file->ucontext);
 		if (!ind_tbl) {
 			ret = -EINVAL;
 			goto err_put;
@@ -1866,8 +1382,15 @@ static int create_qp(struct ib_uverbs_file *file,
 		has_sq = false;
 
 	if (cmd->qp_type == IB_QPT_XRC_TGT) {
-		xrcd = idr_read_xrcd(cmd->pd_handle, file->ucontext,
-				     &xrcd_uobj);
+		xrcd_uobj = uobj_get_read(uobj_get_type(xrcd), cmd->pd_handle,
+					  file->ucontext);
+
+		if (IS_ERR(xrcd_uobj)) {
+			ret = -EINVAL;
+			goto err_put;
+		}
+
+		xrcd = (struct ib_xrcd *)xrcd_uobj->object;
 		if (!xrcd) {
 			ret = -EINVAL;
 			goto err_put;
@@ -1879,8 +1402,8 @@ static int create_qp(struct ib_uverbs_file *file,
 			cmd->max_recv_sge = 0;
 		} else {
 			if (cmd->is_srq) {
-				srq = idr_read_srq(cmd->srq_handle,
-						   file->ucontext);
+				srq = uobj_get_obj_read(srq, cmd->srq_handle,
+							file->ucontext);
 				if (!srq || srq->srq_type != IB_SRQT_BASIC) {
 					ret = -EINVAL;
 					goto err_put;
@@ -1889,8 +1412,8 @@ static int create_qp(struct ib_uverbs_file *file,
 
 			if (!ind_tbl) {
 				if (cmd->recv_cq_handle != cmd->send_cq_handle) {
-					rcq = idr_read_cq(cmd->recv_cq_handle,
-							  file->ucontext, 0);
+					rcq = uobj_get_obj_read(cq, cmd->recv_cq_handle,
+								file->ucontext);
 					if (!rcq) {
 						ret = -EINVAL;
 						goto err_put;
@@ -1900,10 +1423,11 @@ static int create_qp(struct ib_uverbs_file *file,
 		}
 
 		if (has_sq)
-			scq = idr_read_cq(cmd->send_cq_handle, file->ucontext, !!rcq);
+			scq = uobj_get_obj_read(cq, cmd->send_cq_handle,
+						file->ucontext);
 		if (!ind_tbl)
 			rcq = rcq ?: scq;
-		pd  = idr_read_pd(cmd->pd_handle, file->ucontext);
+		pd  = uobj_get_obj_read(pd, cmd->pd_handle, file->ucontext);
 		if (!pd || (!scq && has_sq)) {
 			ret = -EINVAL;
 			goto err_put;
@@ -1955,11 +1479,6 @@ static int create_qp(struct ib_uverbs_file *file,
 			goto err_put;
 		}
 
-	ret = ib_rdmacg_try_charge(&obj->uevent.uobject.cg_obj, device,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-	if (ret)
-		goto err_put;
-
 	if (cmd->qp_type == IB_QPT_XRC_TGT)
 		qp = ib_create_qp(pd, &attr);
 	else
@@ -1967,7 +1486,7 @@ static int create_qp(struct ib_uverbs_file *file,
 
 	if (IS_ERR(qp)) {
 		ret = PTR_ERR(qp);
-		goto err_create;
+		goto err_put;
 	}
 
 	if (cmd->qp_type != IB_QPT_XRC_TGT) {
@@ -1995,9 +1514,6 @@ static int create_qp(struct ib_uverbs_file *file,
 	qp->uobject = &obj->uevent.uobject;
 
 	obj->uevent.uobject.object = qp;
-	ret = idr_add_uobj(&obj->uevent.uobject);
-	if (ret)
-		goto err_destroy;
 
 	memset(&resp, 0, sizeof resp);
 	resp.base.qpn             = qp->qp_num;
@@ -2019,54 +1535,41 @@ static int create_qp(struct ib_uverbs_file *file,
 		obj->uxrcd = container_of(xrcd_uobj, struct ib_uxrcd_object,
 					  uobject);
 		atomic_inc(&obj->uxrcd->refcnt);
-		put_xrcd_read(xrcd_uobj);
+		uobj_put_read(xrcd_uobj);
 	}
 
 	if (pd)
-		put_pd_read(pd);
+		uobj_put_obj_read(pd);
 	if (scq)
-		put_cq_read(scq);
+		uobj_put_obj_read(scq);
 	if (rcq && rcq != scq)
-		put_cq_read(rcq);
+		uobj_put_obj_read(rcq);
 	if (srq)
-		put_srq_read(srq);
+		uobj_put_obj_read(srq);
 	if (ind_tbl)
-		put_rwq_indirection_table_read(ind_tbl);
-
-	mutex_lock(&file->mutex);
-	list_add_tail(&obj->uevent.uobject.list, &file->ucontext->qp_list);
-	mutex_unlock(&file->mutex);
-
-	obj->uevent.uobject.live = 1;
+		uobj_put_obj_read(ind_tbl);
 
-	up_write(&obj->uevent.uobject.mutex);
+	uobj_alloc_commit(&obj->uevent.uobject);
 
 	return 0;
 err_cb:
-	idr_remove_uobj(&obj->uevent.uobject);
-
-err_destroy:
 	ib_destroy_qp(qp);
 
-err_create:
-	ib_rdmacg_uncharge(&obj->uevent.uobject.cg_obj, device,
-			   RDMACG_RESOURCE_HCA_OBJECT);
-
 err_put:
-	if (xrcd)
-		put_xrcd_read(xrcd_uobj);
+	if (!IS_ERR(xrcd_uobj))
+		uobj_put_read(xrcd_uobj);
 	if (pd)
-		put_pd_read(pd);
+		uobj_put_obj_read(pd);
 	if (scq)
-		put_cq_read(scq);
+		uobj_put_obj_read(scq);
 	if (rcq && rcq != scq)
-		put_cq_read(rcq);
+		uobj_put_obj_read(rcq);
 	if (srq)
-		put_srq_read(srq);
+		uobj_put_obj_read(srq);
 	if (ind_tbl)
-		put_rwq_indirection_table_read(ind_tbl);
+		uobj_put_obj_read(ind_tbl);
 
-	put_uobj_write(&obj->uevent.uobject);
+	uobj_alloc_abort(&obj->uevent.uobject);
 	return ret;
 }
 
@@ -2202,17 +1705,22 @@ ssize_t ib_uverbs_open_qp(struct ib_uverbs_file *file,
 		   (unsigned long) cmd.response + sizeof resp,
 		   in_len - sizeof cmd, out_len - sizeof resp);
 
-	obj = kmalloc(sizeof *obj, GFP_KERNEL);
-	if (!obj)
-		return -ENOMEM;
+	obj  = (struct ib_uqp_object *)uobj_alloc(uobj_get_type(qp),
+						  file->ucontext);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
 
-	init_uobj(&obj->uevent.uobject, cmd.user_handle, file->ucontext, &qp_lock_class);
-	down_write(&obj->uevent.uobject.mutex);
+	xrcd_uobj = uobj_get_read(uobj_get_type(xrcd), cmd.pd_handle,
+				  file->ucontext);
+	if (IS_ERR(xrcd_uobj)) {
+		ret = -EINVAL;
+		goto err_put;
+	}
 
-	xrcd = idr_read_xrcd(cmd.pd_handle, file->ucontext, &xrcd_uobj);
+	xrcd = (struct ib_xrcd *)xrcd_uobj->object;
 	if (!xrcd) {
 		ret = -EINVAL;
-		goto err_put;
+		goto err_xrcd;
 	}
 
 	attr.event_handler = ib_uverbs_qp_event_handler;
@@ -2227,15 +1735,11 @@ ssize_t ib_uverbs_open_qp(struct ib_uverbs_file *file,
 	qp = ib_open_qp(xrcd, &attr);
 	if (IS_ERR(qp)) {
 		ret = PTR_ERR(qp);
-		goto err_put;
+		goto err_xrcd;
 	}
 
-	qp->uobject = &obj->uevent.uobject;
-
 	obj->uevent.uobject.object = qp;
-	ret = idr_add_uobj(&obj->uevent.uobject);
-	if (ret)
-		goto err_destroy;
+	obj->uevent.uobject.user_handle = cmd.user_handle;
 
 	memset(&resp, 0, sizeof resp);
 	resp.qpn       = qp->qp_num;
@@ -2244,32 +1748,25 @@ ssize_t ib_uverbs_open_qp(struct ib_uverbs_file *file,
 	if (copy_to_user((void __user *) (unsigned long) cmd.response,
 			 &resp, sizeof resp)) {
 		ret = -EFAULT;
-		goto err_remove;
+		goto err_destroy;
 	}
 
 	obj->uxrcd = container_of(xrcd_uobj, struct ib_uxrcd_object, uobject);
 	atomic_inc(&obj->uxrcd->refcnt);
-	put_xrcd_read(xrcd_uobj);
-
-	mutex_lock(&file->mutex);
-	list_add_tail(&obj->uevent.uobject.list, &file->ucontext->qp_list);
-	mutex_unlock(&file->mutex);
+	qp->uobject = &obj->uevent.uobject;
+	uobj_put_read(xrcd_uobj);
 
-	obj->uevent.uobject.live = 1;
 
-	up_write(&obj->uevent.uobject.mutex);
+	uobj_alloc_commit(&obj->uevent.uobject);
 
 	return in_len;
 
-err_remove:
-	idr_remove_uobj(&obj->uevent.uobject);
-
 err_destroy:
 	ib_destroy_qp(qp);
-
+err_xrcd:
+	uobj_put_read(xrcd_uobj);
 err_put:
-	put_xrcd_read(xrcd_uobj);
-	put_uobj_write(&obj->uevent.uobject);
+	uobj_alloc_abort(&obj->uevent.uobject);
 	return ret;
 }
 
@@ -2295,7 +1792,7 @@ ssize_t ib_uverbs_query_qp(struct ib_uverbs_file *file,
 		goto out;
 	}
 
-	qp = idr_read_qp(cmd.qp_handle, file->ucontext);
+	qp = uobj_get_obj_read(qp, cmd.qp_handle, file->ucontext);
 	if (!qp) {
 		ret = -EINVAL;
 		goto out;
@@ -2303,7 +1800,7 @@ ssize_t ib_uverbs_query_qp(struct ib_uverbs_file *file,
 
 	ret = ib_query_qp(qp, attr, cmd.attr_mask, init_attr);
 
-	put_qp_read(qp);
+	uobj_put_obj_read(qp);
 
 	if (ret)
 		goto out;
@@ -2399,7 +1896,7 @@ static int modify_qp(struct ib_uverbs_file *file,
 	if (!attr)
 		return -ENOMEM;
 
-	qp = idr_read_qp(cmd->base.qp_handle, file->ucontext);
+	qp = uobj_get_obj_read(qp, cmd->base.qp_handle, file->ucontext);
 	if (!qp) {
 		ret = -EINVAL;
 		goto out;
@@ -2471,7 +1968,7 @@ static int modify_qp(struct ib_uverbs_file *file,
 	}
 
 release_qp:
-	put_qp_read(qp);
+	uobj_put_obj_read(qp);
 
 out:
 	kfree(attr);
@@ -2558,42 +2055,27 @@ ssize_t ib_uverbs_destroy_qp(struct ib_uverbs_file *file,
 
 	memset(&resp, 0, sizeof resp);
 
-	uobj = idr_write_uobj(cmd.qp_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
+	uobj  = uobj_get_write(uobj_get_type(qp), cmd.qp_handle,
+			       file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
+
 	qp  = uobj->object;
 	obj = container_of(uobj, struct ib_uqp_object, uevent.uobject);
+	/*
+	 * Make sure we don't free the memory in remove_commit as we still
+	 * needs the uobject memory to create the response.
+	 */
+	uverbs_uobject_get(uobj);
 
-	if (!list_empty(&obj->mcast_list)) {
-		put_uobj_write(uobj);
-		return -EBUSY;
-	}
-
-	ret = ib_destroy_qp(qp);
-	if (!ret)
-		uobj->live = 0;
-
-	put_uobj_write(uobj);
-
-	if (ret)
+	ret = uobj_remove_commit(uobj);
+	if (ret) {
+		uverbs_uobject_put(uobj);
 		return ret;
-
-	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
-
-	if (obj->uxrcd)
-		atomic_dec(&obj->uxrcd->refcnt);
-
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	ib_uverbs_release_uevent(file, &obj->uevent);
+	}
 
 	resp.events_reported = obj->uevent.events_reported;
-
-	put_uobj(uobj);
+	uverbs_uobject_put(uobj);
 
 	if (copy_to_user((void __user *) (unsigned long) cmd.response,
 			 &resp, sizeof resp))
@@ -2637,7 +2119,7 @@ ssize_t ib_uverbs_post_send(struct ib_uverbs_file *file,
 	if (!user_wr)
 		return -ENOMEM;
 
-	qp = idr_read_qp(cmd.qp_handle, file->ucontext);
+	qp = uobj_get_obj_read(qp, cmd.qp_handle, file->ucontext);
 	if (!qp)
 		goto out;
 
@@ -2673,7 +2155,8 @@ ssize_t ib_uverbs_post_send(struct ib_uverbs_file *file,
 				goto out_put;
 			}
 
-			ud->ah = idr_read_ah(user_wr->wr.ud.ah, file->ucontext);
+			ud->ah = uobj_get_obj_read(ah, user_wr->wr.ud.ah,
+						   file->ucontext);
 			if (!ud->ah) {
 				kfree(ud);
 				ret = -EINVAL;
@@ -2780,11 +2263,11 @@ ssize_t ib_uverbs_post_send(struct ib_uverbs_file *file,
 		ret = -EFAULT;
 
 out_put:
-	put_qp_read(qp);
+	uobj_put_obj_read(qp);
 
 	while (wr) {
 		if (is_ud && ud_wr(wr)->ah)
-			put_ah_read(ud_wr(wr)->ah);
+			uobj_put_obj_read(ud_wr(wr)->ah);
 		next = wr->next;
 		kfree(wr);
 		wr = next;
@@ -2901,21 +2384,21 @@ ssize_t ib_uverbs_post_recv(struct ib_uverbs_file *file,
 	if (IS_ERR(wr))
 		return PTR_ERR(wr);
 
-	qp = idr_read_qp(cmd.qp_handle, file->ucontext);
+	qp = uobj_get_obj_read(qp, cmd.qp_handle, file->ucontext);
 	if (!qp)
 		goto out;
 
 	resp.bad_wr = 0;
 	ret = qp->device->post_recv(qp->real_qp, wr, &bad_wr);
 
-	put_qp_read(qp);
-
-	if (ret)
+	uobj_put_obj_read(qp);
+	if (ret) {
 		for (next = wr; next; next = next->next) {
 			++resp.bad_wr;
 			if (next == bad_wr)
 				break;
 		}
+	}
 
 	if (copy_to_user((void __user *) (unsigned long) cmd.response,
 			 &resp, sizeof resp))
@@ -2951,14 +2434,14 @@ ssize_t ib_uverbs_post_srq_recv(struct ib_uverbs_file *file,
 	if (IS_ERR(wr))
 		return PTR_ERR(wr);
 
-	srq = idr_read_srq(cmd.srq_handle, file->ucontext);
+	srq = uobj_get_obj_read(srq, cmd.srq_handle, file->ucontext);
 	if (!srq)
 		goto out;
 
 	resp.bad_wr = 0;
 	ret = srq->device->post_srq_recv(srq, wr, &bad_wr);
 
-	put_srq_read(srq);
+	uobj_put_obj_read(srq);
 
 	if (ret)
 		for (next = wr; next; next = next->next) {
@@ -3005,14 +2488,11 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
 		   (unsigned long)cmd.response + sizeof(resp),
 		   in_len - sizeof(cmd), out_len - sizeof(resp));
 
-	uobj = kmalloc(sizeof *uobj, GFP_KERNEL);
-	if (!uobj)
-		return -ENOMEM;
-
-	init_uobj(uobj, cmd.user_handle, file->ucontext, &ah_lock_class);
-	down_write(&uobj->mutex);
+	uobj  = uobj_alloc(uobj_get_type(ah), file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
-	pd = idr_read_pd(cmd.pd_handle, file->ucontext);
+	pd = uobj_get_obj_read(pd, cmd.pd_handle, file->ucontext);
 	if (!pd) {
 		ret = -EINVAL;
 		goto err;
@@ -3031,28 +2511,20 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
 	memset(&attr.dmac, 0, sizeof(attr.dmac));
 	memcpy(attr.grh.dgid.raw, cmd.attr.grh.dgid, 16);
 
-	ret = ib_rdmacg_try_charge(&uobj->cg_obj, ib_dev,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-	if (ret)
-		goto err_charge;
-
 	ah = pd->device->create_ah(pd, &attr, &udata);
 
 	if (IS_ERR(ah)) {
 		ret = PTR_ERR(ah);
-		goto err_create;
+		goto err_put;
 	}
 
 	ah->device  = pd->device;
 	ah->pd      = pd;
 	atomic_inc(&pd->usecnt);
 	ah->uobject  = uobj;
+	uobj->user_handle = cmd.user_handle;
 	uobj->object = ah;
 
-	ret = idr_add_uobj(uobj);
-	if (ret)
-		goto err_destroy;
-
 	resp.ah_handle = uobj->id;
 
 	if (copy_to_user((void __user *) (unsigned long) cmd.response,
@@ -3061,32 +2533,19 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
 		goto err_copy;
 	}
 
-	put_pd_read(pd);
-
-	mutex_lock(&file->mutex);
-	list_add_tail(&uobj->list, &file->ucontext->ah_list);
-	mutex_unlock(&file->mutex);
-
-	uobj->live = 1;
-
-	up_write(&uobj->mutex);
+	uobj_put_obj_read(pd);
+	uobj_alloc_commit(uobj);
 
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(uobj);
-
-err_destroy:
 	ib_destroy_ah(ah);
 
-err_create:
-	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
-
-err_charge:
-	put_pd_read(pd);
+err_put:
+	uobj_put_obj_read(pd);
 
 err:
-	put_uobj_write(uobj);
+	uobj_alloc_abort(uobj);
 	return ret;
 }
 
@@ -3095,37 +2554,19 @@ ssize_t ib_uverbs_destroy_ah(struct ib_uverbs_file *file,
 			     const char __user *buf, int in_len, int out_len)
 {
 	struct ib_uverbs_destroy_ah cmd;
-	struct ib_ah		   *ah;
 	struct ib_uobject	   *uobj;
 	int			    ret;
 
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(cmd.ah_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
-	ah = uobj->object;
-
-	ret = ib_destroy_ah(ah);
-	if (!ret)
-		uobj->live = 0;
-
-	put_uobj_write(uobj);
-
-	if (ret)
-		return ret;
-
-	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
+	uobj  = uobj_get_write(uobj_get_type(ah), cmd.ah_handle,
+			       file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
-	put_uobj(uobj);
-
-	return in_len;
+	ret = uobj_remove_commit(uobj);
+	return ret ?: in_len;
 }
 
 ssize_t ib_uverbs_attach_mcast(struct ib_uverbs_file *file,
@@ -3142,7 +2583,7 @@ ssize_t ib_uverbs_attach_mcast(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	qp = idr_write_qp(cmd.qp_handle, file->ucontext);
+	qp = uobj_get_obj_read(qp, cmd.qp_handle, file->ucontext);
 	if (!qp)
 		return -EINVAL;
 
@@ -3171,7 +2612,7 @@ ssize_t ib_uverbs_attach_mcast(struct ib_uverbs_file *file,
 		kfree(mcast);
 
 out_put:
-	put_qp_write(qp);
+	uobj_put_obj_read(qp);
 
 	return ret ? ret : in_len;
 }
@@ -3190,16 +2631,16 @@ ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	qp = idr_write_qp(cmd.qp_handle, file->ucontext);
+	qp = uobj_get_obj_read(qp, cmd.qp_handle, file->ucontext);
 	if (!qp)
 		return -EINVAL;
 
+	obj = container_of(qp->uobject, struct ib_uqp_object, uevent.uobject);
+
 	ret = ib_detach_mcast(qp, (union ib_gid *) cmd.gid, cmd.mlid);
 	if (ret)
 		goto out_put;
 
-	obj = container_of(qp->uobject, struct ib_uqp_object, uevent.uobject);
-
 	list_for_each_entry(mcast, &obj->mcast_list, list)
 		if (cmd.mlid == mcast->lid &&
 		    !memcmp(cmd.gid, mcast->gid.raw, sizeof mcast->gid.raw)) {
@@ -3209,8 +2650,7 @@ ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file *file,
 		}
 
 out_put:
-	put_qp_write(qp);
-
+	uobj_put_obj_read(qp);
 	return ret ? ret : in_len;
 }
 
@@ -3402,20 +2842,18 @@ int ib_uverbs_ex_create_wq(struct ib_uverbs_file *file,
 	if (cmd.comp_mask)
 		return -EOPNOTSUPP;
 
-	obj = kmalloc(sizeof(*obj), GFP_KERNEL);
-	if (!obj)
-		return -ENOMEM;
+	obj  = (struct ib_uwq_object *)uobj_alloc(uobj_get_type(wq),
+						  file->ucontext);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
 
-	init_uobj(&obj->uevent.uobject, cmd.user_handle, file->ucontext,
-		  &wq_lock_class);
-	down_write(&obj->uevent.uobject.mutex);
-	pd  = idr_read_pd(cmd.pd_handle, file->ucontext);
+	pd  = uobj_get_obj_read(pd, cmd.pd_handle, file->ucontext);
 	if (!pd) {
 		err = -EINVAL;
 		goto err_uobj;
 	}
 
-	cq = idr_read_cq(cmd.cq_handle, file->ucontext, 0);
+	cq = uobj_get_obj_read(cq, cmd.cq_handle, file->ucontext);
 	if (!cq) {
 		err = -EINVAL;
 		goto err_put_pd;
@@ -3450,9 +2888,6 @@ int ib_uverbs_ex_create_wq(struct ib_uverbs_file *file,
 	atomic_inc(&cq->usecnt);
 	wq->uobject = &obj->uevent.uobject;
 	obj->uevent.uobject.object = wq;
-	err = idr_add_uobj(&obj->uevent.uobject);
-	if (err)
-		goto destroy_wq;
 
 	memset(&resp, 0, sizeof(resp));
 	resp.wq_handle = obj->uevent.uobject.id;
@@ -3465,27 +2900,19 @@ int ib_uverbs_ex_create_wq(struct ib_uverbs_file *file,
 	if (err)
 		goto err_copy;
 
-	put_pd_read(pd);
-	put_cq_read(cq);
-
-	mutex_lock(&file->mutex);
-	list_add_tail(&obj->uevent.uobject.list, &file->ucontext->wq_list);
-	mutex_unlock(&file->mutex);
-
-	obj->uevent.uobject.live = 1;
-	up_write(&obj->uevent.uobject.mutex);
+	uobj_put_obj_read(pd);
+	uobj_put_obj_read(cq);
+	uobj_alloc_commit(&obj->uevent.uobject);
 	return 0;
 
 err_copy:
-	idr_remove_uobj(&obj->uevent.uobject);
-destroy_wq:
 	ib_destroy_wq(wq);
 err_put_cq:
-	put_cq_read(cq);
+	uobj_put_obj_read(cq);
 err_put_pd:
-	put_pd_read(pd);
+	uobj_put_obj_read(pd);
 err_uobj:
-	put_uobj_write(&obj->uevent.uobject);
+	uobj_alloc_abort(&obj->uevent.uobject);
 
 	return err;
 }
@@ -3526,31 +2953,27 @@ int ib_uverbs_ex_destroy_wq(struct ib_uverbs_file *file,
 		return -EOPNOTSUPP;
 
 	resp.response_length = required_resp_len;
-	uobj = idr_write_uobj(cmd.wq_handle,
-			      file->ucontext);
-	if (!uobj)
-		return -EINVAL;
+	uobj  = uobj_get_write(uobj_get_type(wq), cmd.wq_handle,
+			       file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
 	wq = uobj->object;
 	obj = container_of(uobj, struct ib_uwq_object, uevent.uobject);
-	ret = ib_destroy_wq(wq);
-	if (!ret)
-		uobj->live = 0;
+	/*
+	 * Make sure we don't free the memory in remove_commit as we still
+	 * needs the uobject memory to create the response.
+	 */
+	uverbs_uobject_get(uobj);
 
-	put_uobj_write(uobj);
-	if (ret)
+	ret = uobj_remove_commit(uobj);
+	if (ret) {
+		uverbs_uobject_put(uobj);
 		return ret;
+	}
 
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	ib_uverbs_release_uevent(file, &obj->uevent);
 	resp.events_reported = obj->uevent.events_reported;
-	put_uobj(uobj);
-
+	uverbs_uobject_put(uobj);
 	ret = ib_copy_to_udata(ucore, &resp, resp.response_length);
 	if (ret)
 		return ret;
@@ -3588,7 +3011,7 @@ int ib_uverbs_ex_modify_wq(struct ib_uverbs_file *file,
 	if (cmd.attr_mask > (IB_WQ_STATE | IB_WQ_CUR_STATE | IB_WQ_FLAGS))
 		return -EINVAL;
 
-	wq = idr_read_wq(cmd.wq_handle, file->ucontext);
+	wq = uobj_get_obj_read(wq, cmd.wq_handle, file->ucontext);
 	if (!wq)
 		return -EINVAL;
 
@@ -3599,7 +3022,7 @@ int ib_uverbs_ex_modify_wq(struct ib_uverbs_file *file,
 		wq_attr.flags_mask = cmd.flags_mask;
 	}
 	ret = wq->device->modify_wq(wq, &wq_attr, cmd.attr_mask, uhw);
-	put_wq_read(wq);
+	uobj_put_obj_read(wq);
 	return ret;
 }
 
@@ -3677,7 +3100,8 @@ int ib_uverbs_ex_create_rwq_ind_table(struct ib_uverbs_file *file,
 
 	for (num_read_wqs = 0; num_read_wqs < num_wq_handles;
 			num_read_wqs++) {
-		wq = idr_read_wq(wqs_handles[num_read_wqs], file->ucontext);
+		wq = uobj_get_obj_read(wq, wqs_handles[num_read_wqs],
+				       file->ucontext);
 		if (!wq) {
 			err = -EINVAL;
 			goto put_wqs;
@@ -3686,14 +3110,12 @@ int ib_uverbs_ex_create_rwq_ind_table(struct ib_uverbs_file *file,
 		wqs[num_read_wqs] = wq;
 	}
 
-	uobj = kmalloc(sizeof(*uobj), GFP_KERNEL);
-	if (!uobj) {
-		err = -ENOMEM;
+	uobj  = uobj_alloc(uobj_get_type(rwq_ind_table), file->ucontext);
+	if (IS_ERR(uobj)) {
+		err = PTR_ERR(uobj);
 		goto put_wqs;
 	}
 
-	init_uobj(uobj, 0, file->ucontext, &rwq_ind_table_lock_class);
-	down_write(&uobj->mutex);
 	init_attr.log_ind_tbl_size = cmd.log_ind_tbl_size;
 	init_attr.ind_tbl = wqs;
 	rwq_ind_tbl = ib_dev->create_rwq_ind_table(ib_dev, &init_attr, uhw);
@@ -3713,10 +3135,6 @@ int ib_uverbs_ex_create_rwq_ind_table(struct ib_uverbs_file *file,
 	for (i = 0; i < num_wq_handles; i++)
 		atomic_inc(&wqs[i]->usecnt);
 
-	err = idr_add_uobj(uobj);
-	if (err)
-		goto destroy_ind_tbl;
-
 	resp.ind_tbl_handle = uobj->id;
 	resp.ind_tbl_num = rwq_ind_tbl->ind_tbl_num;
 	resp.response_length = required_resp_len;
@@ -3729,26 +3147,18 @@ int ib_uverbs_ex_create_rwq_ind_table(struct ib_uverbs_file *file,
 	kfree(wqs_handles);
 
 	for (j = 0; j < num_read_wqs; j++)
-		put_wq_read(wqs[j]);
-
-	mutex_lock(&file->mutex);
-	list_add_tail(&uobj->list, &file->ucontext->rwq_ind_tbl_list);
-	mutex_unlock(&file->mutex);
+		uobj_put_obj_read(wqs[j]);
 
-	uobj->live = 1;
-
-	up_write(&uobj->mutex);
+	uobj_alloc_commit(uobj);
 	return 0;
 
 err_copy:
-	idr_remove_uobj(uobj);
-destroy_ind_tbl:
 	ib_destroy_rwq_ind_table(rwq_ind_tbl);
 err_uobj:
-	put_uobj_write(uobj);
+	uobj_alloc_abort(uobj);
 put_wqs:
 	for (j = 0; j < num_read_wqs; j++)
-		put_wq_read(wqs[j]);
+		uobj_put_obj_read(wqs[j]);
 err_free:
 	kfree(wqs_handles);
 	kfree(wqs);
@@ -3761,10 +3171,8 @@ int ib_uverbs_ex_destroy_rwq_ind_table(struct ib_uverbs_file *file,
 				       struct ib_udata *uhw)
 {
 	struct ib_uverbs_ex_destroy_rwq_ind_table	cmd = {};
-	struct ib_rwq_ind_table *rwq_ind_tbl;
 	struct ib_uobject		*uobj;
 	int			ret;
-	struct ib_wq	**ind_tbl;
 	size_t required_cmd_sz;
 
 	required_cmd_sz = offsetof(typeof(cmd), ind_tbl_handle) + sizeof(cmd.ind_tbl_handle);
@@ -3784,31 +3192,12 @@ int ib_uverbs_ex_destroy_rwq_ind_table(struct ib_uverbs_file *file,
 	if (cmd.comp_mask)
 		return -EOPNOTSUPP;
 
-	uobj = idr_write_uobj(cmd.ind_tbl_handle,
-			      file->ucontext);
-	if (!uobj)
-		return -EINVAL;
-	rwq_ind_tbl = uobj->object;
-	ind_tbl = rwq_ind_tbl->ind_tbl;
-
-	ret = ib_destroy_rwq_ind_table(rwq_ind_tbl);
-	if (!ret)
-		uobj->live = 0;
-
-	put_uobj_write(uobj);
-
-	if (ret)
-		return ret;
-
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
+	uobj  = uobj_get_write(uobj_get_type(rwq_ind_table), cmd.ind_tbl_handle,
+			       file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
-	put_uobj(uobj);
-	kfree(ind_tbl);
-	return ret;
+	return uobj_remove_commit(uobj);
 }
 
 int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
@@ -3882,15 +3271,13 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 		kern_flow_attr = &cmd.flow_attr;
 	}
 
-	uobj = kmalloc(sizeof(*uobj), GFP_KERNEL);
-	if (!uobj) {
-		err = -ENOMEM;
+	uobj  = uobj_alloc(uobj_get_type(flow), file->ucontext);
+	if (IS_ERR(uobj)) {
+		err = PTR_ERR(uobj);
 		goto err_free_attr;
 	}
-	init_uobj(uobj, 0, file->ucontext, &rule_lock_class);
-	down_write(&uobj->mutex);
 
-	qp = idr_read_qp(cmd.qp_handle, file->ucontext);
+	qp = uobj_get_obj_read(qp, cmd.qp_handle, file->ucontext);
 	if (!qp) {
 		err = -EINVAL;
 		goto err_uobj;
@@ -3931,24 +3318,14 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 		err = -EINVAL;
 		goto err_free;
 	}
-
-	err = ib_rdmacg_try_charge(&uobj->cg_obj, ib_dev,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-	if (err)
-		goto err_free;
-
 	flow_id = ib_create_flow(qp, flow_attr, IB_FLOW_DOMAIN_USER);
 	if (IS_ERR(flow_id)) {
 		err = PTR_ERR(flow_id);
-		goto err_create;
+		goto err_free;
 	}
 	flow_id->uobject = uobj;
 	uobj->object = flow_id;
 
-	err = idr_add_uobj(uobj);
-	if (err)
-		goto destroy_flow;
-
 	memset(&resp, 0, sizeof(resp));
 	resp.flow_handle = uobj->id;
 
@@ -3957,30 +3334,20 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 	if (err)
 		goto err_copy;
 
-	put_qp_read(qp);
-	mutex_lock(&file->mutex);
-	list_add_tail(&uobj->list, &file->ucontext->rule_list);
-	mutex_unlock(&file->mutex);
-
-	uobj->live = 1;
-
-	up_write(&uobj->mutex);
+	uobj_put_obj_read(qp);
+	uobj_alloc_commit(uobj);
 	kfree(flow_attr);
 	if (cmd.flow_attr.num_of_specs)
 		kfree(kern_flow_attr);
 	return 0;
 err_copy:
-	idr_remove_uobj(uobj);
-destroy_flow:
 	ib_destroy_flow(flow_id);
-err_create:
-	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
 err_free:
 	kfree(flow_attr);
 err_put:
-	put_qp_read(qp);
+	uobj_put_obj_read(qp);
 err_uobj:
-	put_uobj_write(uobj);
+	uobj_alloc_abort(uobj);
 err_free_attr:
 	if (cmd.flow_attr.num_of_specs)
 		kfree(kern_flow_attr);
@@ -3993,7 +3360,6 @@ int ib_uverbs_ex_destroy_flow(struct ib_uverbs_file *file,
 			      struct ib_udata *uhw)
 {
 	struct ib_uverbs_destroy_flow	cmd;
-	struct ib_flow			*flow_id;
 	struct ib_uobject		*uobj;
 	int				ret;
 
@@ -4007,28 +3373,12 @@ int ib_uverbs_ex_destroy_flow(struct ib_uverbs_file *file,
 	if (cmd.comp_mask)
 		return -EINVAL;
 
-	uobj = idr_write_uobj(cmd.flow_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
-	flow_id = uobj->object;
-
-	ret = ib_destroy_flow(flow_id);
-	if (!ret) {
-		ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-		uobj->live = 0;
-	}
-
-	put_uobj_write(uobj);
-
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	put_uobj(uobj);
+	uobj  = uobj_get_write(uobj_get_type(flow), cmd.flow_handle,
+			       file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
+	ret = uobj_remove_commit(uobj);
 	return ret;
 }
 
@@ -4045,31 +3395,37 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 	struct ib_srq_init_attr          attr;
 	int ret;
 
-	obj = kmalloc(sizeof *obj, GFP_KERNEL);
-	if (!obj)
-		return -ENOMEM;
-
-	init_uobj(&obj->uevent.uobject, cmd->user_handle, file->ucontext, &srq_lock_class);
-	down_write(&obj->uevent.uobject.mutex);
+	obj  = (struct ib_usrq_object *)uobj_alloc(uobj_get_type(srq),
+						   file->ucontext);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
 
 	if (cmd->srq_type == IB_SRQT_XRC) {
-		attr.ext.xrc.xrcd  = idr_read_xrcd(cmd->xrcd_handle, file->ucontext, &xrcd_uobj);
-		if (!attr.ext.xrc.xrcd) {
+		xrcd_uobj = uobj_get_read(uobj_get_type(xrcd), cmd->xrcd_handle,
+					  file->ucontext);
+		if (IS_ERR(xrcd_uobj)) {
 			ret = -EINVAL;
 			goto err;
 		}
 
+		attr.ext.xrc.xrcd = (struct ib_xrcd *)xrcd_uobj->object;
+		if (!attr.ext.xrc.xrcd) {
+			ret = -EINVAL;
+			goto err_put_xrcd;
+		}
+
 		obj->uxrcd = container_of(xrcd_uobj, struct ib_uxrcd_object, uobject);
 		atomic_inc(&obj->uxrcd->refcnt);
 
-		attr.ext.xrc.cq  = idr_read_cq(cmd->cq_handle, file->ucontext, 0);
+		attr.ext.xrc.cq  = uobj_get_obj_read(cq, cmd->cq_handle,
+						     file->ucontext);
 		if (!attr.ext.xrc.cq) {
 			ret = -EINVAL;
 			goto err_put_xrcd;
 		}
 	}
 
-	pd  = idr_read_pd(cmd->pd_handle, file->ucontext);
+	pd  = uobj_get_obj_read(pd, cmd->pd_handle, file->ucontext);
 	if (!pd) {
 		ret = -EINVAL;
 		goto err_put_cq;
@@ -4085,11 +3441,6 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 	obj->uevent.events_reported = 0;
 	INIT_LIST_HEAD(&obj->uevent.event_list);
 
-	ret = ib_rdmacg_try_charge(&obj->uevent.uobject.cg_obj, ib_dev,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-	if (ret)
-		goto err_put_cq;
-
 	srq = pd->device->create_srq(pd, &attr, udata);
 	if (IS_ERR(srq)) {
 		ret = PTR_ERR(srq);
@@ -4114,9 +3465,7 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 	atomic_set(&srq->usecnt, 0);
 
 	obj->uevent.uobject.object = srq;
-	ret = idr_add_uobj(&obj->uevent.uobject);
-	if (ret)
-		goto err_destroy;
+	obj->uevent.uobject.user_handle = cmd->user_handle;
 
 	memset(&resp, 0, sizeof resp);
 	resp.srq_handle = obj->uevent.uobject.id;
@@ -4132,44 +3481,32 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 	}
 
 	if (cmd->srq_type == IB_SRQT_XRC) {
-		put_uobj_read(xrcd_uobj);
-		put_cq_read(attr.ext.xrc.cq);
+		uobj_put_read(xrcd_uobj);
+		uobj_put_obj_read(attr.ext.xrc.cq);
 	}
-	put_pd_read(pd);
-
-	mutex_lock(&file->mutex);
-	list_add_tail(&obj->uevent.uobject.list, &file->ucontext->srq_list);
-	mutex_unlock(&file->mutex);
-
-	obj->uevent.uobject.live = 1;
-
-	up_write(&obj->uevent.uobject.mutex);
+	uobj_put_obj_read(pd);
+	uobj_alloc_commit(&obj->uevent.uobject);
 
 	return 0;
 
 err_copy:
-	idr_remove_uobj(&obj->uevent.uobject);
-
-err_destroy:
 	ib_destroy_srq(srq);
 
 err_put:
-	ib_rdmacg_uncharge(&obj->uevent.uobject.cg_obj, ib_dev,
-			   RDMACG_RESOURCE_HCA_OBJECT);
-	put_pd_read(pd);
+	uobj_put_obj_read(pd);
 
 err_put_cq:
 	if (cmd->srq_type == IB_SRQT_XRC)
-		put_cq_read(attr.ext.xrc.cq);
+		uobj_put_obj_read(attr.ext.xrc.cq);
 
 err_put_xrcd:
 	if (cmd->srq_type == IB_SRQT_XRC) {
 		atomic_dec(&obj->uxrcd->refcnt);
-		put_uobj_read(xrcd_uobj);
+		uobj_put_read(xrcd_uobj);
 	}
 
 err:
-	put_uobj_write(&obj->uevent.uobject);
+	uobj_alloc_abort(&obj->uevent.uobject);
 	return ret;
 }
 
@@ -4254,7 +3591,7 @@ ssize_t ib_uverbs_modify_srq(struct ib_uverbs_file *file,
 	INIT_UDATA(&udata, buf + sizeof cmd, NULL, in_len - sizeof cmd,
 		   out_len);
 
-	srq = idr_read_srq(cmd.srq_handle, file->ucontext);
+	srq = uobj_get_obj_read(srq, cmd.srq_handle, file->ucontext);
 	if (!srq)
 		return -EINVAL;
 
@@ -4263,7 +3600,7 @@ ssize_t ib_uverbs_modify_srq(struct ib_uverbs_file *file,
 
 	ret = srq->device->modify_srq(srq, &attr, cmd.attr_mask, &udata);
 
-	put_srq_read(srq);
+	uobj_put_obj_read(srq);
 
 	return ret ? ret : in_len;
 }
@@ -4285,13 +3622,13 @@ ssize_t ib_uverbs_query_srq(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	srq = idr_read_srq(cmd.srq_handle, file->ucontext);
+	srq = uobj_get_obj_read(srq, cmd.srq_handle, file->ucontext);
 	if (!srq)
 		return -EINVAL;
 
 	ret = ib_query_srq(srq, &attr);
 
-	put_srq_read(srq);
+	uobj_put_obj_read(srq);
 
 	if (ret)
 		return ret;
@@ -4320,53 +3657,39 @@ ssize_t ib_uverbs_destroy_srq(struct ib_uverbs_file *file,
 	struct ib_srq               	 *srq;
 	struct ib_uevent_object        	 *obj;
 	int                         	  ret = -EINVAL;
-	struct ib_usrq_object		 *us;
 	enum ib_srq_type		  srq_type;
 
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(cmd.srq_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
+	uobj  = uobj_get_write(uobj_get_type(srq), cmd.srq_handle,
+			       file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
+
 	srq = uobj->object;
 	obj = container_of(uobj, struct ib_uevent_object, uobject);
 	srq_type = srq->srq_type;
+	/*
+	 * Make sure we don't free the memory in remove_commit as we still
+	 * needs the uobject memory to create the response.
+	 */
+	uverbs_uobject_get(uobj);
 
-	ret = ib_destroy_srq(srq);
-	if (!ret)
-		uobj->live = 0;
-
-	put_uobj_write(uobj);
+	memset(&resp, 0, sizeof(resp));
 
-	if (ret)
+	ret = uobj_remove_commit(uobj);
+	if (ret) {
+		uverbs_uobject_put(uobj);
 		return ret;
-
-	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_OBJECT);
-
-	if (srq_type == IB_SRQT_XRC) {
-		us = container_of(obj, struct ib_usrq_object, uevent);
-		atomic_dec(&us->uxrcd->refcnt);
 	}
-
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	ib_uverbs_release_uevent(file, obj);
-
-	memset(&resp, 0, sizeof resp);
 	resp.events_reported = obj->events_reported;
+	uverbs_uobject_put(uobj);
+	if (copy_to_user((void __user *)(unsigned long)cmd.response,
+			 &resp, sizeof(resp)))
+		return -EFAULT;
 
-	put_uobj(uobj);
-
-	if (copy_to_user((void __user *) (unsigned long) cmd.response,
-			 &resp, sizeof resp))
-		ret = -EFAULT;
-
-	return ret ? ret : in_len;
+	return in_len;
 }
 
 int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index e1db678..7ccb525 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -52,6 +52,7 @@
 
 #include "uverbs.h"
 #include "core_priv.h"
+#include "rdma_core.h"
 
 MODULE_AUTHOR("Roland Dreier");
 MODULE_DESCRIPTION("InfiniBand userspace verbs access");
@@ -214,140 +215,11 @@ void ib_uverbs_detach_umcast(struct ib_qp *qp,
 }
 
 static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
-				      struct ib_ucontext *context)
+				      struct ib_ucontext *context,
+				      bool device_removed)
 {
-	struct ib_uobject *uobj, *tmp;
-
 	context->closing = 1;
-
-	list_for_each_entry_safe(uobj, tmp, &context->ah_list, list) {
-		struct ib_ah *ah = uobj->object;
-
-		idr_remove_uobj(uobj);
-		ib_destroy_ah(ah);
-		ib_rdmacg_uncharge(&uobj->cg_obj, context->device,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-		kfree(uobj);
-	}
-
-	/* Remove MWs before QPs, in order to support type 2A MWs. */
-	list_for_each_entry_safe(uobj, tmp, &context->mw_list, list) {
-		struct ib_mw *mw = uobj->object;
-
-		idr_remove_uobj(uobj);
-		uverbs_dealloc_mw(mw);
-		ib_rdmacg_uncharge(&uobj->cg_obj, context->device,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-		kfree(uobj);
-	}
-
-	list_for_each_entry_safe(uobj, tmp, &context->rule_list, list) {
-		struct ib_flow *flow_id = uobj->object;
-
-		idr_remove_uobj(uobj);
-		ib_destroy_flow(flow_id);
-		ib_rdmacg_uncharge(&uobj->cg_obj, context->device,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-		kfree(uobj);
-	}
-
-	list_for_each_entry_safe(uobj, tmp, &context->qp_list, list) {
-		struct ib_qp *qp = uobj->object;
-		struct ib_uqp_object *uqp =
-			container_of(uobj, struct ib_uqp_object, uevent.uobject);
-
-		idr_remove_uobj(uobj);
-		if (qp == qp->real_qp)
-			ib_uverbs_detach_umcast(qp, uqp);
-		ib_destroy_qp(qp);
-		ib_rdmacg_uncharge(&uobj->cg_obj, context->device,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-		ib_uverbs_release_uevent(file, &uqp->uevent);
-		kfree(uqp);
-	}
-
-	list_for_each_entry_safe(uobj, tmp, &context->rwq_ind_tbl_list, list) {
-		struct ib_rwq_ind_table *rwq_ind_tbl = uobj->object;
-		struct ib_wq **ind_tbl = rwq_ind_tbl->ind_tbl;
-
-		idr_remove_uobj(uobj);
-		ib_destroy_rwq_ind_table(rwq_ind_tbl);
-		kfree(ind_tbl);
-		kfree(uobj);
-	}
-
-	list_for_each_entry_safe(uobj, tmp, &context->wq_list, list) {
-		struct ib_wq *wq = uobj->object;
-		struct ib_uwq_object *uwq =
-			container_of(uobj, struct ib_uwq_object, uevent.uobject);
-
-		idr_remove_uobj(uobj);
-		ib_destroy_wq(wq);
-		ib_uverbs_release_uevent(file, &uwq->uevent);
-		kfree(uwq);
-	}
-
-	list_for_each_entry_safe(uobj, tmp, &context->srq_list, list) {
-		struct ib_srq *srq = uobj->object;
-		struct ib_uevent_object *uevent =
-			container_of(uobj, struct ib_uevent_object, uobject);
-
-		idr_remove_uobj(uobj);
-		ib_destroy_srq(srq);
-		ib_rdmacg_uncharge(&uobj->cg_obj, context->device,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-		ib_uverbs_release_uevent(file, uevent);
-		kfree(uevent);
-	}
-
-	list_for_each_entry_safe(uobj, tmp, &context->cq_list, list) {
-		struct ib_cq *cq = uobj->object;
-		struct ib_uverbs_event_file *ev_file = cq->cq_context;
-		struct ib_ucq_object *ucq =
-			container_of(uobj, struct ib_ucq_object, uobject);
-
-		idr_remove_uobj(uobj);
-		ib_destroy_cq(cq);
-		ib_rdmacg_uncharge(&uobj->cg_obj, context->device,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-		ib_uverbs_release_ucq(file, ev_file, ucq);
-		kfree(ucq);
-	}
-
-	list_for_each_entry_safe(uobj, tmp, &context->mr_list, list) {
-		struct ib_mr *mr = uobj->object;
-
-		idr_remove_uobj(uobj);
-		ib_dereg_mr(mr);
-		ib_rdmacg_uncharge(&uobj->cg_obj, context->device,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-		kfree(uobj);
-	}
-
-	mutex_lock(&file->device->xrcd_tree_mutex);
-	list_for_each_entry_safe(uobj, tmp, &context->xrcd_list, list) {
-		struct ib_xrcd *xrcd = uobj->object;
-		struct ib_uxrcd_object *uxrcd =
-			container_of(uobj, struct ib_uxrcd_object, uobject);
-
-		idr_remove_uobj(uobj);
-		ib_uverbs_dealloc_xrcd(file->device, xrcd,
-				       file->ucontext ? RDMA_REMOVE_CLOSE :
-				       RDMA_REMOVE_DRIVER_REMOVE);
-		kfree(uxrcd);
-	}
-	mutex_unlock(&file->device->xrcd_tree_mutex);
-
-	list_for_each_entry_safe(uobj, tmp, &context->pd_list, list) {
-		struct ib_pd *pd = uobj->object;
-
-		idr_remove_uobj(uobj);
-		ib_dealloc_pd(pd);
-		ib_rdmacg_uncharge(&uobj->cg_obj, context->device,
-				   RDMACG_RESOURCE_HCA_OBJECT);
-		kfree(uobj);
-	}
-
+	uverbs_cleanup_ucontext(context, device_removed);
 	put_pid(context->tgid);
 
 	ib_rdmacg_uncharge(&context->cg_obj, context->device,
@@ -592,7 +464,7 @@ void ib_uverbs_qp_event_handler(struct ib_event *event, void *context_ptr)
 	struct ib_uevent_object *uobj;
 
 	/* for XRC target qp's, check that qp is live */
-	if (!event->element.qp->uobject || !event->element.qp->uobject->live)
+	if (!event->element.qp->uobject)
 		return;
 
 	uobj = container_of(event->element.qp->uobject,
@@ -1010,7 +882,7 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
 
 	mutex_lock(&file->cleanup_mutex);
 	if (file->ucontext) {
-		ib_uverbs_cleanup_ucontext(file, file->ucontext);
+		ib_uverbs_cleanup_ucontext(file, file->ucontext, false);
 		file->ucontext = NULL;
 	}
 	mutex_unlock(&file->cleanup_mutex);
@@ -1260,7 +1132,7 @@ static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev,
 			 * (e.g mmput).
 			 */
 			ib_dev->disassociate_ucontext(ucontext);
-			ib_uverbs_cleanup_ucontext(file, ucontext);
+			ib_uverbs_cleanup_ucontext(file, ucontext, true);
 		}
 
 		mutex_lock(&uverbs_dev->lists_mutex);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index d3efd22..2e8f661 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1377,17 +1377,6 @@ struct ib_rdmacg_object {
 struct ib_ucontext {
 	struct ib_device       *device;
 	struct ib_uverbs_file  *ufile;
-	struct list_head	pd_list;
-	struct list_head	mr_list;
-	struct list_head	mw_list;
-	struct list_head	cq_list;
-	struct list_head	qp_list;
-	struct list_head	srq_list;
-	struct list_head	ah_list;
-	struct list_head	xrcd_list;
-	struct list_head	rule_list;
-	struct list_head	wq_list;
-	struct list_head	rwq_ind_tbl_list;
 	int			closing;
 
 	/* locking the uobjects_list */
@@ -1426,10 +1415,8 @@ struct ib_uobject {
 	struct ib_rdmacg_object	cg_obj;		/* rdmacg object */
 	int			id;		/* index into kernel idr */
 	struct kref		ref;
-	struct rw_semaphore	mutex;		/* protects .live */
 	atomic_t		usecnt;		/* protects exclusive access */
 	struct rcu_head		rcu;		/* kfree_rcu() overhead */
-	int			live;
 
 	const struct uverbs_obj_type *type;
 };
diff --git a/include/rdma/uverbs_std_types.h b/include/rdma/uverbs_std_types.h
index 2edb776..8885664 100644
--- a/include/rdma/uverbs_std_types.h
+++ b/include/rdma/uverbs_std_types.h
@@ -46,5 +46,68 @@
 extern const struct uverbs_obj_idr_type uverbs_type_attrs_mw;
 extern const struct uverbs_obj_idr_type uverbs_type_attrs_pd;
 extern const struct uverbs_obj_idr_type uverbs_type_attrs_xrcd;
+
+static inline struct ib_uobject *__uobj_get(const struct uverbs_obj_type *type,
+					    bool write,
+					    struct ib_ucontext *ucontext,
+					    int id)
+{
+	return rdma_lookup_get_uobject(type, ucontext, id, write);
+}
+
+#define uobj_get_type(_type) uverbs_type_attrs_##_type.type
+
+#define uobj_get_read(_type, _id, _ucontext)				\
+	 __uobj_get(&(_type), false, _ucontext, _id)
+
+#define uobj_get_obj_read(_type, _id, _ucontext)			\
+({									\
+	struct ib_uobject *uobj =					\
+		__uobj_get(&uobj_get_type(_type),			\
+			   false, _ucontext, _id);			\
+									\
+	(struct ib_##_type *)(IS_ERR(uobj) ? NULL : uobj->object);	\
+})
+
+#define uobj_get_write(_type, _id, _ucontext)				\
+	 __uobj_get(&(_type), true, _ucontext, _id)
+
+static inline void uobj_put_read(struct ib_uobject *uobj)
+{
+	rdma_lookup_put_uobject(uobj, false);
+}
+
+#define uobj_put_obj_read(_obj)					\
+	uobj_put_read((_obj)->uobject)
+
+static inline void uobj_put_write(struct ib_uobject *uobj)
+{
+	rdma_lookup_put_uobject(uobj, true);
+}
+
+static inline int __must_check uobj_remove_commit(struct ib_uobject *uobj)
+{
+	return rdma_remove_commit_uobject(uobj);
+}
+
+static inline void uobj_alloc_commit(struct ib_uobject *uobj)
+{
+	rdma_alloc_commit_uobject(uobj);
+}
+
+static inline void uobj_alloc_abort(struct ib_uobject *uobj)
+{
+	rdma_alloc_abort_uobject(uobj);
+}
+
+static inline struct ib_uobject *__uobj_alloc(const struct uverbs_obj_type *type,
+					      struct ib_ucontext *ucontext)
+{
+	return rdma_alloc_begin_uobject(type, ucontext);
+}
+
+#define uobj_alloc(_type, ucontext)	\
+	__uobj_alloc(&(_type), ucontext)
+
 #endif
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V3 for-next 5/7] IB/core: Add lock to multicast handlers
       [not found] ` <1491301907-32290-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2017-04-04 10:31   ` [PATCH V3 for-next 4/7] IB/core: Change idr objects to use the new schema Matan Barak
@ 2017-04-04 10:31   ` Matan Barak
  2017-04-04 10:31   ` [PATCH V3 for-next 6/7] IB/core: Add support for fd objects Matan Barak
  2017-04-04 10:31   ` [PATCH V3 for-next 7/7] IB/core: Change completion channel to use the reworked objects schema Matan Barak
  6 siblings, 0 replies; 25+ messages in thread
From: Matan Barak @ 2017-04-04 10:31 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Sean Hefty, Leon Romanovsky, Majd Dibbiny, Tal Alon,
	Yishai Hadas, Ira Weiny, Haggai Eran, Christoph Lameter,
	Matan Barak

When two handlers used the same object in the old schema, we blocked
the process in the kernel. The new schema just returns -EBUSY. This
could lead to different behaviour in applications between the old
schema and the new schema. In most cases, using such handlers
concurrently could lead to crashing the process. For example, if
thread A destroys a QP and thread B modifies it, we could have the
destruction happens before the modification. In this case, we are
accessing freed memory which could lead to crashing the process.
This is true for most cases. However, attaching and detaching
a multicast address from QP concurrently is safe. Therefore, we
preserve the original behaviour by adding a lock there.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs.h     | 2 ++
 drivers/infiniband/core/uverbs_cmd.c | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 3660278..27c8b98 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -163,6 +163,8 @@ struct ib_usrq_object {
 
 struct ib_uqp_object {
 	struct ib_uevent_object	uevent;
+	/* lock for mcast list */
+	struct mutex		mcast_lock;
 	struct list_head 	mcast_list;
 	struct ib_uxrcd_object *uxrcd;
 };
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 2f258aa..119c10d 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -1352,6 +1352,7 @@ static int create_qp(struct ib_uverbs_file *file,
 		return PTR_ERR(obj);
 	obj->uxrcd = NULL;
 	obj->uevent.uobject.user_handle = cmd->user_handle;
+	mutex_init(&obj->mcast_lock);
 
 	if (cmd_sz >= offsetof(typeof(*cmd), rwq_ind_tbl_handle) +
 		      sizeof(cmd->rwq_ind_tbl_handle) &&
@@ -2589,6 +2590,7 @@ ssize_t ib_uverbs_attach_mcast(struct ib_uverbs_file *file,
 
 	obj = container_of(qp->uobject, struct ib_uqp_object, uevent.uobject);
 
+	mutex_lock(&obj->mcast_lock);
 	list_for_each_entry(mcast, &obj->mcast_list, list)
 		if (cmd.mlid == mcast->lid &&
 		    !memcmp(cmd.gid, mcast->gid.raw, sizeof mcast->gid.raw)) {
@@ -2612,6 +2614,7 @@ ssize_t ib_uverbs_attach_mcast(struct ib_uverbs_file *file,
 		kfree(mcast);
 
 out_put:
+	mutex_unlock(&obj->mcast_lock);
 	uobj_put_obj_read(qp);
 
 	return ret ? ret : in_len;
@@ -2636,6 +2639,7 @@ ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file *file,
 		return -EINVAL;
 
 	obj = container_of(qp->uobject, struct ib_uqp_object, uevent.uobject);
+	mutex_lock(&obj->mcast_lock);
 
 	ret = ib_detach_mcast(qp, (union ib_gid *) cmd.gid, cmd.mlid);
 	if (ret)
@@ -2650,6 +2654,7 @@ ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file *file,
 		}
 
 out_put:
+	mutex_unlock(&obj->mcast_lock);
 	uobj_put_obj_read(qp);
 	return ret ? ret : in_len;
 }
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V3 for-next 6/7] IB/core: Add support for fd objects
       [not found] ` <1491301907-32290-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2017-04-04 10:31   ` [PATCH V3 for-next 5/7] IB/core: Add lock to multicast handlers Matan Barak
@ 2017-04-04 10:31   ` Matan Barak
  2017-04-04 10:31   ` [PATCH V3 for-next 7/7] IB/core: Change completion channel to use the reworked objects schema Matan Barak
  6 siblings, 0 replies; 25+ messages in thread
From: Matan Barak @ 2017-04-04 10:31 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Sean Hefty, Leon Romanovsky, Majd Dibbiny, Tal Alon,
	Yishai Hadas, Ira Weiny, Haggai Eran, Christoph Lameter,
	Matan Barak

The completion channel we use in verbs infrastructure is FD based.
Previously, we had a separate way to manage this object. Since we
strive for a single way to manage any kind of object in this
infrastructure, we conceptually treat all objects as subclasses
of ib_uobject.

This commit adds the necessary mechanism to support FD based objects
like their IDR counterparts. FD objects release need to be synchronized
with context release. We use the cleanup_mutex on the uverbs_file for
that.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/rdma_core.c   | 177 +++++++++++++++++++++++++++++++++-
 drivers/infiniband/core/rdma_core.h   |   8 ++
 drivers/infiniband/core/uverbs.h      |   1 +
 drivers/infiniband/core/uverbs_main.c |   4 +-
 include/rdma/ib_verbs.h               |   6 ++
 include/rdma/uverbs_types.h           |  16 +++
 6 files changed, 210 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/rdma_core.c b/drivers/infiniband/core/rdma_core.c
index 1cbc053..e5bdf7f 100644
--- a/drivers/infiniband/core/rdma_core.c
+++ b/drivers/infiniband/core/rdma_core.c
@@ -153,6 +153,37 @@ static struct ib_uobject *lookup_get_idr_uobject(const struct uverbs_obj_type *t
 	return uobj;
 }
 
+static struct ib_uobject *lookup_get_fd_uobject(const struct uverbs_obj_type *type,
+						struct ib_ucontext *ucontext,
+						int id, bool write)
+{
+	struct file *f;
+	struct ib_uobject *uobject;
+	const struct uverbs_obj_fd_type *fd_type =
+		container_of(type, struct uverbs_obj_fd_type, type);
+
+	if (write)
+		return ERR_PTR(-EOPNOTSUPP);
+
+	f = fget(id);
+	if (!f)
+		return ERR_PTR(-EBADF);
+
+	uobject = f->private_data;
+	/*
+	 * fget(id) ensures we are not currently running uverbs_close_fd,
+	 * and the caller is expected to ensure that uverbs_close_fd is never
+	 * done while a call top lookup is possible.
+	 */
+	if (f->f_op != fd_type->fops) {
+		fput(f);
+		return ERR_PTR(-EBADF);
+	}
+
+	uverbs_uobject_get(uobject);
+	return uobject;
+}
+
 struct ib_uobject *rdma_lookup_get_uobject(const struct uverbs_obj_type *type,
 					   struct ib_ucontext *ucontext,
 					   int id, bool write)
@@ -211,6 +242,46 @@ static struct ib_uobject *alloc_begin_idr_uobject(const struct uverbs_obj_type *
 	return ERR_PTR(ret);
 }
 
+static struct ib_uobject *alloc_begin_fd_uobject(const struct uverbs_obj_type *type,
+						 struct ib_ucontext *ucontext)
+{
+	const struct uverbs_obj_fd_type *fd_type =
+		container_of(type, struct uverbs_obj_fd_type, type);
+	int new_fd;
+	struct ib_uobject *uobj;
+	struct ib_uobject_file *uobj_file;
+	struct file *filp;
+
+	new_fd = get_unused_fd_flags(O_CLOEXEC);
+	if (new_fd < 0)
+		return ERR_PTR(new_fd);
+
+	uobj = alloc_uobj(ucontext, type);
+	if (IS_ERR(uobj)) {
+		put_unused_fd(new_fd);
+		return uobj;
+	}
+
+	uobj_file = container_of(uobj, struct ib_uobject_file, uobj);
+	filp = anon_inode_getfile(fd_type->name,
+				  fd_type->fops,
+				  uobj_file,
+				  fd_type->flags);
+	if (IS_ERR(filp)) {
+		put_unused_fd(new_fd);
+		uverbs_uobject_put(uobj);
+		return (void *)filp;
+	}
+
+	uobj_file->uobj.id = new_fd;
+	uobj_file->uobj.object = filp;
+	uobj_file->ufile = ucontext->ufile;
+	INIT_LIST_HEAD(&uobj->list);
+	kref_get(&uobj_file->ufile->ref);
+
+	return uobj;
+}
+
 struct ib_uobject *rdma_alloc_begin_uobject(const struct uverbs_obj_type *type,
 					    struct ib_ucontext *ucontext)
 {
@@ -246,6 +317,39 @@ static int __must_check remove_commit_idr_uobject(struct ib_uobject *uobj,
 	return ret;
 }
 
+static void alloc_abort_fd_uobject(struct ib_uobject *uobj)
+{
+	struct ib_uobject_file *uobj_file =
+		container_of(uobj, struct ib_uobject_file, uobj);
+	struct file *filp = uobj->object;
+	int id = uobj_file->uobj.id;
+
+	/* Unsuccessful NEW */
+	fput(filp);
+	put_unused_fd(id);
+}
+
+static int __must_check remove_commit_fd_uobject(struct ib_uobject *uobj,
+						 enum rdma_remove_reason why)
+{
+	const struct uverbs_obj_fd_type *fd_type =
+		container_of(uobj->type, struct uverbs_obj_fd_type, type);
+	struct ib_uobject_file *uobj_file =
+		container_of(uobj, struct ib_uobject_file, uobj);
+	int ret = fd_type->context_closed(uobj_file, why);
+
+	if (why == RDMA_REMOVE_DESTROY && ret)
+		return ret;
+
+	if (why == RDMA_REMOVE_DURING_CLEANUP) {
+		alloc_abort_fd_uobject(uobj);
+		return ret;
+	}
+
+	uobj_file->uobj.context = NULL;
+	return ret;
+}
+
 static void lockdep_check(struct ib_uobject *uobj, bool write)
 {
 #ifdef CONFIG_LOCKDEP
@@ -314,6 +418,19 @@ static void alloc_commit_idr_uobject(struct ib_uobject *uobj)
 	spin_unlock(&uobj->context->ufile->idr_lock);
 }
 
+static void alloc_commit_fd_uobject(struct ib_uobject *uobj)
+{
+	struct ib_uobject_file *uobj_file =
+		container_of(uobj, struct ib_uobject_file, uobj);
+
+	uverbs_uobject_add(&uobj_file->uobj);
+	fd_install(uobj_file->uobj.id, uobj->object);
+	/* This shouldn't be used anymore. Use the file object instead */
+	uobj_file->uobj.id = 0;
+	/* Get another reference as we export this to the fops */
+	uverbs_uobject_get(&uobj_file->uobj);
+}
+
 int rdma_alloc_commit_uobject(struct ib_uobject *uobj)
 {
 	/* Cleanup is running. Calling this should have been impossible */
@@ -352,6 +469,15 @@ static void lookup_put_idr_uobject(struct ib_uobject *uobj, bool write)
 {
 }
 
+static void lookup_put_fd_uobject(struct ib_uobject *uobj, bool write)
+{
+	struct file *filp = uobj->object;
+
+	WARN_ON(write);
+	/* This indirectly calls uverbs_close_fd and free the object */
+	fput(filp);
+}
+
 void rdma_lookup_put_uobject(struct ib_uobject *uobj, bool write)
 {
 	lockdep_check(uobj, write);
@@ -392,6 +518,39 @@ void rdma_lookup_put_uobject(struct ib_uobject *uobj, bool write)
 	.needs_kfree_rcu = true,
 };
 
+static void _uverbs_close_fd(struct ib_uobject_file *uobj_file)
+{
+	struct ib_ucontext *ucontext;
+	struct ib_uverbs_file *ufile = uobj_file->ufile;
+	int ret;
+
+	mutex_lock(&uobj_file->ufile->cleanup_mutex);
+
+	/* uobject was either already cleaned up or is cleaned up right now anyway */
+	if (!uobj_file->uobj.context ||
+	    !down_read_trylock(&uobj_file->uobj.context->cleanup_rwsem))
+		goto unlock;
+
+	ucontext = uobj_file->uobj.context;
+	ret = _rdma_remove_commit_uobject(&uobj_file->uobj, RDMA_REMOVE_CLOSE,
+					  true);
+	up_read(&ucontext->cleanup_rwsem);
+	if (ret)
+		pr_warn("uverbs: unable to clean up uobject file in uverbs_close_fd.\n");
+unlock:
+	mutex_unlock(&ufile->cleanup_mutex);
+}
+
+void uverbs_close_fd(struct file *f)
+{
+	struct ib_uobject_file *uobj_file = f->private_data;
+	struct kref *uverbs_file_ref = &uobj_file->ufile->ref;
+
+	_uverbs_close_fd(uobj_file);
+	uverbs_uobject_put(&uobj_file->uobj);
+	kref_put(uverbs_file_ref, ib_uverbs_release_file);
+}
+
 void uverbs_cleanup_ucontext(struct ib_ucontext *ucontext, bool device_removed)
 {
 	enum rdma_remove_reason reason = device_removed ?
@@ -412,7 +571,13 @@ void uverbs_cleanup_ucontext(struct ib_ucontext *ucontext, bool device_removed)
 
 		/*
 		 * This shouldn't run while executing other commands on this
-		 * context.
+		 * context. Thus, the only thing we should take care of is
+		 * releasing a FD while traversing this list. The FD could be
+		 * closed and released from the _release fop of this FD.
+		 * In order to mitigate this, we add a lock.
+		 * We take and release the lock per order traversal in order
+		 * to let other threads (which might still use the FDs) chance
+		 * to run.
 		 */
 		mutex_lock(&ucontext->uobjects_lock);
 		list_for_each_entry_safe(obj, next_obj, &ucontext->uobjects,
@@ -448,3 +613,13 @@ void uverbs_initialize_ucontext(struct ib_ucontext *ucontext)
 	init_rwsem(&ucontext->cleanup_rwsem);
 }
 
+const struct uverbs_obj_type_class uverbs_fd_class = {
+	.alloc_begin = alloc_begin_fd_uobject,
+	.lookup_get = lookup_get_fd_uobject,
+	.alloc_commit = alloc_commit_fd_uobject,
+	.alloc_abort = alloc_abort_fd_uobject,
+	.lookup_put = lookup_put_fd_uobject,
+	.remove_commit = remove_commit_fd_uobject,
+	.needs_kfree_rcu = false,
+};
+
diff --git a/drivers/infiniband/core/rdma_core.h b/drivers/infiniband/core/rdma_core.h
index 0247bb5..1b82e7f 100644
--- a/drivers/infiniband/core/rdma_core.h
+++ b/drivers/infiniband/core/rdma_core.h
@@ -67,4 +67,12 @@
  */
 void uverbs_uobject_put(struct ib_uobject *uobject);
 
+/* Indicate this fd is no longer used by this consumer, but its memory isn't
+ * necessarily released yet. When the last reference is put, we release the
+ * memory. After this call is executed, calling uverbs_uobject_get isn't
+ * allowed.
+ * This must be called from the release file_operations of the file!
+ */
+void uverbs_close_fd(struct file *f);
+
 #endif /* RDMA_CORE_H */
diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 27c8b98..5f8a7f2 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -193,6 +193,7 @@ void ib_uverbs_release_ucq(struct ib_uverbs_file *file,
 			   struct ib_ucq_object *uobj);
 void ib_uverbs_release_uevent(struct ib_uverbs_file *file,
 			      struct ib_uevent_object *uobj);
+void ib_uverbs_release_file(struct kref *ref);
 
 void ib_uverbs_comp_handler(struct ib_cq *cq, void *cq_context);
 void ib_uverbs_cq_event_handler(struct ib_event *event, void *context_ptr);
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 7ccb525..8ee1d08 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -233,7 +233,7 @@ static void ib_uverbs_comp_dev(struct ib_uverbs_device *dev)
 	complete(&dev->comp);
 }
 
-static void ib_uverbs_release_file(struct kref *ref)
+void ib_uverbs_release_file(struct kref *ref)
 {
 	struct ib_uverbs_file *file =
 		container_of(ref, struct ib_uverbs_file, ref);
@@ -1132,7 +1132,9 @@ static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev,
 			 * (e.g mmput).
 			 */
 			ib_dev->disassociate_ucontext(ucontext);
+			mutex_lock(&file->cleanup_mutex);
 			ib_uverbs_cleanup_ucontext(file, ucontext, true);
+			mutex_unlock(&file->cleanup_mutex);
 		}
 
 		mutex_lock(&uverbs_dev->lists_mutex);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 2e8f661..3a8e058 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1421,6 +1421,12 @@ struct ib_uobject {
 	const struct uverbs_obj_type *type;
 };
 
+struct ib_uobject_file {
+	struct ib_uobject	uobj;
+	/* ufile contains the lock between context release and file close */
+	struct ib_uverbs_file	*ufile;
+};
+
 struct ib_udata {
 	const void __user *inbuf;
 	void __user *outbuf;
diff --git a/include/rdma/uverbs_types.h b/include/rdma/uverbs_types.h
index 66368b5..5867429 100644
--- a/include/rdma/uverbs_types.h
+++ b/include/rdma/uverbs_types.h
@@ -129,6 +129,22 @@ struct ib_uobject *rdma_alloc_begin_uobject(const struct uverbs_obj_type *type,
 int __must_check rdma_remove_commit_uobject(struct ib_uobject *uobj);
 int rdma_alloc_commit_uobject(struct ib_uobject *uobj);
 
+struct uverbs_obj_fd_type {
+	/*
+	 * In fd based objects, uverbs_obj_type_ops points to generic
+	 * fd operations. In order to specialize the underlying types (e.g.
+	 * completion_channel), we use fops, name and flags for fd creation.
+	 * context_closed is called when the context is closed either when
+	 * the driver is removed or the process terminated.
+	 */
+	struct uverbs_obj_type  type;
+	int (*context_closed)(struct ib_uobject_file *uobj_file,
+			      enum rdma_remove_reason why);
+	const struct file_operations	*fops;
+	const char			*name;
+	int				flags;
+};
+
 extern const struct uverbs_obj_type_class uverbs_idr_class;
 
 #define UVERBS_BUILD_BUG_ON(cond) (sizeof(char[1 - 2 * !!(cond)]) -	\
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V3 for-next 7/7] IB/core: Change completion channel to use the reworked objects schema
       [not found] ` <1491301907-32290-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2017-04-04 10:31   ` [PATCH V3 for-next 6/7] IB/core: Add support for fd objects Matan Barak
@ 2017-04-04 10:31   ` Matan Barak
       [not found]     ` <1491301907-32290-8-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  6 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2017-04-04 10:31 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Sean Hefty, Leon Romanovsky, Majd Dibbiny, Tal Alon,
	Yishai Hadas, Ira Weiny, Haggai Eran, Christoph Lameter,
	Matan Barak

This patch adds the standard fd based type - completion_channel.
The completion_channel is now prefixed with ib_uobject, similarly
to the rest of the uobjects.
This requires a few changes:
(1) We define a new completion channel fd based object type.
(2) completion_event and async_event are now two different types.
    This means they use different fops.
(3) We release the completion_channel exactly as we release other
    idr based objects.
(4) Since ib_uobjects are already kref-ed, we only add the kref to the
    async event.

A fd object requires filling out several parameters. Its op pointer
should point to uverbs_fd_ops and its size should be at least the
size if ib_uobject. We use a macro to make the type declaration
easier.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs.h           |  26 ++-
 drivers/infiniband/core/uverbs_cmd.c       |  57 +++---
 drivers/infiniband/core/uverbs_main.c      | 279 +++++++++++++++++------------
 drivers/infiniband/core/uverbs_std_types.c |  33 +++-
 include/rdma/uverbs_std_types.h            |   1 +
 include/rdma/uverbs_types.h                |   9 +
 6 files changed, 257 insertions(+), 148 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 5f8a7f2..826f827 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -102,17 +102,25 @@ struct ib_uverbs_device {
 };
 
 struct ib_uverbs_event_file {
-	struct kref				ref;
-	int					is_async;
-	struct ib_uverbs_file		       *uverbs_file;
 	spinlock_t				lock;
 	int					is_closed;
 	wait_queue_head_t			poll_wait;
 	struct fasync_struct		       *async_queue;
 	struct list_head			event_list;
+};
+
+struct ib_uverbs_async_event_file {
+	struct ib_uverbs_event_file		ev_file;
+	struct ib_uverbs_file		       *uverbs_file;
+	struct kref				ref;
 	struct list_head			list;
 };
 
+struct ib_uverbs_completion_event_file {
+	struct ib_uobject_file			uobj_file;
+	struct ib_uverbs_event_file		ev_file;
+};
+
 struct ib_uverbs_file {
 	struct kref				ref;
 	struct mutex				mutex;
@@ -120,7 +128,7 @@ struct ib_uverbs_file {
 	struct ib_uverbs_device		       *device;
 	struct ib_ucontext		       *ucontext;
 	struct ib_event_handler			event_handler;
-	struct ib_uverbs_event_file	       *async_file;
+	struct ib_uverbs_async_event_file       *async_file;
 	struct list_head			list;
 	int					is_closed;
 
@@ -182,14 +190,14 @@ struct ib_ucq_object {
 	u32			async_events_reported;
 };
 
-struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
-					struct ib_device *ib_dev,
-					int is_async);
+extern const struct file_operations uverbs_event_fops;
+void ib_uverbs_init_event_file(struct ib_uverbs_event_file *ev_file);
+struct file *ib_uverbs_alloc_async_event_file(struct ib_uverbs_file *uverbs_file,
+					      struct ib_device *ib_dev);
 void ib_uverbs_free_async_event_file(struct ib_uverbs_file *uverbs_file);
-struct ib_uverbs_event_file *ib_uverbs_lookup_comp_file(int fd);
 
 void ib_uverbs_release_ucq(struct ib_uverbs_file *file,
-			   struct ib_uverbs_event_file *ev_file,
+			   struct ib_uverbs_completion_event_file *ev_file,
 			   struct ib_ucq_object *uobj);
 void ib_uverbs_release_uevent(struct ib_uverbs_file *file,
 			      struct ib_uevent_object *uobj);
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 119c10d..b9024fa 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -47,6 +47,24 @@
 #include "uverbs.h"
 #include "core_priv.h"
 
+static struct ib_uverbs_completion_event_file *
+ib_uverbs_lookup_comp_file(int fd, struct ib_ucontext *context)
+{
+	struct ib_uobject *uobj = uobj_get_read(uobj_get_type(comp_channel),
+						fd, context);
+	struct ib_uobject_file *uobj_file;
+
+	if (IS_ERR(uobj))
+		return (void *)uobj;
+
+	uverbs_uobject_get(uobj);
+	uobj_put_read(uobj);
+
+	uobj_file = container_of(uobj, struct ib_uobject_file, uobj);
+	return container_of(uobj_file, struct ib_uverbs_completion_event_file,
+			    uobj_file);
+}
+
 ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 			      struct ib_device *ib_dev,
 			      const char __user *buf,
@@ -116,7 +134,7 @@ ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 		goto err_free;
 	resp.async_fd = ret;
 
-	filp = ib_uverbs_alloc_event_file(file, ib_dev, 1);
+	filp = ib_uverbs_alloc_async_event_file(file, ib_dev);
 	if (IS_ERR(filp)) {
 		ret = PTR_ERR(filp);
 		goto err_fd;
@@ -908,8 +926,8 @@ ssize_t ib_uverbs_create_comp_channel(struct ib_uverbs_file *file,
 {
 	struct ib_uverbs_create_comp_channel	   cmd;
 	struct ib_uverbs_create_comp_channel_resp  resp;
-	struct file				  *filp;
-	int ret;
+	struct ib_uobject			  *uobj;
+	struct ib_uverbs_completion_event_file	  *ev_file;
 
 	if (out_len < sizeof resp)
 		return -ENOSPC;
@@ -917,25 +935,23 @@ ssize_t ib_uverbs_create_comp_channel(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	ret = get_unused_fd_flags(O_CLOEXEC);
-	if (ret < 0)
-		return ret;
-	resp.fd = ret;
+	uobj = uobj_alloc(uobj_get_type(comp_channel), file->ucontext);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
-	filp = ib_uverbs_alloc_event_file(file, ib_dev, 0);
-	if (IS_ERR(filp)) {
-		put_unused_fd(resp.fd);
-		return PTR_ERR(filp);
-	}
+	resp.fd = uobj->id;
+
+	ev_file = container_of(uobj, struct ib_uverbs_completion_event_file,
+			       uobj_file.uobj);
+	ib_uverbs_init_event_file(&ev_file->ev_file);
 
 	if (copy_to_user((void __user *) (unsigned long) cmd.response,
 			 &resp, sizeof resp)) {
-		put_unused_fd(resp.fd);
-		fput(filp);
+		uobj_alloc_abort(uobj);
 		return -EFAULT;
 	}
 
-	fd_install(resp.fd, filp);
+	uobj_alloc_commit(uobj);
 	return in_len;
 }
 
@@ -953,7 +969,7 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 				       void *context)
 {
 	struct ib_ucq_object           *obj;
-	struct ib_uverbs_event_file    *ev_file = NULL;
+	struct ib_uverbs_completion_event_file    *ev_file = NULL;
 	struct ib_cq                   *cq;
 	int                             ret;
 	struct ib_uverbs_ex_create_cq_resp resp;
@@ -968,9 +984,10 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 		return obj;
 
 	if (cmd->comp_channel >= 0) {
-		ev_file = ib_uverbs_lookup_comp_file(cmd->comp_channel);
-		if (!ev_file) {
-			ret = -EINVAL;
+		ev_file = ib_uverbs_lookup_comp_file(cmd->comp_channel,
+						     file->ucontext);
+		if (IS_ERR(ev_file)) {
+			ret = PTR_ERR(ev_file);
 			goto err;
 		}
 	}
@@ -998,7 +1015,7 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 	cq->uobject       = &obj->uobject;
 	cq->comp_handler  = ib_uverbs_comp_handler;
 	cq->event_handler = ib_uverbs_cq_event_handler;
-	cq->cq_context    = ev_file;
+	cq->cq_context    = &ev_file->ev_file;
 	atomic_set(&cq->usecnt, 0);
 
 	obj->uobject.object = cq;
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 8ee1d08..0b0dab8 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -156,37 +156,37 @@ static void ib_uverbs_release_dev(struct kobject *kobj)
 	.release = ib_uverbs_release_dev,
 };
 
-static void ib_uverbs_release_event_file(struct kref *ref)
+static void ib_uverbs_release_async_event_file(struct kref *ref)
 {
-	struct ib_uverbs_event_file *file =
-		container_of(ref, struct ib_uverbs_event_file, ref);
+	struct ib_uverbs_async_event_file *file =
+		container_of(ref, struct ib_uverbs_async_event_file, ref);
 
 	kfree(file);
 }
 
 void ib_uverbs_release_ucq(struct ib_uverbs_file *file,
-			  struct ib_uverbs_event_file *ev_file,
+			  struct ib_uverbs_completion_event_file *ev_file,
 			  struct ib_ucq_object *uobj)
 {
 	struct ib_uverbs_event *evt, *tmp;
 
 	if (ev_file) {
-		spin_lock_irq(&ev_file->lock);
+		spin_lock_irq(&ev_file->ev_file.lock);
 		list_for_each_entry_safe(evt, tmp, &uobj->comp_list, obj_list) {
 			list_del(&evt->list);
 			kfree(evt);
 		}
-		spin_unlock_irq(&ev_file->lock);
+		spin_unlock_irq(&ev_file->ev_file.lock);
 
-		kref_put(&ev_file->ref, ib_uverbs_release_event_file);
+		uverbs_uobject_put(&ev_file->uobj_file.uobj);
 	}
 
-	spin_lock_irq(&file->async_file->lock);
+	spin_lock_irq(&file->async_file->ev_file.lock);
 	list_for_each_entry_safe(evt, tmp, &uobj->async_list, obj_list) {
 		list_del(&evt->list);
 		kfree(evt);
 	}
-	spin_unlock_irq(&file->async_file->lock);
+	spin_unlock_irq(&file->async_file->ev_file.lock);
 }
 
 void ib_uverbs_release_uevent(struct ib_uverbs_file *file,
@@ -194,12 +194,12 @@ void ib_uverbs_release_uevent(struct ib_uverbs_file *file,
 {
 	struct ib_uverbs_event *evt, *tmp;
 
-	spin_lock_irq(&file->async_file->lock);
+	spin_lock_irq(&file->async_file->ev_file.lock);
 	list_for_each_entry_safe(evt, tmp, &uobj->event_list, obj_list) {
 		list_del(&evt->list);
 		kfree(evt);
 	}
-	spin_unlock_irq(&file->async_file->lock);
+	spin_unlock_irq(&file->async_file->ev_file.lock);
 }
 
 void ib_uverbs_detach_umcast(struct ib_qp *qp,
@@ -253,10 +253,12 @@ void ib_uverbs_release_file(struct kref *ref)
 	kfree(file);
 }
 
-static ssize_t ib_uverbs_event_read(struct file *filp, char __user *buf,
-				    size_t count, loff_t *pos)
+static ssize_t ib_uverbs_event_read(struct ib_uverbs_event_file *file,
+				    struct ib_uverbs_file *uverbs_file,
+				    struct file *filp, char __user *buf,
+				    size_t count, loff_t *pos,
+				    bool is_async)
 {
-	struct ib_uverbs_event_file *file = filp->private_data;
 	struct ib_uverbs_event *event;
 	int eventsz;
 	int ret = 0;
@@ -275,12 +277,12 @@ static ssize_t ib_uverbs_event_read(struct file *filp, char __user *buf,
 			 * and wake_up() guarentee this will see the null set
 			 * without using RCU
 			 */
-					     !file->uverbs_file->device->ib_dev)))
+					     !uverbs_file->device->ib_dev)))
 			return -ERESTARTSYS;
 
 		/* If device was disassociated and no event exists set an error */
 		if (list_empty(&file->event_list) &&
-		    !file->uverbs_file->device->ib_dev)
+		    !uverbs_file->device->ib_dev)
 			return -EIO;
 
 		spin_lock_irq(&file->lock);
@@ -288,7 +290,7 @@ static ssize_t ib_uverbs_event_read(struct file *filp, char __user *buf,
 
 	event = list_entry(file->event_list.next, struct ib_uverbs_event, list);
 
-	if (file->is_async)
+	if (is_async)
 		eventsz = sizeof (struct ib_uverbs_async_event_desc);
 	else
 		eventsz = sizeof (struct ib_uverbs_comp_event_desc);
@@ -318,11 +320,31 @@ static ssize_t ib_uverbs_event_read(struct file *filp, char __user *buf,
 	return ret;
 }
 
-static unsigned int ib_uverbs_event_poll(struct file *filp,
+static ssize_t ib_uverbs_async_event_read(struct file *filp, char __user *buf,
+					  size_t count, loff_t *pos)
+{
+	struct ib_uverbs_async_event_file *file = filp->private_data;
+
+	return ib_uverbs_event_read(&file->ev_file, file->uverbs_file, filp,
+				    buf, count, pos, true);
+}
+
+static ssize_t ib_uverbs_comp_event_read(struct file *filp, char __user *buf,
+					 size_t count, loff_t *pos)
+{
+	struct ib_uverbs_completion_event_file *comp_ev_file =
+		filp->private_data;
+
+	return ib_uverbs_event_read(&comp_ev_file->ev_file,
+				    comp_ev_file->uobj_file.ufile, filp,
+				    buf, count, pos, false);
+}
+
+static unsigned int ib_uverbs_event_poll(struct ib_uverbs_event_file *file,
+					 struct file *filp,
 					 struct poll_table_struct *wait)
 {
 	unsigned int pollflags = 0;
-	struct ib_uverbs_event_file *file = filp->private_data;
 
 	poll_wait(filp, &file->poll_wait, wait);
 
@@ -334,49 +356,98 @@ static unsigned int ib_uverbs_event_poll(struct file *filp,
 	return pollflags;
 }
 
-static int ib_uverbs_event_fasync(int fd, struct file *filp, int on)
+static unsigned int ib_uverbs_async_event_poll(struct file *filp,
+					       struct poll_table_struct *wait)
+{
+	return ib_uverbs_event_poll(filp->private_data, filp, wait);
+}
+
+static unsigned int ib_uverbs_comp_event_poll(struct file *filp,
+					      struct poll_table_struct *wait)
+{
+	struct ib_uverbs_completion_event_file *comp_ev_file =
+		filp->private_data;
+
+	return ib_uverbs_event_poll(&comp_ev_file->ev_file, filp, wait);
+}
+
+static int ib_uverbs_async_event_fasync(int fd, struct file *filp, int on)
 {
 	struct ib_uverbs_event_file *file = filp->private_data;
 
 	return fasync_helper(fd, filp, on, &file->async_queue);
 }
 
-static int ib_uverbs_event_close(struct inode *inode, struct file *filp)
+static int ib_uverbs_comp_event_fasync(int fd, struct file *filp, int on)
 {
-	struct ib_uverbs_event_file *file = filp->private_data;
+	struct ib_uverbs_completion_event_file *comp_ev_file =
+		filp->private_data;
+
+	return fasync_helper(fd, filp, on, &comp_ev_file->ev_file.async_queue);
+}
+
+static int ib_uverbs_async_event_close(struct inode *inode, struct file *filp)
+{
+	struct ib_uverbs_async_event_file *file = filp->private_data;
+	struct ib_uverbs_file *uverbs_file = file->uverbs_file;
 	struct ib_uverbs_event *entry, *tmp;
 	int closed_already = 0;
 
-	mutex_lock(&file->uverbs_file->device->lists_mutex);
-	spin_lock_irq(&file->lock);
-	closed_already = file->is_closed;
-	file->is_closed = 1;
-	list_for_each_entry_safe(entry, tmp, &file->event_list, list) {
+	mutex_lock(&uverbs_file->device->lists_mutex);
+	spin_lock_irq(&file->ev_file.lock);
+	closed_already = file->ev_file.is_closed;
+	file->ev_file.is_closed = 1;
+	list_for_each_entry_safe(entry, tmp, &file->ev_file.event_list, list) {
 		if (entry->counter)
 			list_del(&entry->obj_list);
 		kfree(entry);
 	}
-	spin_unlock_irq(&file->lock);
+	spin_unlock_irq(&file->ev_file.lock);
 	if (!closed_already) {
 		list_del(&file->list);
-		if (file->is_async)
-			ib_unregister_event_handler(&file->uverbs_file->
-				event_handler);
+		ib_unregister_event_handler(&uverbs_file->event_handler);
+	}
+	mutex_unlock(&uverbs_file->device->lists_mutex);
+
+	kref_put(&uverbs_file->ref, ib_uverbs_release_file);
+	kref_put(&file->ref, ib_uverbs_release_async_event_file);
+
+	return 0;
+}
+
+static int ib_uverbs_comp_event_close(struct inode *inode, struct file *filp)
+{
+	struct ib_uverbs_completion_event_file *file = filp->private_data;
+	struct ib_uverbs_event *entry, *tmp;
+
+	spin_lock_irq(&file->ev_file.lock);
+	list_for_each_entry_safe(entry, tmp, &file->ev_file.event_list, list) {
+		if (entry->counter)
+			list_del(&entry->obj_list);
+		kfree(entry);
 	}
-	mutex_unlock(&file->uverbs_file->device->lists_mutex);
+	spin_unlock_irq(&file->ev_file.lock);
 
-	kref_put(&file->uverbs_file->ref, ib_uverbs_release_file);
-	kref_put(&file->ref, ib_uverbs_release_event_file);
+	uverbs_close_fd(filp);
 
 	return 0;
 }
 
-static const struct file_operations uverbs_event_fops = {
+const struct file_operations uverbs_event_fops = {
 	.owner	 = THIS_MODULE,
-	.read	 = ib_uverbs_event_read,
-	.poll    = ib_uverbs_event_poll,
-	.release = ib_uverbs_event_close,
-	.fasync  = ib_uverbs_event_fasync,
+	.read	 = ib_uverbs_comp_event_read,
+	.poll    = ib_uverbs_comp_event_poll,
+	.release = ib_uverbs_comp_event_close,
+	.fasync  = ib_uverbs_comp_event_fasync,
+	.llseek	 = no_llseek,
+};
+
+static const struct file_operations uverbs_async_event_fops = {
+	.owner	 = THIS_MODULE,
+	.read	 = ib_uverbs_async_event_read,
+	.poll    = ib_uverbs_async_event_poll,
+	.release = ib_uverbs_async_event_close,
+	.fasync  = ib_uverbs_async_event_fasync,
 	.llseek	 = no_llseek,
 };
 
@@ -423,15 +494,15 @@ static void ib_uverbs_async_handler(struct ib_uverbs_file *file,
 	struct ib_uverbs_event *entry;
 	unsigned long flags;
 
-	spin_lock_irqsave(&file->async_file->lock, flags);
-	if (file->async_file->is_closed) {
-		spin_unlock_irqrestore(&file->async_file->lock, flags);
+	spin_lock_irqsave(&file->async_file->ev_file.lock, flags);
+	if (file->async_file->ev_file.is_closed) {
+		spin_unlock_irqrestore(&file->async_file->ev_file.lock, flags);
 		return;
 	}
 
 	entry = kmalloc(sizeof *entry, GFP_ATOMIC);
 	if (!entry) {
-		spin_unlock_irqrestore(&file->async_file->lock, flags);
+		spin_unlock_irqrestore(&file->async_file->ev_file.lock, flags);
 		return;
 	}
 
@@ -440,13 +511,13 @@ static void ib_uverbs_async_handler(struct ib_uverbs_file *file,
 	entry->desc.async.reserved   = 0;
 	entry->counter               = counter;
 
-	list_add_tail(&entry->list, &file->async_file->event_list);
+	list_add_tail(&entry->list, &file->async_file->ev_file.event_list);
 	if (obj_list)
 		list_add_tail(&entry->obj_list, obj_list);
-	spin_unlock_irqrestore(&file->async_file->lock, flags);
+	spin_unlock_irqrestore(&file->async_file->ev_file.lock, flags);
 
-	wake_up_interruptible(&file->async_file->poll_wait);
-	kill_fasync(&file->async_file->async_queue, SIGIO, POLL_IN);
+	wake_up_interruptible(&file->async_file->ev_file.poll_wait);
+	kill_fasync(&file->async_file->ev_file.async_queue, SIGIO, POLL_IN);
 }
 
 void ib_uverbs_cq_event_handler(struct ib_event *event, void *context_ptr)
@@ -509,15 +580,23 @@ void ib_uverbs_event_handler(struct ib_event_handler *handler,
 
 void ib_uverbs_free_async_event_file(struct ib_uverbs_file *file)
 {
-	kref_put(&file->async_file->ref, ib_uverbs_release_event_file);
+	kref_put(&file->async_file->ref, ib_uverbs_release_async_event_file);
 	file->async_file = NULL;
 }
 
-struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
-					struct ib_device	*ib_dev,
-					int is_async)
+void ib_uverbs_init_event_file(struct ib_uverbs_event_file *ev_file)
 {
-	struct ib_uverbs_event_file *ev_file;
+	spin_lock_init(&ev_file->lock);
+	INIT_LIST_HEAD(&ev_file->event_list);
+	init_waitqueue_head(&ev_file->poll_wait);
+	ev_file->is_closed   = 0;
+	ev_file->async_queue = NULL;
+}
+
+struct file *ib_uverbs_alloc_async_event_file(struct ib_uverbs_file *uverbs_file,
+					      struct ib_device	*ib_dev)
+{
+	struct ib_uverbs_async_event_file *ev_file;
 	struct file *filp;
 	int ret;
 
@@ -525,16 +604,11 @@ struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
 	if (!ev_file)
 		return ERR_PTR(-ENOMEM);
 
-	kref_init(&ev_file->ref);
-	spin_lock_init(&ev_file->lock);
-	INIT_LIST_HEAD(&ev_file->event_list);
-	init_waitqueue_head(&ev_file->poll_wait);
+	ib_uverbs_init_event_file(&ev_file->ev_file);
 	ev_file->uverbs_file = uverbs_file;
 	kref_get(&ev_file->uverbs_file->ref);
-	ev_file->async_queue = NULL;
-	ev_file->is_closed   = 0;
-
-	filp = anon_inode_getfile("[infinibandevent]", &uverbs_event_fops,
+	kref_init(&ev_file->ref);
+	filp = anon_inode_getfile("[infinibandevent]", &uverbs_async_event_fops,
 				  ev_file, O_RDONLY);
 	if (IS_ERR(filp))
 		goto err_put_refs;
@@ -544,64 +618,33 @@ struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
 		      &uverbs_file->device->uverbs_events_file_list);
 	mutex_unlock(&uverbs_file->device->lists_mutex);
 
-	if (is_async) {
-		WARN_ON(uverbs_file->async_file);
-		uverbs_file->async_file = ev_file;
-		kref_get(&uverbs_file->async_file->ref);
-		INIT_IB_EVENT_HANDLER(&uverbs_file->event_handler,
-				      ib_dev,
-				      ib_uverbs_event_handler);
-		ret = ib_register_event_handler(&uverbs_file->event_handler);
-		if (ret)
-			goto err_put_file;
-
-		/* At that point async file stuff was fully set */
-		ev_file->is_async = 1;
-	}
+	WARN_ON(uverbs_file->async_file);
+	uverbs_file->async_file = ev_file;
+	kref_get(&uverbs_file->async_file->ref);
+	INIT_IB_EVENT_HANDLER(&uverbs_file->event_handler,
+			      ib_dev,
+			      ib_uverbs_event_handler);
+	ret = ib_register_event_handler(&uverbs_file->event_handler);
+	if (ret)
+		goto err_put_file;
+
+	/* At that point async file stuff was fully set */
 
 	return filp;
 
 err_put_file:
 	fput(filp);
-	kref_put(&uverbs_file->async_file->ref, ib_uverbs_release_event_file);
+	kref_put(&uverbs_file->async_file->ref,
+		 ib_uverbs_release_async_event_file);
 	uverbs_file->async_file = NULL;
 	return ERR_PTR(ret);
 
 err_put_refs:
 	kref_put(&ev_file->uverbs_file->ref, ib_uverbs_release_file);
-	kref_put(&ev_file->ref, ib_uverbs_release_event_file);
+	kref_put(&ev_file->ref, ib_uverbs_release_async_event_file);
 	return filp;
 }
 
-/*
- * Look up a completion event file by FD.  If lookup is successful,
- * takes a ref to the event file struct that it returns; if
- * unsuccessful, returns NULL.
- */
-struct ib_uverbs_event_file *ib_uverbs_lookup_comp_file(int fd)
-{
-	struct ib_uverbs_event_file *ev_file = NULL;
-	struct fd f = fdget(fd);
-
-	if (!f.file)
-		return NULL;
-
-	if (f.file->f_op != &uverbs_event_fops)
-		goto out;
-
-	ev_file = f.file->private_data;
-	if (ev_file->is_async) {
-		ev_file = NULL;
-		goto out;
-	}
-
-	kref_get(&ev_file->ref);
-
-out:
-	fdput(f);
-	return ev_file;
-}
-
 static int verify_command_mask(struct ib_device *ib_dev, __u32 command)
 {
 	u64 mask;
@@ -896,7 +939,8 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
 	mutex_unlock(&file->device->lists_mutex);
 
 	if (file->async_file)
-		kref_put(&file->async_file->ref, ib_uverbs_release_event_file);
+		kref_put(&file->async_file->ref,
+			 ib_uverbs_release_async_event_file);
 
 	kref_put(&file->ref, ib_uverbs_release_file);
 	kobject_put(&dev->kobj);
@@ -1095,7 +1139,7 @@ static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev,
 					struct ib_device *ib_dev)
 {
 	struct ib_uverbs_file *file;
-	struct ib_uverbs_event_file *event_file;
+	struct ib_uverbs_async_event_file *event_file;
 	struct ib_event event;
 
 	/* Pending running commands to terminate */
@@ -1144,21 +1188,20 @@ static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev,
 	while (!list_empty(&uverbs_dev->uverbs_events_file_list)) {
 		event_file = list_first_entry(&uverbs_dev->
 					      uverbs_events_file_list,
-					      struct ib_uverbs_event_file,
+					      struct ib_uverbs_async_event_file,
 					      list);
-		spin_lock_irq(&event_file->lock);
-		event_file->is_closed = 1;
-		spin_unlock_irq(&event_file->lock);
+		spin_lock_irq(&event_file->ev_file.lock);
+		event_file->ev_file.is_closed = 1;
+		spin_unlock_irq(&event_file->ev_file.lock);
 
 		list_del(&event_file->list);
-		if (event_file->is_async) {
-			ib_unregister_event_handler(&event_file->uverbs_file->
-						    event_handler);
-			event_file->uverbs_file->event_handler.device = NULL;
-		}
+		ib_unregister_event_handler(
+			&event_file->uverbs_file->event_handler);
+		event_file->uverbs_file->event_handler.device =
+			NULL;
 
-		wake_up_interruptible(&event_file->poll_wait);
-		kill_fasync(&event_file->async_queue, SIGIO, POLL_IN);
+		wake_up_interruptible(&event_file->ev_file.poll_wait);
+		kill_fasync(&event_file->ev_file.async_queue, SIGIO, POLL_IN);
 	}
 	mutex_unlock(&uverbs_dev->lists_mutex);
 }
diff --git a/drivers/infiniband/core/uverbs_std_types.c b/drivers/infiniband/core/uverbs_std_types.c
index a514556..7f26af5 100644
--- a/drivers/infiniband/core/uverbs_std_types.c
+++ b/drivers/infiniband/core/uverbs_std_types.c
@@ -145,7 +145,11 @@ int uverbs_free_cq(struct ib_uobject *uobject,
 
 	ret = ib_destroy_cq(cq);
 	if (!ret || why != RDMA_REMOVE_DESTROY)
-		ib_uverbs_release_ucq(uobject->context->ufile, ev_file, ucq);
+		ib_uverbs_release_ucq(uobject->context->ufile, ev_file ?
+				      container_of(ev_file,
+						   struct ib_uverbs_completion_event_file,
+						   ev_file) : NULL,
+				      ucq);
 	return ret;
 }
 
@@ -186,6 +190,33 @@ int uverbs_free_pd(struct ib_uobject *uobject,
 	return 0;
 }
 
+int uverbs_hot_unplug_completion_event_file(struct ib_uobject_file *uobj_file,
+					    enum rdma_remove_reason why)
+{
+	struct ib_uverbs_completion_event_file *comp_event_file =
+		container_of(uobj_file, struct ib_uverbs_completion_event_file,
+			     uobj_file);
+	struct ib_uverbs_event_file *event_file = &comp_event_file->ev_file;
+
+	spin_lock_irq(&event_file->lock);
+	event_file->is_closed = 1;
+	spin_unlock_irq(&event_file->lock);
+
+	if (why == RDMA_REMOVE_DRIVER_REMOVE) {
+		wake_up_interruptible(&event_file->poll_wait);
+		kill_fasync(&event_file->async_queue, SIGIO, POLL_IN);
+	}
+	return 0;
+};
+
+const struct uverbs_obj_fd_type uverbs_type_attrs_comp_channel = {
+	.type = UVERBS_TYPE_ALLOC_FD(sizeof(struct ib_uverbs_completion_event_file), 0),
+	.context_closed = uverbs_hot_unplug_completion_event_file,
+	.fops = &uverbs_event_fops,
+	.name = "[infinibandevent]",
+	.flags = O_RDONLY,
+};
+
 const struct uverbs_obj_idr_type uverbs_type_attrs_cq = {
 	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_ucq_object), 0),
 	.destroy_object = uverbs_free_cq,
diff --git a/include/rdma/uverbs_std_types.h b/include/rdma/uverbs_std_types.h
index 8885664..7771ce9 100644
--- a/include/rdma/uverbs_std_types.h
+++ b/include/rdma/uverbs_std_types.h
@@ -35,6 +35,7 @@
 
 #include <rdma/uverbs_types.h>
 
+extern const struct uverbs_obj_fd_type uverbs_type_attrs_comp_channel;
 extern const struct uverbs_obj_idr_type uverbs_type_attrs_cq;
 extern const struct uverbs_obj_idr_type uverbs_type_attrs_qp;
 extern const struct uverbs_obj_idr_type uverbs_type_attrs_rwq_ind_table;
diff --git a/include/rdma/uverbs_types.h b/include/rdma/uverbs_types.h
index 5867429..a376921 100644
--- a/include/rdma/uverbs_types.h
+++ b/include/rdma/uverbs_types.h
@@ -146,9 +146,18 @@ struct uverbs_obj_fd_type {
 };
 
 extern const struct uverbs_obj_type_class uverbs_idr_class;
+extern const struct uverbs_obj_type_class uverbs_fd_class;
 
 #define UVERBS_BUILD_BUG_ON(cond) (sizeof(char[1 - 2 * !!(cond)]) -	\
 				   sizeof(char))
+#define UVERBS_TYPE_ALLOC_FD(_size, _order)				 \
+	{								 \
+		.destroy_order = _order,				 \
+		.type_class = &uverbs_fd_class,				 \
+		.obj_size = (_size) +					 \
+			  UVERBS_BUILD_BUG_ON((_size) <			 \
+					      sizeof(struct ib_uobject_file)),\
+	}
 #define UVERBS_TYPE_ALLOC_IDR_SZ(_size, _order)				\
 	{								\
 		.destroy_order = _order,				\
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* RE: [PATCH V3 for-next 1/7] IB/core: Refactor idr to be per uverbs_file
       [not found]     ` <1491301907-32290-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2017-04-04 17:33       ` Hefty, Sean
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10F3F4-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Hefty, Sean @ 2017-04-04 17:33 UTC (permalink / raw)
  To: Matan Barak, Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Leon Romanovsky, Majd Dibbiny, Tal Alon, Yishai Hadas, Weiny,
	Ira, Haggai Eran, Christoph Lameter

> The current code creates an idr per type. Since types are currently
> common for all drivers and known in advance, this was good enough.
> However, the proposed ioctl based infrastructure allows each driver
> to declare only some of the common types and declare its own specific
> types.
> 
> Thus, we decided to implement idr to be per uverbs_file.
> 
> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---

Reviewed-by: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH V3 for-next 2/7] IB/core: Add support for idr types
       [not found]     ` <1491301907-32290-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2017-04-05  0:43       ` Hefty, Sean
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10F5A5-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Hefty, Sean @ 2017-04-05  0:43 UTC (permalink / raw)
  To: Matan Barak, Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Leon Romanovsky, Majd Dibbiny, Tal Alon, Yishai Hadas, Weiny,
	Ira, Haggai Eran, Christoph Lameter

> diff --git a/drivers/infiniband/core/rdma_core.c
> b/drivers/infiniband/core/rdma_core.c
> new file mode 100644
> index 0000000..1cbc053
> --- /dev/null
> +++ b/drivers/infiniband/core/rdma_core.c
> @@ -0,0 +1,450 @@
> +/*
> + * Copyright (c) 2016, Mellanox Technologies inc.  All rights
> reserved.
> + *
> + * This software is available to you under a choice of one of two
> + * licenses.  You may choose to be licensed under the terms of the
> GNU
> + * General Public License (GPL) Version 2, available from the file
> + * COPYING in the main directory of this source tree, or the
> + * OpenIB.org BSD license below:
> + *
> + *     Redistribution and use in source and binary forms, with or
> + *     without modification, are permitted provided that the
> following
> + *     conditions are met:
> + *
> + *      - Redistributions of source code must retain the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer.
> + *
> + *      - Redistributions in binary form must reproduce the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer in the documentation and/or other materials
> + *        provided with the distribution.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
> HOLDERS
> + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
> + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
> + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
> + * SOFTWARE.
> + */
> +
> +#include <linux/file.h>
> +#include <linux/anon_inodes.h>
> +#include <rdma/ib_verbs.h>
> +#include <rdma/uverbs_types.h>
> +#include <linux/rcupdate.h>
> +#include "uverbs.h"
> +#include "core_priv.h"
> +#include "rdma_core.h"
> +
> +void uverbs_uobject_get(struct ib_uobject *uobject)
> +{
> +	kref_get(&uobject->ref);
> +}
> +
> +static void uverbs_uobject_put_ref(struct kref *ref)
> +{
> +	struct ib_uobject *uobj =
> +		container_of(ref, struct ib_uobject, ref);
> +
> +	if (uobj->type->type_class->needs_kfree_rcu)
> +		kfree_rcu(uobj, rcu);
> +	else
> +		kfree(uobj);
> +}

I would rename 'put' to 'free'.

> +
> +void uverbs_uobject_put(struct ib_uobject *uobject)
> +{
> +	kref_put(&uobject->ref, uverbs_uobject_put_ref);
> +}
> +
> +static int uverbs_try_lock_object(struct ib_uobject *uobj, bool
> write)
> +{
> +	/*
> +	 * When a read is required, we use a positive counter. Each read
> +	 * request checks that the value != -1 and increment it. Write
> +	 * requires an exclusive access, thus we check that the counter
> is
> +	 * zero (nobody claimed this object) and we set it to -1.
> +	 * Releasing a read lock is done by simply decreasing the
> counter.
> +	 * As for writes, since only a single write is permitted,
> setting
> +	 * it to zero is enough for releasing it.
> +	 */
> +	if (!write)
> +		return __atomic_add_unless(&uobj->usecnt, 1, -1) == -1 ?
> +			-EBUSY : 0;
> +
> +	/* lock is either WRITE or DESTROY - should be exclusive */
> +	return atomic_cmpxchg(&uobj->usecnt, 0, -1) == 0 ? 0 : -EBUSY;
> +}

I would replace 'write' with 'exclusive'.

> +static struct ib_uobject *alloc_uobj(struct ib_ucontext *context,
> +				     const struct uverbs_obj_type *type)
> +{
> +	struct ib_uobject *uobj = kmalloc(type->obj_size, GFP_KERNEL);

kzalloc?

> +
> +	if (!uobj)
> +		return ERR_PTR(-ENOMEM);
> +	/*
> +	 * user_handle should be filled by the handler,
> +	 * The object is added to the list in the commit stage.
> +	 */
> +	uobj->context = context;
> +	uobj->type = type;
> +	atomic_set(&uobj->usecnt, 0);
> +	kref_init(&uobj->ref);
> +
> +	return uobj;
> +}
> +
> +static int idr_add_uobj(struct ib_uobject *uobj)
> +{
> +	int ret;
> +
> +	idr_preload(GFP_KERNEL);
> +	spin_lock(&uobj->context->ufile->idr_lock);
> +
> +	/*
> +	 * We start with allocating an idr pointing to NULL. This
> represents an
> +	 * object which isn't initialized yet. We'll replace it later on
> with
> +	 * the real object once we commit.
> +	 */
> +	ret = idr_alloc(&uobj->context->ufile->idr, NULL, 0,
> +			min_t(unsigned long, U32_MAX - 1, INT_MAX),
> GFP_NOWAIT);
> +	if (ret >= 0)
> +		uobj->id = ret;
> +
> +	spin_unlock(&uobj->context->ufile->idr_lock);
> +	idr_preload_end();
> +
> +	return ret < 0 ? ret : 0;
> +}
> +
> +/*
> + * It only removes it from the uobjects list, uverbs_uobject_put() is
> still
> + * required.
> + */
> +static void uverbs_idr_remove_uobj(struct ib_uobject *uobj)
> +{
> +	spin_lock(&uobj->context->ufile->idr_lock);
> +	idr_remove(&uobj->context->ufile->idr, uobj->id);
> +	spin_unlock(&uobj->context->ufile->idr_lock);
> +}
> +
> +/* Returns the ib_uobject or an error. The caller should check for
> IS_ERR. */
> +static struct ib_uobject *lookup_get_idr_uobject(const struct
> uverbs_obj_type *type,
> +						 struct ib_ucontext *ucontext,
> +						 int id, bool write)
> +{
> +	struct ib_uobject *uobj;
> +
> +	rcu_read_lock();
> +	/* object won't be released as we're protected in rcu */
> +	uobj = idr_find(&ucontext->ufile->idr, id);
> +	if (!uobj) {
> +		uobj = ERR_PTR(-ENOENT);
> +		goto free;
> +	}
> +
> +	uverbs_uobject_get(uobj);
> +free:
> +	rcu_read_unlock();
> +	return uobj;
> +}
> +
> +struct ib_uobject *rdma_lookup_get_uobject(const struct
> uverbs_obj_type *type,
> +					   struct ib_ucontext *ucontext,
> +					   int id, bool write)
> +{
> +	struct ib_uobject *uobj;
> +	int ret;
> +
> +	uobj = type->type_class->lookup_get(type, ucontext, id, write);
> +	if (IS_ERR(uobj))
> +		return uobj;
> +
> +	if (uobj->type != type) {
> +		ret = -EINVAL;
> +		goto free;
> +	}
> +
> +	ret = uverbs_try_lock_object(uobj, write);
> +	if (ret) {
> +		WARN(ucontext->cleanup_reason,
> +		     "ib_uverbs: Trying to lookup_get while cleanup
> context\n");
> +		goto free;
> +	}
> +
> +	return uobj;
> +free:
> +	uobj->type->type_class->lookup_put(uobj, write);
> +	uverbs_uobject_put(uobj);

There's an unexpected asymmetry here.  lookup_get is pairing with lookup_put + uobject_put.

> +	return ERR_PTR(ret);
> +}
> +
> +static struct ib_uobject *alloc_begin_idr_uobject(const struct
> uverbs_obj_type *type,
> +						  struct ib_ucontext *ucontext)
> +{
> +	int ret;
> +	struct ib_uobject *uobj;
> +
> +	uobj = alloc_uobj(ucontext, type);
> +	if (IS_ERR(uobj))
> +		return uobj;
> +
> +	ret = idr_add_uobj(uobj);
> +	if (ret)
> +		goto uobj_put;
> +
> +	ret = ib_rdmacg_try_charge(&uobj->cg_obj, ucontext->device,
> +				   RDMACG_RESOURCE_HCA_OBJECT);
> +	if (ret)
> +		goto idr_remove;
> +
> +	return uobj;
> +
> +idr_remove:
> +	uverbs_idr_remove_uobj(uobj);
> +uobj_put:
> +	uverbs_uobject_put(uobj);
> +	return ERR_PTR(ret);
> +}
> +
> +struct ib_uobject *rdma_alloc_begin_uobject(const struct
> uverbs_obj_type *type,
> +					    struct ib_ucontext *ucontext)
> +{
> +	return type->type_class->alloc_begin(type, ucontext);
> +}
> +
> +static void uverbs_uobject_add(struct ib_uobject *uobject)
> +{
> +	mutex_lock(&uobject->context->uobjects_lock);
> +	list_add(&uobject->list, &uobject->context->uobjects);
> +	mutex_unlock(&uobject->context->uobjects_lock);
> +}
> +
> +static int __must_check remove_commit_idr_uobject(struct ib_uobject
> *uobj,
> +						  enum rdma_remove_reason why)
> +{
> +	const struct uverbs_obj_idr_type *idr_type =
> +		container_of(uobj->type, struct uverbs_obj_idr_type,
> +			     type);
> +	int ret = idr_type->destroy_object(uobj, why);
> +
> +	/*
> +	 * We can only fail gracefully if the user requested to destroy
> the
> +	 * object. In the rest of the cases, just remove whatever you
> can.
> +	 */
> +	if (why == RDMA_REMOVE_DESTROY && ret)
> +		return ret;
> +
> +	ib_rdmacg_uncharge(&uobj->cg_obj, uobj->context->device,
> +			   RDMACG_RESOURCE_HCA_OBJECT);
> +	uverbs_idr_remove_uobj(uobj);
> +
> +	return ret;
> +}
> +
> +static void lockdep_check(struct ib_uobject *uobj, bool write)
> +{
> +#ifdef CONFIG_LOCKDEP
> +	if (write)
> +		WARN_ON(atomic_read(&uobj->usecnt) > 0);
> +	else
> +		WARN_ON(atomic_read(&uobj->usecnt) == -1);
> +#endif
> +}
> +
> +static int __must_check _rdma_remove_commit_uobject(struct ib_uobject
> *uobj,
> +						    enum rdma_remove_reason why,
> +						    bool lock)
> +{
> +	int ret;
> +	struct ib_ucontext *ucontext = uobj->context;
> +
> +	ret = uobj->type->type_class->remove_commit(uobj, why);
> +	if (ret && why == RDMA_REMOVE_DESTROY) {
> +		/* We couldn't remove the object, so just unlock the
> uobject */
> +		atomic_set(&uobj->usecnt, 0);
> +		uobj->type->type_class->lookup_put(uobj, true);
> +	} else {
> +		if (lock)
> +			mutex_lock(&ucontext->uobjects_lock);
> +		list_del(&uobj->list);
> +		if (lock)
> +			mutex_unlock(&ucontext->uobjects_lock);
> +		/* put the ref we took when we created the object */
> +		uverbs_uobject_put(uobj);

Please try to restructure the code so that locking state doesn't need to be carried through to functions like this.


> +	}
> +
> +	return ret;
> +}
> +
> +/* This is called only for user requested DESTROY reasons */
> +int __must_check rdma_remove_commit_uobject(struct ib_uobject *uobj)
> +{
> +	int ret;
> +	struct ib_ucontext *ucontext = uobj->context;
> +
> +	/* put the ref count we took at lookup_get */
> +	uverbs_uobject_put(uobj);
> +	/* Cleanup is running. Calling this should have been impossible
> */
> +	if (!down_read_trylock(&ucontext->cleanup_rwsem)) {
> +		WARN(true, "ib_uverbs: Cleanup is running while removing
> an uobject\n");
> +		return 0;
> +	}
> +	lockdep_check(uobj, true);
> +	ret = _rdma_remove_commit_uobject(uobj, RDMA_REMOVE_DESTROY,
> true);
> +
> +	up_read(&ucontext->cleanup_rwsem);
> +	return ret;
> +}
> +
> +static void alloc_commit_idr_uobject(struct ib_uobject *uobj)
> +{
> +	uverbs_uobject_add(uobj);
> +	spin_lock(&uobj->context->ufile->idr_lock);
> +	/*
> +	 * We already allocated this IDR with a NULL object, so
> +	 * this shouldn't fail.
> +	 */
> +	WARN_ON(idr_replace(&uobj->context->ufile->idr,
> +			    uobj, uobj->id));
> +	spin_unlock(&uobj->context->ufile->idr_lock);
> +}
> +
> +int rdma_alloc_commit_uobject(struct ib_uobject *uobj)
> +{
> +	/* Cleanup is running. Calling this should have been impossible
> */
> +	if (!down_read_trylock(&uobj->context->cleanup_rwsem)) {
> +		int ret;
> +
> +		WARN(true, "ib_uverbs: Cleanup is running while allocating
> an uobject\n");
> +		ret = uobj->type->type_class->remove_commit(uobj,
> +
> RDMA_REMOVE_DURING_CLEANUP);
> +		if (ret)
> +			pr_warn("ib_uverbs: cleanup of idr object %d
> failed\n",
> +				uobj->id);
> +		return ret;
> +	}
> +
> +	uobj->type->type_class->alloc_commit(uobj);
> +	up_read(&uobj->context->cleanup_rwsem);
> +
> +	return 0;
> +}
> +
> +static void alloc_abort_idr_uobject(struct ib_uobject *uobj)
> +{
> +	uverbs_idr_remove_uobj(uobj);
> +	ib_rdmacg_uncharge(&uobj->cg_obj, uobj->context->device,
> +			   RDMACG_RESOURCE_HCA_OBJECT);
> +	uverbs_uobject_put(uobj);
> +}
> +
> +void rdma_alloc_abort_uobject(struct ib_uobject *uobj)
> +{
> +	uobj->type->type_class->alloc_abort(uobj);
> +}
> +
> +static void lookup_put_idr_uobject(struct ib_uobject *uobj, bool
> write)
> +{
> +}
> +
> +void rdma_lookup_put_uobject(struct ib_uobject *uobj, bool write)
> +{
> +	lockdep_check(uobj, write);
> +	uobj->type->type_class->lookup_put(uobj, write);
> +	/*
> +	 * In order to unlock an object, either decrease its usecnt for
> +	 * read access or zero it in case of write access. See
> +	 * uverbs_try_lock_object for locking schema information.
> +	 */
> +	if (!write)
> +		atomic_dec(&uobj->usecnt);
> +	else
> +		atomic_set(&uobj->usecnt, 0);
> +
> +	uverbs_uobject_put(uobj);
> +}
> +
> +const struct uverbs_obj_type_class uverbs_idr_class = {
> +	.alloc_begin = alloc_begin_idr_uobject,
> +	.lookup_get = lookup_get_idr_uobject,
> +	.alloc_commit = alloc_commit_idr_uobject,
> +	.alloc_abort = alloc_abort_idr_uobject,
> +	.lookup_put = lookup_put_idr_uobject,
> +	.remove_commit = remove_commit_idr_uobject,
> +	/*
> +	 * When we destroy an object, we first just lock it for WRITE
> and
> +	 * actually DESTROY it in the finalize stage. So, the
> problematic
> +	 * scenario is when we just started the finalize stage of the
> +	 * destruction (nothing was executed yet). Now, the other thread
> +	 * fetched the object for READ access, but it didn't lock it
> yet.
> +	 * The DESTROY thread continues and starts destroying the
> object.
> +	 * When the other thread continue - without the RCU, it would
> +	 * access freed memory. However, the rcu_read_lock delays the
> free
> +	 * until the rcu_read_lock of the READ operation quits. Since
> the
> +	 * write lock of the object is still taken by the DESTROY flow,
> the
> +	 * READ operation will get -EBUSY and it'll just bail out.
> +	 */
> +	.needs_kfree_rcu = true,
> +};
> +
> +void uverbs_cleanup_ucontext(struct ib_ucontext *ucontext, bool
> device_removed)
> +{
> +	enum rdma_remove_reason reason = device_removed ?
> +		RDMA_REMOVE_DRIVER_REMOVE : RDMA_REMOVE_CLOSE;
> +	unsigned int cur_order = 0;
> +
> +	ucontext->cleanup_reason = reason;
> +	/*
> +	 * Waits for all remove_commit and alloc_commit to finish.
> Logically, We
> +	 * want to hold this forever as the context is going to be
> destroyed,
> +	 * but we'll release it since it causes a "held lock freed" BUG
> message.
> +	 */
> +	down_write(&ucontext->cleanup_rwsem);
> +
> +	while (!list_empty(&ucontext->uobjects)) {
> +		struct ib_uobject *obj, *next_obj;
> +		unsigned int next_order = UINT_MAX;
> +
> +		/*
> +		 * This shouldn't run while executing other commands on
> this
> +		 * context.
> +		 */
> +		mutex_lock(&ucontext->uobjects_lock);
> +		list_for_each_entry_safe(obj, next_obj, &ucontext-
> >uobjects,
> +					 list)

Please add braces

> +			if (obj->type->destroy_order == cur_order) {
> +				int ret;
> +
> +				/*
> +				 * if we hit this WARN_ON, that means we are
> +				 * racing with a lookup_get.
> +				 */
> +				WARN_ON(uverbs_try_lock_object(obj, true));
> +				ret = _rdma_remove_commit_uobject(obj, reason,
> +								  false);
> +				if (ret)
> +					pr_warn("ib_uverbs: failed to remove
> uobject id %d order %u\n",
> +						obj->id, cur_order);
> +			} else {
> +				next_order = min(next_order,
> +						 obj->type->destroy_order);
> +			}
> +		mutex_unlock(&ucontext->uobjects_lock);
> +		cur_order = next_order;
> +	}
> +	up_write(&ucontext->cleanup_rwsem);
> +}
> +
> +void uverbs_initialize_ucontext(struct ib_ucontext *ucontext)
> +{
> +	ucontext->cleanup_reason = 0;
> +	mutex_init(&ucontext->uobjects_lock);
> +	INIT_LIST_HEAD(&ucontext->uobjects);
> +	init_rwsem(&ucontext->cleanup_rwsem);
> +}
> +
> diff --git a/drivers/infiniband/core/rdma_core.h
> b/drivers/infiniband/core/rdma_core.h
> new file mode 100644
> index 0000000..ab665a6
> --- /dev/null
> +++ b/drivers/infiniband/core/rdma_core.h
> @@ -0,0 +1,55 @@
> +/*
> + * Copyright (c) 2005 Topspin Communications.  All rights reserved.
> + * Copyright (c) 2005, 2006 Cisco Systems.  All rights reserved.
> + * Copyright (c) 2005-2017 Mellanox Technologies. All rights
> reserved.
> + * Copyright (c) 2005 Voltaire, Inc. All rights reserved.
> + * Copyright (c) 2005 PathScale, Inc. All rights reserved.
> + *
> + * This software is available to you under a choice of one of two
> + * licenses.  You may choose to be licensed under the terms of the
> GNU
> + * General Public License (GPL) Version 2, available from the file
> + * COPYING in the main directory of this source tree, or the
> + * OpenIB.org BSD license below:
> + *
> + *     Redistribution and use in source and binary forms, with or
> + *     without modification, are permitted provided that the
> following
> + *     conditions are met:
> + *
> + *      - Redistributions of source code must retain the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer.
> + *
> + *      - Redistributions in binary form must reproduce the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer in the documentation and/or other materials
> + *        provided with the distribution.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
> HOLDERS
> + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
> + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
> + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
> + * SOFTWARE.
> + */
> +
> +#ifndef RDMA_CORE_H
> +#define RDMA_CORE_H
> +
> +#include <linux/idr.h>
> +#include <rdma/uverbs_types.h>
> +#include <rdma/ib_verbs.h>
> +#include <linux/mutex.h>
> +
> +/*
> + * These functions initialize the context and cleanups its uobjects.
> + * The context has a list of objects which is protected by a mutex
> + * on the context. initialize_ucontext should be called when we
> create
> + * a context.
> + * cleanup_ucontext removes all uobjects from the context and puts
> them.
> + */
> +void uverbs_cleanup_ucontext(struct ib_ucontext *ucontext, bool
> device_removed);
> +void uverbs_initialize_ucontext(struct ib_ucontext *ucontext);
> +
> +#endif /* RDMA_CORE_H */
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 319e691..d3efd22 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1357,6 +1357,17 @@ struct ib_fmr_attr {
> 
>  struct ib_umem;
> 
> +enum rdma_remove_reason {
> +	/* Userspace requested uobject deletion. Call could fail */
> +	RDMA_REMOVE_DESTROY,
> +	/* Context deletion. This call should delete the actual object
> itself */
> +	RDMA_REMOVE_CLOSE,
> +	/* Driver is being hot-unplugged. This call should delete the
> actual object itself */
> +	RDMA_REMOVE_DRIVER_REMOVE,
> +	/* Context is being cleaned-up, but commit was just completed */
> +	RDMA_REMOVE_DURING_CLEANUP,
> +};
> +
>  struct ib_rdmacg_object {
>  #ifdef CONFIG_CGROUP_RDMA
>  	struct rdma_cgroup	*cg;		/* owner rdma cgroup */
> @@ -1379,6 +1390,13 @@ struct ib_ucontext {
>  	struct list_head	rwq_ind_tbl_list;
>  	int			closing;
> 
> +	/* locking the uobjects_list */
> +	struct mutex		uobjects_lock;
> +	struct list_head	uobjects;
> +	/* protects cleanup process from other actions */
> +	struct rw_semaphore	cleanup_rwsem;
> +	enum rdma_remove_reason cleanup_reason;
> +
>  	struct pid             *tgid;
>  #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
>  	struct rb_root      umem_tree;
> @@ -1409,8 +1427,11 @@ struct ib_uobject {
>  	int			id;		/* index into kernel idr */
>  	struct kref		ref;
>  	struct rw_semaphore	mutex;		/* protects .live */
> +	atomic_t		usecnt;		/* protects exclusive access
> */
>  	struct rcu_head		rcu;		/* kfree_rcu() overhead */
>  	int			live;
> +
> +	const struct uverbs_obj_type *type;
>  };
> 
>  struct ib_udata {
> diff --git a/include/rdma/uverbs_types.h b/include/rdma/uverbs_types.h
> new file mode 100644
> index 0000000..0777e40
> --- /dev/null
> +++ b/include/rdma/uverbs_types.h
> @@ -0,0 +1,132 @@
> +/*
> + * Copyright (c) 2017, Mellanox Technologies inc.  All rights
> reserved.
> + *
> + * This software is available to you under a choice of one of two
> + * licenses.  You may choose to be licensed under the terms of the
> GNU
> + * General Public License (GPL) Version 2, available from the file
> + * COPYING in the main directory of this source tree, or the
> + * OpenIB.org BSD license below:
> + *
> + *     Redistribution and use in source and binary forms, with or
> + *     without modification, are permitted provided that the
> following
> + *     conditions are met:
> + *
> + *      - Redistributions of source code must retain the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer.
> + *
> + *      - Redistributions in binary form must reproduce the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer in the documentation and/or other materials
> + *        provided with the distribution.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
> HOLDERS
> + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
> + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
> + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
> + * SOFTWARE.
> + */
> +
> +#ifndef _UVERBS_TYPES_
> +#define _UVERBS_TYPES_
> +
> +#include <linux/kernel.h>
> +#include <rdma/ib_verbs.h>
> +
> +struct uverbs_obj_type;
> +
> +struct uverbs_obj_type_class {
> +	/*
> +	 * Get an ib_uobject that corresponds to the given id from
> ucontext,
> +	 * These functions could create or destroy objects if required.
> +	 * The action will be finalized only when commit, abort or put
> fops are
> +	 * called.
> +	 * The flow of the different actions is:
> +	 * [alloc]:	 Starts with alloc_begin. The handlers logic is than
> +	 *		 executed. If the handler is successful,
> alloc_commit
> +	 *		 is called and the object is inserted to the
> repository.
> +	 *		 Once alloc_commit completes the object is visible
> to
> +	 *		 other threads and userspace.
> +	 e		 Otherwise, alloc_abort is called and the object is
> +	 *		 destroyed.
> +	 * [lookup]:	 Starts with lookup_get which fetches and
> locks the
> +	 *		 object. After the handler finished using the
> object, it
> +	 *		 needs to call lookup_put to unlock it. The write
> flag
> +	 *		 indicates if the object is locked for exclusive
> access.
> +	 * [remove]:	 Starts with lookup_get with write flag set.
> This locks
> +	 *		 the object for exclusive access. If the handler
> code
> +	 *		 completed successfully, remove_commit is called and
> +	 *		 the ib_uobject is removed from the context's
> uobjects
> +	 *		 repository and put. The object itself is destroyed
> as
> +	 *		 well. Once remove succeeds new krefs to the object
> +	 *		 cannot be acquired by other threads or userspace
> and
> +	 *		 the hardware driver is removed from the object.
> +	 *		 Other krefs on the object may still exist.
> +	 *		 If the handler code failed, lookup_put should be
> +	 *		 called. This callback is used when the context
> +	 *		 is destroyed as well (process termination,
> +	 *		 reset flow).
> +	 */
> +	struct ib_uobject *(*alloc_begin)(const struct uverbs_obj_type
> *type,
> +					  struct ib_ucontext *ucontext);
> +	void (*alloc_commit)(struct ib_uobject *uobj);
> +	void (*alloc_abort)(struct ib_uobject *uobj);
> +
> +	struct ib_uobject *(*lookup_get)(const struct uverbs_obj_type
> *type,
> +					 struct ib_ucontext *ucontext, int id,
> +					 bool write);
> +	void (*lookup_put)(struct ib_uobject *uobj, bool write);

Rather than passing in a write/exclusive flag to a bunch of different calls, why not just have separate calls?  E.g. get_shared/put_shared, get_excl/put_excl?

> +	/*
> +	 * Must be called with the write lock held. If successful uobj
> is
> +	 * invalid on return. On failure uobject is left completely
> +	 * unchanged
> +	 */
> +	int __must_check (*remove_commit)(struct ib_uobject *uobj,
> +					  enum rdma_remove_reason why);

Or add matching remove_begin()/remove_abort() calls.

> +	u8    needs_kfree_rcu;
> +};
> +
> +struct uverbs_obj_type {
> +	const struct uverbs_obj_type_class * const type_class;
> +	size_t	     obj_size;
> +	unsigned int destroy_order;
> +};
> +
> +/*
> + * Objects type classes which support a detach state (object is still
> alive but
> + * it's not attached to any context need to make sure:
> + * (a) no call through to a driver after a detach is called
> + * (b) detach isn't called concurrently with context_cleanup
> + */
> +
> +struct uverbs_obj_idr_type {
> +	/*
> +	 * In idr based objects, uverbs_obj_type_class points to a
> generic
> +	 * idr operations. In order to specialize the underlying types
> (e.g. CQ,
> +	 * QPs, etc.), we add destroy_object specific callbacks.
> +	 */
> +	struct uverbs_obj_type  type;
> +
> +	/* Free driver resources from the uobject, make the driver
> uncallable,
> +	 * and move the uobject to the detached state. If the object was
> +	 * destroyed by the user's request, a failure should leave the
> uobject
> +	 * completely unchanged.
> +	 */
> +	int __must_check (*destroy_object)(struct ib_uobject *uobj,
> +					   enum rdma_remove_reason why);
> +};
> +
> +struct ib_uobject *rdma_lookup_get_uobject(const struct
> uverbs_obj_type *type,
> +					   struct ib_ucontext *ucontext,
> +					   int id, bool write);
> +void rdma_lookup_put_uobject(struct ib_uobject *uobj, bool write);
> +struct ib_uobject *rdma_alloc_begin_uobject(const struct
> uverbs_obj_type *type,
> +					    struct ib_ucontext *ucontext);
> +void rdma_alloc_abort_uobject(struct ib_uobject *uobj);
> +int __must_check rdma_remove_commit_uobject(struct ib_uobject *uobj);
> +int rdma_alloc_commit_uobject(struct ib_uobject *uobj);
> +
> +#endif

In general, this code requires a lot of in-function commenting, which suggests complexity.  The general approach seems reasonable based on what I've read so far.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 for-next 2/7] IB/core: Add support for idr types
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10F5A5-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-04-05 10:55           ` Matan Barak
       [not found]             ` <CAAKD3BD=dM8B+bnGu_DTR220wWeo2ce2Sgoy1WwBpUYs6XHoQA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2017-04-05 10:55 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Liran Liss, Jason Gunthorpe, Leon Romanovsky, Majd Dibbiny,
	Tal Alon, Yishai Hadas, Weiny, Ira, Haggai Eran,
	Christoph Lameter

On Wed, Apr 5, 2017 at 3:43 AM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> diff --git a/drivers/infiniband/core/rdma_core.c
>> b/drivers/infiniband/core/rdma_core.c
>> new file mode 100644
>> index 0000000..1cbc053
>> --- /dev/null
>> +++ b/drivers/infiniband/core/rdma_core.c
>> @@ -0,0 +1,450 @@
>> +/*
>> + * Copyright (c) 2016, Mellanox Technologies inc.  All rights
>> reserved.
>> + *
>> + * This software is available to you under a choice of one of two
>> + * licenses.  You may choose to be licensed under the terms of the
>> GNU
>> + * General Public License (GPL) Version 2, available from the file
>> + * COPYING in the main directory of this source tree, or the
>> + * OpenIB.org BSD license below:
>> + *
>> + *     Redistribution and use in source and binary forms, with or
>> + *     without modification, are permitted provided that the
>> following
>> + *     conditions are met:
>> + *
>> + *      - Redistributions of source code must retain the above
>> + *        copyright notice, this list of conditions and the following
>> + *        disclaimer.
>> + *
>> + *      - Redistributions in binary form must reproduce the above
>> + *        copyright notice, this list of conditions and the following
>> + *        disclaimer in the documentation and/or other materials
>> + *        provided with the distribution.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
>> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
>> HOLDERS
>> + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
>> + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
>> + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
>> + * SOFTWARE.
>> + */
>> +
>> +#include <linux/file.h>
>> +#include <linux/anon_inodes.h>
>> +#include <rdma/ib_verbs.h>
>> +#include <rdma/uverbs_types.h>
>> +#include <linux/rcupdate.h>
>> +#include "uverbs.h"
>> +#include "core_priv.h"
>> +#include "rdma_core.h"
>> +
>> +void uverbs_uobject_get(struct ib_uobject *uobject)
>> +{
>> +     kref_get(&uobject->ref);
>> +}
>> +
>> +static void uverbs_uobject_put_ref(struct kref *ref)
>> +{
>> +     struct ib_uobject *uobj =
>> +             container_of(ref, struct ib_uobject, ref);
>> +
>> +     if (uobj->type->type_class->needs_kfree_rcu)
>> +             kfree_rcu(uobj, rcu);
>> +     else
>> +             kfree(uobj);
>> +}
>
> I would rename 'put' to 'free'.
>

Ok

>> +
>> +void uverbs_uobject_put(struct ib_uobject *uobject)
>> +{
>> +     kref_put(&uobject->ref, uverbs_uobject_put_ref);
>> +}
>> +
>> +static int uverbs_try_lock_object(struct ib_uobject *uobj, bool
>> write)
>> +{
>> +     /*
>> +      * When a read is required, we use a positive counter. Each read
>> +      * request checks that the value != -1 and increment it. Write
>> +      * requires an exclusive access, thus we check that the counter
>> is
>> +      * zero (nobody claimed this object) and we set it to -1.
>> +      * Releasing a read lock is done by simply decreasing the
>> counter.
>> +      * As for writes, since only a single write is permitted,
>> setting
>> +      * it to zero is enough for releasing it.
>> +      */
>> +     if (!write)
>> +             return __atomic_add_unless(&uobj->usecnt, 1, -1) == -1 ?
>> +                     -EBUSY : 0;
>> +
>> +     /* lock is either WRITE or DESTROY - should be exclusive */
>> +     return atomic_cmpxchg(&uobj->usecnt, 0, -1) == 0 ? 0 : -EBUSY;
>> +}
>
> I would replace 'write' with 'exclusive'.
>

Ok

>> +static struct ib_uobject *alloc_uobj(struct ib_ucontext *context,
>> +                                  const struct uverbs_obj_type *type)
>> +{
>> +     struct ib_uobject *uobj = kmalloc(type->obj_size, GFP_KERNEL);
>
> kzalloc?
>

All the standard uobject fields are initialized, but since it can be
used to allocate
a user struct that contains this uobject, it might help some users to
detect their bugs
earlier. No problem changing this.

>> +
>> +     if (!uobj)
>> +             return ERR_PTR(-ENOMEM);
>> +     /*
>> +      * user_handle should be filled by the handler,
>> +      * The object is added to the list in the commit stage.
>> +      */
>> +     uobj->context = context;
>> +     uobj->type = type;
>> +     atomic_set(&uobj->usecnt, 0);
>> +     kref_init(&uobj->ref);
>> +
>> +     return uobj;
>> +}
>> +
>> +static int idr_add_uobj(struct ib_uobject *uobj)
>> +{
>> +     int ret;
>> +
>> +     idr_preload(GFP_KERNEL);
>> +     spin_lock(&uobj->context->ufile->idr_lock);
>> +
>> +     /*
>> +      * We start with allocating an idr pointing to NULL. This
>> represents an
>> +      * object which isn't initialized yet. We'll replace it later on
>> with
>> +      * the real object once we commit.
>> +      */
>> +     ret = idr_alloc(&uobj->context->ufile->idr, NULL, 0,
>> +                     min_t(unsigned long, U32_MAX - 1, INT_MAX),
>> GFP_NOWAIT);
>> +     if (ret >= 0)
>> +             uobj->id = ret;
>> +
>> +     spin_unlock(&uobj->context->ufile->idr_lock);
>> +     idr_preload_end();
>> +
>> +     return ret < 0 ? ret : 0;
>> +}
>> +
>> +/*
>> + * It only removes it from the uobjects list, uverbs_uobject_put() is
>> still
>> + * required.
>> + */
>> +static void uverbs_idr_remove_uobj(struct ib_uobject *uobj)
>> +{
>> +     spin_lock(&uobj->context->ufile->idr_lock);
>> +     idr_remove(&uobj->context->ufile->idr, uobj->id);
>> +     spin_unlock(&uobj->context->ufile->idr_lock);
>> +}
>> +
>> +/* Returns the ib_uobject or an error. The caller should check for
>> IS_ERR. */
>> +static struct ib_uobject *lookup_get_idr_uobject(const struct
>> uverbs_obj_type *type,
>> +                                              struct ib_ucontext *ucontext,
>> +                                              int id, bool write)
>> +{
>> +     struct ib_uobject *uobj;
>> +
>> +     rcu_read_lock();
>> +     /* object won't be released as we're protected in rcu */
>> +     uobj = idr_find(&ucontext->ufile->idr, id);
>> +     if (!uobj) {
>> +             uobj = ERR_PTR(-ENOENT);
>> +             goto free;
>> +     }
>> +
>> +     uverbs_uobject_get(uobj);
>> +free:
>> +     rcu_read_unlock();
>> +     return uobj;
>> +}
>> +
>> +struct ib_uobject *rdma_lookup_get_uobject(const struct
>> uverbs_obj_type *type,
>> +                                        struct ib_ucontext *ucontext,
>> +                                        int id, bool write)
>> +{
>> +     struct ib_uobject *uobj;
>> +     int ret;
>> +
>> +     uobj = type->type_class->lookup_get(type, ucontext, id, write);
>> +     if (IS_ERR(uobj))
>> +             return uobj;
>> +
>> +     if (uobj->type != type) {
>> +             ret = -EINVAL;
>> +             goto free;
>> +     }
>> +
>> +     ret = uverbs_try_lock_object(uobj, write);
>> +     if (ret) {
>> +             WARN(ucontext->cleanup_reason,
>> +                  "ib_uverbs: Trying to lookup_get while cleanup
>> context\n");
>> +             goto free;
>> +     }
>> +
>> +     return uobj;
>> +free:
>> +     uobj->type->type_class->lookup_put(uobj, write);
>> +     uverbs_uobject_put(uobj);
>
> There's an unexpected asymmetry here.  lookup_get is pairing with lookup_put + uobject_put.
>

lookup_get also calls uverbs_uobject_get. It's done in the idr/fd's
callback, as sometimes we need
to wrap it in rcu (or some other equivalent mechanism). In the
previous version, it was more symmetrical
but Jason suggested simplicity over symmetry and I think it looks
better this way.

>> +     return ERR_PTR(ret);
>> +}
>> +
>> +static struct ib_uobject *alloc_begin_idr_uobject(const struct
>> uverbs_obj_type *type,
>> +                                               struct ib_ucontext *ucontext)
>> +{
>> +     int ret;
>> +     struct ib_uobject *uobj;
>> +
>> +     uobj = alloc_uobj(ucontext, type);
>> +     if (IS_ERR(uobj))
>> +             return uobj;
>> +
>> +     ret = idr_add_uobj(uobj);
>> +     if (ret)
>> +             goto uobj_put;
>> +
>> +     ret = ib_rdmacg_try_charge(&uobj->cg_obj, ucontext->device,
>> +                                RDMACG_RESOURCE_HCA_OBJECT);
>> +     if (ret)
>> +             goto idr_remove;
>> +
>> +     return uobj;
>> +
>> +idr_remove:
>> +     uverbs_idr_remove_uobj(uobj);
>> +uobj_put:
>> +     uverbs_uobject_put(uobj);
>> +     return ERR_PTR(ret);
>> +}
>> +
>> +struct ib_uobject *rdma_alloc_begin_uobject(const struct
>> uverbs_obj_type *type,
>> +                                         struct ib_ucontext *ucontext)
>> +{
>> +     return type->type_class->alloc_begin(type, ucontext);
>> +}
>> +
>> +static void uverbs_uobject_add(struct ib_uobject *uobject)
>> +{
>> +     mutex_lock(&uobject->context->uobjects_lock);
>> +     list_add(&uobject->list, &uobject->context->uobjects);
>> +     mutex_unlock(&uobject->context->uobjects_lock);
>> +}
>> +
>> +static int __must_check remove_commit_idr_uobject(struct ib_uobject
>> *uobj,
>> +                                               enum rdma_remove_reason why)
>> +{
>> +     const struct uverbs_obj_idr_type *idr_type =
>> +             container_of(uobj->type, struct uverbs_obj_idr_type,
>> +                          type);
>> +     int ret = idr_type->destroy_object(uobj, why);
>> +
>> +     /*
>> +      * We can only fail gracefully if the user requested to destroy
>> the
>> +      * object. In the rest of the cases, just remove whatever you
>> can.
>> +      */
>> +     if (why == RDMA_REMOVE_DESTROY && ret)
>> +             return ret;
>> +
>> +     ib_rdmacg_uncharge(&uobj->cg_obj, uobj->context->device,
>> +                        RDMACG_RESOURCE_HCA_OBJECT);
>> +     uverbs_idr_remove_uobj(uobj);
>> +
>> +     return ret;
>> +}
>> +
>> +static void lockdep_check(struct ib_uobject *uobj, bool write)
>> +{
>> +#ifdef CONFIG_LOCKDEP
>> +     if (write)
>> +             WARN_ON(atomic_read(&uobj->usecnt) > 0);
>> +     else
>> +             WARN_ON(atomic_read(&uobj->usecnt) == -1);
>> +#endif
>> +}
>> +
>> +static int __must_check _rdma_remove_commit_uobject(struct ib_uobject
>> *uobj,
>> +                                                 enum rdma_remove_reason why,
>> +                                                 bool lock)
>> +{
>> +     int ret;
>> +     struct ib_ucontext *ucontext = uobj->context;
>> +
>> +     ret = uobj->type->type_class->remove_commit(uobj, why);
>> +     if (ret && why == RDMA_REMOVE_DESTROY) {
>> +             /* We couldn't remove the object, so just unlock the
>> uobject */
>> +             atomic_set(&uobj->usecnt, 0);
>> +             uobj->type->type_class->lookup_put(uobj, true);
>> +     } else {
>> +             if (lock)
>> +                     mutex_lock(&ucontext->uobjects_lock);
>> +             list_del(&uobj->list);
>> +             if (lock)
>> +                     mutex_unlock(&ucontext->uobjects_lock);
>> +             /* put the ref we took when we created the object */
>> +             uverbs_uobject_put(uobj);
>
> Please try to restructure the code so that locking state doesn't need to be carried through to functions like this.
>
>

Yeah, actually, I could just put the uobj->...->lookup_put + list_del
+ uobject_put straight in the cleanup_ucontext code.

>> +     }
>> +
>> +     return ret;
>> +}
>> +
>> +/* This is called only for user requested DESTROY reasons */
>> +int __must_check rdma_remove_commit_uobject(struct ib_uobject *uobj)
>> +{
>> +     int ret;
>> +     struct ib_ucontext *ucontext = uobj->context;
>> +
>> +     /* put the ref count we took at lookup_get */
>> +     uverbs_uobject_put(uobj);
>> +     /* Cleanup is running. Calling this should have been impossible
>> */
>> +     if (!down_read_trylock(&ucontext->cleanup_rwsem)) {
>> +             WARN(true, "ib_uverbs: Cleanup is running while removing
>> an uobject\n");
>> +             return 0;
>> +     }
>> +     lockdep_check(uobj, true);
>> +     ret = _rdma_remove_commit_uobject(uobj, RDMA_REMOVE_DESTROY,
>> true);
>> +
>> +     up_read(&ucontext->cleanup_rwsem);
>> +     return ret;
>> +}
>> +
>> +static void alloc_commit_idr_uobject(struct ib_uobject *uobj)
>> +{
>> +     uverbs_uobject_add(uobj);
>> +     spin_lock(&uobj->context->ufile->idr_lock);
>> +     /*
>> +      * We already allocated this IDR with a NULL object, so
>> +      * this shouldn't fail.
>> +      */
>> +     WARN_ON(idr_replace(&uobj->context->ufile->idr,
>> +                         uobj, uobj->id));
>> +     spin_unlock(&uobj->context->ufile->idr_lock);
>> +}
>> +
>> +int rdma_alloc_commit_uobject(struct ib_uobject *uobj)
>> +{
>> +     /* Cleanup is running. Calling this should have been impossible
>> */
>> +     if (!down_read_trylock(&uobj->context->cleanup_rwsem)) {
>> +             int ret;
>> +
>> +             WARN(true, "ib_uverbs: Cleanup is running while allocating
>> an uobject\n");
>> +             ret = uobj->type->type_class->remove_commit(uobj,
>> +
>> RDMA_REMOVE_DURING_CLEANUP);
>> +             if (ret)
>> +                     pr_warn("ib_uverbs: cleanup of idr object %d
>> failed\n",
>> +                             uobj->id);
>> +             return ret;
>> +     }
>> +
>> +     uobj->type->type_class->alloc_commit(uobj);
>> +     up_read(&uobj->context->cleanup_rwsem);
>> +
>> +     return 0;
>> +}
>> +
>> +static void alloc_abort_idr_uobject(struct ib_uobject *uobj)
>> +{
>> +     uverbs_idr_remove_uobj(uobj);
>> +     ib_rdmacg_uncharge(&uobj->cg_obj, uobj->context->device,
>> +                        RDMACG_RESOURCE_HCA_OBJECT);
>> +     uverbs_uobject_put(uobj);
>> +}
>> +
>> +void rdma_alloc_abort_uobject(struct ib_uobject *uobj)
>> +{
>> +     uobj->type->type_class->alloc_abort(uobj);
>> +}
>> +
>> +static void lookup_put_idr_uobject(struct ib_uobject *uobj, bool
>> write)
>> +{
>> +}
>> +
>> +void rdma_lookup_put_uobject(struct ib_uobject *uobj, bool write)
>> +{
>> +     lockdep_check(uobj, write);
>> +     uobj->type->type_class->lookup_put(uobj, write);
>> +     /*
>> +      * In order to unlock an object, either decrease its usecnt for
>> +      * read access or zero it in case of write access. See
>> +      * uverbs_try_lock_object for locking schema information.
>> +      */
>> +     if (!write)
>> +             atomic_dec(&uobj->usecnt);
>> +     else
>> +             atomic_set(&uobj->usecnt, 0);
>> +
>> +     uverbs_uobject_put(uobj);
>> +}
>> +
>> +const struct uverbs_obj_type_class uverbs_idr_class = {
>> +     .alloc_begin = alloc_begin_idr_uobject,
>> +     .lookup_get = lookup_get_idr_uobject,
>> +     .alloc_commit = alloc_commit_idr_uobject,
>> +     .alloc_abort = alloc_abort_idr_uobject,
>> +     .lookup_put = lookup_put_idr_uobject,
>> +     .remove_commit = remove_commit_idr_uobject,
>> +     /*
>> +      * When we destroy an object, we first just lock it for WRITE
>> and
>> +      * actually DESTROY it in the finalize stage. So, the
>> problematic
>> +      * scenario is when we just started the finalize stage of the
>> +      * destruction (nothing was executed yet). Now, the other thread
>> +      * fetched the object for READ access, but it didn't lock it
>> yet.
>> +      * The DESTROY thread continues and starts destroying the
>> object.
>> +      * When the other thread continue - without the RCU, it would
>> +      * access freed memory. However, the rcu_read_lock delays the
>> free
>> +      * until the rcu_read_lock of the READ operation quits. Since
>> the
>> +      * write lock of the object is still taken by the DESTROY flow,
>> the
>> +      * READ operation will get -EBUSY and it'll just bail out.
>> +      */
>> +     .needs_kfree_rcu = true,
>> +};
>> +
>> +void uverbs_cleanup_ucontext(struct ib_ucontext *ucontext, bool
>> device_removed)
>> +{
>> +     enum rdma_remove_reason reason = device_removed ?
>> +             RDMA_REMOVE_DRIVER_REMOVE : RDMA_REMOVE_CLOSE;
>> +     unsigned int cur_order = 0;
>> +
>> +     ucontext->cleanup_reason = reason;
>> +     /*
>> +      * Waits for all remove_commit and alloc_commit to finish.
>> Logically, We
>> +      * want to hold this forever as the context is going to be
>> destroyed,
>> +      * but we'll release it since it causes a "held lock freed" BUG
>> message.
>> +      */
>> +     down_write(&ucontext->cleanup_rwsem);
>> +
>> +     while (!list_empty(&ucontext->uobjects)) {
>> +             struct ib_uobject *obj, *next_obj;
>> +             unsigned int next_order = UINT_MAX;
>> +
>> +             /*
>> +              * This shouldn't run while executing other commands on
>> this
>> +              * context.
>> +              */
>> +             mutex_lock(&ucontext->uobjects_lock);
>> +             list_for_each_entry_safe(obj, next_obj, &ucontext-
>> >uobjects,
>> +                                      list)
>
> Please add braces
>

Sure :)

>> +                     if (obj->type->destroy_order == cur_order) {
>> +                             int ret;
>> +
>> +                             /*
>> +                              * if we hit this WARN_ON, that means we are
>> +                              * racing with a lookup_get.
>> +                              */
>> +                             WARN_ON(uverbs_try_lock_object(obj, true));
>> +                             ret = _rdma_remove_commit_uobject(obj, reason,
>> +                                                               false);
>> +                             if (ret)
>> +                                     pr_warn("ib_uverbs: failed to remove
>> uobject id %d order %u\n",
>> +                                             obj->id, cur_order);
>> +                     } else {
>> +                             next_order = min(next_order,
>> +                                              obj->type->destroy_order);
>> +                     }
>> +             mutex_unlock(&ucontext->uobjects_lock);
>> +             cur_order = next_order;
>> +     }
>> +     up_write(&ucontext->cleanup_rwsem);
>> +}
>> +
>> +void uverbs_initialize_ucontext(struct ib_ucontext *ucontext)
>> +{
>> +     ucontext->cleanup_reason = 0;
>> +     mutex_init(&ucontext->uobjects_lock);
>> +     INIT_LIST_HEAD(&ucontext->uobjects);
>> +     init_rwsem(&ucontext->cleanup_rwsem);
>> +}
>> +
>> diff --git a/drivers/infiniband/core/rdma_core.h
>> b/drivers/infiniband/core/rdma_core.h
>> new file mode 100644
>> index 0000000..ab665a6
>> --- /dev/null
>> +++ b/drivers/infiniband/core/rdma_core.h
>> @@ -0,0 +1,55 @@
>> +/*
>> + * Copyright (c) 2005 Topspin Communications.  All rights reserved.
>> + * Copyright (c) 2005, 2006 Cisco Systems.  All rights reserved.
>> + * Copyright (c) 2005-2017 Mellanox Technologies. All rights
>> reserved.
>> + * Copyright (c) 2005 Voltaire, Inc. All rights reserved.
>> + * Copyright (c) 2005 PathScale, Inc. All rights reserved.
>> + *
>> + * This software is available to you under a choice of one of two
>> + * licenses.  You may choose to be licensed under the terms of the
>> GNU
>> + * General Public License (GPL) Version 2, available from the file
>> + * COPYING in the main directory of this source tree, or the
>> + * OpenIB.org BSD license below:
>> + *
>> + *     Redistribution and use in source and binary forms, with or
>> + *     without modification, are permitted provided that the
>> following
>> + *     conditions are met:
>> + *
>> + *      - Redistributions of source code must retain the above
>> + *        copyright notice, this list of conditions and the following
>> + *        disclaimer.
>> + *
>> + *      - Redistributions in binary form must reproduce the above
>> + *        copyright notice, this list of conditions and the following
>> + *        disclaimer in the documentation and/or other materials
>> + *        provided with the distribution.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
>> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
>> HOLDERS
>> + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
>> + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
>> + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
>> + * SOFTWARE.
>> + */
>> +
>> +#ifndef RDMA_CORE_H
>> +#define RDMA_CORE_H
>> +
>> +#include <linux/idr.h>
>> +#include <rdma/uverbs_types.h>
>> +#include <rdma/ib_verbs.h>
>> +#include <linux/mutex.h>
>> +
>> +/*
>> + * These functions initialize the context and cleanups its uobjects.
>> + * The context has a list of objects which is protected by a mutex
>> + * on the context. initialize_ucontext should be called when we
>> create
>> + * a context.
>> + * cleanup_ucontext removes all uobjects from the context and puts
>> them.
>> + */
>> +void uverbs_cleanup_ucontext(struct ib_ucontext *ucontext, bool
>> device_removed);
>> +void uverbs_initialize_ucontext(struct ib_ucontext *ucontext);
>> +
>> +#endif /* RDMA_CORE_H */
>> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
>> index 319e691..d3efd22 100644
>> --- a/include/rdma/ib_verbs.h
>> +++ b/include/rdma/ib_verbs.h
>> @@ -1357,6 +1357,17 @@ struct ib_fmr_attr {
>>
>>  struct ib_umem;
>>
>> +enum rdma_remove_reason {
>> +     /* Userspace requested uobject deletion. Call could fail */
>> +     RDMA_REMOVE_DESTROY,
>> +     /* Context deletion. This call should delete the actual object
>> itself */
>> +     RDMA_REMOVE_CLOSE,
>> +     /* Driver is being hot-unplugged. This call should delete the
>> actual object itself */
>> +     RDMA_REMOVE_DRIVER_REMOVE,
>> +     /* Context is being cleaned-up, but commit was just completed */
>> +     RDMA_REMOVE_DURING_CLEANUP,
>> +};
>> +
>>  struct ib_rdmacg_object {
>>  #ifdef CONFIG_CGROUP_RDMA
>>       struct rdma_cgroup      *cg;            /* owner rdma cgroup */
>> @@ -1379,6 +1390,13 @@ struct ib_ucontext {
>>       struct list_head        rwq_ind_tbl_list;
>>       int                     closing;
>>
>> +     /* locking the uobjects_list */
>> +     struct mutex            uobjects_lock;
>> +     struct list_head        uobjects;
>> +     /* protects cleanup process from other actions */
>> +     struct rw_semaphore     cleanup_rwsem;
>> +     enum rdma_remove_reason cleanup_reason;
>> +
>>       struct pid             *tgid;
>>  #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
>>       struct rb_root      umem_tree;
>> @@ -1409,8 +1427,11 @@ struct ib_uobject {
>>       int                     id;             /* index into kernel idr */
>>       struct kref             ref;
>>       struct rw_semaphore     mutex;          /* protects .live */
>> +     atomic_t                usecnt;         /* protects exclusive access
>> */
>>       struct rcu_head         rcu;            /* kfree_rcu() overhead */
>>       int                     live;
>> +
>> +     const struct uverbs_obj_type *type;
>>  };
>>
>>  struct ib_udata {
>> diff --git a/include/rdma/uverbs_types.h b/include/rdma/uverbs_types.h
>> new file mode 100644
>> index 0000000..0777e40
>> --- /dev/null
>> +++ b/include/rdma/uverbs_types.h
>> @@ -0,0 +1,132 @@
>> +/*
>> + * Copyright (c) 2017, Mellanox Technologies inc.  All rights
>> reserved.
>> + *
>> + * This software is available to you under a choice of one of two
>> + * licenses.  You may choose to be licensed under the terms of the
>> GNU
>> + * General Public License (GPL) Version 2, available from the file
>> + * COPYING in the main directory of this source tree, or the
>> + * OpenIB.org BSD license below:
>> + *
>> + *     Redistribution and use in source and binary forms, with or
>> + *     without modification, are permitted provided that the
>> following
>> + *     conditions are met:
>> + *
>> + *      - Redistributions of source code must retain the above
>> + *        copyright notice, this list of conditions and the following
>> + *        disclaimer.
>> + *
>> + *      - Redistributions in binary form must reproduce the above
>> + *        copyright notice, this list of conditions and the following
>> + *        disclaimer in the documentation and/or other materials
>> + *        provided with the distribution.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
>> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
>> HOLDERS
>> + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
>> + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
>> + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
>> + * SOFTWARE.
>> + */
>> +
>> +#ifndef _UVERBS_TYPES_
>> +#define _UVERBS_TYPES_
>> +
>> +#include <linux/kernel.h>
>> +#include <rdma/ib_verbs.h>
>> +
>> +struct uverbs_obj_type;
>> +
>> +struct uverbs_obj_type_class {
>> +     /*
>> +      * Get an ib_uobject that corresponds to the given id from
>> ucontext,
>> +      * These functions could create or destroy objects if required.
>> +      * The action will be finalized only when commit, abort or put
>> fops are
>> +      * called.
>> +      * The flow of the different actions is:
>> +      * [alloc]:      Starts with alloc_begin. The handlers logic is than
>> +      *               executed. If the handler is successful,
>> alloc_commit
>> +      *               is called and the object is inserted to the
>> repository.
>> +      *               Once alloc_commit completes the object is visible
>> to
>> +      *               other threads and userspace.
>> +      e               Otherwise, alloc_abort is called and the object is
>> +      *               destroyed.
>> +      * [lookup]:     Starts with lookup_get which fetches and
>> locks the
>> +      *               object. After the handler finished using the
>> object, it
>> +      *               needs to call lookup_put to unlock it. The write
>> flag
>> +      *               indicates if the object is locked for exclusive
>> access.
>> +      * [remove]:     Starts with lookup_get with write flag set.
>> This locks
>> +      *               the object for exclusive access. If the handler
>> code
>> +      *               completed successfully, remove_commit is called and
>> +      *               the ib_uobject is removed from the context's
>> uobjects
>> +      *               repository and put. The object itself is destroyed
>> as
>> +      *               well. Once remove succeeds new krefs to the object
>> +      *               cannot be acquired by other threads or userspace
>> and
>> +      *               the hardware driver is removed from the object.
>> +      *               Other krefs on the object may still exist.
>> +      *               If the handler code failed, lookup_put should be
>> +      *               called. This callback is used when the context
>> +      *               is destroyed as well (process termination,
>> +      *               reset flow).
>> +      */
>> +     struct ib_uobject *(*alloc_begin)(const struct uverbs_obj_type
>> *type,
>> +                                       struct ib_ucontext *ucontext);
>> +     void (*alloc_commit)(struct ib_uobject *uobj);
>> +     void (*alloc_abort)(struct ib_uobject *uobj);
>> +
>> +     struct ib_uobject *(*lookup_get)(const struct uverbs_obj_type
>> *type,
>> +                                      struct ib_ucontext *ucontext, int id,
>> +                                      bool write);
>> +     void (*lookup_put)(struct ib_uobject *uobj, bool write);
>
> Rather than passing in a write/exclusive flag to a bunch of different calls, why not just have separate calls?  E.g. get_shared/put_shared, get_excl/put_excl?
>

Actually, there are only two functions which get "exclusive" flag.
That's the lookup_get and lookup_put.
Currently, in respect of idr/fd class types, this flag only used by fd
in order to forbid exclusive access.
I don't think that qualifies another set of _excel and _shared
callbacks. Maybe, instead of having these callbacks,
we could add .allow_exclusive flag on the type itself.

Regarding the rdma_lookup_get/put_uobject APIs, we could consider
splitting them to two separate functions.
However, they are so similar that I think sharing the code might be
better than having two separate calls. Trying
to sketch this up looks like:

struct ib_uobject *rdma_lookup_get_uobject_excl(const struct
uverbs_obj_type *type,
                                                struct ib_ucontext *ucontext,
                                                int id)
{
        struct ib_uobject *uobj;
        int ret;

        if (!type->type_class->allows_exclusive_access)
                return ERR_PTR(-EINVAL);

        uobj = type->type_class->lookup_get(type, ucontext, id);
        if (IS_ERR(uobj))
                return uobj;

        if (uobj->type != type) {
                ret = -EINVAL;
                goto free;
        }

        ret = uverbs_try_lock_object_excl(uobj);
        if (ret) {
                WARN(ucontext->cleanup_reason,
                     "ib_uverbs: Trying to lookup_get while cleanup context\n");
                goto free;
        }

        return uobj;
free:
        uobj->type->type_class->lookup_put(uobj);
        uverbs_uobject_put(uobj);
        return ERR_PTR(ret);
}

struct ib_uobject *rdma_lookup_get_uobject_shared(const struct
uverbs_obj_type *type,
                                                  struct ib_ucontext *ucontext,
                                                  int id)
{
        struct ib_uobject *uobj;
        int ret;

        uobj = type->type_class->lookup_get(type, ucontext, id);
        if (IS_ERR(uobj))
                return uobj;

        if (uobj->type != type) {
                ret = -EINVAL;
                goto free;
        }

        ret = uverbs_try_lock_object_shared(uobj);
        if (ret) {
                WARN(ucontext->cleanup_reason,
                     "ib_uverbs: Trying to lookup_get while cleanup context\n");
                goto free;
        }

        return uobj;
free:
        uobj->type->type_class->lookup_put(uobj);
        uverbs_uobject_put(uobj);
        return ERR_PTR(ret);
}

>> +     /*
>> +      * Must be called with the write lock held. If successful uobj
>> is
>> +      * invalid on return. On failure uobject is left completely
>> +      * unchanged
>> +      */
>> +     int __must_check (*remove_commit)(struct ib_uobject *uobj,
>> +                                       enum rdma_remove_reason why);
>
> Or add matching remove_begin()/remove_abort() calls.
>

Not sure we really need them. They're identical (functionality wise)
to callbacks we already have.
If you think about that, remove requires an exclusive access. When we
do that lookup_get, we have
no idea if remove is going to succeed.
Since remove can fail, it's possible that after the removal attampt
the only thing we'll need is to put this object (unlock its state)
So lookup_get and lookup_put covers this logic exactly. For the term
of documentation, we can wrap them with
macros or inline functions, but I'm not sure this is even necassary or
make the code more readable.

>> +     u8    needs_kfree_rcu;
>> +};
>> +
>> +struct uverbs_obj_type {
>> +     const struct uverbs_obj_type_class * const type_class;
>> +     size_t       obj_size;
>> +     unsigned int destroy_order;
>> +};
>> +
>> +/*
>> + * Objects type classes which support a detach state (object is still
>> alive but
>> + * it's not attached to any context need to make sure:
>> + * (a) no call through to a driver after a detach is called
>> + * (b) detach isn't called concurrently with context_cleanup
>> + */
>> +
>> +struct uverbs_obj_idr_type {
>> +     /*
>> +      * In idr based objects, uverbs_obj_type_class points to a
>> generic
>> +      * idr operations. In order to specialize the underlying types
>> (e.g. CQ,
>> +      * QPs, etc.), we add destroy_object specific callbacks.
>> +      */
>> +     struct uverbs_obj_type  type;
>> +
>> +     /* Free driver resources from the uobject, make the driver
>> uncallable,
>> +      * and move the uobject to the detached state. If the object was
>> +      * destroyed by the user's request, a failure should leave the
>> uobject
>> +      * completely unchanged.
>> +      */
>> +     int __must_check (*destroy_object)(struct ib_uobject *uobj,
>> +                                        enum rdma_remove_reason why);
>> +};
>> +
>> +struct ib_uobject *rdma_lookup_get_uobject(const struct
>> uverbs_obj_type *type,
>> +                                        struct ib_ucontext *ucontext,
>> +                                        int id, bool write);
>> +void rdma_lookup_put_uobject(struct ib_uobject *uobj, bool write);
>> +struct ib_uobject *rdma_alloc_begin_uobject(const struct
>> uverbs_obj_type *type,
>> +                                         struct ib_ucontext *ucontext);
>> +void rdma_alloc_abort_uobject(struct ib_uobject *uobj);
>> +int __must_check rdma_remove_commit_uobject(struct ib_uobject *uobj);
>> +int rdma_alloc_commit_uobject(struct ib_uobject *uobj);
>> +
>> +#endif
>
> In general, this code requires a lot of in-function commenting, which suggests complexity.  The general approach seems reasonable based on what I've read so far.
>

Thanks for the review.
Regarding the simple cosmetic changes - If Doug prefers, I can send
another round. However, I think integrating these patches so people
could start evaluate them
and send these cosmetic changes as fixups is preferrable.

> - Sean

Matan

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 for-next 1/7] IB/core: Refactor idr to be per uverbs_file
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10F3F4-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-04-05 10:56           ` Matan Barak
  0 siblings, 0 replies; 25+ messages in thread
From: Matan Barak @ 2017-04-05 10:56 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Liran Liss, Jason Gunthorpe, Leon Romanovsky, Majd Dibbiny,
	Tal Alon, Yishai Hadas, Weiny, Ira, Haggai Eran,
	Christoph Lameter

On Tue, Apr 4, 2017 at 8:33 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> The current code creates an idr per type. Since types are currently
>> common for all drivers and known in advance, this was good enough.
>> However, the proposed ioctl based infrastructure allows each driver
>> to declare only some of the common types and declare its own specific
>> types.
>>
>> Thus, we decided to implement idr to be per uverbs_file.
>>
>> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Signed-off-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> ---
>
> Reviewed-by: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Thanks

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 for-next 2/7] IB/core: Add support for idr types
       [not found]             ` <CAAKD3BD=dM8B+bnGu_DTR220wWeo2ce2Sgoy1WwBpUYs6XHoQA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-04-05 15:50               ` Jason Gunthorpe
  2017-04-05 17:33               ` Doug Ledford
  1 sibling, 0 replies; 25+ messages in thread
From: Jason Gunthorpe @ 2017-04-05 15:50 UTC (permalink / raw)
  To: Matan Barak
  Cc: Hefty, Sean, Matan Barak, Doug Ledford,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Leon Romanovsky,
	Majd Dibbiny, Tal Alon, Yishai Hadas, Weiny, Ira, Haggai Eran,
	Christoph Lameter

On Wed, Apr 05, 2017 at 01:55:22PM +0300, Matan Barak wrote:
> >> +struct ib_uobject *rdma_lookup_get_uobject(const struct
> >> uverbs_obj_type *type,
> >> +                                        struct ib_ucontext *ucontext,
> >> +                                        int id, bool write)
> >> +{
> >> +     struct ib_uobject *uobj;
> >> +     int ret;
> >> +
> >> +     uobj = type->type_class->lookup_get(type, ucontext, id, write);
> >> +     if (IS_ERR(uobj))
> >> +             return uobj;
> >> +
> >> +     if (uobj->type != type) {
> >> +             ret = -EINVAL;
> >> +             goto free;
> >> +     }
> >> +
> >> +     ret = uverbs_try_lock_object(uobj, write);
> >> +     if (ret) {
> >> +             WARN(ucontext->cleanup_reason,
> >> +                  "ib_uverbs: Trying to lookup_get while cleanup
> >> context\n");
> >> +             goto free;
> >> +     }
> >> +
> >> +     return uobj;
> >> +free:
> >> +     uobj->type->type_class->lookup_put(uobj, write);
> >> +     uverbs_uobject_put(uobj);
> >
> > There's an unexpected asymmetry here.  lookup_get is pairing with lookup_put + uobject_put.
> >
> 
> lookup_get also calls uverbs_uobject_get. It's done in the idr/fd's
> callback, as sometimes we need to wrap it in rcu (or some other
> equivalent mechanism). In the previous version, it was more
> symmetrical but Jason suggested simplicity over symmetry and I think
> it looks better this way.

The real problem here is that we have 'rdma_lookup_put' and
'uvbers_uobject_put' with very similar names which is very confusing.

Do we really need to have lookup_put at all? This is only to hold on
to the 'struct file *' across the lookup, which doesn't seem
important.

I suspect we can simplify this by eliminating the implicit fget held
by lookup_get and instead use an accessor to access the 'struct file
*' in the few places that need to do that:

  struct file *uverbs_get_file(struct ib_uobject *object)

We don't really care about the ordering here, if a caller does

 uobj = rdma_lookup_get_uobject(...);
 filp = uverbs_get_file(uobj);
 fput(filep);
 uverbs_uobject_put(uobj);

And filp is NULL because it raced with close(), we can cope with it
just fine.

With this approach we could get rid of the confusing rdma_lookup_put
entirely.

> >> +      */
> >> +     struct ib_uobject *(*alloc_begin)(const struct uverbs_obj_type
> >> *type,
> >> +                                       struct ib_ucontext *ucontext);
> >> +     void (*alloc_commit)(struct ib_uobject *uobj);
> >> +     void (*alloc_abort)(struct ib_uobject *uobj);
> >> +
> >> +     struct ib_uobject *(*lookup_get)(const struct uverbs_obj_type
> >> *type,
> >> +                                      struct ib_ucontext *ucontext, int id,
> >> +                                      bool write);
> >> +     void (*lookup_put)(struct ib_uobject *uobj, bool write);
> >
> > Rather than passing in a write/exclusive flag to a bunch of different calls, why not just have separate calls?  E.g. get_shared/put_shared, get_excl/put_excl?
> >
> 
> Actually, there are only two functions which get "exclusive" flag.
> That's the lookup_get and lookup_put.
> Currently, in respect of idr/fd class types, this flag only used by fd
> in order to forbid exclusive access.

Why doesn't uverbs_try_lock_object work with FDs? I understand that we
don't use it right now, but that doesn't seem to explain why we
couldn't.

try_lock_object for a FD could hold the flip and the refcount?

> I don't think that qualifies another set of _excel and _shared
> callbacks. Maybe, instead of having these callbacks,
> we could add .allow_exclusive flag on the type itself.

Yes, that is nicer if we need this.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH V3 for-next 3/7] IB/core: Add idr based standard types
       [not found]     ` <1491301907-32290-4-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2017-04-05 17:05       ` Hefty, Sean
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10F97B-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Hefty, Sean @ 2017-04-05 17:05 UTC (permalink / raw)
  To: Matan Barak, Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Leon Romanovsky, Majd Dibbiny, Tal Alon, Yishai Hadas, Weiny,
	Ira, Haggai Eran, Christoph Lameter

> +const struct uverbs_obj_idr_type uverbs_type_attrs_cq = {
> +	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_ucq_object),
> 0),
> +	.destroy_object = uverbs_free_cq,
> +};
> +
> +const struct uverbs_obj_idr_type uverbs_type_attrs_qp = {
> +	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uqp_object),
> 0),
> +	.destroy_object = uverbs_free_qp,
> +};
> +
> +const struct uverbs_obj_idr_type uverbs_type_attrs_mw = {
> +	.type = UVERBS_TYPE_ALLOC_IDR(0),
> +	.destroy_object = uverbs_free_mw,
> +};
> +
> +const struct uverbs_obj_idr_type uverbs_type_attrs_mr = {
> +	/* 1 is used in order to free the MR after all the MWs */
> +	.type = UVERBS_TYPE_ALLOC_IDR(1),
> +	.destroy_object = uverbs_free_mr,
> +};
> +
> +const struct uverbs_obj_idr_type uverbs_type_attrs_srq = {
> +	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_usrq_object),
> 0),
> +	.destroy_object = uverbs_free_srq,
> +};
> +
> +const struct uverbs_obj_idr_type uverbs_type_attrs_ah = {
> +	.type = UVERBS_TYPE_ALLOC_IDR(0),
> +	.destroy_object = uverbs_free_ah,
> +};
> +
> +const struct uverbs_obj_idr_type uverbs_type_attrs_flow = {
> +	.type = UVERBS_TYPE_ALLOC_IDR(0),
> +	.destroy_object = uverbs_free_flow,
> +};
> +
> +const struct uverbs_obj_idr_type uverbs_type_attrs_wq = {
> +	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uwq_object),
> 0),
> +	.destroy_object = uverbs_free_wq,
> +};
> +
> +const struct uverbs_obj_idr_type uverbs_type_attrs_rwq_ind_table = {
> +	.type = UVERBS_TYPE_ALLOC_IDR(0),
> +	.destroy_object = uverbs_free_rwq_ind_tbl,
> +};
> +
> +const struct uverbs_obj_idr_type uverbs_type_attrs_xrcd = {
> +	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uxrcd_object),
> 0),
> +	.destroy_object = uverbs_free_xrcd,
> +};
> +
> +const struct uverbs_obj_idr_type uverbs_type_attrs_pd = {
> +	/* 2 is used in order to free the PD after MRs */
> +	.type = UVERBS_TYPE_ALLOC_IDR(2),
> +	.destroy_object = uverbs_free_pd,
> +};

I wonder if it wouldn't make more sense to have destroy order be independent from object creation assumptions.  For example, QPs must be destroyed prior to their associated CQs.  This code works, since the CQ is passed in during QP creation and ends up on the list correctly, but I wonder if all hardware would actually need that restriction.  Basically, the destroy order value as used is not actually capturing the true destroy order; there are other assumptions baked into the code.

Otherwise:

Reviewed-by: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 for-next 2/7] IB/core: Add support for idr types
       [not found]             ` <CAAKD3BD=dM8B+bnGu_DTR220wWeo2ce2Sgoy1WwBpUYs6XHoQA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-04-05 15:50               ` Jason Gunthorpe
@ 2017-04-05 17:33               ` Doug Ledford
       [not found]                 ` <1491413639.2923.0.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 25+ messages in thread
From: Doug Ledford @ 2017-04-05 17:33 UTC (permalink / raw)
  To: Matan Barak, Hefty, Sean
  Cc: Matan Barak, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss,
	Jason Gunthorpe, Leon Romanovsky, Majd Dibbiny, Tal Alon,
	Yishai Hadas, Weiny, Ira, Haggai Eran, Christoph Lameter

On Wed, 2017-04-05 at 13:55 +0300, Matan Barak wrote:
> Thanks for the review.
> Regarding the simple cosmetic changes - If Doug prefers, I can send
> another round. However, I think integrating these patches so people
> could start evaluate them
> and send these cosmetic changes as fixups is preferrable.

I agree, I've pulled in the v3 patchset.  I'll push it out today and
further changes can be incremental.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 for-next 2/7] IB/core: Add support for idr types
       [not found]                 ` <1491413639.2923.0.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2017-04-05 17:49                   ` Leon Romanovsky
       [not found]                     ` <20170405174943.GI20443-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Leon Romanovsky @ 2017-04-05 17:49 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Matan Barak, Hefty, Sean, Matan Barak,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Majd Dibbiny, Tal Alon, Yishai Hadas, Weiny, Ira, Haggai Eran,
	Christoph Lameter

[-- Attachment #1: Type: text/plain, Size: 1079 bytes --]

On Wed, Apr 05, 2017 at 01:33:59PM -0400, Doug Ledford wrote:
> On Wed, 2017-04-05 at 13:55 +0300, Matan Barak wrote:
> > Thanks for the review.
> > Regarding the simple cosmetic changes - If Doug prefers, I can send
> > another round. However, I think integrating these patches so people
> > could start evaluate them
> > and send these cosmetic changes as fixups is preferrable.
>
> I agree, I've pulled in the v3 patchset.  I'll push it out today and
> further changes can be incremental.

Thanks Doug,

Are you going to merge it into your for-4.12 branch? I need to know it,
so I'll be able to properly build our development branches.

>
> --
> Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>     GPG KeyID: B826A3330E572FDD
>    
> Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 for-next 2/7] IB/core: Add support for idr types
       [not found]                     ` <20170405174943.GI20443-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-04-05 18:51                       ` Doug Ledford
  0 siblings, 0 replies; 25+ messages in thread
From: Doug Ledford @ 2017-04-05 18:51 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Matan Barak, Hefty, Sean, Matan Barak,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Majd Dibbiny, Tal Alon, Yishai Hadas, Weiny, Ira, Haggai Eran,
	Christoph Lameter

On Wed, 2017-04-05 at 20:49 +0300, Leon Romanovsky wrote:
> On Wed, Apr 05, 2017 at 01:33:59PM -0400, Doug Ledford wrote:
> > 
> > On Wed, 2017-04-05 at 13:55 +0300, Matan Barak wrote:
> > > 
> > > Thanks for the review.
> > > Regarding the simple cosmetic changes - If Doug prefers, I can
> > > send
> > > another round. However, I think integrating these patches so
> > > people
> > > could start evaluate them
> > > and send these cosmetic changes as fixups is preferrable.
> > 
> > I agree, I've pulled in the v3 patchset.  I'll push it out today
> > and
> > further changes can be incremental.
> 
> Thanks Doug,
> 
> Are you going to merge it into your for-4.12 branch? I need to know
> it,
> so I'll be able to properly build our development branches.

Well, I can't merge it into that branch because that's where I was when
I applied the patches ;-)

They've been pushed out, so you should be able to see what you need.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH V3 for-next 4/7] IB/core: Change idr objects to use the new schema
       [not found]     ` <1491301907-32290-5-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2017-04-05 21:05       ` Hefty, Sean
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10FAD8-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Hefty, Sean @ 2017-04-05 21:05 UTC (permalink / raw)
  To: Matan Barak, Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Leon Romanovsky, Majd Dibbiny, Tal Alon, Yishai Hadas, Weiny,
	Ira, Haggai Eran, Christoph Lameter

Mostly questions.

> @@ -1628,7 +1148,7 @@ ssize_t ib_uverbs_resize_cq(struct
> ib_uverbs_file *file,
>  		   (unsigned long) cmd.response + sizeof resp,
>  		   in_len - sizeof cmd, out_len - sizeof resp);
> 
> -	cq = idr_read_cq(cmd.cq_handle, file->ucontext, 0);
> +	cq = uobj_get_obj_read(cq, cmd.cq_handle, file->ucontext);

(I noticed rereg_mr used a write lock.)
Is a read lock sufficient here?

>  	if (!cq)
>  		return -EINVAL;
> 

> @@ -2399,7 +1896,7 @@ static int modify_qp(struct ib_uverbs_file
> *file,
>  	if (!attr)
>  		return -ENOMEM;
> 
> -	qp = idr_read_qp(cmd->base.qp_handle, file->ucontext);
> +	qp = uobj_get_obj_read(qp, cmd->base.qp_handle, file->ucontext);

And here? (another below)

>  	if (!qp) {
>  		ret = -EINVAL;
>  		goto out;
> @@ -2471,7 +1968,7 @@ static int modify_qp(struct ib_uverbs_file
> *file,
>  	}
> 
>  release_qp:
> -	put_qp_read(qp);
> +	uobj_put_obj_read(qp);
> 
>  out:
>  	kfree(attr);
> @@ -2558,42 +2055,27 @@ ssize_t ib_uverbs_destroy_qp(struct
> ib_uverbs_file *file,
> 
>  	memset(&resp, 0, sizeof resp);
> 
> -	uobj = idr_write_uobj(cmd.qp_handle, file->ucontext);
> -	if (!uobj)
> -		return -EINVAL;
> +	uobj  = uobj_get_write(uobj_get_type(qp), cmd.qp_handle,
> +			       file->ucontext);
> +	if (IS_ERR(uobj))
> +		return PTR_ERR(uobj);
> +
>  	qp  = uobj->object;
>  	obj = container_of(uobj, struct ib_uqp_object, uevent.uobject);
> +	/*
> +	 * Make sure we don't free the memory in remove_commit as we
> still
> +	 * needs the uobject memory to create the response.
> +	 */
> +	uverbs_uobject_get(uobj);

As an alternative, you could pass a parameter into the destroy calls, which could carry the information needed to write out the results.  (There are 3-4 other places with a similar structure.)

> 
> -	if (!list_empty(&obj->mcast_list)) {
> -		put_uobj_write(uobj);
> -		return -EBUSY;
> -	}
> -
> -	ret = ib_destroy_qp(qp);
> -	if (!ret)
> -		uobj->live = 0;
> -
> -	put_uobj_write(uobj);
> -
> -	if (ret)
> +	ret = uobj_remove_commit(uobj);
> +	if (ret) {
> +		uverbs_uobject_put(uobj);
>  		return ret;
> -
> -	ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev,
> RDMACG_RESOURCE_HCA_OBJECT);
> -
> -	if (obj->uxrcd)
> -		atomic_dec(&obj->uxrcd->refcnt);
> -
> -	idr_remove_uobj(uobj);
> -
> -	mutex_lock(&file->mutex);
> -	list_del(&uobj->list);
> -	mutex_unlock(&file->mutex);
> -
> -	ib_uverbs_release_uevent(file, &obj->uevent);
> +	}
> 
>  	resp.events_reported = obj->uevent.events_reported;
> -
> -	put_uobj(uobj);
> +	uverbs_uobject_put(uobj);
> 
>  	if (copy_to_user((void __user *) (unsigned long) cmd.response,
>  			 &resp, sizeof resp))

> @@ -3142,7 +2583,7 @@ ssize_t ib_uverbs_attach_mcast(struct
> ib_uverbs_file *file,
>  	if (copy_from_user(&cmd, buf, sizeof cmd))
>  		return -EFAULT;
> 
> -	qp = idr_write_qp(cmd.qp_handle, file->ucontext);
> +	qp = uobj_get_obj_read(qp, cmd.qp_handle, file->ucontext);

This converts an idr_write to get_obj_read...

>  	if (!qp)
>  		return -EINVAL;
> 
> @@ -3171,7 +2612,7 @@ ssize_t ib_uverbs_attach_mcast(struct
> ib_uverbs_file *file,
>  		kfree(mcast);
> 
>  out_put:
> -	put_qp_write(qp);
> +	uobj_put_obj_read(qp);
> 
>  	return ret ? ret : in_len;
>  }
> @@ -3190,16 +2631,16 @@ ssize_t ib_uverbs_detach_mcast(struct
> ib_uverbs_file *file,
>  	if (copy_from_user(&cmd, buf, sizeof cmd))
>  		return -EFAULT;
> 
> -	qp = idr_write_qp(cmd.qp_handle, file->ucontext);
> +	qp = uobj_get_obj_read(qp, cmd.qp_handle, file->ucontext);

Same here.  Are these changes correct?

>  	if (!qp)
>  		return -EINVAL;
> 
> +	obj = container_of(qp->uobject, struct ib_uqp_object,
> uevent.uobject);
> +
>  	ret = ib_detach_mcast(qp, (union ib_gid *) cmd.gid, cmd.mlid);
>  	if (ret)
>  		goto out_put;
> 
> -	obj = container_of(qp->uobject, struct ib_uqp_object,
> uevent.uobject);
> -
>  	list_for_each_entry(mcast, &obj->mcast_list, list)
>  		if (cmd.mlid == mcast->lid &&
>  		    !memcmp(cmd.gid, mcast->gid.raw, sizeof mcast-
> >gid.raw)) {
> @@ -3209,8 +2650,7 @@ ssize_t ib_uverbs_detach_mcast(struct
> ib_uverbs_file *file,
>  		}
> 
>  out_put:
> -	put_qp_write(qp);
> -
> +	uobj_put_obj_read(qp);
>  	return ret ? ret : in_len;
>  }

> @@ -3526,31 +2953,27 @@ int ib_uverbs_ex_destroy_wq(struct
> ib_uverbs_file *file,
>  		return -EOPNOTSUPP;
> 
>  	resp.response_length = required_resp_len;
> -	uobj = idr_write_uobj(cmd.wq_handle,
> -			      file->ucontext);
> -	if (!uobj)
> -		return -EINVAL;
> +	uobj  = uobj_get_write(uobj_get_type(wq), cmd.wq_handle,
> +			       file->ucontext);
> +	if (IS_ERR(uobj))
> +		return PTR_ERR(uobj);
> 
>  	wq = uobj->object;
>  	obj = container_of(uobj, struct ib_uwq_object, uevent.uobject);
> -	ret = ib_destroy_wq(wq);
> -	if (!ret)
> -		uobj->live = 0;
> +	/*
> +	 * Make sure we don't free the memory in remove_commit as we
> still
> +	 * needs the uobject memory to create the response.
> +	 */
> +	uverbs_uobject_get(uobj);
> 
> -	put_uobj_write(uobj);
> -	if (ret)
> +	ret = uobj_remove_commit(uobj);
> +	if (ret) {
> +		uverbs_uobject_put(uobj);
>  		return ret;
> +	}
> 
> -	idr_remove_uobj(uobj);
> -
> -	mutex_lock(&file->mutex);
> -	list_del(&uobj->list);
> -	mutex_unlock(&file->mutex);
> -
> -	ib_uverbs_release_uevent(file, &obj->uevent);
>  	resp.events_reported = obj->uevent.events_reported;
> -	put_uobj(uobj);
> -
> +	uverbs_uobject_put(uobj);

Nit: This call can move above the if (ret) check above, with the duplicate call removed.

>  	ret = ib_copy_to_udata(ucore, &resp, resp.response_length);
>  	if (ret)
>  		return ret;
> @@ -3588,7 +3011,7 @@ int ib_uverbs_ex_modify_wq(struct ib_uverbs_file
> *file,
>  	if (cmd.attr_mask > (IB_WQ_STATE | IB_WQ_CUR_STATE |
> IB_WQ_FLAGS))
>  		return -EINVAL;
> 
> -	wq = idr_read_wq(cmd.wq_handle, file->ucontext);
> +	wq = uobj_get_obj_read(wq, cmd.wq_handle, file->ucontext);
>  	if (!wq)
>  		return -EINVAL;
> 

> @@ -4254,7 +3591,7 @@ ssize_t ib_uverbs_modify_srq(struct
> ib_uverbs_file *file,
>  	INIT_UDATA(&udata, buf + sizeof cmd, NULL, in_len - sizeof cmd,
>  		   out_len);
> 
> -	srq = idr_read_srq(cmd.srq_handle, file->ucontext);
> +	srq = uobj_get_obj_read(srq, cmd.srq_handle, file->ucontext);

Use write lock here?

>  	if (!srq)
>  		return -EINVAL;
> 
> @@ -4263,7 +3600,7 @@ ssize_t ib_uverbs_modify_srq(struct
> ib_uverbs_file *file,
> 
>  	ret = srq->device->modify_srq(srq, &attr, cmd.attr_mask,
> &udata);
> 
> -	put_srq_read(srq);
> +	uobj_put_obj_read(srq);
> 
>  	return ret ? ret : in_len;
>  }

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH V3 for-next 4/7] IB/core: Change idr objects to use the new schema
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10FAD8-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-04-05 21:59           ` Hefty, Sean
  2017-04-06 14:13           ` Matan Barak
  1 sibling, 0 replies; 25+ messages in thread
From: Hefty, Sean @ 2017-04-05 21:59 UTC (permalink / raw)
  To: Hefty, Sean, Matan Barak, Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Leon Romanovsky, Majd Dibbiny, Tal Alon, Yishai Hadas, Weiny,
	Ira, Haggai Eran, Christoph Lameter

> > @@ -3142,7 +2583,7 @@ ssize_t ib_uverbs_attach_mcast(struct
> > ib_uverbs_file *file,
> >  	if (copy_from_user(&cmd, buf, sizeof cmd))
> >  		return -EFAULT;
> >
> > -	qp = idr_write_qp(cmd.qp_handle, file->ucontext);
> > +	qp = uobj_get_obj_read(qp, cmd.qp_handle, file->ucontext);
> 
> This converts an idr_write to get_obj_read...
> 
> >  	if (!qp)
> >  		return -EINVAL;
> >
> > @@ -3171,7 +2612,7 @@ ssize_t ib_uverbs_attach_mcast(struct
> > ib_uverbs_file *file,
> >  		kfree(mcast);
> >
> >  out_put:
> > -	put_qp_write(qp);
> > +	uobj_put_obj_read(qp);
> >
> >  	return ret ? ret : in_len;
> >  }
> > @@ -3190,16 +2631,16 @@ ssize_t ib_uverbs_detach_mcast(struct
> > ib_uverbs_file *file,
> >  	if (copy_from_user(&cmd, buf, sizeof cmd))
> >  		return -EFAULT;
> >
> > -	qp = idr_write_qp(cmd.qp_handle, file->ucontext);
> > +	qp = uobj_get_obj_read(qp, cmd.qp_handle, file->ucontext);
> 
> Same here.  Are these changes correct?

I think your next patch addresses this comment.  :)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH V3 for-next 7/7] IB/core: Change completion channel to use the reworked objects schema
       [not found]     ` <1491301907-32290-8-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2017-04-05 23:30       ` Hefty, Sean
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10FBBF-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Hefty, Sean @ 2017-04-05 23:30 UTC (permalink / raw)
  To: Matan Barak, Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Jason Gunthorpe,
	Leon Romanovsky, Majd Dibbiny, Tal Alon, Yishai Hadas, Weiny,
	Ira, Haggai Eran, Christoph Lameter

I need to study this more.  Just a couple of minor comments below.

> diff --git a/drivers/infiniband/core/uverbs.h
> b/drivers/infiniband/core/uverbs.h
> index 5f8a7f2..826f827 100644
> --- a/drivers/infiniband/core/uverbs.h
> +++ b/drivers/infiniband/core/uverbs.h
> @@ -102,17 +102,25 @@ struct ib_uverbs_device {
>  };
> 
>  struct ib_uverbs_event_file {
> -	struct kref				ref;
> -	int					is_async;
> -	struct ib_uverbs_file		       *uverbs_file;
>  	spinlock_t				lock;
>  	int					is_closed;
>  	wait_queue_head_t			poll_wait;
>  	struct fasync_struct		       *async_queue;
>  	struct list_head			event_list;
> +};

I would rename this structure to something like ib_uverbs_event_queue, since the file aspect has been removed, with corresponding name changes where it is used.

> +
> +struct ib_uverbs_async_event_file {
> +	struct ib_uverbs_event_file		ev_file;
> +	struct ib_uverbs_file		       *uverbs_file;
> +	struct kref				ref;
>  	struct list_head			list;
>  };
> 
> +struct ib_uverbs_completion_event_file {
> +	struct ib_uobject_file			uobj_file;
> +	struct ib_uverbs_event_file		ev_file;
> +};
> +
>  struct ib_uverbs_file {
>  	struct kref				ref;
>  	struct mutex				mutex;
> @@ -120,7 +128,7 @@ struct ib_uverbs_file {
>  	struct ib_uverbs_device		       *device;
>  	struct ib_ucontext		       *ucontext;
>  	struct ib_event_handler			event_handler;
> -	struct ib_uverbs_event_file	       *async_file;
> +	struct ib_uverbs_async_event_file       *async_file;
>  	struct list_head			list;
>  	int					is_closed;
> 

> @@ -253,10 +253,12 @@ void ib_uverbs_release_file(struct kref *ref)
>  	kfree(file);
>  }
> 
> -static ssize_t ib_uverbs_event_read(struct file *filp, char __user
> *buf,
> -				    size_t count, loff_t *pos)
> +static ssize_t ib_uverbs_event_read(struct ib_uverbs_event_file
> *file,
> +				    struct ib_uverbs_file *uverbs_file,
> +				    struct file *filp, char __user *buf,
> +				    size_t count, loff_t *pos,
> +				    bool is_async)
>  {
> -	struct ib_uverbs_event_file *file = filp->private_data;
>  	struct ib_uverbs_event *event;
>  	int eventsz;
>  	int ret = 0;
> @@ -275,12 +277,12 @@ static ssize_t ib_uverbs_event_read(struct file
> *filp, char __user *buf,
>  			 * and wake_up() guarentee this will see the null
> set
>  			 * without using RCU
>  			 */
> -					     !file->uverbs_file->device-
> >ib_dev)))
> +					     !uverbs_file->device->ib_dev)))
>  			return -ERESTARTSYS;
> 
>  		/* If device was disassociated and no event exists set an
> error */
>  		if (list_empty(&file->event_list) &&
> -		    !file->uverbs_file->device->ib_dev)
> +		    !uverbs_file->device->ib_dev)
>  			return -EIO;
> 
>  		spin_lock_irq(&file->lock);
> @@ -288,7 +290,7 @@ static ssize_t ib_uverbs_event_read(struct file
> *filp, char __user *buf,
> 
>  	event = list_entry(file->event_list.next, struct
> ib_uverbs_event, list);
> 
> -	if (file->is_async)
> +	if (is_async)
>  		eventsz = sizeof (struct ib_uverbs_async_event_desc);
>  	else
>  		eventsz = sizeof (struct ib_uverbs_comp_event_desc);

Consider adding an event size to ib_uverbs_event_file rather than assuming the size based on is_async. 

> @@ -318,11 +320,31 @@ static ssize_t ib_uverbs_event_read(struct file
> *filp, char __user *buf,
>  	return ret;
>  }
> 
> -static unsigned int ib_uverbs_event_poll(struct file *filp,
> +static ssize_t ib_uverbs_async_event_read(struct file *filp, char
> __user *buf,
> +					  size_t count, loff_t *pos)
> +{
> +	struct ib_uverbs_async_event_file *file = filp->private_data;
> +
> +	return ib_uverbs_event_read(&file->ev_file, file->uverbs_file,
> filp,
> +				    buf, count, pos, true);
> +}
> +
> +static ssize_t ib_uverbs_comp_event_read(struct file *filp, char
> __user *buf,
> +					 size_t count, loff_t *pos)
> +{
> +	struct ib_uverbs_completion_event_file *comp_ev_file =
> +		filp->private_data;
> +
> +	return ib_uverbs_event_read(&comp_ev_file->ev_file,
> +				    comp_ev_file->uobj_file.ufile, filp,
> +				    buf, count, pos, false);
> +}

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 for-next 4/7] IB/core: Change idr objects to use the new schema
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10FAD8-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2017-04-05 21:59           ` Hefty, Sean
@ 2017-04-06 14:13           ` Matan Barak
       [not found]             ` <CAAKD3BCy_JD1cu=3ZHSbrXBHmeTj-M7pJ6nM=rRXFVMi6Szvwg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 25+ messages in thread
From: Matan Barak @ 2017-04-06 14:13 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Liran Liss, Jason Gunthorpe, Leon Romanovsky, Majd Dibbiny,
	Tal Alon, Yishai Hadas, Weiny, Ira, Haggai Eran,
	Christoph Lameter

On Thu, Apr 6, 2017 at 12:05 AM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> Mostly questions.
>
>> @@ -1628,7 +1148,7 @@ ssize_t ib_uverbs_resize_cq(struct
>> ib_uverbs_file *file,
>>                  (unsigned long) cmd.response + sizeof resp,
>>                  in_len - sizeof cmd, out_len - sizeof resp);
>>
>> -     cq = idr_read_cq(cmd.cq_handle, file->ucontext, 0);
>> +     cq = uobj_get_obj_read(cq, cmd.cq_handle, file->ucontext);
>
> (I noticed rereg_mr used a write lock.)
> Is a read lock sufficient here?
>

I guess we don't want to allow two concurrent resize_cq, so write lock should be
better here. However, I don't want to change logic in these patches. I
think such
a patch is better of being separate with a proper commit message.
We don't step on these bugs, as there's no lock at the verbs layer. So
usually, device
drivers provide the locking their themselves.

>>       if (!cq)
>>               return -EINVAL;
>>
>
>> @@ -2399,7 +1896,7 @@ static int modify_qp(struct ib_uverbs_file
>> *file,
>>       if (!attr)
>>               return -ENOMEM;
>>
>> -     qp = idr_read_qp(cmd->base.qp_handle, file->ucontext);
>> +     qp = uobj_get_obj_read(qp, cmd->base.qp_handle, file->ucontext);
>
> And here? (another below)
>

It's similar to the previous case, but a little bit more complicated.
It's valid to carry
two concurrent modify_qp if they change different attributes from the
same state to the
same state. However, this seems to me like a very esoteric case, so
write lock should be
better here.

>>       if (!qp) {
>>               ret = -EINVAL;
>>               goto out;
>> @@ -2471,7 +1968,7 @@ static int modify_qp(struct ib_uverbs_file
>> *file,
>>       }
>>
>>  release_qp:
>> -     put_qp_read(qp);
>> +     uobj_put_obj_read(qp);
>>
>>  out:
>>       kfree(attr);
>> @@ -2558,42 +2055,27 @@ ssize_t ib_uverbs_destroy_qp(struct
>> ib_uverbs_file *file,
>>
>>       memset(&resp, 0, sizeof resp);
>>
>> -     uobj = idr_write_uobj(cmd.qp_handle, file->ucontext);
>> -     if (!uobj)
>> -             return -EINVAL;
>> +     uobj  = uobj_get_write(uobj_get_type(qp), cmd.qp_handle,
>> +                            file->ucontext);
>> +     if (IS_ERR(uobj))
>> +             return PTR_ERR(uobj);
>> +
>>       qp  = uobj->object;
>>       obj = container_of(uobj, struct ib_uqp_object, uevent.uobject);
>> +     /*
>> +      * Make sure we don't free the memory in remove_commit as we
>> still
>> +      * needs the uobject memory to create the response.
>> +      */
>> +     uverbs_uobject_get(uobj);
>
> As an alternative, you could pass a parameter into the destroy calls, which could carry the information needed to write out the results.  (There are 3-4 other places with a similar structure.)
>

Yeah, that's another option. We could get s __user pointer to the free
functions and in case the destruction was requested by the user, it'll
write the required information.
I think that logically, writing responses should be part of the
handler's code. If we look a little bit further, these free functions
will be called from the ioctl code as well.
The ioctl responses could be formed a little bit differently and I
don't want to pass information about the response structure to the
free functions.

>>
>> -     if (!list_empty(&obj->mcast_list)) {
>> -             put_uobj_write(uobj);
>> -             return -EBUSY;
>> -     }
>> -
>> -     ret = ib_destroy_qp(qp);
>> -     if (!ret)
>> -             uobj->live = 0;
>> -
>> -     put_uobj_write(uobj);
>> -
>> -     if (ret)
>> +     ret = uobj_remove_commit(uobj);
>> +     if (ret) {
>> +             uverbs_uobject_put(uobj);
>>               return ret;
>> -
>> -     ib_rdmacg_uncharge(&uobj->cg_obj, ib_dev,
>> RDMACG_RESOURCE_HCA_OBJECT);
>> -
>> -     if (obj->uxrcd)
>> -             atomic_dec(&obj->uxrcd->refcnt);
>> -
>> -     idr_remove_uobj(uobj);
>> -
>> -     mutex_lock(&file->mutex);
>> -     list_del(&uobj->list);
>> -     mutex_unlock(&file->mutex);
>> -
>> -     ib_uverbs_release_uevent(file, &obj->uevent);
>> +     }
>>
>>       resp.events_reported = obj->uevent.events_reported;
>> -
>> -     put_uobj(uobj);
>> +     uverbs_uobject_put(uobj);
>>
>>       if (copy_to_user((void __user *) (unsigned long) cmd.response,
>>                        &resp, sizeof resp))
>
>> @@ -3142,7 +2583,7 @@ ssize_t ib_uverbs_attach_mcast(struct
>> ib_uverbs_file *file,
>>       if (copy_from_user(&cmd, buf, sizeof cmd))
>>               return -EFAULT;
>>
>> -     qp = idr_write_qp(cmd.qp_handle, file->ucontext);
>> +     qp = uobj_get_obj_read(qp, cmd.qp_handle, file->ucontext);
>
> This converts an idr_write to get_obj_read...
>

This is solved in the next commit as we add another lock here. I don't want to
return -EBUSY to the user-space if two concurrent mcast_attach (or detach) are
called.

>>       if (!qp)
>>               return -EINVAL;
>>
>> @@ -3171,7 +2612,7 @@ ssize_t ib_uverbs_attach_mcast(struct
>> ib_uverbs_file *file,
>>               kfree(mcast);
>>
>>  out_put:
>> -     put_qp_write(qp);
>> +     uobj_put_obj_read(qp);
>>
>>       return ret ? ret : in_len;
>>  }
>> @@ -3190,16 +2631,16 @@ ssize_t ib_uverbs_detach_mcast(struct
>> ib_uverbs_file *file,
>>       if (copy_from_user(&cmd, buf, sizeof cmd))
>>               return -EFAULT;
>>
>> -     qp = idr_write_qp(cmd.qp_handle, file->ucontext);
>> +     qp = uobj_get_obj_read(qp, cmd.qp_handle, file->ucontext);
>
> Same here.  Are these changes correct?
>

Yeah, lock is added at the next patch.

>>       if (!qp)
>>               return -EINVAL;
>>
>> +     obj = container_of(qp->uobject, struct ib_uqp_object,
>> uevent.uobject);
>> +
>>       ret = ib_detach_mcast(qp, (union ib_gid *) cmd.gid, cmd.mlid);
>>       if (ret)
>>               goto out_put;
>>
>> -     obj = container_of(qp->uobject, struct ib_uqp_object,
>> uevent.uobject);
>> -
>>       list_for_each_entry(mcast, &obj->mcast_list, list)
>>               if (cmd.mlid == mcast->lid &&
>>                   !memcmp(cmd.gid, mcast->gid.raw, sizeof mcast-
>> >gid.raw)) {
>> @@ -3209,8 +2650,7 @@ ssize_t ib_uverbs_detach_mcast(struct
>> ib_uverbs_file *file,
>>               }
>>
>>  out_put:
>> -     put_qp_write(qp);
>> -
>> +     uobj_put_obj_read(qp);
>>       return ret ? ret : in_len;
>>  }
>
>> @@ -3526,31 +2953,27 @@ int ib_uverbs_ex_destroy_wq(struct
>> ib_uverbs_file *file,
>>               return -EOPNOTSUPP;
>>
>>       resp.response_length = required_resp_len;
>> -     uobj = idr_write_uobj(cmd.wq_handle,
>> -                           file->ucontext);
>> -     if (!uobj)
>> -             return -EINVAL;
>> +     uobj  = uobj_get_write(uobj_get_type(wq), cmd.wq_handle,
>> +                            file->ucontext);
>> +     if (IS_ERR(uobj))
>> +             return PTR_ERR(uobj);
>>
>>       wq = uobj->object;
>>       obj = container_of(uobj, struct ib_uwq_object, uevent.uobject);
>> -     ret = ib_destroy_wq(wq);
>> -     if (!ret)
>> -             uobj->live = 0;
>> +     /*
>> +      * Make sure we don't free the memory in remove_commit as we
>> still
>> +      * needs the uobject memory to create the response.
>> +      */
>> +     uverbs_uobject_get(uobj);
>>
>> -     put_uobj_write(uobj);
>> -     if (ret)
>> +     ret = uobj_remove_commit(uobj);
>> +     if (ret) {
>> +             uverbs_uobject_put(uobj);
>>               return ret;
>> +     }
>>
>> -     idr_remove_uobj(uobj);
>> -
>> -     mutex_lock(&file->mutex);
>> -     list_del(&uobj->list);
>> -     mutex_unlock(&file->mutex);
>> -
>> -     ib_uverbs_release_uevent(file, &obj->uevent);
>>       resp.events_reported = obj->uevent.events_reported;
>> -     put_uobj(uobj);
>> -
>> +     uverbs_uobject_put(uobj);
>
> Nit: This call can move above the if (ret) check above, with the duplicate call removed.
>

Sure, I'll do that.

>>       ret = ib_copy_to_udata(ucore, &resp, resp.response_length);
>>       if (ret)
>>               return ret;
>> @@ -3588,7 +3011,7 @@ int ib_uverbs_ex_modify_wq(struct ib_uverbs_file
>> *file,
>>       if (cmd.attr_mask > (IB_WQ_STATE | IB_WQ_CUR_STATE |
>> IB_WQ_FLAGS))
>>               return -EINVAL;
>>
>> -     wq = idr_read_wq(cmd.wq_handle, file->ucontext);
>> +     wq = uobj_get_obj_read(wq, cmd.wq_handle, file->ucontext);
>>       if (!wq)
>>               return -EINVAL;
>>
>
>> @@ -4254,7 +3591,7 @@ ssize_t ib_uverbs_modify_srq(struct
>> ib_uverbs_file *file,
>>       INIT_UDATA(&udata, buf + sizeof cmd, NULL, in_len - sizeof cmd,
>>                  out_len);
>>
>> -     srq = idr_read_srq(cmd.srq_handle, file->ucontext);
>> +     srq = uobj_get_obj_read(srq, cmd.srq_handle, file->ucontext);
>
> Use write lock here?
>

As of the other verbs you mentioned (modify_qp, resize_cq), this makes sense
but I don't want to change the logic in this refactor patch. Like
modify_qp, there's
an esoteric case where changing IB_SRQ_MAX_WR and IB_SRQ_LIMIT in two
concurrent call is currently allowed and makes sense semantically.

>>       if (!srq)
>>               return -EINVAL;
>>
>> @@ -4263,7 +3600,7 @@ ssize_t ib_uverbs_modify_srq(struct
>> ib_uverbs_file *file,
>>
>>       ret = srq->device->modify_srq(srq, &attr, cmd.attr_mask,
>> &udata);
>>
>> -     put_srq_read(srq);
>> +     uobj_put_obj_read(srq);
>>
>>       return ret ? ret : in_len;
>>  }
>
> - Sean

Thanks reviewing this code.

Matan

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 for-next 7/7] IB/core: Change completion channel to use the reworked objects schema
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10FBBF-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-04-06 14:14           ` Matan Barak
  0 siblings, 0 replies; 25+ messages in thread
From: Matan Barak @ 2017-04-06 14:14 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Liran Liss, Jason Gunthorpe, Leon Romanovsky, Majd Dibbiny,
	Tal Alon, Yishai Hadas, Weiny, Ira, Haggai Eran,
	Christoph Lameter

On Thu, Apr 6, 2017 at 2:30 AM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> I need to study this more.  Just a couple of minor comments below.
>
>> diff --git a/drivers/infiniband/core/uverbs.h
>> b/drivers/infiniband/core/uverbs.h
>> index 5f8a7f2..826f827 100644
>> --- a/drivers/infiniband/core/uverbs.h
>> +++ b/drivers/infiniband/core/uverbs.h
>> @@ -102,17 +102,25 @@ struct ib_uverbs_device {
>>  };
>>
>>  struct ib_uverbs_event_file {
>> -     struct kref                             ref;
>> -     int                                     is_async;
>> -     struct ib_uverbs_file                  *uverbs_file;
>>       spinlock_t                              lock;
>>       int                                     is_closed;
>>       wait_queue_head_t                       poll_wait;
>>       struct fasync_struct                   *async_queue;
>>       struct list_head                        event_list;
>> +};
>
> I would rename this structure to something like ib_uverbs_event_queue, since the file aspect has been removed, with corresponding name changes where it is used.

Sure, the base structure could be renamed to ib_uverbs_event_queue.

>
>> +
>> +struct ib_uverbs_async_event_file {
>> +     struct ib_uverbs_event_file             ev_file;
>> +     struct ib_uverbs_file                  *uverbs_file;
>> +     struct kref                             ref;
>>       struct list_head                        list;
>>  };
>>
>> +struct ib_uverbs_completion_event_file {
>> +     struct ib_uobject_file                  uobj_file;
>> +     struct ib_uverbs_event_file             ev_file;
>> +};
>> +
>>  struct ib_uverbs_file {
>>       struct kref                             ref;
>>       struct mutex                            mutex;
>> @@ -120,7 +128,7 @@ struct ib_uverbs_file {
>>       struct ib_uverbs_device                *device;
>>       struct ib_ucontext                     *ucontext;
>>       struct ib_event_handler                 event_handler;
>> -     struct ib_uverbs_event_file            *async_file;
>> +     struct ib_uverbs_async_event_file       *async_file;
>>       struct list_head                        list;
>>       int                                     is_closed;
>>
>
>> @@ -253,10 +253,12 @@ void ib_uverbs_release_file(struct kref *ref)
>>       kfree(file);
>>  }
>>
>> -static ssize_t ib_uverbs_event_read(struct file *filp, char __user
>> *buf,
>> -                                 size_t count, loff_t *pos)
>> +static ssize_t ib_uverbs_event_read(struct ib_uverbs_event_file
>> *file,
>> +                                 struct ib_uverbs_file *uverbs_file,
>> +                                 struct file *filp, char __user *buf,
>> +                                 size_t count, loff_t *pos,
>> +                                 bool is_async)
>>  {
>> -     struct ib_uverbs_event_file *file = filp->private_data;
>>       struct ib_uverbs_event *event;
>>       int eventsz;
>>       int ret = 0;
>> @@ -275,12 +277,12 @@ static ssize_t ib_uverbs_event_read(struct file
>> *filp, char __user *buf,
>>                        * and wake_up() guarentee this will see the null
>> set
>>                        * without using RCU
>>                        */
>> -                                          !file->uverbs_file->device-
>> >ib_dev)))
>> +                                          !uverbs_file->device->ib_dev)))
>>                       return -ERESTARTSYS;
>>
>>               /* If device was disassociated and no event exists set an
>> error */
>>               if (list_empty(&file->event_list) &&
>> -                 !file->uverbs_file->device->ib_dev)
>> +                 !uverbs_file->device->ib_dev)
>>                       return -EIO;
>>
>>               spin_lock_irq(&file->lock);
>> @@ -288,7 +290,7 @@ static ssize_t ib_uverbs_event_read(struct file
>> *filp, char __user *buf,
>>
>>       event = list_entry(file->event_list.next, struct
>> ib_uverbs_event, list);
>>
>> -     if (file->is_async)
>> +     if (is_async)
>>               eventsz = sizeof (struct ib_uverbs_async_event_desc);
>>       else
>>               eventsz = sizeof (struct ib_uverbs_comp_event_desc);
>
> Consider adding an event size to ib_uverbs_event_file rather than assuming the size based on is_async.
>

Ok

>> @@ -318,11 +320,31 @@ static ssize_t ib_uverbs_event_read(struct file
>> *filp, char __user *buf,
>>       return ret;
>>  }
>>
>> -static unsigned int ib_uverbs_event_poll(struct file *filp,
>> +static ssize_t ib_uverbs_async_event_read(struct file *filp, char
>> __user *buf,
>> +                                       size_t count, loff_t *pos)
>> +{
>> +     struct ib_uverbs_async_event_file *file = filp->private_data;
>> +
>> +     return ib_uverbs_event_read(&file->ev_file, file->uverbs_file,
>> filp,
>> +                                 buf, count, pos, true);
>> +}
>> +
>> +static ssize_t ib_uverbs_comp_event_read(struct file *filp, char
>> __user *buf,
>> +                                      size_t count, loff_t *pos)
>> +{
>> +     struct ib_uverbs_completion_event_file *comp_ev_file =
>> +             filp->private_data;
>> +
>> +     return ib_uverbs_event_read(&comp_ev_file->ev_file,
>> +                                 comp_ev_file->uobj_file.ufile, filp,
>> +                                 buf, count, pos, false);
>> +}
>
> - Sean

Thanks for the review.

Matan

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 for-next 3/7] IB/core: Add idr based standard types
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10F97B-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-04-06 14:14           ` Matan Barak
  0 siblings, 0 replies; 25+ messages in thread
From: Matan Barak @ 2017-04-06 14:14 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Liran Liss, Jason Gunthorpe, Leon Romanovsky, Majd Dibbiny,
	Tal Alon, Yishai Hadas, Weiny, Ira, Haggai Eran,
	Christoph Lameter

On Wed, Apr 5, 2017 at 8:05 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> +const struct uverbs_obj_idr_type uverbs_type_attrs_cq = {
>> +     .type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_ucq_object),
>> 0),
>> +     .destroy_object = uverbs_free_cq,
>> +};
>> +
>> +const struct uverbs_obj_idr_type uverbs_type_attrs_qp = {
>> +     .type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uqp_object),
>> 0),
>> +     .destroy_object = uverbs_free_qp,
>> +};
>> +
>> +const struct uverbs_obj_idr_type uverbs_type_attrs_mw = {
>> +     .type = UVERBS_TYPE_ALLOC_IDR(0),
>> +     .destroy_object = uverbs_free_mw,
>> +};
>> +
>> +const struct uverbs_obj_idr_type uverbs_type_attrs_mr = {
>> +     /* 1 is used in order to free the MR after all the MWs */
>> +     .type = UVERBS_TYPE_ALLOC_IDR(1),
>> +     .destroy_object = uverbs_free_mr,
>> +};
>> +
>> +const struct uverbs_obj_idr_type uverbs_type_attrs_srq = {
>> +     .type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_usrq_object),
>> 0),
>> +     .destroy_object = uverbs_free_srq,
>> +};
>> +
>> +const struct uverbs_obj_idr_type uverbs_type_attrs_ah = {
>> +     .type = UVERBS_TYPE_ALLOC_IDR(0),
>> +     .destroy_object = uverbs_free_ah,
>> +};
>> +
>> +const struct uverbs_obj_idr_type uverbs_type_attrs_flow = {
>> +     .type = UVERBS_TYPE_ALLOC_IDR(0),
>> +     .destroy_object = uverbs_free_flow,
>> +};
>> +
>> +const struct uverbs_obj_idr_type uverbs_type_attrs_wq = {
>> +     .type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uwq_object),
>> 0),
>> +     .destroy_object = uverbs_free_wq,
>> +};
>> +
>> +const struct uverbs_obj_idr_type uverbs_type_attrs_rwq_ind_table = {
>> +     .type = UVERBS_TYPE_ALLOC_IDR(0),
>> +     .destroy_object = uverbs_free_rwq_ind_tbl,
>> +};
>> +
>> +const struct uverbs_obj_idr_type uverbs_type_attrs_xrcd = {
>> +     .type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uxrcd_object),
>> 0),
>> +     .destroy_object = uverbs_free_xrcd,
>> +};
>> +
>> +const struct uverbs_obj_idr_type uverbs_type_attrs_pd = {
>> +     /* 2 is used in order to free the PD after MRs */
>> +     .type = UVERBS_TYPE_ALLOC_IDR(2),
>> +     .destroy_object = uverbs_free_pd,
>> +};
>
> I wonder if it wouldn't make more sense to have destroy order be independent from object creation assumptions.  For example, QPs must be destroyed prior to their associated CQs.  This code works, since the CQ is passed in during QP creation and ends up on the list correctly, but I wonder if all hardware would actually need that restriction.  Basically, the destroy order value as used is not actually capturing the true destroy order; there are other assumptions baked into the code.
>

Actually, this code gives you the freedom to do both. You could have
every type in its own "order group" or you could share these groups.
It's guaranteed that all objects in "order group" x will be destroyed
after
objects in "order group" x+1.
The rational here is to destroy objects in a LIFO order in every order
group. That should fit most objects pretty well. It's obvious that if
a QP reports to a CQ, you can't destroy this CQ before destroying the
QP.
This is the case now, when creating object x which depends on object
y, we increase the ref count of object y. Object y will only be
destroyed when its reference count is zero.
So, effectively, we capture the same logic here.
The only exception is objects which their dependency is either
set/breaks in the user-space/hardware (without kernel awareness). In
these cases we need to fall back to destroy objects by their type
class order (a
good example could be MWs and MRs).

> Otherwise:
>
> Reviewed-by: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Thanks :)

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 for-next 4/7] IB/core: Change idr objects to use the new schema
       [not found]             ` <CAAKD3BCy_JD1cu=3ZHSbrXBHmeTj-M7pJ6nM=rRXFVMi6Szvwg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-04-06 16:57               ` Jason Gunthorpe
       [not found]                 ` <20170406165722.GE7657-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Jason Gunthorpe @ 2017-04-06 16:57 UTC (permalink / raw)
  To: Matan Barak
  Cc: Hefty, Sean, Matan Barak, Doug Ledford,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Leon Romanovsky,
	Majd Dibbiny, Tal Alon, Yishai Hadas, Weiny, Ira, Haggai Eran,
	Christoph Lameter

On Thu, Apr 06, 2017 at 05:13:52PM +0300, Matan Barak wrote:
> On Thu, Apr 6, 2017 at 12:05 AM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> > Mostly questions.
> >
> >> @@ -1628,7 +1148,7 @@ ssize_t ib_uverbs_resize_cq(struct
> >> ib_uverbs_file *file,
> >>                  (unsigned long) cmd.response + sizeof resp,
> >>                  in_len - sizeof cmd, out_len - sizeof resp);
> >>
> >> -     cq = idr_read_cq(cmd.cq_handle, file->ucontext, 0);
> >> +     cq = uobj_get_obj_read(cq, cmd.cq_handle, file->ucontext);
> >
> > (I noticed rereg_mr used a write lock.)
> > Is a read lock sufficient here?
> >
> 
> I guess we don't want to allow two concurrent resize_cq, so write
> lock should be better here. However, I don't want to change logic in
> these patches. I think such a patch is better of being separate with
> a proper commit message.  We don't step on these bugs, as there's no
> lock at the verbs layer. So usually, device drivers provide the
> locking their themselves.

It makes sense to split the patch, but the 'write lock' nee
'exclusive' seems like the wrong approach, we need an actual mutex
just like for multicast.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 for-next 4/7] IB/core: Change idr objects to use the new schema
       [not found]                 ` <20170406165722.GE7657-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2017-04-09 15:16                   ` Matan Barak
  0 siblings, 0 replies; 25+ messages in thread
From: Matan Barak @ 2017-04-09 15:16 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Hefty, Sean, Matan Barak, Doug Ledford,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Leon Romanovsky,
	Majd Dibbiny, Tal Alon, Yishai Hadas, Weiny, Ira, Haggai Eran,
	Christoph Lameter

On Thu, Apr 6, 2017 at 7:57 PM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Thu, Apr 06, 2017 at 05:13:52PM +0300, Matan Barak wrote:
>> On Thu, Apr 6, 2017 at 12:05 AM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> > Mostly questions.
>> >
>> >> @@ -1628,7 +1148,7 @@ ssize_t ib_uverbs_resize_cq(struct
>> >> ib_uverbs_file *file,
>> >>                  (unsigned long) cmd.response + sizeof resp,
>> >>                  in_len - sizeof cmd, out_len - sizeof resp);
>> >>
>> >> -     cq = idr_read_cq(cmd.cq_handle, file->ucontext, 0);
>> >> +     cq = uobj_get_obj_read(cq, cmd.cq_handle, file->ucontext);
>> >
>> > (I noticed rereg_mr used a write lock.)
>> > Is a read lock sufficient here?
>> >
>>
>> I guess we don't want to allow two concurrent resize_cq, so write
>> lock should be better here. However, I don't want to change logic in
>> these patches. I think such a patch is better of being separate with
>> a proper commit message.  We don't step on these bugs, as there's no
>> lock at the verbs layer. So usually, device drivers provide the
>> locking their themselves.
>
> It makes sense to split the patch, but the 'write lock' nee
> 'exclusive' seems like the wrong approach, we need an actual mutex
> just like for multicast.
>

Wjy? Let's look at that from the application point of view. When two
threads try to resize the cq concurrently,
what is the CQ actual size? It could be any one the request sizes.
More than that, it doesn't really make sense to
resize the same CQ concurrently. I think it's safe to return -EBUSY to
one of these system calls.

> Jason

Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2017-04-09 15:16 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-04 10:31 [PATCH V3 for-next 0/7] Change IDR usage and locking in uverbs Matan Barak
     [not found] ` <1491301907-32290-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-04-04 10:31   ` [PATCH V3 for-next 1/7] IB/core: Refactor idr to be per uverbs_file Matan Barak
     [not found]     ` <1491301907-32290-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-04-04 17:33       ` Hefty, Sean
     [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10F3F4-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-04-05 10:56           ` Matan Barak
2017-04-04 10:31   ` [PATCH V3 for-next 2/7] IB/core: Add support for idr types Matan Barak
     [not found]     ` <1491301907-32290-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-04-05  0:43       ` Hefty, Sean
     [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10F5A5-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-04-05 10:55           ` Matan Barak
     [not found]             ` <CAAKD3BD=dM8B+bnGu_DTR220wWeo2ce2Sgoy1WwBpUYs6XHoQA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-04-05 15:50               ` Jason Gunthorpe
2017-04-05 17:33               ` Doug Ledford
     [not found]                 ` <1491413639.2923.0.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-04-05 17:49                   ` Leon Romanovsky
     [not found]                     ` <20170405174943.GI20443-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-04-05 18:51                       ` Doug Ledford
2017-04-04 10:31   ` [PATCH V3 for-next 3/7] IB/core: Add idr based standard types Matan Barak
     [not found]     ` <1491301907-32290-4-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-04-05 17:05       ` Hefty, Sean
     [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10F97B-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-04-06 14:14           ` Matan Barak
2017-04-04 10:31   ` [PATCH V3 for-next 4/7] IB/core: Change idr objects to use the new schema Matan Barak
     [not found]     ` <1491301907-32290-5-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-04-05 21:05       ` Hefty, Sean
     [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10FAD8-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-04-05 21:59           ` Hefty, Sean
2017-04-06 14:13           ` Matan Barak
     [not found]             ` <CAAKD3BCy_JD1cu=3ZHSbrXBHmeTj-M7pJ6nM=rRXFVMi6Szvwg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-04-06 16:57               ` Jason Gunthorpe
     [not found]                 ` <20170406165722.GE7657-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-04-09 15:16                   ` Matan Barak
2017-04-04 10:31   ` [PATCH V3 for-next 5/7] IB/core: Add lock to multicast handlers Matan Barak
2017-04-04 10:31   ` [PATCH V3 for-next 6/7] IB/core: Add support for fd objects Matan Barak
2017-04-04 10:31   ` [PATCH V3 for-next 7/7] IB/core: Change completion channel to use the reworked objects schema Matan Barak
     [not found]     ` <1491301907-32290-8-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-04-05 23:30       ` Hefty, Sean
     [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB10FBBF-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-04-06 14:14           ` Matan Barak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.