All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC ABI V6 00/14] SG-based RDMA ABI Proposal
@ 2016-12-11 12:57 Matan Barak
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Matan Barak @ 2016-12-11 12:57 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Jason Gunthorpe, Sean Hefty, Christoph Lameter,
	Liran Liss, Haggai Eran, Majd Dibbiny, Matan Barak, Tal Alon,
	Leon Romanovsky

The following patch set comes to enrich security model as a follow up
to commit e6bd18f57aad ('IB/security: Restrict use of the write() interface').

DISCLAIMER:
These patches are far from being completed. They present working init_ucontext
and query_device (both regular and extended version). In addition, they are
given as a basis of discussions.

The ideas presented here are based on our previous series in addition to some
ideas presented in OFVWG and Sean's series.

This patch series add ioctl() interface to the existing write() interface and
provide an easy route to backport this change to legacy supported systems.
Analyzing the current uverbs role in dispatching and parsing commands, we find
that:
(a) uverbs validates the basic properties of the command
(b) uverbs is responsible of doing all the IDR and uobject management and
    locking. It's also responsible of handling completion FDs.
(c) uverbs transforms the user<-->kernel ABI to kernel API.

(a) and (b) are valid for every kABI. Although the nature of commands could
change, they still have to be validated and transform to kernel pointers.
In order to avoid duplications between the various drivers, we would like to
keep (a) and (b) as shared code.

In addition, this is a good time to expand the ABI to be more scalable, so we
added a few goals:
(1) Command's attributes shall be extensible in an easy one. Either by allowing
    drivers to have their own extensible set of attributes or core code
    extensible attributes. Moreover, driver's specific attributes could some
    day become core's standard attributes. We would like to still support
    old user-space while avoid duplicating the code in kernel.
(2) Each driver may have specific type system (i.e QP, CQ, ....). It may
    or may not even implement the standard type system. It could extend this
    type system in the future. Try to avoid duplicating existing types or
    actions.
(3) Do not change or recompile driver libraries and don't copy their data.
(4) Efficient dispatching.

Thus, in order to allow this flexibility, we decide giving (a) and (b) as a
common infrastructure, but use per-driver guidelines in order to do that
parsing and uobject management. Handlers are also set by the drivers
themselves (though they can point to either shared common code) or
driver specific code.

Since types are no longer enforced by the common infrastructure, there is no
point of pre-allocating common IDR types in the common code. Instead, we
provide an API for driver to add new types. We use one IDR per driver
for all its types. The driver declared all its supported types, their
free function and release order. After that, all uboject, exclusive access
and types are handled automatically for the driver by the infrastructure.

Scatter gather was chosen in order to allow us not to recompile user space
drivers. By using pointers to driver specific data, we could just use it
without introduce copying data and without changing the user-space driver at
all.

We chose to go with non blocking lock user objects. When exclusive
(WRITE or DESTROY) access is required, we dispatch the action if and only if
no other action needs this object as well. Otherwise, -EBUSY is returned to
the user-space. Device removal is synced with SRCU as of today.
If we were using locks, we would have need to sort the given user-space handles.
Otherwise, a user-space application may result in causing a deadlock.
Moving to a non blocking lock based behaviour, the dispatching in kernel
becomes more efficient.

We implement a compatibility layer between the old write implementation and
the new IOCTL based implementation by:
(a) Create IOCTL header and attributes descriptors.
(b) The attribute descriptors are mapped straight to the user-space supplied
    buffers. We expect that every subset of consecutive fields in the old ABI
    could be directly mapped to an attribute in the new ABI.
(c) We pass a flag telling the parsing function whether the headers reside in
    kernel-space or user-space.

We would like to use this opportunity to introduce a more syntactic way of
querying a device features. Each feature is represented in a parsing tree
(which consists of type groups, types, action groups, actions, attribute groups
and attributes). When a driver registers itself to the IB subsystem, it merges
all feature trees into one parsing tree. Later on, this parsing tree is used
in order to parse and validate all commands. We plan to allow user-space to
read this parsing tree and by that figuring out which features are supported.

Further uverbs related subsystem (such as RDMA-CM) may use other fds or use
other ioctl codes. When implementing this infrastructure to RDMA-CM, we may
need to replace ib_device with an ioctl_device and ib_ucontext with
ioctl_context. However, this could be done in a later stage.

Note, we might switch to submitting one task (i.e - change locking schema) once
the concepts are more mature.

This series is based on Doug's k.o/for-4.9-fixed branch [1] + Leon's [1] series.

Regards,
Liran, Haggai, Leon and Matan

[0] 2937f3757519 ('staging/lustre: Disable InfiniBand support')
[1] RDMA/core: Unify style of IOCTL commands series

Changes from V5:
1. Allow inlined input attributes.
2. Using the DSL macros in an inlined more compact way.
3. Specify mandatory attributes (both from user-space and kernel).
4. Specify minimum size check for attributes.
5. Introduce a way to merge feature trees.
   - Each feature will be defined in its own parsing tree.
   - We merge all these trees into one big parsing tree.
   - Driver data is declared exactly in such a tree.
6. Remove all unnecessary EXPORT_SYMBOL (we'll add them later if we need).
7. Convert __u8/16/32 to kernel's native types.
8. Make the code bisect-able.
   - Make the write uverbs handlers use the new locking/objects allocation
     infrastructure.
9. Get rid of the sizeof() requirement in the macro language.
10. Ditch the distribution function and come with a simpled fixed model
    (use high bits).
11. Allocate the necessary stuff on the stack for small commands.
12. Write compatibility mode – use flag instead of get_fs and set_fs
13. Change handlers definitions to get array of group_attributes
14. Remove the priv from handlers
15. Rename unlock_idr to commit_objects
16. Remove the live indication from ib_uboject
17. Bugfix: proper cleanups of IDRs
18. Bugfix: Use of attr in create_qp
19. Remove close_sem

Changes from V4:
1. Rebased over Doug's k.o/for-4.9-fixed branch.
2. Added create_qp and modify_qp commands.
3. Added libibverbs POC code. Started implementing the bits required for
   ibv_rc_pingpong.
4. Added a patch that puts the foundations of a compatibility layer
   between write commands and ioctl commands. This has some limitations
   of which every subset of the old write ABI should be directly mapped
   to an attribute of the new ABI.
5. Implement write's get_context using this compatibility layer.

Changes from V3:
1. Add create_cq and create_comp_channel.
2. Add FD as ib_uobject into the type system.

Changes from V2:
1. Use types declerations in order to declare release order and free function
2. Allow the driver to extend and use existing building blocks in any level:
        a. Add more types
        b. Add actions to exsiting types
        c. Add attributes to existing actions (existed in V2)
   Such a driver will only duplicate structs which it actually changed.
3. Fixed bugs in ucontext teardown and type allocation/locking.
4. Add reg_mr and init_pd

Changes from V1:
1. Refined locking system
a. try_read_lock and write lock to sync exclusive accesssb. SRCU to sync device removal from commands execution
c. Future rwsem to sync close context from commands execution
2. Added temporary udata usage for vendor's data
3. Add query_device and init_ucontext command with mlx5 implementation
4. Fixed bugs in ioctl dispatching
5. Change callbacks to get ib_uverbs_file instead of ucontext
6. Add general types initialization and cleanups

Leon Romanovsky (1):
  IB/core: Refactor IDR to be per-device

Matan Barak (13):
  IB/core: Add support for custom types
  IB/core: Add generic ucontext initialization and teardown
  IB/core: Add macros for declaring types and type groups.
  IB/core: Declare all common IB types
  IB/core: Use the new IDR and locking infrastructure in uverbs_cmd
  IB/core: Add new ioctl interface
  IB/core: Add macros for declaring actions and attributes
  IB/core: Add uverbs types, actions, handlers and attributes
  IB/core: Add uverbs merge trees functionality
  IB/mlx5: Implement common uverb objects
  IB/{core,mlx5}: Support uhw definition per driver
  IB/core: Support getting IOCTL header/SGEs from kernel space
  IB/core: Implement compatibility layer for get context command

 drivers/infiniband/core/Makefile             |    4 +-
 drivers/infiniband/core/core_priv.h          |   14 +
 drivers/infiniband/core/device.c             |   18 +
 drivers/infiniband/core/rdma_core.c          |  527 +++++++++++
 drivers/infiniband/core/rdma_core.h          |   80 ++
 drivers/infiniband/core/uverbs.h             |   43 +-
 drivers/infiniband/core/uverbs_cmd.c         | 1310 +++++++++-----------------
 drivers/infiniband/core/uverbs_ioctl.c       |  369 ++++++++
 drivers/infiniband/core/uverbs_ioctl_cmd.c   | 1072 +++++++++++++++++++++
 drivers/infiniband/core/uverbs_ioctl_merge.c |  672 +++++++++++++
 drivers/infiniband/core/uverbs_main.c        |  263 ++----
 drivers/infiniband/hw/mlx5/Makefile          |    2 +-
 drivers/infiniband/hw/mlx5/main.c            |   20 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h         |    2 +
 drivers/infiniband/hw/mlx5/uverbs_tree.c     |   68 ++
 include/rdma/ib_verbs.h                      |   37 +-
 include/rdma/uverbs_ioctl.h                  |  380 ++++++++
 include/rdma/uverbs_ioctl_cmd.h              |  210 +++++
 include/uapi/rdma/ib_user_verbs.h            |   39 +
 include/uapi/rdma/rdma_user_ioctl.h          |   28 +
 20 files changed, 4083 insertions(+), 1075 deletions(-)
 create mode 100644 drivers/infiniband/core/rdma_core.c
 create mode 100644 drivers/infiniband/core/rdma_core.h
 create mode 100644 drivers/infiniband/core/uverbs_ioctl.c
 create mode 100644 drivers/infiniband/core/uverbs_ioctl_cmd.c
 create mode 100644 drivers/infiniband/core/uverbs_ioctl_merge.c
 create mode 100644 drivers/infiniband/hw/mlx5/uverbs_tree.c
 create mode 100644 include/rdma/uverbs_ioctl.h
 create mode 100644 include/rdma/uverbs_ioctl_cmd.h

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC ABI V6 01/14] IB/core: Refactor IDR to be per-device
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-12-11 12:57   ` Matan Barak
  2016-12-11 12:57   ` [RFC ABI V6 02/14] IB/core: Add support for custom types Matan Barak
                     ` (13 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Matan Barak @ 2016-12-11 12:57 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Jason Gunthorpe, Sean Hefty, Christoph Lameter,
	Liran Liss, Haggai Eran, Majd Dibbiny, Matan Barak, Tal Alon,
	Leon Romanovsky

From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

The current code creates an IDR per type. Since types are currently
common for all vendors and known in advance, this was good enough.
However, the proposed ioctl based infrastructure allows each vendor
to declare only some of the common types and declare its own specific
types.

Thus, we decided to implement IDR to be per device and refactor it to
use a new file.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/device.c      |  14 +++
 drivers/infiniband/core/uverbs.h      |  16 +---
 drivers/infiniband/core/uverbs_cmd.c  | 157 ++++++++++++++++------------------
 drivers/infiniband/core/uverbs_main.c |  42 +++------
 include/rdma/ib_verbs.h               |   4 +
 5 files changed, 106 insertions(+), 127 deletions(-)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 760ef60..c3b68f5 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -168,11 +168,23 @@ static int alloc_name(char *name)
 	return 0;
 }
 
+static void ib_device_allocate_idrs(struct ib_device *device)
+{
+	spin_lock_init(&device->idr_lock);
+	idr_init(&device->idr);
+}
+
+static void ib_device_destroy_idrs(struct ib_device *device)
+{
+	idr_destroy(&device->idr);
+}
+
 static void ib_device_release(struct device *device)
 {
 	struct ib_device *dev = container_of(device, struct ib_device, dev);
 
 	ib_cache_release_one(dev);
+	ib_device_destroy_idrs(dev);
 	kfree(dev->port_immutable);
 	kfree(dev);
 }
@@ -219,6 +231,8 @@ struct ib_device *ib_alloc_device(size_t size)
 	if (!device)
 		return NULL;
 
+	ib_device_allocate_idrs(device);
+
 	device->dev.class = &ib_class;
 	device_initialize(&device->dev);
 
diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index df26a74..8074705 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -38,7 +38,6 @@
 #define UVERBS_H
 
 #include <linux/kref.h>
-#include <linux/idr.h>
 #include <linux/mutex.h>
 #include <linux/completion.h>
 #include <linux/cdev.h>
@@ -176,20 +175,7 @@ struct ib_ucq_object {
 	u32			async_events_reported;
 };
 
-extern spinlock_t ib_uverbs_idr_lock;
-extern struct idr ib_uverbs_pd_idr;
-extern struct idr ib_uverbs_mr_idr;
-extern struct idr ib_uverbs_mw_idr;
-extern struct idr ib_uverbs_ah_idr;
-extern struct idr ib_uverbs_cq_idr;
-extern struct idr ib_uverbs_qp_idr;
-extern struct idr ib_uverbs_srq_idr;
-extern struct idr ib_uverbs_xrcd_idr;
-extern struct idr ib_uverbs_rule_idr;
-extern struct idr ib_uverbs_wq_idr;
-extern struct idr ib_uverbs_rwq_ind_tbl_idr;
-
-void idr_remove_uobj(struct idr *idp, struct ib_uobject *uobj);
+void idr_remove_uobj(struct ib_uobject *uobj);
 
 struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
 					struct ib_device *ib_dev,
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index cb3f515a..84daf2c 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -120,37 +120,36 @@ static void put_uobj_write(struct ib_uobject *uobj)
 	put_uobj(uobj);
 }
 
-static int idr_add_uobj(struct idr *idr, struct ib_uobject *uobj)
+static int idr_add_uobj(struct ib_uobject *uobj)
 {
 	int ret;
 
 	idr_preload(GFP_KERNEL);
-	spin_lock(&ib_uverbs_idr_lock);
+	spin_lock(&uobj->context->device->idr_lock);
 
-	ret = idr_alloc(idr, uobj, 0, 0, GFP_NOWAIT);
+	ret = idr_alloc(&uobj->context->device->idr, uobj, 0, 0, GFP_NOWAIT);
 	if (ret >= 0)
 		uobj->id = ret;
 
-	spin_unlock(&ib_uverbs_idr_lock);
+	spin_unlock(&uobj->context->device->idr_lock);
 	idr_preload_end();
 
 	return ret < 0 ? ret : 0;
 }
 
-void idr_remove_uobj(struct idr *idr, struct ib_uobject *uobj)
+void idr_remove_uobj(struct ib_uobject *uobj)
 {
-	spin_lock(&ib_uverbs_idr_lock);
-	idr_remove(idr, uobj->id);
-	spin_unlock(&ib_uverbs_idr_lock);
+	spin_lock(&uobj->context->device->idr_lock);
+	idr_remove(&uobj->context->device->idr, uobj->id);
+	spin_unlock(&uobj->context->device->idr_lock);
 }
 
-static struct ib_uobject *__idr_get_uobj(struct idr *idr, int id,
-					 struct ib_ucontext *context)
+static struct ib_uobject *__idr_get_uobj(int id, struct ib_ucontext *context)
 {
 	struct ib_uobject *uobj;
 
 	rcu_read_lock();
-	uobj = idr_find(idr, id);
+	uobj = idr_find(&context->device->idr, id);
 	if (uobj) {
 		if (uobj->context == context)
 			kref_get(&uobj->ref);
@@ -162,12 +161,12 @@ static struct ib_uobject *__idr_get_uobj(struct idr *idr, int id,
 	return uobj;
 }
 
-static struct ib_uobject *idr_read_uobj(struct idr *idr, int id,
-					struct ib_ucontext *context, int nested)
+static struct ib_uobject *idr_read_uobj(int id, struct ib_ucontext *context,
+					int nested)
 {
 	struct ib_uobject *uobj;
 
-	uobj = __idr_get_uobj(idr, id, context);
+	uobj = __idr_get_uobj(id, context);
 	if (!uobj)
 		return NULL;
 
@@ -183,12 +182,11 @@ static struct ib_uobject *idr_read_uobj(struct idr *idr, int id,
 	return uobj;
 }
 
-static struct ib_uobject *idr_write_uobj(struct idr *idr, int id,
-					 struct ib_ucontext *context)
+static struct ib_uobject *idr_write_uobj(int id, struct ib_ucontext *context)
 {
 	struct ib_uobject *uobj;
 
-	uobj = __idr_get_uobj(idr, id, context);
+	uobj = __idr_get_uobj(id, context);
 	if (!uobj)
 		return NULL;
 
@@ -201,18 +199,18 @@ static struct ib_uobject *idr_write_uobj(struct idr *idr, int id,
 	return uobj;
 }
 
-static void *idr_read_obj(struct idr *idr, int id, struct ib_ucontext *context,
+static void *idr_read_obj(int id, struct ib_ucontext *context,
 			  int nested)
 {
 	struct ib_uobject *uobj;
 
-	uobj = idr_read_uobj(idr, id, context, nested);
+	uobj = idr_read_uobj(id, context, nested);
 	return uobj ? uobj->object : NULL;
 }
 
 static struct ib_pd *idr_read_pd(int pd_handle, struct ib_ucontext *context)
 {
-	return idr_read_obj(&ib_uverbs_pd_idr, pd_handle, context, 0);
+	return idr_read_obj(pd_handle, context, 0);
 }
 
 static void put_pd_read(struct ib_pd *pd)
@@ -222,7 +220,7 @@ static void put_pd_read(struct ib_pd *pd)
 
 static struct ib_cq *idr_read_cq(int cq_handle, struct ib_ucontext *context, int nested)
 {
-	return idr_read_obj(&ib_uverbs_cq_idr, cq_handle, context, nested);
+	return idr_read_obj(cq_handle, context, nested);
 }
 
 static void put_cq_read(struct ib_cq *cq)
@@ -230,24 +228,24 @@ static void put_cq_read(struct ib_cq *cq)
 	put_uobj_read(cq->uobject);
 }
 
-static struct ib_ah *idr_read_ah(int ah_handle, struct ib_ucontext *context)
+static void put_ah_read(struct ib_ah *ah)
 {
-	return idr_read_obj(&ib_uverbs_ah_idr, ah_handle, context, 0);
+	put_uobj_read(ah->uobject);
 }
 
-static void put_ah_read(struct ib_ah *ah)
+static struct ib_ah *idr_read_ah(int ah_handle, struct ib_ucontext *context)
 {
-	put_uobj_read(ah->uobject);
+	return idr_read_obj(ah_handle, context, 0);
 }
 
 static struct ib_qp *idr_read_qp(int qp_handle, struct ib_ucontext *context)
 {
-	return idr_read_obj(&ib_uverbs_qp_idr, qp_handle, context, 0);
+	return idr_read_obj(qp_handle, context, 0);
 }
 
 static struct ib_wq *idr_read_wq(int wq_handle, struct ib_ucontext *context)
 {
-	return idr_read_obj(&ib_uverbs_wq_idr, wq_handle, context, 0);
+	return idr_read_obj(wq_handle, context, 0);
 }
 
 static void put_wq_read(struct ib_wq *wq)
@@ -258,7 +256,7 @@ static void put_wq_read(struct ib_wq *wq)
 static struct ib_rwq_ind_table *idr_read_rwq_indirection_table(int ind_table_handle,
 							       struct ib_ucontext *context)
 {
-	return idr_read_obj(&ib_uverbs_rwq_ind_tbl_idr, ind_table_handle, context, 0);
+	return idr_read_obj(ind_table_handle, context, 0);
 }
 
 static void put_rwq_indirection_table_read(struct ib_rwq_ind_table *ind_table)
@@ -270,7 +268,7 @@ static struct ib_qp *idr_write_qp(int qp_handle, struct ib_ucontext *context)
 {
 	struct ib_uobject *uobj;
 
-	uobj = idr_write_uobj(&ib_uverbs_qp_idr, qp_handle, context);
+	uobj = idr_write_uobj(qp_handle, context);
 	return uobj ? uobj->object : NULL;
 }
 
@@ -286,7 +284,7 @@ static void put_qp_write(struct ib_qp *qp)
 
 static struct ib_srq *idr_read_srq(int srq_handle, struct ib_ucontext *context)
 {
-	return idr_read_obj(&ib_uverbs_srq_idr, srq_handle, context, 0);
+	return idr_read_obj(srq_handle, context, 0);
 }
 
 static void put_srq_read(struct ib_srq *srq)
@@ -297,7 +295,7 @@ static void put_srq_read(struct ib_srq *srq)
 static struct ib_xrcd *idr_read_xrcd(int xrcd_handle, struct ib_ucontext *context,
 				     struct ib_uobject **uobj)
 {
-	*uobj = idr_read_uobj(&ib_uverbs_xrcd_idr, xrcd_handle, context, 0);
+	*uobj = idr_read_uobj(xrcd_handle, context, 0);
 	return *uobj ? (*uobj)->object : NULL;
 }
 
@@ -305,7 +303,6 @@ static void put_xrcd_read(struct ib_uobject *uobj)
 {
 	put_uobj_read(uobj);
 }
-
 ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 			      struct ib_device *ib_dev,
 			      const char __user *buf,
@@ -575,7 +572,7 @@ ssize_t ib_uverbs_alloc_pd(struct ib_uverbs_file *file,
 	atomic_set(&pd->usecnt, 0);
 
 	uobj->object = pd;
-	ret = idr_add_uobj(&ib_uverbs_pd_idr, uobj);
+	ret = idr_add_uobj(uobj);
 	if (ret)
 		goto err_idr;
 
@@ -599,7 +596,7 @@ ssize_t ib_uverbs_alloc_pd(struct ib_uverbs_file *file,
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(&ib_uverbs_pd_idr, uobj);
+	idr_remove_uobj(uobj);
 
 err_idr:
 	ib_dealloc_pd(pd);
@@ -622,7 +619,7 @@ ssize_t ib_uverbs_dealloc_pd(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(&ib_uverbs_pd_idr, cmd.pd_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.pd_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 	pd = uobj->object;
@@ -640,7 +637,7 @@ ssize_t ib_uverbs_dealloc_pd(struct ib_uverbs_file *file,
 	uobj->live = 0;
 	put_uobj_write(uobj);
 
-	idr_remove_uobj(&ib_uverbs_pd_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -816,7 +813,7 @@ ssize_t ib_uverbs_open_xrcd(struct ib_uverbs_file *file,
 
 	atomic_set(&obj->refcnt, 0);
 	obj->uobject.object = xrcd;
-	ret = idr_add_uobj(&ib_uverbs_xrcd_idr, &obj->uobject);
+	ret = idr_add_uobj(&obj->uobject);
 	if (ret)
 		goto err_idr;
 
@@ -860,7 +857,7 @@ err_copy:
 	}
 
 err_insert_xrcd:
-	idr_remove_uobj(&ib_uverbs_xrcd_idr, &obj->uobject);
+	idr_remove_uobj(&obj->uobject);
 
 err_idr:
 	ib_dealloc_xrcd(xrcd);
@@ -894,7 +891,7 @@ ssize_t ib_uverbs_close_xrcd(struct ib_uverbs_file *file,
 		return -EFAULT;
 
 	mutex_lock(&file->device->xrcd_tree_mutex);
-	uobj = idr_write_uobj(&ib_uverbs_xrcd_idr, cmd.xrcd_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.xrcd_handle, file->ucontext);
 	if (!uobj) {
 		ret = -EINVAL;
 		goto out;
@@ -927,7 +924,7 @@ ssize_t ib_uverbs_close_xrcd(struct ib_uverbs_file *file,
 	if (inode && !live)
 		xrcd_table_delete(file->device, inode);
 
-	idr_remove_uobj(&ib_uverbs_xrcd_idr, uobj);
+	idr_remove_uobj(uobj);
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
 	mutex_unlock(&file->mutex);
@@ -1020,7 +1017,7 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
 	atomic_inc(&pd->usecnt);
 
 	uobj->object = mr;
-	ret = idr_add_uobj(&ib_uverbs_mr_idr, uobj);
+	ret = idr_add_uobj(uobj);
 	if (ret)
 		goto err_unreg;
 
@@ -1048,7 +1045,7 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(&ib_uverbs_mr_idr, uobj);
+	idr_remove_uobj(uobj);
 
 err_unreg:
 	ib_dereg_mr(mr);
@@ -1093,8 +1090,7 @@ ssize_t ib_uverbs_rereg_mr(struct ib_uverbs_file *file,
 	     (cmd.start & ~PAGE_MASK) != (cmd.hca_va & ~PAGE_MASK)))
 			return -EINVAL;
 
-	uobj = idr_write_uobj(&ib_uverbs_mr_idr, cmd.mr_handle,
-			      file->ucontext);
+	uobj = idr_write_uobj(cmd.mr_handle, file->ucontext);
 
 	if (!uobj)
 		return -EINVAL;
@@ -1163,7 +1159,7 @@ ssize_t ib_uverbs_dereg_mr(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(&ib_uverbs_mr_idr, cmd.mr_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.mr_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 
@@ -1178,7 +1174,7 @@ ssize_t ib_uverbs_dereg_mr(struct ib_uverbs_file *file,
 	if (ret)
 		return ret;
 
-	idr_remove_uobj(&ib_uverbs_mr_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -1238,7 +1234,7 @@ ssize_t ib_uverbs_alloc_mw(struct ib_uverbs_file *file,
 	atomic_inc(&pd->usecnt);
 
 	uobj->object = mw;
-	ret = idr_add_uobj(&ib_uverbs_mw_idr, uobj);
+	ret = idr_add_uobj(uobj);
 	if (ret)
 		goto err_unalloc;
 
@@ -1265,7 +1261,7 @@ ssize_t ib_uverbs_alloc_mw(struct ib_uverbs_file *file,
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(&ib_uverbs_mw_idr, uobj);
+	idr_remove_uobj(uobj);
 
 err_unalloc:
 	uverbs_dealloc_mw(mw);
@@ -1291,7 +1287,7 @@ ssize_t ib_uverbs_dealloc_mw(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof(cmd)))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(&ib_uverbs_mw_idr, cmd.mw_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.mw_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 
@@ -1306,7 +1302,7 @@ ssize_t ib_uverbs_dealloc_mw(struct ib_uverbs_file *file,
 	if (ret)
 		return ret;
 
-	idr_remove_uobj(&ib_uverbs_mw_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -1420,7 +1416,7 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 	atomic_set(&cq->usecnt, 0);
 
 	obj->uobject.object = cq;
-	ret = idr_add_uobj(&ib_uverbs_cq_idr, &obj->uobject);
+	ret = idr_add_uobj(&obj->uobject);
 	if (ret)
 		goto err_free;
 
@@ -1446,7 +1442,7 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 	return obj;
 
 err_cb:
-	idr_remove_uobj(&ib_uverbs_cq_idr, &obj->uobject);
+	idr_remove_uobj(&obj->uobject);
 
 err_free:
 	ib_destroy_cq(cq);
@@ -1716,7 +1712,7 @@ ssize_t ib_uverbs_destroy_cq(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(&ib_uverbs_cq_idr, cmd.cq_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.cq_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 	cq      = uobj->object;
@@ -1732,7 +1728,7 @@ ssize_t ib_uverbs_destroy_cq(struct ib_uverbs_file *file,
 	if (ret)
 		return ret;
 
-	idr_remove_uobj(&ib_uverbs_cq_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -1939,7 +1935,7 @@ static int create_qp(struct ib_uverbs_file *file,
 	qp->uobject = &obj->uevent.uobject;
 
 	obj->uevent.uobject.object = qp;
-	ret = idr_add_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject);
+	ret = idr_add_uobj(&obj->uevent.uobject);
 	if (ret)
 		goto err_destroy;
 
@@ -1987,7 +1983,7 @@ static int create_qp(struct ib_uverbs_file *file,
 
 	return 0;
 err_cb:
-	idr_remove_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject);
+	idr_remove_uobj(&obj->uevent.uobject);
 
 err_destroy:
 	ib_destroy_qp(qp);
@@ -2173,7 +2169,7 @@ ssize_t ib_uverbs_open_qp(struct ib_uverbs_file *file,
 	qp->uobject = &obj->uevent.uobject;
 
 	obj->uevent.uobject.object = qp;
-	ret = idr_add_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject);
+	ret = idr_add_uobj(&obj->uevent.uobject);
 	if (ret)
 		goto err_destroy;
 
@@ -2202,7 +2198,7 @@ ssize_t ib_uverbs_open_qp(struct ib_uverbs_file *file,
 	return in_len;
 
 err_remove:
-	idr_remove_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject);
+	idr_remove_uobj(&obj->uevent.uobject);
 
 err_destroy:
 	ib_destroy_qp(qp);
@@ -2442,7 +2438,7 @@ ssize_t ib_uverbs_destroy_qp(struct ib_uverbs_file *file,
 
 	memset(&resp, 0, sizeof resp);
 
-	uobj = idr_write_uobj(&ib_uverbs_qp_idr, cmd.qp_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.qp_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 	qp  = uobj->object;
@@ -2465,7 +2461,7 @@ ssize_t ib_uverbs_destroy_qp(struct ib_uverbs_file *file,
 	if (obj->uxrcd)
 		atomic_dec(&obj->uxrcd->refcnt);
 
-	idr_remove_uobj(&ib_uverbs_qp_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -2917,7 +2913,7 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
 	ah->uobject  = uobj;
 	uobj->object = ah;
 
-	ret = idr_add_uobj(&ib_uverbs_ah_idr, uobj);
+	ret = idr_add_uobj(uobj);
 	if (ret)
 		goto err_destroy;
 
@@ -2942,7 +2938,7 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(&ib_uverbs_ah_idr, uobj);
+	idr_remove_uobj(uobj);
 
 err_destroy:
 	ib_destroy_ah(ah);
@@ -2967,7 +2963,7 @@ ssize_t ib_uverbs_destroy_ah(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(&ib_uverbs_ah_idr, cmd.ah_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.ah_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 	ah = uobj->object;
@@ -2981,7 +2977,7 @@ ssize_t ib_uverbs_destroy_ah(struct ib_uverbs_file *file,
 	if (ret)
 		return ret;
 
-	idr_remove_uobj(&ib_uverbs_ah_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -3263,7 +3259,7 @@ int ib_uverbs_ex_create_wq(struct ib_uverbs_file *file,
 	atomic_inc(&cq->usecnt);
 	wq->uobject = &obj->uevent.uobject;
 	obj->uevent.uobject.object = wq;
-	err = idr_add_uobj(&ib_uverbs_wq_idr, &obj->uevent.uobject);
+	err = idr_add_uobj(&obj->uevent.uobject);
 	if (err)
 		goto destroy_wq;
 
@@ -3290,7 +3286,7 @@ int ib_uverbs_ex_create_wq(struct ib_uverbs_file *file,
 	return 0;
 
 err_copy:
-	idr_remove_uobj(&ib_uverbs_wq_idr, &obj->uevent.uobject);
+	idr_remove_uobj(&obj->uevent.uobject);
 destroy_wq:
 	ib_destroy_wq(wq);
 err_put_cq:
@@ -3339,7 +3335,7 @@ int ib_uverbs_ex_destroy_wq(struct ib_uverbs_file *file,
 		return -EOPNOTSUPP;
 
 	resp.response_length = required_resp_len;
-	uobj = idr_write_uobj(&ib_uverbs_wq_idr, cmd.wq_handle,
+	uobj = idr_write_uobj(cmd.wq_handle,
 			      file->ucontext);
 	if (!uobj)
 		return -EINVAL;
@@ -3354,7 +3350,7 @@ int ib_uverbs_ex_destroy_wq(struct ib_uverbs_file *file,
 	if (ret)
 		return ret;
 
-	idr_remove_uobj(&ib_uverbs_wq_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -3522,7 +3518,7 @@ int ib_uverbs_ex_create_rwq_ind_table(struct ib_uverbs_file *file,
 	for (i = 0; i < num_wq_handles; i++)
 		atomic_inc(&wqs[i]->usecnt);
 
-	err = idr_add_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+	err = idr_add_uobj(uobj);
 	if (err)
 		goto destroy_ind_tbl;
 
@@ -3550,7 +3546,7 @@ int ib_uverbs_ex_create_rwq_ind_table(struct ib_uverbs_file *file,
 	return 0;
 
 err_copy:
-	idr_remove_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+	idr_remove_uobj(uobj);
 destroy_ind_tbl:
 	ib_destroy_rwq_ind_table(rwq_ind_tbl);
 err_uobj:
@@ -3593,7 +3589,7 @@ int ib_uverbs_ex_destroy_rwq_ind_table(struct ib_uverbs_file *file,
 	if (cmd.comp_mask)
 		return -EOPNOTSUPP;
 
-	uobj = idr_write_uobj(&ib_uverbs_rwq_ind_tbl_idr, cmd.ind_tbl_handle,
+	uobj = idr_write_uobj(cmd.ind_tbl_handle,
 			      file->ucontext);
 	if (!uobj)
 		return -EINVAL;
@@ -3609,7 +3605,7 @@ int ib_uverbs_ex_destroy_rwq_ind_table(struct ib_uverbs_file *file,
 	if (ret)
 		return ret;
 
-	idr_remove_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -3749,7 +3745,7 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 	flow_id->uobject = uobj;
 	uobj->object = flow_id;
 
-	err = idr_add_uobj(&ib_uverbs_rule_idr, uobj);
+	err = idr_add_uobj(uobj);
 	if (err)
 		goto destroy_flow;
 
@@ -3774,7 +3770,7 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 		kfree(kern_flow_attr);
 	return 0;
 err_copy:
-	idr_remove_uobj(&ib_uverbs_rule_idr, uobj);
+	idr_remove_uobj(uobj);
 destroy_flow:
 	ib_destroy_flow(flow_id);
 err_free:
@@ -3809,8 +3805,7 @@ int ib_uverbs_ex_destroy_flow(struct ib_uverbs_file *file,
 	if (cmd.comp_mask)
 		return -EINVAL;
 
-	uobj = idr_write_uobj(&ib_uverbs_rule_idr, cmd.flow_handle,
-			      file->ucontext);
+	uobj = idr_write_uobj(cmd.flow_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 	flow_id = uobj->object;
@@ -3821,7 +3816,7 @@ int ib_uverbs_ex_destroy_flow(struct ib_uverbs_file *file,
 
 	put_uobj_write(uobj);
 
-	idr_remove_uobj(&ib_uverbs_rule_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
@@ -3909,7 +3904,7 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 	atomic_set(&srq->usecnt, 0);
 
 	obj->uevent.uobject.object = srq;
-	ret = idr_add_uobj(&ib_uverbs_srq_idr, &obj->uevent.uobject);
+	ret = idr_add_uobj(&obj->uevent.uobject);
 	if (ret)
 		goto err_destroy;
 
@@ -3943,7 +3938,7 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 	return 0;
 
 err_copy:
-	idr_remove_uobj(&ib_uverbs_srq_idr, &obj->uevent.uobject);
+	idr_remove_uobj(&obj->uevent.uobject);
 
 err_destroy:
 	ib_destroy_srq(srq);
@@ -4119,7 +4114,7 @@ ssize_t ib_uverbs_destroy_srq(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(&ib_uverbs_srq_idr, cmd.srq_handle, file->ucontext);
+	uobj = idr_write_uobj(cmd.srq_handle, file->ucontext);
 	if (!uobj)
 		return -EINVAL;
 	srq = uobj->object;
@@ -4140,7 +4135,7 @@ ssize_t ib_uverbs_destroy_srq(struct ib_uverbs_file *file,
 		atomic_dec(&us->uxrcd->refcnt);
 	}
 
-	idr_remove_uobj(&ib_uverbs_srq_idr, uobj);
+	idr_remove_uobj(uobj);
 
 	mutex_lock(&file->mutex);
 	list_del(&uobj->list);
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 0012fa5..f783723 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -66,19 +66,6 @@ enum {
 
 static struct class *uverbs_class;
 
-DEFINE_SPINLOCK(ib_uverbs_idr_lock);
-DEFINE_IDR(ib_uverbs_pd_idr);
-DEFINE_IDR(ib_uverbs_mr_idr);
-DEFINE_IDR(ib_uverbs_mw_idr);
-DEFINE_IDR(ib_uverbs_ah_idr);
-DEFINE_IDR(ib_uverbs_cq_idr);
-DEFINE_IDR(ib_uverbs_qp_idr);
-DEFINE_IDR(ib_uverbs_srq_idr);
-DEFINE_IDR(ib_uverbs_xrcd_idr);
-DEFINE_IDR(ib_uverbs_rule_idr);
-DEFINE_IDR(ib_uverbs_wq_idr);
-DEFINE_IDR(ib_uverbs_rwq_ind_tbl_idr);
-
 static DEFINE_SPINLOCK(map_lock);
 static DECLARE_BITMAP(dev_map, IB_UVERBS_MAX_DEVICES);
 
@@ -234,7 +221,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 	list_for_each_entry_safe(uobj, tmp, &context->ah_list, list) {
 		struct ib_ah *ah = uobj->object;
 
-		idr_remove_uobj(&ib_uverbs_ah_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_destroy_ah(ah);
 		kfree(uobj);
 	}
@@ -243,7 +230,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 	list_for_each_entry_safe(uobj, tmp, &context->mw_list, list) {
 		struct ib_mw *mw = uobj->object;
 
-		idr_remove_uobj(&ib_uverbs_mw_idr, uobj);
+		idr_remove_uobj(uobj);
 		uverbs_dealloc_mw(mw);
 		kfree(uobj);
 	}
@@ -251,7 +238,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 	list_for_each_entry_safe(uobj, tmp, &context->rule_list, list) {
 		struct ib_flow *flow_id = uobj->object;
 
-		idr_remove_uobj(&ib_uverbs_rule_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_destroy_flow(flow_id);
 		kfree(uobj);
 	}
@@ -261,7 +248,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		struct ib_uqp_object *uqp =
 			container_of(uobj, struct ib_uqp_object, uevent.uobject);
 
-		idr_remove_uobj(&ib_uverbs_qp_idr, uobj);
+		idr_remove_uobj(uobj);
 		if (qp != qp->real_qp) {
 			ib_close_qp(qp);
 		} else {
@@ -276,7 +263,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		struct ib_rwq_ind_table *rwq_ind_tbl = uobj->object;
 		struct ib_wq **ind_tbl = rwq_ind_tbl->ind_tbl;
 
-		idr_remove_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_destroy_rwq_ind_table(rwq_ind_tbl);
 		kfree(ind_tbl);
 		kfree(uobj);
@@ -287,7 +274,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		struct ib_uwq_object *uwq =
 			container_of(uobj, struct ib_uwq_object, uevent.uobject);
 
-		idr_remove_uobj(&ib_uverbs_wq_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_destroy_wq(wq);
 		ib_uverbs_release_uevent(file, &uwq->uevent);
 		kfree(uwq);
@@ -298,7 +285,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		struct ib_uevent_object *uevent =
 			container_of(uobj, struct ib_uevent_object, uobject);
 
-		idr_remove_uobj(&ib_uverbs_srq_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_destroy_srq(srq);
 		ib_uverbs_release_uevent(file, uevent);
 		kfree(uevent);
@@ -310,7 +297,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		struct ib_ucq_object *ucq =
 			container_of(uobj, struct ib_ucq_object, uobject);
 
-		idr_remove_uobj(&ib_uverbs_cq_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_destroy_cq(cq);
 		ib_uverbs_release_ucq(file, ev_file, ucq);
 		kfree(ucq);
@@ -319,7 +306,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 	list_for_each_entry_safe(uobj, tmp, &context->mr_list, list) {
 		struct ib_mr *mr = uobj->object;
 
-		idr_remove_uobj(&ib_uverbs_mr_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_dereg_mr(mr);
 		kfree(uobj);
 	}
@@ -330,7 +317,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		struct ib_uxrcd_object *uxrcd =
 			container_of(uobj, struct ib_uxrcd_object, uobject);
 
-		idr_remove_uobj(&ib_uverbs_xrcd_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_uverbs_dealloc_xrcd(file->device, xrcd);
 		kfree(uxrcd);
 	}
@@ -339,7 +326,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 	list_for_each_entry_safe(uobj, tmp, &context->pd_list, list) {
 		struct ib_pd *pd = uobj->object;
 
-		idr_remove_uobj(&ib_uverbs_pd_idr, uobj);
+		idr_remove_uobj(uobj);
 		ib_dealloc_pd(pd);
 		kfree(uobj);
 	}
@@ -1375,13 +1362,6 @@ static void __exit ib_uverbs_cleanup(void)
 	unregister_chrdev_region(IB_UVERBS_BASE_DEV, IB_UVERBS_MAX_DEVICES);
 	if (overflow_maj)
 		unregister_chrdev_region(overflow_maj, IB_UVERBS_MAX_DEVICES);
-	idr_destroy(&ib_uverbs_pd_idr);
-	idr_destroy(&ib_uverbs_mr_idr);
-	idr_destroy(&ib_uverbs_mw_idr);
-	idr_destroy(&ib_uverbs_ah_idr);
-	idr_destroy(&ib_uverbs_cq_idr);
-	idr_destroy(&ib_uverbs_qp_idr);
-	idr_destroy(&ib_uverbs_srq_idr);
 }
 
 module_init(ib_uverbs_init);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index d3fba0a..b5d2075 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1835,6 +1835,10 @@ struct ib_device {
 
 	struct iw_cm_verbs	     *iwcm;
 
+	struct idr		idr;
+	/* Global lock in use to safely release device IDR */
+	spinlock_t		idr_lock;
+
 	/**
 	 * alloc_hw_stats - Allocate a struct rdma_hw_stats and fill in the
 	 *   driver initialized data.  The struct is kfree()'ed by the sysfs
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC ABI V6 02/14] IB/core: Add support for custom types
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-12-11 12:57   ` [RFC ABI V6 01/14] IB/core: Refactor IDR to be per-device Matan Barak
@ 2016-12-11 12:57   ` Matan Barak
       [not found]     ` <1481461088-56355-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-12-11 12:57   ` [RFC ABI V6 03/14] IB/core: Add generic ucontext initialization and teardown Matan Barak
                     ` (12 subsequent siblings)
  14 siblings, 1 reply; 20+ messages in thread
From: Matan Barak @ 2016-12-11 12:57 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Jason Gunthorpe, Sean Hefty, Christoph Lameter,
	Liran Liss, Haggai Eran, Majd Dibbiny, Matan Barak, Tal Alon,
	Leon Romanovsky

The new ioctl infrastructure supports driver specific objects.
Each such object type has a free function, allocation size and an
order of destruction. This information is embedded in the same
table describing the various action allowed on the object, similarly
to object oriented programming.

When a ucontext is created, a new list is created in this ib_ucontext.
This list contains all objects created under this ib_ucontext.
When a ib_ucontext is destroyed, we traverse this list several time
destroying the various objects by the order mentioned in the object
type description. If few object types have the same destruction order,
they are destroyed in an order opposite to their creation order.

Adding an object is done in two parts.
First, an object is allocated and added to IDR/fd table. Then, the
command's handlers (in downstream patches) could work on this object
and fill in its required details.
After a successful command, ib_uverbs_uobject_enable is called and
this user objects becomes ucontext visible.

Removing an uboject is done by calling ib_uverbs_uobject_remove.

We should make sure IDR (per-device) and list (per-ucontext) could
be accessed concurrently without corrupting them.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/Makefile      |   3 +-
 drivers/infiniband/core/device.c      |   1 +
 drivers/infiniband/core/rdma_core.c   | 397 ++++++++++++++++++++++++++++++++++
 drivers/infiniband/core/rdma_core.h   |  71 ++++++
 drivers/infiniband/core/uverbs.h      |   1 +
 drivers/infiniband/core/uverbs_main.c |   2 +-
 include/rdma/ib_verbs.h               |  22 +-
 include/rdma/uverbs_ioctl.h           | 218 +++++++++++++++++++
 8 files changed, 710 insertions(+), 5 deletions(-)
 create mode 100644 drivers/infiniband/core/rdma_core.c
 create mode 100644 drivers/infiniband/core/rdma_core.h
 create mode 100644 include/rdma/uverbs_ioctl.h

diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index edaae9f..1819623 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -28,4 +28,5 @@ ib_umad-y :=			user_mad.o
 
 ib_ucm-y :=			ucm.o
 
-ib_uverbs-y :=			uverbs_main.o uverbs_cmd.o uverbs_marshall.o
+ib_uverbs-y :=			uverbs_main.o uverbs_cmd.o uverbs_marshall.o \
+				rdma_core.o
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index c3b68f5..43994b1 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -243,6 +243,7 @@ struct ib_device *ib_alloc_device(size_t size)
 	spin_lock_init(&device->client_data_lock);
 	INIT_LIST_HEAD(&device->client_data_list);
 	INIT_LIST_HEAD(&device->port_list);
+	INIT_LIST_HEAD(&device->type_list);
 
 	return device;
 }
diff --git a/drivers/infiniband/core/rdma_core.c b/drivers/infiniband/core/rdma_core.c
new file mode 100644
index 0000000..398b61f
--- /dev/null
+++ b/drivers/infiniband/core/rdma_core.c
@@ -0,0 +1,397 @@
+/*
+ * Copyright (c) 2016, Mellanox Technologies inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/file.h>
+#include <linux/anon_inodes.h>
+#include <rdma/ib_verbs.h>
+#include "uverbs.h"
+#include "rdma_core.h"
+#include <rdma/uverbs_ioctl.h>
+
+static int uverbs_lock_object(struct ib_uobject *uobj,
+			      enum uverbs_idr_access access)
+{
+	if (access == UVERBS_IDR_ACCESS_READ)
+		return down_read_trylock(&uobj->usecnt) == 1 ? 0 : -EBUSY;
+
+	/* lock is either WRITE or DESTROY - should be exclusive */
+	return down_write_trylock(&uobj->usecnt) == 1 ? 0 : -EBUSY;
+}
+
+static struct ib_uobject *get_uobj(int id, struct ib_ucontext *context)
+{
+	struct ib_uobject *uobj;
+
+	rcu_read_lock();
+	uobj = idr_find(&context->device->idr, id);
+	if (uobj) {
+		if (uobj->context != context)
+			uobj = NULL;
+	}
+	rcu_read_unlock();
+
+	return uobj;
+}
+
+bool uverbs_is_live(struct ib_uobject *uobj)
+{
+	return uobj == get_uobj(uobj->id, uobj->context);
+}
+
+struct ib_ucontext_lock {
+	struct kref  ref;
+	/* locking the uobjects_list */
+	struct mutex lock;
+};
+
+static void release_uobjects_list_lock(struct kref *ref)
+{
+	struct ib_ucontext_lock *lock = container_of(ref,
+						     struct ib_ucontext_lock,
+						     ref);
+
+	kfree(lock);
+}
+
+static void init_uobj(struct ib_uobject *uobj, struct ib_ucontext *context)
+{
+	init_rwsem(&uobj->usecnt);
+	uobj->context     = context;
+}
+
+static int add_uobj(struct ib_uobject *uobj)
+{
+	int ret;
+
+	idr_preload(GFP_KERNEL);
+	spin_lock(&uobj->context->device->idr_lock);
+
+	/* The uobject will be replaced with the actual one when we commit */
+	ret = idr_alloc(&uobj->context->device->idr, NULL, 0, 0, GFP_NOWAIT);
+	if (ret >= 0)
+		uobj->id = ret;
+
+	spin_unlock(&uobj->context->device->idr_lock);
+	idr_preload_end();
+
+	return ret < 0 ? ret : 0;
+}
+
+static void remove_uobj(struct ib_uobject *uobj)
+{
+	spin_lock(&uobj->context->device->idr_lock);
+	idr_remove(&uobj->context->device->idr, uobj->id);
+	spin_unlock(&uobj->context->device->idr_lock);
+}
+
+static void put_uobj(struct ib_uobject *uobj)
+{
+	kfree_rcu(uobj, rcu);
+}
+
+static struct ib_uobject *get_uobject_from_context(struct ib_ucontext *ucontext,
+						   const struct uverbs_type_alloc_action *type,
+						   u32 idr,
+						   enum uverbs_idr_access access)
+{
+	struct ib_uobject *uobj;
+	int ret;
+
+	rcu_read_lock();
+	uobj = get_uobj(idr, ucontext);
+	if (!uobj)
+		goto free;
+
+	if (uobj->type != type) {
+		uobj = NULL;
+		goto free;
+	}
+
+	ret = uverbs_lock_object(uobj, access);
+	if (ret)
+		uobj = ERR_PTR(ret);
+free:
+	rcu_read_unlock();
+	return uobj;
+
+	return NULL;
+}
+
+static int ib_uverbs_uobject_add(struct ib_uobject *uobject,
+				 const struct uverbs_type_alloc_action *uobject_type)
+{
+	uobject->type = uobject_type;
+	return add_uobj(uobject);
+}
+
+struct ib_uobject *uverbs_get_type_from_idr(const struct uverbs_type_alloc_action *type,
+					    struct ib_ucontext *ucontext,
+					    enum uverbs_idr_access access,
+					    uint32_t idr)
+{
+	struct ib_uobject *uobj;
+	int ret;
+
+	if (access == UVERBS_IDR_ACCESS_NEW) {
+		uobj = kmalloc(type->obj_size, GFP_KERNEL);
+		if (!uobj)
+			return ERR_PTR(-ENOMEM);
+
+		init_uobj(uobj, ucontext);
+
+		/* lock idr */
+		ret = ib_uverbs_uobject_add(uobj, type);
+		if (ret) {
+			kfree(uobj);
+			return ERR_PTR(ret);
+		}
+
+	} else {
+		uobj = get_uobject_from_context(ucontext, type, idr,
+						access);
+
+		if (!uobj)
+			return ERR_PTR(-ENOENT);
+	}
+
+	return uobj;
+}
+
+struct ib_uobject *uverbs_get_type_from_fd(const struct uverbs_type_alloc_action *type,
+					   struct ib_ucontext *ucontext,
+					   enum uverbs_idr_access access,
+					   int fd)
+{
+	if (access == UVERBS_IDR_ACCESS_NEW) {
+		int _fd;
+		struct ib_uobject *uobj = NULL;
+		struct file *filp;
+
+		_fd = get_unused_fd_flags(O_CLOEXEC);
+		if (_fd < 0 || WARN_ON(type->obj_size < sizeof(struct ib_uobject)))
+			return ERR_PTR(_fd);
+
+		uobj = kmalloc(type->obj_size, GFP_KERNEL);
+		init_uobj(uobj, ucontext);
+
+		if (!uobj)
+			return ERR_PTR(-ENOMEM);
+
+		filp = anon_inode_getfile(type->fd.name, type->fd.fops,
+					  uobj + 1, type->fd.flags);
+		if (IS_ERR(filp)) {
+			put_unused_fd(_fd);
+			kfree(uobj);
+			return (void *)filp;
+		}
+
+		uobj->type = type;
+		uobj->id = _fd;
+		uobj->object = filp;
+
+		return uobj;
+	} else if (access == UVERBS_IDR_ACCESS_READ) {
+		struct file *f = fget(fd);
+		struct ib_uobject *uobject;
+
+		if (!f)
+			return ERR_PTR(-EBADF);
+
+		uobject = f->private_data - sizeof(struct ib_uobject);
+		if (f->f_op != type->fd.fops ||
+		    !uobject->context) {
+			fput(f);
+			return ERR_PTR(-EBADF);
+		}
+
+		/*
+		 * No need to protect it with a ref count, as fget increases
+		 * f_count.
+		 */
+		return uobject;
+	} else {
+		return ERR_PTR(-EOPNOTSUPP);
+	}
+}
+
+static void ib_uverbs_uobject_enable(struct ib_uobject *uobject)
+{
+	mutex_lock(&uobject->context->uobjects_lock->lock);
+	list_add(&uobject->list, &uobject->context->uobjects);
+	mutex_unlock(&uobject->context->uobjects_lock->lock);
+	spin_lock(&uobject->context->device->idr_lock);
+	idr_replace(&uobject->context->device->idr, uobject, uobject->id);
+	spin_unlock(&uobject->context->device->idr_lock);
+}
+
+static void ib_uverbs_uobject_remove(struct ib_uobject *uobject, bool lock)
+{
+	/*
+	 * Calling remove requires exclusive access, so it's not possible
+	 * another thread will use our object.
+	 */
+	remove_uobj(uobject);
+	if (lock)
+		mutex_lock(&uobject->context->uobjects_lock->lock);
+	list_del(&uobject->list);
+	if (lock)
+		mutex_unlock(&uobject->context->uobjects_lock->lock);
+	put_uobj(uobject);
+}
+
+static void uverbs_commit_idr(struct ib_uobject *uobj,
+			      enum uverbs_idr_access access,
+			      bool success)
+{
+	switch (access) {
+	case UVERBS_IDR_ACCESS_READ:
+		up_read(&uobj->usecnt);
+		break;
+	case UVERBS_IDR_ACCESS_NEW:
+		if (success) {
+			ib_uverbs_uobject_enable(uobj);
+		} else {
+			remove_uobj(uobj);
+			put_uobj(uobj);
+		}
+		break;
+	case UVERBS_IDR_ACCESS_WRITE:
+		up_write(&uobj->usecnt);
+		break;
+	case UVERBS_IDR_ACCESS_DESTROY:
+		if (success)
+			ib_uverbs_uobject_remove(uobj, true);
+		else
+			up_write(&uobj->usecnt);
+		break;
+	}
+}
+
+static void uverbs_commit_fd(struct ib_uobject *uobj,
+			     enum uverbs_idr_access access,
+			     bool success)
+{
+	struct file *filp = uobj->object;
+
+	if (access == UVERBS_IDR_ACCESS_NEW) {
+		if (success) {
+			kref_get(&uobj->context->ufile->ref);
+			uobj->uobjects_lock = uobj->context->uobjects_lock;
+			kref_get(&uobj->uobjects_lock->ref);
+			ib_uverbs_uobject_enable(uobj);
+			fd_install(uobj->id, uobj->object);
+		} else {
+			fput(uobj->object);
+			put_unused_fd(uobj->id);
+			kfree(uobj);
+		}
+	} else {
+		fput(filp);
+	}
+}
+
+static void _uverbs_commit_object(struct ib_uobject *uobj,
+				  enum uverbs_idr_access access,
+				  bool success)
+{
+	if (uobj->type->type == UVERBS_ATTR_TYPE_IDR)
+		uverbs_commit_idr(uobj, access, success);
+	else if (uobj->type->type == UVERBS_ATTR_TYPE_FD)
+		uverbs_commit_fd(uobj, access, success);
+	else
+		WARN_ON(true);
+}
+
+void uverbs_commit_object(struct ib_uobject *uobj,
+			  enum uverbs_idr_access access)
+{
+	return _uverbs_commit_object(uobj, access, true);
+}
+
+void uverbs_rollback_object(struct ib_uobject *uobj,
+			    enum uverbs_idr_access access)
+{
+	return _uverbs_commit_object(uobj, access, false);
+}
+
+void ib_uverbs_close_fd(struct file *f)
+{
+	struct ib_uobject *uobject = f->private_data - sizeof(struct ib_uobject);
+
+	mutex_lock(&uobject->uobjects_lock->lock);
+	if (uobject->context) {
+		list_del(&uobject->list);
+		kref_put(&uobject->context->ufile->ref, ib_uverbs_release_file);
+		uobject->context = NULL;
+	}
+	mutex_unlock(&uobject->uobjects_lock->lock);
+	kref_put(&uobject->uobjects_lock->ref, release_uobjects_list_lock);
+}
+
+void ib_uverbs_cleanup_fd(void *private_data)
+{
+	struct ib_uboject *uobject = private_data - sizeof(struct ib_uobject);
+
+	kfree(uobject);
+}
+
+void uverbs_commit_objects(struct uverbs_attr_array *attr_array,
+			   size_t num,
+			   const struct uverbs_action *action,
+			   bool success)
+{
+	unsigned int i;
+
+	for (i = 0; i < num; i++) {
+		struct uverbs_attr_array *attr_spec_array = &attr_array[i];
+		const struct uverbs_attr_spec_group *attr_spec_group =
+			action->attr_groups[i];
+		unsigned int j;
+
+		for (j = 0; j < attr_spec_array->num_attrs; j++) {
+			struct uverbs_attr *attr = &attr_spec_array->attrs[j];
+			struct uverbs_attr_spec *spec = &attr_spec_group->attrs[j];
+
+			if (!uverbs_is_valid(attr_spec_array, j))
+				continue;
+
+			if (spec->type == UVERBS_ATTR_TYPE_IDR ||
+			    spec->type == UVERBS_ATTR_TYPE_FD)
+				/*
+				 * refcounts should be handled at the object
+				 * level and not at the uobject level.
+				 */
+				_uverbs_commit_object(attr->obj_attr.uobject,
+						      spec->obj.access, success);
+		}
+	}
+}
diff --git a/drivers/infiniband/core/rdma_core.h b/drivers/infiniband/core/rdma_core.h
new file mode 100644
index 0000000..0bb4be3
--- /dev/null
+++ b/drivers/infiniband/core/rdma_core.h
@@ -0,0 +1,71 @@
+/*
+ * Copyright (c) 2005 Topspin Communications.  All rights reserved.
+ * Copyright (c) 2005, 2006 Cisco Systems.  All rights reserved.
+ * Copyright (c) 2005-2016 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2005 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2005 PathScale, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RDMA_CORE_H
+#define RDMA_CORE_H
+
+#include <linux/idr.h>
+#include <rdma/uverbs_ioctl.h>
+#include <rdma/ib_verbs.h>
+#include <linux/mutex.h>
+
+struct ib_uobject *uverbs_get_type_from_idr(const struct uverbs_type_alloc_action *type,
+					    struct ib_ucontext *ucontext,
+					    enum uverbs_idr_access access,
+					    uint32_t idr);
+struct ib_uobject *uverbs_get_type_from_fd(const struct uverbs_type_alloc_action *type,
+					   struct ib_ucontext *ucontext,
+					   enum uverbs_idr_access access,
+					   int fd);
+bool uverbs_is_live(struct ib_uobject *uobj);
+void uverbs_rollback_object(struct ib_uobject *uobj,
+			    enum uverbs_idr_access access);
+void uverbs_commit_object(struct ib_uobject *uobj,
+				 enum uverbs_idr_access access);
+void uverbs_commit_objects(struct uverbs_attr_array *attr_array,
+			   size_t num,
+			   const struct uverbs_action *action,
+			   bool success);
+
+void ib_uverbs_close_fd(struct file *f);
+void ib_uverbs_cleanup_fd(void *private_data);
+
+static inline void *uverbs_fd_to_priv(struct ib_uobject *uobj)
+{
+	return uobj + 1;
+}
+
+#endif /* RDMA_CORE_H */
diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 8074705..ae7d4b8 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -180,6 +180,7 @@ void idr_remove_uobj(struct ib_uobject *uobj);
 struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
 					struct ib_device *ib_dev,
 					int is_async);
+void ib_uverbs_release_file(struct kref *ref);
 void ib_uverbs_free_async_event_file(struct ib_uverbs_file *uverbs_file);
 struct ib_uverbs_event_file *ib_uverbs_lookup_comp_file(int fd);
 
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index f783723..e63357a 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -341,7 +341,7 @@ static void ib_uverbs_comp_dev(struct ib_uverbs_device *dev)
 	complete(&dev->comp);
 }
 
-static void ib_uverbs_release_file(struct kref *ref)
+void ib_uverbs_release_file(struct kref *ref)
 {
 	struct ib_uverbs_file *file =
 		container_of(ref, struct ib_uverbs_file, ref);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index b5d2075..282b0ba 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1329,8 +1329,11 @@ struct ib_fmr_attr {
 
 struct ib_umem;
 
+struct ib_ucontext_lock;
+
 struct ib_ucontext {
 	struct ib_device       *device;
+	struct ib_uverbs_file  *ufile;
 	struct list_head	pd_list;
 	struct list_head	mr_list;
 	struct list_head	mw_list;
@@ -1344,6 +1347,10 @@ struct ib_ucontext {
 	struct list_head	rwq_ind_tbl_list;
 	int			closing;
 
+	/* lock for uobjects list */
+	struct ib_ucontext_lock	*uobjects_lock;
+	struct list_head	uobjects;
+
 	struct pid             *tgid;
 #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	struct rb_root      umem_tree;
@@ -1363,16 +1370,22 @@ struct ib_ucontext {
 #endif
 };
 
+struct uverbs_object_list;
+
 struct ib_uobject {
 	u64			user_handle;	/* handle given to us by userspace */
 	struct ib_ucontext     *context;	/* associated user context */
 	void		       *object;		/* containing object */
 	struct list_head	list;		/* link to context's list */
-	int			id;		/* index into kernel idr */
-	struct kref		ref;
-	struct rw_semaphore	mutex;		/* protects .live */
+	int			id;		/* index into kernel idr/fd */
+	struct kref             ref;
+	struct rw_semaphore	usecnt;		/* protects exclusive access */
+	struct rw_semaphore     mutex;          /* protects .live */
 	struct rcu_head		rcu;		/* kfree_rcu() overhead */
 	int			live;
+
+	const struct uverbs_type_alloc_action *type;
+	struct ib_ucontext_lock	*uobjects_lock;
 };
 
 struct ib_udata {
@@ -2101,6 +2114,9 @@ struct ib_device {
 	 */
 	int (*get_port_immutable)(struct ib_device *, u8, struct ib_port_immutable *);
 	void (*get_dev_fw_str)(struct ib_device *, char *str, size_t str_len);
+	struct list_head type_list;
+
+	const struct uverbs_types_group	*types_group;
 };
 
 struct ib_client {
diff --git a/include/rdma/uverbs_ioctl.h b/include/rdma/uverbs_ioctl.h
new file mode 100644
index 0000000..382321b
--- /dev/null
+++ b/include/rdma/uverbs_ioctl.h
@@ -0,0 +1,218 @@
+/*
+ * Copyright (c) 2016, Mellanox Technologies inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef _UVERBS_IOCTL_
+#define _UVERBS_IOCTL_
+
+#include <linux/kernel.h>
+
+struct uverbs_object_type;
+struct ib_ucontext;
+struct ib_uobject;
+struct ib_device;
+struct uverbs_uobject_type;
+
+/*
+ * =======================================
+ *	Verbs action specifications
+ * =======================================
+ */
+
+#define UVERBS_ID_RESERVED_MASK 0xF000
+#define UVERBS_ID_RESERVED_SHIFT 12
+
+enum uverbs_attr_type {
+	UVERBS_ATTR_TYPE_NA,
+	UVERBS_ATTR_TYPE_PTR_IN,
+	UVERBS_ATTR_TYPE_PTR_OUT,
+	UVERBS_ATTR_TYPE_IDR,
+	UVERBS_ATTR_TYPE_FD,
+	UVERBS_ATTR_TYPE_FLAG,
+};
+
+enum uverbs_idr_access {
+	UVERBS_IDR_ACCESS_READ,
+	UVERBS_IDR_ACCESS_WRITE,
+	UVERBS_IDR_ACCESS_NEW,
+	UVERBS_IDR_ACCESS_DESTROY
+};
+
+enum uverbs_attr_spec_flags {
+	UVERBS_ATTR_SPEC_F_MANDATORY	= 1U << 0,
+	UVERBS_ATTR_SPEC_F_MIN_SZ	= 1U << 1,
+};
+
+struct uverbs_attr_spec {
+	enum uverbs_attr_type		type;
+	u8				flags;
+	union {
+		u16				len;
+		struct {
+			u16			obj_type;
+			u8			access;
+		} obj;
+		struct {
+			/* flags are always 64bits */
+			u64			mask;
+		} flag;
+	};
+};
+
+struct uverbs_attr_spec_group {
+	struct uverbs_attr_spec		*attrs;
+	size_t				num_attrs;
+	/* populate at runtime */
+	unsigned long			*mandatory_attrs_bitmask;
+};
+
+struct uverbs_attr_array;
+struct ib_uverbs_file;
+
+enum uverbs_action_flags {
+	UVERBS_ACTION_FLAG_CREATE_ROOT = 1 << 0,
+};
+
+struct uverbs_action {
+	const struct uverbs_attr_spec_group		**attr_groups;
+	size_t						num_groups;
+	u32 flags;
+	int (*handler)(struct ib_device *ib_dev, struct ib_uverbs_file *ufile,
+		       struct uverbs_attr_array *ctx, size_t num);
+	u16 num_child_attrs;
+};
+
+struct uverbs_type_alloc_action;
+typedef void (*free_type)(const struct uverbs_type_alloc_action *uobject_type,
+			  struct ib_uobject *uobject);
+
+struct uverbs_type_alloc_action {
+	enum uverbs_attr_type		type;
+	int				order;
+	size_t				obj_size;
+	free_type			free_fn;
+	struct {
+		const struct file_operations	*fops;
+		const char			*name;
+		int				flags;
+	} fd;
+};
+
+struct uverbs_action_group {
+	size_t					num_actions;
+	const struct uverbs_action		**actions;
+};
+
+struct uverbs_type {
+	size_t					num_groups;
+	const struct uverbs_action_group	**action_groups;
+	const struct uverbs_type_alloc_action	*alloc;
+};
+
+struct uverbs_type_group {
+	size_t					num_types;
+	const struct uverbs_type		**types;
+};
+
+struct uverbs_root {
+	const struct uverbs_type_group		**type_groups;
+	size_t					num_groups;
+};
+
+/* =================================================
+ *              Parsing infrastructure
+ * =================================================
+ */
+
+struct uverbs_ptr_attr {
+	void	* __user ptr;
+	u16		len;
+};
+
+struct uverbs_fd_attr {
+	int		fd;
+};
+
+struct uverbs_uobj_attr {
+	/*  idr handle */
+	u32	idr;
+};
+
+struct uverbs_flag_attr {
+	u64	flags;
+};
+
+struct uverbs_obj_attr {
+	/* pointer to the kernel descriptor -> type, access, etc */
+	struct ib_uverbs_attr __user	*uattr;
+	const struct uverbs_type_alloc_action	*type;
+	struct ib_uobject		*uobject;
+	union {
+		struct uverbs_fd_attr		fd;
+		struct uverbs_uobj_attr		uobj;
+	};
+};
+
+struct uverbs_attr {
+	union {
+		struct uverbs_ptr_attr	cmd_attr;
+		struct uverbs_obj_attr	obj_attr;
+		struct uverbs_flag_attr flag_attr;
+	};
+};
+
+/* output of one validator */
+struct uverbs_attr_array {
+	unsigned long *valid_bitmap;
+	size_t num_attrs;
+	/* arrays of attrubytes, index is the id i.e SEND_CQ */
+	struct uverbs_attr *attrs;
+};
+
+static inline bool uverbs_is_valid(const struct uverbs_attr_array *attr_array,
+				   unsigned int idx)
+{
+	return test_bit(idx, attr_array->valid_bitmap);
+}
+
+/* =================================================
+ *              Types infrastructure
+ * =================================================
+ */
+
+int ib_uverbs_uobject_type_add(struct list_head	*head,
+			       void (*free)(struct uverbs_uobject_type *type,
+					    struct ib_uobject *uobject,
+					    struct ib_ucontext *ucontext),
+			       uint16_t	obj_type);
+void ib_uverbs_uobject_types_remove(struct ib_device *ib_dev);
+
+#endif
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC ABI V6 03/14] IB/core: Add generic ucontext initialization and teardown
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-12-11 12:57   ` [RFC ABI V6 01/14] IB/core: Refactor IDR to be per-device Matan Barak
  2016-12-11 12:57   ` [RFC ABI V6 02/14] IB/core: Add support for custom types Matan Barak
@ 2016-12-11 12:57   ` Matan Barak
  2016-12-11 12:57   ` [RFC ABI V6 04/14] IB/core: Add macros for declaring types and type groups Matan Barak
                     ` (11 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Matan Barak @ 2016-12-11 12:57 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Jason Gunthorpe, Sean Hefty, Christoph Lameter,
	Liran Liss, Haggai Eran, Majd Dibbiny, Matan Barak, Tal Alon,
	Leon Romanovsky

When a ucontext is created, we need to initialize the list of objects.
This list consists of every user object that is associated with
this ucontext. The possible elements in this list are either a file
descriptor or an object which is represented by an IDR.
Every such an object, has a release function (which is called upon
object destruction) and a number associated to its release order.

When a ucontext is destroyed, the list is traversed while holding a
lock. This lock is necessary since a user might try to close a FD
file [s]he created and exists in this list.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/rdma_core.c | 87 +++++++++++++++++++++++++++++++++++++
 drivers/infiniband/core/rdma_core.h |  4 ++
 2 files changed, 91 insertions(+)

diff --git a/drivers/infiniband/core/rdma_core.c b/drivers/infiniband/core/rdma_core.c
index 398b61f..01221c0 100644
--- a/drivers/infiniband/core/rdma_core.c
+++ b/drivers/infiniband/core/rdma_core.c
@@ -73,6 +73,12 @@ struct ib_ucontext_lock {
 	struct mutex lock;
 };
 
+static void init_uobjects_list_lock(struct ib_ucontext_lock *lock)
+{
+	mutex_init(&lock->lock);
+	kref_init(&lock->ref);
+}
+
 static void release_uobjects_list_lock(struct kref *ref)
 {
 	struct ib_ucontext_lock *lock = container_of(ref,
@@ -343,6 +349,20 @@ void uverbs_rollback_object(struct ib_uobject *uobj,
 	return _uverbs_commit_object(uobj, access, false);
 }
 
+static void ib_uverbs_remove_fd(struct ib_uobject *uobject)
+{
+	/*
+	 * user should release the uobject in the release
+	 * callback.
+	 */
+	if (uobject->context) {
+		list_del(&uobject->list);
+		uobject->type->free_fn(uobject->type, uobject);
+		kref_put(&uobject->context->ufile->ref, ib_uverbs_release_file);
+		uobject->context = NULL;
+	}
+}
+
 void ib_uverbs_close_fd(struct file *f)
 {
 	struct ib_uobject *uobject = f->private_data - sizeof(struct ib_uobject);
@@ -395,3 +415,70 @@ void uverbs_commit_objects(struct uverbs_attr_array *attr_array,
 		}
 	}
 }
+
+static unsigned int get_type_orders(const struct uverbs_root *root)
+{
+	unsigned int i;
+	unsigned int max = 0;
+
+	for (i = 0; i < root->num_groups; i++) {
+		unsigned int j;
+		const struct uverbs_type_group *types = root->type_groups[i];
+
+		for (j = 0; j < types->num_types; j++) {
+			if (!types->types[j] || !types->types[j]->alloc)
+				continue;
+			if (types->types[j]->alloc->order > max)
+				max = types->types[j]->alloc->order;
+		}
+	}
+
+	return max;
+}
+
+void ib_uverbs_uobject_type_cleanup_ucontext(struct ib_ucontext *ucontext,
+					     const struct uverbs_root *root)
+{
+	unsigned int num_orders = get_type_orders(root);
+	unsigned int i;
+
+	for (i = 0; i <= num_orders; i++) {
+		struct ib_uobject *obj, *next_obj;
+
+		/*
+		 * No need to take lock here, as cleanup should be called
+		 * after all commands finished executing. Newly executed
+		 * commands should fail.
+		 */
+		mutex_lock(&ucontext->uobjects_lock->lock);
+		list_for_each_entry_safe(obj, next_obj, &ucontext->uobjects,
+					 list)
+			if (obj->type->order == i) {
+				if (obj->type->type == UVERBS_ATTR_TYPE_IDR)
+					ib_uverbs_uobject_remove(obj, false);
+				else
+					ib_uverbs_remove_fd(obj);
+			}
+		mutex_unlock(&ucontext->uobjects_lock->lock);
+	}
+	kref_put(&ucontext->uobjects_lock->ref, release_uobjects_list_lock);
+}
+
+int ib_uverbs_uobject_type_initialize_ucontext(struct ib_ucontext *ucontext)
+{
+	ucontext->uobjects_lock = kmalloc(sizeof(*ucontext->uobjects_lock),
+					  GFP_KERNEL);
+	if (!ucontext->uobjects_lock)
+		return -ENOMEM;
+
+	init_uobjects_list_lock(ucontext->uobjects_lock);
+	INIT_LIST_HEAD(&ucontext->uobjects);
+
+	return 0;
+}
+
+void ib_uverbs_uobject_type_release_ucontext(struct ib_ucontext *ucontext)
+{
+	kfree(ucontext->uobjects_lock);
+}
+
diff --git a/drivers/infiniband/core/rdma_core.h b/drivers/infiniband/core/rdma_core.h
index 0bb4be3..9b91c1c 100644
--- a/drivers/infiniband/core/rdma_core.h
+++ b/drivers/infiniband/core/rdma_core.h
@@ -60,6 +60,10 @@ void uverbs_commit_objects(struct uverbs_attr_array *attr_array,
 			   const struct uverbs_action *action,
 			   bool success);
 
+void ib_uverbs_uobject_type_cleanup_ucontext(struct ib_ucontext *ucontext,
+					     const struct uverbs_root *root);
+int ib_uverbs_uobject_type_initialize_ucontext(struct ib_ucontext *ucontext);
+void ib_uverbs_uobject_type_release_ucontext(struct ib_ucontext *ucontext);
 void ib_uverbs_close_fd(struct file *f);
 void ib_uverbs_cleanup_fd(void *private_data);
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC ABI V6 04/14] IB/core: Add macros for declaring types and type groups.
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2016-12-11 12:57   ` [RFC ABI V6 03/14] IB/core: Add generic ucontext initialization and teardown Matan Barak
@ 2016-12-11 12:57   ` Matan Barak
  2016-12-11 12:57   ` [RFC ABI V6 05/14] IB/core: Declare all common IB types Matan Barak
                     ` (10 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Matan Barak @ 2016-12-11 12:57 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Jason Gunthorpe, Sean Hefty, Christoph Lameter,
	Liran Liss, Haggai Eran, Majd Dibbiny, Matan Barak, Tal Alon,
	Leon Romanovsky

In order to initialize and destroy types in a generic way, we need to
provide information about the allocation size, release function and
order. This is done through a macro based DSL (domain specific
language). This patch adds macros to initialize a type and a type
group.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 include/rdma/uverbs_ioctl.h | 50 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/include/rdma/uverbs_ioctl.h b/include/rdma/uverbs_ioctl.h
index 382321b..54f592a 100644
--- a/include/rdma/uverbs_ioctl.h
+++ b/include/rdma/uverbs_ioctl.h
@@ -147,6 +147,56 @@ struct uverbs_root {
 	size_t					num_groups;
 };
 
+#define _UVERBS_ACTIONS_GROUP_SZ(...)					\
+	(sizeof((const struct uverbs_action_group*[]){__VA_ARGS__}) / \
+	 sizeof(const struct uverbs_action_group *))
+#define UVERBS_TYPE_ALLOC_FD(_order, _obj_size, _free_fn, _fops, _name, _flags)\
+	((const struct uverbs_type_alloc_action)			\
+	 {.type = UVERBS_ATTR_TYPE_FD,					\
+	 .order = _order,						\
+	 .obj_size = _obj_size,						\
+	 .free_fn = _free_fn,						\
+	 .fd = {.fops = _fops,						\
+		.name = _name,						\
+		.flags = _flags} })
+#define UVERBS_TYPE_ALLOC_IDR_SZ(_size, _order, _free_fn)		\
+	((const struct uverbs_type_alloc_action)			\
+	 {.type = UVERBS_ATTR_TYPE_IDR,					\
+	 .order = _order,						\
+	 .free_fn = _free_fn,						\
+	 .obj_size = _size,})
+#define UVERBS_TYPE_ALLOC_IDR(_order, _free_fn)				\
+	 UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uobject), _order, _free_fn)
+#define DECLARE_UVERBS_TYPE(name, _alloc, ...)				\
+	const struct uverbs_type name = {				\
+		.alloc = _alloc,					\
+		.num_groups = _UVERBS_ACTIONS_GROUP_SZ(__VA_ARGS__),	\
+		.action_groups = (const struct uverbs_action_group *[]){__VA_ARGS__} \
+	}
+#define _UVERBS_TYPE_SZ(...)						\
+	(sizeof((const struct uverbs_type *[]){__VA_ARGS__}) /	\
+	 sizeof(const struct uverbs_type *))
+#define ADD_UVERBS_TYPE_ACTIONS(type_idx, ...)				\
+	[type_idx] = &UVERBS_ACTIONS(__VA_ARGS__)
+#define ADD_UVERBS_TYPE(type_idx, type_ptr)				\
+	[type_idx] = ((const struct uverbs_type * const)&type_ptr)
+#define UVERBS_TYPES(...)  ((const struct uverbs_type_group)			\
+	{.num_types = _UVERBS_TYPE_SZ(__VA_ARGS__),			\
+	 .types = (const struct uverbs_type *[]){__VA_ARGS__} })
+#define DECLARE_UVERBS_TYPES(name, ...)				\
+	const struct uverbs_type_group name = UVERBS_TYPES(__VA_ARGS__)
+
+#define _UVERBS_TYPES_SZ(...)						\
+	(sizeof((const struct uverbs_type_group *[]){__VA_ARGS__}) /	\
+	 sizeof(const struct uverbs_type_group *))
+
+#define UVERBS_TYPES_GROUP(...)						\
+	((const struct uverbs_root){				\
+		.type_groups = (const struct uverbs_type_group *[]){__VA_ARGS__},\
+		.num_groups = _UVERBS_TYPES_SZ(__VA_ARGS__)})
+#define DECLARE_UVERBS_TYPES_GROUP(name, ...)		\
+	const struct uverbs_root name = UVERBS_TYPES_GROUP(__VA_ARGS__)
+
 /* =================================================
  *              Parsing infrastructure
  * =================================================
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC ABI V6 05/14] IB/core: Declare all common IB types
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2016-12-11 12:57   ` [RFC ABI V6 04/14] IB/core: Add macros for declaring types and type groups Matan Barak
@ 2016-12-11 12:57   ` Matan Barak
  2016-12-11 12:58   ` [RFC ABI V6 06/14] IB/core: Use the new IDR and locking infrastructure in uverbs_cmd Matan Barak
                     ` (9 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Matan Barak @ 2016-12-11 12:57 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Jason Gunthorpe, Sean Hefty, Christoph Lameter,
	Liran Liss, Haggai Eran, Majd Dibbiny, Matan Barak, Tal Alon,
	Leon Romanovsky

This patch declares all the required information for creating and
destroying all common IB types. The refactored infrastructure treats
types in a more object oriented way. Each type encapsulates all the
required information for its creation and destruction.

This patch is required in order to transform all currently uverbs_cmd
verbs to initialize and destroy objects using the refactored
infrastructure.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/Makefile           |   2 +-
 drivers/infiniband/core/uverbs.h           |   3 +
 drivers/infiniband/core/uverbs_ioctl_cmd.c | 239 +++++++++++++++++++++++++++++
 drivers/infiniband/core/uverbs_main.c      |   6 +-
 include/rdma/uverbs_ioctl_cmd.h            |  70 +++++++++
 5 files changed, 316 insertions(+), 4 deletions(-)
 create mode 100644 drivers/infiniband/core/uverbs_ioctl_cmd.c
 create mode 100644 include/rdma/uverbs_ioctl_cmd.h

diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index 1819623..7676592 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -29,4 +29,4 @@ ib_umad-y :=			user_mad.o
 ib_ucm-y :=			ucm.o
 
 ib_uverbs-y :=			uverbs_main.o uverbs_cmd.o uverbs_marshall.o \
-				rdma_core.o
+				rdma_core.o uverbs_ioctl_cmd.o
diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index ae7d4b8..05e9e83 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -176,6 +176,7 @@ struct ib_ucq_object {
 };
 
 void idr_remove_uobj(struct ib_uobject *uobj);
+extern const struct file_operations uverbs_event_fops;
 
 struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
 					struct ib_device *ib_dev,
@@ -200,6 +201,8 @@ void ib_uverbs_event_handler(struct ib_event_handler *handler,
 void ib_uverbs_dealloc_xrcd(struct ib_uverbs_device *dev, struct ib_xrcd *xrcd);
 
 int uverbs_dealloc_mw(struct ib_mw *mw);
+void ib_uverbs_detach_umcast(struct ib_qp *qp,
+			     struct ib_uqp_object *uobj);
 
 struct ib_uverbs_flow_spec {
 	union {
diff --git a/drivers/infiniband/core/uverbs_ioctl_cmd.c b/drivers/infiniband/core/uverbs_ioctl_cmd.c
new file mode 100644
index 0000000..cb19f38
--- /dev/null
+++ b/drivers/infiniband/core/uverbs_ioctl_cmd.c
@@ -0,0 +1,239 @@
+/*
+ * Copyright (c) 2016, Mellanox Technologies inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <rdma/uverbs_ioctl_cmd.h>
+#include <rdma/ib_user_verbs.h>
+#include <rdma/ib_verbs.h>
+#include <linux/bug.h>
+#include <linux/file.h>
+#include "rdma_core.h"
+#include "uverbs.h"
+
+void uverbs_free_ah(const struct uverbs_type_alloc_action *uobject_type,
+		    struct ib_uobject *uobject)
+{
+	ib_destroy_ah((struct ib_ah *)uobject->object);
+}
+
+void uverbs_free_flow(const struct uverbs_type_alloc_action *type_alloc_action,
+		      struct ib_uobject *uobject)
+{
+	ib_destroy_flow((struct ib_flow *)uobject->object);
+}
+
+void uverbs_free_mw(const struct uverbs_type_alloc_action *type_alloc_action,
+		    struct ib_uobject *uobject)
+{
+	uverbs_dealloc_mw((struct ib_mw *)uobject->object);
+}
+
+void uverbs_free_qp(const struct uverbs_type_alloc_action *type_alloc_action,
+		    struct ib_uobject *uobject)
+{
+	struct ib_qp *qp = uobject->object;
+	struct ib_uqp_object *uqp =
+		container_of(uobject, struct ib_uqp_object, uevent.uobject);
+
+	if (qp != qp->real_qp) {
+		ib_close_qp(qp);
+	} else {
+		ib_uverbs_detach_umcast(qp, uqp);
+		ib_destroy_qp(qp);
+	}
+	ib_uverbs_release_uevent(uobject->context->ufile, &uqp->uevent);
+}
+
+void uverbs_free_rwq_ind_tbl(const struct uverbs_type_alloc_action *type_alloc_action,
+			     struct ib_uobject *uobject)
+{
+	struct ib_rwq_ind_table *rwq_ind_tbl = uobject->object;
+	struct ib_wq **ind_tbl = rwq_ind_tbl->ind_tbl;
+
+	ib_destroy_rwq_ind_table(rwq_ind_tbl);
+	kfree(ind_tbl);
+}
+
+void uverbs_free_wq(const struct uverbs_type_alloc_action *type_alloc_action,
+		    struct ib_uobject *uobject)
+{
+	struct ib_wq *wq = uobject->object;
+	struct ib_uwq_object *uwq =
+		container_of(uobject, struct ib_uwq_object, uevent.uobject);
+
+	ib_destroy_wq(wq);
+	ib_uverbs_release_uevent(uobject->context->ufile, &uwq->uevent);
+}
+
+void uverbs_free_srq(const struct uverbs_type_alloc_action *type_alloc_action,
+		     struct ib_uobject *uobject)
+{
+	struct ib_srq *srq = uobject->object;
+	struct ib_uevent_object *uevent =
+		container_of(uobject, struct ib_uevent_object, uobject);
+
+	ib_destroy_srq(srq);
+	ib_uverbs_release_uevent(uobject->context->ufile, uevent);
+}
+
+void uverbs_free_cq(const struct uverbs_type_alloc_action *type_alloc_action,
+		    struct ib_uobject *uobject)
+{
+	struct ib_cq *cq = uobject->object;
+	struct ib_uverbs_event_file *ev_file = cq->cq_context;
+	struct ib_ucq_object *ucq =
+		container_of(uobject, struct ib_ucq_object, uobject);
+
+	ib_destroy_cq(cq);
+	ib_uverbs_release_ucq(uobject->context->ufile, ev_file, ucq);
+}
+
+void uverbs_free_mr(const struct uverbs_type_alloc_action *type_alloc_action,
+		    struct ib_uobject *uobject)
+{
+	ib_dereg_mr((struct ib_mr *)uobject->object);
+}
+
+void uverbs_free_xrcd(const struct uverbs_type_alloc_action *type_alloc_action,
+		      struct ib_uobject *uobject)
+{
+	struct ib_xrcd *xrcd = uobject->object;
+
+	mutex_lock(&uobject->context->ufile->device->xrcd_tree_mutex);
+	ib_uverbs_dealloc_xrcd(uobject->context->ufile->device, xrcd);
+	mutex_unlock(&uobject->context->ufile->device->xrcd_tree_mutex);
+}
+
+void uverbs_free_pd(const struct uverbs_type_alloc_action *type_alloc_action,
+		    struct ib_uobject *uobject)
+{
+	ib_dealloc_pd((struct ib_pd *)uobject->object);
+}
+
+void uverbs_free_event_file(const struct uverbs_type_alloc_action *type_alloc_action,
+			    struct ib_uobject *uobject)
+{
+	struct ib_uverbs_event_file *event_file = (void *)(uobject + 1);
+
+	spin_lock_irq(&event_file->lock);
+	event_file->is_closed = 1;
+	spin_unlock_irq(&event_file->lock);
+
+	wake_up_interruptible(&event_file->poll_wait);
+	kill_fasync(&event_file->async_queue, SIGIO, POLL_IN);
+};
+
+DECLARE_UVERBS_TYPE(uverbs_type_comp_channel,
+		    /* 1 is used in order to free the comp_channel after the CQs */
+		    &UVERBS_TYPE_ALLOC_FD(1, sizeof(struct ib_uobject) + sizeof(struct ib_uverbs_event_file),
+					  uverbs_free_event_file,
+					  &uverbs_event_fops,
+					  "[infinibandevent]", O_RDONLY),
+		    /* TODO: implement actions for comp channel */
+		    NULL);
+
+DECLARE_UVERBS_TYPE(uverbs_type_cq,
+		    /* 1 is used in order to free the MR after all the MWs */
+		    &UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_ucq_object), 0,
+					      uverbs_free_cq),
+		    /* TODO: implement actions for cq */
+		    NULL);
+
+DECLARE_UVERBS_TYPE(uverbs_type_qp,
+		    /* 1 is used in order to free the MR after all the MWs */
+		    &UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uqp_object), 0,
+					      uverbs_free_qp),
+		    /* TODO: implement actions for qp */
+		    NULL);
+
+DECLARE_UVERBS_TYPE(uverbs_type_mw,
+		    &UVERBS_TYPE_ALLOC_IDR(1, uverbs_free_mw),
+		    /* TODO: implement actions for mw */
+		    NULL);
+
+DECLARE_UVERBS_TYPE(uverbs_type_mr,
+		    /* 1 is used in order to free the MR after all the MWs */
+		    &UVERBS_TYPE_ALLOC_IDR(1, uverbs_free_mr),
+		    /* TODO: implement actions for mr */
+		    NULL);
+
+DECLARE_UVERBS_TYPE(uverbs_type_srq,
+		    &UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_usrq_object), 0,
+					      uverbs_free_srq),
+		    /* TODO: implement actions for srq */
+		    NULL);
+
+DECLARE_UVERBS_TYPE(uverbs_type_ah,
+		    &UVERBS_TYPE_ALLOC_IDR(0, uverbs_free_ah),
+		    /* TODO: implement actions for ah */
+		    NULL);
+
+DECLARE_UVERBS_TYPE(uverbs_type_flow,
+		    &UVERBS_TYPE_ALLOC_IDR(0, uverbs_free_flow),
+		    /* TODO: implement actions for flow */
+		    NULL);
+
+DECLARE_UVERBS_TYPE(uverbs_type_wq,
+		    &UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uwq_object), 0,
+					      uverbs_free_wq),
+		    /* TODO: implement actions for wq */
+		    NULL);
+
+DECLARE_UVERBS_TYPE(uverbs_type_rwq_ind_table,
+		    &UVERBS_TYPE_ALLOC_IDR(0, uverbs_free_rwq_ind_tbl),
+		    /* TODO: implement actions for rwq_ind_table */
+		    NULL);
+
+DECLARE_UVERBS_TYPE(uverbs_type_xrcd,
+		    &UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uxrcd_object), 0,
+					      uverbs_free_xrcd),
+		    /* TODO: implement actions for xrcd */
+		    NULL);
+
+DECLARE_UVERBS_TYPE(uverbs_type_pd,
+		    /* 2 is used in order to free the PD after all objects */
+		    &UVERBS_TYPE_ALLOC_IDR(2, uverbs_free_pd),
+		    /* TODO: implement actions for pd */
+		    NULL);
+
+DECLARE_UVERBS_TYPE(uverbs_type_device, NULL,
+		    /* TODO: implement actions for device */
+		    NULL);
+
+DECLARE_UVERBS_TYPES(uverbs_common_types,
+		     ADD_UVERBS_TYPE(UVERBS_TYPE_DEVICE, uverbs_type_device),
+		     ADD_UVERBS_TYPE(UVERBS_TYPE_PD, uverbs_type_pd),
+		     ADD_UVERBS_TYPE(UVERBS_TYPE_MR, uverbs_type_mr),
+		     ADD_UVERBS_TYPE(UVERBS_TYPE_COMP_CHANNEL, uverbs_type_comp_channel),
+		     ADD_UVERBS_TYPE(UVERBS_TYPE_CQ, uverbs_type_cq),
+		     ADD_UVERBS_TYPE(UVERBS_TYPE_QP, uverbs_type_qp),
+);
+EXPORT_SYMBOL(uverbs_common_types);
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index e63357a..b560c88 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -199,8 +199,8 @@ void ib_uverbs_release_uevent(struct ib_uverbs_file *file,
 	spin_unlock_irq(&file->async_file->lock);
 }
 
-static void ib_uverbs_detach_umcast(struct ib_qp *qp,
-				    struct ib_uqp_object *uobj)
+void ib_uverbs_detach_umcast(struct ib_qp *qp,
+			     struct ib_uqp_object *uobj)
 {
 	struct ib_uverbs_mcast_entry *mcast, *tmp;
 
@@ -479,7 +479,7 @@ static int ib_uverbs_event_close(struct inode *inode, struct file *filp)
 	return 0;
 }
 
-static const struct file_operations uverbs_event_fops = {
+const struct file_operations uverbs_event_fops = {
 	.owner	 = THIS_MODULE,
 	.read	 = ib_uverbs_event_read,
 	.poll    = ib_uverbs_event_poll,
diff --git a/include/rdma/uverbs_ioctl_cmd.h b/include/rdma/uverbs_ioctl_cmd.h
new file mode 100644
index 0000000..614e80c
--- /dev/null
+++ b/include/rdma/uverbs_ioctl_cmd.h
@@ -0,0 +1,70 @@
+/*
+ * Copyright (c) 2016, Mellanox Technologies inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef _UVERBS_IOCTL_CMD_
+#define _UVERBS_IOCTL_CMD_
+
+#include <rdma/uverbs_ioctl.h>
+
+enum uverbs_common_types {
+	UVERBS_TYPE_DEVICE, /* Don't use IDRs here */
+	UVERBS_TYPE_PD,
+	UVERBS_TYPE_COMP_CHANNEL,
+	UVERBS_TYPE_CQ,
+	UVERBS_TYPE_QP,
+	UVERBS_TYPE_SRQ,
+	UVERBS_TYPE_AH,
+	UVERBS_TYPE_MR,
+	UVERBS_TYPE_MW,
+	UVERBS_TYPE_FLOW,
+	UVERBS_TYPE_XRCD,
+	UVERBS_TYPE_RWQ_IND_TBL,
+	UVERBS_TYPE_WQ,
+	UVERBS_TYPE_LAST,
+};
+
+extern const struct uverbs_type uverbs_type_cq;
+extern const struct uverbs_type uverbs_type_qp;
+extern const struct uverbs_type uverbs_type_rwq_ind_table;
+extern const struct uverbs_type uverbs_type_wq;
+extern const struct uverbs_type uverbs_type_srq;
+extern const struct uverbs_type uverbs_type_ah;
+extern const struct uverbs_type uverbs_type_flow;
+extern const struct uverbs_type uverbs_type_comp_channel;
+extern const struct uverbs_type uverbs_type_mr;
+extern const struct uverbs_type uverbs_type_mw;
+extern const struct uverbs_type uverbs_type_pd;
+extern const struct uverbs_type uverbs_type_xrcd;
+extern const struct uverbs_type uverbs_type_device;
+extern const struct uverbs_type_group uverbs_common_types;
+#endif
+
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC ABI V6 06/14] IB/core: Use the new IDR and locking infrastructure in uverbs_cmd
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2016-12-11 12:57   ` [RFC ABI V6 05/14] IB/core: Declare all common IB types Matan Barak
@ 2016-12-11 12:58   ` Matan Barak
  2016-12-11 12:58   ` [RFC ABI V6 07/14] IB/core: Add new ioctl interface Matan Barak
                     ` (8 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Matan Barak @ 2016-12-11 12:58 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Jason Gunthorpe, Sean Hefty, Christoph Lameter,
	Liran Liss, Haggai Eran, Majd Dibbiny, Matan Barak, Tal Alon,
	Leon Romanovsky

The new infrastructure introduced a new locking and objects scheme.
We rewrite the current uverbs_cmd handlers to use them. The new
infrastructure needs a types definition in order to figure out how
to allocate/free resources, which is defined in upstream patches.

This patch refactores the following things:
(*) Instead of having a list per type, we use the ucontext's list
(*) The locking semantics are changed:
      Two commands might try to lock the same object. If this object
      is locked for exclusive object, any concurrent access will get
      -EBUSY. This makes the user serialize access.
(*) The completion channel FD is created by using the infrastructure.
    It's release function and release context are serialized by the
    infrastructure.
(*) The live flag is no longer required.
(*) User may need to assign the user_handle explicitly.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs.h      |   12 +-
 drivers/infiniband/core/uverbs_cmd.c  | 1163 ++++++++++++---------------------
 drivers/infiniband/core/uverbs_main.c |  232 ++-----
 drivers/infiniband/hw/mlx5/main.c     |    4 +
 include/rdma/ib_verbs.h               |   16 +-
 5 files changed, 480 insertions(+), 947 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 05e9e83..64f8658 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -175,15 +175,12 @@ struct ib_ucq_object {
 	u32			async_events_reported;
 };
 
-void idr_remove_uobj(struct ib_uobject *uobj);
 extern const struct file_operations uverbs_event_fops;
 
-struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
-					struct ib_device *ib_dev,
-					int is_async);
+struct file *ib_uverbs_alloc_async_event_file(struct ib_uverbs_file *uverbs_file,
+					      struct ib_device *ib_dev);
 void ib_uverbs_release_file(struct kref *ref);
 void ib_uverbs_free_async_event_file(struct ib_uverbs_file *uverbs_file);
-struct ib_uverbs_event_file *ib_uverbs_lookup_comp_file(int fd);
 
 void ib_uverbs_release_ucq(struct ib_uverbs_file *file,
 			   struct ib_uverbs_event_file *ev_file,
@@ -201,6 +198,11 @@ void ib_uverbs_event_handler(struct ib_event_handler *handler,
 void ib_uverbs_dealloc_xrcd(struct ib_uverbs_device *dev, struct ib_xrcd *xrcd);
 
 int uverbs_dealloc_mw(struct ib_mw *mw);
+void ib_uverbs_release_ucq(struct ib_uverbs_file *file,
+			   struct ib_uverbs_event_file *ev_file,
+			   struct ib_ucq_object *uobj);
+void ib_uverbs_release_uevent(struct ib_uverbs_file *file,
+			      struct ib_uevent_object *uobj);
 void ib_uverbs_detach_umcast(struct ib_qp *qp,
 			     struct ib_uqp_object *uobj);
 
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 84daf2c..79a1a8b 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -40,269 +40,73 @@
 
 #include <asm/uaccess.h>
 
+#include <rdma/uverbs_ioctl.h>
+#include <rdma/uverbs_ioctl_cmd.h>
+#include "rdma_core.h"
+
 #include "uverbs.h"
 #include "core_priv.h"
 
-struct uverbs_lock_class {
-	struct lock_class_key	key;
-	char			name[16];
-};
-
-static struct uverbs_lock_class pd_lock_class	= { .name = "PD-uobj" };
-static struct uverbs_lock_class mr_lock_class	= { .name = "MR-uobj" };
-static struct uverbs_lock_class mw_lock_class	= { .name = "MW-uobj" };
-static struct uverbs_lock_class cq_lock_class	= { .name = "CQ-uobj" };
-static struct uverbs_lock_class qp_lock_class	= { .name = "QP-uobj" };
-static struct uverbs_lock_class ah_lock_class	= { .name = "AH-uobj" };
-static struct uverbs_lock_class srq_lock_class	= { .name = "SRQ-uobj" };
-static struct uverbs_lock_class xrcd_lock_class = { .name = "XRCD-uobj" };
-static struct uverbs_lock_class rule_lock_class = { .name = "RULE-uobj" };
-static struct uverbs_lock_class wq_lock_class = { .name = "WQ-uobj" };
-static struct uverbs_lock_class rwq_ind_table_lock_class = { .name = "IND_TBL-uobj" };
-
-/*
- * The ib_uobject locking scheme is as follows:
- *
- * - ib_uverbs_idr_lock protects the uverbs idrs themselves, so it
- *   needs to be held during all idr write operations.  When an object is
- *   looked up, a reference must be taken on the object's kref before
- *   dropping this lock.  For read operations, the rcu_read_lock()
- *   and rcu_write_lock() but similarly the kref reference is grabbed
- *   before the rcu_read_unlock().
- *
- * - Each object also has an rwsem.  This rwsem must be held for
- *   reading while an operation that uses the object is performed.
- *   For example, while registering an MR, the associated PD's
- *   uobject.mutex must be held for reading.  The rwsem must be held
- *   for writing while initializing or destroying an object.
- *
- * - In addition, each object has a "live" flag.  If this flag is not
- *   set, then lookups of the object will fail even if it is found in
- *   the idr.  This handles a reader that blocks and does not acquire
- *   the rwsem until after the object is destroyed.  The destroy
- *   operation will set the live flag to 0 and then drop the rwsem;
- *   this will allow the reader to acquire the rwsem, see that the
- *   live flag is 0, and then drop the rwsem and its reference to
- *   object.  The underlying storage will not be freed until the last
- *   reference to the object is dropped.
- */
-
-static void init_uobj(struct ib_uobject *uobj, u64 user_handle,
-		      struct ib_ucontext *context, struct uverbs_lock_class *c)
-{
-	uobj->user_handle = user_handle;
-	uobj->context     = context;
-	kref_init(&uobj->ref);
-	init_rwsem(&uobj->mutex);
-	lockdep_set_class_and_name(&uobj->mutex, &c->key, c->name);
-	uobj->live        = 0;
-}
-
-static void release_uobj(struct kref *kref)
-{
-	kfree_rcu(container_of(kref, struct ib_uobject, ref), rcu);
-}
-
-static void put_uobj(struct ib_uobject *uobj)
-{
-	kref_put(&uobj->ref, release_uobj);
-}
-
-static void put_uobj_read(struct ib_uobject *uobj)
-{
-	up_read(&uobj->mutex);
-	put_uobj(uobj);
-}
-
-static void put_uobj_write(struct ib_uobject *uobj)
-{
-	up_write(&uobj->mutex);
-	put_uobj(uobj);
-}
-
-static int idr_add_uobj(struct ib_uobject *uobj)
-{
-	int ret;
-
-	idr_preload(GFP_KERNEL);
-	spin_lock(&uobj->context->device->idr_lock);
-
-	ret = idr_alloc(&uobj->context->device->idr, uobj, 0, 0, GFP_NOWAIT);
-	if (ret >= 0)
-		uobj->id = ret;
-
-	spin_unlock(&uobj->context->device->idr_lock);
-	idr_preload_end();
-
-	return ret < 0 ? ret : 0;
-}
-
-void idr_remove_uobj(struct ib_uobject *uobj)
-{
-	spin_lock(&uobj->context->device->idr_lock);
-	idr_remove(&uobj->context->device->idr, uobj->id);
-	spin_unlock(&uobj->context->device->idr_lock);
-}
-
-static struct ib_uobject *__idr_get_uobj(int id, struct ib_ucontext *context)
-{
-	struct ib_uobject *uobj;
-
-	rcu_read_lock();
-	uobj = idr_find(&context->device->idr, id);
-	if (uobj) {
-		if (uobj->context == context)
-			kref_get(&uobj->ref);
-		else
-			uobj = NULL;
-	}
-	rcu_read_unlock();
-
-	return uobj;
-}
-
-static struct ib_uobject *idr_read_uobj(int id, struct ib_ucontext *context,
-					int nested)
-{
-	struct ib_uobject *uobj;
-
-	uobj = __idr_get_uobj(id, context);
-	if (!uobj)
-		return NULL;
-
-	if (nested)
-		down_read_nested(&uobj->mutex, SINGLE_DEPTH_NESTING);
-	else
-		down_read(&uobj->mutex);
-	if (!uobj->live) {
-		put_uobj_read(uobj);
-		return NULL;
-	}
-
-	return uobj;
-}
-
-static struct ib_uobject *idr_write_uobj(int id, struct ib_ucontext *context)
-{
-	struct ib_uobject *uobj;
-
-	uobj = __idr_get_uobj(id, context);
-	if (!uobj)
-		return NULL;
-
-	down_write(&uobj->mutex);
-	if (!uobj->live) {
-		put_uobj_write(uobj);
-		return NULL;
-	}
-
-	return uobj;
-}
-
-static void *idr_read_obj(int id, struct ib_ucontext *context,
-			  int nested)
-{
-	struct ib_uobject *uobj;
-
-	uobj = idr_read_uobj(id, context, nested);
-	return uobj ? uobj->object : NULL;
-}
+#define fld_concat(a, b) a##b
+#define idr_get_xxxx(_type, _access, _handle, _context) ({		\
+	const struct uverbs_type * const type = &uverbs_type_## _type;	\
+	struct ib_uobject *uobj = uverbs_get_type_from_idr(		\
+					type->alloc,			\
+					_context, _access, _handle);	\
+									\
+	IS_ERR(uobj) ? NULL : uobj->object; })
 
 static struct ib_pd *idr_read_pd(int pd_handle, struct ib_ucontext *context)
 {
-	return idr_read_obj(pd_handle, context, 0);
-}
-
-static void put_pd_read(struct ib_pd *pd)
-{
-	put_uobj_read(pd->uobject);
-}
-
-static struct ib_cq *idr_read_cq(int cq_handle, struct ib_ucontext *context, int nested)
-{
-	return idr_read_obj(cq_handle, context, nested);
+	return idr_get_xxxx(pd, UVERBS_IDR_ACCESS_READ, pd_handle, context);
 }
 
-static void put_cq_read(struct ib_cq *cq)
+static struct ib_cq *idr_read_cq(int cq_handle, struct ib_ucontext *context)
 {
-	put_uobj_read(cq->uobject);
-}
-
-static void put_ah_read(struct ib_ah *ah)
-{
-	put_uobj_read(ah->uobject);
+	return idr_get_xxxx(cq, UVERBS_IDR_ACCESS_READ, cq_handle, context);
 }
 
 static struct ib_ah *idr_read_ah(int ah_handle, struct ib_ucontext *context)
 {
-	return idr_read_obj(ah_handle, context, 0);
+	return idr_get_xxxx(ah, UVERBS_IDR_ACCESS_READ, ah_handle, context);
 }
 
 static struct ib_qp *idr_read_qp(int qp_handle, struct ib_ucontext *context)
 {
-	return idr_read_obj(qp_handle, context, 0);
+	return idr_get_xxxx(qp, UVERBS_IDR_ACCESS_READ, qp_handle, context);
 }
 
 static struct ib_wq *idr_read_wq(int wq_handle, struct ib_ucontext *context)
 {
-	return idr_read_obj(wq_handle, context, 0);
-}
-
-static void put_wq_read(struct ib_wq *wq)
-{
-	put_uobj_read(wq->uobject);
+	return idr_get_xxxx(wq, UVERBS_IDR_ACCESS_READ, wq_handle, context);
 }
 
 static struct ib_rwq_ind_table *idr_read_rwq_indirection_table(int ind_table_handle,
 							       struct ib_ucontext *context)
 {
-	return idr_read_obj(ind_table_handle, context, 0);
-}
-
-static void put_rwq_indirection_table_read(struct ib_rwq_ind_table *ind_table)
-{
-	put_uobj_read(ind_table->uobject);
+	return idr_get_xxxx(rwq_ind_table, UVERBS_IDR_ACCESS_READ,
+			    ind_table_handle, context);
 }
 
 static struct ib_qp *idr_write_qp(int qp_handle, struct ib_ucontext *context)
 {
-	struct ib_uobject *uobj;
-
-	uobj = idr_write_uobj(qp_handle, context);
-	return uobj ? uobj->object : NULL;
-}
-
-static void put_qp_read(struct ib_qp *qp)
-{
-	put_uobj_read(qp->uobject);
-}
-
-static void put_qp_write(struct ib_qp *qp)
-{
-	put_uobj_write(qp->uobject);
+	return idr_get_xxxx(qp, UVERBS_IDR_ACCESS_WRITE, qp_handle, context);
 }
 
 static struct ib_srq *idr_read_srq(int srq_handle, struct ib_ucontext *context)
 {
-	return idr_read_obj(srq_handle, context, 0);
-}
-
-static void put_srq_read(struct ib_srq *srq)
-{
-	put_uobj_read(srq->uobject);
+	return idr_get_xxxx(srq, UVERBS_IDR_ACCESS_WRITE, srq_handle, context);
 }
 
 static struct ib_xrcd *idr_read_xrcd(int xrcd_handle, struct ib_ucontext *context,
 				     struct ib_uobject **uobj)
 {
-	*uobj = idr_read_uobj(xrcd_handle, context, 0);
+	*uobj = uverbs_get_type_from_idr(uverbs_type_xrcd.alloc,
+					 context, UVERBS_IDR_ACCESS_READ,
+					 xrcd_handle);
 	return *uobj ? (*uobj)->object : NULL;
 }
 
-static void put_xrcd_read(struct ib_uobject *uobj)
-{
-	put_uobj_read(uobj);
-}
 ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 			      struct ib_device *ib_dev,
 			      const char __user *buf,
@@ -339,17 +143,11 @@ ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 	}
 
 	ucontext->device = ib_dev;
-	INIT_LIST_HEAD(&ucontext->pd_list);
-	INIT_LIST_HEAD(&ucontext->mr_list);
-	INIT_LIST_HEAD(&ucontext->mw_list);
-	INIT_LIST_HEAD(&ucontext->cq_list);
-	INIT_LIST_HEAD(&ucontext->qp_list);
-	INIT_LIST_HEAD(&ucontext->srq_list);
-	INIT_LIST_HEAD(&ucontext->ah_list);
-	INIT_LIST_HEAD(&ucontext->wq_list);
-	INIT_LIST_HEAD(&ucontext->rwq_ind_tbl_list);
-	INIT_LIST_HEAD(&ucontext->xrcd_list);
-	INIT_LIST_HEAD(&ucontext->rule_list);
+	ucontext->ufile = file;
+	ret = ib_uverbs_uobject_type_initialize_ucontext(ucontext);
+	if (ret)
+		goto err_ctx;
+
 	rcu_read_lock();
 	ucontext->tgid = get_task_pid(current->group_leader, PIDTYPE_PID);
 	rcu_read_unlock();
@@ -373,7 +171,7 @@ ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 		goto err_free;
 	resp.async_fd = ret;
 
-	filp = ib_uverbs_alloc_event_file(file, ib_dev, 1);
+	filp = ib_uverbs_alloc_async_event_file(file, ib_dev);
 	if (IS_ERR(filp)) {
 		ret = PTR_ERR(filp);
 		goto err_fd;
@@ -386,6 +184,7 @@ ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 	}
 
 	file->ucontext = ucontext;
+	ucontext->ufile = file;
 
 	fd_install(resp.async_fd, filp);
 
@@ -402,6 +201,8 @@ err_fd:
 
 err_free:
 	put_pid(ucontext->tgid);
+	ib_uverbs_uobject_type_release_ucontext(ucontext);
+err_ctx:
 	ib_dev->dealloc_ucontext(ucontext);
 
 err:
@@ -553,12 +354,10 @@ ssize_t ib_uverbs_alloc_pd(struct ib_uverbs_file *file,
 		   (unsigned long) cmd.response + sizeof resp,
 		   in_len - sizeof cmd, out_len - sizeof resp);
 
-	uobj = kmalloc(sizeof *uobj, GFP_KERNEL);
-	if (!uobj)
-		return -ENOMEM;
-
-	init_uobj(uobj, 0, file->ucontext, &pd_lock_class);
-	down_write(&uobj->mutex);
+	uobj = uverbs_get_type_from_idr(uverbs_type_pd.alloc, file->ucontext,
+					UVERBS_IDR_ACCESS_NEW, 0);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
 	pd = ib_dev->alloc_pd(ib_dev, file->ucontext, &udata);
 	if (IS_ERR(pd)) {
@@ -570,12 +369,7 @@ ssize_t ib_uverbs_alloc_pd(struct ib_uverbs_file *file,
 	pd->uobject = uobj;
 	pd->__internal_mr = NULL;
 	atomic_set(&pd->usecnt, 0);
-
 	uobj->object = pd;
-	ret = idr_add_uobj(uobj);
-	if (ret)
-		goto err_idr;
-
 	memset(&resp, 0, sizeof resp);
 	resp.pd_handle = uobj->id;
 
@@ -585,24 +379,14 @@ ssize_t ib_uverbs_alloc_pd(struct ib_uverbs_file *file,
 		goto err_copy;
 	}
 
-	mutex_lock(&file->mutex);
-	list_add_tail(&uobj->list, &file->ucontext->pd_list);
-	mutex_unlock(&file->mutex);
-
-	uobj->live = 1;
-
-	up_write(&uobj->mutex);
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_NEW);
 
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(uobj);
-
-err_idr:
 	ib_dealloc_pd(pd);
-
 err:
-	put_uobj_write(uobj);
+	uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_NEW);
 	return ret;
 }
 
@@ -619,9 +403,11 @@ ssize_t ib_uverbs_dealloc_pd(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(cmd.pd_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
+	uobj = uverbs_get_type_from_idr(uverbs_type_pd.alloc, file->ucontext,
+					UVERBS_IDR_ACCESS_DESTROY,
+					cmd.pd_handle);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 	pd = uobj->object;
 
 	if (atomic_read(&pd->usecnt)) {
@@ -634,21 +420,12 @@ ssize_t ib_uverbs_dealloc_pd(struct ib_uverbs_file *file,
 	if (ret)
 		goto err_put;
 
-	uobj->live = 0;
-	put_uobj_write(uobj);
-
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	put_uobj(uobj);
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 
 	return in_len;
 
 err_put:
-	put_uobj_write(uobj);
+	uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 	return ret;
 }
 
@@ -786,15 +563,11 @@ ssize_t ib_uverbs_open_xrcd(struct ib_uverbs_file *file,
 		}
 	}
 
-	obj = kmalloc(sizeof *obj, GFP_KERNEL);
-	if (!obj) {
-		ret = -ENOMEM;
-		goto err_tree_mutex_unlock;
-	}
-
-	init_uobj(&obj->uobject, 0, file->ucontext, &xrcd_lock_class);
-
-	down_write(&obj->uobject.mutex);
+	obj = (struct ib_uxrcd_object *)
+		uverbs_get_type_from_idr(uverbs_type_xrcd.alloc, file->ucontext,
+					 UVERBS_IDR_ACCESS_NEW, 0);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
 
 	if (!xrcd) {
 		xrcd = ib_dev->alloc_xrcd(ib_dev, file->ucontext, &udata);
@@ -813,10 +586,6 @@ ssize_t ib_uverbs_open_xrcd(struct ib_uverbs_file *file,
 
 	atomic_set(&obj->refcnt, 0);
 	obj->uobject.object = xrcd;
-	ret = idr_add_uobj(&obj->uobject);
-	if (ret)
-		goto err_idr;
-
 	memset(&resp, 0, sizeof resp);
 	resp.xrcd_handle = obj->uobject.id;
 
@@ -825,7 +594,7 @@ ssize_t ib_uverbs_open_xrcd(struct ib_uverbs_file *file,
 			/* create new inode/xrcd table entry */
 			ret = xrcd_table_insert(file->device, inode, xrcd);
 			if (ret)
-				goto err_insert_xrcd;
+				goto err_dealloc_xrcd;
 		}
 		atomic_inc(&xrcd->usecnt);
 	}
@@ -839,12 +608,7 @@ ssize_t ib_uverbs_open_xrcd(struct ib_uverbs_file *file,
 	if (f.file)
 		fdput(f);
 
-	mutex_lock(&file->mutex);
-	list_add_tail(&obj->uobject.list, &file->ucontext->xrcd_list);
-	mutex_unlock(&file->mutex);
-
-	obj->uobject.live = 1;
-	up_write(&obj->uobject.mutex);
+	uverbs_commit_object(&obj->uobject, UVERBS_IDR_ACCESS_NEW);
 
 	mutex_unlock(&file->device->xrcd_tree_mutex);
 	return in_len;
@@ -856,14 +620,11 @@ err_copy:
 		atomic_dec(&xrcd->usecnt);
 	}
 
-err_insert_xrcd:
-	idr_remove_uobj(&obj->uobject);
-
-err_idr:
+err_dealloc_xrcd:
 	ib_dealloc_xrcd(xrcd);
 
 err:
-	put_uobj_write(&obj->uobject);
+	uverbs_rollback_object(&obj->uobject, UVERBS_IDR_ACCESS_NEW);
 
 err_tree_mutex_unlock:
 	if (f.file)
@@ -884,24 +645,23 @@ ssize_t ib_uverbs_close_xrcd(struct ib_uverbs_file *file,
 	struct ib_xrcd              *xrcd = NULL;
 	struct inode                *inode = NULL;
 	struct ib_uxrcd_object      *obj;
-	int                         live;
 	int                         ret = 0;
+	bool			    destroyed = false;
 
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
 	mutex_lock(&file->device->xrcd_tree_mutex);
-	uobj = idr_write_uobj(cmd.xrcd_handle, file->ucontext);
-	if (!uobj) {
-		ret = -EINVAL;
-		goto out;
-	}
+	uobj = uverbs_get_type_from_idr(uverbs_type_xrcd.alloc, file->ucontext,
+					UVERBS_IDR_ACCESS_DESTROY,
+					cmd.xrcd_handle);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
 	xrcd  = uobj->object;
 	inode = xrcd->inode;
 	obj   = container_of(uobj, struct ib_uxrcd_object, uobject);
 	if (atomic_read(&obj->refcnt)) {
-		put_uobj_write(uobj);
 		ret = -EBUSY;
 		goto out;
 	}
@@ -909,30 +669,24 @@ ssize_t ib_uverbs_close_xrcd(struct ib_uverbs_file *file,
 	if (!inode || atomic_dec_and_test(&xrcd->usecnt)) {
 		ret = ib_dealloc_xrcd(uobj->object);
 		if (!ret)
-			uobj->live = 0;
+			uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
+		destroyed = !ret;
 	}
 
-	live = uobj->live;
 	if (inode && ret)
 		atomic_inc(&xrcd->usecnt);
 
-	put_uobj_write(uobj);
-
 	if (ret)
 		goto out;
 
-	if (inode && !live)
+	if (inode && destroyed)
 		xrcd_table_delete(file->device, inode);
 
-	idr_remove_uobj(uobj);
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	put_uobj(uobj);
 	ret = in_len;
 
 out:
+	if (!destroyed)
+		uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 	mutex_unlock(&file->device->xrcd_tree_mutex);
 	return ret;
 }
@@ -982,12 +736,10 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
 	if (ret)
 		return ret;
 
-	uobj = kmalloc(sizeof *uobj, GFP_KERNEL);
-	if (!uobj)
-		return -ENOMEM;
-
-	init_uobj(uobj, 0, file->ucontext, &mr_lock_class);
-	down_write(&uobj->mutex);
+	uobj = uverbs_get_type_from_idr(uverbs_type_mr.alloc, file->ucontext,
+					UVERBS_IDR_ACCESS_NEW, 0);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
 	pd = idr_read_pd(cmd.pd_handle, file->ucontext);
 	if (!pd) {
@@ -1017,9 +769,6 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
 	atomic_inc(&pd->usecnt);
 
 	uobj->object = mr;
-	ret = idr_add_uobj(uobj);
-	if (ret)
-		goto err_unreg;
 
 	memset(&resp, 0, sizeof resp);
 	resp.lkey      = mr->lkey;
@@ -1032,29 +781,20 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
 		goto err_copy;
 	}
 
-	put_pd_read(pd);
-
-	mutex_lock(&file->mutex);
-	list_add_tail(&uobj->list, &file->ucontext->mr_list);
-	mutex_unlock(&file->mutex);
-
-	uobj->live = 1;
+	uverbs_commit_object(pd->uobject, UVERBS_IDR_ACCESS_READ);
 
-	up_write(&uobj->mutex);
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_NEW);
 
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(uobj);
-
-err_unreg:
 	ib_dereg_mr(mr);
 
 err_put:
-	put_pd_read(pd);
+	uverbs_rollback_object(pd->uobject, UVERBS_IDR_ACCESS_READ);
 
 err_free:
-	put_uobj_write(uobj);
+	uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_NEW);
 	return ret;
 }
 
@@ -1090,10 +830,10 @@ ssize_t ib_uverbs_rereg_mr(struct ib_uverbs_file *file,
 	     (cmd.start & ~PAGE_MASK) != (cmd.hca_va & ~PAGE_MASK)))
 			return -EINVAL;
 
-	uobj = idr_write_uobj(cmd.mr_handle, file->ucontext);
-
-	if (!uobj)
-		return -EINVAL;
+	uobj = uverbs_get_type_from_idr(uverbs_type_mr.alloc, file->ucontext,
+					UVERBS_IDR_ACCESS_WRITE, cmd.mr_handle);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
 	mr = uobj->object;
 
@@ -1136,12 +876,23 @@ ssize_t ib_uverbs_rereg_mr(struct ib_uverbs_file *file,
 		ret = in_len;
 
 put_uobj_pd:
-	if (cmd.flags & IB_MR_REREG_PD)
-		put_pd_read(pd);
+	if (cmd.flags & IB_MR_REREG_PD) {
+		if (ret == in_len)
+			uverbs_commit_object(pd->uobject,
+					     UVERBS_IDR_ACCESS_READ);
+		else
+			uverbs_rollback_object(pd->uobject,
+					       UVERBS_IDR_ACCESS_READ);
+	}
 
 put_uobjs:
 
-	put_uobj_write(mr->uobject);
+	if (ret == in_len)
+		uverbs_commit_object(uobj,
+				     UVERBS_IDR_ACCESS_WRITE);
+	else
+		uverbs_rollback_object(pd->uobject,
+				       UVERBS_IDR_ACCESS_WRITE);
 
 	return ret;
 }
@@ -1159,28 +910,22 @@ ssize_t ib_uverbs_dereg_mr(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(cmd.mr_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
+	uobj = uverbs_get_type_from_idr(uverbs_type_mr.alloc, file->ucontext,
+					UVERBS_IDR_ACCESS_DESTROY,
+					cmd.mr_handle);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
 	mr = uobj->object;
 
 	ret = ib_dereg_mr(mr);
-	if (!ret)
-		uobj->live = 0;
 
-	put_uobj_write(uobj);
-
-	if (ret)
+	if (ret) {
+		uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 		return ret;
+	}
 
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	put_uobj(uobj);
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 
 	return in_len;
 }
@@ -1204,12 +949,10 @@ ssize_t ib_uverbs_alloc_mw(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof(cmd)))
 		return -EFAULT;
 
-	uobj = kmalloc(sizeof(*uobj), GFP_KERNEL);
-	if (!uobj)
-		return -ENOMEM;
-
-	init_uobj(uobj, 0, file->ucontext, &mw_lock_class);
-	down_write(&uobj->mutex);
+	uobj = uverbs_get_type_from_idr(uverbs_type_mw.alloc, file->ucontext,
+					UVERBS_IDR_ACCESS_NEW, 0);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
 	pd = idr_read_pd(cmd.pd_handle, file->ucontext);
 	if (!pd) {
@@ -1234,9 +977,6 @@ ssize_t ib_uverbs_alloc_mw(struct ib_uverbs_file *file,
 	atomic_inc(&pd->usecnt);
 
 	uobj->object = mw;
-	ret = idr_add_uobj(uobj);
-	if (ret)
-		goto err_unalloc;
 
 	memset(&resp, 0, sizeof(resp));
 	resp.rkey      = mw->rkey;
@@ -1248,29 +988,17 @@ ssize_t ib_uverbs_alloc_mw(struct ib_uverbs_file *file,
 		goto err_copy;
 	}
 
-	put_pd_read(pd);
-
-	mutex_lock(&file->mutex);
-	list_add_tail(&uobj->list, &file->ucontext->mw_list);
-	mutex_unlock(&file->mutex);
-
-	uobj->live = 1;
-
-	up_write(&uobj->mutex);
+	uverbs_commit_object(pd->uobject, UVERBS_IDR_ACCESS_READ);
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_NEW);
 
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(uobj);
-
-err_unalloc:
 	uverbs_dealloc_mw(mw);
-
 err_put:
-	put_pd_read(pd);
-
+	uverbs_rollback_object(pd->uobject, UVERBS_IDR_ACCESS_READ);
 err_free:
-	put_uobj_write(uobj);
+	uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_NEW);
 	return ret;
 }
 
@@ -1287,28 +1015,21 @@ ssize_t ib_uverbs_dealloc_mw(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof(cmd)))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(cmd.mw_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
+	uobj = uverbs_get_type_from_idr(uverbs_type_mw.alloc, file->ucontext,
+					UVERBS_IDR_ACCESS_DESTROY,
+					cmd.mw_handle);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
 	mw = uobj->object;
 
 	ret = uverbs_dealloc_mw(mw);
-	if (!ret)
-		uobj->live = 0;
-
-	put_uobj_write(uobj);
-
-	if (ret)
+	if (ret) {
+		uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 		return ret;
+	}
 
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	put_uobj(uobj);
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 
 	return in_len;
 }
@@ -1320,8 +1041,8 @@ ssize_t ib_uverbs_create_comp_channel(struct ib_uverbs_file *file,
 {
 	struct ib_uverbs_create_comp_channel	   cmd;
 	struct ib_uverbs_create_comp_channel_resp  resp;
-	struct file				  *filp;
-	int ret;
+	struct ib_uobject			  *uobj;
+	struct ib_uverbs_event_file		  *ev_file;
 
 	if (out_len < sizeof resp)
 		return -ENOSPC;
@@ -1329,25 +1050,30 @@ ssize_t ib_uverbs_create_comp_channel(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	ret = get_unused_fd_flags(O_CLOEXEC);
-	if (ret < 0)
-		return ret;
-	resp.fd = ret;
+	uobj = uverbs_get_type_from_fd(uverbs_type_comp_channel.alloc,
+				       file->ucontext,
+				       UVERBS_IDR_ACCESS_NEW, 0);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
-	filp = ib_uverbs_alloc_event_file(file, ib_dev, 0);
-	if (IS_ERR(filp)) {
-		put_unused_fd(resp.fd);
-		return PTR_ERR(filp);
-	}
+	resp.fd = uobj->id;
+
+	ev_file = uverbs_fd_to_priv(uobj);
+	kref_init(&ev_file->ref);
+	spin_lock_init(&ev_file->lock);
+	INIT_LIST_HEAD(&ev_file->event_list);
+	init_waitqueue_head(&ev_file->poll_wait);
+	ev_file->async_queue = NULL;
+	ev_file->uverbs_file = file;
+	ev_file->is_closed   = 0;
 
 	if (copy_to_user((void __user *) (unsigned long) cmd.response,
 			 &resp, sizeof resp)) {
-		put_unused_fd(resp.fd);
-		fput(filp);
+		uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_NEW);
 		return -EFAULT;
 	}
 
-	fd_install(resp.fd, filp);
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_NEW);
 	return in_len;
 }
 
@@ -1365,6 +1091,7 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 				       void *context)
 {
 	struct ib_ucq_object           *obj;
+	struct ib_uobject	       *ev_uobj = NULL;
 	struct ib_uverbs_event_file    *ev_file = NULL;
 	struct ib_cq                   *cq;
 	int                             ret;
@@ -1374,21 +1101,27 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 	if (cmd->comp_vector >= file->device->num_comp_vectors)
 		return ERR_PTR(-EINVAL);
 
-	obj = kmalloc(sizeof *obj, GFP_KERNEL);
-	if (!obj)
-		return ERR_PTR(-ENOMEM);
-
-	init_uobj(&obj->uobject, cmd->user_handle, file->ucontext, &cq_lock_class);
-	down_write(&obj->uobject.mutex);
+	obj = (struct ib_ucq_object *)uverbs_get_type_from_idr(
+						uverbs_type_cq.alloc,
+						file->ucontext,
+						UVERBS_IDR_ACCESS_NEW, 0);
+	if (IS_ERR(obj))
+		return obj;
 
 	if (cmd->comp_channel >= 0) {
-		ev_file = ib_uverbs_lookup_comp_file(cmd->comp_channel);
-		if (!ev_file) {
+		ev_uobj = uverbs_get_type_from_fd(uverbs_type_comp_channel.alloc,
+						  file->ucontext,
+						  UVERBS_IDR_ACCESS_READ,
+						  cmd->comp_channel);
+		if (IS_ERR(ev_uobj)) {
 			ret = -EINVAL;
 			goto err;
 		}
+		ev_file = uverbs_fd_to_priv(ev_uobj);
+		kref_get(&ev_file->ref);
 	}
 
+	obj->uobject.user_handle = cmd->user_handle;
 	obj->uverbs_file	   = file;
 	obj->comp_events_reported  = 0;
 	obj->async_events_reported = 0;
@@ -1416,10 +1149,6 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 	atomic_set(&cq->usecnt, 0);
 
 	obj->uobject.object = cq;
-	ret = idr_add_uobj(&obj->uobject);
-	if (ret)
-		goto err_free;
-
 	memset(&resp, 0, sizeof resp);
 	resp.base.cq_handle = obj->uobject.id;
 	resp.base.cqe       = cq->cqe;
@@ -1431,28 +1160,20 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
 	if (ret)
 		goto err_cb;
 
-	mutex_lock(&file->mutex);
-	list_add_tail(&obj->uobject.list, &file->ucontext->cq_list);
-	mutex_unlock(&file->mutex);
-
-	obj->uobject.live = 1;
-
-	up_write(&obj->uobject.mutex);
+	if (ev_uobj)
+		uverbs_commit_object(ev_uobj, UVERBS_IDR_ACCESS_READ);
+	uverbs_commit_object(&obj->uobject, UVERBS_IDR_ACCESS_NEW);
 
 	return obj;
 
 err_cb:
-	idr_remove_uobj(&obj->uobject);
-
-err_free:
 	ib_destroy_cq(cq);
 
 err_file:
-	if (ev_file)
-		ib_uverbs_release_ucq(file, ev_file, obj);
-
+	if (ev_uobj)
+		uverbs_rollback_object(ev_uobj, UVERBS_IDR_ACCESS_READ);
 err:
-	put_uobj_write(&obj->uobject);
+	uverbs_rollback_object(&obj->uobject, UVERBS_IDR_ACCESS_NEW);
 
 	return ERR_PTR(ret);
 }
@@ -1575,7 +1296,7 @@ ssize_t ib_uverbs_resize_cq(struct ib_uverbs_file *file,
 		   (unsigned long) cmd.response + sizeof resp,
 		   in_len - sizeof cmd, out_len - sizeof resp);
 
-	cq = idr_read_cq(cmd.cq_handle, file->ucontext, 0);
+	cq = idr_read_cq(cmd.cq_handle, file->ucontext);
 	if (!cq)
 		return -EINVAL;
 
@@ -1590,7 +1311,10 @@ ssize_t ib_uverbs_resize_cq(struct ib_uverbs_file *file,
 		ret = -EFAULT;
 
 out:
-	put_cq_read(cq);
+	if (!ret)
+		uverbs_commit_object(cq->uobject, UVERBS_IDR_ACCESS_READ);
+	else
+		uverbs_rollback_object(cq->uobject, UVERBS_IDR_ACCESS_READ);
 
 	return ret ? ret : in_len;
 }
@@ -1637,7 +1361,7 @@ ssize_t ib_uverbs_poll_cq(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	cq = idr_read_cq(cmd.cq_handle, file->ucontext, 0);
+	cq = idr_read_cq(cmd.cq_handle, file->ucontext);
 	if (!cq)
 		return -EINVAL;
 
@@ -1669,7 +1393,10 @@ ssize_t ib_uverbs_poll_cq(struct ib_uverbs_file *file,
 	ret = in_len;
 
 out_put:
-	put_cq_read(cq);
+	if (ret == in_len)
+		uverbs_commit_object(cq->uobject, UVERBS_IDR_ACCESS_READ);
+	else
+		uverbs_rollback_object(cq->uobject, UVERBS_IDR_ACCESS_READ);
 	return ret;
 }
 
@@ -1684,14 +1411,14 @@ ssize_t ib_uverbs_req_notify_cq(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	cq = idr_read_cq(cmd.cq_handle, file->ucontext, 0);
+	cq = idr_read_cq(cmd.cq_handle, file->ucontext);
 	if (!cq)
 		return -EINVAL;
 
 	ib_req_notify_cq(cq, cmd.solicited_only ?
 			 IB_CQ_SOLICITED : IB_CQ_NEXT_COMP);
 
-	put_cq_read(cq);
+	uverbs_commit_object(cq->uobject, UVERBS_IDR_ACCESS_READ);
 
 	return in_len;
 }
@@ -1712,36 +1439,29 @@ ssize_t ib_uverbs_destroy_cq(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(cmd.cq_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
+	uobj = uverbs_get_type_from_idr(uverbs_type_cq.alloc,
+					file->ucontext,
+					UVERBS_IDR_ACCESS_DESTROY,
+					cmd.cq_handle);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
+
 	cq      = uobj->object;
 	ev_file = cq->cq_context;
 	obj     = container_of(cq->uobject, struct ib_ucq_object, uobject);
 
 	ret = ib_destroy_cq(cq);
-	if (!ret)
-		uobj->live = 0;
-
-	put_uobj_write(uobj);
-
-	if (ret)
+	if (ret) {
+		uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 		return ret;
+	}
 
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	ib_uverbs_release_ucq(file, ev_file, obj);
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 
 	memset(&resp, 0, sizeof resp);
 	resp.comp_events_reported  = obj->comp_events_reported;
 	resp.async_events_reported = obj->async_events_reported;
 
-	put_uobj(uobj);
-
 	if (copy_to_user((void __user *) (unsigned long) cmd.response,
 			 &resp, sizeof resp))
 		return -EFAULT;
@@ -1777,13 +1497,15 @@ static int create_qp(struct ib_uverbs_file *file,
 	if (cmd->qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW))
 		return -EPERM;
 
-	obj = kzalloc(sizeof *obj, GFP_KERNEL);
-	if (!obj)
-		return -ENOMEM;
+	obj = (struct ib_uqp_object *)uverbs_get_type_from_idr(
+						uverbs_type_qp.alloc,
+						file->ucontext,
+						UVERBS_IDR_ACCESS_NEW, 0);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+	obj->uxrcd = NULL;
+	obj->uevent.uobject.user_handle = cmd->user_handle;
 
-	init_uobj(&obj->uevent.uobject, cmd->user_handle, file->ucontext,
-		  &qp_lock_class);
-	down_write(&obj->uevent.uobject.mutex);
 	if (cmd_sz >= offsetof(typeof(*cmd), rwq_ind_tbl_handle) +
 		      sizeof(cmd->rwq_ind_tbl_handle) &&
 		      (cmd->comp_mask & IB_UVERBS_CREATE_QP_MASK_IND_TABLE)) {
@@ -1836,7 +1558,7 @@ static int create_qp(struct ib_uverbs_file *file,
 			if (!ind_tbl) {
 				if (cmd->recv_cq_handle != cmd->send_cq_handle) {
 					rcq = idr_read_cq(cmd->recv_cq_handle,
-							  file->ucontext, 0);
+							  file->ucontext);
 					if (!rcq) {
 						ret = -EINVAL;
 						goto err_put;
@@ -1846,7 +1568,7 @@ static int create_qp(struct ib_uverbs_file *file,
 		}
 
 		if (has_sq)
-			scq = idr_read_cq(cmd->send_cq_handle, file->ucontext, !!rcq);
+			scq = idr_read_cq(cmd->send_cq_handle, file->ucontext);
 		if (!ind_tbl)
 			rcq = rcq ?: scq;
 		pd  = idr_read_pd(cmd->pd_handle, file->ucontext);
@@ -1935,9 +1657,6 @@ static int create_qp(struct ib_uverbs_file *file,
 	qp->uobject = &obj->uevent.uobject;
 
 	obj->uevent.uobject.object = qp;
-	ret = idr_add_uobj(&obj->uevent.uobject);
-	if (ret)
-		goto err_destroy;
 
 	memset(&resp, 0, sizeof resp);
 	resp.base.qpn             = qp->qp_num;
@@ -1959,50 +1678,42 @@ static int create_qp(struct ib_uverbs_file *file,
 		obj->uxrcd = container_of(xrcd_uobj, struct ib_uxrcd_object,
 					  uobject);
 		atomic_inc(&obj->uxrcd->refcnt);
-		put_xrcd_read(xrcd_uobj);
+		uverbs_commit_object(xrcd_uobj, UVERBS_IDR_ACCESS_READ);
 	}
 
 	if (pd)
-		put_pd_read(pd);
+		uverbs_commit_object(pd->uobject, UVERBS_IDR_ACCESS_READ);
 	if (scq)
-		put_cq_read(scq);
+		uverbs_commit_object(scq->uobject, UVERBS_IDR_ACCESS_READ);
 	if (rcq && rcq != scq)
-		put_cq_read(rcq);
+		uverbs_commit_object(rcq->uobject, UVERBS_IDR_ACCESS_READ);
 	if (srq)
-		put_srq_read(srq);
+		uverbs_commit_object(srq->uobject, UVERBS_IDR_ACCESS_READ);
 	if (ind_tbl)
-		put_rwq_indirection_table_read(ind_tbl);
+		uverbs_commit_object(ind_tbl->uobject, UVERBS_IDR_ACCESS_READ);
 
-	mutex_lock(&file->mutex);
-	list_add_tail(&obj->uevent.uobject.list, &file->ucontext->qp_list);
-	mutex_unlock(&file->mutex);
-
-	obj->uevent.uobject.live = 1;
-
-	up_write(&obj->uevent.uobject.mutex);
+	uverbs_commit_object(&obj->uevent.uobject, UVERBS_IDR_ACCESS_NEW);
 
 	return 0;
-err_cb:
-	idr_remove_uobj(&obj->uevent.uobject);
 
-err_destroy:
+err_cb:
 	ib_destroy_qp(qp);
 
 err_put:
 	if (xrcd)
-		put_xrcd_read(xrcd_uobj);
+		uverbs_rollback_object(xrcd_uobj, UVERBS_IDR_ACCESS_READ);
 	if (pd)
-		put_pd_read(pd);
+		uverbs_rollback_object(pd->uobject, UVERBS_IDR_ACCESS_READ);
 	if (scq)
-		put_cq_read(scq);
+		uverbs_rollback_object(scq->uobject, UVERBS_IDR_ACCESS_READ);
 	if (rcq && rcq != scq)
-		put_cq_read(rcq);
+		uverbs_rollback_object(rcq->uobject, UVERBS_IDR_ACCESS_READ);
 	if (srq)
-		put_srq_read(srq);
+		uverbs_rollback_object(srq->uobject, UVERBS_IDR_ACCESS_READ);
 	if (ind_tbl)
-		put_rwq_indirection_table_read(ind_tbl);
+		uverbs_rollback_object(ind_tbl->uobject, UVERBS_IDR_ACCESS_READ);
 
-	put_uobj_write(&obj->uevent.uobject);
+	uverbs_rollback_object(&obj->uevent.uobject, UVERBS_IDR_ACCESS_NEW);
 	return ret;
 }
 
@@ -2138,12 +1849,13 @@ ssize_t ib_uverbs_open_qp(struct ib_uverbs_file *file,
 		   (unsigned long) cmd.response + sizeof resp,
 		   in_len - sizeof cmd, out_len - sizeof resp);
 
-	obj = kmalloc(sizeof *obj, GFP_KERNEL);
-	if (!obj)
-		return -ENOMEM;
-
-	init_uobj(&obj->uevent.uobject, cmd.user_handle, file->ucontext, &qp_lock_class);
-	down_write(&obj->uevent.uobject.mutex);
+	obj = (struct ib_uqp_object *)uverbs_get_type_from_idr(
+						uverbs_type_qp.alloc,
+						file->ucontext,
+						UVERBS_IDR_ACCESS_NEW, 0);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+	obj->uxrcd = NULL;
 
 	xrcd = idr_read_xrcd(cmd.pd_handle, file->ucontext, &xrcd_uobj);
 	if (!xrcd) {
@@ -2163,15 +1875,12 @@ ssize_t ib_uverbs_open_qp(struct ib_uverbs_file *file,
 	qp = ib_open_qp(xrcd, &attr);
 	if (IS_ERR(qp)) {
 		ret = PTR_ERR(qp);
-		goto err_put;
+		goto err_xrcd;
 	}
 
 	qp->uobject = &obj->uevent.uobject;
 
 	obj->uevent.uobject.object = qp;
-	ret = idr_add_uobj(&obj->uevent.uobject);
-	if (ret)
-		goto err_destroy;
 
 	memset(&resp, 0, sizeof resp);
 	resp.qpn       = qp->qp_num;
@@ -2180,32 +1889,23 @@ ssize_t ib_uverbs_open_qp(struct ib_uverbs_file *file,
 	if (copy_to_user((void __user *) (unsigned long) cmd.response,
 			 &resp, sizeof resp)) {
 		ret = -EFAULT;
-		goto err_remove;
+		goto err_destroy;
 	}
 
 	obj->uxrcd = container_of(xrcd_uobj, struct ib_uxrcd_object, uobject);
 	atomic_inc(&obj->uxrcd->refcnt);
-	put_xrcd_read(xrcd_uobj);
-
-	mutex_lock(&file->mutex);
-	list_add_tail(&obj->uevent.uobject.list, &file->ucontext->qp_list);
-	mutex_unlock(&file->mutex);
-
-	obj->uevent.uobject.live = 1;
+	uverbs_commit_object(xrcd_uobj, UVERBS_IDR_ACCESS_READ);
 
-	up_write(&obj->uevent.uobject.mutex);
+	uverbs_commit_object(&obj->uevent.uobject, UVERBS_IDR_ACCESS_NEW);
 
 	return in_len;
 
-err_remove:
-	idr_remove_uobj(&obj->uevent.uobject);
-
 err_destroy:
 	ib_destroy_qp(qp);
-
+err_xrcd:
+	uverbs_rollback_object(xrcd_uobj, UVERBS_IDR_ACCESS_READ);
 err_put:
-	put_xrcd_read(xrcd_uobj);
-	put_uobj_write(&obj->uevent.uobject);
+	uverbs_rollback_object(&obj->uevent.uobject, UVERBS_IDR_ACCESS_WRITE);
 	return ret;
 }
 
@@ -2238,11 +1938,10 @@ ssize_t ib_uverbs_query_qp(struct ib_uverbs_file *file,
 	}
 
 	ret = ib_query_qp(qp, attr, cmd.attr_mask, init_attr);
-
-	put_qp_read(qp);
-
 	if (ret)
-		goto out;
+		goto out_query;
+
+	uverbs_commit_object(qp->uobject, UVERBS_IDR_ACCESS_READ);
 
 	memset(&resp, 0, sizeof resp);
 
@@ -2308,6 +2007,10 @@ out:
 	kfree(init_attr);
 
 	return ret ? ret : in_len;
+
+out_query:
+	uverbs_rollback_object(qp->uobject, UVERBS_IDR_ACCESS_READ);
+	return ret;
 }
 
 /* Remove ignored fields set in the attribute mask */
@@ -2413,7 +2116,10 @@ ssize_t ib_uverbs_modify_qp(struct ib_uverbs_file *file,
 	ret = in_len;
 
 release_qp:
-	put_qp_read(qp);
+	if (ret == in_len)
+		uverbs_commit_object(qp->uobject, UVERBS_IDR_ACCESS_READ);
+	else
+		uverbs_rollback_object(qp->uobject, UVERBS_IDR_ACCESS_READ);
 
 out:
 	kfree(attr);
@@ -2438,40 +2144,34 @@ ssize_t ib_uverbs_destroy_qp(struct ib_uverbs_file *file,
 
 	memset(&resp, 0, sizeof resp);
 
-	uobj = idr_write_uobj(cmd.qp_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
+	uobj = uverbs_get_type_from_idr(uverbs_type_qp.alloc, file->ucontext,
+					UVERBS_IDR_ACCESS_DESTROY,
+					cmd.qp_handle);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
+
 	qp  = uobj->object;
 	obj = container_of(uobj, struct ib_uqp_object, uevent.uobject);
 
 	if (!list_empty(&obj->mcast_list)) {
-		put_uobj_write(uobj);
+		uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 		return -EBUSY;
 	}
 
 	ret = ib_destroy_qp(qp);
-	if (!ret)
-		uobj->live = 0;
-
-	put_uobj_write(uobj);
-
-	if (ret)
+	if (ret) {
+		uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 		return ret;
+	}
 
 	if (obj->uxrcd)
 		atomic_dec(&obj->uxrcd->refcnt);
 
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
 	ib_uverbs_release_uevent(file, &obj->uevent);
 
 	resp.events_reported = obj->uevent.events_reported;
 
-	put_uobj(uobj);
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 
 	if (copy_to_user((void __user *) (unsigned long) cmd.response,
 			 &resp, sizeof resp))
@@ -2658,11 +2358,20 @@ ssize_t ib_uverbs_post_send(struct ib_uverbs_file *file,
 		ret = -EFAULT;
 
 out_put:
-	put_qp_read(qp);
+	if (ret)
+		uverbs_rollback_object(qp->uobject, UVERBS_IDR_ACCESS_READ);
+	else
+		uverbs_commit_object(qp->uobject, UVERBS_IDR_ACCESS_READ);
 
 	while (wr) {
-		if (is_ud && ud_wr(wr)->ah)
-			put_ah_read(ud_wr(wr)->ah);
+		if (is_ud && ud_wr(wr)->ah) {
+			if (ret)
+				uverbs_rollback_object(ud_wr(wr)->ah->uobject,
+						       UVERBS_IDR_ACCESS_READ);
+			else
+				uverbs_commit_object(ud_wr(wr)->ah->uobject,
+						     UVERBS_IDR_ACCESS_READ);
+		}
 		next = wr->next;
 		kfree(wr);
 		wr = next;
@@ -2786,14 +2495,16 @@ ssize_t ib_uverbs_post_recv(struct ib_uverbs_file *file,
 	resp.bad_wr = 0;
 	ret = qp->device->post_recv(qp->real_qp, wr, &bad_wr);
 
-	put_qp_read(qp);
-
-	if (ret)
+	if (ret) {
+		uverbs_rollback_object(qp->uobject, UVERBS_IDR_ACCESS_READ);
 		for (next = wr; next; next = next->next) {
 			++resp.bad_wr;
 			if (next == bad_wr)
 				break;
 		}
+	} else {
+		uverbs_commit_object(qp->uobject, UVERBS_IDR_ACCESS_READ);
+	}
 
 	if (copy_to_user((void __user *) (unsigned long) cmd.response,
 			 &resp, sizeof resp))
@@ -2836,7 +2547,10 @@ ssize_t ib_uverbs_post_srq_recv(struct ib_uverbs_file *file,
 	resp.bad_wr = 0;
 	ret = srq->device->post_srq_recv(srq, wr, &bad_wr);
 
-	put_srq_read(srq);
+	if (!ret)
+		uverbs_commit_object(srq->uobject, UVERBS_IDR_ACCESS_READ);
+	else
+		uverbs_rollback_object(srq->uobject, UVERBS_IDR_ACCESS_READ);
 
 	if (ret)
 		for (next = wr; next; next = next->next) {
@@ -2878,12 +2592,11 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = kmalloc(sizeof *uobj, GFP_KERNEL);
-	if (!uobj)
-		return -ENOMEM;
-
-	init_uobj(uobj, cmd.user_handle, file->ucontext, &ah_lock_class);
-	down_write(&uobj->mutex);
+	uobj = uverbs_get_type_from_fd(uverbs_type_ah.alloc,
+				       file->ucontext,
+				       UVERBS_IDR_ACCESS_NEW, 0);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
 	pd = idr_read_pd(cmd.pd_handle, file->ucontext);
 	if (!pd) {
@@ -2913,10 +2626,6 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
 	ah->uobject  = uobj;
 	uobj->object = ah;
 
-	ret = idr_add_uobj(uobj);
-	if (ret)
-		goto err_destroy;
-
 	resp.ah_handle = uobj->id;
 
 	if (copy_to_user((void __user *) (unsigned long) cmd.response,
@@ -2925,29 +2634,23 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
 		goto err_copy;
 	}
 
-	put_pd_read(pd);
-
-	mutex_lock(&file->mutex);
-	list_add_tail(&uobj->list, &file->ucontext->ah_list);
-	mutex_unlock(&file->mutex);
-
-	uobj->live = 1;
+	if (ret)
+		uverbs_rollback_object(pd->uobject, UVERBS_IDR_ACCESS_READ);
+	else
+		uverbs_commit_object(pd->uobject, UVERBS_IDR_ACCESS_READ);
 
-	up_write(&uobj->mutex);
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_NEW);
 
 	return in_len;
 
 err_copy:
-	idr_remove_uobj(uobj);
-
-err_destroy:
 	ib_destroy_ah(ah);
 
 err_put:
-	put_pd_read(pd);
+	uverbs_rollback_object(pd->uobject, UVERBS_IDR_ACCESS_READ);
 
 err:
-	put_uobj_write(uobj);
+	uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_NEW);
 	return ret;
 }
 
@@ -2963,29 +2666,22 @@ ssize_t ib_uverbs_destroy_ah(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(cmd.ah_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
+	uobj = uverbs_get_type_from_idr(uverbs_type_ah.alloc, file->ucontext,
+					UVERBS_IDR_ACCESS_DESTROY,
+					cmd.ah_handle);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
+
 	ah = uobj->object;
 
 	ret = ib_destroy_ah(ah);
-	if (!ret)
-		uobj->live = 0;
-
-	put_uobj_write(uobj);
-
-	if (ret)
+	if (ret) {
+		uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 		return ret;
-
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	put_uobj(uobj);
-
-	return in_len;
+	} else {
+		uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
+		return in_len;
+	}
 }
 
 ssize_t ib_uverbs_attach_mcast(struct ib_uverbs_file *file,
@@ -3031,9 +2727,13 @@ ssize_t ib_uverbs_attach_mcast(struct ib_uverbs_file *file,
 		kfree(mcast);
 
 out_put:
-	put_qp_write(qp);
+	if (ret) {
+		uverbs_rollback_object(qp->uobject, UVERBS_IDR_ACCESS_WRITE);
+		return ret;
+	}
 
-	return ret ? ret : in_len;
+	uverbs_commit_object(qp->uobject, UVERBS_IDR_ACCESS_WRITE);
+	return in_len;
 }
 
 ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file *file,
@@ -3069,9 +2769,13 @@ ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file *file,
 		}
 
 out_put:
-	put_qp_write(qp);
+	if (ret) {
+		uverbs_rollback_object(qp->uobject, UVERBS_IDR_ACCESS_WRITE);
+		return ret;
+	}
 
-	return ret ? ret : in_len;
+	uverbs_commit_object(qp->uobject, UVERBS_IDR_ACCESS_WRITE);
+	return in_len;
 }
 
 static size_t kern_spec_filter_sz(struct ib_uverbs_flow_spec_hdr *spec)
@@ -3214,20 +2918,20 @@ int ib_uverbs_ex_create_wq(struct ib_uverbs_file *file,
 	if (cmd.comp_mask)
 		return -EOPNOTSUPP;
 
-	obj = kmalloc(sizeof(*obj), GFP_KERNEL);
-	if (!obj)
-		return -ENOMEM;
+	obj = (struct ib_uwq_object *)uverbs_get_type_from_idr(
+						uverbs_type_wq.alloc,
+						file->ucontext,
+						UVERBS_IDR_ACCESS_NEW, 0);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
 
-	init_uobj(&obj->uevent.uobject, cmd.user_handle, file->ucontext,
-		  &wq_lock_class);
-	down_write(&obj->uevent.uobject.mutex);
 	pd  = idr_read_pd(cmd.pd_handle, file->ucontext);
 	if (!pd) {
 		err = -EINVAL;
 		goto err_uobj;
 	}
 
-	cq = idr_read_cq(cmd.cq_handle, file->ucontext, 0);
+	cq = idr_read_cq(cmd.cq_handle, file->ucontext);
 	if (!cq) {
 		err = -EINVAL;
 		goto err_put_pd;
@@ -3259,9 +2963,6 @@ int ib_uverbs_ex_create_wq(struct ib_uverbs_file *file,
 	atomic_inc(&cq->usecnt);
 	wq->uobject = &obj->uevent.uobject;
 	obj->uevent.uobject.object = wq;
-	err = idr_add_uobj(&obj->uevent.uobject);
-	if (err)
-		goto destroy_wq;
 
 	memset(&resp, 0, sizeof(resp));
 	resp.wq_handle = obj->uevent.uobject.id;
@@ -3274,27 +2975,19 @@ int ib_uverbs_ex_create_wq(struct ib_uverbs_file *file,
 	if (err)
 		goto err_copy;
 
-	put_pd_read(pd);
-	put_cq_read(cq);
-
-	mutex_lock(&file->mutex);
-	list_add_tail(&obj->uevent.uobject.list, &file->ucontext->wq_list);
-	mutex_unlock(&file->mutex);
-
-	obj->uevent.uobject.live = 1;
-	up_write(&obj->uevent.uobject.mutex);
+	uverbs_commit_object(pd->uobject, UVERBS_IDR_ACCESS_READ);
+	uverbs_commit_object(cq->uobject, UVERBS_IDR_ACCESS_READ);
+	uverbs_commit_object(&obj->uevent.uobject, UVERBS_IDR_ACCESS_NEW);
 	return 0;
 
 err_copy:
-	idr_remove_uobj(&obj->uevent.uobject);
-destroy_wq:
 	ib_destroy_wq(wq);
 err_put_cq:
-	put_cq_read(cq);
+	uverbs_rollback_object(cq->uobject, UVERBS_IDR_ACCESS_READ);
 err_put_pd:
-	put_pd_read(pd);
+	uverbs_rollback_object(pd->uobject, UVERBS_IDR_ACCESS_READ);
 err_uobj:
-	put_uobj_write(&obj->uevent.uobject);
+	uverbs_rollback_object(&obj->uevent.uobject, UVERBS_IDR_ACCESS_NEW);
 
 	return err;
 }
@@ -3335,30 +3028,23 @@ int ib_uverbs_ex_destroy_wq(struct ib_uverbs_file *file,
 		return -EOPNOTSUPP;
 
 	resp.response_length = required_resp_len;
-	uobj = idr_write_uobj(cmd.wq_handle,
-			      file->ucontext);
-	if (!uobj)
-		return -EINVAL;
+	uobj = uverbs_get_type_from_idr(uverbs_type_ah.alloc, file->ucontext,
+					UVERBS_IDR_ACCESS_DESTROY,
+					cmd.wq_handle);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
 
 	wq = uobj->object;
 	obj = container_of(uobj, struct ib_uwq_object, uevent.uobject);
 	ret = ib_destroy_wq(wq);
-	if (!ret)
-		uobj->live = 0;
-
-	put_uobj_write(uobj);
-	if (ret)
+	if (ret) {
+		uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 		return ret;
-
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
+	}
 
 	ib_uverbs_release_uevent(file, &obj->uevent);
 	resp.events_reported = obj->uevent.events_reported;
-	put_uobj(uobj);
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 
 	ret = ib_copy_to_udata(ucore, &resp, resp.response_length);
 	if (ret)
@@ -3404,7 +3090,10 @@ int ib_uverbs_ex_modify_wq(struct ib_uverbs_file *file,
 	wq_attr.curr_wq_state = cmd.curr_wq_state;
 	wq_attr.wq_state = cmd.wq_state;
 	ret = wq->device->modify_wq(wq, &wq_attr, cmd.attr_mask, uhw);
-	put_wq_read(wq);
+	if (ret)
+		uverbs_rollback_object(wq->uobject, UVERBS_IDR_ACCESS_READ);
+	else
+		uverbs_commit_object(wq->uobject, UVERBS_IDR_ACCESS_READ);
 	return ret;
 }
 
@@ -3491,14 +3180,15 @@ int ib_uverbs_ex_create_rwq_ind_table(struct ib_uverbs_file *file,
 		wqs[num_read_wqs] = wq;
 	}
 
-	uobj = kmalloc(sizeof(*uobj), GFP_KERNEL);
-	if (!uobj) {
+	uobj = uverbs_get_type_from_idr(uverbs_type_rwq_ind_table.alloc,
+					file->ucontext,
+					UVERBS_IDR_ACCESS_NEW,
+					0);
+	if (IS_ERR(uobj)) {
 		err = -ENOMEM;
 		goto put_wqs;
 	}
 
-	init_uobj(uobj, 0, file->ucontext, &rwq_ind_table_lock_class);
-	down_write(&uobj->mutex);
 	init_attr.log_ind_tbl_size = cmd.log_ind_tbl_size;
 	init_attr.ind_tbl = wqs;
 	rwq_ind_tbl = ib_dev->create_rwq_ind_table(ib_dev, &init_attr, uhw);
@@ -3518,10 +3208,6 @@ int ib_uverbs_ex_create_rwq_ind_table(struct ib_uverbs_file *file,
 	for (i = 0; i < num_wq_handles; i++)
 		atomic_inc(&wqs[i]->usecnt);
 
-	err = idr_add_uobj(uobj);
-	if (err)
-		goto destroy_ind_tbl;
-
 	resp.ind_tbl_handle = uobj->id;
 	resp.ind_tbl_num = rwq_ind_tbl->ind_tbl_num;
 	resp.response_length = required_resp_len;
@@ -3534,26 +3220,18 @@ int ib_uverbs_ex_create_rwq_ind_table(struct ib_uverbs_file *file,
 	kfree(wqs_handles);
 
 	for (j = 0; j < num_read_wqs; j++)
-		put_wq_read(wqs[j]);
-
-	mutex_lock(&file->mutex);
-	list_add_tail(&uobj->list, &file->ucontext->rwq_ind_tbl_list);
-	mutex_unlock(&file->mutex);
-
-	uobj->live = 1;
+		uverbs_commit_object(wqs[j]->uobject, UVERBS_IDR_ACCESS_READ);
 
-	up_write(&uobj->mutex);
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_NEW);
 	return 0;
 
 err_copy:
-	idr_remove_uobj(uobj);
-destroy_ind_tbl:
 	ib_destroy_rwq_ind_table(rwq_ind_tbl);
 err_uobj:
-	put_uobj_write(uobj);
+	uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_NEW);
 put_wqs:
 	for (j = 0; j < num_read_wqs; j++)
-		put_wq_read(wqs[j]);
+		uverbs_rollback_object(wqs[j]->uobject, UVERBS_IDR_ACCESS_READ);
 err_free:
 	kfree(wqs_handles);
 	kfree(wqs);
@@ -3589,29 +3267,23 @@ int ib_uverbs_ex_destroy_rwq_ind_table(struct ib_uverbs_file *file,
 	if (cmd.comp_mask)
 		return -EOPNOTSUPP;
 
-	uobj = idr_write_uobj(cmd.ind_tbl_handle,
-			      file->ucontext);
-	if (!uobj)
-		return -EINVAL;
+	uobj = uverbs_get_type_from_idr(uverbs_type_rwq_ind_table.alloc,
+					file->ucontext,
+					UVERBS_IDR_ACCESS_DESTROY,
+					cmd.ind_tbl_handle);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
+
 	rwq_ind_tbl = uobj->object;
 	ind_tbl = rwq_ind_tbl->ind_tbl;
 
 	ret = ib_destroy_rwq_ind_table(rwq_ind_tbl);
-	if (!ret)
-		uobj->live = 0;
-
-	put_uobj_write(uobj);
-
-	if (ret)
+	if (ret) {
+		uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 		return ret;
+	}
 
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	put_uobj(uobj);
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 	kfree(ind_tbl);
 	return ret;
 }
@@ -3687,13 +3359,12 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 		kern_flow_attr = &cmd.flow_attr;
 	}
 
-	uobj = kmalloc(sizeof(*uobj), GFP_KERNEL);
-	if (!uobj) {
+	uobj = uverbs_get_type_from_idr(uverbs_type_flow.alloc, file->ucontext,
+					UVERBS_IDR_ACCESS_NEW, 0);
+	if (IS_ERR(uobj)) {
 		err = -ENOMEM;
 		goto err_free_attr;
 	}
-	init_uobj(uobj, 0, file->ucontext, &rule_lock_class);
-	down_write(&uobj->mutex);
 
 	qp = idr_read_qp(cmd.qp_handle, file->ucontext);
 	if (!qp) {
@@ -3745,10 +3416,6 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 	flow_id->uobject = uobj;
 	uobj->object = flow_id;
 
-	err = idr_add_uobj(uobj);
-	if (err)
-		goto destroy_flow;
-
 	memset(&resp, 0, sizeof(resp));
 	resp.flow_handle = uobj->id;
 
@@ -3757,28 +3424,20 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 	if (err)
 		goto err_copy;
 
-	put_qp_read(qp);
-	mutex_lock(&file->mutex);
-	list_add_tail(&uobj->list, &file->ucontext->rule_list);
-	mutex_unlock(&file->mutex);
-
-	uobj->live = 1;
-
-	up_write(&uobj->mutex);
+	uverbs_commit_object(qp->uobject, UVERBS_IDR_ACCESS_READ);
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_NEW);
 	kfree(flow_attr);
 	if (cmd.flow_attr.num_of_specs)
 		kfree(kern_flow_attr);
 	return 0;
 err_copy:
-	idr_remove_uobj(uobj);
-destroy_flow:
 	ib_destroy_flow(flow_id);
 err_free:
 	kfree(flow_attr);
 err_put:
-	put_qp_read(qp);
+	uverbs_rollback_object(qp->uobject, UVERBS_IDR_ACCESS_READ);
 err_uobj:
-	put_uobj_write(uobj);
+	uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_NEW);
 err_free_attr:
 	if (cmd.flow_attr.num_of_specs)
 		kfree(kern_flow_attr);
@@ -3805,25 +3464,22 @@ int ib_uverbs_ex_destroy_flow(struct ib_uverbs_file *file,
 	if (cmd.comp_mask)
 		return -EINVAL;
 
-	uobj = idr_write_uobj(cmd.flow_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
+	uobj = uverbs_get_type_from_idr(uverbs_type_flow.alloc,
+					file->ucontext,
+					UVERBS_IDR_ACCESS_DESTROY,
+					cmd.flow_handle);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
+
 	flow_id = uobj->object;
 
 	ret = ib_destroy_flow(flow_id);
-	if (!ret)
-		uobj->live = 0;
-
-	put_uobj_write(uobj);
-
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
-	put_uobj(uobj);
+	if (ret) {
+		uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
+		return ret;
+	}
 
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 	return ret;
 }
 
@@ -3840,12 +3496,12 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 	struct ib_srq_init_attr          attr;
 	int ret;
 
-	obj = kmalloc(sizeof *obj, GFP_KERNEL);
-	if (!obj)
-		return -ENOMEM;
-
-	init_uobj(&obj->uevent.uobject, cmd->user_handle, file->ucontext, &srq_lock_class);
-	down_write(&obj->uevent.uobject.mutex);
+	obj = (struct ib_usrq_object *)uverbs_get_type_from_idr(
+						uverbs_type_srq.alloc,
+						file->ucontext,
+						UVERBS_IDR_ACCESS_NEW, 0);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
 
 	if (cmd->srq_type == IB_SRQT_XRC) {
 		attr.ext.xrc.xrcd  = idr_read_xrcd(cmd->xrcd_handle, file->ucontext, &xrcd_uobj);
@@ -3857,7 +3513,7 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 		obj->uxrcd = container_of(xrcd_uobj, struct ib_uxrcd_object, uobject);
 		atomic_inc(&obj->uxrcd->refcnt);
 
-		attr.ext.xrc.cq  = idr_read_cq(cmd->cq_handle, file->ucontext, 0);
+		attr.ext.xrc.cq  = idr_read_cq(cmd->cq_handle, file->ucontext);
 		if (!attr.ext.xrc.cq) {
 			ret = -EINVAL;
 			goto err_put_xrcd;
@@ -3904,9 +3560,7 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 	atomic_set(&srq->usecnt, 0);
 
 	obj->uevent.uobject.object = srq;
-	ret = idr_add_uobj(&obj->uevent.uobject);
-	if (ret)
-		goto err_destroy;
+	obj->uevent.uobject.user_handle = cmd->user_handle;
 
 	memset(&resp, 0, sizeof resp);
 	resp.srq_handle = obj->uevent.uobject.id;
@@ -3922,42 +3576,34 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 	}
 
 	if (cmd->srq_type == IB_SRQT_XRC) {
-		put_uobj_read(xrcd_uobj);
-		put_cq_read(attr.ext.xrc.cq);
+		uverbs_commit_object(xrcd_uobj, UVERBS_IDR_ACCESS_READ);
+		uverbs_commit_object(attr.ext.xrc.cq->uobject,
+				     UVERBS_IDR_ACCESS_READ);
 	}
-	put_pd_read(pd);
-
-	mutex_lock(&file->mutex);
-	list_add_tail(&obj->uevent.uobject.list, &file->ucontext->srq_list);
-	mutex_unlock(&file->mutex);
-
-	obj->uevent.uobject.live = 1;
-
-	up_write(&obj->uevent.uobject.mutex);
+	uverbs_commit_object(pd->uobject, UVERBS_IDR_ACCESS_READ);
+	uverbs_commit_object(&obj->uevent.uobject, UVERBS_IDR_ACCESS_NEW);
 
 	return 0;
 
 err_copy:
-	idr_remove_uobj(&obj->uevent.uobject);
-
-err_destroy:
 	ib_destroy_srq(srq);
 
 err_put:
-	put_pd_read(pd);
+	uverbs_rollback_object(pd->uobject, UVERBS_IDR_ACCESS_READ);
 
 err_put_cq:
 	if (cmd->srq_type == IB_SRQT_XRC)
-		put_cq_read(attr.ext.xrc.cq);
+		uverbs_rollback_object(attr.ext.xrc.cq->uobject,
+				       UVERBS_IDR_ACCESS_READ);
 
 err_put_xrcd:
 	if (cmd->srq_type == IB_SRQT_XRC) {
 		atomic_dec(&obj->uxrcd->refcnt);
-		put_uobj_read(xrcd_uobj);
+		uverbs_rollback_object(xrcd_uobj, UVERBS_IDR_ACCESS_READ);
 	}
 
 err:
-	put_uobj_write(&obj->uevent.uobject);
+	uverbs_rollback_object(&obj->uevent.uobject, UVERBS_IDR_ACCESS_NEW);
 	return ret;
 }
 
@@ -4051,9 +3697,13 @@ ssize_t ib_uverbs_modify_srq(struct ib_uverbs_file *file,
 
 	ret = srq->device->modify_srq(srq, &attr, cmd.attr_mask, &udata);
 
-	put_srq_read(srq);
+	if (ret) {
+		uverbs_rollback_object(srq->uobject, UVERBS_IDR_ACCESS_READ);
+		return ret;
+	}
 
-	return ret ? ret : in_len;
+	uverbs_commit_object(srq->uobject, UVERBS_IDR_ACCESS_READ);
+	return in_len;
 }
 
 ssize_t ib_uverbs_query_srq(struct ib_uverbs_file *file,
@@ -4079,10 +3729,12 @@ ssize_t ib_uverbs_query_srq(struct ib_uverbs_file *file,
 
 	ret = ib_query_srq(srq, &attr);
 
-	put_srq_read(srq);
-
-	if (ret)
+	if (ret) {
+		uverbs_rollback_object(srq->uobject, UVERBS_IDR_ACCESS_READ);
 		return ret;
+	}
+
+	uverbs_commit_object(srq->uobject, UVERBS_IDR_ACCESS_READ);
 
 	memset(&resp, 0, sizeof resp);
 
@@ -4114,39 +3766,34 @@ ssize_t ib_uverbs_destroy_srq(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	uobj = idr_write_uobj(cmd.srq_handle, file->ucontext);
-	if (!uobj)
-		return -EINVAL;
+	uobj = uverbs_get_type_from_idr(uverbs_type_srq.alloc,
+					file->ucontext,
+					UVERBS_IDR_ACCESS_DESTROY,
+					cmd.srq_handle);
+	if (IS_ERR(uobj))
+		return PTR_ERR(uobj);
+
 	srq = uobj->object;
 	obj = container_of(uobj, struct ib_uevent_object, uobject);
 	srq_type = srq->srq_type;
 
 	ret = ib_destroy_srq(srq);
-	if (!ret)
-		uobj->live = 0;
-
-	put_uobj_write(uobj);
-
-	if (ret)
+	if (ret) {
+		uverbs_rollback_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 		return ret;
+	}
 
 	if (srq_type == IB_SRQT_XRC) {
 		us = container_of(obj, struct ib_usrq_object, uevent);
 		atomic_dec(&us->uxrcd->refcnt);
 	}
 
-	idr_remove_uobj(uobj);
-
-	mutex_lock(&file->mutex);
-	list_del(&uobj->list);
-	mutex_unlock(&file->mutex);
-
 	ib_uverbs_release_uevent(file, obj);
 
 	memset(&resp, 0, sizeof resp);
 	resp.events_reported = obj->events_reported;
 
-	put_uobj(uobj);
+	uverbs_commit_object(uobj, UVERBS_IDR_ACCESS_DESTROY);
 
 	if (copy_to_user((void __user *) (unsigned long) cmd.response,
 			 &resp, sizeof resp))
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index b560c88..d3dacdd 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -51,6 +51,7 @@
 #include <rdma/ib.h>
 
 #include "uverbs.h"
+#include "rdma_core.h"
 
 MODULE_AUTHOR("Roland Dreier");
 MODULE_DESCRIPTION("InfiniBand userspace verbs access");
@@ -153,7 +154,7 @@ static struct kobj_type ib_uverbs_dev_ktype = {
 	.release = ib_uverbs_release_dev,
 };
 
-static void ib_uverbs_release_event_file(struct kref *ref)
+static void ib_uverbs_release_async_event_file(struct kref *ref)
 {
 	struct ib_uverbs_event_file *file =
 		container_of(ref, struct ib_uverbs_event_file, ref);
@@ -161,6 +162,14 @@ static void ib_uverbs_release_event_file(struct kref *ref)
 	kfree(file);
 }
 
+static void ib_uverbs_release_event_file(struct kref *ref)
+{
+	struct ib_uverbs_event_file *file =
+		container_of(ref, struct ib_uverbs_event_file, ref);
+
+	ib_uverbs_cleanup_fd(file);
+}
+
 void ib_uverbs_release_ucq(struct ib_uverbs_file *file,
 			  struct ib_uverbs_event_file *ev_file,
 			  struct ib_ucq_object *uobj)
@@ -214,123 +223,9 @@ void ib_uverbs_detach_umcast(struct ib_qp *qp,
 static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 				      struct ib_ucontext *context)
 {
-	struct ib_uobject *uobj, *tmp;
-
 	context->closing = 1;
-
-	list_for_each_entry_safe(uobj, tmp, &context->ah_list, list) {
-		struct ib_ah *ah = uobj->object;
-
-		idr_remove_uobj(uobj);
-		ib_destroy_ah(ah);
-		kfree(uobj);
-	}
-
-	/* Remove MWs before QPs, in order to support type 2A MWs. */
-	list_for_each_entry_safe(uobj, tmp, &context->mw_list, list) {
-		struct ib_mw *mw = uobj->object;
-
-		idr_remove_uobj(uobj);
-		uverbs_dealloc_mw(mw);
-		kfree(uobj);
-	}
-
-	list_for_each_entry_safe(uobj, tmp, &context->rule_list, list) {
-		struct ib_flow *flow_id = uobj->object;
-
-		idr_remove_uobj(uobj);
-		ib_destroy_flow(flow_id);
-		kfree(uobj);
-	}
-
-	list_for_each_entry_safe(uobj, tmp, &context->qp_list, list) {
-		struct ib_qp *qp = uobj->object;
-		struct ib_uqp_object *uqp =
-			container_of(uobj, struct ib_uqp_object, uevent.uobject);
-
-		idr_remove_uobj(uobj);
-		if (qp != qp->real_qp) {
-			ib_close_qp(qp);
-		} else {
-			ib_uverbs_detach_umcast(qp, uqp);
-			ib_destroy_qp(qp);
-		}
-		ib_uverbs_release_uevent(file, &uqp->uevent);
-		kfree(uqp);
-	}
-
-	list_for_each_entry_safe(uobj, tmp, &context->rwq_ind_tbl_list, list) {
-		struct ib_rwq_ind_table *rwq_ind_tbl = uobj->object;
-		struct ib_wq **ind_tbl = rwq_ind_tbl->ind_tbl;
-
-		idr_remove_uobj(uobj);
-		ib_destroy_rwq_ind_table(rwq_ind_tbl);
-		kfree(ind_tbl);
-		kfree(uobj);
-	}
-
-	list_for_each_entry_safe(uobj, tmp, &context->wq_list, list) {
-		struct ib_wq *wq = uobj->object;
-		struct ib_uwq_object *uwq =
-			container_of(uobj, struct ib_uwq_object, uevent.uobject);
-
-		idr_remove_uobj(uobj);
-		ib_destroy_wq(wq);
-		ib_uverbs_release_uevent(file, &uwq->uevent);
-		kfree(uwq);
-	}
-
-	list_for_each_entry_safe(uobj, tmp, &context->srq_list, list) {
-		struct ib_srq *srq = uobj->object;
-		struct ib_uevent_object *uevent =
-			container_of(uobj, struct ib_uevent_object, uobject);
-
-		idr_remove_uobj(uobj);
-		ib_destroy_srq(srq);
-		ib_uverbs_release_uevent(file, uevent);
-		kfree(uevent);
-	}
-
-	list_for_each_entry_safe(uobj, tmp, &context->cq_list, list) {
-		struct ib_cq *cq = uobj->object;
-		struct ib_uverbs_event_file *ev_file = cq->cq_context;
-		struct ib_ucq_object *ucq =
-			container_of(uobj, struct ib_ucq_object, uobject);
-
-		idr_remove_uobj(uobj);
-		ib_destroy_cq(cq);
-		ib_uverbs_release_ucq(file, ev_file, ucq);
-		kfree(ucq);
-	}
-
-	list_for_each_entry_safe(uobj, tmp, &context->mr_list, list) {
-		struct ib_mr *mr = uobj->object;
-
-		idr_remove_uobj(uobj);
-		ib_dereg_mr(mr);
-		kfree(uobj);
-	}
-
-	mutex_lock(&file->device->xrcd_tree_mutex);
-	list_for_each_entry_safe(uobj, tmp, &context->xrcd_list, list) {
-		struct ib_xrcd *xrcd = uobj->object;
-		struct ib_uxrcd_object *uxrcd =
-			container_of(uobj, struct ib_uxrcd_object, uobject);
-
-		idr_remove_uobj(uobj);
-		ib_uverbs_dealloc_xrcd(file->device, xrcd);
-		kfree(uxrcd);
-	}
-	mutex_unlock(&file->device->xrcd_tree_mutex);
-
-	list_for_each_entry_safe(uobj, tmp, &context->pd_list, list) {
-		struct ib_pd *pd = uobj->object;
-
-		idr_remove_uobj(uobj);
-		ib_dealloc_pd(pd);
-		kfree(uobj);
-	}
-
+	ib_uverbs_uobject_type_cleanup_ucontext(context,
+						context->device->specs_root);
 	put_pid(context->tgid);
 
 	return context->device->dealloc_ucontext(context);
@@ -449,7 +344,7 @@ static int ib_uverbs_event_fasync(int fd, struct file *filp, int on)
 	return fasync_helper(fd, filp, on, &file->async_queue);
 }
 
-static int ib_uverbs_event_close(struct inode *inode, struct file *filp)
+static int ib_uverbs_async_event_close(struct inode *inode, struct file *filp)
 {
 	struct ib_uverbs_event_file *file = filp->private_data;
 	struct ib_uverbs_event *entry, *tmp;
@@ -474,6 +369,25 @@ static int ib_uverbs_event_close(struct inode *inode, struct file *filp)
 	mutex_unlock(&file->uverbs_file->device->lists_mutex);
 
 	kref_put(&file->uverbs_file->ref, ib_uverbs_release_file);
+	kref_put(&file->ref, ib_uverbs_release_async_event_file);
+
+	return 0;
+}
+
+static int ib_uverbs_event_close(struct inode *inode, struct file *filp)
+{
+	struct ib_uverbs_event_file *file = filp->private_data;
+	struct ib_uverbs_event *entry, *tmp;
+
+	spin_lock_irq(&file->lock);
+	list_for_each_entry_safe(entry, tmp, &file->event_list, list) {
+		if (entry->counter)
+			list_del(&entry->obj_list);
+		kfree(entry);
+	}
+	spin_unlock_irq(&file->lock);
+
+	ib_uverbs_close_fd(filp);
 	kref_put(&file->ref, ib_uverbs_release_event_file);
 
 	return 0;
@@ -488,6 +402,15 @@ const struct file_operations uverbs_event_fops = {
 	.llseek	 = no_llseek,
 };
 
+static const struct file_operations uverbs_async_event_fops = {
+	.owner	 = THIS_MODULE,
+	.read	 = ib_uverbs_event_read,
+	.poll    = ib_uverbs_event_poll,
+	.release = ib_uverbs_async_event_close,
+	.fasync  = ib_uverbs_event_fasync,
+	.llseek	 = no_llseek,
+};
+
 void ib_uverbs_comp_handler(struct ib_cq *cq, void *cq_context)
 {
 	struct ib_uverbs_event_file    *file = cq_context;
@@ -572,7 +495,8 @@ void ib_uverbs_qp_event_handler(struct ib_event *event, void *context_ptr)
 	struct ib_uevent_object *uobj;
 
 	/* for XRC target qp's, check that qp is live */
-	if (!event->element.qp->uobject || !event->element.qp->uobject->live)
+	if (!event->element.qp->uobject ||
+	    !uverbs_is_live(event->element.qp->uobject))
 		return;
 
 	uobj = container_of(event->element.qp->uobject,
@@ -617,13 +541,12 @@ void ib_uverbs_event_handler(struct ib_event_handler *handler,
 
 void ib_uverbs_free_async_event_file(struct ib_uverbs_file *file)
 {
-	kref_put(&file->async_file->ref, ib_uverbs_release_event_file);
+	kref_put(&file->async_file->ref, ib_uverbs_release_async_event_file);
 	file->async_file = NULL;
 }
 
-struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
-					struct ib_device	*ib_dev,
-					int is_async)
+struct file *ib_uverbs_alloc_async_event_file(struct ib_uverbs_file *uverbs_file,
+					      struct ib_device	*ib_dev)
 {
 	struct ib_uverbs_event_file *ev_file;
 	struct file *filp;
@@ -642,7 +565,7 @@ struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
 	ev_file->async_queue = NULL;
 	ev_file->is_closed   = 0;
 
-	filp = anon_inode_getfile("[infinibandevent]", &uverbs_event_fops,
+	filp = anon_inode_getfile("[infinibandevent]", &uverbs_async_event_fops,
 				  ev_file, O_RDONLY);
 	if (IS_ERR(filp))
 		goto err_put_refs;
@@ -652,26 +575,25 @@ struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
 		      &uverbs_file->device->uverbs_events_file_list);
 	mutex_unlock(&uverbs_file->device->lists_mutex);
 
-	if (is_async) {
-		WARN_ON(uverbs_file->async_file);
-		uverbs_file->async_file = ev_file;
-		kref_get(&uverbs_file->async_file->ref);
-		INIT_IB_EVENT_HANDLER(&uverbs_file->event_handler,
-				      ib_dev,
-				      ib_uverbs_event_handler);
-		ret = ib_register_event_handler(&uverbs_file->event_handler);
-		if (ret)
-			goto err_put_file;
-
-		/* At that point async file stuff was fully set */
-		ev_file->is_async = 1;
-	}
+	WARN_ON(uverbs_file->async_file);
+	uverbs_file->async_file = ev_file;
+	kref_get(&uverbs_file->async_file->ref);
+	INIT_IB_EVENT_HANDLER(&uverbs_file->event_handler,
+			      ib_dev,
+			      ib_uverbs_event_handler);
+	ret = ib_register_event_handler(&uverbs_file->event_handler);
+	if (ret)
+		goto err_put_file;
+
+	/* At that point async file stuff was fully set */
+	ev_file->is_async = 1;
 
 	return filp;
 
 err_put_file:
 	fput(filp);
-	kref_put(&uverbs_file->async_file->ref, ib_uverbs_release_event_file);
+	kref_put(&uverbs_file->async_file->ref,
+		 ib_uverbs_release_async_event_file);
 	uverbs_file->async_file = NULL;
 	return ERR_PTR(ret);
 
@@ -681,35 +603,6 @@ err_put_refs:
 	return filp;
 }
 
-/*
- * Look up a completion event file by FD.  If lookup is successful,
- * takes a ref to the event file struct that it returns; if
- * unsuccessful, returns NULL.
- */
-struct ib_uverbs_event_file *ib_uverbs_lookup_comp_file(int fd)
-{
-	struct ib_uverbs_event_file *ev_file = NULL;
-	struct fd f = fdget(fd);
-
-	if (!f.file)
-		return NULL;
-
-	if (f.file->f_op != &uverbs_event_fops)
-		goto out;
-
-	ev_file = f.file->private_data;
-	if (ev_file->is_async) {
-		ev_file = NULL;
-		goto out;
-	}
-
-	kref_get(&ev_file->ref);
-
-out:
-	fdput(f);
-	return ev_file;
-}
-
 static int verify_command_mask(struct ib_device *ib_dev, __u32 command)
 {
 	u64 mask;
@@ -998,7 +891,8 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
 	mutex_unlock(&file->device->lists_mutex);
 
 	if (file->async_file)
-		kref_put(&file->async_file->ref, ib_uverbs_release_event_file);
+		kref_put(&file->async_file->ref,
+			 ib_uverbs_release_async_event_file);
 
 	kref_put(&file->ref, ib_uverbs_release_file);
 	kobject_put(&dev->kobj);
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index f4160d5..a10b203 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -55,6 +55,7 @@
 #include <linux/etherdevice.h>
 #include <linux/mlx5/fs.h>
 #include "mlx5_ib.h"
+#include <rdma/uverbs_ioctl_cmd.h>
 
 #define DRIVER_NAME "mlx5_ib"
 #define DRIVER_VERSION "2.2-1"
@@ -2918,6 +2919,8 @@ free:
 	return ARRAY_SIZE(names);
 }
 
+DECLARE_UVERBS_TYPES_GROUP(root, &uverbs_common_types);
+
 static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 {
 	struct mlx5_ib_dev *dev;
@@ -3128,6 +3131,7 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 	if (err)
 		goto err_odp;
 
+	dev->ib_dev.specs_root = (struct uverbs_root *)&root;
 	err = ib_register_device(&dev->ib_dev, NULL);
 	if (err)
 		goto err_q_cnt;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 282b0ba..f8eeca4 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1334,17 +1334,6 @@ struct ib_ucontext_lock;
 struct ib_ucontext {
 	struct ib_device       *device;
 	struct ib_uverbs_file  *ufile;
-	struct list_head	pd_list;
-	struct list_head	mr_list;
-	struct list_head	mw_list;
-	struct list_head	cq_list;
-	struct list_head	qp_list;
-	struct list_head	srq_list;
-	struct list_head	ah_list;
-	struct list_head	xrcd_list;
-	struct list_head	rule_list;
-	struct list_head	wq_list;
-	struct list_head	rwq_ind_tbl_list;
 	int			closing;
 
 	/* lock for uobjects list */
@@ -1378,11 +1367,8 @@ struct ib_uobject {
 	void		       *object;		/* containing object */
 	struct list_head	list;		/* link to context's list */
 	int			id;		/* index into kernel idr/fd */
-	struct kref             ref;
 	struct rw_semaphore	usecnt;		/* protects exclusive access */
-	struct rw_semaphore     mutex;          /* protects .live */
 	struct rcu_head		rcu;		/* kfree_rcu() overhead */
-	int			live;
 
 	const struct uverbs_type_alloc_action *type;
 	struct ib_ucontext_lock	*uobjects_lock;
@@ -2116,7 +2102,7 @@ struct ib_device {
 	void (*get_dev_fw_str)(struct ib_device *, char *str, size_t str_len);
 	struct list_head type_list;
 
-	const struct uverbs_types_group	*types_group;
+	struct uverbs_root                      *specs_root;
 };
 
 struct ib_client {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC ABI V6 07/14] IB/core: Add new ioctl interface
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2016-12-11 12:58   ` [RFC ABI V6 06/14] IB/core: Use the new IDR and locking infrastructure in uverbs_cmd Matan Barak
@ 2016-12-11 12:58   ` Matan Barak
  2016-12-11 12:58   ` [RFC ABI V6 08/14] IB/core: Add macros for declaring actions and attributes Matan Barak
                     ` (7 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Matan Barak @ 2016-12-11 12:58 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Jason Gunthorpe, Sean Hefty, Christoph Lameter,
	Liran Liss, Haggai Eran, Majd Dibbiny, Matan Barak, Tal Alon,
	Leon Romanovsky

In this proposed ioctl interface, processing the command starts from
properties of the command and fetching the appropriate user objects
before calling the handler.

Parsing and validation is done according to a specifier declared by
the driver's code. In the driver, all supported types are declared.
These types are separated to different type groups, each could be
declared in a different place (for example, common types and driver
specific types).

For each type we list all supported actions. Similarly to types,
actions are separated to actions groups too. Each group is declared
separately. This could be used in order to add actions to an existing
type.

Each action has a specifies a handler, which could be either a
standard command or a driver specific command.
Along with the handler, a group of attributes is specified as well.
This group lists all supported attributes and is used for automatic
fetching and validation of the command, response and its related
objects.

When a group of elements is used, the high bits of the elements ids
are used in order to calculate the group index. Then, these high bits
are masked out in order to have a zero based namespace for every
group. This is mandatory for compact representation and O(1) array
access.

A group of attributes is actually an array of attributes. Each
attribute has a type (PTR_IN, PTR_OUT, IDR, FD and FLAG) and a length.
Attributes could be validated through some attributes, like:
(*) Minimum size / Exact size
(*) Fops for FD
(*) Object type for IDR
(*) Mask for FLAG

If an IDR/fd attribute is specified, the kernel also states the object
type and the required access (NEW, WRITE, READ or DESTROY).
All uobject/fd management is done automatically by the infrastructure,
meaning - the infrastructure will fail concurrent commands that at
least one of them requires concurrent access (WRITE/DESTROY),
synchronize actions with device removals (dissociate context events)
and take care of reference counting (increase/decrease) for concurrent
actions invocation. The reference counts on the actual kernel objects
shall be handled by the handlers.

 types
+--------+
|        |
|        |   actions                                                                +--------+
|        |   group      action      action_spec                           +-----+   |len     |
+--------+  +------+[d]+-------+   +----------------+[d]+------------+    |attr1+-> |type    |
| type   +> |action+-> | spec  +-> +  attr_groups   +-> |common_chain+--> +-----+   |idr_type|
+--------+  +------+   |handler|   |                |   +------------+    |attr2|   |access  |
|        |  |      |   +-------+   +----------------+   |vendor chain|    +-----+   +--------+
|        |  |      |                                    +------------+
|        |  +------+
|        |
|        |
|        |
|        |
|        |
|        |
|        |
|        |
|        |
|        |
+--------+

[d] = distribute ids to groups using the high order bits

The right types table is also chosen by using the high bits from
uverbs_types_groups.

Once validation and object fetching (or creation) completed, we call
the handler:
int (*handler)(struct ib_device *ib_dev, struct ib_ucontext *ucontext,
               struct uverbs_attr_array *ctx, size_t num);

Where ctx is an array of uverbs_attr_array. Each element in this array
is an array of attributes which corresponds to one group of attributes.
For example, in the usually used case:

 ctx                               core
+----------------------------+     +------------+
| core: uverbs_attr_array    +---> | valid      |
+----------------------------+     | cmd_attr   |
| driver: uverbs_attr_array  |     +------------+
|----------------------------+--+  | valid      |
                                |  | cmd_attr   |
                                |  +------------+
                                |  | valid      |
                                |  | obj_attr   |
                                |  +------------+
                                |
                                |  vendor
                                |  +------------+
                                +> | valid      |
                                   | cmd_attr   |
                                   +------------+
                                   | valid      |
                                   | cmd_attr   |
                                   +------------+
                                   | valid      |
                                   | obj_attr   |
                                   +------------+

Ctx array's indices corresponds to the attributes groups order. The indices
of core and driver corresponds to the attributes name spaces of each
group. Thus, we could think of the following as one object:
1. Set of attribute specification (with their attribute IDs)
2. Attribute group which owns (1) specifications
3. A function which could handle this attributes which the handler
   could call
4. The allocation descriptor of this type uverbs_type_alloc_action.

Upon success of a handler invocation, reference count of uobjects and
use count will be a updated automatically according to the
specification.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/Makefile       |   2 +-
 drivers/infiniband/core/device.c       |   3 +
 drivers/infiniband/core/rdma_core.c    |  59 +++++-
 drivers/infiniband/core/rdma_core.h    |   5 +
 drivers/infiniband/core/uverbs.h       |   3 +
 drivers/infiniband/core/uverbs_ioctl.c | 353 +++++++++++++++++++++++++++++++++
 drivers/infiniband/core/uverbs_main.c  |   3 +
 include/rdma/ib_verbs.h                |   3 +-
 include/uapi/rdma/rdma_user_ioctl.h    |  28 +++
 9 files changed, 449 insertions(+), 10 deletions(-)
 create mode 100644 drivers/infiniband/core/uverbs_ioctl.c

diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index 7676592..121373c 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -29,4 +29,4 @@ ib_umad-y :=			user_mad.o
 ib_ucm-y :=			ucm.o
 
 ib_uverbs-y :=			uverbs_main.o uverbs_cmd.o uverbs_marshall.o \
-				rdma_core.o uverbs_ioctl_cmd.o
+				rdma_core.o uverbs_ioctl_cmd.o uverbs_ioctl.o
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 43994b1..c0c6365 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -245,6 +245,9 @@ struct ib_device *ib_alloc_device(size_t size)
 	INIT_LIST_HEAD(&device->port_list);
 	INIT_LIST_HEAD(&device->type_list);
 
+	/* TODO: don't forget to initialize device->driver_id, so verbs handshake between
+	 * user space<->kernel space will work for other values than driver_id == 0.
+	 */
 	return device;
 }
 EXPORT_SYMBOL(ib_alloc_device);
diff --git a/drivers/infiniband/core/rdma_core.c b/drivers/infiniband/core/rdma_core.c
index 01221c0..1e08ebf 100644
--- a/drivers/infiniband/core/rdma_core.c
+++ b/drivers/infiniband/core/rdma_core.c
@@ -37,6 +37,51 @@
 #include "rdma_core.h"
 #include <rdma/uverbs_ioctl.h>
 
+int uverbs_group_idx(u16 *id, unsigned int ngroups)
+{
+	int ret = (*id & UVERBS_ID_RESERVED_MASK) >> UVERBS_ID_RESERVED_SHIFT;
+
+	if (ret >= ngroups)
+		return -EINVAL;
+
+	*id &= ~UVERBS_ID_RESERVED_MASK;
+	return ret;
+}
+
+const struct uverbs_type *uverbs_get_type(const struct ib_device *ibdev,
+					  uint16_t type)
+{
+	const struct uverbs_root *groups = ibdev->specs_root;
+	const struct uverbs_type_group *types;
+	int ret = uverbs_group_idx(&type, groups->num_groups);
+
+	if (ret < 0)
+		return NULL;
+
+	types = groups->type_groups[ret];
+
+	if (type >= types->num_types)
+		return NULL;
+
+	return types->types[type];
+}
+
+const struct uverbs_action *uverbs_get_action(const struct uverbs_type *type,
+					      uint16_t action)
+{
+	const struct uverbs_action_group *action_group;
+	int ret = uverbs_group_idx(&action, type->num_groups);
+
+	if (ret < 0)
+		return NULL;
+
+	action_group = type->action_groups[ret];
+	if (action >= action_group->num_actions)
+		return NULL;
+
+	return action_group->actions[action];
+}
+
 static int uverbs_lock_object(struct ib_uobject *uobj,
 			      enum uverbs_idr_access access)
 {
@@ -357,7 +402,6 @@ static void ib_uverbs_remove_fd(struct ib_uobject *uobject)
 	 */
 	if (uobject->context) {
 		list_del(&uobject->list);
-		uobject->type->free_fn(uobject->type, uobject);
 		kref_put(&uobject->context->ufile->ref, ib_uverbs_release_file);
 		uobject->context = NULL;
 	}
@@ -368,11 +412,7 @@ void ib_uverbs_close_fd(struct file *f)
 	struct ib_uobject *uobject = f->private_data - sizeof(struct ib_uobject);
 
 	mutex_lock(&uobject->uobjects_lock->lock);
-	if (uobject->context) {
-		list_del(&uobject->list);
-		kref_put(&uobject->context->ufile->ref, ib_uverbs_release_file);
-		uobject->context = NULL;
-	}
+	ib_uverbs_remove_fd(uobject);
 	mutex_unlock(&uobject->uobjects_lock->lock);
 	kref_put(&uobject->uobjects_lock->ref, release_uobjects_list_lock);
 }
@@ -454,10 +494,13 @@ void ib_uverbs_uobject_type_cleanup_ucontext(struct ib_ucontext *ucontext,
 		list_for_each_entry_safe(obj, next_obj, &ucontext->uobjects,
 					 list)
 			if (obj->type->order == i) {
-				if (obj->type->type == UVERBS_ATTR_TYPE_IDR)
+				if (obj->type->type == UVERBS_ATTR_TYPE_IDR) {
+					obj->type->free_fn(obj->type, obj);
 					ib_uverbs_uobject_remove(obj, false);
-				else
+				} else {
+					obj->type->free_fn(obj->type, obj);
 					ib_uverbs_remove_fd(obj);
+				}
 			}
 		mutex_unlock(&ucontext->uobjects_lock->lock);
 	}
diff --git a/drivers/infiniband/core/rdma_core.h b/drivers/infiniband/core/rdma_core.h
index 9b91c1c..78a5339 100644
--- a/drivers/infiniband/core/rdma_core.h
+++ b/drivers/infiniband/core/rdma_core.h
@@ -42,6 +42,11 @@
 #include <rdma/ib_verbs.h>
 #include <linux/mutex.h>
 
+int uverbs_group_idx(u16 *id, unsigned int ngroups);
+const struct uverbs_type *uverbs_get_type(const struct ib_device *ibdev,
+					  uint16_t type);
+const struct uverbs_action *uverbs_get_action(const struct uverbs_type *type,
+					      uint16_t action);
 struct ib_uobject *uverbs_get_type_from_idr(const struct uverbs_type_alloc_action *type,
 					    struct ib_ucontext *ucontext,
 					    enum uverbs_idr_access access,
diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 64f8658..d3ad81c 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -41,6 +41,7 @@
 #include <linux/mutex.h>
 #include <linux/completion.h>
 #include <linux/cdev.h>
+#include <linux/rwsem.h>
 
 #include <rdma/ib_verbs.h>
 #include <rdma/ib_umem.h>
@@ -83,6 +84,8 @@
  * released when the CQ is destroyed.
  */
 
+long ib_uverbs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
+
 struct ib_uverbs_device {
 	atomic_t				refcount;
 	int					num_comp_vectors;
diff --git a/drivers/infiniband/core/uverbs_ioctl.c b/drivers/infiniband/core/uverbs_ioctl.c
new file mode 100644
index 0000000..406b735
--- /dev/null
+++ b/drivers/infiniband/core/uverbs_ioctl.c
@@ -0,0 +1,353 @@
+/*
+ * Copyright (c) 2016, Mellanox Technologies inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <rdma/rdma_user_ioctl.h>
+#include <rdma/uverbs_ioctl.h>
+#include "rdma_core.h"
+#include "uverbs.h"
+
+static int uverbs_validate_attr(struct ib_device *ibdev,
+				struct ib_ucontext *ucontext,
+				const struct ib_uverbs_attr *uattr,
+				u16 attr_id,
+				const struct uverbs_attr_spec_group *attr_spec_group,
+				struct uverbs_attr_array *attr_array,
+				struct ib_uverbs_attr __user *uattr_ptr)
+{
+	const struct uverbs_attr_spec *spec;
+	struct uverbs_attr *e;
+	const struct uverbs_type *type;
+	struct uverbs_obj_attr *o_attr;
+	struct uverbs_attr *elements = attr_array->attrs;
+
+	if (uattr->reserved)
+		return -EINVAL;
+
+	if (attr_id >= attr_spec_group->num_attrs) {
+		if (uattr->flags & UVERBS_ATTR_F_MANDATORY)
+			return -EINVAL;
+		else
+			return 0;
+	}
+
+	spec = &attr_spec_group->attrs[attr_id];
+	e = &elements[attr_id];
+
+	switch (spec->type) {
+	case UVERBS_ATTR_TYPE_PTR_IN:
+	case UVERBS_ATTR_TYPE_PTR_OUT:
+		if (uattr->len < spec->len ||
+		    (!(spec->flags & UVERBS_ATTR_SPEC_F_MIN_SZ) &&
+		     uattr->len > spec->len))
+			return -EINVAL;
+
+		e->cmd_attr.ptr = (void * __user)uattr->ptr_idr;
+		e->cmd_attr.len = uattr->len;
+		break;
+
+	case UVERBS_ATTR_TYPE_FLAG:
+		e->flag_attr.flags = uattr->ptr_idr;
+		if (uattr->flags & UVERBS_ATTR_F_MANDATORY &&
+		    e->flag_attr.flags & ~spec->flag.mask)
+			return -EINVAL;
+		break;
+
+	case UVERBS_ATTR_TYPE_IDR:
+	case UVERBS_ATTR_TYPE_FD:
+		if (uattr->len != 0 || (uattr->ptr_idr >> 32) || (!ucontext))
+			return -EINVAL;
+
+		o_attr = &e->obj_attr;
+		type = uverbs_get_type(ibdev, spec->obj.obj_type);
+		if (!type)
+			return -EINVAL;
+		o_attr->type = type->alloc;
+		o_attr->uattr = uattr_ptr;
+
+		if (spec->type == UVERBS_ATTR_TYPE_IDR) {
+			o_attr->uobj.idr = (uint32_t)uattr->ptr_idr;
+			o_attr->uobject = uverbs_get_type_from_idr(o_attr->type,
+								   ucontext,
+								   spec->obj.access,
+								   o_attr->uobj.idr);
+		} else {
+			o_attr->fd.fd = (int)uattr->ptr_idr;
+			o_attr->uobject = uverbs_get_type_from_fd(o_attr->type,
+								  ucontext,
+								  spec->obj.access,
+								  o_attr->fd.fd);
+		}
+
+		if (IS_ERR(o_attr->uobject))
+			return -EINVAL;
+
+		if (spec->obj.access == UVERBS_IDR_ACCESS_NEW) {
+			u64 idr = o_attr->uobject->id;
+
+			if (put_user(idr, &o_attr->uattr->ptr_idr)) {
+				uverbs_rollback_object(o_attr->uobject,
+						       UVERBS_IDR_ACCESS_NEW);
+				return -EFAULT;
+			}
+		}
+
+		break;
+	default:
+		return -EOPNOTSUPP;
+	};
+
+	set_bit(attr_id, attr_array->valid_bitmap);
+	return 0;
+}
+
+static int uverbs_validate(struct ib_device *ibdev,
+			   struct ib_ucontext *ucontext,
+			   const struct ib_uverbs_attr *uattrs,
+			   size_t num_attrs,
+			   const struct uverbs_action *action,
+			   struct uverbs_attr_array *attr_array,
+			   struct ib_uverbs_attr __user *uattr_ptr)
+{
+	size_t i;
+	int ret;
+	int n_val = 0;
+
+	for (i = 0; i < num_attrs; i++) {
+		const struct ib_uverbs_attr *uattr = &uattrs[i];
+		u16 attr_id = uattr->attr_id;
+		const struct uverbs_attr_spec_group *attr_spec_group;
+
+		ret = uverbs_group_idx(&attr_id, action->num_groups);
+		if (ret < 0) {
+			if (uattr->flags & UVERBS_ATTR_F_MANDATORY)
+				return ret;
+
+			ret = 0;
+			continue;
+		}
+
+		if (ret >= n_val)
+			n_val = ret + 1;
+
+		attr_spec_group = action->attr_groups[ret];
+		ret = uverbs_validate_attr(ibdev, ucontext, uattr, attr_id,
+					   attr_spec_group, &attr_array[ret],
+					   uattr_ptr++);
+		if (ret) {
+			uverbs_commit_objects(attr_array, n_val,
+					      action, false);
+			return ret;
+		}
+	}
+
+	return ret ? ret : n_val;
+}
+
+static int uverbs_handle_action(struct ib_uverbs_attr __user *uattr_ptr,
+				const struct ib_uverbs_attr *uattrs,
+				size_t num_attrs,
+				struct ib_device *ibdev,
+				struct ib_uverbs_file *ufile,
+				const struct uverbs_action *handler,
+				struct uverbs_attr_array *attr_array)
+{
+	int ret;
+	int n_val;
+	unsigned int i;
+
+	n_val = uverbs_validate(ibdev, ufile->ucontext, uattrs, num_attrs,
+				handler, attr_array, uattr_ptr);
+	if (n_val <= 0)
+		return n_val;
+
+	for (i = 0; i < n_val; i++) {
+		const struct uverbs_attr_spec_group *attr_spec_group =
+			handler->attr_groups[i];
+
+		if (!bitmap_subset(attr_spec_group->mandatory_attrs_bitmask,
+				   attr_array[i].valid_bitmap,
+				   attr_spec_group->num_attrs)) {
+			ret = -EINVAL;
+			goto cleanup;
+		}
+	}
+
+	ret = handler->handler(ibdev, ufile, attr_array, n_val);
+cleanup:
+	uverbs_commit_objects(attr_array, n_val, handler, !ret);
+
+	return ret;
+}
+
+#define UVERBS_OPTIMIZE_USING_STACK
+#ifdef UVERBS_OPTIMIZE_USING_STACK
+#define UVERBS_MAX_STACK_USAGE		512
+#endif
+static long ib_uverbs_cmd_verbs(struct ib_device *ib_dev,
+				struct ib_uverbs_file *file,
+				struct ib_uverbs_ioctl_hdr *hdr,
+				void __user *buf)
+{
+	const struct uverbs_type *type;
+	const struct uverbs_action *action;
+	long err = 0;
+	unsigned int i;
+	struct {
+		struct ib_uverbs_attr		*uattrs;
+		struct uverbs_attr_array	*uverbs_attr_array;
+	} *ctx = NULL;
+	struct uverbs_attr *curr_attr;
+	unsigned long *curr_bitmap;
+	size_t ctx_size;
+#ifdef UVERBS_OPTIMIZE_USING_STACK
+	uintptr_t data[UVERBS_MAX_STACK_USAGE / sizeof(uintptr_t)];
+#endif
+
+	if (ib_dev->driver_id != hdr->driver_id)
+		return -EINVAL;
+
+	type = uverbs_get_type(ib_dev, hdr->object_type);
+	if (!type)
+		return -EOPNOTSUPP;
+
+	action = uverbs_get_action(type, hdr->action);
+	if (!action)
+		return -EOPNOTSUPP;
+
+	if ((action->flags & UVERBS_ACTION_FLAG_CREATE_ROOT) ^ !file->ucontext)
+		return -EINVAL;
+
+	ctx_size = sizeof(*ctx->uattrs) * hdr->num_attrs +
+		   sizeof(*ctx->uverbs_attr_array->attrs) * action->num_child_attrs +
+		   sizeof(struct uverbs_attr_array) * action->num_groups +
+		   sizeof(*ctx->uverbs_attr_array->valid_bitmap) *
+			(action->num_child_attrs / BITS_PER_LONG +
+			 action->num_groups) +
+		   sizeof(*ctx);
+
+#ifdef UVERBS_OPTIMIZE_USING_STACK
+	if (ctx_size <= UVERBS_MAX_STACK_USAGE) {
+		memset(data, 0, ctx_size);
+		ctx = (void *)data;
+	}
+	if (!ctx)
+#endif
+	ctx = kzalloc(ctx_size, GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+
+	ctx->uverbs_attr_array = (void *)ctx + sizeof(*ctx);
+	ctx->uattrs = (void *)(ctx->uverbs_attr_array +
+			       action->num_groups);
+	curr_attr = (void *)(ctx->uattrs + hdr->num_attrs);
+	curr_bitmap = (void *)(curr_attr + action->num_child_attrs);
+
+	for (i = 0; i < action->num_groups; i++) {
+		unsigned int curr_num_attrs = action->attr_groups[i]->num_attrs;
+
+		ctx->uverbs_attr_array[i].attrs = curr_attr;
+		curr_attr += curr_num_attrs;
+		ctx->uverbs_attr_array[i].num_attrs = curr_num_attrs;
+		ctx->uverbs_attr_array[i].valid_bitmap = curr_bitmap;
+		curr_bitmap += BITS_TO_LONGS(curr_num_attrs);
+	}
+
+	err = copy_from_user(ctx->uattrs, buf,
+			     sizeof(*ctx->uattrs) * hdr->num_attrs);
+	if (err) {
+		err = -EFAULT;
+		goto out;
+	}
+
+	err = uverbs_handle_action(buf, ctx->uattrs, hdr->num_attrs, ib_dev,
+				   file, action, ctx->uverbs_attr_array);
+out:
+#ifdef UVERBS_OPTIMIZE_USING_STACK
+	if (ctx_size > UVERBS_MAX_STACK_USAGE)
+#endif
+	kfree(ctx);
+	return err;
+}
+
+#define IB_UVERBS_MAX_CMD_SZ 4096
+
+long ib_uverbs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
+{
+	struct ib_uverbs_file *file = filp->private_data;
+	struct ib_uverbs_ioctl_hdr __user *user_hdr =
+		(struct ib_uverbs_ioctl_hdr __user *)arg;
+	struct ib_uverbs_ioctl_hdr hdr;
+	struct ib_device *ib_dev;
+	int srcu_key;
+	long err;
+
+	srcu_key = srcu_read_lock(&file->device->disassociate_srcu);
+	ib_dev = srcu_dereference(file->device->ib_dev,
+				  &file->device->disassociate_srcu);
+	if (!ib_dev) {
+		err = -EIO;
+		goto out;
+	}
+
+	if (cmd == RDMA_DIRECT_IOCTL) {
+		/* TODO? */
+		err = -ENOSYS;
+		goto out;
+	} else {
+		if (cmd != RDMA_VERBS_IOCTL) {
+			err = -ENOIOCTLCMD;
+			goto out;
+		}
+
+		err = copy_from_user(&hdr, user_hdr, sizeof(hdr));
+
+		if (err || hdr.length > IB_UVERBS_MAX_CMD_SZ ||
+		    hdr.length <= sizeof(hdr) ||
+		    hdr.length != sizeof(hdr) + hdr.num_attrs * sizeof(struct ib_uverbs_attr)) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		/* currently there are no flags supported */
+		if (hdr.flags) {
+			err = -EOPNOTSUPP;
+			goto out;
+		}
+
+		err = ib_uverbs_cmd_verbs(ib_dev, file, &hdr,
+					  (__user void *)arg + sizeof(hdr));
+	}
+out:
+	srcu_read_unlock(&file->device->disassociate_srcu, srcu_key);
+
+	return err;
+}
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index d3dacdd..273b48f 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -49,6 +49,7 @@
 #include <asm/uaccess.h>
 
 #include <rdma/ib.h>
+#include <rdma/rdma_user_ioctl.h>
 
 #include "uverbs.h"
 #include "rdma_core.h"
@@ -906,6 +907,7 @@ static const struct file_operations uverbs_fops = {
 	.open	 = ib_uverbs_open,
 	.release = ib_uverbs_close,
 	.llseek	 = no_llseek,
+	.unlocked_ioctl = ib_uverbs_ioctl,
 };
 
 static const struct file_operations uverbs_mmap_fops = {
@@ -915,6 +917,7 @@ static const struct file_operations uverbs_mmap_fops = {
 	.open	 = ib_uverbs_open,
 	.release = ib_uverbs_close,
 	.llseek	 = no_llseek,
+	.unlocked_ioctl = ib_uverbs_ioctl,
 };
 
 static struct ib_client uverbs_client = {
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index f8eeca4..d751404 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2102,7 +2102,8 @@ struct ib_device {
 	void (*get_dev_fw_str)(struct ib_device *, char *str, size_t str_len);
 	struct list_head type_list;
 
-	struct uverbs_root                      *specs_root;
+	u16					driver_id;
+	struct uverbs_root			*specs_root;
 };
 
 struct ib_client {
diff --git a/include/uapi/rdma/rdma_user_ioctl.h b/include/uapi/rdma/rdma_user_ioctl.h
index 9388125..3e2f59a 100644
--- a/include/uapi/rdma/rdma_user_ioctl.h
+++ b/include/uapi/rdma/rdma_user_ioctl.h
@@ -43,6 +43,34 @@
 /* Legacy name, for user space application which already use it */
 #define IB_IOCTL_MAGIC		RDMA_IOCTL_MAGIC
 
+#define RDMA_VERBS_IOCTL \
+	_IOWR(RDMA_IOCTL_MAGIC, 1, struct ib_uverbs_ioctl_hdr)
+
+#define RDMA_DIRECT_IOCTL \
+	_IOWR(RDMA_IOCTL_MAGIC, 2, struct ib_uverbs_ioctl_hdr)
+
+enum ib_uverbs_attr_flags {
+	UVERBS_ATTR_F_MANDATORY = 1U << 0,
+};
+
+struct ib_uverbs_attr {
+	__u16 attr_id;		/* command specific type attribute */
+	__u16 len;		/* NA for idr */
+	__u16 flags;		/* combination of uverbs_attr_flags */
+	__u16 reserved;
+	__u64 ptr_idr;		/* ptr typeo command/idr handle */
+};
+
+struct ib_uverbs_ioctl_hdr {
+	__u16 length;
+	__u16 flags;
+	__u16 object_type;
+	__u16 driver_id;
+	__u16 action;
+	__u16 num_attrs;
+	struct ib_uverbs_attr  attrs[0];
+};
+
 /*
  * General blocks assignments
  * It is closed on purpose do not expose it it user space
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC ABI V6 08/14] IB/core: Add macros for declaring actions and attributes
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (6 preceding siblings ...)
  2016-12-11 12:58   ` [RFC ABI V6 07/14] IB/core: Add new ioctl interface Matan Barak
@ 2016-12-11 12:58   ` Matan Barak
  2016-12-11 12:58   ` [RFC ABI V6 09/14] IB/core: Add uverbs types, actions, handlers " Matan Barak
                     ` (6 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Matan Barak @ 2016-12-11 12:58 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Jason Gunthorpe, Sean Hefty, Christoph Lameter,
	Liran Liss, Haggai Eran, Majd Dibbiny, Matan Barak, Tal Alon,
	Leon Romanovsky

This patch adds macros for declaring action groups, actions,
attribute groups and attributes. These definitions are later
used by downstream patches to declare some of the common types.

In addition, we add some helper inline functions to copy_{from,to}
user-space buffers and check if an attribute is valid.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 include/rdma/uverbs_ioctl.h | 103 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 103 insertions(+)

diff --git a/include/rdma/uverbs_ioctl.h b/include/rdma/uverbs_ioctl.h
index 54f592a..5340673 100644
--- a/include/rdma/uverbs_ioctl.h
+++ b/include/rdma/uverbs_ioctl.h
@@ -34,6 +34,8 @@
 #define _UVERBS_IOCTL_
 
 #include <linux/kernel.h>
+#include <linux/uaccess.h>
+#include <rdma/rdma_user_ioctl.h>
 
 struct uverbs_object_type;
 struct ib_ucontext;
@@ -147,6 +149,77 @@ struct uverbs_root {
 	size_t					num_groups;
 };
 
+#define UA_FLAGS(_flags)  .flags = _flags
+#define UVERBS_ATTR(_id, _len, _type, ...)				\
+	[_id] = {.len = _len, .type = _type, ##__VA_ARGS__}
+#define UVERBS_ATTR_PTR_IN_SZ(_id, _len, ...)				\
+	UVERBS_ATTR(_id, _len, UVERBS_ATTR_TYPE_PTR_IN, ##__VA_ARGS__)
+#define UVERBS_ATTR_PTR_IN(_id, _type, ...)				\
+	UVERBS_ATTR_PTR_IN_SZ(_id, sizeof(_type), ##__VA_ARGS__)
+#define UVERBS_ATTR_PTR_OUT_SZ(_id, _len, ...)				\
+	UVERBS_ATTR(_id, _len, UVERBS_ATTR_TYPE_PTR_OUT, ##__VA_ARGS__)
+#define UVERBS_ATTR_PTR_OUT(_id, _type, ...)				\
+	UVERBS_ATTR_PTR_OUT_SZ(_id, sizeof(_type), ##__VA_ARGS__)
+#define UVERBS_ATTR_IDR(_id, _idr_type, _access, ...)			\
+	[_id] = {.type = UVERBS_ATTR_TYPE_IDR,				\
+		 .obj = {.obj_type = _idr_type,				\
+			 .access = _access				\
+		 }, ##__VA_ARGS__ }
+#define UVERBS_ATTR_FD(_id, _fd_type, _access, ...)			\
+	[_id] = {.type = UVERBS_ATTR_TYPE_FD,				\
+		 .obj = {.obj_type = _fd_type,				\
+			 .access = _access + BUILD_BUG_ON_ZERO(		\
+				_access != UVERBS_IDR_ACCESS_NEW &&	\
+				_access != UVERBS_IDR_ACCESS_READ)	\
+		 }, ##__VA_ARGS__ }
+#define UVERBS_ATTR_FLAG(_id, _mask, ...)				\
+	[_id] = {.type = UVERBS_ATTR_TYPE_FLAG,				\
+		 .flag = {.mask = _mask}, ##__VA_ARGS__ }
+#define _UVERBS_ATTR_SPEC_SZ(...)					\
+	(sizeof((const struct uverbs_attr_spec[]){__VA_ARGS__}) /	\
+	 sizeof(const struct uverbs_attr_spec))
+#define UVERBS_ATTR_SPEC(...)					\
+	((const struct uverbs_attr_spec_group)				\
+	 {.attrs = (struct uverbs_attr_spec[]){__VA_ARGS__},		\
+	  .num_attrs = _UVERBS_ATTR_SPEC_SZ(__VA_ARGS__)})
+#define DECLARE_UVERBS_ATTR_SPEC(name, ...)			\
+	const struct uverbs_attr_spec_group name =			\
+		UVERBS_ATTR_SPEC(__VA_ARGS__)
+#define _UVERBS_ATTR_ACTION_SPEC_SZ(...)				  \
+	(sizeof((const struct uverbs_attr_spec_group *[]){__VA_ARGS__}) / \
+	 sizeof(const struct uverbs_attr_spec_group *))
+#define _UVERBS_ACTION(_handler, _flags, ...)				\
+	((const struct uverbs_action) {					\
+		.flags = _flags,					\
+		.handler = _handler,					\
+		.num_groups =	_UVERBS_ATTR_ACTION_SPEC_SZ(__VA_ARGS__),	\
+		.attr_groups = (const struct uverbs_attr_spec_group *[]){__VA_ARGS__} })
+#define UVERBS_ACTION(_handler, ...)			\
+	_UVERBS_ACTION(_handler, 0, __VA_ARGS__)
+#define UVERBS_CTX_ACTION(_handler, ...)			\
+	_UVERBS_ACTION(_handler, UVERBS_ACTION_FLAG_CREATE_ROOT, __VA_ARGS__)
+#define _UVERBS_ACTIONS_SZ(...)					\
+	(sizeof((const struct uverbs_action *[]){__VA_ARGS__}) /	\
+	 sizeof(const struct uverbs_action *))
+#define ADD_UVERBS_ACTION(action_idx, _handler,  ...)		\
+	[action_idx] = &UVERBS_ACTION(_handler, __VA_ARGS__)
+#define DECLARE_UVERBS_ACTION(name, _handler, ...)		\
+	const struct uverbs_action name =				\
+		UVERBS_ACTION(_handler, __VA_ARGS__)
+#define ADD_UVERBS_CTX_ACTION(action_idx, _handler,  ...)	\
+	[action_idx] = &UVERBS_CTX_ACTION(_handler, __VA_ARGS__)
+#define DECLARE_UVERBS_CTX_ACTION(name, _handler, ...)	\
+	const struct uverbs_action name =				\
+		UVERBS_CTX_ACTION(_handler, __VA_ARGS__)
+#define ADD_UVERBS_ACTION_PTR(idx, ptr)					\
+	[idx] = ptr
+#define UVERBS_ACTIONS(...)						\
+	((const struct uverbs_action_group)			\
+	  {.num_actions = _UVERBS_ACTIONS_SZ(__VA_ARGS__),		\
+	   .actions = (const struct uverbs_action *[]){__VA_ARGS__} })
+#define DECLARE_UVERBS_ACTIONS(name, ...)				\
+	const struct  uverbs_type_actions_group name =			\
+		UVERBS_ACTIONS(__VA_ARGS__)
 #define _UVERBS_ACTIONS_GROUP_SZ(...)					\
 	(sizeof((const struct uverbs_action_group*[]){__VA_ARGS__}) / \
 	 sizeof(const struct uverbs_action_group *))
@@ -253,6 +326,36 @@ static inline bool uverbs_is_valid(const struct uverbs_attr_array *attr_array,
 	return test_bit(idx, attr_array->valid_bitmap);
 }
 
+/* TODO: Add debug version for these macros/inline func */
+static inline int uverbs_copy_to(struct uverbs_attr_array *attr_array,
+				 size_t idx, const void *from)
+{
+	if (!uverbs_is_valid(attr_array, idx))
+		return -ENOENT;
+
+	return copy_to_user(attr_array->attrs[idx].cmd_attr.ptr, from,
+			    attr_array->attrs[idx].cmd_attr.len) ? -EFAULT : 0;
+}
+
+#define uverbs_copy_from(to, attr_array, idx)				\
+	(uverbs_is_valid((attr_array), idx) ?				\
+	 (sizeof(*to) <= sizeof(((struct ib_uverbs_attr *)0)->ptr_idr) ?\
+	  (memcpy(to, &(attr_array)->attrs[idx].cmd_attr.ptr,		\
+		 (attr_array)->attrs[idx].cmd_attr.len), 0) :		\
+	  (copy_from_user((to), (attr_array)->attrs[idx].cmd_attr.ptr,	\
+			 (attr_array)->attrs[idx].cmd_attr.len) ?	\
+	   -EFAULT : 0)) : -ENOENT)
+#define uverbs_get_attr(to, attr_array, idx)				\
+	(uverbs_is_valid((attr_array), idx) ?				\
+	 (sizeof(to) <= sizeof(((struct ib_uverbs_attr *)0)->ptr_idr) ? \
+	  (sizeof(to) == sizeof((&(to))[0]) ?				\
+	   ((to) = *(typeof(to) *)&(attr_array)->attrs[idx].cmd_attr.ptr, 0) :\
+	   (memcpy(&(to), &(attr_array)->attrs[idx].cmd_attr.ptr,	\
+		 (attr_array)->attrs[idx].cmd_attr.len), 0)) :		\
+	  (copy_from_user(&(to), (attr_array)->attrs[idx].cmd_attr.ptr,	\
+			 (attr_array)->attrs[idx].cmd_attr.len) ?	\
+	   -EFAULT : 0)) : -ENOENT)
+
 /* =================================================
  *              Types infrastructure
  * =================================================
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC ABI V6 09/14] IB/core: Add uverbs types, actions, handlers and attributes
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (7 preceding siblings ...)
  2016-12-11 12:58   ` [RFC ABI V6 08/14] IB/core: Add macros for declaring actions and attributes Matan Barak
@ 2016-12-11 12:58   ` Matan Barak
  2016-12-11 12:58   ` [RFC ABI V6 10/14] IB/core: Add uverbs merge trees functionality Matan Barak
                     ` (5 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Matan Barak @ 2016-12-11 12:58 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Jason Gunthorpe, Sean Hefty, Christoph Lameter,
	Liran Liss, Haggai Eran, Majd Dibbiny, Matan Barak, Tal Alon,
	Leon Romanovsky

We add the common (core) code for init context, query device,
reg_mr, create_cq, create_qp, modify_qp create_comp_channel and
init_pd.
This includes the following parts:
* Macros for defining commands and validators
* For each command
    * type declarations
          - destruction order
          - free function
          - uverbs action group
    * actions
    * handlers
    * attributes

Drivers could use the these attributes, actions or types when they
want to alter or add a new type. The could use the uverbs handler
directly in the action (or just wrap it in the driver's custom code).

Currently we use ib_udata to pass vendor specific information to the
driver. This should probably be refactored in the future.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/core_priv.h        |  14 +
 drivers/infiniband/core/uverbs.h           |   4 +
 drivers/infiniband/core/uverbs_cmd.c       |  21 +-
 drivers/infiniband/core/uverbs_ioctl_cmd.c | 864 ++++++++++++++++++++++++++++-
 include/rdma/uverbs_ioctl_cmd.h            | 140 +++++
 include/uapi/rdma/ib_user_verbs.h          |  39 ++
 6 files changed, 1052 insertions(+), 30 deletions(-)

diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index 19d499d..fccc7bc 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -153,4 +153,18 @@ int ib_nl_handle_set_timeout(struct sk_buff *skb,
 int ib_nl_handle_ip_res_resp(struct sk_buff *skb,
 			     struct netlink_callback *cb);
 
+/* Remove ignored fields set in the attribute mask */
+static inline int modify_qp_mask(enum ib_qp_type qp_type, int mask)
+{
+	switch (qp_type) {
+	case IB_QPT_XRC_INI:
+		return mask & ~(IB_QP_MAX_DEST_RD_ATOMIC | IB_QP_MIN_RNR_TIMER);
+	case IB_QPT_XRC_TGT:
+		return mask & ~(IB_QP_MAX_QP_RD_ATOMIC | IB_QP_RETRY_CNT |
+				IB_QP_RNR_RETRY);
+	default:
+		return mask;
+	}
+}
+
 #endif /* _CORE_PRIV_H */
diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index d3ad81c..7c038a3 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -201,6 +201,10 @@ void ib_uverbs_event_handler(struct ib_event_handler *handler,
 void ib_uverbs_dealloc_xrcd(struct ib_uverbs_device *dev, struct ib_xrcd *xrcd);
 
 int uverbs_dealloc_mw(struct ib_mw *mw);
+void uverbs_copy_query_dev_fields(struct ib_device *ib_dev,
+				  struct ib_uverbs_query_device_resp *resp,
+				  struct ib_device_attr *attr);
+
 void ib_uverbs_release_ucq(struct ib_uverbs_file *file,
 			   struct ib_uverbs_event_file *ev_file,
 			   struct ib_ucq_object *uobj);
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 79a1a8b..a3fc3f72 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -210,8 +210,7 @@ err:
 	return ret;
 }
 
-static void copy_query_dev_fields(struct ib_uverbs_file *file,
-				  struct ib_device *ib_dev,
+void uverbs_copy_query_dev_fields(struct ib_device *ib_dev,
 				  struct ib_uverbs_query_device_resp *resp,
 				  struct ib_device_attr *attr)
 {
@@ -272,7 +271,7 @@ ssize_t ib_uverbs_query_device(struct ib_uverbs_file *file,
 		return -EFAULT;
 
 	memset(&resp, 0, sizeof resp);
-	copy_query_dev_fields(file, ib_dev, &resp, &ib_dev->attrs);
+	uverbs_copy_query_dev_fields(ib_dev, &resp, &ib_dev->attrs);
 
 	if (copy_to_user((void __user *) (unsigned long) cmd.response,
 			 &resp, sizeof resp))
@@ -2013,20 +2012,6 @@ out_query:
 	return ret;
 }
 
-/* Remove ignored fields set in the attribute mask */
-static int modify_qp_mask(enum ib_qp_type qp_type, int mask)
-{
-	switch (qp_type) {
-	case IB_QPT_XRC_INI:
-		return mask & ~(IB_QP_MAX_DEST_RD_ATOMIC | IB_QP_MIN_RNR_TIMER);
-	case IB_QPT_XRC_TGT:
-		return mask & ~(IB_QP_MAX_QP_RD_ATOMIC | IB_QP_RETRY_CNT |
-				IB_QP_RNR_RETRY);
-	default:
-		return mask;
-	}
-}
-
 ssize_t ib_uverbs_modify_qp(struct ib_uverbs_file *file,
 			    struct ib_device *ib_dev,
 			    const char __user *buf, int in_len,
@@ -3834,7 +3819,7 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
 	if (err)
 		return err;
 
-	copy_query_dev_fields(file, ib_dev, &resp.base, &attr);
+	uverbs_copy_query_dev_fields(ib_dev, &resp.base, &attr);
 
 	if (ucore->outlen < resp.response_length + sizeof(resp.odp_caps))
 		goto end;
diff --git a/drivers/infiniband/core/uverbs_ioctl_cmd.c b/drivers/infiniband/core/uverbs_ioctl_cmd.c
index cb19f38..5ab6189 100644
--- a/drivers/infiniband/core/uverbs_ioctl_cmd.c
+++ b/drivers/infiniband/core/uverbs_ioctl_cmd.c
@@ -37,6 +37,7 @@
 #include <linux/file.h>
 #include "rdma_core.h"
 #include "uverbs.h"
+#include "core_priv.h"
 
 void uverbs_free_ah(const struct uverbs_type_alloc_action *uobject_type,
 		    struct ib_uobject *uobject)
@@ -151,28 +152,852 @@ void uverbs_free_event_file(const struct uverbs_type_alloc_action *type_alloc_ac
 	kill_fasync(&event_file->async_queue, SIGIO, POLL_IN);
 };
 
+DECLARE_UVERBS_ATTR_SPEC(
+	uverbs_uhw_compat_spec,
+	UVERBS_ATTR_PTR_IN_SZ(UVERBS_UHW_IN, 0, UA_FLAGS(UVERBS_ATTR_SPEC_F_MIN_SZ)),
+	UVERBS_ATTR_PTR_OUT_SZ(UVERBS_UHW_OUT, 0, UA_FLAGS(UVERBS_ATTR_SPEC_F_MIN_SZ)));
+
+static void create_udata(struct uverbs_attr_array *ctx, size_t num,
+			 struct ib_udata *udata)
+{
+	/*
+	 * This is for ease of conversion. The purpose is to convert all drivers
+	 * to use uverbs_attr_array instead of ib_udata.
+	 * Assume attr == 0 is input and attr == 1 is output.
+	 */
+	void * __user inbuf;
+	size_t inbuf_len = 0;
+	void * __user outbuf;
+	size_t outbuf_len = 0;
+
+	if (num >= 2) {
+		struct uverbs_attr_array *driver = &ctx[1];
+
+		WARN_ON(driver->num_attrs > 2);
+
+		if (uverbs_is_valid(driver, 0)) {
+			inbuf = driver->attrs[0].cmd_attr.ptr;
+			inbuf_len = driver->attrs[0].cmd_attr.len;
+		}
+
+		if (driver->num_attrs == 2 && uverbs_is_valid(driver, 1)) {
+			outbuf = driver->attrs[1].cmd_attr.ptr;
+			outbuf_len = driver->attrs[1].cmd_attr.len;
+		}
+	}
+	INIT_UDATA_BUF_OR_NULL(udata, inbuf, outbuf, inbuf_len, outbuf_len);
+}
+
+DECLARE_UVERBS_ATTR_SPEC(
+	uverbs_get_context_spec,
+	UVERBS_ATTR_PTR_OUT(GET_CONTEXT_RESP,
+			    struct ib_uverbs_get_context_resp));
+
+int uverbs_get_context(struct ib_device *ib_dev,
+		       struct ib_uverbs_file *file,
+		       struct uverbs_attr_array *ctx, size_t num)
+{
+	struct uverbs_attr_array *common = &ctx[0];
+	struct ib_udata uhw;
+	struct ib_uverbs_get_context_resp resp;
+	struct ib_ucontext		 *ucontext;
+	struct file			 *filp;
+	int ret;
+
+	if (!uverbs_is_valid(common, GET_CONTEXT_RESP))
+		return -EINVAL;
+
+	/* Temporary, only until drivers get the new uverbs_attr_array */
+	create_udata(ctx, num, &uhw);
+
+	mutex_lock(&file->mutex);
+
+	if (file->ucontext) {
+		ret = -EINVAL;
+		goto err;
+	}
+
+	ucontext = ib_dev->alloc_ucontext(ib_dev, &uhw);
+	if (IS_ERR(ucontext)) {
+		ret = PTR_ERR(ucontext);
+		goto err;
+	}
+
+	ucontext->device = ib_dev;
+	ret = ib_uverbs_uobject_type_initialize_ucontext(ucontext);
+	if (ret)
+		goto err_ctx;
+
+	rcu_read_lock();
+	ucontext->tgid = get_task_pid(current->group_leader, PIDTYPE_PID);
+	rcu_read_unlock();
+	ucontext->closing = 0;
+
+#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
+	ucontext->umem_tree = RB_ROOT;
+	init_rwsem(&ucontext->umem_rwsem);
+	ucontext->odp_mrs_count = 0;
+	INIT_LIST_HEAD(&ucontext->no_private_counters);
+
+	if (!(ib_dev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING))
+		ucontext->invalidate_range = NULL;
+
+#endif
+
+	resp.num_comp_vectors = file->device->num_comp_vectors;
+
+	ret = get_unused_fd_flags(O_CLOEXEC);
+	if (ret < 0)
+		goto err_free;
+	resp.async_fd = ret;
+
+	filp = ib_uverbs_alloc_async_event_file(file, ib_dev);
+	if (IS_ERR(filp)) {
+		ret = PTR_ERR(filp);
+		goto err_fd;
+	}
+
+	if (copy_to_user(common->attrs[GET_CONTEXT_RESP].cmd_attr.ptr,
+			 &resp, sizeof(resp))) {
+		ret = -EFAULT;
+		goto err_file;
+	}
+
+	file->ucontext = ucontext;
+	ucontext->ufile = file;
+
+	fd_install(resp.async_fd, filp);
+
+	mutex_unlock(&file->mutex);
+
+	return 0;
+
+err_file:
+	ib_uverbs_free_async_event_file(file);
+	fput(filp);
+
+err_fd:
+	put_unused_fd(resp.async_fd);
+
+err_free:
+	put_pid(ucontext->tgid);
+	ib_uverbs_uobject_type_release_ucontext(ucontext);
+
+err_ctx:
+	ib_dev->dealloc_ucontext(ucontext);
+err:
+	mutex_unlock(&file->mutex);
+	return ret;
+}
+
+DECLARE_UVERBS_ATTR_SPEC(
+	uverbs_query_device_spec,
+	UVERBS_ATTR_PTR_OUT(QUERY_DEVICE_RESP, struct ib_uverbs_query_device_resp),
+	UVERBS_ATTR_PTR_OUT(QUERY_DEVICE_ODP, struct ib_uverbs_odp_caps),
+	UVERBS_ATTR_PTR_OUT(QUERY_DEVICE_TIMESTAMP_MASK, u64),
+	UVERBS_ATTR_PTR_OUT(QUERY_DEVICE_HCA_CORE_CLOCK, u64),
+	UVERBS_ATTR_PTR_OUT(QUERY_DEVICE_CAP_FLAGS, u64));
+
+int uverbs_query_device_handler(struct ib_device *ib_dev,
+				struct ib_uverbs_file *file,
+				struct uverbs_attr_array *ctx, size_t num)
+{
+	struct uverbs_attr_array *common = &ctx[0];
+	struct ib_device_attr attr = {};
+	struct ib_udata uhw;
+	int err;
+
+	/* Temporary, only until drivers get the new uverbs_attr_array */
+	create_udata(ctx, num, &uhw);
+
+	err = ib_dev->query_device(ib_dev, &attr, &uhw);
+	if (err)
+		return err;
+
+	if (uverbs_is_valid(common, QUERY_DEVICE_RESP)) {
+		struct ib_uverbs_query_device_resp resp = {};
+
+		uverbs_copy_query_dev_fields(ib_dev, &resp, &attr);
+		if (copy_to_user(common->attrs[QUERY_DEVICE_RESP].cmd_attr.ptr,
+				 &resp, sizeof(resp)))
+			return -EFAULT;
+	}
+
+#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
+	if (uverbs_is_valid(common, QUERY_DEVICE_ODP)) {
+		struct ib_uverbs_odp_caps odp_caps;
+
+		odp_caps.general_caps = attr.odp_caps.general_caps;
+		odp_caps.per_transport_caps.rc_odp_caps =
+			attr.odp_caps.per_transport_caps.rc_odp_caps;
+		odp_caps.per_transport_caps.uc_odp_caps =
+			attr.odp_caps.per_transport_caps.uc_odp_caps;
+		odp_caps.per_transport_caps.ud_odp_caps =
+			attr.odp_caps.per_transport_caps.ud_odp_caps;
+
+		if (copy_to_user(common->attrs[QUERY_DEVICE_ODP].cmd_attr.ptr,
+				 &odp_caps, sizeof(odp_caps)))
+			return -EFAULT;
+	}
+#endif
+	if (uverbs_copy_to(common, QUERY_DEVICE_TIMESTAMP_MASK,
+			   &attr.timestamp_mask) == -EFAULT)
+		return -EFAULT;
+
+	if (uverbs_copy_to(common, QUERY_DEVICE_HCA_CORE_CLOCK,
+			   &attr.hca_core_clock) == -EFAULT)
+		return -EFAULT;
+
+	if (uverbs_copy_to(common, QUERY_DEVICE_CAP_FLAGS,
+			   &attr.device_cap_flags) == -EFAULT)
+		return -EFAULT;
+
+	return 0;
+}
+
+DECLARE_UVERBS_ATTR_SPEC(
+	uverbs_alloc_pd_spec,
+	UVERBS_ATTR_IDR(ALLOC_PD_HANDLE, UVERBS_TYPE_PD,
+			UVERBS_IDR_ACCESS_NEW,
+			UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
+
+int uverbs_alloc_pd_handler(struct ib_device *ib_dev,
+			    struct ib_uverbs_file *file,
+			    struct uverbs_attr_array *ctx, size_t num)
+{
+	struct uverbs_attr_array *common = &ctx[0];
+	struct ib_ucontext *ucontext = file->ucontext;
+	struct ib_udata uhw;
+	struct ib_uobject *uobject;
+	struct ib_pd *pd;
+
+	/* Temporary, only until drivers get the new uverbs_attr_array */
+	create_udata(ctx, num, &uhw);
+
+	pd = ib_dev->alloc_pd(ib_dev, ucontext, &uhw);
+	if (IS_ERR(pd))
+		return PTR_ERR(pd);
+
+	uobject = common->attrs[ALLOC_PD_HANDLE].obj_attr.uobject;
+	pd->device  = ib_dev;
+	pd->uobject = uobject;
+	pd->__internal_mr = NULL;
+	uobject->object = pd;
+	atomic_set(&pd->usecnt, 0);
+
+	return 0;
+}
+
+DECLARE_UVERBS_ATTR_SPEC(
+	uverbs_reg_mr_spec,
+	UVERBS_ATTR_IDR(REG_MR_HANDLE, UVERBS_TYPE_MR, UVERBS_IDR_ACCESS_NEW,
+			UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	UVERBS_ATTR_IDR(REG_MR_PD_HANDLE, UVERBS_TYPE_PD, UVERBS_IDR_ACCESS_READ,
+			UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	UVERBS_ATTR_PTR_IN(REG_MR_CMD, struct ib_uverbs_ioctl_reg_mr,
+			   UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	UVERBS_ATTR_PTR_OUT(REG_MR_RESP, struct ib_uverbs_ioctl_reg_mr_resp,
+			    UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
+
+int uverbs_reg_mr_handler(struct ib_device *ib_dev,
+			  struct ib_uverbs_file *file,
+			  struct uverbs_attr_array *ctx, size_t num)
+{
+	struct uverbs_attr_array *common = &ctx[0];
+	struct ib_uverbs_ioctl_reg_mr		cmd;
+	struct ib_uverbs_ioctl_reg_mr_resp	resp;
+	struct ib_udata uhw;
+	struct ib_uobject *uobject;
+	struct ib_pd                *pd;
+	struct ib_mr                *mr;
+	int                          ret;
+
+	if (copy_from_user(&cmd, common->attrs[REG_MR_CMD].cmd_attr.ptr,
+			   sizeof(cmd)))
+		return -EFAULT;
+
+	if ((cmd.start & ~PAGE_MASK) != (cmd.hca_va & ~PAGE_MASK))
+		return -EINVAL;
+
+	ret = ib_check_mr_access(cmd.access_flags);
+	if (ret)
+		return ret;
+
+	/* Temporary, only until drivers get the new uverbs_attr_array */
+	create_udata(ctx, num, &uhw);
+
+	uobject = common->attrs[REG_MR_HANDLE].obj_attr.uobject;
+	pd = common->attrs[REG_MR_PD_HANDLE].obj_attr.uobject->object;
+
+	if (cmd.access_flags & IB_ACCESS_ON_DEMAND) {
+		if (!(pd->device->attrs.device_cap_flags &
+		      IB_DEVICE_ON_DEMAND_PAGING)) {
+			pr_debug("ODP support not available\n");
+			return -EINVAL;
+		}
+	}
+
+	mr = pd->device->reg_user_mr(pd, cmd.start, cmd.length, cmd.hca_va,
+				     cmd.access_flags, &uhw);
+	if (IS_ERR(mr))
+		return PTR_ERR(mr);
+
+	mr->device  = pd->device;
+	mr->pd      = pd;
+	mr->uobject = uobject;
+	atomic_inc(&pd->usecnt);
+	uobject->object = mr;
+
+	resp.lkey      = mr->lkey;
+	resp.rkey      = mr->rkey;
+
+	if (copy_to_user(common->attrs[REG_MR_RESP].cmd_attr.ptr,
+			 &resp, sizeof(resp))) {
+		ret = -EFAULT;
+		goto err;
+	}
+
+	return 0;
+
+err:
+	ib_dereg_mr(mr);
+	return ret;
+}
+
+DECLARE_UVERBS_ATTR_SPEC(
+	uverbs_dereg_mr_spec,
+	UVERBS_ATTR_IDR(DEREG_MR_HANDLE, UVERBS_TYPE_MR, UVERBS_IDR_ACCESS_DESTROY,
+			UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
+
+int uverbs_dereg_mr_handler(struct ib_device *ib_dev,
+			    struct ib_uverbs_file *file,
+			    struct uverbs_attr_array *ctx, size_t num)
+{
+	struct uverbs_attr_array *common = &ctx[0];
+	struct ib_mr             *mr;
+
+	mr = common->attrs[DEREG_MR_HANDLE].obj_attr.uobject->object;
+
+	/* dereg_mr doesn't support driver data */
+	return ib_dereg_mr(mr);
+};
+
+DECLARE_UVERBS_ATTR_SPEC(
+	uverbs_create_comp_channel_spec,
+	UVERBS_ATTR_FD(CREATE_COMP_CHANNEL_FD, UVERBS_TYPE_COMP_CHANNEL,
+		       UVERBS_IDR_ACCESS_NEW,
+		       UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
+
+int uverbs_create_comp_channel_handler(struct ib_device *ib_dev,
+				       struct ib_uverbs_file *file,
+				       struct uverbs_attr_array *ctx, size_t num)
+{
+	struct uverbs_attr_array *common = &ctx[0];
+	struct ib_uverbs_event_file *ev_file;
+
+	if (!uverbs_is_valid(common, CREATE_COMP_CHANNEL_FD))
+		return -EINVAL;
+
+	ev_file = uverbs_fd_to_priv(common->attrs[CREATE_COMP_CHANNEL_FD].obj_attr.uobject);
+	kref_init(&ev_file->ref);
+	spin_lock_init(&ev_file->lock);
+	INIT_LIST_HEAD(&ev_file->event_list);
+	init_waitqueue_head(&ev_file->poll_wait);
+	ev_file->async_queue = NULL;
+	ev_file->uverbs_file = file;
+	ev_file->is_closed   = 0;
+
+	/*
+	 * The original code puts the handle in an event list....
+	 * Currently, it's on our context
+	 */
+
+	return 0;
+}
+
+DECLARE_UVERBS_ATTR_SPEC(
+	uverbs_create_cq_spec,
+	UVERBS_ATTR_IDR(CREATE_CQ_HANDLE, UVERBS_TYPE_CQ, UVERBS_IDR_ACCESS_NEW,
+			UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	UVERBS_ATTR_PTR_IN(CREATE_CQ_CQE, u32,
+			   UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	UVERBS_ATTR_PTR_IN(CREATE_CQ_USER_HANDLE, u64),
+	UVERBS_ATTR_FD(CREATE_CQ_COMP_CHANNEL, UVERBS_TYPE_COMP_CHANNEL, UVERBS_IDR_ACCESS_READ),
+	/*
+	 * Currently, COMP_VECTOR is mandatory, but that could be lifted in the
+	 * future.
+	 */
+	UVERBS_ATTR_PTR_IN(CREATE_CQ_COMP_VECTOR, u32,
+			   UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	UVERBS_ATTR_PTR_IN(CREATE_CQ_FLAGS, u32),
+	UVERBS_ATTR_PTR_OUT(CREATE_CQ_RESP_CQE, u32,
+			    UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
+
+int uverbs_create_cq_handler(struct ib_device *ib_dev,
+			     struct ib_uverbs_file *file,
+			     struct uverbs_attr_array *ctx, size_t num)
+{
+	struct uverbs_attr_array *common = &ctx[0];
+	struct ib_ucontext *ucontext = file->ucontext;
+	struct ib_ucq_object           *obj;
+	struct ib_udata uhw;
+	int ret;
+	u64 user_handle = 0;
+	struct ib_cq_init_attr attr = {};
+	struct ib_cq                   *cq;
+	struct ib_uverbs_event_file    *ev_file = NULL;
+
+	ret = uverbs_copy_from(&attr.comp_vector, common, CREATE_CQ_COMP_VECTOR);
+	if (!ret)
+		ret = uverbs_copy_from(&attr.cqe, common, CREATE_CQ_CQE);
+	if (ret)
+		return ret;
+
+	/* Optional params */
+	if (uverbs_copy_from(&attr.flags, common, CREATE_CQ_FLAGS) == -EFAULT ||
+	    uverbs_copy_from(&user_handle, common, CREATE_CQ_USER_HANDLE) == -EFAULT)
+		return -EFAULT;
+
+	if (uverbs_is_valid(common, CREATE_CQ_COMP_CHANNEL)) {
+		ev_file = uverbs_fd_to_priv(common->attrs[CREATE_CQ_COMP_CHANNEL].obj_attr.uobject);
+		kref_get(&ev_file->ref);
+	}
+
+	if (attr.comp_vector >= ucontext->ufile->device->num_comp_vectors)
+		return -EINVAL;
+
+	obj = container_of(common->attrs[CREATE_CQ_HANDLE].obj_attr.uobject,
+			   typeof(*obj), uobject);
+	obj->uverbs_file	   = ucontext->ufile;
+	obj->comp_events_reported  = 0;
+	obj->async_events_reported = 0;
+	INIT_LIST_HEAD(&obj->comp_list);
+	INIT_LIST_HEAD(&obj->async_list);
+
+	/* Temporary, only until drivers get the new uverbs_attr_array */
+	create_udata(ctx, num, &uhw);
+
+	cq = ib_dev->create_cq(ib_dev, &attr, ucontext, &uhw);
+	if (IS_ERR(cq))
+		return PTR_ERR(cq);
+
+	cq->device        = ib_dev;
+	cq->uobject       = &obj->uobject;
+	cq->comp_handler  = ib_uverbs_comp_handler;
+	cq->event_handler = ib_uverbs_cq_event_handler;
+	cq->cq_context    = ev_file;
+	obj->uobject.object = cq;
+	obj->uobject.user_handle = user_handle;
+	atomic_set(&cq->usecnt, 0);
+
+	ret = uverbs_copy_to(common, CREATE_CQ_RESP_CQE, &cq->cqe);
+	if (ret)
+		goto err;
+
+	return 0;
+err:
+	ib_destroy_cq(cq);
+	return ret;
+};
+
+static int qp_fill_attrs(struct ib_qp_init_attr *attr, struct ib_ucontext *ctx,
+			 const struct ib_uverbs_ioctl_create_qp *cmd,
+			 u32 create_flags)
+{
+	if (create_flags & ~(IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK |
+			     IB_QP_CREATE_CROSS_CHANNEL |
+			     IB_QP_CREATE_MANAGED_SEND |
+			     IB_QP_CREATE_MANAGED_RECV |
+			     IB_QP_CREATE_SCATTER_FCS))
+		return -EINVAL;
+
+	attr->create_flags = create_flags;
+	attr->event_handler = ib_uverbs_qp_event_handler;
+	attr->qp_context = ctx->ufile;
+	attr->sq_sig_type = cmd->sq_sig_all ? IB_SIGNAL_ALL_WR :
+		IB_SIGNAL_REQ_WR;
+	attr->qp_type = cmd->qp_type;
+
+	attr->cap.max_send_wr     = cmd->max_send_wr;
+	attr->cap.max_recv_wr     = cmd->max_recv_wr;
+	attr->cap.max_send_sge    = cmd->max_send_sge;
+	attr->cap.max_recv_sge    = cmd->max_recv_sge;
+	attr->cap.max_inline_data = cmd->max_inline_data;
+
+	return 0;
+}
+
+static void qp_init_uqp(struct ib_uqp_object *obj)
+{
+	obj->uevent.events_reported     = 0;
+	INIT_LIST_HEAD(&obj->uevent.event_list);
+	INIT_LIST_HEAD(&obj->mcast_list);
+}
+
+static int qp_write_resp(const struct ib_qp_init_attr *attr,
+			 const struct ib_qp *qp,
+			 struct uverbs_attr_array *common)
+{
+	struct ib_uverbs_ioctl_create_qp_resp resp = {
+		.qpn = qp->qp_num,
+		.max_recv_sge    = attr->cap.max_recv_sge,
+		.max_send_sge    = attr->cap.max_send_sge,
+		.max_recv_wr     = attr->cap.max_recv_wr,
+		.max_send_wr     = attr->cap.max_send_wr,
+		.max_inline_data = attr->cap.max_inline_data};
+
+	return uverbs_copy_to(common, CREATE_QP_RESP, &resp);
+}
+
+DECLARE_UVERBS_ATTR_SPEC(
+	uverbs_create_qp_spec,
+	UVERBS_ATTR_IDR(CREATE_QP_HANDLE, UVERBS_TYPE_QP, UVERBS_IDR_ACCESS_NEW,
+			UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	UVERBS_ATTR_IDR(CREATE_QP_PD_HANDLE, UVERBS_TYPE_PD, UVERBS_IDR_ACCESS_READ,
+			UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	UVERBS_ATTR_IDR(CREATE_QP_SEND_CQ, UVERBS_TYPE_CQ, UVERBS_IDR_ACCESS_READ,
+			UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	UVERBS_ATTR_IDR(CREATE_QP_RECV_CQ, UVERBS_TYPE_CQ, UVERBS_IDR_ACCESS_READ,
+			UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	UVERBS_ATTR_IDR(CREATE_QP_SRQ, UVERBS_TYPE_SRQ, UVERBS_IDR_ACCESS_READ),
+	UVERBS_ATTR_PTR_IN(CREATE_QP_USER_HANDLE, u64),
+	UVERBS_ATTR_PTR_IN(CREATE_QP_CMD, struct ib_uverbs_ioctl_create_qp),
+	UVERBS_ATTR_PTR_IN(CREATE_QP_CMD_FLAGS, u32),
+	UVERBS_ATTR_PTR_OUT(CREATE_QP_RESP, struct ib_uverbs_ioctl_create_qp_resp,
+			    UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
+
+int uverbs_create_qp_handler(struct ib_device *ib_dev,
+			     struct ib_uverbs_file *file,
+			     struct uverbs_attr_array *ctx, size_t num)
+{
+	struct uverbs_attr_array *common = &ctx[0];
+	struct ib_ucontext *ucontext = file->ucontext;
+	struct ib_uqp_object           *obj;
+	struct ib_udata uhw;
+	int ret;
+	u64 user_handle = 0;
+	u32 create_flags = 0;
+	struct ib_uverbs_ioctl_create_qp cmd;
+	struct ib_qp_init_attr attr = {};
+	struct ib_qp                   *qp;
+	struct ib_pd			*pd;
+
+	ret = uverbs_copy_from(&cmd, common, CREATE_QP_CMD);
+	if (ret)
+		return ret;
+
+	/* Optional params */
+	if (uverbs_copy_from(&create_flags, common, CREATE_QP_CMD_FLAGS) == -EFAULT ||
+	    uverbs_copy_from(&user_handle, common, CREATE_QP_USER_HANDLE) == -EFAULT)
+		return -EFAULT;
+
+	if (cmd.qp_type == IB_QPT_XRC_INI) {
+		cmd.max_recv_wr = 0;
+		cmd.max_recv_sge = 0;
+	}
+
+	ret = qp_fill_attrs(&attr, ucontext, &cmd, create_flags);
+	if (ret)
+		return ret;
+
+	pd = common->attrs[CREATE_QP_PD_HANDLE].obj_attr.uobject->object;
+	attr.send_cq = common->attrs[CREATE_QP_SEND_CQ].obj_attr.uobject->object;
+	attr.recv_cq = common->attrs[CREATE_QP_RECV_CQ].obj_attr.uobject->object;
+	if (uverbs_is_valid(common, CREATE_QP_SRQ))
+		attr.srq = common->attrs[CREATE_QP_SRQ].obj_attr.uobject->object;
+	obj = (struct ib_uqp_object *)common->attrs[CREATE_QP_HANDLE].obj_attr.uobject;
+
+	if (attr.srq && attr.srq->srq_type != IB_SRQT_BASIC)
+		return -EINVAL;
+
+	qp_init_uqp(obj);
+	create_udata(ctx, num, &uhw);
+	qp = pd->device->create_qp(pd, &attr, &uhw);
+	if (IS_ERR(qp))
+		return PTR_ERR(qp);
+	qp->real_qp	  = qp;
+	qp->device	  = pd->device;
+	qp->pd		  = pd;
+	qp->send_cq	  = attr.send_cq;
+	qp->recv_cq	  = attr.recv_cq;
+	qp->srq		  = attr.srq;
+	qp->event_handler = attr.event_handler;
+	qp->qp_context	  = attr.qp_context;
+	qp->qp_type	  = attr.qp_type;
+	atomic_set(&qp->usecnt, 0);
+	atomic_inc(&pd->usecnt);
+	atomic_inc(&attr.send_cq->usecnt);
+	if (attr.recv_cq)
+		atomic_inc(&attr.recv_cq->usecnt);
+	if (attr.srq)
+		atomic_inc(&attr.srq->usecnt);
+	qp->uobject = &obj->uevent.uobject;
+	obj->uevent.uobject.object = qp;
+	obj->uevent.uobject.user_handle = user_handle;
+
+	ret = qp_write_resp(&attr, qp, common);
+	if (ret) {
+		ib_destroy_qp(qp);
+		return ret;
+	}
+
+	return 0;
+}
+
+DECLARE_UVERBS_ATTR_SPEC(
+	uverbs_create_qp_xrc_tgt_spec,
+	UVERBS_ATTR_IDR(CREATE_QP_XRC_TGT_HANDLE, UVERBS_TYPE_QP, UVERBS_IDR_ACCESS_NEW,
+			UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	UVERBS_ATTR_IDR(CREATE_QP_XRC_TGT_XRCD, UVERBS_TYPE_XRCD, UVERBS_IDR_ACCESS_READ,
+			UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	UVERBS_ATTR_PTR_IN(CREATE_QP_XRC_TGT_USER_HANDLE, u64),
+	UVERBS_ATTR_PTR_IN(CREATE_QP_XRC_TGT_CMD, struct ib_uverbs_ioctl_create_qp),
+	UVERBS_ATTR_PTR_IN(CREATE_QP_XRC_TGT_CMD_FLAGS, u32),
+	UVERBS_ATTR_PTR_OUT(CREATE_QP_XRC_TGT_RESP, struct ib_uverbs_ioctl_create_qp_resp,
+			    UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
+
+int uverbs_create_qp_xrc_tgt_handler(struct ib_device *ib_dev,
+				     struct ib_uverbs_file *file,
+				     struct uverbs_attr_array *ctx, size_t num)
+{
+	struct uverbs_attr_array *common = &ctx[0];
+	struct ib_ucontext *ucontext = file->ucontext;
+	struct ib_uqp_object           *obj;
+	int ret;
+	u64 user_handle = 0;
+	u32 create_flags = 0;
+	struct ib_uverbs_ioctl_create_qp cmd;
+	struct ib_qp_init_attr attr = {};
+	struct ib_qp                   *qp;
+
+	ret = uverbs_copy_from(&cmd, common, CREATE_QP_XRC_TGT_CMD);
+	if (ret)
+		return ret;
+
+	/* Optional params */
+	if (uverbs_copy_from(&create_flags, common, CREATE_QP_CMD_FLAGS) == -EFAULT ||
+	    uverbs_copy_from(&user_handle, common, CREATE_QP_USER_HANDLE) == -EFAULT)
+		return -EFAULT;
+
+	ret = qp_fill_attrs(&attr, ucontext, &cmd, create_flags);
+	if (ret)
+		return ret;
+
+	obj = (struct ib_uqp_object *)common->attrs[CREATE_QP_HANDLE].obj_attr.uobject;
+	obj->uxrcd = container_of(common->attrs[CREATE_QP_XRC_TGT_XRCD].obj_attr.uobject,
+				  struct ib_uxrcd_object, uobject);
+	attr.xrcd = obj->uxrcd->uobject.object;
+
+	qp_init_uqp(obj);
+	qp = ib_create_qp(NULL, &attr);
+	if (IS_ERR(qp))
+		return PTR_ERR(qp);
+	qp->uobject = &obj->uevent.uobject;
+	obj->uevent.uobject.object = qp;
+	obj->uevent.uobject.user_handle = user_handle;
+	atomic_inc(&obj->uxrcd->refcnt);
+
+	ret = qp_write_resp(&attr, qp, common);
+	if (ret) {
+		ib_destroy_qp(qp);
+		return ret;
+	}
+
+	return 0;
+}
+
+DECLARE_UVERBS_ATTR_SPEC(
+	uverbs_modify_qp_spec,
+	UVERBS_ATTR_IDR(MODIFY_QP_HANDLE, UVERBS_TYPE_QP, UVERBS_IDR_ACCESS_WRITE),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_STATE, u8),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_CUR_STATE, u8),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_EN_SQD_ASYNC_NOTIFY, u8),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_ACCESS_FLAGS, u32),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_PKEY_INDEX, u16),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_PORT, u8),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_QKEY, u32),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_AV, struct ib_uverbs_qp_dest),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_PATH_MTU, u8),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_TIMEOUT, u8),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_RETRY_CNT, u8),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_RNR_RETRY, u8),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_RQ_PSN, u32),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_MAX_RD_ATOMIC, u8),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_ALT_PATH, struct ib_uverbs_qp_alt_path),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_MIN_RNR_TIMER, u8),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_SQ_PSN, u32),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_MAX_DEST_RD_ATOMIC, u8),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_PATH_MIG_STATE, u8),
+	UVERBS_ATTR_PTR_IN(MODIFY_QP_DEST_QPN, u32));
+
+int uverbs_modify_qp_handler(struct ib_device *ib_dev,
+			     struct ib_uverbs_file *file,
+			     struct uverbs_attr_array *ctx, size_t num)
+{
+	struct uverbs_attr_array *common = &ctx[0];
+	struct ib_udata            uhw;
+	struct ib_qp              *qp;
+	struct ib_qp_attr         *attr;
+	struct ib_uverbs_qp_dest  av;
+	struct ib_uverbs_qp_alt_path alt_path;
+	u32 attr_mask = 0;
+	int ret;
+
+	if (!uverbs_is_valid(common, MODIFY_QP_HANDLE))
+		return -EINVAL;
+
+	qp = common->attrs[MODIFY_QP_HANDLE].obj_attr.uobject->object;
+	attr = kzalloc(sizeof(*attr), GFP_KERNEL);
+	if (!attr)
+		return -ENOMEM;
+
+#define MODIFY_QP_CPY(_param, _fld, _attr)				\
+	({								\
+		int ret = uverbs_copy_from(_fld, common, _param);	\
+		if (!ret)						\
+			attr_mask |= _attr;				\
+		ret == -EFAULT ? ret : 0;				\
+	})
+
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_STATE, &attr->qp_state,
+				   IB_QP_STATE);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_CUR_STATE, &attr->cur_qp_state,
+				   IB_QP_CUR_STATE);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_EN_SQD_ASYNC_NOTIFY,
+				   &attr->en_sqd_async_notify,
+				   IB_QP_EN_SQD_ASYNC_NOTIFY);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_ACCESS_FLAGS,
+				   &attr->qp_access_flags, IB_QP_ACCESS_FLAGS);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_PKEY_INDEX, &attr->pkey_index,
+				   IB_QP_PKEY_INDEX);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_PORT, &attr->port_num, IB_QP_PORT);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_QKEY, &attr->qkey, IB_QP_QKEY);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_PATH_MTU, &attr->path_mtu,
+				   IB_QP_PATH_MTU);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_TIMEOUT, &attr->timeout,
+				   IB_QP_TIMEOUT);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_RETRY_CNT, &attr->retry_cnt,
+				   IB_QP_RETRY_CNT);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_RNR_RETRY, &attr->rnr_retry,
+				   IB_QP_RNR_RETRY);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_RQ_PSN, &attr->rq_psn,
+				   IB_QP_RQ_PSN);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_MAX_RD_ATOMIC,
+				   &attr->max_rd_atomic,
+				   IB_QP_MAX_QP_RD_ATOMIC);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_MIN_RNR_TIMER,
+				   &attr->min_rnr_timer, IB_QP_MIN_RNR_TIMER);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_SQ_PSN, &attr->sq_psn,
+				   IB_QP_SQ_PSN);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_MAX_DEST_RD_ATOMIC,
+				   &attr->max_dest_rd_atomic,
+				   IB_QP_MAX_DEST_RD_ATOMIC);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_PATH_MIG_STATE,
+				   &attr->path_mig_state, IB_QP_PATH_MIG_STATE);
+	ret = ret ?: MODIFY_QP_CPY(MODIFY_QP_DEST_QPN, &attr->dest_qp_num,
+				   IB_QP_DEST_QPN);
+
+	if (ret)
+		goto err;
+
+	ret = uverbs_copy_from(&av, common, MODIFY_QP_AV);
+	if (!ret) {
+		attr_mask |= IB_QP_AV;
+		memcpy(attr->ah_attr.grh.dgid.raw, av.dgid, 16);
+		attr->ah_attr.grh.flow_label        = av.flow_label;
+		attr->ah_attr.grh.sgid_index        = av.sgid_index;
+		attr->ah_attr.grh.hop_limit         = av.hop_limit;
+		attr->ah_attr.grh.traffic_class     = av.traffic_class;
+		attr->ah_attr.dlid		    = av.dlid;
+		attr->ah_attr.sl		    = av.sl;
+		attr->ah_attr.src_path_bits	    = av.src_path_bits;
+		attr->ah_attr.static_rate	    = av.static_rate;
+		attr->ah_attr.ah_flags		    = av.is_global ? IB_AH_GRH : 0;
+		attr->ah_attr.port_num		    = av.port_num;
+	} else if (ret == -EFAULT) {
+		goto err;
+	}
+
+	ret = uverbs_copy_from(&alt_path, common, MODIFY_QP_ALT_PATH);
+	if (!ret) {
+		attr_mask |= IB_QP_ALT_PATH;
+		memcpy(attr->alt_ah_attr.grh.dgid.raw, alt_path.dest.dgid, 16);
+		attr->alt_ah_attr.grh.flow_label    = alt_path.dest.flow_label;
+		attr->alt_ah_attr.grh.sgid_index    = alt_path.dest.sgid_index;
+		attr->alt_ah_attr.grh.hop_limit     = alt_path.dest.hop_limit;
+		attr->alt_ah_attr.grh.traffic_class = alt_path.dest.traffic_class;
+		attr->alt_ah_attr.dlid		    = alt_path.dest.dlid;
+		attr->alt_ah_attr.sl		    = alt_path.dest.sl;
+		attr->alt_ah_attr.src_path_bits     = alt_path.dest.src_path_bits;
+		attr->alt_ah_attr.static_rate       = alt_path.dest.static_rate;
+		attr->alt_ah_attr.ah_flags	    = alt_path.dest.is_global ? IB_AH_GRH : 0;
+		attr->alt_ah_attr.port_num	    = alt_path.dest.port_num;
+		attr->alt_pkey_index		    = alt_path.pkey_index;
+		attr->alt_port_num		    = alt_path.port_num;
+		attr->alt_timeout		    = alt_path.timeout;
+	} else if (ret == -EFAULT) {
+		goto err;
+	}
+
+	create_udata(ctx, num, &uhw);
+
+	if (qp->real_qp == qp) {
+		ret = ib_resolve_eth_dmac(qp, attr, &attr_mask);
+		if (ret)
+			goto err;
+		ret = qp->device->modify_qp(qp, attr,
+			modify_qp_mask(qp->qp_type, attr_mask), &uhw);
+	} else {
+		ret = ib_modify_qp(qp, attr, modify_qp_mask(qp->qp_type, attr_mask));
+	}
+
+	if (ret)
+		goto err;
+
+	return 0;
+err:
+	kfree(attr);
+	return ret;
+}
+
 DECLARE_UVERBS_TYPE(uverbs_type_comp_channel,
 		    /* 1 is used in order to free the comp_channel after the CQs */
 		    &UVERBS_TYPE_ALLOC_FD(1, sizeof(struct ib_uobject) + sizeof(struct ib_uverbs_event_file),
 					  uverbs_free_event_file,
 					  &uverbs_event_fops,
 					  "[infinibandevent]", O_RDONLY),
-		    /* TODO: implement actions for comp channel */
-		    NULL);
+		    &UVERBS_ACTIONS(
+			ADD_UVERBS_ACTION(UVERBS_COMP_CHANNEL_CREATE,
+					  uverbs_create_comp_channel_handler,
+					  &uverbs_create_comp_channel_spec)));
 
 DECLARE_UVERBS_TYPE(uverbs_type_cq,
 		    /* 1 is used in order to free the MR after all the MWs */
 		    &UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_ucq_object), 0,
 					      uverbs_free_cq),
-		    /* TODO: implement actions for cq */
-		    NULL);
+		    &UVERBS_ACTIONS(
+			ADD_UVERBS_ACTION(UVERBS_CQ_CREATE,
+					  uverbs_create_cq_handler,
+					  &uverbs_create_cq_spec,
+					  &uverbs_uhw_compat_spec)));
 
 DECLARE_UVERBS_TYPE(uverbs_type_qp,
 		    /* 1 is used in order to free the MR after all the MWs */
 		    &UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uqp_object), 0,
 					      uverbs_free_qp),
-		    /* TODO: implement actions for qp */
-		    NULL);
+		    &UVERBS_ACTIONS(
+			ADD_UVERBS_ACTION(UVERBS_QP_CREATE,
+					  uverbs_create_qp_handler,
+					  &uverbs_create_qp_spec,
+					  &uverbs_uhw_compat_spec),
+			ADD_UVERBS_ACTION(UVERBS_QP_CREATE_XRC_TGT,
+					  uverbs_create_qp_xrc_tgt_handler,
+					  &uverbs_create_qp_xrc_tgt_spec),
+			ADD_UVERBS_ACTION(UVERBS_QP_MODIFY,
+					  uverbs_modify_qp_handler,
+					  &uverbs_modify_qp_spec,
+					  &uverbs_uhw_compat_spec)),
+);
 
 DECLARE_UVERBS_TYPE(uverbs_type_mw,
 		    &UVERBS_TYPE_ALLOC_IDR(1, uverbs_free_mw),
@@ -182,8 +1007,13 @@ DECLARE_UVERBS_TYPE(uverbs_type_mw,
 DECLARE_UVERBS_TYPE(uverbs_type_mr,
 		    /* 1 is used in order to free the MR after all the MWs */
 		    &UVERBS_TYPE_ALLOC_IDR(1, uverbs_free_mr),
-		    /* TODO: implement actions for mr */
-		    NULL);
+		    &UVERBS_ACTIONS(
+			ADD_UVERBS_ACTION(UVERBS_MR_REG, uverbs_reg_mr_handler,
+					  &uverbs_reg_mr_spec,
+					  &uverbs_uhw_compat_spec),
+			ADD_UVERBS_ACTION(UVERBS_MR_DEREG,
+					  uverbs_dereg_mr_handler,
+					  &uverbs_dereg_mr_spec)));
 
 DECLARE_UVERBS_TYPE(uverbs_type_srq,
 		    &UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_usrq_object), 0,
@@ -221,12 +1051,22 @@ DECLARE_UVERBS_TYPE(uverbs_type_xrcd,
 DECLARE_UVERBS_TYPE(uverbs_type_pd,
 		    /* 2 is used in order to free the PD after all objects */
 		    &UVERBS_TYPE_ALLOC_IDR(2, uverbs_free_pd),
-		    /* TODO: implement actions for pd */
-		    NULL);
+		    &UVERBS_ACTIONS(
+			ADD_UVERBS_ACTION(UVERBS_PD_ALLOC,
+					  uverbs_alloc_pd_handler,
+					  &uverbs_alloc_pd_spec,
+					  &uverbs_uhw_compat_spec)));
 
 DECLARE_UVERBS_TYPE(uverbs_type_device, NULL,
-		    /* TODO: implement actions for device */
-		    NULL);
+		    &UVERBS_ACTIONS(
+			ADD_UVERBS_CTX_ACTION(UVERBS_DEVICE_ALLOC_CONTEXT,
+					      uverbs_get_context,
+					      &uverbs_get_context_spec,
+					      &uverbs_uhw_compat_spec),
+			ADD_UVERBS_ACTION(UVERBS_DEVICE_QUERY,
+					  &uverbs_query_device_handler,
+					  &uverbs_query_device_spec,
+					  &uverbs_uhw_compat_spec)));
 
 DECLARE_UVERBS_TYPES(uverbs_common_types,
 		     ADD_UVERBS_TYPE(UVERBS_TYPE_DEVICE, uverbs_type_device),
diff --git a/include/rdma/uverbs_ioctl_cmd.h b/include/rdma/uverbs_ioctl_cmd.h
index 614e80c..dee76d70 100644
--- a/include/rdma/uverbs_ioctl_cmd.h
+++ b/include/rdma/uverbs_ioctl_cmd.h
@@ -35,6 +35,13 @@
 
 #include <rdma/uverbs_ioctl.h>
 
+#define IB_UVERBS_VENDOR_FLAG	0x8000
+
+enum {
+	UVERBS_UHW_IN,
+	UVERBS_UHW_OUT,
+};
+
 enum uverbs_common_types {
 	UVERBS_TYPE_DEVICE, /* Don't use IDRs here */
 	UVERBS_TYPE_PD,
@@ -52,6 +59,139 @@ enum uverbs_common_types {
 	UVERBS_TYPE_LAST,
 };
 
+enum uverbs_create_qp_cmd_attr {
+	CREATE_QP_HANDLE,
+	CREATE_QP_PD_HANDLE,
+	CREATE_QP_SEND_CQ,
+	CREATE_QP_RECV_CQ,
+	CREATE_QP_SRQ,
+	CREATE_QP_USER_HANDLE,
+	CREATE_QP_CMD,
+	CREATE_QP_CMD_FLAGS,
+	CREATE_QP_RESP
+};
+
+enum uverbs_create_cq_cmd_attr {
+	CREATE_CQ_HANDLE,
+	CREATE_CQ_CQE,
+	CREATE_CQ_USER_HANDLE,
+	CREATE_CQ_COMP_CHANNEL,
+	CREATE_CQ_COMP_VECTOR,
+	CREATE_CQ_FLAGS,
+	CREATE_CQ_RESP_CQE,
+};
+
+enum uverbs_create_qp_xrc_tgt_cmd_attr {
+	CREATE_QP_XRC_TGT_HANDLE,
+	CREATE_QP_XRC_TGT_XRCD,
+	CREATE_QP_XRC_TGT_USER_HANDLE,
+	CREATE_QP_XRC_TGT_CMD,
+	CREATE_QP_XRC_TGT_CMD_FLAGS,
+	CREATE_QP_XRC_TGT_RESP
+};
+
+enum uverbs_modify_qp_cmd_attr {
+	MODIFY_QP_HANDLE,
+	MODIFY_QP_STATE,
+	MODIFY_QP_CUR_STATE,
+	MODIFY_QP_EN_SQD_ASYNC_NOTIFY,
+	MODIFY_QP_ACCESS_FLAGS,
+	MODIFY_QP_PKEY_INDEX,
+	MODIFY_QP_PORT,
+	MODIFY_QP_QKEY,
+	MODIFY_QP_AV,
+	MODIFY_QP_PATH_MTU,
+	MODIFY_QP_TIMEOUT,
+	MODIFY_QP_RETRY_CNT,
+	MODIFY_QP_RNR_RETRY,
+	MODIFY_QP_RQ_PSN,
+	MODIFY_QP_MAX_RD_ATOMIC,
+	MODIFY_QP_ALT_PATH,
+	MODIFY_QP_MIN_RNR_TIMER,
+	MODIFY_QP_SQ_PSN,
+	MODIFY_QP_MAX_DEST_RD_ATOMIC,
+	MODIFY_QP_PATH_MIG_STATE,
+	MODIFY_QP_DEST_QPN
+};
+
+enum uverbs_create_comp_channel_cmd_attr {
+	CREATE_COMP_CHANNEL_FD,
+};
+
+enum uverbs_get_context {
+	GET_CONTEXT_RESP,
+};
+
+enum uverbs_query_device {
+	QUERY_DEVICE_RESP,
+	QUERY_DEVICE_ODP,
+	QUERY_DEVICE_TIMESTAMP_MASK,
+	QUERY_DEVICE_HCA_CORE_CLOCK,
+	QUERY_DEVICE_CAP_FLAGS,
+};
+
+enum uverbs_alloc_pd {
+	ALLOC_PD_HANDLE,
+};
+
+enum uverbs_reg_mr {
+	REG_MR_HANDLE,
+	REG_MR_PD_HANDLE,
+	REG_MR_CMD,
+	REG_MR_RESP
+};
+
+enum uverbs_dereg_mr {
+	DEREG_MR_HANDLE,
+};
+
+extern const struct uverbs_attr_spec_group uverbs_uhw_compat_spec;
+extern const struct uverbs_attr_spec_group uverbs_get_context_spec;
+extern const struct uverbs_attr_spec_group uverbs_query_device_spec;
+extern const struct uverbs_attr_spec_group uverbs_alloc_pd_spec;
+extern const struct uverbs_attr_spec_group uverbs_reg_mr_spec;
+extern const struct uverbs_attr_spec_group uverbs_dereg_mr_spec;
+
+enum uverbs_actions_mr_ops {
+	UVERBS_MR_REG,
+	UVERBS_MR_DEREG,
+};
+
+extern const struct uverbs_action_group uverbs_actions_mr;
+
+enum uverbs_actions_comp_channel_ops {
+	UVERBS_COMP_CHANNEL_CREATE,
+};
+
+extern const struct uverbs_action_group uverbs_actions_comp_channel;
+
+enum uverbs_actions_cq_ops {
+	UVERBS_CQ_CREATE,
+};
+
+extern const struct uverbs_action_group uverbs_actions_cq;
+
+enum uverbs_actions_qp_ops {
+	UVERBS_QP_CREATE,
+	UVERBS_QP_CREATE_XRC_TGT,
+	UVERBS_QP_MODIFY,
+};
+
+extern const struct uverbs_action_group uverbs_actions_qp;
+
+enum uverbs_actions_pd_ops {
+	UVERBS_PD_ALLOC
+};
+
+extern const struct uverbs_action_group uverbs_actions_pd;
+
+enum uverbs_actions_device_ops {
+	UVERBS_DEVICE_ALLOC_CONTEXT,
+	UVERBS_DEVICE_QUERY,
+};
+
+extern const struct uverbs_action_group uverbs_actions_device;
+
 extern const struct uverbs_type uverbs_type_cq;
 extern const struct uverbs_type uverbs_type_qp;
 extern const struct uverbs_type uverbs_type_rwq_ind_table;
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index 25225eb..0b06c4d 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -317,12 +317,25 @@ struct ib_uverbs_reg_mr {
 	__u64 driver_data[0];
 };
 
+struct ib_uverbs_ioctl_reg_mr {
+	__u64 start;
+	__u64 length;
+	__u64 hca_va;
+	__u32 access_flags;
+	__u32 reserved;
+};
+
 struct ib_uverbs_reg_mr_resp {
 	__u32 mr_handle;
 	__u32 lkey;
 	__u32 rkey;
 };
 
+struct ib_uverbs_ioctl_reg_mr_resp {
+	__u32 lkey;
+	__u32 rkey;
+};
+
 struct ib_uverbs_rereg_mr {
 	__u64 response;
 	__u32 mr_handle;
@@ -566,6 +579,16 @@ struct ib_uverbs_ex_create_qp {
 	__u32  reserved1;
 };
 
+struct ib_uverbs_ioctl_create_qp {
+	__u32 max_send_wr;
+	__u32 max_recv_wr;
+	__u32 max_send_sge;
+	__u32 max_recv_sge;
+	__u32 max_inline_data;
+	__u8  sq_sig_all;
+	__u8  qp_type;
+};
+
 struct ib_uverbs_open_qp {
 	__u64 response;
 	__u64 user_handle;
@@ -588,6 +611,15 @@ struct ib_uverbs_create_qp_resp {
 	__u32 reserved;
 };
 
+struct ib_uverbs_ioctl_create_qp_resp {
+	__u32 qpn;
+	__u32 max_send_wr;
+	__u32 max_recv_wr;
+	__u32 max_send_sge;
+	__u32 max_recv_sge;
+	__u32 max_inline_data;
+};
+
 struct ib_uverbs_ex_create_qp_resp {
 	struct ib_uverbs_create_qp_resp base;
 	__u32 comp_mask;
@@ -613,6 +645,13 @@ struct ib_uverbs_qp_dest {
 	__u8  port_num;
 };
 
+struct ib_uverbs_qp_alt_path {
+	struct ib_uverbs_qp_dest dest;
+	__u16 pkey_index;
+	__u8  port_num;
+	__u8  timeout;
+};
+
 struct ib_uverbs_query_qp {
 	__u64 response;
 	__u32 qp_handle;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC ABI V6 10/14] IB/core: Add uverbs merge trees functionality
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (8 preceding siblings ...)
  2016-12-11 12:58   ` [RFC ABI V6 09/14] IB/core: Add uverbs types, actions, handlers " Matan Barak
@ 2016-12-11 12:58   ` Matan Barak
  2016-12-11 12:58   ` [RFC ABI V6 11/14] IB/mlx5: Implement common uverb objects Matan Barak
                     ` (4 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Matan Barak @ 2016-12-11 12:58 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Jason Gunthorpe, Sean Hefty, Christoph Lameter,
	Liran Liss, Haggai Eran, Majd Dibbiny, Matan Barak, Tal Alon,
	Leon Romanovsky

In order to have a more robust query system, we propose having
a parse tree that is unique for a device and represents only
the features this particular device support. This is done
by having a root specification tree per feature.
Before a device registers itself as an IB device, it merges
all these trees into one parsing tree. This parsing tree
is used to parse all user-space commands.
A user-space application could read this parse tree. This
tree represents which types, actions and attributes are
supported for this device.

This is based on the idea of
Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/Makefile             |   3 +-
 drivers/infiniband/core/uverbs_ioctl_merge.c | 672 +++++++++++++++++++++++++++
 include/rdma/uverbs_ioctl.h                  |   9 +
 3 files changed, 683 insertions(+), 1 deletion(-)
 create mode 100644 drivers/infiniband/core/uverbs_ioctl_merge.c

diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index 121373c..131ea4b 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -29,4 +29,5 @@ ib_umad-y :=			user_mad.o
 ib_ucm-y :=			ucm.o
 
 ib_uverbs-y :=			uverbs_main.o uverbs_cmd.o uverbs_marshall.o \
-				rdma_core.o uverbs_ioctl_cmd.o uverbs_ioctl.o
+				rdma_core.o uverbs_ioctl_cmd.o uverbs_ioctl.o \
+				uverbs_ioctl_merge.o
diff --git a/drivers/infiniband/core/uverbs_ioctl_merge.c b/drivers/infiniband/core/uverbs_ioctl_merge.c
new file mode 100644
index 0000000..15eab72
--- /dev/null
+++ b/drivers/infiniband/core/uverbs_ioctl_merge.c
@@ -0,0 +1,672 @@
+/*
+ * Copyright (c) 2016, Mellanox Technologies inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <rdma/uverbs_ioctl.h>
+#include <linux/bitops.h>
+#include "uverbs.h"
+
+static const struct uverbs_type **get_next_type(const struct uverbs_type_group *types,
+						const struct uverbs_type **type)
+{
+	while (type - types->types < types->num_types && !(*type))
+		type++;
+
+	return type - types->types < types->num_types ? type : NULL;
+}
+
+static const struct uverbs_action **get_next_action(const struct uverbs_action_group *group,
+						    const struct uverbs_action **pcurr)
+{
+	while (pcurr - group->actions < group->num_actions && !(*pcurr))
+		pcurr++;
+
+	return pcurr - group->actions < group->num_actions ? pcurr : NULL;
+}
+
+static const struct uverbs_attr_spec *get_next_attr(const struct uverbs_attr_spec_group *group,
+						    const struct uverbs_attr_spec *pcurr)
+{
+	while (pcurr - group->attrs < group->num_attrs && !pcurr->type)
+		pcurr++;
+
+	return pcurr - group->attrs < group->num_attrs ? pcurr : NULL;
+}
+
+static void _free_attr_spec_group(struct uverbs_attr_spec_group **attr_group,
+				  unsigned int num_groups)
+{
+	unsigned int i;
+
+	for (i = 0; i < num_groups; i++)
+		kfree((void *)attr_group[i]);
+}
+
+static void free_attr_spec_group(struct uverbs_attr_spec_group **attr_group,
+				 unsigned int num_groups)
+{
+	_free_attr_spec_group(attr_group, num_groups);
+	kfree(attr_group);
+}
+
+static size_t get_attrs_from_trees(const struct uverbs_action **action_arr,
+				   unsigned int elements,
+				   struct uverbs_attr_spec_group ***out)
+{
+	static const unsigned int num_groups =
+		UVERBS_ID_RESERVED_MASK >> UVERBS_ID_RESERVED_SHIFT;
+	unsigned int group_idx;
+	struct uverbs_attr_spec_group *attr_spec_group[num_groups];
+	unsigned int max_action_specs = 0;
+	unsigned int i;
+	int ret;
+
+	for (group_idx = 0; group_idx < num_groups; group_idx++) {
+		const struct uverbs_attr_spec_group *attr_group_trees[elements];
+		unsigned int num_attr_group_trees = 0;
+		const struct uverbs_attr_spec *attr_trees[elements];
+		unsigned int num_attr_groups = 0;
+		unsigned int attrs_in_group = 0;
+		unsigned long *mandatory_attr_mask;
+
+		for (i = 0; i < elements; i++) {
+			const struct uverbs_action *action = action_arr[i];
+
+			if (action->num_groups > group_idx &&
+			    action->attr_groups[group_idx]) {
+				const struct uverbs_attr_spec_group *spec_group =
+					action->attr_groups[group_idx];
+
+				attr_group_trees[num_attr_group_trees++] =
+					spec_group;
+				attr_trees[num_attr_groups++] =
+					spec_group->attrs;
+				if (spec_group->num_attrs > attrs_in_group)
+					attrs_in_group = spec_group->num_attrs;
+			}
+		}
+
+		if (!attrs_in_group) {
+			attr_spec_group[group_idx] = NULL;
+			continue;
+		}
+
+		attr_spec_group[group_idx] =
+			kzalloc(sizeof(*attr_spec_group[group_idx]) +
+				sizeof(struct uverbs_attr_spec) * attrs_in_group +
+				sizeof(unsigned long) * BITS_TO_LONGS(attrs_in_group),
+				GFP_KERNEL);
+		if (!attr_spec_group[group_idx]) {
+			ret = -ENOMEM;
+			goto free_groups;
+		}
+
+		attr_spec_group[group_idx]->attrs =
+			(void *)(attr_spec_group[group_idx] + 1);
+		attr_spec_group[group_idx]->num_attrs = attrs_in_group;
+		attr_spec_group[group_idx]->mandatory_attrs_bitmask =
+			(void *)(attr_spec_group[group_idx]->attrs + attrs_in_group);
+		mandatory_attr_mask =
+			attr_spec_group[group_idx]->mandatory_attrs_bitmask;
+
+		do {
+			unsigned int tree_idx;
+			bool found_next = false;
+			unsigned int attr_trees_idx[num_attr_groups];
+			unsigned int min_attr = INT_MAX;
+			const struct uverbs_attr_spec *single_attr_trees[num_attr_groups];
+			unsigned int num_single_attr_trees = 0;
+			unsigned int num_attr_trees = 0;
+			struct uverbs_attr_spec *allocated_attr;
+			enum uverbs_attr_type cur_type = UVERBS_ATTR_TYPE_NA;
+			unsigned int attr_type_idx = 0;
+
+			for (tree_idx = 0; tree_idx < num_attr_group_trees;
+			     tree_idx++) {
+				const struct uverbs_attr_spec *next =
+					get_next_attr(attr_group_trees[tree_idx],
+						      attr_trees[tree_idx]);
+
+				if (next) {
+					found_next = true;
+					attr_trees[num_attr_trees] = next;
+					attr_trees_idx[num_attr_trees] =
+						next - attr_group_trees[tree_idx]->attrs;
+					if (min_attr > attr_trees_idx[num_attr_trees])
+						min_attr = attr_trees_idx[num_attr_trees];
+					num_attr_trees++;
+				}
+			}
+
+			if (!found_next)
+				break;
+
+			max_action_specs = group_idx + 1;
+
+			allocated_attr =
+				attr_spec_group[group_idx]->attrs + min_attr;
+
+			for (i = 0; i < num_attr_trees; i++) {
+				if (attr_trees_idx[i] == min_attr) {
+					single_attr_trees[num_single_attr_trees++] =
+						attr_trees[i];
+					attr_trees[i]++;
+				}
+			}
+
+			for (i = 0; i < num_single_attr_trees; i++)
+				switch (cur_type) {
+				case UVERBS_ATTR_TYPE_NA:
+					cur_type = single_attr_trees[i]->type;
+					attr_type_idx = i;
+					continue;
+				case UVERBS_ATTR_TYPE_PTR_IN:
+				case UVERBS_ATTR_TYPE_PTR_OUT:
+				case UVERBS_ATTR_TYPE_IDR:
+				case UVERBS_ATTR_TYPE_FD:
+					if (single_attr_trees[i]->type !=
+					    UVERBS_ATTR_TYPE_NA)
+						WARN("%s\n", "uverbs_merge: Two types for the same attribute");
+					break;
+				case UVERBS_ATTR_TYPE_FLAG:
+					if (single_attr_trees[i]->type !=
+					    UVERBS_ATTR_TYPE_FLAG &&
+					    single_attr_trees[i]->type !=
+					    UVERBS_ATTR_TYPE_NA)
+						WARN("%s\n", "uverbs_merge: Two types for the same attribute");
+					break;
+				default:
+					WARN("%s\n", "uverbs_merge: Unknown attribute type given");
+				}
+
+			switch (cur_type) {
+			case UVERBS_ATTR_TYPE_PTR_IN:
+			case UVERBS_ATTR_TYPE_PTR_OUT:
+			case UVERBS_ATTR_TYPE_IDR:
+			case UVERBS_ATTR_TYPE_FD:
+				/* PTR_IN and PTR_OUT can't be merged between trees */
+				memcpy(allocated_attr,
+				       single_attr_trees[attr_type_idx],
+				       sizeof(*allocated_attr));
+				break;
+			case UVERBS_ATTR_TYPE_FLAG:
+				allocated_attr->type =
+					UVERBS_ATTR_TYPE_FLAG;
+				allocated_attr->flags = 0;
+				allocated_attr->flag.mask = 0;
+				for (i = 0; i < num_single_attr_trees; i++) {
+					allocated_attr->flags |=
+						single_attr_trees[i]->flags;
+					allocated_attr->flag.mask |=
+						single_attr_trees[i]->flag.mask;
+				}
+				break;
+			default:
+				return -EINVAL;
+			};
+
+			if (allocated_attr->flags & UVERBS_ATTR_SPEC_F_MANDATORY)
+				set_bit(min_attr, mandatory_attr_mask);
+		} while (1);
+	}
+
+	*out = kcalloc(max_action_specs, sizeof(struct uverbs_attr_spec_group *),
+		       GFP_KERNEL);
+	if (!(*out))
+		goto free_groups;
+
+	for (group_idx = 0; group_idx < max_action_specs; group_idx++)
+		(*out)[group_idx] = attr_spec_group[group_idx];
+
+	return max_action_specs;
+
+free_groups:
+	_free_attr_spec_group(attr_spec_group, group_idx);
+
+	return ret;
+}
+
+struct action_alloc_list {
+	struct uverbs_action	action;
+	unsigned int		action_idx;
+	/* next is used in order to construct the group later on */
+	struct list_head	list;
+};
+
+static void _free_type_actions_group(struct uverbs_action_group **action_groups,
+				     unsigned int num_groups) {
+	unsigned int i, j;
+
+	for (i = 0; i < num_groups; i++) {
+		if (!action_groups[i])
+			continue;
+
+		for (j = 0; j < action_groups[i]->num_actions; j++) {
+			if (!action_groups[i]->actions[j]->attr_groups)
+				continue;
+
+			free_attr_spec_group((struct uverbs_attr_spec_group **)
+					     action_groups[i]->actions[j]->attr_groups,
+					     action_groups[i]->actions[j]->num_groups);
+			kfree((void *)action_groups[i]->actions[j]);
+		}
+		kfree(action_groups[i]);
+	}
+}
+
+static void free_type_actions_group(struct uverbs_action_group **action_groups,
+				    unsigned int num_groups)
+{
+	_free_type_actions_group(action_groups, num_groups);
+	kfree(action_groups);
+}
+
+static int get_actions_from_trees(const struct uverbs_type **type_arr,
+				  unsigned int elements,
+				  struct uverbs_action_group ***out)
+{
+	static const unsigned int num_groups =
+		UVERBS_ID_RESERVED_MASK >> UVERBS_ID_RESERVED_SHIFT;
+	unsigned int group_idx;
+	struct uverbs_action_group  *action_groups[num_groups];
+	unsigned int max_action_groups = 0;
+	struct uverbs_action_group **allocated_type_actions_group = NULL;
+	int i;
+
+	for (group_idx = 0; group_idx < num_groups; group_idx++) {
+		const struct uverbs_action_group *actions_group_trees[elements];
+		unsigned int num_actions_group_trees = 0;
+		const struct uverbs_action **action_trees[elements];
+		unsigned int num_action_trees = 0;
+		unsigned int actions_in_group = 0;
+		LIST_HEAD(allocated_group_list);
+
+		for (i = 0; i < elements; i++) {
+			if (type_arr[i]->num_groups > group_idx &&
+			    type_arr[i]->action_groups[group_idx]) {
+				actions_group_trees[num_actions_group_trees++] =
+					type_arr[i]->action_groups[group_idx];
+				action_trees[num_action_trees++] =
+					type_arr[i]->action_groups[group_idx]->actions;
+			}
+		}
+
+		do {
+			unsigned int tree_idx;
+			bool found_next = false;
+			unsigned int action_trees_idx[num_action_trees];
+			unsigned int min_action = INT_MAX;
+			const struct uverbs_action *single_action_trees[num_action_trees];
+			unsigned int num_single_action_trees = 0;
+			unsigned int num_action_trees = 0;
+			struct action_alloc_list *allocated_action = NULL;
+			int ret;
+
+			for (tree_idx = 0; tree_idx < num_actions_group_trees;
+			     tree_idx++) {
+				const struct uverbs_action **next =
+					get_next_action(actions_group_trees[tree_idx],
+							action_trees[tree_idx]);
+
+				if (!next)
+					continue;
+
+				found_next = true;
+				action_trees[num_action_trees] = next;
+				action_trees_idx[num_action_trees] =
+					next - actions_group_trees[tree_idx]->actions;
+				if (min_action > action_trees_idx[num_action_trees])
+					min_action = action_trees_idx[num_action_trees];
+				num_action_trees++;
+			}
+
+			if (!found_next)
+				break;
+
+			for (i = 0; i < num_action_trees; i++) {
+				if (action_trees_idx[i] == min_action) {
+					single_action_trees[num_single_action_trees++] =
+						*action_trees[i];
+					action_trees[i]++;
+				}
+			}
+
+			actions_in_group = min_action + 1;
+
+			/* Now we have an array of all attributes of the same actions */
+			allocated_action = kmalloc(sizeof(*allocated_action),
+						   GFP_KERNEL);
+			if (!allocated_action)
+				goto free_list;
+
+			/* Take the last tree which is parameter != NULL */
+			for (i = num_single_action_trees - 1;
+			     i >= 0 && !single_action_trees[i]->handler; i--)
+				;
+			if (WARN_ON(i < 0)) {
+				allocated_action->action.flags = 0;
+				allocated_action->action.handler = NULL;
+			} else {
+				allocated_action->action.flags =
+					single_action_trees[i]->flags;
+				allocated_action->action.handler =
+					single_action_trees[i]->handler;
+			}
+			allocated_action->action.num_child_attrs = 0;
+
+			ret = get_attrs_from_trees(single_action_trees,
+						   num_single_action_trees,
+						   (struct uverbs_attr_spec_group ***)
+						   &allocated_action->action.attr_groups);
+			if (ret < 0) {
+				kfree(allocated_action);
+				goto free_list;
+			}
+
+			allocated_action->action.num_groups = ret;
+
+			for (i = 0; i < allocated_action->action.num_groups;
+			     allocated_action->action.num_child_attrs +=
+				allocated_action->action.attr_groups[i]->num_attrs, i++)
+				;
+
+			allocated_action->action_idx = min_action;
+			list_add_tail(&allocated_action->list,
+				      &allocated_group_list);
+		} while (1);
+
+		if (!actions_in_group) {
+			action_groups[group_idx] = NULL;
+			continue;
+		}
+
+		action_groups[group_idx] =
+			kmalloc(sizeof(*action_groups[group_idx]) +
+				sizeof(struct uverbs_action *) * actions_in_group,
+				GFP_KERNEL);
+
+		if (!action_groups[group_idx])
+			goto free_list;
+
+		action_groups[group_idx]->num_actions = actions_in_group;
+		action_groups[group_idx]->actions =
+			(void *)(action_groups[group_idx] + 1);
+		{
+			struct action_alloc_list *iter;
+
+			list_for_each_entry(iter, &allocated_group_list, list)
+				action_groups[group_idx]->actions[iter->action_idx] =
+					(const struct uverbs_action *)&iter->action;
+		}
+
+		max_action_groups = group_idx + 1;
+
+		continue;
+
+free_list:
+		{
+			struct action_alloc_list *iter, *tmp;
+
+			list_for_each_entry_safe(iter, tmp,
+						 &allocated_group_list, list)
+				kfree(iter);
+
+			goto free_groups;
+		}
+	}
+
+	allocated_type_actions_group =
+		kmalloc(sizeof(*allocated_type_actions_group) * max_action_groups,
+			GFP_KERNEL);
+	if (!allocated_type_actions_group)
+		goto free_groups;
+
+	memcpy(allocated_type_actions_group, action_groups,
+	       sizeof(*allocated_type_actions_group) * max_action_groups);
+
+	*out = allocated_type_actions_group;
+
+	return max_action_groups;
+
+free_groups:
+	_free_type_actions_group(action_groups, max_action_groups);
+
+	return -ENOMEM;
+}
+
+struct type_alloc_list {
+	struct uverbs_type	type;
+	unsigned int		type_idx;
+	/* next is used in order to construct the group later on */
+	struct list_head	list;
+};
+
+static void _free_types(struct uverbs_type_group **types, unsigned int num_types)
+{
+	unsigned int i, j;
+
+	for (i = 0; i < num_types; i++) {
+		if (!types[i])
+			continue;
+
+		for (j = 0; j < types[i]->num_types; j++) {
+			if (!types[i]->types[j])
+				continue;
+
+			free_type_actions_group((struct uverbs_action_group **)
+						types[i]->types[j]->action_groups,
+						types[i]->types[j]->num_groups);
+			kfree((void *)types[i]->types[j]);
+		}
+		kfree(types[i]);
+	}
+}
+
+struct uverbs_root *uverbs_alloc_spec_tree(unsigned int num_trees,
+					   const struct uverbs_root_spec *trees)
+{
+	static const unsigned int num_groups =
+		UVERBS_ID_RESERVED_MASK >> UVERBS_ID_RESERVED_SHIFT;
+	unsigned int group_idx;
+	struct uverbs_type_group *types_groups[num_groups];
+	unsigned int max_types_groups = 0;
+	struct uverbs_root *allocated_types_group = NULL;
+	int i;
+
+	memset(types_groups, 0, sizeof(*types_groups));
+
+	for (group_idx = 0; group_idx < num_groups; group_idx++) {
+		const struct uverbs_type **type_trees[num_trees];
+		unsigned int types_in_group = 0;
+		LIST_HEAD(allocated_group_list);
+
+		for (i = 0; i < num_trees; i++)
+			type_trees[i] = trees[i].types->types;
+
+		do {
+			const struct uverbs_type *curr_type[num_trees];
+			unsigned int type_trees_idx[num_trees];
+			unsigned int trees_for_curr_type = 0;
+			unsigned int min_type = INT_MAX;
+			unsigned int types_idx = 0;
+			bool found_next = false;
+			unsigned int tree_idx;
+			int res;
+			struct type_alloc_list *allocated_type = NULL;
+
+			for (tree_idx = 0; tree_idx < num_trees; tree_idx++) {
+				if (trees[tree_idx].group_id == group_idx) {
+					const struct uverbs_type **next =
+						get_next_type(trees[tree_idx].types,
+							      type_trees[tree_idx]);
+
+					if (!next)
+						continue;
+
+					found_next = true;
+					type_trees[types_idx] = next;
+					type_trees_idx[types_idx] =
+						next - trees[tree_idx].types->types;
+					if (min_type > type_trees_idx[types_idx])
+						min_type = type_trees_idx[types_idx];
+					types_idx++;
+				}
+			}
+
+			if (!found_next)
+				break;
+
+			max_types_groups = group_idx + 1;
+
+			for (i = 0; i < types_idx; i++)
+				/*
+				 * We must have at least one hit here,
+				 * as we found this min type
+				 */
+				if (type_trees_idx[i] == min_type) {
+					curr_type[trees_for_curr_type++] =
+						*type_trees[i];
+					type_trees[i]++;
+				}
+
+			types_in_group = min_type + 1;
+
+			/*
+			 * Do things for type:
+			 * 1. Get action_groups and num_group.
+			 * 2. Allocate uverbs_type. Copy alloc pointer
+			 *      (shallow copy) and fill in num_groups and
+			 *      action_groups.
+			 *      In order to hash them, allocate a struct of
+			 *      {uverbs_type, list_head}
+			 * 3. Put that pointer in types_group[group_idx].
+			 */
+			allocated_type = kmalloc(sizeof(*allocated_type),
+						 GFP_KERNEL);
+			if (!allocated_type)
+				goto free_list;
+
+			/* Take the last tree which is parameter != NULL */
+			for (i = trees_for_curr_type - 1;
+			     i >= 0 && !curr_type[i]->alloc; i--)
+				;
+			if (i < 0)
+				allocated_type->type.alloc = NULL;
+			else
+				allocated_type->type.alloc = curr_type[i]->alloc;
+
+			res = get_actions_from_trees(curr_type,
+						     trees_for_curr_type,
+						     (struct uverbs_action_group ***)
+						     &allocated_type->type.action_groups);
+			if (res < 0) {
+				kfree(allocated_type);
+				goto free_list;
+			}
+
+			allocated_type->type.num_groups = res;
+			allocated_type->type_idx = min_type;
+			list_add_tail(&allocated_type->list,
+				      &allocated_group_list);
+		} while (1);
+
+		if (!types_in_group) {
+			types_groups[group_idx] = NULL;
+			continue;
+		}
+
+		types_groups[group_idx] = kzalloc(sizeof(*types_groups[group_idx]) +
+						  sizeof(struct uverbs_type *) * types_in_group,
+						  GFP_KERNEL);
+		if (!types_groups[group_idx])
+			goto free_list;
+
+		types_groups[group_idx]->num_types = types_in_group;
+		types_groups[group_idx]->types =
+			(void *)(types_groups[group_idx] + 1);
+		{
+			struct type_alloc_list *iter;
+
+			list_for_each_entry(iter, &allocated_group_list, list)
+				types_groups[group_idx]->types[iter->type_idx] =
+					(const struct uverbs_type *)&iter->type;
+		}
+
+		continue;
+
+free_list:
+		{
+			struct type_alloc_list *iter, *tmp;
+
+			list_for_each_entry_safe(iter, tmp,
+						 &allocated_group_list, list)
+				kfree(iter);
+
+			goto free_groups;
+		}
+	}
+
+	/*
+	 * 1. Allocate struct uverbs_root + space for type_groups array.
+	 * 2. Fill it with types_group
+	 *	memcpy(allocated_space + 1, types_group,
+	 *	       sizeof(types_group[0]) * max_types_groups)
+	 * 3. If anything fails goto free_groups;
+	 */
+	allocated_types_group =
+		kmalloc(sizeof(*allocated_types_group) +
+			sizeof(*allocated_types_group->type_groups) * max_types_groups,
+			GFP_KERNEL);
+	if (!allocated_types_group)
+		goto free_groups;
+
+	allocated_types_group->type_groups = (void *)(allocated_types_group + 1);
+	memcpy(allocated_types_group->type_groups, types_groups,
+	       sizeof(*allocated_types_group->type_groups) * max_types_groups);
+	allocated_types_group->num_groups = max_types_groups;
+
+	return allocated_types_group;
+
+free_groups:
+	_free_types(types_groups, max_types_groups);
+
+	return ERR_PTR(-ENOMEM);
+}
+EXPORT_SYMBOL(uverbs_alloc_spec_tree);
+
+void uverbs_specs_free(struct uverbs_root *root)
+{
+	_free_types((struct uverbs_type_group **)root->type_groups,
+		    root->num_groups);
+	kfree(root);
+}
+EXPORT_SYMBOL(uverbs_specs_free);
+
diff --git a/include/rdma/uverbs_ioctl.h b/include/rdma/uverbs_ioctl.h
index 5340673..a5998e5 100644
--- a/include/rdma/uverbs_ioctl.h
+++ b/include/rdma/uverbs_ioctl.h
@@ -368,4 +368,13 @@ int ib_uverbs_uobject_type_add(struct list_head	*head,
 			       uint16_t	obj_type);
 void ib_uverbs_uobject_types_remove(struct ib_device *ib_dev);
 
+struct uverbs_root_spec {
+	const struct uverbs_type_group	*types;
+	u8				group_id;
+};
+
+struct uverbs_root *uverbs_alloc_spec_tree(unsigned int num_trees,
+					   const struct uverbs_root_spec *trees);
+void uverbs_specs_free(struct uverbs_root *root);
+
 #endif
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC ABI V6 11/14] IB/mlx5: Implement common uverb objects
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (9 preceding siblings ...)
  2016-12-11 12:58   ` [RFC ABI V6 10/14] IB/core: Add uverbs merge trees functionality Matan Barak
@ 2016-12-11 12:58   ` Matan Barak
  2016-12-11 12:58   ` [RFC ABI V6 12/14] IB/{core,mlx5}: Support uhw definition per driver Matan Barak
                     ` (3 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Matan Barak @ 2016-12-11 12:58 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Jason Gunthorpe, Sean Hefty, Christoph Lameter,
	Liran Liss, Haggai Eran, Majd Dibbiny, Matan Barak, Tal Alon,
	Leon Romanovsky

This patch simply tells mlx5 to use the uverb objects declared by
the common layer.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/main.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index a10b203..4e739e7 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -51,6 +51,7 @@
 #include <linux/list.h>
 #include <rdma/ib_smi.h>
 #include <rdma/ib_umem.h>
+#include <rdma/uverbs_ioctl_cmd.h>
 #include <linux/in.h>
 #include <linux/etherdevice.h>
 #include <linux/mlx5/fs.h>
@@ -2919,8 +2920,6 @@ free:
 	return ARRAY_SIZE(names);
 }
 
-DECLARE_UVERBS_TYPES_GROUP(root, &uverbs_common_types);
-
 static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 {
 	struct mlx5_ib_dev *dev;
@@ -2929,6 +2928,10 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 	const char *name;
 	int err;
 	int i;
+	static const struct uverbs_root_spec root_spec[] = {
+		[0] = {.types = &uverbs_common_types,
+			.group_id = 0},
+	};
 
 	port_type_cap = MLX5_CAP_GEN(mdev, port_type);
 	ll = mlx5_port_type_cap_to_rdma_ll(port_type_cap);
@@ -3131,10 +3134,15 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 	if (err)
 		goto err_odp;
 
-	dev->ib_dev.specs_root = (struct uverbs_root *)&root;
+	dev->ib_dev.specs_root =
+		uverbs_alloc_spec_tree(ARRAY_SIZE(root_spec),
+				       root_spec);
+	if (IS_ERR(dev->ib_dev.specs_root))
+		goto err_q_cnt;
+
 	err = ib_register_device(&dev->ib_dev, NULL);
 	if (err)
-		goto err_q_cnt;
+		goto err_alloc_spec_tree;
 
 	err = create_umr_res(dev);
 	if (err)
@@ -3157,6 +3165,9 @@ err_umrc:
 err_dev:
 	ib_unregister_device(&dev->ib_dev);
 
+err_alloc_spec_tree:
+	uverbs_specs_free(dev->ib_dev.specs_root);
+
 err_q_cnt:
 	mlx5_ib_dealloc_q_counters(dev);
 
@@ -3188,6 +3199,7 @@ static void mlx5_ib_remove(struct mlx5_core_dev *mdev, void *context)
 
 	mlx5_remove_roce_notifier(dev);
 	ib_unregister_device(&dev->ib_dev);
+	uverbs_specs_free(dev->ib_dev.specs_root);
 	mlx5_ib_dealloc_q_counters(dev);
 	destroy_umrc_res(dev);
 	mlx5_ib_odp_remove_one(dev);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC ABI V6 12/14] IB/{core,mlx5}: Support uhw definition per driver
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (10 preceding siblings ...)
  2016-12-11 12:58   ` [RFC ABI V6 11/14] IB/mlx5: Implement common uverb objects Matan Barak
@ 2016-12-11 12:58   ` Matan Barak
  2016-12-11 12:58   ` [RFC ABI V6 13/14] IB/core: Support getting IOCTL header/SGEs from kernel space Matan Barak
                     ` (2 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Matan Barak @ 2016-12-11 12:58 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Jason Gunthorpe, Sean Hefty, Christoph Lameter,
	Liran Liss, Haggai Eran, Majd Dibbiny, Matan Barak, Tal Alon,
	Leon Romanovsky

Instead of supporting a generic unknown size of the uhw (driver
specific uhw_in and uhw_out) parameters via the common attributes,
declare the sizes specific to each driver in a driver specific tree.

When we initialize the parsing tree, this tree is merged with the
common parsing tree, creating this driver specific tree.
This tree could add driver specific types, actions and other
attributes as well.
We propose creating way of passing the merged parsing tree to the
user-space. By that, a user-space could query which features are
supported by the driver.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs_ioctl_cmd.c | 21 +++------
 drivers/infiniband/hw/mlx5/Makefile        |  2 +-
 drivers/infiniband/hw/mlx5/main.c          |  2 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h       |  2 +
 drivers/infiniband/hw/mlx5/uverbs_tree.c   | 68 ++++++++++++++++++++++++++++++
 5 files changed, 80 insertions(+), 15 deletions(-)
 create mode 100644 drivers/infiniband/hw/mlx5/uverbs_tree.c

diff --git a/drivers/infiniband/core/uverbs_ioctl_cmd.c b/drivers/infiniband/core/uverbs_ioctl_cmd.c
index 5ab6189..ef620b4 100644
--- a/drivers/infiniband/core/uverbs_ioctl_cmd.c
+++ b/drivers/infiniband/core/uverbs_ioctl_cmd.c
@@ -978,8 +978,7 @@ DECLARE_UVERBS_TYPE(uverbs_type_cq,
 		    &UVERBS_ACTIONS(
 			ADD_UVERBS_ACTION(UVERBS_CQ_CREATE,
 					  uverbs_create_cq_handler,
-					  &uverbs_create_cq_spec,
-					  &uverbs_uhw_compat_spec)));
+					  &uverbs_create_cq_spec)));
 
 DECLARE_UVERBS_TYPE(uverbs_type_qp,
 		    /* 1 is used in order to free the MR after all the MWs */
@@ -988,15 +987,13 @@ DECLARE_UVERBS_TYPE(uverbs_type_qp,
 		    &UVERBS_ACTIONS(
 			ADD_UVERBS_ACTION(UVERBS_QP_CREATE,
 					  uverbs_create_qp_handler,
-					  &uverbs_create_qp_spec,
-					  &uverbs_uhw_compat_spec),
+					  &uverbs_create_qp_spec),
 			ADD_UVERBS_ACTION(UVERBS_QP_CREATE_XRC_TGT,
 					  uverbs_create_qp_xrc_tgt_handler,
 					  &uverbs_create_qp_xrc_tgt_spec),
 			ADD_UVERBS_ACTION(UVERBS_QP_MODIFY,
 					  uverbs_modify_qp_handler,
-					  &uverbs_modify_qp_spec,
-					  &uverbs_uhw_compat_spec)),
+					  &uverbs_modify_qp_spec)),
 );
 
 DECLARE_UVERBS_TYPE(uverbs_type_mw,
@@ -1009,8 +1006,7 @@ DECLARE_UVERBS_TYPE(uverbs_type_mr,
 		    &UVERBS_TYPE_ALLOC_IDR(1, uverbs_free_mr),
 		    &UVERBS_ACTIONS(
 			ADD_UVERBS_ACTION(UVERBS_MR_REG, uverbs_reg_mr_handler,
-					  &uverbs_reg_mr_spec,
-					  &uverbs_uhw_compat_spec),
+					  &uverbs_reg_mr_spec),
 			ADD_UVERBS_ACTION(UVERBS_MR_DEREG,
 					  uverbs_dereg_mr_handler,
 					  &uverbs_dereg_mr_spec)));
@@ -1054,19 +1050,16 @@ DECLARE_UVERBS_TYPE(uverbs_type_pd,
 		    &UVERBS_ACTIONS(
 			ADD_UVERBS_ACTION(UVERBS_PD_ALLOC,
 					  uverbs_alloc_pd_handler,
-					  &uverbs_alloc_pd_spec,
-					  &uverbs_uhw_compat_spec)));
+					  &uverbs_alloc_pd_spec)));
 
 DECLARE_UVERBS_TYPE(uverbs_type_device, NULL,
 		    &UVERBS_ACTIONS(
 			ADD_UVERBS_CTX_ACTION(UVERBS_DEVICE_ALLOC_CONTEXT,
 					      uverbs_get_context,
-					      &uverbs_get_context_spec,
-					      &uverbs_uhw_compat_spec),
+					      &uverbs_get_context_spec),
 			ADD_UVERBS_ACTION(UVERBS_DEVICE_QUERY,
 					  &uverbs_query_device_handler,
-					  &uverbs_query_device_spec,
-					  &uverbs_uhw_compat_spec)));
+					  &uverbs_query_device_spec)));
 
 DECLARE_UVERBS_TYPES(uverbs_common_types,
 		     ADD_UVERBS_TYPE(UVERBS_TYPE_DEVICE, uverbs_type_device),
diff --git a/drivers/infiniband/hw/mlx5/Makefile b/drivers/infiniband/hw/mlx5/Makefile
index 7493a83..aa035bb 100644
--- a/drivers/infiniband/hw/mlx5/Makefile
+++ b/drivers/infiniband/hw/mlx5/Makefile
@@ -1,4 +1,4 @@
 obj-$(CONFIG_MLX5_INFINIBAND)	+= mlx5_ib.o
 
-mlx5_ib-y :=	main.o cq.o doorbell.o qp.o mem.o srq.o mr.o ah.o mad.o gsi.o ib_virt.o
+mlx5_ib-y :=	main.o cq.o doorbell.o qp.o mem.o srq.o mr.o ah.o mad.o gsi.o ib_virt.o uverbs_tree.o
 mlx5_ib-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += odp.o
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 4e739e7..26beb6f 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2931,6 +2931,8 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 	static const struct uverbs_root_spec root_spec[] = {
 		[0] = {.types = &uverbs_common_types,
 			.group_id = 0},
+		[1] = {.types = &mlx5_common_types,
+			.group_id = 0},
 	};
 
 	port_type_cap = MLX5_CAP_GEN(mdev, port_type);
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 1df8a67..bb12c66 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -45,6 +45,7 @@
 #include <linux/mlx5/transobj.h>
 #include <rdma/ib_user_verbs.h>
 #include <rdma/mlx5-abi.h>
+#include <rdma/uverbs_ioctl.h>
 
 #define mlx5_ib_dbg(dev, format, arg...)				\
 pr_debug("%s:%s:%d:(pid %d): " format, (dev)->ib_dev.name, __func__,	\
@@ -906,6 +907,7 @@ int mlx5_ib_gsi_post_recv(struct ib_qp *qp, struct ib_recv_wr *wr,
 void mlx5_ib_gsi_pkey_change(struct mlx5_ib_gsi_qp *gsi);
 
 int mlx5_ib_generate_wc(struct ib_cq *ibcq, struct ib_wc *wc);
+extern const struct uverbs_type_group mlx5_common_types;
 
 static inline void init_query_mad(struct ib_smp *mad)
 {
diff --git a/drivers/infiniband/hw/mlx5/uverbs_tree.c b/drivers/infiniband/hw/mlx5/uverbs_tree.c
new file mode 100644
index 0000000..704b177
--- /dev/null
+++ b/drivers/infiniband/hw/mlx5/uverbs_tree.c
@@ -0,0 +1,68 @@
+#include <rdma/uverbs_ioctl.h>
+#include <rdma/uverbs_ioctl_cmd.h>
+#include <rdma/mlx5-abi.h>
+#include "mlx5_ib.h"
+
+DECLARE_UVERBS_ATTR_SPEC(
+	mlx5_spec_create_qp,
+	UVERBS_ATTR_PTR_IN_SZ(UVERBS_UHW_IN, 0,
+			      UA_FLAGS(UVERBS_ATTR_SPEC_F_MIN_SZ)),
+	UVERBS_ATTR_PTR_OUT_SZ(UVERBS_UHW_OUT, 0,
+			       UA_FLAGS(UVERBS_ATTR_SPEC_F_MIN_SZ)));
+
+DECLARE_UVERBS_ATTR_SPEC(
+	mlx5_spec_create_cq,
+	UVERBS_ATTR_PTR_IN_SZ(UVERBS_UHW_IN,
+			      offsetof(struct mlx5_ib_create_cq, reserved),
+			      UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY |
+				       UVERBS_ATTR_SPEC_F_MIN_SZ)),
+	UVERBS_ATTR_PTR_OUT(UVERBS_UHW_OUT, __u32,
+			    UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
+
+DECLARE_UVERBS_ATTR_SPEC(
+	mlx5_spec_alloc_pd,
+	UVERBS_ATTR_PTR_OUT(UVERBS_UHW_OUT, struct mlx5_ib_alloc_pd_resp,
+			    UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
+
+DECLARE_UVERBS_ATTR_SPEC(
+	mlx5_spec_device_query,
+	UVERBS_ATTR_PTR_OUT_SZ(UVERBS_UHW_OUT, 0,
+			       UA_FLAGS(UVERBS_ATTR_SPEC_F_MIN_SZ)));
+/* TODO: fix sizes */
+DECLARE_UVERBS_ATTR_SPEC(
+	mlx5_spec_alloc_context,
+	UVERBS_ATTR_PTR_IN(UVERBS_UHW_IN, struct mlx5_ib_alloc_ucontext_req,
+			   UA_FLAGS(UVERBS_ATTR_SPEC_F_MIN_SZ |
+				    UVERBS_ATTR_SPEC_F_MANDATORY)),
+	UVERBS_ATTR_PTR_OUT_SZ(UVERBS_UHW_OUT, 0,
+			       UA_FLAGS(UVERBS_ATTR_SPEC_F_MIN_SZ)));
+
+DECLARE_UVERBS_TYPE(mlx5_type_qp, NULL,
+		    &UVERBS_ACTIONS(
+			ADD_UVERBS_ACTION(UVERBS_QP_CREATE, NULL, NULL,
+					  &mlx5_spec_create_qp)));
+
+DECLARE_UVERBS_TYPE(mlx5_type_cq, NULL,
+		    &UVERBS_ACTIONS(
+			ADD_UVERBS_ACTION(UVERBS_CQ_CREATE, NULL, NULL,
+					  &mlx5_spec_create_cq)));
+
+DECLARE_UVERBS_TYPE(mlx5_type_pd, NULL,
+		    &UVERBS_ACTIONS(
+			ADD_UVERBS_ACTION(UVERBS_PD_ALLOC, NULL, NULL,
+					  &mlx5_spec_alloc_pd)));
+
+DECLARE_UVERBS_TYPE(mlx5_type_device, NULL,
+		    &UVERBS_ACTIONS(
+			ADD_UVERBS_CTX_ACTION(UVERBS_DEVICE_ALLOC_CONTEXT,
+					      NULL, NULL,
+					      &mlx5_spec_alloc_context),
+			ADD_UVERBS_ACTION(UVERBS_DEVICE_QUERY,
+					  NULL, NULL,
+					  &mlx5_spec_device_query)));
+
+DECLARE_UVERBS_TYPES(mlx5_common_types,
+		     ADD_UVERBS_TYPE(UVERBS_TYPE_DEVICE, mlx5_type_device),
+		     ADD_UVERBS_TYPE(UVERBS_TYPE_PD, mlx5_type_pd),
+		     ADD_UVERBS_TYPE(UVERBS_TYPE_CQ, mlx5_type_cq),
+		     ADD_UVERBS_TYPE(UVERBS_TYPE_QP, mlx5_type_qp));
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC ABI V6 13/14] IB/core: Support getting IOCTL header/SGEs from kernel space
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (11 preceding siblings ...)
  2016-12-11 12:58   ` [RFC ABI V6 12/14] IB/{core,mlx5}: Support uhw definition per driver Matan Barak
@ 2016-12-11 12:58   ` Matan Barak
  2016-12-11 12:58   ` [RFC ABI V6 14/14] IB/core: Implement compatibility layer for get context command Matan Barak
  2016-12-19  0:48   ` [RFC ABI V6 00/14] SG-based RDMA ABI Proposal ira.weiny
  14 siblings, 0 replies; 20+ messages in thread
From: Matan Barak @ 2016-12-11 12:58 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Jason Gunthorpe, Sean Hefty, Christoph Lameter,
	Liran Liss, Haggai Eran, Majd Dibbiny, Matan Barak, Tal Alon,
	Leon Romanovsky

In order to allow compatibility header, allow passing
ib_uverbs_ioctl_hdr and ib_uverbs_attr from kernel space.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs.h       |  6 ++++
 drivers/infiniband/core/uverbs_ioctl.c | 56 ++++++++++++++++++++++------------
 2 files changed, 42 insertions(+), 20 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 7c038a3..72858e5 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -84,7 +84,13 @@
  * released when the CQ is destroyed.
  */
 
+struct ib_uverbs_ioctl_hdr;
 long ib_uverbs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
+long ib_uverbs_cmd_verbs(struct ib_device *ib_dev,
+			 struct ib_uverbs_file *file,
+			 struct ib_uverbs_ioctl_hdr *hdr,
+			 void __user *buf,
+			 bool w_legacy);
 
 struct ib_uverbs_device {
 	atomic_t				refcount;
diff --git a/drivers/infiniband/core/uverbs_ioctl.c b/drivers/infiniband/core/uverbs_ioctl.c
index 406b735..6050a64 100644
--- a/drivers/infiniband/core/uverbs_ioctl.c
+++ b/drivers/infiniband/core/uverbs_ioctl.c
@@ -41,7 +41,8 @@ static int uverbs_validate_attr(struct ib_device *ibdev,
 				u16 attr_id,
 				const struct uverbs_attr_spec_group *attr_spec_group,
 				struct uverbs_attr_array *attr_array,
-				struct ib_uverbs_attr __user *uattr_ptr)
+				struct ib_uverbs_attr __user *uattr_ptr,
+				bool w_legacy)
 {
 	const struct uverbs_attr_spec *spec;
 	struct uverbs_attr *e;
@@ -113,10 +114,14 @@ static int uverbs_validate_attr(struct ib_device *ibdev,
 		if (spec->obj.access == UVERBS_IDR_ACCESS_NEW) {
 			u64 idr = o_attr->uobject->id;
 
-			if (put_user(idr, &o_attr->uattr->ptr_idr)) {
-				uverbs_rollback_object(o_attr->uobject,
-						       UVERBS_IDR_ACCESS_NEW);
-				return -EFAULT;
+			if (!w_legacy) {
+				if (put_user(idr, &o_attr->uattr->ptr_idr)) {
+					uverbs_rollback_object(o_attr->uobject,
+							       UVERBS_IDR_ACCESS_NEW);
+					return -EFAULT;
+				}
+			} else {
+				o_attr->uattr->ptr_idr = idr;
 			}
 		}
 
@@ -135,7 +140,8 @@ static int uverbs_validate(struct ib_device *ibdev,
 			   size_t num_attrs,
 			   const struct uverbs_action *action,
 			   struct uverbs_attr_array *attr_array,
-			   struct ib_uverbs_attr __user *uattr_ptr)
+			   struct ib_uverbs_attr __user *uattr_ptr,
+			   bool w_legacy)
 {
 	size_t i;
 	int ret;
@@ -161,7 +167,7 @@ static int uverbs_validate(struct ib_device *ibdev,
 		attr_spec_group = action->attr_groups[ret];
 		ret = uverbs_validate_attr(ibdev, ucontext, uattr, attr_id,
 					   attr_spec_group, &attr_array[ret],
-					   uattr_ptr++);
+					   uattr_ptr++, w_legacy);
 		if (ret) {
 			uverbs_commit_objects(attr_array, n_val,
 					      action, false);
@@ -178,14 +184,16 @@ static int uverbs_handle_action(struct ib_uverbs_attr __user *uattr_ptr,
 				struct ib_device *ibdev,
 				struct ib_uverbs_file *ufile,
 				const struct uverbs_action *handler,
-				struct uverbs_attr_array *attr_array)
+				struct uverbs_attr_array *attr_array,
+				bool w_legacy)
 {
 	int ret;
 	int n_val;
 	unsigned int i;
 
 	n_val = uverbs_validate(ibdev, ufile->ucontext, uattrs, num_attrs,
-				handler, attr_array, uattr_ptr);
+				handler, attr_array, uattr_ptr,
+				w_legacy);
 	if (n_val <= 0)
 		return n_val;
 
@@ -212,10 +220,11 @@ cleanup:
 #ifdef UVERBS_OPTIMIZE_USING_STACK
 #define UVERBS_MAX_STACK_USAGE		512
 #endif
-static long ib_uverbs_cmd_verbs(struct ib_device *ib_dev,
-				struct ib_uverbs_file *file,
-				struct ib_uverbs_ioctl_hdr *hdr,
-				void __user *buf)
+long ib_uverbs_cmd_verbs(struct ib_device *ib_dev,
+			 struct ib_uverbs_file *file,
+			 struct ib_uverbs_ioctl_hdr *hdr,
+			 void __user *buf,
+			 bool w_legacy)
 {
 	const struct uverbs_type *type;
 	const struct uverbs_action *action;
@@ -281,15 +290,21 @@ static long ib_uverbs_cmd_verbs(struct ib_device *ib_dev,
 		curr_bitmap += BITS_TO_LONGS(curr_num_attrs);
 	}
 
-	err = copy_from_user(ctx->uattrs, buf,
-			     sizeof(*ctx->uattrs) * hdr->num_attrs);
-	if (err) {
-		err = -EFAULT;
-		goto out;
+	if (w_legacy) {
+		memcpy(ctx->uattrs, buf,
+		       sizeof(*ctx->uattrs) * hdr->num_attrs);
+	} else {
+		err = copy_from_user(ctx->uattrs, buf,
+				     sizeof(*ctx->uattrs) * hdr->num_attrs);
+		if (err) {
+			err = -EFAULT;
+			goto out;
+		}
 	}
 
 	err = uverbs_handle_action(buf, ctx->uattrs, hdr->num_attrs, ib_dev,
-				   file, action, ctx->uverbs_attr_array);
+				   file, action, ctx->uverbs_attr_array,
+				   w_legacy);
 out:
 #ifdef UVERBS_OPTIMIZE_USING_STACK
 	if (ctx_size > UVERBS_MAX_STACK_USAGE)
@@ -344,7 +359,8 @@ long ib_uverbs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 		}
 
 		err = ib_uverbs_cmd_verbs(ib_dev, file, &hdr,
-					  (__user void *)arg + sizeof(hdr));
+					  (__user void *)arg + sizeof(hdr),
+					  false);
 	}
 out:
 	srcu_read_unlock(&file->device->disassociate_srcu, srcu_key);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC ABI V6 14/14] IB/core: Implement compatibility layer for get context command
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (12 preceding siblings ...)
  2016-12-11 12:58   ` [RFC ABI V6 13/14] IB/core: Support getting IOCTL header/SGEs from kernel space Matan Barak
@ 2016-12-11 12:58   ` Matan Barak
  2016-12-19  0:48   ` [RFC ABI V6 00/14] SG-based RDMA ABI Proposal ira.weiny
  14 siblings, 0 replies; 20+ messages in thread
From: Matan Barak @ 2016-12-11 12:58 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Jason Gunthorpe, Sean Hefty, Christoph Lameter,
	Liran Liss, Haggai Eran, Majd Dibbiny, Matan Barak, Tal Alon,
	Leon Romanovsky

Implement write -> ioctl compatibility layer for ib_uverbs_get_context
by translating the headers to ioctl headers and invoke the ioctl
parser.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs_cmd.c | 157 +++++++++++++++++------------------
 1 file changed, 75 insertions(+), 82 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index a3fc3f72..66cc589 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -37,6 +37,8 @@
 #include <linux/fs.h>
 #include <linux/slab.h>
 #include <linux/sched.h>
+#include <rdma/rdma_user_ioctl.h>
+#include <rdma/uverbs_ioctl_cmd.h>
 
 #include <asm/uaccess.h>
 
@@ -107,6 +109,54 @@ static struct ib_xrcd *idr_read_xrcd(int xrcd_handle, struct ib_ucontext *contex
 	return *uobj ? (*uobj)->object : NULL;
 }
 
+static int get_vendor_num_attrs(size_t cmd, size_t resp, int in_len,
+				int out_len)
+{
+	return !!(cmd != in_len) + !!(resp != out_len);
+}
+
+static void init_ioctl_hdr(struct ib_uverbs_ioctl_hdr *hdr,
+			   struct ib_device *ib_dev,
+			   size_t num_attrs,
+			   u16 object_type,
+			   u16 action)
+{
+	hdr->length = sizeof(*hdr) + num_attrs * sizeof(hdr->attrs[0]);
+	hdr->flags = 0;
+	hdr->object_type = object_type;
+	hdr->driver_id = ib_dev->driver_id;
+	hdr->action = action;
+	hdr->num_attrs = num_attrs;
+}
+
+static void fill_attr_ptr(struct ib_uverbs_attr *attr, u16 attr_id, u16 len,
+			  const void * __user source)
+{
+	attr->attr_id = attr_id;
+	attr->len = len;
+	attr->reserved = 0;
+	attr->ptr_idr = (__u64)source;
+}
+
+static void fill_hw_attrs(struct ib_uverbs_attr *hw_attrs,
+			  const void __user *in_buf,
+			  const void __user *out_buf,
+			  size_t cmd_size, size_t resp_size,
+			  int in_len, int out_len)
+{
+	if (in_len > cmd_size)
+		fill_attr_ptr(&hw_attrs[UVERBS_UHW_IN],
+			      UVERBS_UHW_IN | IB_UVERBS_VENDOR_FLAG,
+			      in_len - cmd_size,
+			      in_buf + cmd_size);
+
+	if (out_len > resp_size)
+		fill_attr_ptr(&hw_attrs[UVERBS_UHW_OUT],
+			      UVERBS_UHW_OUT | IB_UVERBS_VENDOR_FLAG,
+			      out_len - resp_size,
+			      out_buf + resp_size);
+}
+
 ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 			      struct ib_device *ib_dev,
 			      const char __user *buf,
@@ -114,100 +164,43 @@ ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 {
 	struct ib_uverbs_get_context      cmd;
 	struct ib_uverbs_get_context_resp resp;
-	struct ib_udata                   udata;
-	struct ib_ucontext		 *ucontext;
-	struct file			 *filp;
-	int ret;
+	struct {
+		struct ib_uverbs_ioctl_hdr hdr;
+		struct ib_uverbs_attr  cmd_attrs[GET_CONTEXT_RESP + 1];
+		struct ib_uverbs_attr  hw_attrs[UVERBS_UHW_OUT + 1];
+	} ioctl_cmd;
+	long err;
 
 	if (out_len < sizeof resp)
 		return -ENOSPC;
 
-	if (copy_from_user(&cmd, buf, sizeof cmd))
+	if (copy_from_user(&cmd, buf, sizeof(cmd)))
 		return -EFAULT;
 
-	mutex_lock(&file->mutex);
+	init_ioctl_hdr(&ioctl_cmd.hdr, ib_dev, ARRAY_SIZE(ioctl_cmd.cmd_attrs) +
+		       get_vendor_num_attrs(sizeof(cmd), sizeof(resp), in_len,
+					    out_len),
+		       UVERBS_TYPE_DEVICE, UVERBS_DEVICE_ALLOC_CONTEXT);
 
-	if (file->ucontext) {
-		ret = -EINVAL;
-		goto err;
-	}
+	/*
+	 * We have to have a direct mapping between the new format and the old
+	 * format. It's easily achievable with new attributes.
+	 */
+	fill_attr_ptr(&ioctl_cmd.cmd_attrs[GET_CONTEXT_RESP],
+		      GET_CONTEXT_RESP, sizeof(resp),
+		      (const void * __user)cmd.response);
+	fill_hw_attrs(ioctl_cmd.hw_attrs, buf,
+		      (const void * __user)cmd.response, sizeof(cmd),
+		      sizeof(resp), in_len, out_len);
 
-	INIT_UDATA(&udata, buf + sizeof cmd,
-		   (unsigned long) cmd.response + sizeof resp,
-		   in_len - sizeof cmd, out_len - sizeof resp);
+	err = ib_uverbs_cmd_verbs(ib_dev, file, &ioctl_cmd.hdr,
+				  ioctl_cmd.cmd_attrs, true);
 
-	ucontext = ib_dev->alloc_ucontext(ib_dev, &udata);
-	if (IS_ERR(ucontext)) {
-		ret = PTR_ERR(ucontext);
+	if (err < 0)
 		goto err;
-	}
-
-	ucontext->device = ib_dev;
-	ucontext->ufile = file;
-	ret = ib_uverbs_uobject_type_initialize_ucontext(ucontext);
-	if (ret)
-		goto err_ctx;
-
-	rcu_read_lock();
-	ucontext->tgid = get_task_pid(current->group_leader, PIDTYPE_PID);
-	rcu_read_unlock();
-	ucontext->closing = 0;
-
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
-	ucontext->umem_tree = RB_ROOT;
-	init_rwsem(&ucontext->umem_rwsem);
-	ucontext->odp_mrs_count = 0;
-	INIT_LIST_HEAD(&ucontext->no_private_counters);
-
-	if (!(ib_dev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING))
-		ucontext->invalidate_range = NULL;
-
-#endif
-
-	resp.num_comp_vectors = file->device->num_comp_vectors;
-
-	ret = get_unused_fd_flags(O_CLOEXEC);
-	if (ret < 0)
-		goto err_free;
-	resp.async_fd = ret;
-
-	filp = ib_uverbs_alloc_async_event_file(file, ib_dev);
-	if (IS_ERR(filp)) {
-		ret = PTR_ERR(filp);
-		goto err_fd;
-	}
-
-	if (copy_to_user((void __user *) (unsigned long) cmd.response,
-			 &resp, sizeof resp)) {
-		ret = -EFAULT;
-		goto err_file;
-	}
-
-	file->ucontext = ucontext;
-	ucontext->ufile = file;
-
-	fd_install(resp.async_fd, filp);
-
-	mutex_unlock(&file->mutex);
-
-	return in_len;
-
-err_file:
-	ib_uverbs_free_async_event_file(file);
-	fput(filp);
-
-err_fd:
-	put_unused_fd(resp.async_fd);
-
-err_free:
-	put_pid(ucontext->tgid);
-	ib_uverbs_uobject_type_release_ucontext(ucontext);
-err_ctx:
-	ib_dev->dealloc_ucontext(ucontext);
 
 err:
-	mutex_unlock(&file->mutex);
-	return ret;
+	return err == 0 ? in_len : err;
 }
 
 void uverbs_copy_query_dev_fields(struct ib_device *ib_dev,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFC ABI V6 00/14] SG-based RDMA ABI Proposal
       [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (13 preceding siblings ...)
  2016-12-11 12:58   ` [RFC ABI V6 14/14] IB/core: Implement compatibility layer for get context command Matan Barak
@ 2016-12-19  0:48   ` ira.weiny
       [not found]     ` <20161219055037.GO1074@mtr-leonro.local>
  14 siblings, 1 reply; 20+ messages in thread
From: ira.weiny @ 2016-12-19  0:48 UTC (permalink / raw)
  To: Matan Barak
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Jason Gunthorpe,
	Sean Hefty, Christoph Lameter, Liran Liss, Haggai Eran,
	Majd Dibbiny, Tal Alon, Leon Romanovsky

On Sun, Dec 11, 2016 at 02:57:54PM +0200, Matan Barak wrote:
> 
> This series is based on Doug's k.o/for-4.9-fixed branch [1] + Leon's [1] series.
> 
> Regards,
> Liran, Haggai, Leon and Matan
> 
> [0] 2937f3757519 ('staging/lustre: Disable InfiniBand support')
> [1] RDMA/core: Unify style of IOCTL commands series
> 

These don't apply to any Doug branches I have tried.  (I don't see a
for-4.9-fixed branch at:)

https://git.kernel.org/cgit/linux/kernel/git/dledford/rdma.git/refs/heads

Doug, have you accepted Leon's IOCTL series?

If so which branch is it on?

If not, then Matan could you post this series to github which includes Leon's
series?

Thanks,
Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [RFC ABI V6 00/14] SG-based RDMA ABI Proposal
       [not found]       ` <20161219055037.GO1074-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2016-12-19  6:07         ` Weiny, Ira
       [not found]           ` <2807E5FD2F6FDA4886F6618EAC48510E3C611539-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Weiny, Ira @ 2016-12-19  6:07 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Matan Barak, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford,
	Jason Gunthorpe, Hefty, Sean, Christoph Lameter, Liran Liss,
	Haggai Eran, Majd Dibbiny, Tal Alon

> > These don't apply to any Doug branches I have tried.  (I don't see a
> > for-4.9-fixed branch at:)
> >
> > https://git.kernel.org/cgit/linux/kernel/git/dledford/rdma.git/refs/he
> > ads
> >
> > Doug, have you accepted Leon's IOCTL series?
> 
> Yes, Doug accepted it.
> 

Do you know what branch they are on?

Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC ABI V6 00/14] SG-based RDMA ABI Proposal
       [not found]           ` <2807E5FD2F6FDA4886F6618EAC48510E3C611539-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-12-19  6:28             ` Leon Romanovsky
       [not found]               ` <20161219062841.GP1074-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Leon Romanovsky @ 2016-12-19  6:28 UTC (permalink / raw)
  To: Weiny, Ira
  Cc: Matan Barak, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford,
	Jason Gunthorpe, Hefty, Sean, Christoph Lameter, Liran Liss,
	Haggai Eran, Majd Dibbiny, Tal Alon

[-- Attachment #1: Type: text/plain, Size: 781 bytes --]

On Mon, Dec 19, 2016 at 06:07:55AM +0000, Weiny, Ira wrote:
> > > These don't apply to any Doug branches I have tried.  (I don't see a
> > > for-4.9-fixed branch at:)
> > >
> > > https://git.kernel.org/cgit/linux/kernel/git/dledford/rdma.git/refs/he
> > > ads
> > >
> > > Doug, have you accepted Leon's IOCTL series?
> >
> > Yes, Doug accepted it.
> >
>
> Do you know what branch they are on?

It puzzled me too, but since the code is available in Matan's repo [1],
I didn't bother myself.

[1] https://github.com/matanb10/linux abi_rfc_pre_v6

>
> Ira
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC ABI V6 00/14] SG-based RDMA ABI Proposal
       [not found]               ` <20161219062841.GP1074-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2016-12-19  8:12                 ` Matan Barak
  0 siblings, 0 replies; 20+ messages in thread
From: Matan Barak @ 2016-12-19  8:12 UTC (permalink / raw)
  To: Leon Romanovsky, Weiny, Ira
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Jason Gunthorpe,
	Hefty, Sean, Christoph Lameter, Liran Liss, Haggai Eran,
	Majd Dibbiny, Tal Alon

On 19/12/2016 08:28, Leon Romanovsky wrote:
> On Mon, Dec 19, 2016 at 06:07:55AM +0000, Weiny, Ira wrote:
>>>> These don't apply to any Doug branches I have tried.  (I don't see a
>>>> for-4.9-fixed branch at:)
>>>>
>>>> https://git.kernel.org/cgit/linux/kernel/git/dledford/rdma.git/refs/he
>>>> ads
>>>>
>>>> Doug, have you accepted Leon's IOCTL series?
>>>
>>> Yes, Doug accepted it.
>>>
>>
>> Do you know what branch they are on?
>
> It puzzled me too, but since the code is available in Matan's repo [1],
> I didn't bother myself.
>

Actually, I couldn't find Leon's patches on any official branch, so I've 
just applied them myself. Anyway, I also manage a tag named abi-devel in 
my github (which represents the on-going development).

> [1] https://github.com/matanb10/linux abi_rfc_pre_v6
>
>>
>> Ira
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC ABI V6 02/14] IB/core: Add support for custom types
       [not found]     ` <1481461088-56355-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-12-22  8:08       ` ira.weiny
  0 siblings, 0 replies; 20+ messages in thread
From: ira.weiny @ 2016-12-22  8:08 UTC (permalink / raw)
  To: Matan Barak
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Jason Gunthorpe,
	Sean Hefty, Christoph Lameter, Liran Liss, Haggai Eran,
	Majd Dibbiny, Tal Alon, Leon Romanovsky

On Sun, Dec 11, 2016 at 02:57:56PM +0200, Matan Barak wrote:
> The new ioctl infrastructure supports driver specific objects.
> Each such object type has a free function, allocation size and an
> order of destruction. This information is embedded in the same
> table describing the various action allowed on the object, similarly
> to object oriented programming.
> 
> When a ucontext is created, a new list is created in this ib_ucontext.
> This list contains all objects created under this ib_ucontext.
> When a ib_ucontext is destroyed, we traverse this list several time
> destroying the various objects by the order mentioned in the object
> type description. If few object types have the same destruction order,
> they are destroyed in an order opposite to their creation order.

Why don't we just use the krefs to decide this?

> 
> Adding an object is done in two parts.
> First, an object is allocated and added to IDR/fd table. Then, the
> command's handlers (in downstream patches) could work on this object
> and fill in its required details.
> After a successful command, ib_uverbs_uobject_enable is called and
> this user objects becomes ucontext visible.

Why do we need this?

> 
> Removing an uboject is done by calling ib_uverbs_uobject_remove.
> 
> We should make sure IDR (per-device) and list (per-ucontext) could
> be accessed concurrently without corrupting them.
> 
> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/core/Makefile      |   3 +-
>  drivers/infiniband/core/device.c      |   1 +
>  drivers/infiniband/core/rdma_core.c   | 397 ++++++++++++++++++++++++++++++++++
>  drivers/infiniband/core/rdma_core.h   |  71 ++++++
>  drivers/infiniband/core/uverbs.h      |   1 +
>  drivers/infiniband/core/uverbs_main.c |   2 +-
>  include/rdma/ib_verbs.h               |  22 +-
>  include/rdma/uverbs_ioctl.h           | 218 +++++++++++++++++++
>  8 files changed, 710 insertions(+), 5 deletions(-)
>  create mode 100644 drivers/infiniband/core/rdma_core.c
>  create mode 100644 drivers/infiniband/core/rdma_core.h
>  create mode 100644 include/rdma/uverbs_ioctl.h
> 
> diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
> index edaae9f..1819623 100644
> --- a/drivers/infiniband/core/Makefile
> +++ b/drivers/infiniband/core/Makefile
> @@ -28,4 +28,5 @@ ib_umad-y :=			user_mad.o
>  
>  ib_ucm-y :=			ucm.o
>  
> -ib_uverbs-y :=			uverbs_main.o uverbs_cmd.o uverbs_marshall.o
> +ib_uverbs-y :=			uverbs_main.o uverbs_cmd.o uverbs_marshall.o \
> +				rdma_core.o
> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> index c3b68f5..43994b1 100644
> --- a/drivers/infiniband/core/device.c
> +++ b/drivers/infiniband/core/device.c
> @@ -243,6 +243,7 @@ struct ib_device *ib_alloc_device(size_t size)
>  	spin_lock_init(&device->client_data_lock);
>  	INIT_LIST_HEAD(&device->client_data_list);
>  	INIT_LIST_HEAD(&device->port_list);
> +	INIT_LIST_HEAD(&device->type_list);
>  
>  	return device;
>  }
> diff --git a/drivers/infiniband/core/rdma_core.c b/drivers/infiniband/core/rdma_core.c
> new file mode 100644
> index 0000000..398b61f
> --- /dev/null
> +++ b/drivers/infiniband/core/rdma_core.c
> @@ -0,0 +1,397 @@
> +/*
> + * Copyright (c) 2016, Mellanox Technologies inc.  All rights reserved.
> + *
> + * This software is available to you under a choice of one of two
> + * licenses.  You may choose to be licensed under the terms of the GNU
> + * General Public License (GPL) Version 2, available from the file
> + * COPYING in the main directory of this source tree, or the
> + * OpenIB.org BSD license below:
> + *
> + *     Redistribution and use in source and binary forms, with or
> + *     without modification, are permitted provided that the following
> + *     conditions are met:
> + *
> + *      - Redistributions of source code must retain the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer.
> + *
> + *      - Redistributions in binary form must reproduce the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer in the documentation and/or other materials
> + *        provided with the distribution.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
> + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
> + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
> + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
> + * SOFTWARE.
> + */
> +
> +#include <linux/file.h>
> +#include <linux/anon_inodes.h>
> +#include <rdma/ib_verbs.h>
> +#include "uverbs.h"
> +#include "rdma_core.h"
> +#include <rdma/uverbs_ioctl.h>
> +
> +static int uverbs_lock_object(struct ib_uobject *uobj,
> +			      enum uverbs_idr_access access)
> +{
> +	if (access == UVERBS_IDR_ACCESS_READ)
> +		return down_read_trylock(&uobj->usecnt) == 1 ? 0 : -EBUSY;
> +
> +	/* lock is either WRITE or DESTROY - should be exclusive */
> +	return down_write_trylock(&uobj->usecnt) == 1 ? 0 : -EBUSY;
> +}
> +
> +static struct ib_uobject *get_uobj(int id, struct ib_ucontext *context)
> +{
> +	struct ib_uobject *uobj;
> +
> +	rcu_read_lock();
> +	uobj = idr_find(&context->device->idr, id);
> +	if (uobj) {
> +		if (uobj->context != context)
> +			uobj = NULL;
> +	}
> +	rcu_read_unlock();
> +
> +	return uobj;
> +}
> +
> +bool uverbs_is_live(struct ib_uobject *uobj)
> +{
> +	return uobj == get_uobj(uobj->id, uobj->context);
> +}
> +
> +struct ib_ucontext_lock {
> +	struct kref  ref;
> +	/* locking the uobjects_list */
> +	struct mutex lock;
> +};
> +
> +static void release_uobjects_list_lock(struct kref *ref)
> +{
> +	struct ib_ucontext_lock *lock = container_of(ref,
> +						     struct ib_ucontext_lock,
> +						     ref);
> +
> +	kfree(lock);
> +}
> +
> +static void init_uobj(struct ib_uobject *uobj, struct ib_ucontext *context)
> +{
> +	init_rwsem(&uobj->usecnt);
> +	uobj->context     = context;
> +}
> +
> +static int add_uobj(struct ib_uobject *uobj)
> +{
> +	int ret;
> +
> +	idr_preload(GFP_KERNEL);
> +	spin_lock(&uobj->context->device->idr_lock);
> +
> +	/* The uobject will be replaced with the actual one when we commit */

This still seems overly complicated to me?  Why not add the object to the idr
only after it has been successfully created?

> +	ret = idr_alloc(&uobj->context->device->idr, NULL, 0, 0, GFP_NOWAIT);
> +	if (ret >= 0)
> +		uobj->id = ret;

Perhaps there is a reason we need an idr in the object?  But so far I have not
seen it.

> +
> +	spin_unlock(&uobj->context->device->idr_lock);
> +	idr_preload_end();
> +
> +	return ret < 0 ? ret : 0;
> +}
> +
> +static void remove_uobj(struct ib_uobject *uobj)
> +{
> +	spin_lock(&uobj->context->device->idr_lock);
> +	idr_remove(&uobj->context->device->idr, uobj->id);
> +	spin_unlock(&uobj->context->device->idr_lock);
> +}
> +
> +static void put_uobj(struct ib_uobject *uobj)
> +{
> +	kfree_rcu(uobj, rcu);
> +}
> +
> +static struct ib_uobject *get_uobject_from_context(struct ib_ucontext *ucontext,
> +						   const struct uverbs_type_alloc_action *type,
> +						   u32 idr,
> +						   enum uverbs_idr_access access)
> +{
> +	struct ib_uobject *uobj;
> +	int ret;
> +
> +	rcu_read_lock();
> +	uobj = get_uobj(idr, ucontext);
> +	if (!uobj)
> +		goto free;
> +
> +	if (uobj->type != type) {
> +		uobj = NULL;
> +		goto free;
> +	}
> +
> +	ret = uverbs_lock_object(uobj, access);
> +	if (ret)
> +		uobj = ERR_PTR(ret);
> +free:
> +	rcu_read_unlock();
> +	return uobj;
> +
> +	return NULL;

merge/copy/past error?

> +}
> +
> +static int ib_uverbs_uobject_add(struct ib_uobject *uobject,
> +				 const struct uverbs_type_alloc_action *uobject_type)

uobject_type is a bad name for something which is an "alloc_action".

Also, could we stop calling these actions and start calling them methods?

> +{
> +	uobject->type = uobject_type;

This should be part of "allocating" the object.

> +	return add_uobj(uobject);
> +}
> +
> +struct ib_uobject *uverbs_get_type_from_idr(const struct uverbs_type_alloc_action *type,

Please call this get _object_ from idr.  Types and objects are not the same
thing and in this case we are returning an actual instance.

> +					    struct ib_ucontext *ucontext,
> +					    enum uverbs_idr_access access,
> +					    uint32_t idr)

Why not have separate calls for allocation?

> +{
> +	struct ib_uobject *uobj;
> +	int ret;
> +
> +	if (access == UVERBS_IDR_ACCESS_NEW) {
> +		uobj = kmalloc(type->obj_size, GFP_KERNEL);
> +		if (!uobj)
> +			return ERR_PTR(-ENOMEM);
> +
> +		init_uobj(uobj, ucontext);
> +
> +		/* lock idr */

I think I commented on this in the previous series and I'm still confused about
what this comment means?


> +		ret = ib_uverbs_uobject_add(uobj, type);

Again, why are we adding a null idr entry?

> +		if (ret) {
> +			kfree(uobj);
> +			return ERR_PTR(ret);
> +		}
> +
> +	} else {
> +		uobj = get_uobject_from_context(ucontext, type, idr,
> +						access);
> +
> +		if (!uobj)
> +			return ERR_PTR(-ENOENT);
> +	}
> +
> +	return uobj;

Why don't we take a reference when someone gets the uobject from the idr table?

> +}
> +
> +struct ib_uobject *uverbs_get_type_from_fd(const struct uverbs_type_alloc_action *type,
> +					   struct ib_ucontext *ucontext,
> +					   enum uverbs_idr_access access,
> +					   int fd)

Same comments as for the idr above.

> +{
> +	if (access == UVERBS_IDR_ACCESS_NEW) {
> +		int _fd;
> +		struct ib_uobject *uobj = NULL;
> +		struct file *filp;
> +
> +		_fd = get_unused_fd_flags(O_CLOEXEC);
> +		if (_fd < 0 || WARN_ON(type->obj_size < sizeof(struct ib_uobject)))
> +			return ERR_PTR(_fd);
> +
> +		uobj = kmalloc(type->obj_size, GFP_KERNEL);
> +		init_uobj(uobj, ucontext);
> +
> +		if (!uobj)
> +			return ERR_PTR(-ENOMEM);
> +
> +		filp = anon_inode_getfile(type->fd.name, type->fd.fops,
> +					  uobj + 1, type->fd.flags);
> +		if (IS_ERR(filp)) {
> +			put_unused_fd(_fd);
> +			kfree(uobj);
> +			return (void *)filp;
> +		}
> +
> +		uobj->type = type;
> +		uobj->id = _fd;
> +		uobj->object = filp;
> +
> +		return uobj;
> +	} else if (access == UVERBS_IDR_ACCESS_READ) {
> +		struct file *f = fget(fd);
> +		struct ib_uobject *uobject;
> +
> +		if (!f)
> +			return ERR_PTR(-EBADF);
> +
> +		uobject = f->private_data - sizeof(struct ib_uobject);
> +		if (f->f_op != type->fd.fops ||
> +		    !uobject->context) {
> +			fput(f);
> +			return ERR_PTR(-EBADF);
> +		}
> +
> +		/*
> +		 * No need to protect it with a ref count, as fget increases
> +		 * f_count.
> +		 */
> +		return uobject;
> +	} else {
> +		return ERR_PTR(-EOPNOTSUPP);
> +	}
> +}
> +
> +static void ib_uverbs_uobject_enable(struct ib_uobject *uobject)
> +{
> +	mutex_lock(&uobject->context->uobjects_lock->lock);
> +	list_add(&uobject->list, &uobject->context->uobjects);
> +	mutex_unlock(&uobject->context->uobjects_lock->lock);
> +	spin_lock(&uobject->context->device->idr_lock);
> +	idr_replace(&uobject->context->device->idr, uobject, uobject->id);
> +	spin_unlock(&uobject->context->device->idr_lock);
> +}
> +
> +static void ib_uverbs_uobject_remove(struct ib_uobject *uobject, bool lock)
> +{
> +	/*
> +	 * Calling remove requires exclusive access, so it's not possible
> +	 * another thread will use our object.
> +	 */

Based on this comment...  Why is "lock" optional?  And why is remove_uobj not
covered by the lock?  (ie I think the comment is wrong.)

> +	remove_uobj(uobject);
> +	if (lock)
> +		mutex_lock(&uobject->context->uobjects_lock->lock);
> +	list_del(&uobject->list);
> +	if (lock)
> +		mutex_unlock(&uobject->context->uobjects_lock->lock);
> +	put_uobj(uobject);
> +}
> +
> +static void uverbs_commit_idr(struct ib_uobject *uobj,
> +			      enum uverbs_idr_access access,
> +			      bool success)

I'm slowly learning what "commit" means in this architecture.  It seems like
you are trying to use some database design pattern but I think it just adds
complexity which is not needed.

Basically this function is doing 3 things.

1) Activating an object which was previously not in the idr (but had an idr
   reserved.)
2) performing reference counting based on the access.
3) destroying objects

I think over time this will be confusing to most developers.  Why is it
important that we have a function which is doing so many things?

> +{
> +	switch (access) {
> +	case UVERBS_IDR_ACCESS_READ:
> +		up_read(&uobj->usecnt);
> +		break;
> +	case UVERBS_IDR_ACCESS_NEW:
> +		if (success) {
> +			ib_uverbs_uobject_enable(uobj);
> +		} else {
> +			remove_uobj(uobj);
> +			put_uobj(uobj);
> +		}
> +		break;
> +	case UVERBS_IDR_ACCESS_WRITE:
> +		up_write(&uobj->usecnt);
> +		break;
> +	case UVERBS_IDR_ACCESS_DESTROY:
> +		if (success)
> +			ib_uverbs_uobject_remove(uobj, true);
> +		else
> +			up_write(&uobj->usecnt);
> +		break;
> +	}
> +}
> +
> +static void uverbs_commit_fd(struct ib_uobject *uobj,
> +			     enum uverbs_idr_access access,
> +			     bool success)
> +{
> +	struct file *filp = uobj->object;
> +
> +	if (access == UVERBS_IDR_ACCESS_NEW) {
> +		if (success) {
> +			kref_get(&uobj->context->ufile->ref);
> +			uobj->uobjects_lock = uobj->context->uobjects_lock;
> +			kref_get(&uobj->uobjects_lock->ref);
> +			ib_uverbs_uobject_enable(uobj);
> +			fd_install(uobj->id, uobj->object);
> +		} else {
> +			fput(uobj->object);
> +			put_unused_fd(uobj->id);
> +			kfree(uobj);
> +		}
> +	} else {
> +		fput(filp);
> +	}
> +}
> +
> +static void _uverbs_commit_object(struct ib_uobject *uobj,
> +				  enum uverbs_idr_access access,
> +				  bool success)
> +{
> +	if (uobj->type->type == UVERBS_ATTR_TYPE_IDR)
> +		uverbs_commit_idr(uobj, access, success);
> +	else if (uobj->type->type == UVERBS_ATTR_TYPE_FD)
> +		uverbs_commit_fd(uobj, access, success);
> +	else
> +		WARN_ON(true);
> +}
> +
> +void uverbs_commit_object(struct ib_uobject *uobj,
> +			  enum uverbs_idr_access access)
> +{
> +	return _uverbs_commit_object(uobj, access, true);
> +}
> +
> +void uverbs_rollback_object(struct ib_uobject *uobj,
> +			    enum uverbs_idr_access access)
> +{
> +	return _uverbs_commit_object(uobj, access, false);
> +}
> +
> +void ib_uverbs_close_fd(struct file *f)
> +{
> +	struct ib_uobject *uobject = f->private_data - sizeof(struct ib_uobject);
> +
> +	mutex_lock(&uobject->uobjects_lock->lock);
> +	if (uobject->context) {
> +		list_del(&uobject->list);
> +		kref_put(&uobject->context->ufile->ref, ib_uverbs_release_file);
> +		uobject->context = NULL;
> +	}
> +	mutex_unlock(&uobject->uobjects_lock->lock);
> +	kref_put(&uobject->uobjects_lock->ref, release_uobjects_list_lock);
> +}
> +
> +void ib_uverbs_cleanup_fd(void *private_data)
> +{
> +	struct ib_uboject *uobject = private_data - sizeof(struct ib_uobject);
> +
> +	kfree(uobject);
> +}
> +
> +void uverbs_commit_objects(struct uverbs_attr_array *attr_array,
> +			   size_t num,
> +			   const struct uverbs_action *action,
> +			   bool success)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < num; i++) {
> +		struct uverbs_attr_array *attr_spec_array = &attr_array[i];
> +		const struct uverbs_attr_spec_group *attr_spec_group =
> +			action->attr_groups[i];
> +		unsigned int j;
> +
> +		for (j = 0; j < attr_spec_array->num_attrs; j++) {
> +			struct uverbs_attr *attr = &attr_spec_array->attrs[j];
> +			struct uverbs_attr_spec *spec = &attr_spec_group->attrs[j];
> +
> +			if (!uverbs_is_valid(attr_spec_array, j))
> +				continue;
> +
> +			if (spec->type == UVERBS_ATTR_TYPE_IDR ||
> +			    spec->type == UVERBS_ATTR_TYPE_FD)
> +				/*
> +				 * refcounts should be handled at the object
> +				 * level and not at the uobject level.
> +				 */

Why are the current ib_uobject krefs not enough to track this?

> +				_uverbs_commit_object(attr->obj_attr.uobject,
> +						      spec->obj.access, success);
> +		}
> +	}
> +}
> diff --git a/drivers/infiniband/core/rdma_core.h b/drivers/infiniband/core/rdma_core.h
> new file mode 100644
> index 0000000..0bb4be3
> --- /dev/null
> +++ b/drivers/infiniband/core/rdma_core.h
> @@ -0,0 +1,71 @@
> +/*
> + * Copyright (c) 2005 Topspin Communications.  All rights reserved.
> + * Copyright (c) 2005, 2006 Cisco Systems.  All rights reserved.
> + * Copyright (c) 2005-2016 Mellanox Technologies. All rights reserved.
> + * Copyright (c) 2005 Voltaire, Inc. All rights reserved.
> + * Copyright (c) 2005 PathScale, Inc. All rights reserved.
> + *
> + * This software is available to you under a choice of one of two
> + * licenses.  You may choose to be licensed under the terms of the GNU
> + * General Public License (GPL) Version 2, available from the file
> + * COPYING in the main directory of this source tree, or the
> + * OpenIB.org BSD license below:
> + *
> + *     Redistribution and use in source and binary forms, with or
> + *     without modification, are permitted provided that the following
> + *     conditions are met:
> + *
> + *      - Redistributions of source code must retain the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer.
> + *
> + *      - Redistributions in binary form must reproduce the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer in the documentation and/or other materials
> + *        provided with the distribution.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
> + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
> + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
> + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
> + * SOFTWARE.
> + */
> +
> +#ifndef RDMA_CORE_H
> +#define RDMA_CORE_H
> +
> +#include <linux/idr.h>
> +#include <rdma/uverbs_ioctl.h>
> +#include <rdma/ib_verbs.h>
> +#include <linux/mutex.h>
> +
> +struct ib_uobject *uverbs_get_type_from_idr(const struct uverbs_type_alloc_action *type,
> +					    struct ib_ucontext *ucontext,
> +					    enum uverbs_idr_access access,
> +					    uint32_t idr);
> +struct ib_uobject *uverbs_get_type_from_fd(const struct uverbs_type_alloc_action *type,
> +					   struct ib_ucontext *ucontext,
> +					   enum uverbs_idr_access access,
> +					   int fd);
> +bool uverbs_is_live(struct ib_uobject *uobj);
> +void uverbs_rollback_object(struct ib_uobject *uobj,
> +			    enum uverbs_idr_access access);
> +void uverbs_commit_object(struct ib_uobject *uobj,
> +				 enum uverbs_idr_access access);
> +void uverbs_commit_objects(struct uverbs_attr_array *attr_array,
> +			   size_t num,
> +			   const struct uverbs_action *action,
> +			   bool success);
> +
> +void ib_uverbs_close_fd(struct file *f);
> +void ib_uverbs_cleanup_fd(void *private_data);
> +
> +static inline void *uverbs_fd_to_priv(struct ib_uobject *uobj)
> +{
> +	return uobj + 1;
> +}
> +
> +#endif /* RDMA_CORE_H */
> diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
> index 8074705..ae7d4b8 100644
> --- a/drivers/infiniband/core/uverbs.h
> +++ b/drivers/infiniband/core/uverbs.h
> @@ -180,6 +180,7 @@ void idr_remove_uobj(struct ib_uobject *uobj);
>  struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
>  					struct ib_device *ib_dev,
>  					int is_async);
> +void ib_uverbs_release_file(struct kref *ref);
>  void ib_uverbs_free_async_event_file(struct ib_uverbs_file *uverbs_file);
>  struct ib_uverbs_event_file *ib_uverbs_lookup_comp_file(int fd);
>  
> diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
> index f783723..e63357a 100644
> --- a/drivers/infiniband/core/uverbs_main.c
> +++ b/drivers/infiniband/core/uverbs_main.c
> @@ -341,7 +341,7 @@ static void ib_uverbs_comp_dev(struct ib_uverbs_device *dev)
>  	complete(&dev->comp);
>  }
>  
> -static void ib_uverbs_release_file(struct kref *ref)
> +void ib_uverbs_release_file(struct kref *ref)
>  {
>  	struct ib_uverbs_file *file =
>  		container_of(ref, struct ib_uverbs_file, ref);
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index b5d2075..282b0ba 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1329,8 +1329,11 @@ struct ib_fmr_attr {
>  
>  struct ib_umem;
>  
> +struct ib_ucontext_lock;
> +
>  struct ib_ucontext {
>  	struct ib_device       *device;
> +	struct ib_uverbs_file  *ufile;
>  	struct list_head	pd_list;
>  	struct list_head	mr_list;
>  	struct list_head	mw_list;
> @@ -1344,6 +1347,10 @@ struct ib_ucontext {
>  	struct list_head	rwq_ind_tbl_list;
>  	int			closing;
>  
> +	/* lock for uobjects list */
> +	struct ib_ucontext_lock	*uobjects_lock;
> +	struct list_head	uobjects;
> +
>  	struct pid             *tgid;
>  #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
>  	struct rb_root      umem_tree;
> @@ -1363,16 +1370,22 @@ struct ib_ucontext {
>  #endif
>  };
>  
> +struct uverbs_object_list;
> +
>  struct ib_uobject {
>  	u64			user_handle;	/* handle given to us by userspace */
>  	struct ib_ucontext     *context;	/* associated user context */
>  	void		       *object;		/* containing object */
>  	struct list_head	list;		/* link to context's list */
> -	int			id;		/* index into kernel idr */
> -	struct kref		ref;
> -	struct rw_semaphore	mutex;		/* protects .live */
> +	int			id;		/* index into kernel idr/fd */
> +	struct kref             ref;
> +	struct rw_semaphore	usecnt;		/* protects exclusive access */
> +	struct rw_semaphore     mutex;          /* protects .live */
>  	struct rcu_head		rcu;		/* kfree_rcu() overhead */
>  	int			live;
> +
> +	const struct uverbs_type_alloc_action *type;
> +	struct ib_ucontext_lock	*uobjects_lock;
>  };
>  
>  struct ib_udata {
> @@ -2101,6 +2114,9 @@ struct ib_device {
>  	 */
>  	int (*get_port_immutable)(struct ib_device *, u8, struct ib_port_immutable *);
>  	void (*get_dev_fw_str)(struct ib_device *, char *str, size_t str_len);
> +	struct list_head type_list;
> +
> +	const struct uverbs_types_group	*types_group;
>  };
>  
>  struct ib_client {
> diff --git a/include/rdma/uverbs_ioctl.h b/include/rdma/uverbs_ioctl.h
> new file mode 100644
> index 0000000..382321b
> --- /dev/null
> +++ b/include/rdma/uverbs_ioctl.h
> @@ -0,0 +1,218 @@
> +/*
> + * Copyright (c) 2016, Mellanox Technologies inc.  All rights reserved.
> + *
> + * This software is available to you under a choice of one of two
> + * licenses.  You may choose to be licensed under the terms of the GNU
> + * General Public License (GPL) Version 2, available from the file
> + * COPYING in the main directory of this source tree, or the
> + * OpenIB.org BSD license below:
> + *
> + *     Redistribution and use in source and binary forms, with or
> + *     without modification, are permitted provided that the following
> + *     conditions are met:
> + *
> + *      - Redistributions of source code must retain the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer.
> + *
> + *      - Redistributions in binary form must reproduce the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer in the documentation and/or other materials
> + *        provided with the distribution.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
> + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
> + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
> + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
> + * SOFTWARE.
> + */
> +
> +#ifndef _UVERBS_IOCTL_
> +#define _UVERBS_IOCTL_
> +
> +#include <linux/kernel.h>
> +
> +struct uverbs_object_type;
> +struct ib_ucontext;
> +struct ib_uobject;
> +struct ib_device;
> +struct uverbs_uobject_type;
> +
> +/*
> + * =======================================
> + *	Verbs action specifications
> + * =======================================
> + */
> +
> +#define UVERBS_ID_RESERVED_MASK 0xF000
> +#define UVERBS_ID_RESERVED_SHIFT 12
> +
> +enum uverbs_attr_type {
> +	UVERBS_ATTR_TYPE_NA,
> +	UVERBS_ATTR_TYPE_PTR_IN,
> +	UVERBS_ATTR_TYPE_PTR_OUT,
> +	UVERBS_ATTR_TYPE_IDR,
> +	UVERBS_ATTR_TYPE_FD,
> +	UVERBS_ATTR_TYPE_FLAG,
> +};
> +
> +enum uverbs_idr_access {
> +	UVERBS_IDR_ACCESS_READ,
> +	UVERBS_IDR_ACCESS_WRITE,
> +	UVERBS_IDR_ACCESS_NEW,
> +	UVERBS_IDR_ACCESS_DESTROY
> +};

It seems like these are not specific to IDR "access" so I think we should
remove the "_IDR_" label.

> +
> +enum uverbs_attr_spec_flags {
> +	UVERBS_ATTR_SPEC_F_MANDATORY	= 1U << 0,
> +	UVERBS_ATTR_SPEC_F_MIN_SZ	= 1U << 1,
> +};
> +
> +struct uverbs_attr_spec {
> +	enum uverbs_attr_type		type;
> +	u8				flags;
> +	union {
> +		u16				len;
> +		struct {
> +			u16			obj_type;
> +			u8			access;
> +		} obj;
> +		struct {
> +			/* flags are always 64bits */
> +			u64			mask;
> +		} flag;
> +	};
> +};

The more I look at this the more I feel like all attributes should be
"mandatory".

Furthermore, I think we could use the same data structure to describe the
attributes to a function as are used to pass the data from user space.  This
would make validation easier.

For example use something like this for the attribute definition in 
include/uapi/rdma

struct urdma_attr {
        __u8  type;             /* enum uverbs_attr_type */
	__u8  id;               /* command attribute id */
	__u16 len;              /* NA for idr or data */
	__u32 reserved;
	__u64 value;            /* ptr/idr/data */
};


I've also been working on a simplified scheme which is more object oriented.

For every method you specify the exact list of attributes which are expected.


struct urdma_method {
       u32 id;
       int (*method)(const struct ib_device *dev,
                     const struct ib_ucontext *uctxt,
                     struct urdma_attr *user_attrs);
       u16 num_exp_attrs;
       struct urdma_attr exp_attrs[0];
};

Validation becomes a simple 1:1 comparison.

In order to expand a method we define a new one which has additional attributes
as needed.

This is similar to having the same function defined in a class:

class foo {
	public:
		A();
		A(int data);
		A(float data);
};

This is much clearer about which method/attributes are required and are being
called.

The trade off is of course that we need more method space to account for
methods in the future.  And there is the potential for more "holes" in the
method table.  But good hashing functions can fix this.

> +
> +struct uverbs_attr_spec_group {
> +	struct uverbs_attr_spec		*attrs;
> +	size_t				num_attrs;
> +	/* populate at runtime */
> +	unsigned long			*mandatory_attrs_bitmask;
> +};

The above idea gets rid of this as well.

> +
> +struct uverbs_attr_array;
> +struct ib_uverbs_file;
> +
> +enum uverbs_action_flags {
> +	UVERBS_ACTION_FLAG_CREATE_ROOT = 1 << 0,
> +};
> +
> +struct uverbs_action {
> +	const struct uverbs_attr_spec_group		**attr_groups;
> +	size_t						num_groups;
> +	u32 flags;
> +	int (*handler)(struct ib_device *ib_dev, struct ib_uverbs_file *ufile,
> +		       struct uverbs_attr_array *ctx, size_t num);
> +	u16 num_child_attrs;
> +};
> +
> +struct uverbs_type_alloc_action;
> +typedef void (*free_type)(const struct uverbs_type_alloc_action *uobject_type,
> +			  struct ib_uobject *uobject);
> +
> +struct uverbs_type_alloc_action {
> +	enum uverbs_attr_type		type;
> +	int				order;
> +	size_t				obj_size;
> +	free_type			free_fn;
> +	struct {
> +		const struct file_operations	*fops;
> +		const char			*name;
> +		int				flags;
> +	} fd;
> +};
> +
> +struct uverbs_action_group {
> +	size_t					num_actions;
> +	const struct uverbs_action		**actions;
> +};
> +
> +struct uverbs_type {
> +	size_t					num_groups;
> +	const struct uverbs_action_group	**action_groups;
> +	const struct uverbs_type_alloc_action	*alloc;
> +};
> +
> +struct uverbs_type_group {
> +	size_t					num_types;
> +	const struct uverbs_type		**types;
> +};
> +
> +struct uverbs_root {
> +	const struct uverbs_type_group		**type_groups;
> +	size_t					num_groups;
> +};
> +
> +/* =================================================
> + *              Parsing infrastructure
> + * =================================================
> + */
> +
> +struct uverbs_ptr_attr {
> +	void	* __user ptr;
> +	u16		len;
> +};
> +
> +struct uverbs_fd_attr {
> +	int		fd;
> +};
> +
> +struct uverbs_uobj_attr {
> +	/*  idr handle */
> +	u32	idr;
> +};
> +
> +struct uverbs_flag_attr {
> +	u64	flags;
> +};
> +
> +struct uverbs_obj_attr {
> +	/* pointer to the kernel descriptor -> type, access, etc */
> +	struct ib_uverbs_attr __user	*uattr;
> +	const struct uverbs_type_alloc_action	*type;
> +	struct ib_uobject		*uobject;
> +	union {
> +		struct uverbs_fd_attr		fd;
> +		struct uverbs_uobj_attr		uobj;
> +	};
> +};
> +
> +struct uverbs_attr {
> +	union {
> +		struct uverbs_ptr_attr	cmd_attr;
                                       ^^^^^^^^^
                                        ptr_attr?

> +		struct uverbs_obj_attr	obj_attr;
> +		struct uverbs_flag_attr flag_attr;

I think "flag" should really just be "value".  The actual meaning of a 64 bit
"direct" value attribute is going to be method/attribute specific.


I have started some patches against Matans v5 series which works on some of
this.  Now that we have this cleaned up version I will port them to this
series.

Also I have not looked at the other patches except to try and understand this
one better.  So whatever I do I will try and take into account the entire
series.

Thanks,
Ira

> +	};
> +};
> +
> +/* output of one validator */
> +struct uverbs_attr_array {
> +	unsigned long *valid_bitmap;
> +	size_t num_attrs;
> +	/* arrays of attrubytes, index is the id i.e SEND_CQ */
> +	struct uverbs_attr *attrs;
> +};
> +
> +static inline bool uverbs_is_valid(const struct uverbs_attr_array *attr_array,
> +				   unsigned int idx)
> +{
> +	return test_bit(idx, attr_array->valid_bitmap);
> +}
> +
> +/* =================================================
> + *              Types infrastructure
> + * =================================================
> + */
> +
> +int ib_uverbs_uobject_type_add(struct list_head	*head,
> +			       void (*free)(struct uverbs_uobject_type *type,
> +					    struct ib_uobject *uobject,
> +					    struct ib_ucontext *ucontext),
> +			       uint16_t	obj_type);
> +void ib_uverbs_uobject_types_remove(struct ib_device *ib_dev);
> +
> +#endif
> -- 
> 1.8.3.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-12-22  8:08 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-11 12:57 [RFC ABI V6 00/14] SG-based RDMA ABI Proposal Matan Barak
     [not found] ` <1481461088-56355-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-12-11 12:57   ` [RFC ABI V6 01/14] IB/core: Refactor IDR to be per-device Matan Barak
2016-12-11 12:57   ` [RFC ABI V6 02/14] IB/core: Add support for custom types Matan Barak
     [not found]     ` <1481461088-56355-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-12-22  8:08       ` ira.weiny
2016-12-11 12:57   ` [RFC ABI V6 03/14] IB/core: Add generic ucontext initialization and teardown Matan Barak
2016-12-11 12:57   ` [RFC ABI V6 04/14] IB/core: Add macros for declaring types and type groups Matan Barak
2016-12-11 12:57   ` [RFC ABI V6 05/14] IB/core: Declare all common IB types Matan Barak
2016-12-11 12:58   ` [RFC ABI V6 06/14] IB/core: Use the new IDR and locking infrastructure in uverbs_cmd Matan Barak
2016-12-11 12:58   ` [RFC ABI V6 07/14] IB/core: Add new ioctl interface Matan Barak
2016-12-11 12:58   ` [RFC ABI V6 08/14] IB/core: Add macros for declaring actions and attributes Matan Barak
2016-12-11 12:58   ` [RFC ABI V6 09/14] IB/core: Add uverbs types, actions, handlers " Matan Barak
2016-12-11 12:58   ` [RFC ABI V6 10/14] IB/core: Add uverbs merge trees functionality Matan Barak
2016-12-11 12:58   ` [RFC ABI V6 11/14] IB/mlx5: Implement common uverb objects Matan Barak
2016-12-11 12:58   ` [RFC ABI V6 12/14] IB/{core,mlx5}: Support uhw definition per driver Matan Barak
2016-12-11 12:58   ` [RFC ABI V6 13/14] IB/core: Support getting IOCTL header/SGEs from kernel space Matan Barak
2016-12-11 12:58   ` [RFC ABI V6 14/14] IB/core: Implement compatibility layer for get context command Matan Barak
2016-12-19  0:48   ` [RFC ABI V6 00/14] SG-based RDMA ABI Proposal ira.weiny
     [not found]     ` <20161219055037.GO1074@mtr-leonro.local>
     [not found]       ` <20161219055037.GO1074-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2016-12-19  6:07         ` Weiny, Ira
     [not found]           ` <2807E5FD2F6FDA4886F6618EAC48510E3C611539-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-12-19  6:28             ` Leon Romanovsky
     [not found]               ` <20161219062841.GP1074-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2016-12-19  8:12                 ` Matan Barak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.