All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V1 for-next 0/2] HW Device hot-removal support
@ 2014-11-18 12:11 Yishai Hadas
       [not found] ` <1416312682-7899-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Yishai Hadas @ 2014-11-18 12:11 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, jackm-VPRAkNaXOzVWk0Htik3J/w

Similarly, if the device is hot-unplugged or reset, the device driver
hardware removal flow blocks until all user contexts are destroyed.

This patchset removes the above limitations. The IB-core and uverbs
layers are still required to remain loaded as long as there are user
applications using the verbs API. However, the hardware device drivers
are not blocked any more by the user space activity.

To support this, the hardware device needs to expose a new kernel API
named 'disassociate_ucontext'. The device driver is given a ucontext
to detach from, and it should block this user context from any future
hardware access. In the IB-core level, we use this interface to
deactivate all ucontext that address a specific device when handling a
remove_one callback for it.

The first patch introduces the new API between the HW device driver and
the IB core. For devices which implement the functionality, IB core
will use it in remove_one, disassociating any active ucontext from the
hardware device. Other drivers that didn't implement it will behave as
today, remove_one will block until all ucontexts referring the device
are destroyed before returning.

The second patch provides implementation of this API for the mlx4
driver. 

Changes from V0:
patch #1: ib_uverbs_close, reduced mutex scope to enable tasks run in parallel. 

Yishai Hadas (2):
  IB/uverbs: Enable device removal when there are active user space
    applications
  IB/mlx4_ib: Disassociate support

 drivers/infiniband/core/uverbs.h      |    9 +
 drivers/infiniband/core/uverbs_cmd.c  |    8 +
 drivers/infiniband/core/uverbs_main.c |  324 +++++++++++++++++++++++++++------
 drivers/infiniband/hw/mlx4/main.c     |  119 ++++++++++++-
 drivers/infiniband/hw/mlx4/mlx4_ib.h  |   12 ++
 include/rdma/ib_verbs.h               |    2 +
 6 files changed, 417 insertions(+), 57 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
       [not found] ` <1416312682-7899-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2014-11-18 12:11   ` Yishai Hadas
       [not found]     ` <1416312682-7899-2-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2014-11-18 12:11   ` [PATCH V1 for-next 2/2] IB/mlx4_ib: Disassociate support Yishai Hadas
  2014-11-18 14:43   ` [PATCH V1 for-next 0/2] HW Device hot-removal support Or Gerlitz
  2 siblings, 1 reply; 19+ messages in thread
From: Yishai Hadas @ 2014-11-18 12:11 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, jackm-VPRAkNaXOzVWk0Htik3J/w

Enables the uverbs_remove_one to succeed despite the fact that there are
running IB applications working with the given ib device.  This functionality
enables a HW device to be unbind/reset despite the fact that there are running
user space applications using it.

It exposes a new IB kernel API named 'disassociate_ucontext' which lets a
driver detaching its HW resources from a given user context without
crashing/terminating the application. In case a driver implemented the above
API and registered with ib_uverb there will be no dependency between its device
to its uverbs_device. Upon calling remove_one of ib_uverbs the call should
return after disassociating the open HW resources without waiting to clients
disconnecting. In case driver didn't implement this API there will be no change
to current behaviour and uverbs_remove_one will return only when last client
has disconnected and reference count on uverbs device became 0.

In case the lower driver device was removed any application will continue
working over some zombie HCA, further calls will ended with an immediate error.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Jack Morgenstein <jackm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

---
 drivers/infiniband/core/uverbs.h      |    9 +
 drivers/infiniband/core/uverbs_cmd.c  |    8 +
 drivers/infiniband/core/uverbs_main.c |  324 +++++++++++++++++++++++++++------
 include/rdma/ib_verbs.h               |    2 +
 4 files changed, 287 insertions(+), 56 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 643c08a..e485e67 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -94,6 +94,12 @@ struct ib_uverbs_device {
 	struct cdev			        cdev;
 	struct rb_root				xrcd_tree;
 	struct mutex				xrcd_tree_mutex;
+	struct mutex				disassociate_mutex; /* protect lists of files. */
+	int					disassociated;
+	int					disassociated_supported;
+	struct srcu_struct			disassociate_srcu;
+	struct list_head			uverbs_file_list;
+	struct list_head			uverbs_events_file_list;
 };
 
 struct ib_uverbs_event_file {
@@ -105,6 +111,7 @@ struct ib_uverbs_event_file {
 	wait_queue_head_t			poll_wait;
 	struct fasync_struct		       *async_queue;
 	struct list_head			event_list;
+	struct list_head			list;
 };
 
 struct ib_uverbs_file {
@@ -114,6 +121,8 @@ struct ib_uverbs_file {
 	struct ib_ucontext		       *ucontext;
 	struct ib_event_handler			event_handler;
 	struct ib_uverbs_event_file	       *async_file;
+	struct list_head			list;
+	int					fatal_event_raised;
 };
 
 struct ib_uverbs_event {
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 5ba2a86..0b19361 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -38,6 +38,7 @@
 #include <linux/slab.h>
 
 #include <asm/uaccess.h>
+#include <linux/sched.h>
 
 #include "uverbs.h"
 #include "core_priv.h"
@@ -326,6 +327,7 @@ ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 	INIT_LIST_HEAD(&ucontext->xrcd_list);
 	INIT_LIST_HEAD(&ucontext->rule_list);
 	ucontext->closing = 0;
+	ucontext->tgid = get_task_pid(current->group_leader, PIDTYPE_PID);
 
 	resp.num_comp_vectors = file->device->num_comp_vectors;
 
@@ -1286,6 +1288,12 @@ ssize_t ib_uverbs_create_comp_channel(struct ib_uverbs_file *file,
 		return -EFAULT;
 	}
 
+	/* Taking ref count on uverbs_file to make sure that file won't be freed till
+	  * that event file is closed. It will enable accessing the uverbs_device fields as part of
+	  * closing the events file and making sure that uverbs device is available by that time as well.
+	  * Note: similar is already done for the async event file.
+	*/
+	kref_get(&file->ref);
 	fd_install(resp.fd, filp);
 	return in_len;
 }
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 71ab83f..1d33e46 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -133,7 +133,12 @@ static void ib_uverbs_release_dev(struct kref *ref)
 	struct ib_uverbs_device *dev =
 		container_of(ref, struct ib_uverbs_device, ref);
 
-	complete(&dev->comp);
+	if (dev->disassociated) {
+		cleanup_srcu_struct(&dev->disassociate_srcu);
+		kfree(dev);
+	} else {
+		complete(&dev->comp);
+	}
 }
 
 static void ib_uverbs_release_event_file(struct kref *ref)
@@ -296,6 +301,7 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		kfree(uobj);
 	}
 
+	put_pid(context->tgid);
 	return context->device->dealloc_ucontext(context);
 }
 
@@ -304,7 +310,9 @@ static void ib_uverbs_release_file(struct kref *ref)
 	struct ib_uverbs_file *file =
 		container_of(ref, struct ib_uverbs_file, ref);
 
-	module_put(file->device->ib_dev->owner);
+	if (!file->device->disassociated_supported)
+		module_put(file->device->ib_dev->owner);
+
 	kref_put(&file->device->ref, ib_uverbs_release_dev);
 
 	kfree(file);
@@ -327,9 +335,15 @@ static ssize_t ib_uverbs_event_read(struct file *filp, char __user *buf,
 			return -EAGAIN;
 
 		if (wait_event_interruptible(file->poll_wait,
-					     !list_empty(&file->event_list)))
+					     (!list_empty(&file->event_list) ||
+					     file->uverbs_file->device->disassociated)))
+			/* will reach here in case signal has occoured */
 			return -ERESTARTSYS;
 
+		/* We reach here once list is not empty or once device was disassociated */
+		if (list_empty(&file->event_list) && file->uverbs_file->device->disassociated)
+			return -EIO;
+
 		spin_lock_irq(&file->lock);
 	}
 
@@ -402,12 +416,17 @@ static int ib_uverbs_event_close(struct inode *inode, struct file *filp)
 	}
 	spin_unlock_irq(&file->lock);
 
-	if (file->is_async) {
-		ib_unregister_event_handler(&file->uverbs_file->event_handler);
-		kref_put(&file->uverbs_file->ref, ib_uverbs_release_file);
+	mutex_lock(&file->uverbs_file->device->disassociate_mutex);
+	if (!file->uverbs_file->device->disassociated) {
+		list_del(&file->list);
+		if (file->is_async)
+			ib_unregister_event_handler(&file->uverbs_file->event_handler);
 	}
-	kref_put(&file->ref, ib_uverbs_release_event_file);
 
+	mutex_unlock(&file->uverbs_file->device->disassociate_mutex);
+
+	kref_put(&file->uverbs_file->ref, ib_uverbs_release_file);
+	kref_put(&file->ref, ib_uverbs_release_event_file);
 	return 0;
 }
 
@@ -533,6 +552,12 @@ void ib_uverbs_event_handler(struct ib_event_handler *handler,
 	struct ib_uverbs_file *file =
 		container_of(handler, struct ib_uverbs_file, event_handler);
 
+	if (event->event == IB_EVENT_DEVICE_FATAL) {
+		if (file->fatal_event_raised)
+			return;
+		file->fatal_event_raised = 1;
+	}
+
 	ib_uverbs_async_handler(file, event->element.port_num, event->event,
 				NULL, NULL);
 }
@@ -543,7 +568,7 @@ struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
 	struct ib_uverbs_event_file *ev_file;
 	struct file *filp;
 
-	ev_file = kmalloc(sizeof *ev_file, GFP_KERNEL);
+	ev_file = kzalloc(sizeof *ev_file, GFP_KERNEL);
 	if (!ev_file)
 		return ERR_PTR(-ENOMEM);
 
@@ -558,10 +583,24 @@ struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file,
 
 	filp = anon_inode_getfile("[infinibandevent]", &uverbs_event_fops,
 				  ev_file, O_RDONLY);
-	if (IS_ERR(filp))
+	if (IS_ERR(filp)) {
 		kfree(ev_file);
+		return filp;
+	}
+
+	mutex_lock(&uverbs_file->device->disassociate_mutex);
+	if (!uverbs_file->device->disassociated) {
+		list_add_tail(&ev_file->list, &uverbs_file->device->uverbs_events_file_list);
+		mutex_unlock(&uverbs_file->device->disassociate_mutex);
+
+		return filp;
+	}
+
+	mutex_unlock(&uverbs_file->device->disassociate_mutex);
 
-	return filp;
+	fput(filp);
+	kfree(ev_file);
+	return ERR_PTR(-EIO);
 }
 
 /*
@@ -599,6 +638,8 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
 	struct ib_uverbs_file *file = filp->private_data;
 	struct ib_uverbs_cmd_hdr hdr;
 	__u32 flags;
+	int srcu_key;
+	ssize_t ret;
 
 	if (count < sizeof hdr)
 		return -EINVAL;
@@ -606,6 +647,12 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
 	if (copy_from_user(&hdr, buf, sizeof hdr))
 		return -EFAULT;
 
+	srcu_key = srcu_read_lock(&file->device->disassociate_srcu);
+	if (file->device->disassociated) {
+		ret = -EIO;
+		goto out;
+	}
+
 	flags = (hdr.command &
 		 IB_USER_VERBS_CMD_FLAGS_MASK) >> IB_USER_VERBS_CMD_FLAGS_SHIFT;
 
@@ -613,26 +660,36 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
 		__u32 command;
 
 		if (hdr.command & ~(__u32)(IB_USER_VERBS_CMD_FLAGS_MASK |
-					   IB_USER_VERBS_CMD_COMMAND_MASK))
-			return -EINVAL;
+					   IB_USER_VERBS_CMD_COMMAND_MASK)) {
+			ret = -EINVAL;
+			goto out;
+		}
 
 		command = hdr.command & IB_USER_VERBS_CMD_COMMAND_MASK;
 
 		if (command >= ARRAY_SIZE(uverbs_cmd_table) ||
-		    !uverbs_cmd_table[command])
-			return -EINVAL;
+		    !uverbs_cmd_table[command]) {
+			ret = -EINVAL;
+			goto out;
+		}
 
 		if (!file->ucontext &&
-		    command != IB_USER_VERBS_CMD_GET_CONTEXT)
-			return -EINVAL;
+		    command != IB_USER_VERBS_CMD_GET_CONTEXT) {
+			ret = -EINVAL;
+			goto out;
+		}
 
-		if (!(file->device->ib_dev->uverbs_cmd_mask & (1ull << command)))
-			return -ENOSYS;
+		if (!(file->device->ib_dev->uverbs_cmd_mask & (1ull << command))) {
+			ret = -ENOSYS;
+			goto out;
+		}
 
-		if (hdr.in_words * 4 != count)
-			return -EINVAL;
+		if (hdr.in_words * 4 != count) {
+			ret = -EINVAL;
+			goto out;
+		}
 
-		return uverbs_cmd_table[command](file,
+		ret = uverbs_cmd_table[command](file,
 						 buf + sizeof(hdr),
 						 hdr.in_words * 4,
 						 hdr.out_words * 4);
@@ -647,47 +704,69 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
 		size_t written_count = count;
 
 		if (hdr.command & ~(__u32)(IB_USER_VERBS_CMD_FLAGS_MASK |
-					   IB_USER_VERBS_CMD_COMMAND_MASK))
-			return -EINVAL;
+					   IB_USER_VERBS_CMD_COMMAND_MASK)) {
+			ret = -EINVAL;
+			goto out;
+		}
 
 		command = hdr.command & IB_USER_VERBS_CMD_COMMAND_MASK;
 
 		if (command >= ARRAY_SIZE(uverbs_ex_cmd_table) ||
-		    !uverbs_ex_cmd_table[command])
-			return -ENOSYS;
+		    !uverbs_ex_cmd_table[command]) {
+			ret = -ENOSYS;
+			goto out;
+		}
 
-		if (!file->ucontext)
-			return -EINVAL;
+		if (!file->ucontext) {
+			ret = -EINVAL;
+			goto out;
+		}
 
-		if (!(file->device->ib_dev->uverbs_ex_cmd_mask & (1ull << command)))
-			return -ENOSYS;
+		if (!(file->device->ib_dev->uverbs_ex_cmd_mask & (1ull << command))) {
+			ret = -ENOSYS;
+			goto out;
+		}
 
-		if (count < (sizeof(hdr) + sizeof(ex_hdr)))
-			return -EINVAL;
+		if (count < (sizeof(hdr) + sizeof(ex_hdr))) {
+			ret = -EINVAL;
+			goto out;
+		}
 
-		if (copy_from_user(&ex_hdr, buf + sizeof(hdr), sizeof(ex_hdr)))
-			return -EFAULT;
+		if (copy_from_user(&ex_hdr, buf + sizeof(hdr), sizeof(ex_hdr))) {
+			ret = -EFAULT;
+			goto out;
+		}
 
 		count -= sizeof(hdr) + sizeof(ex_hdr);
 		buf += sizeof(hdr) + sizeof(ex_hdr);
 
-		if ((hdr.in_words + ex_hdr.provider_in_words) * 8 != count)
-			return -EINVAL;
+		if ((hdr.in_words + ex_hdr.provider_in_words) * 8 != count) {
+			ret = -EINVAL;
+			goto out;
+		}
 
-		if (ex_hdr.cmd_hdr_reserved)
-			return -EINVAL;
+		if (ex_hdr.cmd_hdr_reserved) {
+			ret = -EINVAL;
+			goto out;
+		}
 
 		if (ex_hdr.response) {
-			if (!hdr.out_words && !ex_hdr.provider_out_words)
-				return -EINVAL;
+			if (!hdr.out_words && !ex_hdr.provider_out_words) {
+				ret = -EINVAL;
+				goto out;
+			}
 
 			if (!access_ok(VERIFY_WRITE,
 				       (void __user *) (unsigned long) ex_hdr.response,
-				       (hdr.out_words + ex_hdr.provider_out_words) * 8))
-				return -EFAULT;
+				       (hdr.out_words + ex_hdr.provider_out_words) * 8)) {
+				ret = -EFAULT;
+				goto out;
+			}
 		} else {
-			if (hdr.out_words || ex_hdr.provider_out_words)
-				return -EINVAL;
+			if (hdr.out_words || ex_hdr.provider_out_words) {
+				ret = -EINVAL;
+				goto out;
+			}
 		}
 
 		INIT_UDATA_BUF_OR_NULL(&ucore, buf, (unsigned long) ex_hdr.response,
@@ -704,22 +783,37 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
 						   &uhw);
 
 		if (err)
-			return err;
-
-		return written_count;
+			ret = err;
+		else
+			ret = written_count;
+	} else {
+		ret = -ENOSYS;
 	}
 
-	return -ENOSYS;
+out:
+	srcu_read_unlock(&file->device->disassociate_srcu, srcu_key);
+	return ret;
 }
 
 static int ib_uverbs_mmap(struct file *filp, struct vm_area_struct *vma)
 {
 	struct ib_uverbs_file *file = filp->private_data;
+	int ret = 0;
+	int srcu_key;
+
+	srcu_key = srcu_read_lock(&file->device->disassociate_srcu);
+	if (file->device->disassociated) {
+		ret = -EIO;
+		goto out;
+	}
 
 	if (!file->ucontext)
-		return -ENODEV;
+		ret = -ENODEV;
 	else
-		return file->device->ib_dev->mmap(file->ucontext, vma);
+		ret = file->device->ib_dev->mmap(file->ucontext, vma);
+out:
+	srcu_read_unlock(&file->device->disassociate_srcu, srcu_key);
+	return ret;
 }
 
 /*
@@ -737,6 +831,7 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp)
 	struct ib_uverbs_device *dev;
 	struct ib_uverbs_file *file;
 	int ret;
+	int module_dependent;
 
 	dev = container_of(inode->i_cdev, struct ib_uverbs_device, cdev);
 	if (dev)
@@ -744,15 +839,31 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp)
 	else
 		return -ENXIO;
 
-	if (!try_module_get(dev->ib_dev->owner)) {
-		ret = -ENODEV;
+	mutex_lock(&dev->disassociate_mutex);
+	if (dev->disassociated) {
+		ret = -EIO;
 		goto err;
 	}
 
-	file = kmalloc(sizeof *file, GFP_KERNEL);
+	/* In case IB device supports disassociate ucontext, there is no hard
+	 * dependency between uverbs device and its low level device.
+	 */
+	module_dependent = !dev->disassociated_supported;
+
+	if (module_dependent) {
+		if (!try_module_get(dev->ib_dev->owner)) {
+			ret = -ENODEV;
+			goto err;
+		}
+	}
+
+	file = kzalloc(sizeof *file, GFP_KERNEL);
 	if (!file) {
 		ret = -ENOMEM;
-		goto err_module;
+		if (module_dependent)
+			goto err_module;
+
+		goto err;
 	}
 
 	file->device	 = dev;
@@ -762,6 +873,8 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp)
 	mutex_init(&file->mutex);
 
 	filp->private_data = file;
+	list_add_tail(&file->list, &dev->uverbs_file_list);
+	mutex_unlock(&dev->disassociate_mutex);
 
 	return nonseekable_open(inode, filp);
 
@@ -769,6 +882,7 @@ err_module:
 	module_put(dev->ib_dev->owner);
 
 err:
+	mutex_unlock(&dev->disassociate_mutex);
 	kref_put(&dev->ref, ib_uverbs_release_dev);
 	return ret;
 }
@@ -776,9 +890,26 @@ err:
 static int ib_uverbs_close(struct inode *inode, struct file *filp)
 {
 	struct ib_uverbs_file *file = filp->private_data;
+	struct ib_ucontext *ucontext = NULL;
+	int srcu_key;
+
+	srcu_key = srcu_read_lock(&file->device->disassociate_srcu);
+	mutex_lock(&file->device->disassociate_mutex);
+	if (!file->device->disassociated) {
+		/* No need to remove from the list once alreday disassociated. Try
+		 * doing that might race with ib_uverbs_free_hw_resources as mutex
+		 * is not held by that time.
+		*/
+		list_del(&file->list);
+		ucontext = file->ucontext;
+	}
 
-	ib_uverbs_cleanup_ucontext(file, file->ucontext);
+	mutex_unlock(&file->device->disassociate_mutex);
 
+	if (ucontext)
+		ib_uverbs_cleanup_ucontext(file, ucontext);
+
+	srcu_read_unlock(&file->device->disassociate_srcu, srcu_key);
 	if (file->async_file)
 		kref_put(&file->async_file->ref, ib_uverbs_release_event_file);
 
@@ -870,6 +1001,7 @@ static void ib_uverbs_add_one(struct ib_device *device)
 	int devnum;
 	dev_t base;
 	struct ib_uverbs_device *uverbs_dev;
+	int ret;
 
 	if (!device->alloc_ucontext)
 		return;
@@ -882,6 +1014,13 @@ static void ib_uverbs_add_one(struct ib_device *device)
 	init_completion(&uverbs_dev->comp);
 	uverbs_dev->xrcd_tree = RB_ROOT;
 	mutex_init(&uverbs_dev->xrcd_tree_mutex);
+	mutex_init(&uverbs_dev->disassociate_mutex);
+	ret = init_srcu_struct(&uverbs_dev->disassociate_srcu);
+	if (ret)
+		goto err_init;
+
+	INIT_LIST_HEAD(&uverbs_dev->uverbs_file_list);
+	INIT_LIST_HEAD(&uverbs_dev->uverbs_events_file_list);
 
 	spin_lock(&map_lock);
 	devnum = find_first_zero_bit(dev_map, IB_UVERBS_MAX_DEVICES);
@@ -923,6 +1062,7 @@ static void ib_uverbs_add_one(struct ib_device *device)
 	if (device_create_file(uverbs_dev->dev, &dev_attr_abi_version))
 		goto err_class;
 
+	uverbs_dev->disassociated_supported = device->disassociate_ucontext ? 1 : 0;
 	ib_set_client_data(device, &uverbs_client, uverbs_dev);
 
 	return;
@@ -938,15 +1078,69 @@ err_cdev:
 		clear_bit(devnum, overflow_map);
 
 err:
+	cleanup_srcu_struct(&uverbs_dev->disassociate_srcu);
+
+err_init:
 	kref_put(&uverbs_dev->ref, ib_uverbs_release_dev);
 	wait_for_completion(&uverbs_dev->comp);
 	kfree(uverbs_dev);
 	return;
 }
 
+static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev)
+{
+	struct ib_uverbs_file *file, *tmp_file;
+	struct ib_uverbs_event_file *event_file, *tmp_event_file;
+	struct ib_event event;
+
+	mutex_lock(&uverbs_dev->disassociate_mutex);
+	uverbs_dev->disassociated = 1;
+	/* We must release the mutex before going ahead and calling disassociate_ucontext as a nested call to
+	  * uverbs_close might be called as a result of freeing the resources (e.g mmput).
+	  * In addition, we should take an extra ref count on files to prevent them being freed as
+	  * part of parallel file closing, from other task or event internally from that one.
+	*/
+	list_for_each_entry(file, &uverbs_dev->uverbs_file_list, list)
+		kref_get(&file->ref);
+	list_for_each_entry(event_file, &uverbs_dev->uverbs_events_file_list, list)
+		kref_get(&event_file->ref);
+	mutex_unlock(&uverbs_dev->disassociate_mutex);
+
+	/* pending running commands to terminate */
+	synchronize_srcu(&uverbs_dev->disassociate_srcu);
+	event.event = IB_EVENT_DEVICE_FATAL;
+	event.element.port_num = 0;
+	event.device = uverbs_dev->ib_dev;
+
+	list_for_each_entry(file, &uverbs_dev->uverbs_file_list, list) {
+		ib_uverbs_event_handler(&file->event_handler, &event);
+		uverbs_dev->ib_dev->disassociate_ucontext(file->ucontext);
+		ib_uverbs_cleanup_ucontext(file, file->ucontext);
+	}
+
+	list_for_each_entry(event_file, &uverbs_dev->uverbs_events_file_list, list) {
+		if (event_file->is_async) {
+			/* ib_device is freed once that function/remove_one is finished,
+			  * must unregister the event handler before.
+			*/
+			ib_unregister_event_handler(&event_file->uverbs_file->event_handler);
+		}
+
+		wake_up_interruptible(&event_file->poll_wait);
+		kill_fasync(&event_file->async_queue, SIGIO, POLL_IN);
+	}
+
+	/* we need here a safe iterator as file might be freed as part of loop */
+	list_for_each_entry_safe(file, tmp_file, &uverbs_dev->uverbs_file_list, list)
+		kref_put(&file->ref, ib_uverbs_release_file);
+
+	list_for_each_entry_safe(event_file, tmp_event_file, &uverbs_dev->uverbs_events_file_list, list)
+		kref_put(&event_file->ref, ib_uverbs_release_event_file);
+}
 static void ib_uverbs_remove_one(struct ib_device *device)
 {
 	struct ib_uverbs_device *uverbs_dev = ib_get_client_data(device, &uverbs_client);
+	int wait_clients = 1;
 
 	if (!uverbs_dev)
 		return;
@@ -960,9 +1154,27 @@ static void ib_uverbs_remove_one(struct ib_device *device)
 	else
 		clear_bit(uverbs_dev->devnum - IB_UVERBS_MAX_DEVICES, overflow_map);
 
+	if (uverbs_dev->disassociated_supported) {
+		/* We disassociate HW resources and immediately returning, not pending to active
+		  * userspace client. Upon returning ib_device may be freed internally
+		  * and is not valid any more.
+		  * uverbs_device is still available, when all clients close their files, the uverbs device ref count will be zero,
+		  * and its resources will be freed.
+		  * Note: At that step no more files can be opened on that cdev, as it was deleted,
+		  * however active clients can still issue commands and close their open files.
+		*/
+		ib_uverbs_free_hw_resources(uverbs_dev);
+		wait_clients = 0;
+		/* ib device can no longer be accessed. It is freed when this procedure returns. */
+		uverbs_dev->ib_dev = NULL;
+	}
+	/* ref count taken as part of add one is put back in both modes.*/
 	kref_put(&uverbs_dev->ref, ib_uverbs_release_dev);
-	wait_for_completion(&uverbs_dev->comp);
-	kfree(uverbs_dev);
+	if (wait_clients) {
+		wait_for_completion(&uverbs_dev->comp);
+		cleanup_srcu_struct(&uverbs_dev->disassociate_srcu);
+		kfree(uverbs_dev);
+	}
 }
 
 static char *uverbs_devnode(struct device *dev, umode_t *mode)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 470a011..da5904b 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1127,6 +1127,7 @@ struct ib_ucontext {
 	struct list_head	xrcd_list;
 	struct list_head	rule_list;
 	int			closing;
+	struct pid             *tgid;
 };
 
 struct ib_uobject {
@@ -1607,6 +1608,7 @@ struct ib_device {
 	int			   (*destroy_flow)(struct ib_flow *flow_id);
 	int			   (*check_mr_status)(struct ib_mr *mr, u32 check_mask,
 						      struct ib_mr_status *mr_status);
+	void			   (*disassociate_ucontext)(struct ib_ucontext *ibcontext);
 
 	struct ib_dma_mapping_ops   *dma_ops;
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V1 for-next 2/2] IB/mlx4_ib: Disassociate support
       [not found] ` <1416312682-7899-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2014-11-18 12:11   ` [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications Yishai Hadas
@ 2014-11-18 12:11   ` Yishai Hadas
       [not found]     ` <1416312682-7899-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2014-11-18 14:43   ` [PATCH V1 for-next 0/2] HW Device hot-removal support Or Gerlitz
  2 siblings, 1 reply; 19+ messages in thread
From: Yishai Hadas @ 2014-11-18 12:11 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, jackm-VPRAkNaXOzVWk0Htik3J/w

Implements the IB core disassociate_ucontext API. The driver detaches the HW
resources for a given user context to prevent a dependency between application
termination and device disconnecting. This is done by managing the VMAs that
were mapped to the HW bars such as door bell and blueflame. When need to detach
remap them to an arbitrary kernel page returned by the zap API.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Jack Morgenstein <jackm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

---
 drivers/infiniband/hw/mlx4/main.c    |  119 +++++++++++++++++++++++++++++++++-
 drivers/infiniband/hw/mlx4/mlx4_ib.h |   12 ++++
 2 files changed, 130 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index bda5994..76151b2 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -645,7 +645,7 @@ static struct ib_ucontext *mlx4_ib_alloc_ucontext(struct ib_device *ibdev,
 		resp.cqe_size	      = dev->dev->caps.cqe_size;
 	}
 
-	context = kmalloc(sizeof *context, GFP_KERNEL);
+	context = kzalloc(sizeof *context, GFP_KERNEL);
 	if (!context)
 		return ERR_PTR(-ENOMEM);
 
@@ -682,21 +682,134 @@ static int mlx4_ib_dealloc_ucontext(struct ib_ucontext *ibcontext)
 	return 0;
 }
 
+static void  mlx4_ib_vma_open(struct vm_area_struct *area)
+{
+	/* vma_open is called when a new VMA is created on top of our VMA.
+	 * This is done through either mremap flow or split_vma (usually due to mlock,
+	 * madvise, munmap, etc.) We do not support a clone of the vma, as this VMA is
+	 * strongly hardware related.  Therefore we set the vm_ops of the newly
+	 * created/cloned VMA to NULL, to prevent it from calling us again and trying
+	 * to do incorrect actions.  We assume that the original vma size is exactly a
+	 * single page that there will be no "splitting" operations on.
+	 */
+	area->vm_ops = NULL;
+}
+
+static void  mlx4_ib_vma_close(struct vm_area_struct *area)
+{
+	struct mlx4_ib_vma_private_data *mlx4_ib_vma_priv_data;
+
+	/* It's guaranteed that all VMAs opened on a FD are closed before the
+	 * file itself is closed, therefore no sync is needed with the regular closing
+	 * flow. (e.g. mlx4_ib_dealloc_ucontext) However need a sync with accessing the
+	 * vma as part of mlx4_ib_disassociate_ucontext.  The close operation is
+	 * usually called under mm->mmap_sem except when process is exiting.  The
+	 * exiting case is handled explicitly as part of mlx4_ib_disassociate_ucontext.
+	 */
+	mlx4_ib_vma_priv_data = (struct mlx4_ib_vma_private_data *)area->vm_private_data;
+
+	/* set the vma context pointer to null in the mlx4_ib driver's private data,
+	 * to protect against a race condition in mlx4_ib_dissassociate_ucontext().
+	 */
+	mlx4_ib_vma_priv_data->vma = NULL;
+}
+
+static const struct vm_operations_struct mlx4_ib_vm_ops = {
+	.open = mlx4_ib_vma_open,
+	.close = mlx4_ib_vma_close
+};
+
+static void mlx4_ib_disassociate_ucontext(struct ib_ucontext *ibcontext)
+{
+	int i;
+	int ret = 0;
+	struct vm_area_struct *vma;
+	struct mlx4_ib_ucontext *context = to_mucontext(ibcontext);
+	struct task_struct *owning_process  = NULL;
+	struct mm_struct   *owning_mm       = NULL;
+
+	owning_process = get_pid_task(ibcontext->tgid, PIDTYPE_PID);
+	if (!owning_process)
+		return;
+
+	owning_mm = get_task_mm(owning_process);
+	if (!owning_mm) {
+		pr_info("no mm, disassociate ucontext is pending task termination\n");
+		while (1) {
+			/* make sure that task is dead before returning, it may prevent a rare case
+			 * of module down in parallel to a call to mlx4_ib_vma_close.
+			 */
+			put_task_struct(owning_process);
+			msleep(1);
+			owning_process = get_pid_task(ibcontext->tgid, PIDTYPE_PID);
+			if (!owning_process || owning_process->state == TASK_DEAD) {
+				pr_info("disassociate ucontext done, task was terminated\n");
+				/* in case task was dead need to release the task struct */
+				if (owning_process)
+					put_task_struct(owning_process);
+				return;
+			}
+		}
+	}
+
+	/* need to protect from a race on closing the vma as part of mlx4_ib_vma_close */
+	down_read(&owning_mm->mmap_sem);
+	for (i = 0; i < HW_BAR_COUNT; i++) {
+		vma = context->hw_bar_info[i].vma;
+		if (!vma)
+			continue;
+
+		ret = zap_vma_ptes(context->hw_bar_info[i].vma, context->hw_bar_info[i].vma->vm_start,
+				   PAGE_SIZE);
+		if (ret) {
+			pr_err("Error: zap_vma_ptes failed for index=%d, ret=%d\n", i, ret);
+			BUG_ON(1);
+		}
+
+		/* context going to be destroyed, should not access ops any more */
+		context->hw_bar_info[i].vma->vm_ops = NULL;
+	}
+
+	up_read(&owning_mm->mmap_sem);
+	mmput(owning_mm);
+	put_task_struct(owning_process);
+}
+
+static void mlx4_ib_set_vma_data(struct vm_area_struct *vma,
+				 struct mlx4_ib_vma_private_data *vma_private_data)
+{
+	vma_private_data->vma = vma;
+	vma->vm_private_data = vma_private_data;
+	vma->vm_ops =  &mlx4_ib_vm_ops;
+}
+
 static int mlx4_ib_mmap(struct ib_ucontext *context, struct vm_area_struct *vma)
 {
 	struct mlx4_ib_dev *dev = to_mdev(context->device);
+	struct mlx4_ib_ucontext *mucontext = to_mucontext(context);
 
 	if (vma->vm_end - vma->vm_start != PAGE_SIZE)
 		return -EINVAL;
 
 	if (vma->vm_pgoff == 0) {
+		/* We prevent double mmaping on same context */
+		if (mucontext->hw_bar_info[HW_BAR_DB].vma != NULL)
+			return -EINVAL;
+
 		vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 
 		if (io_remap_pfn_range(vma, vma->vm_start,
 				       to_mucontext(context)->uar.pfn,
 				       PAGE_SIZE, vma->vm_page_prot))
 			return -EAGAIN;
+
+		mlx4_ib_set_vma_data(vma, &mucontext->hw_bar_info[HW_BAR_DB]);
+
 	} else if (vma->vm_pgoff == 1 && dev->dev->caps.bf_reg_size != 0) {
+		/* We prevent double mmaping on same context */
+		if (mucontext->hw_bar_info[HW_BAR_BF].vma != NULL)
+			return -EINVAL;
+
 		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
 
 		if (io_remap_pfn_range(vma, vma->vm_start,
@@ -704,6 +817,9 @@ static int mlx4_ib_mmap(struct ib_ucontext *context, struct vm_area_struct *vma)
 				       dev->dev->caps.num_uars,
 				       PAGE_SIZE, vma->vm_page_prot))
 			return -EAGAIN;
+
+		mlx4_ib_set_vma_data(vma, &mucontext->hw_bar_info[HW_BAR_BF]);
+
 	} else
 		return -EINVAL;
 
@@ -2155,6 +2271,7 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
 	ibdev->ib_dev.attach_mcast	= mlx4_ib_mcg_attach;
 	ibdev->ib_dev.detach_mcast	= mlx4_ib_mcg_detach;
 	ibdev->ib_dev.process_mad	= mlx4_ib_process_mad;
+	ibdev->ib_dev.disassociate_ucontext = mlx4_ib_disassociate_ucontext;
 
 	if (!mlx4_is_slave(ibdev->dev)) {
 		ibdev->ib_dev.alloc_fmr		= mlx4_ib_fmr_alloc;
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 6eb743f..e5d2e7b 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -70,11 +70,23 @@ extern int mlx4_ib_sm_guid_assign;
 
 #define MLX4_IB_UC_STEER_QPN_ALIGN 1
 #define MLX4_IB_UC_MAX_NUM_QPS     256
+
+enum hw_bar_type {
+	HW_BAR_BF,
+	HW_BAR_DB,
+	HW_BAR_COUNT
+};
+
+struct mlx4_ib_vma_private_data {
+	struct vm_area_struct *vma;
+};
+
 struct mlx4_ib_ucontext {
 	struct ib_ucontext	ibucontext;
 	struct mlx4_uar		uar;
 	struct list_head	db_page_list;
 	struct mutex		db_page_mutex;
+	struct mlx4_ib_vma_private_data hw_bar_info[HW_BAR_COUNT];
 };
 
 struct mlx4_ib_pd {
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
       [not found]     ` <1416312682-7899-2-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2014-11-18 14:38       ` Or Gerlitz
       [not found]         ` <546B59E2.2050707-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2014-11-18 14:40       ` Or Gerlitz
  2014-11-19 14:49       ` Or Gerlitz
  2 siblings, 1 reply; 19+ messages in thread
From: Or Gerlitz @ 2014-11-18 14:38 UTC (permalink / raw)
  To: Yishai Hadas, roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, jackm-VPRAkNaXOzVWk0Htik3J/w

On 11/18/2014 2:11 PM, Yishai Hadas wrote:
> @@ -923,6 +1062,7 @@ static void ib_uverbs_add_one(struct ib_device *device)
>   	if (device_create_file(uverbs_dev->dev, &dev_attr_abi_version))
>   		goto err_class;
>   
> +	uverbs_dev->disassociated_supported = device->disassociate_ucontext ? 1 : 0;
>   	ib_set_client_data(device, &uverbs_client, uverbs_dev);
>   
>   	return;

please no, for object->yyy_supported flags (here and elsewhere too).

You can add IB device capability (say, named IB_DEVICE_DISASSOCIATE) and 
just look it up directly from uverbs throug the IB device pointer 
existing in the uverbs device object in the spots you need this (remove 
uverbs_dev->disassociated_supported field).

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
       [not found]     ` <1416312682-7899-2-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2014-11-18 14:38       ` Or Gerlitz
@ 2014-11-18 14:40       ` Or Gerlitz
  2014-11-19 14:49       ` Or Gerlitz
  2 siblings, 0 replies; 19+ messages in thread
From: Or Gerlitz @ 2014-11-18 14:40 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jackm-VPRAkNaXOzVWk0Htik3J/w

On 11/18/2014 2:11 PM, Yishai Hadas wrote:
> @@ -599,6 +638,8 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
>   	struct ib_uverbs_file *file = filp->private_data;
>   	struct ib_uverbs_cmd_hdr hdr;
>   	__u32 flags;
> +	int srcu_key;
> +	ssize_t ret;
>   
>   	if (count < sizeof hdr)
>   		return -EINVAL;
> @@ -606,6 +647,12 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
>   	if (copy_from_user(&hdr, buf, sizeof hdr))
>   		return -EFAULT;
>   
> +	srcu_key = srcu_read_lock(&file->device->disassociate_srcu);
> +	if (file->device->disassociated) {
> +		ret = -EIO;
> +		goto out;
> +	}
> +
>   	flags = (hdr.command &
>   		 IB_USER_VERBS_CMD_FLAGS_MASK) >> IB_USER_VERBS_CMD_FLAGS_SHIFT;
>   
> @@ -613,26 +660,36 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
>   		__u32 command;
>   
>   		if (hdr.command & ~(__u32)(IB_USER_VERBS_CMD_FLAGS_MASK |
> -					   IB_USER_VERBS_CMD_COMMAND_MASK))
> -			return -EINVAL;
> +					   IB_USER_VERBS_CMD_COMMAND_MASK)) {
> +			ret = -EINVAL;
> +			goto out;
> +		}

put this change and it's such in a pre-patch that does refactoring of 
the ib_uverbs_write() code, to make the volume and amount of changes 
introduced by this patch smaller

>   
>   		command = hdr.command & IB_USER_VERBS_CMD_COMMAND_MASK;
>   
>   		if (command >= ARRAY_SIZE(uverbs_cmd_table) ||
> -		    !uverbs_cmd_table[command])
> -			return -EINVAL;
> +		    !uverbs_cmd_table[command]) {
> +			ret = -EINVAL;
> +			goto out;
> +		}
>   
>   		if (!file->ucontext &&
> -		    command != IB_USER_VERBS_CMD_GET_CONTEXT)
> -			return -EINVAL;
> +		    command != IB_USER_VERBS_CMD_GET_CONTEXT) {
> +			ret = -EINVAL;
> +			goto out;
> +		}
>   
> -		if (!(file->device->ib_dev->uverbs_cmd_mask & (1ull << command)))
> -			return -ENOSYS;
> +		if (!(file->device->ib_dev->uverbs_cmd_mask & (1ull << command))) {
> +			ret = -ENOSYS;
> +			goto out;
> +		}
>   
> -		if (hdr.in_words * 4 != count)
> -			return -EINVAL;
> +		if (hdr.in_words * 4 != count) {
> +			ret = -EINVAL;
> +			goto out;
> +		}
>   
> -		return uverbs_cmd_table[command](file,
> +		ret = uverbs_cmd_table[command](file,
>   						 buf + sizeof(hdr),
>   						 hdr.in_words * 4,
>   						 hdr.out_words * 4);
> @@ -647,47 +704,69 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
>   		size_t written_count = count;
>   
>   		if (hdr.command & ~(__u32)(IB_USER_VERBS_CMD_FLAGS_MASK |
> -					   IB_USER_VERBS_CMD_COMMAND_MASK))
> -			return -EINVAL;
> +					   IB_USER_VERBS_CMD_COMMAND_MASK)) {
> +			ret = -EINVAL;
> +			goto out;
> +		}
>   
>   		command = hdr.command & IB_USER_VERBS_CMD_COMMAND_MASK;
>   
>   		if (command >= ARRAY_SIZE(uverbs_ex_cmd_table) ||
> -		    !uverbs_ex_cmd_table[command])
> -			return -ENOSYS;
> +		    !uverbs_ex_cmd_table[command]) {
> +			ret = -ENOSYS;
> +			goto out;
> +		}
>   
> -		if (!file->ucontext)
> -			return -EINVAL;
> +		if (!file->ucontext) {
> +			ret = -EINVAL;
> +			goto out;
> +		}
>   
> -		if (!(file->device->ib_dev->uverbs_ex_cmd_mask & (1ull << command)))
> -			return -ENOSYS;
> +		if (!(file->device->ib_dev->uverbs_ex_cmd_mask & (1ull << command))) {
> +			ret = -ENOSYS;
> +			goto out;
> +		}
>   
> -		if (count < (sizeof(hdr) + sizeof(ex_hdr)))
> -			return -EINVAL;
> +		if (count < (sizeof(hdr) + sizeof(ex_hdr))) {
> +			ret = -EINVAL;
> +			goto out;
> +		}
>   
> -		if (copy_from_user(&ex_hdr, buf + sizeof(hdr), sizeof(ex_hdr)))
> -			return -EFAULT;
> +		if (copy_from_user(&ex_hdr, buf + sizeof(hdr), sizeof(ex_hdr))) {
> +			ret = -EFAULT;
> +			goto out;
> +		}
>   
>   		count -= sizeof(hdr) + sizeof(ex_hdr);
>   		buf += sizeof(hdr) + sizeof(ex_hdr);
>   
> -		if ((hdr.in_words + ex_hdr.provider_in_words) * 8 != count)
> -			return -EINVAL;
> +		if ((hdr.in_words + ex_hdr.provider_in_words) * 8 != count) {
> +			ret = -EINVAL;
> +			goto out;
> +		}
>   
> -		if (ex_hdr.cmd_hdr_reserved)
> -			return -EINVAL;
> +		if (ex_hdr.cmd_hdr_reserved) {
> +			ret = -EINVAL;
> +			goto out;
> +		}
>   
>   		if (ex_hdr.response) {
> -			if (!hdr.out_words && !ex_hdr.provider_out_words)
> -				return -EINVAL;
> +			if (!hdr.out_words && !ex_hdr.provider_out_words) {
> +				ret = -EINVAL;
> +				goto out;
> +			}
>   
>   			if (!access_ok(VERIFY_WRITE,
>   				       (void __user *) (unsigned long) ex_hdr.response,
> -				       (hdr.out_words + ex_hdr.provider_out_words) * 8))
> -				return -EFAULT;
> +				       (hdr.out_words + ex_hdr.provider_out_words) * 8)) {
> +				ret = -EFAULT;
> +				goto out;
> +			}
>   		} else {
> -			if (hdr.out_words || ex_hdr.provider_out_words)
> -				return -EINVAL;
> +			if (hdr.out_words || ex_hdr.provider_out_words) {
> +				ret = -EINVAL;
> +				goto out;
> +			}
>   		}
>   
>   		INIT_UDATA_BUF_OR_NULL(&ucore, buf, (unsigned long) ex_hdr.response,
> @@ -704,22 +783,37 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
>   						   &uhw);
>   
>   		if (err)
> -			return err;
> -
> -		return written_count;
> +			ret = err;
> +		else
> +			ret = written_count;
> +	} else {
> +		ret = -ENOSYS;
>   	}
>   
> -	return -ENOSYS;
> +out:
> +	srcu_read_unlock(&file->device->disassociate_srcu, srcu_key);
> +	return ret;
>   }

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 2/2] IB/mlx4_ib: Disassociate support
       [not found]     ` <1416312682-7899-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2014-11-18 14:42       ` Or Gerlitz
       [not found]         ` <546B5AE3.9060606-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Or Gerlitz @ 2014-11-18 14:42 UTC (permalink / raw)
  To: Yishai Hadas, roland-DgEjT+Ai2ygdnm+yROfE0A,
	jackm-VPRAkNaXOzVWk0Htik3J/w, Jack Morgenstein
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Shachar Raindel, Haggai Eran

On 11/18/2014 2:11 PM, Yishai Hadas wrote:
> Implements the IB core disassociate_ucontext API. The driver detaches the HW
> resources for a given user context to prevent a dependency between application
> termination and device disconnecting. This is done by managing the VMAs that
> were mapped to the HW bars such as door bell and blueflame. When need to detach
> remap them to an arbitrary kernel page returned by the zap API.
>
>

Guys,

Can we somehow make this patch generic (e.g land in the IB core) such 
that it can apply also for mlx5 (and other HW drivers...) basically, the 
HW driver should tell the IB core which pages to zap and we should be 
OK, isn't that?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 0/2] HW Device hot-removal support
       [not found] ` <1416312682-7899-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2014-11-18 12:11   ` [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications Yishai Hadas
  2014-11-18 12:11   ` [PATCH V1 for-next 2/2] IB/mlx4_ib: Disassociate support Yishai Hadas
@ 2014-11-18 14:43   ` Or Gerlitz
  2 siblings, 0 replies; 19+ messages in thread
From: Or Gerlitz @ 2014-11-18 14:43 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jackm-VPRAkNaXOzVWk0Htik3J/w

On 11/18/2014 2:11 PM, Yishai Hadas wrote:
> Changes from V0:
> patch #1: ib_uverbs_close, reduced mutex scope to enable tasks run in parallel.

Yishai, still some things to sort out here...1st, there must be away for 
us to come up with two patches that do not introduce 55 "line over 80 
characters" new checkaptch warnings on the code. Next, there are few 
more checkpatch warnings, please fix them up.

# ./scripts/checkpatch.pl --strict /tmp/disc_v1/* | grep WARN  | wc -l
59

# ./scripts/checkpatch.pl --strict /tmp/disc_v1/* | grep WARN  | grep 
"line over" | wc -l
55

# ./scripts/checkpatch.pl --strict /tmp/disc_v1/* | grep WARN  | sort -u
WARNING: line over 80 characters
WARNING: msleep < 20ms can sleep for up to 20ms; see 
Documentation/timers/timers-howto.txt
WARNING: sizeof *context should be sizeof(*context)
WARNING: sizeof *ev_file should be sizeof(*ev_file)
WARNING: sizeof *file should be sizeof(*file)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
       [not found]         ` <546B59E2.2050707-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2014-11-18 16:24           ` Yishai Hadas
       [not found]             ` <546B72A1.2080403-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Yishai Hadas @ 2014-11-18 16:24 UTC (permalink / raw)
  To: Or Gerlitz, Yishai Hadas, roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, jackm-VPRAkNaXOzVWk0Htik3J/w

On 11/18/2014 4:38 PM, Or Gerlitz wrote:
> On 11/18/2014 2:11 PM, Yishai Hadas wrote:
>> @@ -923,6 +1062,7 @@ static void ib_uverbs_add_one(struct ib_device
>> *device)
>>       if (device_create_file(uverbs_dev->dev, &dev_attr_abi_version))
>>           goto err_class;
>> +    uverbs_dev->disassociated_supported =
>> device->disassociate_ucontext ? 1 : 0;
>>       ib_set_client_data(device, &uverbs_client, uverbs_dev);
>>       return;
>
> please no, for object->yyy_supported flags (here and elsewhere too).
>
> You can add IB device capability (say, named IB_DEVICE_DISASSOCIATE) and
> just look it up directly from uverbs throug the IB device pointer
> existing in the uverbs device object in the spots you need this (remove
> uverbs_dev->disassociated_supported field).
>

The IB device pointer is not valid at all spots, for example after the 
HW Device was removed, that's why we set this information on the uverbs 
device.

> Or.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 2/2] IB/mlx4_ib: Disassociate support
       [not found]         ` <546B5AE3.9060606-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2014-11-18 16:48           ` Steve Wise
  2014-11-18 17:14           ` Yishai Hadas
  1 sibling, 0 replies; 19+ messages in thread
From: Steve Wise @ 2014-11-18 16:48 UTC (permalink / raw)
  To: Or Gerlitz, Yishai Hadas, roland-DgEjT+Ai2ygdnm+yROfE0A,
	jackm-VPRAkNaXOzVWk0Htik3J/w, Jack Morgenstein
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Shachar Raindel, Haggai Eran

On 11/18/2014 8:42 AM, Or Gerlitz wrote:
> On 11/18/2014 2:11 PM, Yishai Hadas wrote:
>> Implements the IB core disassociate_ucontext API. The driver detaches 
>> the HW
>> resources for a given user context to prevent a dependency between 
>> application
>> termination and device disconnecting. This is done by managing the 
>> VMAs that
>> were mapped to the HW bars such as door bell and blueflame. When need 
>> to detach
>> remap them to an arbitrary kernel page returned by the zap API.
>>
>>
>
> Guys,
>
> Can we somehow make this patch generic (e.g land in the IB core) such 
> that it can apply also for mlx5 (and other HW drivers...) basically, 
> the HW driver should tell the IB core which pages to zap and we should 
> be OK, isn't that?
>

I agree.  cxgb* will want to support this also.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 2/2] IB/mlx4_ib: Disassociate support
       [not found]         ` <546B5AE3.9060606-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2014-11-18 16:48           ` Steve Wise
@ 2014-11-18 17:14           ` Yishai Hadas
       [not found]             ` <546B7E73.40008-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  1 sibling, 1 reply; 19+ messages in thread
From: Yishai Hadas @ 2014-11-18 17:14 UTC (permalink / raw)
  To: Or Gerlitz, Yishai Hadas, roland-DgEjT+Ai2ygdnm+yROfE0A,
	jackm-VPRAkNaXOzVWk0Htik3J/w, Jack Morgenstein
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Shachar Raindel, Haggai Eran,
	liranl-VPRAkNaXOzVWk0Htik3J/w

On 11/18/2014 4:42 PM, Or Gerlitz wrote:
> On 11/18/2014 2:11 PM, Yishai Hadas wrote:
>> Implements the IB core disassociate_ucontext API. The driver detaches
>> the HW
>> resources for a given user context to prevent a dependency between
>> application
>> termination and device disconnecting. This is done by managing the
>> VMAs that
>> were mapped to the HW bars such as door bell and blueflame. When need
>> to detach
>> remap them to an arbitrary kernel page returned by the zap API.
>>
>>
>
> Guys,
>
> Can we somehow make this patch generic (e.g land in the IB core) such
> that it can apply also for mlx5 (and other HW drivers...) basically, the
> HW driver should tell the IB core which pages to zap and we should be
> OK, isn't that?

We introduced a generic API that asked the low level driver to detach a 
given ucontext from its HW resources. The specific driver implementation 
may be different between HW devices and may not involve the zap usage, 
that's why it wasn't put in IB core. In addition, the zap API should be 
in sync with inflight VMA closing to prevent zapping an already unmapped 
address. To achieve that the driver should implement some VMA ops and 
synchronize between those flows. That code looks like something that fit 
to be done in the driver and not as part of a IB core.

> Or.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
       [not found]             ` <546B72A1.2080403-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-11-18 21:42               ` Or Gerlitz
       [not found]                 ` <CAJ3xEMh5f-qZuYQ30frjUa0ETNxexh2igguxxRm-pVtERCwn-Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Or Gerlitz @ 2014-11-18 21:42 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Or Gerlitz, Yishai Hadas, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, jackm-VPRAkNaXOzVWk0Htik3J/w

On Tue, Nov 18, 2014 at 6:24 PM, Yishai Hadas <yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>

> The IB device pointer is not valid at all spots, for example after the HW
> Device was removed, that's why we set this information on the uverbs device.

let this (flag on the uverbs object) be, but still, no-no for

object->xxx_supported
object->yyy_supported
object->zzz_supported

but rather do

object->flags & XXX
object->flags & YYY
object->flags & ZZZ

where each of XXX/YYY/ZZZ are bits that mark support for features xxx/yyy/zzz
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 2/2] IB/mlx4_ib: Disassociate support
       [not found]             ` <546B7E73.40008-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-11-18 21:50               ` Or Gerlitz
       [not found]                 ` <CAJ3xEMjzZdf_aML2e+ht=cbYx3T9U7Lr7DOR6+JenEKTcxQ+Zw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Or Gerlitz @ 2014-11-18 21:50 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Or Gerlitz, Yishai Hadas, Roland Dreier,
	jackm-VPRAkNaXOzVWk0Htik3J/w, Jack Morgenstein,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Shachar Raindel, Haggai Eran,
	liranl-VPRAkNaXOzVWk0Htik3J/w

On Tue, Nov 18, 2014 at 7:14 PM, Yishai Hadas
<yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On 11/18/2014 4:42 PM, Or Gerlitz wrote:

>> Can we somehow make this patch generic (e.g land in the IB core) such
>> that it can apply also for mlx5 (and other HW drivers...) basically, the
>> HW driver should tell the IB core which pages to zap and we should be
>> OK, isn't that?

> We introduced a generic API that asked the low level driver to detach a
> given ucontext from its HW resources. The specific driver implementation may
> be different between HW devices and may not involve the zap usage, that's
> why it wasn't put in IB core. In addition, the zap API should be in sync
> with inflight VMA closing to prevent zapping an already unmapped address. To
> achieve that the driver should implement some VMA ops and synchronize
> between those flows.

Again, drivers that don't want to go the zapping way, could just avoid
this generic code.

So we have already two more low-level drivers (cxgb3/4) that would be
happily using the 95% of the mlx4_ib code you wrote after the
re-factoring I suggested.
I tend to think mlx5_ib should join too. Why not try it out, write the
code in the way which

1. under the IB core
2. easiest to use under this twist for mlx4_ib

and let cxgb3/4 and mlx5 maintainers to see if/how they can use it.

> That code looks like something that fit to be done in
> the driver and not as part of a IB core.

None of the arguments above doesn't make me get into this conclusion,
open to more people, but why don't you just follow steps 1/2 above and
see where it gets you?


Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
       [not found]                 ` <CAJ3xEMh5f-qZuYQ30frjUa0ETNxexh2igguxxRm-pVtERCwn-Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-11-19  8:05                   ` Or Gerlitz
  0 siblings, 0 replies; 19+ messages in thread
From: Or Gerlitz @ 2014-11-19  8:05 UTC (permalink / raw)
  To: Or Gerlitz, Yishai Hadas
  Cc: Yishai Hadas, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jackm-VPRAkNaXOzVWk0Htik3J/w

On 11/18/2014 11:42 PM, Or Gerlitz wrote:
> On Tue, Nov 18, 2014 at 6:24 PM, Yishai Hadas<yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
>
>> >The IB device pointer is not valid at all spots, for example after the HW
>> >Device was removed, that's why we set this information on the uverbs device.

In 2nd thought, they must be point @ time where it is valid, right? so 
you can query/cache the whole 32bit device caps on that minute and use  
it later.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 2/2] IB/mlx4_ib: Disassociate support
       [not found]                 ` <CAJ3xEMjzZdf_aML2e+ht=cbYx3T9U7Lr7DOR6+JenEKTcxQ+Zw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-11-19 12:15                   ` Yishai Hadas
       [not found]                     ` <546C89E6.1000204-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Yishai Hadas @ 2014-11-19 12:15 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Or Gerlitz, Yishai Hadas, Roland Dreier,
	jackm-VPRAkNaXOzVWk0Htik3J/w, Jack Morgenstein,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Shachar Raindel, Haggai Eran,
	liranl-VPRAkNaXOzVWk0Htik3J/w, Yevgeny Petrilin,
	talal-VPRAkNaXOzVWk0Htik3J/w

On 11/18/2014 11:50 PM, Or Gerlitz wrote:
> On Tue, Nov 18, 2014 at 7:14 PM, Yishai Hadas
> <yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>> On 11/18/2014 4:42 PM, Or Gerlitz wrote:
>
>>> Can we somehow make this patch generic (e.g land in the IB core) such
>>> that it can apply also for mlx5 (and other HW drivers...) basically, the
>>> HW driver should tell the IB core which pages to zap and we should be
>>> OK, isn't that?
>
>> We introduced a generic API that asked the low level driver to detach a
>> given ucontext from its HW resources. The specific driver implementation may
>> be different between HW devices and may not involve the zap usage, that's
>> why it wasn't put in IB core. In addition, the zap API should be in sync
>> with inflight VMA closing to prevent zapping an already unmapped address. To
>> achieve that the driver should implement some VMA ops and synchronize
>> between those flows.
>
> Again, drivers that don't want to go the zapping way, could just avoid
> this generic code.
>
> So we have already two more low-level drivers (cxgb3/4) that would be
> happily using the 95% of the mlx4_ib code you wrote after the
> re-factoring I suggested.
> I tend to think mlx5_ib should join too. Why not try it out, write the
> code in the way which
>
> 1. under the IB core
> 2. easiest to use under this twist for mlx4_ib
>
> and let cxgb3/4 and mlx5 maintainers to see if/how they can use it.

The HW driver upon mmap of its UARs needs to manage their VMAs, it 
includes saving the VMAs pointers on its internal context, set its 
private VMA open/close operations, etc. In addition, the HW Driver 
should synchronize between VMA closing to a parallel call of ucontext 
detaching.

To use the ZAP API the internal context of the HW driver should be used 
and its state should be considered, as such it makes sense that it will 
be used as part of the HW Driver. In addition, there may be some others 
vendors that may not use it as part of their ucontext detaching at all.

For that reason we believe that the correct place to use the zap call is 
in the HW driver and not in IB core, at least at first phase.



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 2/2] IB/mlx4_ib: Disassociate support
       [not found]                     ` <546C89E6.1000204-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-11-19 14:45                       ` Or Gerlitz
  0 siblings, 0 replies; 19+ messages in thread
From: Or Gerlitz @ 2014-11-19 14:45 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Yishai Hadas, Roland Dreier, jackm-VPRAkNaXOzVWk0Htik3J/w,
	Jack Morgenstein, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Shachar Raindel, Haggai Eran, liranl-VPRAkNaXOzVWk0Htik3J/w,
	Yevgeny Petrilin, talal-VPRAkNaXOzVWk0Htik3J/w

On 11/19/2014 2:15 PM, Yishai Hadas wrote:
> For that reason we believe that the correct place to use the zap call 
> is in the HW driver and not in IB core

OK let it be - get this one (after the  trivial comments are addressed) 
in and let the ones that follow see if they
can generalize/re-use the code somehow...

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
       [not found]     ` <1416312682-7899-2-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2014-11-18 14:38       ` Or Gerlitz
  2014-11-18 14:40       ` Or Gerlitz
@ 2014-11-19 14:49       ` Or Gerlitz
       [not found]         ` <546CADEF.6070905-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2 siblings, 1 reply; 19+ messages in thread
From: Or Gerlitz @ 2014-11-19 14:49 UTC (permalink / raw)
  To: Yishai Hadas, roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, jackm-VPRAkNaXOzVWk0Htik3J/w

On 11/18/2014 2:11 PM, Yishai Hadas wrote:
> @@ -533,6 +552,12 @@ void ib_uverbs_event_handler(struct ib_event_handler *handler,
>   	struct ib_uverbs_file *file =
>   		container_of(handler, struct ib_uverbs_file, event_handler);
>   
> +	if (event->event == IB_EVENT_DEVICE_FATAL) {
> +		if (file->fatal_event_raised)
> +			return;
> +		file->fatal_event_raised = 1;
> +	}
> +

can we avoid the fatal_event_raised flag?

In a similar manner to what defined for rdma-cm consumers, let's add 
IB_EVENT_DEVICE_REMOVAL event and generate both events from  here 
towards the uberbs consumer (the fatal for legacy apps and the device 
removal for newer apps that maybe want to distinguish between the
two)

>   	ib_uverbs_async_handler(file, event->element.port_num, event->event,
>   				NULL, NULL);
>   }

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
       [not found]         ` <546CADEF.6070905-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2014-11-20  7:52           ` Yishai Hadas
       [not found]             ` <546D9DB2.8050900-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Yishai Hadas @ 2014-11-20  7:52 UTC (permalink / raw)
  To: Or Gerlitz, Yishai Hadas, roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, jackm-VPRAkNaXOzVWk0Htik3J/w

On 11/19/2014 4:49 PM, Or Gerlitz wrote:
> On 11/18/2014 2:11 PM, Yishai Hadas wrote:
>> @@ -533,6 +552,12 @@ void ib_uverbs_event_handler(struct
>> ib_event_handler *handler,
>>       struct ib_uverbs_file *file =
>>           container_of(handler, struct ib_uverbs_file, event_handler);
>> +    if (event->event == IB_EVENT_DEVICE_FATAL) {
>> +        if (file->fatal_event_raised)
>> +            return;
>> +        file->fatal_event_raised = 1;
>> +    }
>> +
>
> can we avoid the fatal_event_raised flag?

We need this flag to prevent generating duplicate event. This may happen 
when a fatal error occurred and just later the remove one was called 
when there were active applications.

>
> In a similar manner to what defined for rdma-cm consumers, let's add
> IB_EVENT_DEVICE_REMOVAL event and generate both events from  here
> towards the uberbs consumer (the fatal for legacy apps and the device
> removal for newer apps that maybe want to distinguish between the
> two)

Need to consider whether it's important for newer applications to 
distinguish, can be opened for other input. In case we believe so, can 
be added as an extra patch to that series.

>
>>       ib_uverbs_async_handler(file, event->element.port_num,
>> event->event,
>>                   NULL, NULL);
>>   }
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
       [not found]             ` <546D9DB2.8050900-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-11-20 11:21               ` Or Gerlitz
  2014-11-20 11:21               ` Or Gerlitz
  1 sibling, 0 replies; 19+ messages in thread
From: Or Gerlitz @ 2014-11-20 11:21 UTC (permalink / raw)
  To: Yishai Hadas, Yishai Hadas, roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, jackm-VPRAkNaXOzVWk0Htik3J/w

On 11/20/2014 9:52 AM, Yishai Hadas wrote:
>> can we avoid the fatal_event_raised flag?
>
> We need this flag to prevent generating duplicate event. This may 
> happen when a fatal error occurred and just later the remove one was 
> called when there were active applications.

If your only concern is not generating duplicate fatal error events, I 
would strongly recommend to go a drop this flag to make the code 
clearer/simpler, object->I_already_did_operation_X flags should be 
avoided in correct system programming IMHO.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
       [not found]             ` <546D9DB2.8050900-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  2014-11-20 11:21               ` Or Gerlitz
@ 2014-11-20 11:21               ` Or Gerlitz
  1 sibling, 0 replies; 19+ messages in thread
From: Or Gerlitz @ 2014-11-20 11:21 UTC (permalink / raw)
  To: Yishai Hadas, Yishai Hadas, roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, jackm-VPRAkNaXOzVWk0Htik3J/w

On 11/20/2014 9:52 AM, Yishai Hadas wrote:
>> can we avoid the fatal_event_raised flag?
>
> We need this flag to prevent generating duplicate event. This may 
> happen when a fatal error occurred and just later the remove one was 
> called when there were active applications.

If your only concern is not generating duplicate fatal error events, I 
would strongly recommend to go and drop this flag to make the code 
clearer/simpler, object->I_already_did_operation_X flags should be 
avoided in correct system programming IMHO.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2014-11-20 11:21 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-18 12:11 [PATCH V1 for-next 0/2] HW Device hot-removal support Yishai Hadas
     [not found] ` <1416312682-7899-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-11-18 12:11   ` [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications Yishai Hadas
     [not found]     ` <1416312682-7899-2-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-11-18 14:38       ` Or Gerlitz
     [not found]         ` <546B59E2.2050707-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-11-18 16:24           ` Yishai Hadas
     [not found]             ` <546B72A1.2080403-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-11-18 21:42               ` Or Gerlitz
     [not found]                 ` <CAJ3xEMh5f-qZuYQ30frjUa0ETNxexh2igguxxRm-pVtERCwn-Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-11-19  8:05                   ` Or Gerlitz
2014-11-18 14:40       ` Or Gerlitz
2014-11-19 14:49       ` Or Gerlitz
     [not found]         ` <546CADEF.6070905-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-11-20  7:52           ` Yishai Hadas
     [not found]             ` <546D9DB2.8050900-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-11-20 11:21               ` Or Gerlitz
2014-11-20 11:21               ` Or Gerlitz
2014-11-18 12:11   ` [PATCH V1 for-next 2/2] IB/mlx4_ib: Disassociate support Yishai Hadas
     [not found]     ` <1416312682-7899-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-11-18 14:42       ` Or Gerlitz
     [not found]         ` <546B5AE3.9060606-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-11-18 16:48           ` Steve Wise
2014-11-18 17:14           ` Yishai Hadas
     [not found]             ` <546B7E73.40008-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-11-18 21:50               ` Or Gerlitz
     [not found]                 ` <CAJ3xEMjzZdf_aML2e+ht=cbYx3T9U7Lr7DOR6+JenEKTcxQ+Zw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-11-19 12:15                   ` Yishai Hadas
     [not found]                     ` <546C89E6.1000204-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-11-19 14:45                       ` Or Gerlitz
2014-11-18 14:43   ` [PATCH V1 for-next 0/2] HW Device hot-removal support Or Gerlitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.