All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] nvme-rdma: Add IB event handling support
@ 2018-03-21 15:48 Max Gurtovoy
  2018-03-28  8:10 ` Christoph Hellwig
  2018-04-04 13:02 ` Sagi Grimberg
  0 siblings, 2 replies; 5+ messages in thread
From: Max Gurtovoy @ 2018-03-21 15:48 UTC (permalink / raw)


From: Nitzan Carmi <nitzanc@mellanox.com>

IB devices may invoke IB events that need a special treatment
from the ib_client. For example, fatal event notification raised
to registered clients due to an invalid port/device state after EEH.
IB clients should be aware of this fatal event and not post any WR's
to the device. Draining the QP, for example, is forbidden and will
stuck forever waiting for the flushed work completions.

Signed-off-by: Nitzan Carmi <nitzanc at mellanox.com>
Signed-off-by: Max Gurtovoy <maxg at mellanox.com>
---
 drivers/nvme/host/rdma.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 4d84a73..dc5af97 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -45,6 +45,7 @@
 struct nvme_rdma_device {
 	struct ib_device	*dev;
 	struct ib_pd		*pd;
+	struct ib_event_handler	event_handler;
 	struct kref		ref;
 	struct list_head	entry;
 };
@@ -329,6 +330,7 @@ static void nvme_rdma_free_dev(struct kref *ref)
 	list_del(&ndev->entry);
 	mutex_unlock(&device_list_mutex);
 
+	ib_unregister_event_handler(&ndev->event_handler);
 	ib_dealloc_pd(ndev->pd);
 	kfree(ndev);
 }
@@ -343,6 +345,36 @@ static int nvme_rdma_dev_get(struct nvme_rdma_device *dev)
 	return kref_get_unless_zero(&dev->ref);
 }
 
+static void nvme_rdma_ib_event_handler(struct ib_event_handler *handler,
+				       struct ib_event *event)
+{
+	struct nvme_rdma_ctrl *ctrl;
+	int i;
+
+	pr_debug("async event %s (%d) on device %s port %d\n",
+		 ib_event_msg(event->event), event->event,
+		 event->device->name, event->element.port_num);
+
+	switch(event->event) {
+	case IB_EVENT_DEVICE_FATAL:
+		mutex_lock(&nvme_rdma_ctrl_mutex);
+		list_for_each_entry(ctrl, &nvme_rdma_ctrl_list, list) {
+			if (ctrl->device->dev != event->device)
+				continue;
+
+			for (i = 0; i < ctrl->ctrl.queue_count; i++)
+				clear_bit(NVME_RDMA_Q_LIVE,
+					  &ctrl->queues[i].flags);
+			nvme_delete_ctrl(&ctrl->ctrl);
+		}
+		mutex_unlock(&nvme_rdma_ctrl_mutex);
+		break;
+	default:
+		pr_debug("Unsupported event (%d)\n", event->event);
+		break;
+	}
+}
+
 static struct nvme_rdma_device *
 nvme_rdma_find_get_device(struct rdma_cm_id *cm_id)
 {
@@ -374,6 +406,10 @@ static int nvme_rdma_dev_get(struct nvme_rdma_device *dev)
 		goto out_free_pd;
 	}
 
+	INIT_IB_EVENT_HANDLER(&ndev->event_handler, ndev->dev,
+	                      nvme_rdma_ib_event_handler);
+	ib_register_event_handler(&ndev->event_handler);
+
 	list_add(&ndev->entry, &device_list);
 out_unlock:
 	mutex_unlock(&device_list_mutex);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 1/1] nvme-rdma: Add IB event handling support
  2018-03-21 15:48 [PATCH 1/1] nvme-rdma: Add IB event handling support Max Gurtovoy
@ 2018-03-28  8:10 ` Christoph Hellwig
  2018-03-28  9:17   ` Max Gurtovoy
  2018-04-04 13:02 ` Sagi Grimberg
  1 sibling, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2018-03-28  8:10 UTC (permalink / raw)


Eww.  I guess something like this is required, but we really need
one coherent rdma event model instead of this mix between rdms/cm,
struct ib_client and ib_event.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/1] nvme-rdma: Add IB event handling support
  2018-03-28  8:10 ` Christoph Hellwig
@ 2018-03-28  9:17   ` Max Gurtovoy
  0 siblings, 0 replies; 5+ messages in thread
From: Max Gurtovoy @ 2018-03-28  9:17 UTC (permalink / raw)




On 3/28/2018 11:10 AM, Christoph Hellwig wrote:
> Eww.  I guess something like this is required, but we really need
> one coherent rdma event model instead of this mix between rdms/cm,
> struct ib_client and ib_event.
> 

Yes, I agree. Let's ignore this patch for now.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/1] nvme-rdma: Add IB event handling support
  2018-03-21 15:48 [PATCH 1/1] nvme-rdma: Add IB event handling support Max Gurtovoy
  2018-03-28  8:10 ` Christoph Hellwig
@ 2018-04-04 13:02 ` Sagi Grimberg
  2018-04-11 16:23   ` Max Gurtovoy
  1 sibling, 1 reply; 5+ messages in thread
From: Sagi Grimberg @ 2018-04-04 13:02 UTC (permalink / raw)



> IB devices may invoke IB events that need a special treatment
> from the ib_client. For example, fatal event notification raised
> to registered clients due to an invalid port/device state after EEH.
> IB clients should be aware of this fatal event and not post any WR's
> to the device. Draining the QP, for example, is forbidden and will
> stuck forever waiting for the flushed work completions.

Where can we find a documentation to what should and should not
work in this event?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/1] nvme-rdma: Add IB event handling support
  2018-04-04 13:02 ` Sagi Grimberg
@ 2018-04-11 16:23   ` Max Gurtovoy
  0 siblings, 0 replies; 5+ messages in thread
From: Max Gurtovoy @ 2018-04-11 16:23 UTC (permalink / raw)




On 4/4/2018 4:02 PM, Sagi Grimberg wrote:
> 
>> IB devices may invoke IB events that need a special treatment
>> from the ib_client. For example, fatal event notification raised
>> to registered clients due to an invalid port/device state after EEH.
>> IB clients should be aware of this fatal event and not post any WR's
>> to the device. Draining the QP, for example, is forbidden and will
>> stuck forever waiting for the flushed work completions.
> 
> Where can we find a documentation to what should and should not
> work in this event?

I'll probably abandon this commit since there should be low level 
mechanism that needs to be aware of attempts to post WR to a device in a 
fatal error and generate a "fake" completions.
I'll update if this patch is needed.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-04-11 16:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-21 15:48 [PATCH 1/1] nvme-rdma: Add IB event handling support Max Gurtovoy
2018-03-28  8:10 ` Christoph Hellwig
2018-03-28  9:17   ` Max Gurtovoy
2018-04-04 13:02 ` Sagi Grimberg
2018-04-11 16:23   ` Max Gurtovoy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.