All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] SRP initiator patches for kernel 3.16
@ 2014-05-06 12:49 Bart Van Assche
       [not found] ` <5368DA5B.80609-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Bart Van Assche @ 2014-05-06 12:49 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

This patch series consists of one patch that adds fast registration
support to the SRP initiator and eight preparation patches:

0001-IB-srp-Fix-kernel-doc-warnings.patch
0002-IB-srp-Introduce-an-additional-local-variable.patch
0003-IB-srp-Introduce-srp_alloc_fmr_pool.patch
0004-IB-srp-Introduce-srp_map_fmr.patch
0005-IB-srp-Introduce-srp_finish_mapping.patch
0006-IB-srp-Make-srp_alloc_req_data-reallocate-request-da.patch
0007-IB-srp-Avoid-triggering-an-infinite-loop-if-memory-m.patch
0008-IB-srp-Rename-FMR-related-variables.patch
0009-IB-srp-Add-fast-registration-support.patch
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 1/9] IB/srp: Fix kernel-doc warnings
       [not found] ` <5368DA5B.80609-HInyCGIudOg@public.gmane.org>
@ 2014-05-06 12:50   ` Bart Van Assche
       [not found]     ` <5368DAB2.2070006-HInyCGIudOg@public.gmane.org>
  2014-05-06 12:51   ` [PATCH 2/9] IB/srp: Introduce an additional local variable Bart Van Assche
                     ` (9 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: Bart Van Assche @ 2014-05-06 12:50 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

Avoid that the kernel-doc tool warns about missing argument descriptions
for the ib_srp.[ch] source files.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 66a908b..cf80f7a 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -813,6 +813,10 @@ static struct scsi_cmnd *srp_claim_req(struct srp_target_port *target,
 
 /**
  * srp_free_req() - Unmap data and add request to the free request list.
+ * @target: SRP target port.
+ * @req:    Request to be freed.
+ * @scmnd:  SCSI command associated with @req.
+ * @req_lim_delta: Amount to be added to @target->req_lim.
  */
 static void srp_free_req(struct srp_target_port *target,
 			 struct srp_request *req, struct scsi_cmnd *scmnd,
@@ -1455,6 +1459,7 @@ static void srp_handle_recv(struct srp_target_port *target, struct ib_wc *wc)
 
 /**
  * srp_tl_err_work() - handle a transport layer error
+ * @work: Work structure embedded in an SRP target port.
  *
  * Note: This function may get invoked before the rport has been created,
  * hence the target->rport test.
@@ -2310,6 +2315,8 @@ static struct class srp_class = {
 
 /**
  * srp_conn_unique() - check whether the connection to a target is unique
+ * @host:   SRP host.
+ * @target: SRP target port.
  */
 static bool srp_conn_unique(struct srp_host *host,
 			    struct srp_target_port *target)
-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 2/9] IB/srp: Introduce an additional local variable
       [not found] ` <5368DA5B.80609-HInyCGIudOg@public.gmane.org>
  2014-05-06 12:50   ` [PATCH 1/9] IB/srp: Fix kernel-doc warnings Bart Van Assche
@ 2014-05-06 12:51   ` Bart Van Assche
       [not found]     ` <5368DADF.2070105-HInyCGIudOg@public.gmane.org>
  2014-05-06 12:52   ` [PATCH 3/9] IB/srp: Introduce srp_alloc_fmr_pool() Bart Van Assche
                     ` (8 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: Bart Van Assche @ 2014-05-06 12:51 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index cf80f7a..8c03371 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -290,6 +290,7 @@ static int srp_new_cm_id(struct srp_target_port *target)
 
 static int srp_create_target_ib(struct srp_target_port *target)
 {
+	struct srp_device *dev = target->srp_host->srp_dev;
 	struct ib_qp_init_attr *init_attr;
 	struct ib_cq *recv_cq, *send_cq;
 	struct ib_qp *qp;
@@ -299,16 +300,14 @@ static int srp_create_target_ib(struct srp_target_port *target)
 	if (!init_attr)
 		return -ENOMEM;
 
-	recv_cq = ib_create_cq(target->srp_host->srp_dev->dev,
-			       srp_recv_completion, NULL, target,
+	recv_cq = ib_create_cq(dev->dev, srp_recv_completion, NULL, target,
 			       target->queue_size, target->comp_vector);
 	if (IS_ERR(recv_cq)) {
 		ret = PTR_ERR(recv_cq);
 		goto err;
 	}
 
-	send_cq = ib_create_cq(target->srp_host->srp_dev->dev,
-			       srp_send_completion, NULL, target,
+	send_cq = ib_create_cq(dev->dev, srp_send_completion, NULL, target,
 			       target->queue_size, target->comp_vector);
 	if (IS_ERR(send_cq)) {
 		ret = PTR_ERR(send_cq);
@@ -327,7 +326,7 @@ static int srp_create_target_ib(struct srp_target_port *target)
 	init_attr->send_cq             = send_cq;
 	init_attr->recv_cq             = recv_cq;
 
-	qp = ib_create_qp(target->srp_host->srp_dev->pd, init_attr);
+	qp = ib_create_qp(dev->pd, init_attr);
 	if (IS_ERR(qp)) {
 		ret = PTR_ERR(qp);
 		goto err_send_cq;
-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 3/9] IB/srp: Introduce srp_alloc_fmr_pool()
       [not found] ` <5368DA5B.80609-HInyCGIudOg@public.gmane.org>
  2014-05-06 12:50   ` [PATCH 1/9] IB/srp: Fix kernel-doc warnings Bart Van Assche
  2014-05-06 12:51   ` [PATCH 2/9] IB/srp: Introduce an additional local variable Bart Van Assche
@ 2014-05-06 12:52   ` Bart Van Assche
       [not found]     ` <5368DB05.4070600-HInyCGIudOg@public.gmane.org>
  2014-05-06 12:53   ` [PATCH 4/9] IB/srp: Introduce srp_map_fmr() Bart Van Assche
                     ` (7 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: Bart Van Assche @ 2014-05-06 12:52 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

Introduce the srp_alloc_fmr_pool() function. Only set
srp_dev->fmr_max_size if FMR pool creation succeeded. This change is
safe since that variable is only used if FMR pool creation succeeded.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 56 ++++++++++++++++++++++---------------
 1 file changed, 33 insertions(+), 23 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 8c03371..f41cc8c 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -2792,13 +2792,43 @@ free_host:
 	return NULL;
 }
 
+static void srp_alloc_fmr_pool(struct srp_device *srp_dev)
+{
+	int max_pages_per_mr;
+	struct ib_fmr_pool_param fmr_param;
+	struct ib_fmr_pool *pool;
+
+	srp_dev->fmr_pool = NULL;
+
+	for (max_pages_per_mr = SRP_FMR_SIZE;
+	     max_pages_per_mr >= SRP_FMR_MIN_SIZE;
+	     max_pages_per_mr /= 2) {
+		memset(&fmr_param, 0, sizeof(fmr_param));
+		fmr_param.pool_size	    = SRP_FMR_POOL_SIZE;
+		fmr_param.dirty_watermark   = SRP_FMR_DIRTY_SIZE;
+		fmr_param.cache		    = 1;
+		fmr_param.max_pages_per_fmr = max_pages_per_mr;
+		fmr_param.page_shift	    = ilog2(srp_dev->fmr_page_size);
+		fmr_param.access	    = (IB_ACCESS_LOCAL_WRITE |
+					       IB_ACCESS_REMOTE_WRITE |
+					       IB_ACCESS_REMOTE_READ);
+
+		pool = ib_create_fmr_pool(srp_dev->pd, &fmr_param);
+		if (!IS_ERR(pool)) {
+			srp_dev->fmr_pool = pool;
+			srp_dev->fmr_max_size =
+				srp_dev->fmr_page_size * max_pages_per_mr;
+			break;
+		}
+	}
+}
+
 static void srp_add_one(struct ib_device *device)
 {
 	struct srp_device *srp_dev;
 	struct ib_device_attr *dev_attr;
-	struct ib_fmr_pool_param fmr_param;
 	struct srp_host *host;
-	int max_pages_per_fmr, fmr_page_shift, s, e, p;
+	int fmr_page_shift, s, e, p;
 
 	dev_attr = kmalloc(sizeof *dev_attr, GFP_KERNEL);
 	if (!dev_attr)
@@ -2821,7 +2851,6 @@ static void srp_add_one(struct ib_device *device)
 	fmr_page_shift		= max(12, ffs(dev_attr->page_size_cap) - 1);
 	srp_dev->fmr_page_size	= 1 << fmr_page_shift;
 	srp_dev->fmr_page_mask	= ~((u64) srp_dev->fmr_page_size - 1);
-	srp_dev->fmr_max_size	= srp_dev->fmr_page_size * SRP_FMR_SIZE;
 
 	INIT_LIST_HEAD(&srp_dev->dev_list);
 
@@ -2837,26 +2866,7 @@ static void srp_add_one(struct ib_device *device)
 	if (IS_ERR(srp_dev->mr))
 		goto err_pd;
 
-	for (max_pages_per_fmr = SRP_FMR_SIZE;
-			max_pages_per_fmr >= SRP_FMR_MIN_SIZE;
-			max_pages_per_fmr /= 2, srp_dev->fmr_max_size /= 2) {
-		memset(&fmr_param, 0, sizeof fmr_param);
-		fmr_param.pool_size	    = SRP_FMR_POOL_SIZE;
-		fmr_param.dirty_watermark   = SRP_FMR_DIRTY_SIZE;
-		fmr_param.cache		    = 1;
-		fmr_param.max_pages_per_fmr = max_pages_per_fmr;
-		fmr_param.page_shift	    = fmr_page_shift;
-		fmr_param.access	    = (IB_ACCESS_LOCAL_WRITE |
-					       IB_ACCESS_REMOTE_WRITE |
-					       IB_ACCESS_REMOTE_READ);
-
-		srp_dev->fmr_pool = ib_create_fmr_pool(srp_dev->pd, &fmr_param);
-		if (!IS_ERR(srp_dev->fmr_pool))
-			break;
-	}
-
-	if (IS_ERR(srp_dev->fmr_pool))
-		srp_dev->fmr_pool = NULL;
+	srp_alloc_fmr_pool(srp_dev);
 
 	if (device->node_type == RDMA_NODE_IB_SWITCH) {
 		s = 0;
-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 4/9] IB/srp: Introduce srp_map_fmr()
       [not found] ` <5368DA5B.80609-HInyCGIudOg@public.gmane.org>
                     ` (2 preceding siblings ...)
  2014-05-06 12:52   ` [PATCH 3/9] IB/srp: Introduce srp_alloc_fmr_pool() Bart Van Assche
@ 2014-05-06 12:53   ` Bart Van Assche
       [not found]     ` <5368DB2F.8020506-HInyCGIudOg@public.gmane.org>
  2014-05-06 12:53   ` [PATCH 5/9] IB/srp: Introduce srp_finish_mapping() Bart Van Assche
                     ` (6 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: Bart Van Assche @ 2014-05-06 12:53 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 77 ++++++++++++++++++++++---------------
 1 file changed, 45 insertions(+), 32 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index f41cc8c..5fb607b 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1047,12 +1047,54 @@ static int srp_map_sg_entry(struct srp_map_state *state,
 	return ret;
 }
 
+static void srp_map_fmr(struct srp_map_state *state,
+			struct srp_target_port *target, struct srp_request *req,
+			struct scatterlist *scat, int count)
+{
+	struct srp_device *dev = target->srp_host->srp_dev;
+	struct ib_device *ibdev = dev->dev;
+	struct scatterlist *sg;
+	int i, use_fmr;
+
+	state->desc	= req->indirect_desc;
+	state->pages	= req->map_page;
+	state->next_fmr	= req->fmr_list;
+
+	use_fmr = dev->fmr_pool ? SRP_MAP_ALLOW_FMR : SRP_MAP_NO_FMR;
+
+	for_each_sg(scat, sg, count, i) {
+		if (srp_map_sg_entry(state, target, sg, i, use_fmr)) {
+			/* FMR mapping failed, so backtrack to the first
+			 * unmapped entry and continue on without using FMR.
+			 */
+			dma_addr_t dma_addr;
+			unsigned int dma_len;
+
+backtrack:
+			sg = state->unmapped_sg;
+			i = state->unmapped_index;
+
+			dma_addr = ib_sg_dma_address(ibdev, sg);
+			dma_len = ib_sg_dma_len(ibdev, sg);
+			dma_len -= (state->unmapped_addr - dma_addr);
+			dma_addr = state->unmapped_addr;
+			use_fmr = SRP_MAP_NO_FMR;
+			srp_map_desc(state, dma_addr, dma_len, target->rkey);
+		}
+	}
+
+	if (use_fmr == SRP_MAP_ALLOW_FMR && srp_map_finish_fmr(state, target))
+		goto backtrack;
+
+	req->nfmr = state->nfmr;
+}
+
 static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
 			struct srp_request *req)
 {
-	struct scatterlist *scat, *sg;
+	struct scatterlist *scat;
 	struct srp_cmd *cmd = req->cmd->buf;
-	int i, len, nents, count, use_fmr;
+	int len, nents, count;
 	struct srp_device *dev;
 	struct ib_device *ibdev;
 	struct srp_map_state state;
@@ -1111,35 +1153,7 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
 				   target->indirect_size, DMA_TO_DEVICE);
 
 	memset(&state, 0, sizeof(state));
-	state.desc	= req->indirect_desc;
-	state.pages	= req->map_page;
-	state.next_fmr	= req->fmr_list;
-
-	use_fmr = dev->fmr_pool ? SRP_MAP_ALLOW_FMR : SRP_MAP_NO_FMR;
-
-	for_each_sg(scat, sg, count, i) {
-		if (srp_map_sg_entry(&state, target, sg, i, use_fmr)) {
-			/* FMR mapping failed, so backtrack to the first
-			 * unmapped entry and continue on without using FMR.
-			 */
-			dma_addr_t dma_addr;
-			unsigned int dma_len;
-
-backtrack:
-			sg = state.unmapped_sg;
-			i = state.unmapped_index;
-
-			dma_addr = ib_sg_dma_address(ibdev, sg);
-			dma_len = ib_sg_dma_len(ibdev, sg);
-			dma_len -= (state.unmapped_addr - dma_addr);
-			dma_addr = state.unmapped_addr;
-			use_fmr = SRP_MAP_NO_FMR;
-			srp_map_desc(&state, dma_addr, dma_len, target->rkey);
-		}
-	}
-
-	if (use_fmr == SRP_MAP_ALLOW_FMR && srp_map_finish_fmr(&state, target))
-		goto backtrack;
+	srp_map_fmr(&state, target, req, scat, count);
 
 	/* We've mapped the request, now pull as much of the indirect
 	 * descriptor table as we can into the command buffer. If this
@@ -1147,7 +1161,6 @@ backtrack:
 	 * guaranteed to fit into the command, as the SCSI layer won't
 	 * give us more S/G entries than we allow.
 	 */
-	req->nfmr = state.nfmr;
 	if (state.ndesc == 1) {
 		/* FMR mapping was able to collapse this to one entry,
 		 * so use a direct descriptor.
-- 
1.8.4.5


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 5/9] IB/srp: Introduce srp_finish_mapping()
       [not found] ` <5368DA5B.80609-HInyCGIudOg@public.gmane.org>
                     ` (3 preceding siblings ...)
  2014-05-06 12:53   ` [PATCH 4/9] IB/srp: Introduce srp_map_fmr() Bart Van Assche
@ 2014-05-06 12:53   ` Bart Van Assche
       [not found]     ` <5368DB58.6070502-HInyCGIudOg@public.gmane.org>
  2014-05-06 12:54   ` [PATCH 6/9] IB/srp: Make srp_alloc_req_data() reallocate request data Bart Van Assche
                     ` (5 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: Bart Van Assche @ 2014-05-06 12:53 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 37 +++++++++++++++++++++++++------------
 1 file changed, 25 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 5fb607b..ba434d6 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -935,16 +935,6 @@ static int srp_map_finish_fmr(struct srp_map_state *state,
 	struct ib_pool_fmr *fmr;
 	u64 io_addr = 0;
 
-	if (!state->npages)
-		return 0;
-
-	if (state->npages == 1) {
-		srp_map_desc(state, state->base_dma_addr, state->fmr_len,
-			     target->rkey);
-		state->npages = state->fmr_len = 0;
-		return 0;
-	}
-
 	fmr = ib_fmr_pool_map_phys(dev->fmr_pool, state->pages,
 				   state->npages, io_addr);
 	if (IS_ERR(fmr))
@@ -954,10 +944,33 @@ static int srp_map_finish_fmr(struct srp_map_state *state,
 	state->nfmr++;
 
 	srp_map_desc(state, 0, state->fmr_len, fmr->fmr->rkey);
-	state->npages = state->fmr_len = 0;
+
 	return 0;
 }
 
+static int srp_finish_mapping(struct srp_map_state *state,
+			      struct srp_target_port *target)
+{
+	int ret = 0;
+
+	if (state->npages == 0)
+		return 0;
+
+	if (state->npages == 1) {
+		srp_map_desc(state, state->base_dma_addr, state->fmr_len,
+			     target->rkey);
+	} else {
+		ret = srp_map_finish_fmr(state, target);
+	}
+
+	if (ret == 0) {
+		state->npages = 0;
+		state->fmr_len = 0;
+	}
+
+	return ret;
+}
+
 static void srp_map_update_start(struct srp_map_state *state,
 				 struct scatterlist *sg, int sg_index,
 				 dma_addr_t dma_addr)
@@ -998,7 +1011,7 @@ static int srp_map_sg_entry(struct srp_map_state *state,
 	 * avoided using FMR on such page fragments.
 	 */
 	if (dma_addr & ~dev->fmr_page_mask || dma_len > dev->fmr_max_size) {
-		ret = srp_map_finish_fmr(state, target);
+		ret = srp_finish_mapping(state, target);
 		if (ret)
 			return ret;
 
-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 6/9] IB/srp: Make srp_alloc_req_data() reallocate request data
       [not found] ` <5368DA5B.80609-HInyCGIudOg@public.gmane.org>
                     ` (4 preceding siblings ...)
  2014-05-06 12:53   ` [PATCH 5/9] IB/srp: Introduce srp_finish_mapping() Bart Van Assche
@ 2014-05-06 12:54   ` Bart Van Assche
       [not found]     ` <5368DB90.9050209-HInyCGIudOg@public.gmane.org>
  2014-05-06 12:55   ` [PATCH 7/9] IB/srp: Avoid triggering an infinite loop if memory mapping fails Bart Van Assche
                     ` (4 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: Bart Van Assche @ 2014-05-06 12:54 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

This patch is needed by the patch that adds fast registration support.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 41 ++++++++++++++++++++++++-------------
 1 file changed, 27 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index ba434d6..1c4b0d3 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -574,17 +574,18 @@ static void srp_disconnect_target(struct srp_target_port *target)
 	}
 }
 
-static void srp_free_req_data(struct srp_target_port *target)
+static void srp_free_req_data(struct srp_target_port *target,
+			      struct srp_request *req_ring)
 {
 	struct ib_device *ibdev = target->srp_host->srp_dev->dev;
 	struct srp_request *req;
 	int i;
 
-	if (!target->req_ring)
+	if (!req_ring)
 		return;
 
 	for (i = 0; i < target->req_ring_size; ++i) {
-		req = &target->req_ring[i];
+		req = &req_ring[i];
 		kfree(req->fmr_list);
 		kfree(req->map_page);
 		if (req->indirect_dma_addr) {
@@ -595,27 +596,34 @@ static void srp_free_req_data(struct srp_target_port *target)
 		kfree(req->indirect_desc);
 	}
 
-	kfree(target->req_ring);
-	target->req_ring = NULL;
+	kfree(req_ring);
 }
 
+/**
+ * srp_alloc_req_data() - allocate or reallocate request data
+ * @target: SRP target port.
+ *
+ * If target->req_ring was non-NULL before this function got invoked it will
+ * also be non-NULL after this function has finished.
+ */
 static int srp_alloc_req_data(struct srp_target_port *target)
 {
 	struct srp_device *srp_dev = target->srp_host->srp_dev;
 	struct ib_device *ibdev = srp_dev->dev;
-	struct srp_request *req;
+	struct list_head free_reqs;
+	struct srp_request *req_ring, *req;
 	dma_addr_t dma_addr;
 	int i, ret = -ENOMEM;
 
-	INIT_LIST_HEAD(&target->free_reqs);
+	INIT_LIST_HEAD(&free_reqs);
 
-	target->req_ring = kzalloc(target->req_ring_size *
-				   sizeof(*target->req_ring), GFP_KERNEL);
-	if (!target->req_ring)
+	req_ring = kzalloc(target->req_ring_size * sizeof(*req_ring),
+			   GFP_KERNEL);
+	if (!req_ring)
 		goto out;
 
 	for (i = 0; i < target->req_ring_size; ++i) {
-		req = &target->req_ring[i];
+		req = &req_ring[i];
 		req->fmr_list = kmalloc(target->cmd_sg_cnt * sizeof(void *),
 					GFP_KERNEL);
 		req->map_page = kmalloc(SRP_FMR_SIZE * sizeof(void *),
@@ -632,11 +640,16 @@ static int srp_alloc_req_data(struct srp_target_port *target)
 
 		req->indirect_dma_addr = dma_addr;
 		req->index = i;
-		list_add_tail(&req->list, &target->free_reqs);
+		list_add_tail(&req->list, &free_reqs);
 	}
+	swap(target->req_ring, req_ring);
+	INIT_LIST_HEAD(&target->free_reqs);
+	list_splice(&free_reqs, &target->free_reqs);
 	ret = 0;
 
 out:
+	srp_free_req_data(target, req_ring);
+
 	return ret;
 }
 
@@ -669,7 +682,7 @@ static void srp_remove_target(struct srp_target_port *target)
 	srp_free_target_ib(target);
 	cancel_work_sync(&target->tl_err_work);
 	srp_rport_put(target->rport);
-	srp_free_req_data(target);
+	srp_free_req_data(target, target->req_ring);
 
 	spin_lock(&target->srp_host->target_lock);
 	list_del(&target->list);
@@ -2750,7 +2763,7 @@ err_free_ib:
 	srp_free_target_ib(target);
 
 err_free_mem:
-	srp_free_req_data(target);
+	srp_free_req_data(target, target->req_ring);
 
 err:
 	scsi_host_put(target_host);
-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 7/9] IB/srp: Avoid triggering an infinite loop if memory mapping fails
       [not found] ` <5368DA5B.80609-HInyCGIudOg@public.gmane.org>
                     ` (5 preceding siblings ...)
  2014-05-06 12:54   ` [PATCH 6/9] IB/srp: Make srp_alloc_req_data() reallocate request data Bart Van Assche
@ 2014-05-06 12:55   ` Bart Van Assche
       [not found]     ` <5368DBC5.6070609-HInyCGIudOg@public.gmane.org>
  2014-05-06 12:56   ` [PATCH 8/9] IB/srp: Rename FMR-related variables Bart Van Assche
                     ` (3 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: Bart Van Assche @ 2014-05-06 12:55 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

Only request the SCSI mid-layer to retry a SCSI command after a
temporary mapping failure (-ENOMEM) but not after a permanent
mapping failure. This patch avoids that SCSI commands are retried
indefinitely if a permanent memory mapping failure occurs.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 1c4b0d3..af94381 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1564,7 +1564,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 	struct srp_cmd *cmd;
 	struct ib_device *dev;
 	unsigned long flags;
-	int len, result;
+	int len, result, ret = SCSI_MLQUEUE_HOST_BUSY;
 	const bool in_scsi_eh = !in_interrupt() && current == shost->ehandler;
 
 	/*
@@ -1580,6 +1580,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 	if (unlikely(result)) {
 		scmnd->result = result;
 		scmnd->scsi_done(scmnd);
+		ret = 0;
 		goto unlock_rport;
 	}
 
@@ -1613,7 +1614,12 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 	len = srp_map_data(scmnd, target, req);
 	if (len < 0) {
 		shost_printk(KERN_ERR, target->scsi_host,
-			     PFX "Failed to map data\n");
+			     PFX "Failed to map data (%d)\n", len);
+		if (len != -ENOMEM) {
+			scmnd->result = DID_ERROR << 16;
+			scmnd->scsi_done(scmnd);
+			ret = 0;
+		}
 		goto err_iu;
 	}
 
@@ -1625,11 +1631,13 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 		goto err_unmap;
 	}
 
+	ret = 0;
+
 unlock_rport:
 	if (in_scsi_eh)
 		mutex_unlock(&rport->mutex);
 
-	return 0;
+	return ret;
 
 err_unmap:
 	srp_unmap_data(scmnd, target, req);
@@ -1643,10 +1651,7 @@ err_iu:
 err_unlock:
 	spin_unlock_irqrestore(&target->lock, flags);
 
-	if (in_scsi_eh)
-		mutex_unlock(&rport->mutex);
-
-	return SCSI_MLQUEUE_HOST_BUSY;
+	goto unlock_rport;
 }
 
 /*
-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 8/9] IB/srp: Rename FMR-related variables
       [not found] ` <5368DA5B.80609-HInyCGIudOg@public.gmane.org>
                     ` (6 preceding siblings ...)
  2014-05-06 12:55   ` [PATCH 7/9] IB/srp: Avoid triggering an infinite loop if memory mapping fails Bart Van Assche
@ 2014-05-06 12:56   ` Bart Van Assche
  2014-05-06 12:56   ` [PATCH 9/9] IB/srp: Add fast registration support Bart Van Assche
                     ` (2 subsequent siblings)
  10 siblings, 0 replies; 32+ messages in thread
From: Bart Van Assche @ 2014-05-06 12:56 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

The next patch will cause the renamed variables to be shared between
the code for FMR and for FR memory registration. Make the names of
these variables independent of the memory registration mode. This
patch does not change any functionality.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 44 ++++++++++++++++++-------------------
 drivers/infiniband/ulp/srp/ib_srp.h | 18 +++++++--------
 2 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index af94381..017de46 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -626,7 +626,7 @@ static int srp_alloc_req_data(struct srp_target_port *target)
 		req = &req_ring[i];
 		req->fmr_list = kmalloc(target->cmd_sg_cnt * sizeof(void *),
 					GFP_KERNEL);
-		req->map_page = kmalloc(SRP_FMR_SIZE * sizeof(void *),
+		req->map_page = kmalloc(SRP_MAX_PAGES_PER_MR * sizeof(void *),
 					GFP_KERNEL);
 		req->indirect_desc = kmalloc(target->indirect_size, GFP_KERNEL);
 		if (!req->fmr_list || !req->map_page || !req->indirect_desc)
@@ -784,7 +784,7 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
 		return;
 
 	pfmr = req->fmr_list;
-	while (req->nfmr--)
+	while (req->nmdesc--)
 		ib_fmr_pool_unmap(*pfmr++);
 
 	ib_dma_unmap_sg(ibdev, scsi_sglist(scmnd), scsi_sg_count(scmnd),
@@ -954,9 +954,9 @@ static int srp_map_finish_fmr(struct srp_map_state *state,
 		return PTR_ERR(fmr);
 
 	*state->next_fmr++ = fmr;
-	state->nfmr++;
+	state->nmdesc++;
 
-	srp_map_desc(state, 0, state->fmr_len, fmr->fmr->rkey);
+	srp_map_desc(state, 0, state->dma_len, fmr->fmr->rkey);
 
 	return 0;
 }
@@ -970,7 +970,7 @@ static int srp_finish_mapping(struct srp_map_state *state,
 		return 0;
 
 	if (state->npages == 1) {
-		srp_map_desc(state, state->base_dma_addr, state->fmr_len,
+		srp_map_desc(state, state->base_dma_addr, state->dma_len,
 			     target->rkey);
 	} else {
 		ret = srp_map_finish_fmr(state, target);
@@ -978,7 +978,7 @@ static int srp_finish_mapping(struct srp_map_state *state,
 
 	if (ret == 0) {
 		state->npages = 0;
-		state->fmr_len = 0;
+		state->dma_len = 0;
 	}
 
 	return ret;
@@ -1023,7 +1023,7 @@ static int srp_map_sg_entry(struct srp_map_state *state,
 	 * that were never quite defined, but went away when the initiator
 	 * avoided using FMR on such page fragments.
 	 */
-	if (dma_addr & ~dev->fmr_page_mask || dma_len > dev->fmr_max_size) {
+	if (dma_addr & ~dev->mr_page_mask || dma_len > dev->fmr_max_size) {
 		ret = srp_finish_mapping(state, target);
 		if (ret)
 			return ret;
@@ -1042,7 +1042,7 @@ static int srp_map_sg_entry(struct srp_map_state *state,
 		srp_map_update_start(state, sg, sg_index, dma_addr);
 
 	while (dma_len) {
-		if (state->npages == SRP_FMR_SIZE) {
+		if (state->npages == SRP_MAX_PAGES_PER_MR) {
 			ret = srp_map_finish_fmr(state, target);
 			if (ret)
 				return ret;
@@ -1050,12 +1050,12 @@ static int srp_map_sg_entry(struct srp_map_state *state,
 			srp_map_update_start(state, sg, sg_index, dma_addr);
 		}
 
-		len = min_t(unsigned int, dma_len, dev->fmr_page_size);
+		len = min_t(unsigned int, dma_len, dev->mr_page_size);
 
 		if (!state->npages)
 			state->base_dma_addr = dma_addr;
 		state->pages[state->npages++] = dma_addr;
-		state->fmr_len += len;
+		state->dma_len += len;
 		dma_addr += len;
 		dma_len -= len;
 	}
@@ -1065,7 +1065,7 @@ static int srp_map_sg_entry(struct srp_map_state *state,
 	 * boundries.
 	 */
 	ret = 0;
-	if (len != dev->fmr_page_size) {
+	if (len != dev->mr_page_size) {
 		ret = srp_map_finish_fmr(state, target);
 		if (!ret)
 			srp_map_update_start(state, NULL, 0, 0);
@@ -1112,7 +1112,7 @@ backtrack:
 	if (use_fmr == SRP_MAP_ALLOW_FMR && srp_map_finish_fmr(state, target))
 		goto backtrack;
 
-	req->nfmr = state->nfmr;
+	req->nmdesc = state->nmdesc;
 }
 
 static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
@@ -1165,7 +1165,7 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
 		buf->key = cpu_to_be32(target->rkey);
 		buf->len = cpu_to_be32(ib_sg_dma_len(ibdev, scat));
 
-		req->nfmr = 0;
+		req->nmdesc = 0;
 		goto map_complete;
 	}
 
@@ -2844,15 +2844,15 @@ static void srp_alloc_fmr_pool(struct srp_device *srp_dev)
 
 	srp_dev->fmr_pool = NULL;
 
-	for (max_pages_per_mr = SRP_FMR_SIZE;
-	     max_pages_per_mr >= SRP_FMR_MIN_SIZE;
+	for (max_pages_per_mr = SRP_MAX_PAGES_PER_MR;
+	     max_pages_per_mr >= SRP_MIN_PAGES_PER_MR;
 	     max_pages_per_mr /= 2) {
 		memset(&fmr_param, 0, sizeof(fmr_param));
-		fmr_param.pool_size	    = SRP_FMR_POOL_SIZE;
+		fmr_param.pool_size	    = SRP_MDESC_PER_POOL;
 		fmr_param.dirty_watermark   = SRP_FMR_DIRTY_SIZE;
 		fmr_param.cache		    = 1;
 		fmr_param.max_pages_per_fmr = max_pages_per_mr;
-		fmr_param.page_shift	    = ilog2(srp_dev->fmr_page_size);
+		fmr_param.page_shift	    = ilog2(srp_dev->mr_page_size);
 		fmr_param.access	    = (IB_ACCESS_LOCAL_WRITE |
 					       IB_ACCESS_REMOTE_WRITE |
 					       IB_ACCESS_REMOTE_READ);
@@ -2861,7 +2861,7 @@ static void srp_alloc_fmr_pool(struct srp_device *srp_dev)
 		if (!IS_ERR(pool)) {
 			srp_dev->fmr_pool = pool;
 			srp_dev->fmr_max_size =
-				srp_dev->fmr_page_size * max_pages_per_mr;
+				srp_dev->mr_page_size * max_pages_per_mr;
 			break;
 		}
 	}
@@ -2872,7 +2872,7 @@ static void srp_add_one(struct ib_device *device)
 	struct srp_device *srp_dev;
 	struct ib_device_attr *dev_attr;
 	struct srp_host *host;
-	int fmr_page_shift, s, e, p;
+	int mr_page_shift, s, e, p;
 
 	dev_attr = kmalloc(sizeof *dev_attr, GFP_KERNEL);
 	if (!dev_attr)
@@ -2892,9 +2892,9 @@ static void srp_add_one(struct ib_device *device)
 	 * minimum of 4096 bytes. We're unlikely to build large sglists
 	 * out of smaller entries.
 	 */
-	fmr_page_shift		= max(12, ffs(dev_attr->page_size_cap) - 1);
-	srp_dev->fmr_page_size	= 1 << fmr_page_shift;
-	srp_dev->fmr_page_mask	= ~((u64) srp_dev->fmr_page_size - 1);
+	mr_page_shift		= max(12, ffs(dev_attr->page_size_cap) - 1);
+	srp_dev->mr_page_size	= 1 << mr_page_shift;
+	srp_dev->mr_page_mask	= ~((u64) srp_dev->mr_page_size - 1);
 
 	INIT_LIST_HEAD(&srp_dev->dev_list);
 
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index aad27b7..89e3adb 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -66,10 +66,10 @@ enum {
 	SRP_TAG_NO_REQ		= ~0U,
 	SRP_TAG_TSK_MGMT	= 1U << 31,
 
-	SRP_FMR_SIZE		= 512,
-	SRP_FMR_MIN_SIZE	= 128,
-	SRP_FMR_POOL_SIZE	= 1024,
-	SRP_FMR_DIRTY_SIZE	= SRP_FMR_POOL_SIZE / 4,
+	SRP_MAX_PAGES_PER_MR	= 512,
+	SRP_MIN_PAGES_PER_MR	= 128,
+	SRP_MDESC_PER_POOL	= 1024,
+	SRP_FMR_DIRTY_SIZE	= SRP_MDESC_PER_POOL / 4,
 
 	SRP_MAP_ALLOW_FMR	= 0,
 	SRP_MAP_NO_FMR		= 1,
@@ -92,8 +92,8 @@ struct srp_device {
 	struct ib_pd	       *pd;
 	struct ib_mr	       *mr;
 	struct ib_fmr_pool     *fmr_pool;
-	u64			fmr_page_mask;
-	int			fmr_page_size;
+	u64			mr_page_mask;
+	int			mr_page_size;
 	int			fmr_max_size;
 };
 
@@ -116,7 +116,7 @@ struct srp_request {
 	u64		       *map_page;
 	struct srp_direct_buf  *indirect_desc;
 	dma_addr_t		indirect_dma_addr;
-	short			nfmr;
+	short			nmdesc;
 	short			index;
 };
 
@@ -202,10 +202,10 @@ struct srp_map_state {
 	struct srp_direct_buf  *desc;
 	u64		       *pages;
 	dma_addr_t		base_dma_addr;
-	u32			fmr_len;
+	u32			dma_len;
 	u32			total_len;
 	unsigned int		npages;
-	unsigned int		nfmr;
+	unsigned int		nmdesc;
 	unsigned int		ndesc;
 	struct scatterlist     *unmapped_sg;
 	int			unmapped_index;
-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 9/9] IB/srp: Add fast registration support
       [not found] ` <5368DA5B.80609-HInyCGIudOg@public.gmane.org>
                     ` (7 preceding siblings ...)
  2014-05-06 12:56   ` [PATCH 8/9] IB/srp: Rename FMR-related variables Bart Van Assche
@ 2014-05-06 12:56   ` Bart Van Assche
       [not found]     ` <5368DC09.70608-HInyCGIudOg@public.gmane.org>
  2014-05-06 14:06   ` [PATCH 0/9] SRP initiator patches for kernel 3.16 Jack Wang
  2014-05-06 14:21   ` [PATCH 3/9] IB/srp: Introduce srp_alloc_fmr_pool() Bart Van Assche
  10 siblings, 1 reply; 32+ messages in thread
From: Bart Van Assche @ 2014-05-06 12:56 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

Certain HCA types (e.g. Connect-IB) and certain configurations (e.g.
ConnectX VF) support FR but not FMR. Hence add FR support.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 442 ++++++++++++++++++++++++++++++------
 drivers/infiniband/ulp/srp/ib_srp.h |  82 ++++++-
 2 files changed, 451 insertions(+), 73 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 017de46..fbda2ca 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -66,6 +66,8 @@ static unsigned int srp_sg_tablesize;
 static unsigned int cmd_sg_entries;
 static unsigned int indirect_sg_entries;
 static bool allow_ext_sg;
+static bool prefer_fr;
+static bool register_always;
 static int topspin_workarounds = 1;
 
 module_param(srp_sg_tablesize, uint, 0444);
@@ -87,6 +89,14 @@ module_param(topspin_workarounds, int, 0444);
 MODULE_PARM_DESC(topspin_workarounds,
 		 "Enable workarounds for Topspin/Cisco SRP target bugs if != 0");
 
+module_param(prefer_fr, bool, 0444);
+MODULE_PARM_DESC(prefer_fr,
+		 "Whether to use FR if both FMR and FR are supported");
+
+module_param(register_always, bool, 0444);
+MODULE_PARM_DESC(register_always,
+		 "Use memory registration even for contiguous memory regions");
+
 static struct kernel_param_ops srp_tmo_ops;
 
 static int srp_reconnect_delay = 10;
@@ -288,12 +298,154 @@ static int srp_new_cm_id(struct srp_target_port *target)
 	return 0;
 }
 
+/**
+ * srp_destroy_fr_pool() - free the resources owned by a pool
+ * @pool: Fast registration pool to be destroyed.
+ */
+static void srp_destroy_fr_pool(struct srp_fr_pool *pool)
+{
+	int i;
+	struct srp_fr_desc *d;
+
+	if (!pool)
+		return;
+
+	for (i = 0, d = &pool->desc[0]; i < pool->size; i++, d++) {
+		if (d->frpl)
+			ib_free_fast_reg_page_list(d->frpl);
+		if (d->mr)
+			ib_dereg_mr(d->mr);
+	}
+	kfree(pool);
+}
+
+/**
+ * srp_create_fr_pool() - allocate and initialize a pool for fast registration
+ * @device:            IB device to allocate fast registration descriptors for.
+ * @pd:                Protection domain associated with the FR descriptors.
+ * @pool_size:         Number of descriptors to allocate.
+ * @max_page_list_len: Maximum fast registration work request page list length.
+ */
+static struct srp_fr_pool *srp_create_fr_pool(struct ib_device *device,
+					      struct ib_pd *pd, int pool_size,
+					      int max_page_list_len)
+{
+	struct srp_fr_pool *pool;
+	struct srp_fr_desc *d;
+	struct ib_mr *mr;
+	struct ib_fast_reg_page_list *frpl;
+	int i, ret = -EINVAL;
+
+	if (pool_size <= 0)
+		goto err;
+	ret = -ENOMEM;
+	pool = kzalloc(sizeof(struct srp_fr_pool) +
+		       pool_size * sizeof(struct srp_fr_desc), GFP_KERNEL);
+	if (!pool)
+		goto err;
+	pool->size = pool_size;
+	pool->max_page_list_len = max_page_list_len;
+	spin_lock_init(&pool->lock);
+	INIT_LIST_HEAD(&pool->free_list);
+
+	for (i = 0, d = &pool->desc[0]; i < pool->size; i++, d++) {
+		mr = ib_alloc_fast_reg_mr(pd, max_page_list_len);
+		if (IS_ERR(mr)) {
+			ret = PTR_ERR(mr);
+			goto destroy_pool;
+		}
+		d->mr = mr;
+		frpl = ib_alloc_fast_reg_page_list(device, max_page_list_len);
+		if (IS_ERR(frpl)) {
+			ret = PTR_ERR(frpl);
+			goto destroy_pool;
+		}
+		d->frpl = frpl;
+		list_add_tail(&d->entry, &pool->free_list);
+	}
+
+out:
+	return pool;
+
+destroy_pool:
+	srp_destroy_fr_pool(pool);
+
+err:
+	pool = ERR_PTR(ret);
+	goto out;
+}
+
+/**
+ * srp_fr_pool_get() - obtain a descriptor suitable for fast registration
+ * @pool: Pool to obtain descriptor from.
+ */
+static struct srp_fr_desc *srp_fr_pool_get(struct srp_fr_pool *pool)
+{
+	struct srp_fr_desc *d = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&pool->lock, flags);
+	if (!list_empty(&pool->free_list)) {
+		d = list_first_entry(&pool->free_list, typeof(*d), entry);
+		list_del(&d->entry);
+	}
+	spin_unlock_irqrestore(&pool->lock, flags);
+
+	return d;
+}
+
+/**
+ * srp_fr_pool_put() - put an FR descriptor back in the free list
+ * @pool: Pool the descriptor was allocated from.
+ * @desc: Pointer to an array of fast registration descriptor pointers.
+ * @n:    Number of descriptors to put back.
+ *
+ * Note: The caller must already have queued an invalidation request for
+ * desc->mr->rkey before calling this function.
+ */
+static void srp_fr_pool_put(struct srp_fr_pool *pool, struct srp_fr_desc **desc,
+			    int n)
+{
+	unsigned long flags;
+	int i;
+
+	spin_lock_irqsave(&pool->lock, flags);
+	for (i = 0; i < n; i++)
+		list_add(&desc[i]->entry, &pool->free_list);
+	spin_unlock_irqrestore(&pool->lock, flags);
+}
+
+static struct srp_fr_pool *srp_alloc_fr_pool(struct srp_target_port *target)
+{
+	struct srp_device *dev = target->srp_host->srp_dev;
+	struct srp_fr_pool *pool;
+	int max_pages_per_mr;
+
+	for (max_pages_per_mr = SRP_MAX_PAGES_PER_MR;
+	     max_pages_per_mr >= SRP_MIN_PAGES_PER_MR;
+	     max_pages_per_mr /= 2) {
+		pool = srp_create_fr_pool(dev->dev, dev->pd,
+					  SRP_MDESC_PER_POOL, max_pages_per_mr);
+		if (!IS_ERR(pool))
+			goto out;
+	}
+
+	if (IS_ERR(pool))
+		pr_warn("Fast registration pool creation for %s failed: %d\n",
+			dev->dev->name, PTR_RET(pool));
+
+out:
+	return pool;
+}
+
 static int srp_create_target_ib(struct srp_target_port *target)
 {
 	struct srp_device *dev = target->srp_host->srp_dev;
 	struct ib_qp_init_attr *init_attr;
 	struct ib_cq *recv_cq, *send_cq;
 	struct ib_qp *qp;
+	struct srp_fr_pool *fr_pool = NULL;
+	const int m = 1 + dev->use_fast_reg;
 	int ret;
 
 	init_attr = kzalloc(sizeof *init_attr, GFP_KERNEL);
@@ -308,7 +460,7 @@ static int srp_create_target_ib(struct srp_target_port *target)
 	}
 
 	send_cq = ib_create_cq(dev->dev, srp_send_completion, NULL, target,
-			       target->queue_size, target->comp_vector);
+			       m * target->queue_size, target->comp_vector);
 	if (IS_ERR(send_cq)) {
 		ret = PTR_ERR(send_cq);
 		goto err_recv_cq;
@@ -317,11 +469,11 @@ static int srp_create_target_ib(struct srp_target_port *target)
 	ib_req_notify_cq(recv_cq, IB_CQ_NEXT_COMP);
 
 	init_attr->event_handler       = srp_qp_event;
-	init_attr->cap.max_send_wr     = target->queue_size;
+	init_attr->cap.max_send_wr     = m * target->queue_size;
 	init_attr->cap.max_recv_wr     = target->queue_size;
 	init_attr->cap.max_recv_sge    = 1;
 	init_attr->cap.max_send_sge    = 1;
-	init_attr->sq_sig_type         = IB_SIGNAL_ALL_WR;
+	init_attr->sq_sig_type         = IB_SIGNAL_REQ_WR;
 	init_attr->qp_type             = IB_QPT_RC;
 	init_attr->send_cq             = send_cq;
 	init_attr->recv_cq             = recv_cq;
@@ -336,6 +488,14 @@ static int srp_create_target_ib(struct srp_target_port *target)
 	if (ret)
 		goto err_qp;
 
+	if (dev->use_fast_reg) {
+		fr_pool = srp_alloc_fr_pool(target);
+		if (IS_ERR(fr_pool)) {
+			ret = PTR_ERR(fr_pool);
+			goto err_qp;
+		}
+	}
+
 	if (target->qp)
 		ib_destroy_qp(target->qp);
 	if (target->recv_cq)
@@ -343,6 +503,15 @@ static int srp_create_target_ib(struct srp_target_port *target)
 	if (target->send_cq)
 		ib_destroy_cq(target->send_cq);
 
+	if (dev->use_fast_reg) {
+		srp_destroy_fr_pool(target->fr_pool);
+		target->fr_pool = fr_pool;
+		target->mr_max_size = dev->mr_page_size *
+			fr_pool->max_page_list_len;
+	} else {
+		target->mr_max_size = dev->fmr_max_size;
+	}
+
 	target->qp = qp;
 	target->recv_cq = recv_cq;
 	target->send_cq = send_cq;
@@ -370,12 +539,16 @@ err:
  */
 static void srp_free_target_ib(struct srp_target_port *target)
 {
+	struct srp_device *dev = target->srp_host->srp_dev;
 	int i;
 
 	ib_destroy_qp(target->qp);
 	ib_destroy_cq(target->send_cq);
 	ib_destroy_cq(target->recv_cq);
 
+	if (dev->use_fast_reg)
+		srp_destroy_fr_pool(target->fr_pool);
+
 	target->qp = NULL;
 	target->send_cq = target->recv_cq = NULL;
 
@@ -577,7 +750,8 @@ static void srp_disconnect_target(struct srp_target_port *target)
 static void srp_free_req_data(struct srp_target_port *target,
 			      struct srp_request *req_ring)
 {
-	struct ib_device *ibdev = target->srp_host->srp_dev->dev;
+	struct srp_device *dev = target->srp_host->srp_dev;
+	struct ib_device *ibdev = dev->dev;
 	struct srp_request *req;
 	int i;
 
@@ -586,7 +760,10 @@ static void srp_free_req_data(struct srp_target_port *target,
 
 	for (i = 0; i < target->req_ring_size; ++i) {
 		req = &req_ring[i];
-		kfree(req->fmr_list);
+		if (dev->use_fast_reg)
+			kfree(req->fr.fr_list);
+		else
+			kfree(req->fmr.fmr_list);
 		kfree(req->map_page);
 		if (req->indirect_dma_addr) {
 			ib_dma_unmap_single(ibdev, req->indirect_dma_addr,
@@ -612,6 +789,7 @@ static int srp_alloc_req_data(struct srp_target_port *target)
 	struct ib_device *ibdev = srp_dev->dev;
 	struct list_head free_reqs;
 	struct srp_request *req_ring, *req;
+	void *mr_list;
 	dma_addr_t dma_addr;
 	int i, ret = -ENOMEM;
 
@@ -624,12 +802,20 @@ static int srp_alloc_req_data(struct srp_target_port *target)
 
 	for (i = 0; i < target->req_ring_size; ++i) {
 		req = &req_ring[i];
-		req->fmr_list = kmalloc(target->cmd_sg_cnt * sizeof(void *),
-					GFP_KERNEL);
+		mr_list = kmalloc(target->cmd_sg_cnt * sizeof(void *),
+				  GFP_KERNEL);
+		if (!mr_list)
+			goto out;
+		if (srp_dev->use_fast_reg)
+			req->fr.fr_list = mr_list;
+		else
+			req->fmr.fmr_list = mr_list;
 		req->map_page = kmalloc(SRP_MAX_PAGES_PER_MR * sizeof(void *),
 					GFP_KERNEL);
+		if (!req->map_page)
+			goto out;
 		req->indirect_desc = kmalloc(target->indirect_size, GFP_KERNEL);
-		if (!req->fmr_list || !req->map_page || !req->indirect_desc)
+		if (!req->indirect_desc)
 			goto out;
 
 		dma_addr = ib_dma_map_single(ibdev, req->indirect_desc,
@@ -771,21 +957,49 @@ static int srp_connect_target(struct srp_target_port *target)
 	}
 }
 
+static int srp_inv_rkey(struct srp_target_port *target, u32 rkey)
+{
+	struct ib_send_wr *bad_wr;
+	struct ib_send_wr wr = {
+		.opcode		    = IB_WR_LOCAL_INV,
+		.wr_id		    = LOCAL_INV_WR_ID_MASK,
+		.next		    = NULL,
+		.num_sge	    = 0,
+		.send_flags	    = 0,
+		.ex.invalidate_rkey = rkey,
+	};
+
+	return ib_post_send(target->qp, &wr, &bad_wr);
+}
+
 static void srp_unmap_data(struct scsi_cmnd *scmnd,
 			   struct srp_target_port *target,
 			   struct srp_request *req)
 {
-	struct ib_device *ibdev = target->srp_host->srp_dev->dev;
-	struct ib_pool_fmr **pfmr;
+	struct srp_device *dev = target->srp_host->srp_dev;
+	struct ib_device *ibdev = dev->dev;
+	int i;
 
 	if (!scsi_sglist(scmnd) ||
 	    (scmnd->sc_data_direction != DMA_TO_DEVICE &&
 	     scmnd->sc_data_direction != DMA_FROM_DEVICE))
 		return;
 
-	pfmr = req->fmr_list;
-	while (req->nmdesc--)
-		ib_fmr_pool_unmap(*pfmr++);
+	if (dev->use_fast_reg) {
+		struct srp_fr_desc **pfr;
+
+		for (i = req->nmdesc, pfr = req->fr.fr_list; i > 0; i--, pfr++)
+			srp_inv_rkey(target, (*pfr)->mr->rkey);
+		if (req->nmdesc)
+			srp_fr_pool_put(target->fr_pool, req->fr.fr_list,
+					req->nmdesc);
+	} else {
+		struct ib_pool_fmr **pfmr;
+
+		for (i = req->nmdesc, pfmr = req->fmr.fmr_list; i > 0;
+		     i--, pfmr++)
+			ib_fmr_pool_unmap(*pfmr);
+	}
 
 	ib_dma_unmap_sg(ibdev, scsi_sglist(scmnd), scsi_sg_count(scmnd),
 			scmnd->sc_data_direction);
@@ -898,20 +1112,25 @@ static int srp_rport_reconnect(struct srp_rport *rport)
 	 * callbacks will have finished before a new QP is allocated.
 	 */
 	ret = srp_new_cm_id(target);
+
+	for (i = 0; i < target->req_ring_size; ++i) {
+		struct srp_request *req = &target->req_ring[i];
+		srp_finish_req(target, req, NULL, DID_RESET << 16);
+	}
+
 	/*
 	 * Whether or not creating a new CM ID succeeded, create a new
-	 * QP. This guarantees that all completion callback function
-	 * invocations have finished before request resetting starts.
+	 * QP. This guarantees that all QP callback function invocations have
+	 * finished before request reallocating starts.
 	 */
 	if (ret == 0)
 		ret = srp_create_target_ib(target);
 	else
 		srp_create_target_ib(target);
 
-	for (i = 0; i < target->req_ring_size; ++i) {
-		struct srp_request *req = &target->req_ring[i];
-		srp_finish_req(target, req, NULL, DID_RESET << 16);
-	}
+	/* Reallocate requests to reset the MR state in FR mode. */
+	if (ret == 0)
+		ret = srp_alloc_req_data(target);
 
 	INIT_LIST_HEAD(&target->free_tx);
 	for (i = 0; i < target->queue_size; ++i)
@@ -961,6 +1180,47 @@ static int srp_map_finish_fmr(struct srp_map_state *state,
 	return 0;
 }
 
+static int srp_map_finish_fr(struct srp_map_state *state,
+			     struct srp_target_port *target)
+{
+	struct srp_device *dev = target->srp_host->srp_dev;
+	struct ib_send_wr *bad_wr;
+	struct ib_send_wr wr;
+	struct srp_fr_desc *desc;
+	u32 rkey;
+
+	desc = srp_fr_pool_get(target->fr_pool);
+	if (!desc)
+		return -ENOMEM;
+
+	rkey = ib_inc_rkey(desc->mr->rkey);
+	ib_update_fast_reg_key(desc->mr, rkey);
+
+	memcpy(desc->frpl->page_list, state->pages,
+	       sizeof(state->pages[0]) * state->npages);
+
+	memset(&wr, 0, sizeof(wr));
+	wr.opcode = IB_WR_FAST_REG_MR;
+	wr.wr_id = FAST_REG_WR_ID_MASK;
+	wr.wr.fast_reg.iova_start = state->base_dma_addr;
+	wr.wr.fast_reg.page_list = desc->frpl;
+	wr.wr.fast_reg.page_list_len = state->npages;
+	wr.wr.fast_reg.page_shift = ilog2(dev->mr_page_size);
+	wr.wr.fast_reg.length = state->dma_len;
+	wr.wr.fast_reg.access_flags = (IB_ACCESS_LOCAL_WRITE |
+				       IB_ACCESS_REMOTE_READ |
+				       IB_ACCESS_REMOTE_WRITE);
+	wr.wr.fast_reg.rkey = desc->mr->lkey;
+
+	*state->next_fr++ = desc;
+	state->nmdesc++;
+
+	srp_map_desc(state, state->base_dma_addr, state->dma_len,
+		     desc->mr->rkey);
+
+	return ib_post_send(target->qp, &wr, &bad_wr);
+}
+
 static int srp_finish_mapping(struct srp_map_state *state,
 			      struct srp_target_port *target)
 {
@@ -969,11 +1229,13 @@ static int srp_finish_mapping(struct srp_map_state *state,
 	if (state->npages == 0)
 		return 0;
 
-	if (state->npages == 1) {
+	if (state->npages == 1 && !register_always) {
 		srp_map_desc(state, state->base_dma_addr, state->dma_len,
 			     target->rkey);
 	} else {
-		ret = srp_map_finish_fmr(state, target);
+		ret = target->srp_host->srp_dev->use_fast_reg ?
+			srp_map_finish_fr(state, target) :
+			srp_map_finish_fmr(state, target);
 	}
 
 	if (ret == 0) {
@@ -996,7 +1258,7 @@ static void srp_map_update_start(struct srp_map_state *state,
 static int srp_map_sg_entry(struct srp_map_state *state,
 			    struct srp_target_port *target,
 			    struct scatterlist *sg, int sg_index,
-			    int use_fmr)
+			    bool use_memory_registration)
 {
 	struct srp_device *dev = target->srp_host->srp_dev;
 	struct ib_device *ibdev = dev->dev;
@@ -1008,22 +1270,24 @@ static int srp_map_sg_entry(struct srp_map_state *state,
 	if (!dma_len)
 		return 0;
 
-	if (use_fmr == SRP_MAP_NO_FMR) {
-		/* Once we're in direct map mode for a request, we don't
-		 * go back to FMR mode, so no need to update anything
+	if (!use_memory_registration) {
+		/*
+		 * Once we're in direct map mode for a request, we don't
+		 * go back to FMR or FR mode, so no need to update anything
 		 * other than the descriptor.
 		 */
 		srp_map_desc(state, dma_addr, dma_len, target->rkey);
 		return 0;
 	}
 
-	/* If we start at an offset into the FMR page, don't merge into
-	 * the current FMR. Finish it out, and use the kernel's MR for this
-	 * sg entry. This is to avoid potential bugs on some SRP targets
-	 * that were never quite defined, but went away when the initiator
-	 * avoided using FMR on such page fragments.
+	/*
+	 * Since not all RDMA HW drivers support non-zero page offsets for
+	 * FMR, if we start at an offset into a page, don't merge into the
+	 * current FMR mapping. Finish it out, and use the kernel's MR for
+	 * this sg entry.
 	 */
-	if (dma_addr & ~dev->mr_page_mask || dma_len > dev->fmr_max_size) {
+	if ((!dev->use_fast_reg && dma_addr & ~dev->mr_page_mask) ||
+	    dma_len > target->mr_max_size) {
 		ret = srp_finish_mapping(state, target);
 		if (ret)
 			return ret;
@@ -1033,28 +1297,30 @@ static int srp_map_sg_entry(struct srp_map_state *state,
 		return 0;
 	}
 
-	/* If this is the first sg to go into the FMR, save our position.
-	 * We need to know the first unmapped entry, its index, and the
-	 * first unmapped address within that entry to be able to restart
-	 * mapping after an error.
+	/*
+	 * If this is the first sg that will be mapped via FMR or via FR, save
+	 * our position. We need to know the first unmapped entry, its index,
+	 * and the first unmapped address within that entry to be able to
+	 * restart mapping after an error.
 	 */
 	if (!state->unmapped_sg)
 		srp_map_update_start(state, sg, sg_index, dma_addr);
 
 	while (dma_len) {
-		if (state->npages == SRP_MAX_PAGES_PER_MR) {
-			ret = srp_map_finish_fmr(state, target);
+		unsigned offset = dma_addr & ~dev->mr_page_mask;
+		if (state->npages == SRP_MAX_PAGES_PER_MR || offset != 0) {
+			ret = srp_finish_mapping(state, target);
 			if (ret)
 				return ret;
 
 			srp_map_update_start(state, sg, sg_index, dma_addr);
 		}
 
-		len = min_t(unsigned int, dma_len, dev->mr_page_size);
+		len = min_t(unsigned int, dma_len, dev->mr_page_size - offset);
 
 		if (!state->npages)
 			state->base_dma_addr = dma_addr;
-		state->pages[state->npages++] = dma_addr;
+		state->pages[state->npages++] = dma_addr & dev->mr_page_mask;
 		state->dma_len += len;
 		dma_addr += len;
 		dma_len -= len;
@@ -1066,32 +1332,40 @@ static int srp_map_sg_entry(struct srp_map_state *state,
 	 */
 	ret = 0;
 	if (len != dev->mr_page_size) {
-		ret = srp_map_finish_fmr(state, target);
+		ret = srp_finish_mapping(state, target);
 		if (!ret)
 			srp_map_update_start(state, NULL, 0, 0);
 	}
 	return ret;
 }
 
-static void srp_map_fmr(struct srp_map_state *state,
-			struct srp_target_port *target, struct srp_request *req,
-			struct scatterlist *scat, int count)
+static int srp_map_sg(struct srp_map_state *state,
+		      struct srp_target_port *target, struct srp_request *req,
+		      struct scatterlist *scat, int count)
 {
 	struct srp_device *dev = target->srp_host->srp_dev;
 	struct ib_device *ibdev = dev->dev;
 	struct scatterlist *sg;
-	int i, use_fmr;
+	int i;
+	bool use_memory_registration;
 
 	state->desc	= req->indirect_desc;
 	state->pages	= req->map_page;
-	state->next_fmr	= req->fmr_list;
-
-	use_fmr = dev->fmr_pool ? SRP_MAP_ALLOW_FMR : SRP_MAP_NO_FMR;
+	if (dev->use_fast_reg) {
+		state->next_fmr = req->fmr.fmr_list;
+		use_memory_registration = !!target->fr_pool;
+	} else {
+		state->next_fr = req->fr.fr_list;
+		use_memory_registration = !!dev->fmr_pool;
+	}
 
 	for_each_sg(scat, sg, count, i) {
-		if (srp_map_sg_entry(state, target, sg, i, use_fmr)) {
-			/* FMR mapping failed, so backtrack to the first
-			 * unmapped entry and continue on without using FMR.
+		if (srp_map_sg_entry(state, target, sg, i,
+				     use_memory_registration)) {
+			/*
+			 * Memory registration failed, so backtrack to the
+			 * first unmapped entry and continue on without using
+			 * memory registration.
 			 */
 			dma_addr_t dma_addr;
 			unsigned int dma_len;
@@ -1104,15 +1378,17 @@ backtrack:
 			dma_len = ib_sg_dma_len(ibdev, sg);
 			dma_len -= (state->unmapped_addr - dma_addr);
 			dma_addr = state->unmapped_addr;
-			use_fmr = SRP_MAP_NO_FMR;
+			use_memory_registration = false;
 			srp_map_desc(state, dma_addr, dma_len, target->rkey);
 		}
 	}
 
-	if (use_fmr == SRP_MAP_ALLOW_FMR && srp_map_finish_fmr(state, target))
+	if (use_memory_registration && srp_finish_mapping(state, target))
 		goto backtrack;
 
 	req->nmdesc = state->nmdesc;
+
+	return 0;
 }
 
 static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
@@ -1120,7 +1396,7 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
 {
 	struct scatterlist *scat;
 	struct srp_cmd *cmd = req->cmd->buf;
-	int len, nents, count;
+	int len, nents, count, res;
 	struct srp_device *dev;
 	struct ib_device *ibdev;
 	struct srp_map_state state;
@@ -1152,7 +1428,7 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
 	fmt = SRP_DATA_DESC_DIRECT;
 	len = sizeof (struct srp_cmd) +	sizeof (struct srp_direct_buf);
 
-	if (count == 1) {
+	if (count == 1 && !register_always) {
 		/*
 		 * The midlayer only generated a single gather/scatter
 		 * entry, or DMA mapping coalesced everything to a
@@ -1169,9 +1445,9 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
 		goto map_complete;
 	}
 
-	/* We have more than one scatter/gather entry, so build our indirect
-	 * descriptor table, trying to merge as many entries with FMR as we
-	 * can.
+	/*
+	 * We have more than one scatter/gather entry, so build our indirect
+	 * descriptor table, trying to merge as many entries as we can.
 	 */
 	indirect_hdr = (void *) cmd->add_data;
 
@@ -1179,7 +1455,9 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
 				   target->indirect_size, DMA_TO_DEVICE);
 
 	memset(&state, 0, sizeof(state));
-	srp_map_fmr(&state, target, req, scat, count);
+	res = srp_map_sg(&state, target, req, scat, count);
+	if (res < 0)
+		return res;
 
 	/* We've mapped the request, now pull as much of the indirect
 	 * descriptor table as we can into the command buffer. If this
@@ -1188,7 +1466,8 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
 	 * give us more S/G entries than we allow.
 	 */
 	if (state.ndesc == 1) {
-		/* FMR mapping was able to collapse this to one entry,
+		/*
+		 * Memory registration collapsed the sg-list into one entry,
 		 * so use a direct descriptor.
 		 */
 		struct srp_direct_buf *buf = (void *) cmd->add_data;
@@ -1511,14 +1790,24 @@ static void srp_tl_err_work(struct work_struct *work)
 		srp_start_tl_fail_timers(target->rport);
 }
 
-static void srp_handle_qp_err(enum ib_wc_status wc_status, bool send_err,
-			      struct srp_target_port *target)
+static void srp_handle_qp_err(u64 wr_id, enum ib_wc_status wc_status,
+			      bool send_err, struct srp_target_port *target)
 {
 	if (target->connected && !target->qp_in_error) {
-		shost_printk(KERN_ERR, target->scsi_host,
-			     PFX "failed %s status %d\n",
-			     send_err ? "send" : "receive",
-			     wc_status);
+		if (wr_id & LOCAL_INV_WR_ID_MASK) {
+			shost_printk(KERN_ERR, target->scsi_host,
+				     "LOCAL_INV failed with status %d\n",
+				     wc_status);
+		} else if (wr_id & FAST_REG_WR_ID_MASK) {
+			shost_printk(KERN_ERR, target->scsi_host,
+				     "FAST_REG_MR failed status %d\n",
+				     wc_status);
+		} else {
+			shost_printk(KERN_ERR, target->scsi_host,
+				     PFX "failed %s status %d for iu %p\n",
+				     send_err ? "send" : "receive",
+				     wc_status, (void *)(uintptr_t)wr_id);
+		}
 		queue_work(system_long_wq, &target->tl_err_work);
 	}
 	target->qp_in_error = true;
@@ -1534,7 +1823,7 @@ static void srp_recv_completion(struct ib_cq *cq, void *target_ptr)
 		if (likely(wc.status == IB_WC_SUCCESS)) {
 			srp_handle_recv(target, &wc);
 		} else {
-			srp_handle_qp_err(wc.status, false, target);
+			srp_handle_qp_err(wc.wr_id, wc.status, false, target);
 		}
 	}
 }
@@ -1550,7 +1839,7 @@ static void srp_send_completion(struct ib_cq *cq, void *target_ptr)
 			iu = (struct srp_iu *) (uintptr_t) wc.wr_id;
 			list_add(&iu->list, &target->free_tx);
 		} else {
-			srp_handle_qp_err(wc.status, true, target);
+			srp_handle_qp_err(wc.wr_id, wc.status, true, target);
 		}
 	}
 }
@@ -2873,6 +3162,7 @@ static void srp_add_one(struct ib_device *device)
 	struct ib_device_attr *dev_attr;
 	struct srp_host *host;
 	int mr_page_shift, s, e, p;
+	bool have_fmr = false, have_fr = false;
 
 	dev_attr = kmalloc(sizeof *dev_attr, GFP_KERNEL);
 	if (!dev_attr)
@@ -2887,6 +3177,19 @@ static void srp_add_one(struct ib_device *device)
 	if (!srp_dev)
 		goto free_attr;
 
+	if (device->alloc_fmr && device->dealloc_fmr && device->map_phys_fmr &&
+	    device->unmap_fmr) {
+		have_fmr = true;
+	}
+	if (dev_attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS)
+		have_fr = true;
+	if (!have_fmr && !have_fr) {
+		dev_err(&device->dev, "neither FMR nor FR is supported\n");
+		goto free_dev;
+	}
+
+	srp_dev->use_fast_reg = have_fr && (!have_fmr || prefer_fr);
+
 	/*
 	 * Use the smallest page size supported by the HCA, down to a
 	 * minimum of 4096 bytes. We're unlikely to build large sglists
@@ -2910,7 +3213,8 @@ static void srp_add_one(struct ib_device *device)
 	if (IS_ERR(srp_dev->mr))
 		goto err_pd;
 
-	srp_alloc_fmr_pool(srp_dev);
+	if (!srp_dev->use_fast_reg)
+		srp_alloc_fmr_pool(srp_dev);
 
 	if (device->node_type == RDMA_NODE_IB_SWITCH) {
 		s = 0;
@@ -2974,7 +3278,7 @@ static void srp_remove_one(struct ib_device *device)
 		kfree(host);
 	}
 
-	if (srp_dev->fmr_pool)
+	if (!srp_dev->use_fast_reg && srp_dev->fmr_pool)
 		ib_destroy_fmr_pool(srp_dev->fmr_pool);
 	ib_dereg_mr(srp_dev->mr);
 	ib_dealloc_pd(srp_dev->pd);
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index 89e3adb..4ec44b5 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -71,8 +71,8 @@ enum {
 	SRP_MDESC_PER_POOL	= 1024,
 	SRP_FMR_DIRTY_SIZE	= SRP_MDESC_PER_POOL / 4,
 
-	SRP_MAP_ALLOW_FMR	= 0,
-	SRP_MAP_NO_FMR		= 1,
+	LOCAL_INV_WR_ID_MASK	= 1,
+	FAST_REG_WR_ID_MASK	= 2,
 };
 
 enum srp_target_state {
@@ -86,6 +86,12 @@ enum srp_iu_type {
 	SRP_IU_RSP,
 };
 
+/*
+ * @mr_page_mask: HCA memory registration page mask.
+ * @mr_page_size: HCA memory registration page size.
+ * @fmr_max_size: Maximum size of a single HCA memory registration request
+ *                when using FMR.
+ */
 struct srp_device {
 	struct list_head	dev_list;
 	struct ib_device       *dev;
@@ -95,6 +101,7 @@ struct srp_device {
 	u64			mr_page_mask;
 	int			mr_page_size;
 	int			fmr_max_size;
+	bool			use_fast_reg;
 };
 
 struct srp_host {
@@ -108,18 +115,32 @@ struct srp_host {
 	struct mutex		add_target_mutex;
 };
 
+/*
+ * In the union below 'fmr' stands for 'Fast Memory Registration' and fr for
+ * 'Fast Registration'.
+ */
 struct srp_request {
 	struct list_head	list;
 	struct scsi_cmnd       *scmnd;
 	struct srp_iu	       *cmd;
-	struct ib_pool_fmr    **fmr_list;
 	u64		       *map_page;
 	struct srp_direct_buf  *indirect_desc;
 	dma_addr_t		indirect_dma_addr;
 	short			nmdesc;
 	short			index;
+	union {
+		struct {
+			struct ib_pool_fmr **fmr_list;
+		} fmr;
+		struct {
+			struct srp_fr_desc **fr_list;
+		} fr;
+	};
 };
 
+/*
+ * @mr_max_size: Maximum size of a single HCA memory registration request.
+ */
 struct srp_target_port {
 	/* These are RW in the hot path, and commonly used together */
 	struct list_head	free_tx;
@@ -131,6 +152,8 @@ struct srp_target_port {
 	struct ib_cq	       *send_cq ____cacheline_aligned_in_smp;
 	struct ib_cq	       *recv_cq;
 	struct ib_qp	       *qp;
+	struct srp_fr_pool     *fr_pool;
+	int			mr_max_size;
 	u32			lkey;
 	u32			rkey;
 	enum srp_target_state	state;
@@ -197,8 +220,59 @@ struct srp_iu {
 	enum dma_data_direction	direction;
 };
 
+/**
+ * struct srp_fr_desc - fast registration work request arguments
+ * @entry: Entry in free_list.
+ * @mr:    Memory region.
+ * @frpl:  Fast registration page list.
+ */
+struct srp_fr_desc {
+	struct list_head		entry;
+	struct ib_mr			*mr;
+	struct ib_fast_reg_page_list	*frpl;
+};
+
+/**
+ * struct srp_fr_pool - pool of fast registration descriptors
+ *
+ * An entry is available for allocation if and only if it occurs in @free_list.
+ *
+ * @size:      Number of descriptors in this pool.
+ * @max_page_list_len: Maximum fast registration work request page list length.
+ * @lock:      Protects free_list.
+ * @free_list: List of free descriptors.
+ * @desc:      Fast registration descriptor pool.
+ */
+struct srp_fr_pool {
+	int			size;
+	int			max_page_list_len;
+	spinlock_t		lock;
+	struct list_head	free_list;
+	struct srp_fr_desc	desc[0];
+};
+
+/**
+ * struct srp_map_state - per-request DMA memory mapping state
+ * @desc:	    Pointer to the element of the SRP buffer descriptor array
+ *		    that is being filled in.
+ * @pages:	    Array with DMA addresses of pages being considered for
+ *		    memory registration.
+ * @base_dma_addr:  DMA address of the first page that has not yet been mapped.
+ * @dma_len:	    Number of bytes that will be registered with the next
+ *		    FMR or FR memory registration call.
+ * @total_len:	    Total number of bytes in the sg-list being mapped.
+ * @npages:	    Number of page addresses in the pages[] array.
+ * @nmdesc:	    Number of FMR or FR memory descriptors used for mapping.
+ * @ndesc:	    Number of SRP buffer descriptors that have been filled in.
+ * @unmapped_sg:    First element of the sg-list that is mapped via FMR or FR.
+ * @unmapped_index: Index of the first element mapped via FMR or FR.
+ * @unmapped_addr:  DMA address of the first element mapped via FMR or FR.
+ */
 struct srp_map_state {
-	struct ib_pool_fmr    **next_fmr;
+	union {
+		struct ib_pool_fmr **next_fmr;
+		struct srp_fr_desc **next_fr;
+	};
 	struct srp_direct_buf  *desc;
 	u64		       *pages;
 	dma_addr_t		base_dma_addr;
-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 0/9] SRP initiator patches for kernel 3.16
       [not found] ` <5368DA5B.80609-HInyCGIudOg@public.gmane.org>
                     ` (8 preceding siblings ...)
  2014-05-06 12:56   ` [PATCH 9/9] IB/srp: Add fast registration support Bart Van Assche
@ 2014-05-06 14:06   ` Jack Wang
       [not found]     ` <5368EC54.8020802-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2014-05-06 14:21   ` [PATCH 3/9] IB/srp: Introduce srp_alloc_fmr_pool() Bart Van Assche
  10 siblings, 1 reply; 32+ messages in thread
From: Jack Wang @ 2014-05-06 14:06 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Roland Dreier, Sagi Grimberg, Vu Pham, David Dillow,
	Sebastian Parschauer, linux-rdma

On 05/06/2014 02:49 PM, Bart Van Assche wrote:
> This patch series consists of one patch that adds fast registration
> support to the SRP initiator and eight preparation patches:
> 
> 0001-IB-srp-Fix-kernel-doc-warnings.patch
> 0002-IB-srp-Introduce-an-additional-local-variable.patch
> 0003-IB-srp-Introduce-srp_alloc_fmr_pool.patch
> 0004-IB-srp-Introduce-srp_map_fmr.patch
> 0005-IB-srp-Introduce-srp_finish_mapping.patch
> 0006-IB-srp-Make-srp_alloc_req_data-reallocate-request-da.patch
> 0007-IB-srp-Avoid-triggering-an-infinite-loop-if-memory-m.patch
> 0008-IB-srp-Rename-FMR-related-variables.patch
> 0009-IB-srp-Add-fast-registration-support.patch
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Hi Bart,

The 3rd patch does not show up in the mail list, could you resend it.
The first 2 looks clear right:)

Jack

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 3/9] IB/srp: Introduce srp_alloc_fmr_pool()
       [not found] ` <5368DA5B.80609-HInyCGIudOg@public.gmane.org>
                     ` (9 preceding siblings ...)
  2014-05-06 14:06   ` [PATCH 0/9] SRP initiator patches for kernel 3.16 Jack Wang
@ 2014-05-06 14:21   ` Bart Van Assche
  10 siblings, 0 replies; 32+ messages in thread
From: Bart Van Assche @ 2014-05-06 14:21 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

Introduce the srp_alloc_fmr_pool() function. Only set
srp_dev->fmr_max_size if FMR pool creation succeeded. This change is
safe since that variable is only used if FMR pool creation succeeded.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 56 ++++++++++++++++++++++---------------
 1 file changed, 33 insertions(+), 23 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 8c03371..f41cc8c 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -2792,13 +2792,43 @@ free_host:
 	return NULL;
 }
 +static void srp_alloc_fmr_pool(struct srp_device *srp_dev)
+{
+	int max_pages_per_mr;
+	struct ib_fmr_pool_param fmr_param;
+	struct ib_fmr_pool *pool;
+
+	srp_dev->fmr_pool = NULL;
+
+	for (max_pages_per_mr = SRP_FMR_SIZE;
+	     max_pages_per_mr >= SRP_FMR_MIN_SIZE;
+	     max_pages_per_mr /= 2) {
+		memset(&fmr_param, 0, sizeof(fmr_param));
+		fmr_param.pool_size	    = SRP_FMR_POOL_SIZE;
+		fmr_param.dirty_watermark   = SRP_FMR_DIRTY_SIZE;
+		fmr_param.cache		    = 1;
+		fmr_param.max_pages_per_fmr = max_pages_per_mr;
+		fmr_param.page_shift	    = ilog2(srp_dev->fmr_page_size);
+		fmr_param.access	    = (IB_ACCESS_LOCAL_WRITE |
+					       IB_ACCESS_REMOTE_WRITE |
+					       IB_ACCESS_REMOTE_READ);
+
+		pool = ib_create_fmr_pool(srp_dev->pd, &fmr_param);
+		if (!IS_ERR(pool)) {
+			srp_dev->fmr_pool = pool;
+			srp_dev->fmr_max_size =
+				srp_dev->fmr_page_size * max_pages_per_mr;
+			break;
+		}
+	}
+}
+
 static void srp_add_one(struct ib_device *device)
 {
 	struct srp_device *srp_dev;
 	struct ib_device_attr *dev_attr;
-	struct ib_fmr_pool_param fmr_param;
 	struct srp_host *host;
-	int max_pages_per_fmr, fmr_page_shift, s, e, p;
+	int fmr_page_shift, s, e, p;
  	dev_attr = kmalloc(sizeof *dev_attr, GFP_KERNEL);
 	if (!dev_attr)
@@ -2821,7 +2851,6 @@ static void srp_add_one(struct ib_device *device)
 	fmr_page_shift		= max(12, ffs(dev_attr->page_size_cap) - 1);
 	srp_dev->fmr_page_size	= 1 << fmr_page_shift;
 	srp_dev->fmr_page_mask	= ~((u64) srp_dev->fmr_page_size - 1);
-	srp_dev->fmr_max_size	= srp_dev->fmr_page_size * SRP_FMR_SIZE;
  	INIT_LIST_HEAD(&srp_dev->dev_list);
 @@ -2837,26 +2866,7 @@ static void srp_add_one(struct ib_device *device)
 	if (IS_ERR(srp_dev->mr))
 		goto err_pd;
 -	for (max_pages_per_fmr = SRP_FMR_SIZE;
-			max_pages_per_fmr >= SRP_FMR_MIN_SIZE;
-			max_pages_per_fmr /= 2, srp_dev->fmr_max_size /= 2) {
-		memset(&fmr_param, 0, sizeof fmr_param);
-		fmr_param.pool_size	    = SRP_FMR_POOL_SIZE;
-		fmr_param.dirty_watermark   = SRP_FMR_DIRTY_SIZE;
-		fmr_param.cache		    = 1;
-		fmr_param.max_pages_per_fmr = max_pages_per_fmr;
-		fmr_param.page_shift	    = fmr_page_shift;
-		fmr_param.access	    = (IB_ACCESS_LOCAL_WRITE |
-					       IB_ACCESS_REMOTE_WRITE |
-					       IB_ACCESS_REMOTE_READ);
-
-		srp_dev->fmr_pool = ib_create_fmr_pool(srp_dev->pd, &fmr_param);
-		if (!IS_ERR(srp_dev->fmr_pool))
-			break;
-	}
-
-	if (IS_ERR(srp_dev->fmr_pool))
-		srp_dev->fmr_pool = NULL;
+	srp_alloc_fmr_pool(srp_dev);
  	if (device->node_type == RDMA_NODE_IB_SWITCH) {
 		s = 0;
-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 0/9] SRP initiator patches for kernel 3.16
       [not found]     ` <5368EC54.8020802-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2014-05-06 14:26       ` Bart Van Assche
  0 siblings, 0 replies; 32+ messages in thread
From: Bart Van Assche @ 2014-05-06 14:26 UTC (permalink / raw)
  To: Jack Wang
  Cc: Roland Dreier, Sagi Grimberg, Vu Pham, David Dillow,
	Sebastian Parschauer, linux-rdma

On 05/06/14 16:06, Jack Wang wrote:
> On 05/06/2014 02:49 PM, Bart Van Assche wrote:
>> This patch series consists of one patch that adds fast registration
>> support to the SRP initiator and eight preparation patches:
>>
>> 0001-IB-srp-Fix-kernel-doc-warnings.patch
>> 0002-IB-srp-Introduce-an-additional-local-variable.patch
>> 0003-IB-srp-Introduce-srp_alloc_fmr_pool.patch
>> 0004-IB-srp-Introduce-srp_map_fmr.patch
>> 0005-IB-srp-Introduce-srp_finish_mapping.patch
>> 0006-IB-srp-Make-srp_alloc_req_data-reallocate-request-da.patch
>> 0007-IB-srp-Avoid-triggering-an-infinite-loop-if-memory-m.patch
>> 0008-IB-srp-Rename-FMR-related-variables.patch
>> 0009-IB-srp-Add-fast-registration-support.patch
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> The 3rd patch does not show up in the mail list, could you resend it.
> The first 2 looks clear right:)

Hello Jack,

That's strange ... anyway, thanks for the notification. I have resent
patch 3/9.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/9] IB/srp: Fix kernel-doc warnings
       [not found]     ` <5368DAB2.2070006-HInyCGIudOg@public.gmane.org>
@ 2014-05-07 10:35       ` Sagi Grimberg
  0 siblings, 0 replies; 32+ messages in thread
From: Sagi Grimberg @ 2014-05-07 10:35 UTC (permalink / raw)
  To: Bart Van Assche, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 5/6/2014 3:50 PM, Bart Van Assche wrote:
> Avoid that the kernel-doc tool warns about missing argument descriptions
> for the ib_srp.[ch] source files.
>
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
> Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
> Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> ---
>   drivers/infiniband/ulp/srp/ib_srp.c | 7 +++++++
>   1 file changed, 7 insertions(+)
>
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 66a908b..cf80f7a 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -813,6 +813,10 @@ static struct scsi_cmnd *srp_claim_req(struct srp_target_port *target,
>   
>   /**
>    * srp_free_req() - Unmap data and add request to the free request list.
> + * @target: SRP target port.
> + * @req:    Request to be freed.
> + * @scmnd:  SCSI command associated with @req.
> + * @req_lim_delta: Amount to be added to @target->req_lim.
>    */
>   static void srp_free_req(struct srp_target_port *target,
>   			 struct srp_request *req, struct scsi_cmnd *scmnd,
> @@ -1455,6 +1459,7 @@ static void srp_handle_recv(struct srp_target_port *target, struct ib_wc *wc)
>   
>   /**
>    * srp_tl_err_work() - handle a transport layer error
> + * @work: Work structure embedded in an SRP target port.
>    *
>    * Note: This function may get invoked before the rport has been created,
>    * hence the target->rport test.
> @@ -2310,6 +2315,8 @@ static struct class srp_class = {
>   
>   /**
>    * srp_conn_unique() - check whether the connection to a target is unique
> + * @host:   SRP host.
> + * @target: SRP target port.
>    */
>   static bool srp_conn_unique(struct srp_host *host,
>   			    struct srp_target_port *target)

Looks good,

Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/9] IB/srp: Introduce an additional local variable
       [not found]     ` <5368DADF.2070105-HInyCGIudOg@public.gmane.org>
@ 2014-05-07 10:37       ` Sagi Grimberg
       [not found]         ` <536A0CF7.7080307-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Sagi Grimberg @ 2014-05-07 10:37 UTC (permalink / raw)
  To: Bart Van Assche, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 5/6/2014 3:51 PM, Bart Van Assche wrote:
> This patch does not change any functionality.
>
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
> Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
> Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> ---
>   drivers/infiniband/ulp/srp/ib_srp.c | 9 ++++-----
>   1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index cf80f7a..8c03371 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -290,6 +290,7 @@ static int srp_new_cm_id(struct srp_target_port *target)
>   
>   static int srp_create_target_ib(struct srp_target_port *target)
>   {
> +	struct srp_device *dev = target->srp_host->srp_dev;
>   	struct ib_qp_init_attr *init_attr;
>   	struct ib_cq *recv_cq, *send_cq;
>   	struct ib_qp *qp;
> @@ -299,16 +300,14 @@ static int srp_create_target_ib(struct srp_target_port *target)
>   	if (!init_attr)
>   		return -ENOMEM;
>   
> -	recv_cq = ib_create_cq(target->srp_host->srp_dev->dev,
> -			       srp_recv_completion, NULL, target,
> +	recv_cq = ib_create_cq(dev->dev, srp_recv_completion, NULL, target,
>   			       target->queue_size, target->comp_vector);
>   	if (IS_ERR(recv_cq)) {
>   		ret = PTR_ERR(recv_cq);
>   		goto err;
>   	}
>   
> -	send_cq = ib_create_cq(target->srp_host->srp_dev->dev,
> -			       srp_send_completion, NULL, target,
> +	send_cq = ib_create_cq(dev->dev, srp_send_completion, NULL, target,
>   			       target->queue_size, target->comp_vector);
>   	if (IS_ERR(send_cq)) {
>   		ret = PTR_ERR(send_cq);
> @@ -327,7 +326,7 @@ static int srp_create_target_ib(struct srp_target_port *target)
>   	init_attr->send_cq             = send_cq;
>   	init_attr->recv_cq             = recv_cq;
>   
> -	qp = ib_create_qp(target->srp_host->srp_dev->pd, init_attr);
> +	qp = ib_create_qp(dev->pd, init_attr);
>   	if (IS_ERR(qp)) {
>   		ret = PTR_ERR(qp);
>   		goto err_send_cq;

I understand why you need it later, but I'm not sure that use_fastreg 
should be a device attribute.

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/9] IB/srp: Introduce srp_alloc_fmr_pool()
       [not found]     ` <5368DB05.4070600-HInyCGIudOg@public.gmane.org>
@ 2014-05-07 10:38       ` Sagi Grimberg
  0 siblings, 0 replies; 32+ messages in thread
From: Sagi Grimberg @ 2014-05-07 10:38 UTC (permalink / raw)
  To: Bart Van Assche, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 5/6/2014 3:52 PM, Bart Van Assche wrote:
> Introduce the srp_alloc_fmr_pool() function. Only set
> srp_dev->fmr_max_size if FMR pool creation succeeded. This change is
> safe since that variable is only used if FMR pool creation succeeded.
>
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
> Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
> Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> ---
>   drivers/infiniband/ulp/srp/ib_srp.c | 56 ++++++++++++++++++++++---------------
>   1 file changed, 33 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 8c03371..f41cc8c 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -2792,13 +2792,43 @@ free_host:
>   	return NULL;
>   }
>   
> +static void srp_alloc_fmr_pool(struct srp_device *srp_dev)
> +{
> +	int max_pages_per_mr;
> +	struct ib_fmr_pool_param fmr_param;
> +	struct ib_fmr_pool *pool;
> +
> +	srp_dev->fmr_pool = NULL;
> +
> +	for (max_pages_per_mr = SRP_FMR_SIZE;
> +	     max_pages_per_mr >= SRP_FMR_MIN_SIZE;
> +	     max_pages_per_mr /= 2) {
> +		memset(&fmr_param, 0, sizeof(fmr_param));
> +		fmr_param.pool_size	    = SRP_FMR_POOL_SIZE;
> +		fmr_param.dirty_watermark   = SRP_FMR_DIRTY_SIZE;
> +		fmr_param.cache		    = 1;
> +		fmr_param.max_pages_per_fmr = max_pages_per_mr;
> +		fmr_param.page_shift	    = ilog2(srp_dev->fmr_page_size);
> +		fmr_param.access	    = (IB_ACCESS_LOCAL_WRITE |
> +					       IB_ACCESS_REMOTE_WRITE |
> +					       IB_ACCESS_REMOTE_READ);
> +
> +		pool = ib_create_fmr_pool(srp_dev->pd, &fmr_param);
> +		if (!IS_ERR(pool)) {
> +			srp_dev->fmr_pool = pool;
> +			srp_dev->fmr_max_size =
> +				srp_dev->fmr_page_size * max_pages_per_mr;
> +			break;
> +		}
> +	}
> +}
> +
>   static void srp_add_one(struct ib_device *device)
>   {
>   	struct srp_device *srp_dev;
>   	struct ib_device_attr *dev_attr;
> -	struct ib_fmr_pool_param fmr_param;
>   	struct srp_host *host;
> -	int max_pages_per_fmr, fmr_page_shift, s, e, p;
> +	int fmr_page_shift, s, e, p;
>   
>   	dev_attr = kmalloc(sizeof *dev_attr, GFP_KERNEL);
>   	if (!dev_attr)
> @@ -2821,7 +2851,6 @@ static void srp_add_one(struct ib_device *device)
>   	fmr_page_shift		= max(12, ffs(dev_attr->page_size_cap) - 1);
>   	srp_dev->fmr_page_size	= 1 << fmr_page_shift;
>   	srp_dev->fmr_page_mask	= ~((u64) srp_dev->fmr_page_size - 1);
> -	srp_dev->fmr_max_size	= srp_dev->fmr_page_size * SRP_FMR_SIZE;
>   
>   	INIT_LIST_HEAD(&srp_dev->dev_list);
>   
> @@ -2837,26 +2866,7 @@ static void srp_add_one(struct ib_device *device)
>   	if (IS_ERR(srp_dev->mr))
>   		goto err_pd;
>   
> -	for (max_pages_per_fmr = SRP_FMR_SIZE;
> -			max_pages_per_fmr >= SRP_FMR_MIN_SIZE;
> -			max_pages_per_fmr /= 2, srp_dev->fmr_max_size /= 2) {
> -		memset(&fmr_param, 0, sizeof fmr_param);
> -		fmr_param.pool_size	    = SRP_FMR_POOL_SIZE;
> -		fmr_param.dirty_watermark   = SRP_FMR_DIRTY_SIZE;
> -		fmr_param.cache		    = 1;
> -		fmr_param.max_pages_per_fmr = max_pages_per_fmr;
> -		fmr_param.page_shift	    = fmr_page_shift;
> -		fmr_param.access	    = (IB_ACCESS_LOCAL_WRITE |
> -					       IB_ACCESS_REMOTE_WRITE |
> -					       IB_ACCESS_REMOTE_READ);
> -
> -		srp_dev->fmr_pool = ib_create_fmr_pool(srp_dev->pd, &fmr_param);
> -		if (!IS_ERR(srp_dev->fmr_pool))
> -			break;
> -	}
> -
> -	if (IS_ERR(srp_dev->fmr_pool))
> -		srp_dev->fmr_pool = NULL;
> +	srp_alloc_fmr_pool(srp_dev);
>   
>   	if (device->node_type == RDMA_NODE_IB_SWITCH) {
>   		s = 0;

Looks good,

Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/9] IB/srp: Introduce srp_map_fmr()
       [not found]     ` <5368DB2F.8020506-HInyCGIudOg@public.gmane.org>
@ 2014-05-07 10:39       ` Sagi Grimberg
  0 siblings, 0 replies; 32+ messages in thread
From: Sagi Grimberg @ 2014-05-07 10:39 UTC (permalink / raw)
  To: Bart Van Assche, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 5/6/2014 3:53 PM, Bart Van Assche wrote:
> This patch does not change any functionality.
>
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
> Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
> Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> ---
>   drivers/infiniband/ulp/srp/ib_srp.c | 77 ++++++++++++++++++++++---------------
>   1 file changed, 45 insertions(+), 32 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index f41cc8c..5fb607b 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -1047,12 +1047,54 @@ static int srp_map_sg_entry(struct srp_map_state *state,
>   	return ret;
>   }
>   
> +static void srp_map_fmr(struct srp_map_state *state,
> +			struct srp_target_port *target, struct srp_request *req,
> +			struct scatterlist *scat, int count)
> +{
> +	struct srp_device *dev = target->srp_host->srp_dev;
> +	struct ib_device *ibdev = dev->dev;
> +	struct scatterlist *sg;
> +	int i, use_fmr;
> +
> +	state->desc	= req->indirect_desc;
> +	state->pages	= req->map_page;
> +	state->next_fmr	= req->fmr_list;
> +
> +	use_fmr = dev->fmr_pool ? SRP_MAP_ALLOW_FMR : SRP_MAP_NO_FMR;
> +
> +	for_each_sg(scat, sg, count, i) {
> +		if (srp_map_sg_entry(state, target, sg, i, use_fmr)) {
> +			/* FMR mapping failed, so backtrack to the first
> +			 * unmapped entry and continue on without using FMR.
> +			 */
> +			dma_addr_t dma_addr;
> +			unsigned int dma_len;
> +
> +backtrack:
> +			sg = state->unmapped_sg;
> +			i = state->unmapped_index;
> +
> +			dma_addr = ib_sg_dma_address(ibdev, sg);
> +			dma_len = ib_sg_dma_len(ibdev, sg);
> +			dma_len -= (state->unmapped_addr - dma_addr);
> +			dma_addr = state->unmapped_addr;
> +			use_fmr = SRP_MAP_NO_FMR;
> +			srp_map_desc(state, dma_addr, dma_len, target->rkey);
> +		}
> +	}
> +
> +	if (use_fmr == SRP_MAP_ALLOW_FMR && srp_map_finish_fmr(state, target))
> +		goto backtrack;
> +
> +	req->nfmr = state->nfmr;
> +}
> +
>   static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
>   			struct srp_request *req)
>   {
> -	struct scatterlist *scat, *sg;
> +	struct scatterlist *scat;
>   	struct srp_cmd *cmd = req->cmd->buf;
> -	int i, len, nents, count, use_fmr;
> +	int len, nents, count;
>   	struct srp_device *dev;
>   	struct ib_device *ibdev;
>   	struct srp_map_state state;
> @@ -1111,35 +1153,7 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
>   				   target->indirect_size, DMA_TO_DEVICE);
>   
>   	memset(&state, 0, sizeof(state));
> -	state.desc	= req->indirect_desc;
> -	state.pages	= req->map_page;
> -	state.next_fmr	= req->fmr_list;
> -
> -	use_fmr = dev->fmr_pool ? SRP_MAP_ALLOW_FMR : SRP_MAP_NO_FMR;
> -
> -	for_each_sg(scat, sg, count, i) {
> -		if (srp_map_sg_entry(&state, target, sg, i, use_fmr)) {
> -			/* FMR mapping failed, so backtrack to the first
> -			 * unmapped entry and continue on without using FMR.
> -			 */
> -			dma_addr_t dma_addr;
> -			unsigned int dma_len;
> -
> -backtrack:
> -			sg = state.unmapped_sg;
> -			i = state.unmapped_index;
> -
> -			dma_addr = ib_sg_dma_address(ibdev, sg);
> -			dma_len = ib_sg_dma_len(ibdev, sg);
> -			dma_len -= (state.unmapped_addr - dma_addr);
> -			dma_addr = state.unmapped_addr;
> -			use_fmr = SRP_MAP_NO_FMR;
> -			srp_map_desc(&state, dma_addr, dma_len, target->rkey);
> -		}
> -	}
> -
> -	if (use_fmr == SRP_MAP_ALLOW_FMR && srp_map_finish_fmr(&state, target))
> -		goto backtrack;
> +	srp_map_fmr(&state, target, req, scat, count);
>   
>   	/* We've mapped the request, now pull as much of the indirect
>   	 * descriptor table as we can into the command buffer. If this
> @@ -1147,7 +1161,6 @@ backtrack:
>   	 * guaranteed to fit into the command, as the SCSI layer won't
>   	 * give us more S/G entries than we allow.
>   	 */
> -	req->nfmr = state.nfmr;
>   	if (state.ndesc == 1) {
>   		/* FMR mapping was able to collapse this to one entry,
>   		 * so use a direct descriptor.
Looks good.

Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/9] IB/srp: Introduce srp_finish_mapping()
       [not found]     ` <5368DB58.6070502-HInyCGIudOg@public.gmane.org>
@ 2014-05-07 10:41       ` Sagi Grimberg
  0 siblings, 0 replies; 32+ messages in thread
From: Sagi Grimberg @ 2014-05-07 10:41 UTC (permalink / raw)
  To: Bart Van Assche, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 5/6/2014 3:53 PM, Bart Van Assche wrote:
> This patch does not change any functionality.
>
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
> Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
> Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> ---
>   drivers/infiniband/ulp/srp/ib_srp.c | 37 +++++++++++++++++++++++++------------
>   1 file changed, 25 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 5fb607b..ba434d6 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -935,16 +935,6 @@ static int srp_map_finish_fmr(struct srp_map_state *state,
>   	struct ib_pool_fmr *fmr;
>   	u64 io_addr = 0;
>   
> -	if (!state->npages)
> -		return 0;
> -
> -	if (state->npages == 1) {
> -		srp_map_desc(state, state->base_dma_addr, state->fmr_len,
> -			     target->rkey);
> -		state->npages = state->fmr_len = 0;
> -		return 0;
> -	}
> -
>   	fmr = ib_fmr_pool_map_phys(dev->fmr_pool, state->pages,
>   				   state->npages, io_addr);
>   	if (IS_ERR(fmr))
> @@ -954,10 +944,33 @@ static int srp_map_finish_fmr(struct srp_map_state *state,
>   	state->nfmr++;
>   
>   	srp_map_desc(state, 0, state->fmr_len, fmr->fmr->rkey);
> -	state->npages = state->fmr_len = 0;
> +
>   	return 0;
>   }
>   
> +static int srp_finish_mapping(struct srp_map_state *state,
> +			      struct srp_target_port *target)
> +{
> +	int ret = 0;
> +
> +	if (state->npages == 0)
> +		return 0;
> +
> +	if (state->npages == 1) {
> +		srp_map_desc(state, state->base_dma_addr, state->fmr_len,
> +			     target->rkey);
> +	} else {
> +		ret = srp_map_finish_fmr(state, target);
> +	}
> +

Nit, can lose the brackets.
Other than that, Looks good.

Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/9] IB/srp: Make srp_alloc_req_data() reallocate request data
       [not found]     ` <5368DB90.9050209-HInyCGIudOg@public.gmane.org>
@ 2014-05-07 10:50       ` Sagi Grimberg
       [not found]         ` <536A0FF7.7050608-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Sagi Grimberg @ 2014-05-07 10:50 UTC (permalink / raw)
  To: Bart Van Assche, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 5/6/2014 3:54 PM, Bart Van Assche wrote:
> This patch is needed by the patch that adds fast registration support.
>
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
> Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
> Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> ---
>   drivers/infiniband/ulp/srp/ib_srp.c | 41 ++++++++++++++++++++++++-------------
>   1 file changed, 27 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index ba434d6..1c4b0d3 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -574,17 +574,18 @@ static void srp_disconnect_target(struct srp_target_port *target)
>   	}
>   }
>   
> -static void srp_free_req_data(struct srp_target_port *target)
> +static void srp_free_req_data(struct srp_target_port *target,
> +			      struct srp_request *req_ring)
>   {

Something here feels wrong (or partially right).

>   	struct ib_device *ibdev = target->srp_host->srp_dev->dev;
>   	struct srp_request *req;
>   	int i;
>   
> -	if (!target->req_ring)
> +	if (!req_ring)
>   		return;
>   
>   	for (i = 0; i < target->req_ring_size; ++i) {
> -		req = &target->req_ring[i];
> +		req = &req_ring[i];

You loop for {ring A size} and operates on ring B elements. They will 
probably be the same but the notion seems buggy.
Will it be better to untie this routine from srp_target_port at all?

>   		kfree(req->fmr_list);
>   		kfree(req->map_page);
>   		if (req->indirect_dma_addr) {
> @@ -595,27 +596,34 @@ static void srp_free_req_data(struct srp_target_port *target)
>   		kfree(req->indirect_desc);
>   	}
>   
> -	kfree(target->req_ring);
> -	target->req_ring = NULL;
> +	kfree(req_ring);
>   }
>   
> +/**
> + * srp_alloc_req_data() - allocate or reallocate request data
> + * @target: SRP target port.
> + *
> + * If target->req_ring was non-NULL before this function got invoked it will
> + * also be non-NULL after this function has finished.
> + */
>   static int srp_alloc_req_data(struct srp_target_port *target)
>   {
>   	struct srp_device *srp_dev = target->srp_host->srp_dev;
>   	struct ib_device *ibdev = srp_dev->dev;
> -	struct srp_request *req;
> +	struct list_head free_reqs;
> +	struct srp_request *req_ring, *req;
>   	dma_addr_t dma_addr;
>   	int i, ret = -ENOMEM;
>   
> -	INIT_LIST_HEAD(&target->free_reqs);
> +	INIT_LIST_HEAD(&free_reqs);
>   
> -	target->req_ring = kzalloc(target->req_ring_size *
> -				   sizeof(*target->req_ring), GFP_KERNEL);
> -	if (!target->req_ring)
> +	req_ring = kzalloc(target->req_ring_size * sizeof(*req_ring),
> +			   GFP_KERNEL);
> +	if (!req_ring)
>   		goto out;
>   
>   	for (i = 0; i < target->req_ring_size; ++i) {
> -		req = &target->req_ring[i];
> +		req = &req_ring[i];
>   		req->fmr_list = kmalloc(target->cmd_sg_cnt * sizeof(void *),
>   					GFP_KERNEL);
>   		req->map_page = kmalloc(SRP_FMR_SIZE * sizeof(void *),
> @@ -632,11 +640,16 @@ static int srp_alloc_req_data(struct srp_target_port *target)
>   
>   		req->indirect_dma_addr = dma_addr;
>   		req->index = i;
> -		list_add_tail(&req->list, &target->free_reqs);
> +		list_add_tail(&req->list, &free_reqs);
>   	}
> +	swap(target->req_ring, req_ring);
> +	INIT_LIST_HEAD(&target->free_reqs);
> +	list_splice(&free_reqs, &target->free_reqs);
>   	ret = 0;
>   
>   out:
> +	srp_free_req_data(target, req_ring);
> +
>   	return ret;
>   }
>   
> @@ -669,7 +682,7 @@ static void srp_remove_target(struct srp_target_port *target)
>   	srp_free_target_ib(target);
>   	cancel_work_sync(&target->tl_err_work);
>   	srp_rport_put(target->rport);
> -	srp_free_req_data(target);
> +	srp_free_req_data(target, target->req_ring);
>   
>   	spin_lock(&target->srp_host->target_lock);
>   	list_del(&target->list);
> @@ -2750,7 +2763,7 @@ err_free_ib:
>   	srp_free_target_ib(target);
>   
>   err_free_mem:
> -	srp_free_req_data(target);
> +	srp_free_req_data(target, target->req_ring);
>   
>   err:
>   	scsi_host_put(target_host);

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 7/9] IB/srp: Avoid triggering an infinite loop if memory mapping fails
       [not found]     ` <5368DBC5.6070609-HInyCGIudOg@public.gmane.org>
@ 2014-05-07 10:50       ` Sagi Grimberg
  0 siblings, 0 replies; 32+ messages in thread
From: Sagi Grimberg @ 2014-05-07 10:50 UTC (permalink / raw)
  To: Bart Van Assche, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 5/6/2014 3:55 PM, Bart Van Assche wrote:
> Only request the SCSI mid-layer to retry a SCSI command after a
> temporary mapping failure (-ENOMEM) but not after a permanent
> mapping failure. This patch avoids that SCSI commands are retried
> indefinitely if a permanent memory mapping failure occurs.
>
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
> Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
> Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> ---
>   drivers/infiniband/ulp/srp/ib_srp.c | 19 ++++++++++++-------
>   1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 1c4b0d3..af94381 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -1564,7 +1564,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
>   	struct srp_cmd *cmd;
>   	struct ib_device *dev;
>   	unsigned long flags;
> -	int len, result;
> +	int len, result, ret = SCSI_MLQUEUE_HOST_BUSY;
>   	const bool in_scsi_eh = !in_interrupt() && current == shost->ehandler;
>   
>   	/*
> @@ -1580,6 +1580,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
>   	if (unlikely(result)) {
>   		scmnd->result = result;
>   		scmnd->scsi_done(scmnd);
> +		ret = 0;
>   		goto unlock_rport;
>   	}
>   
> @@ -1613,7 +1614,12 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
>   	len = srp_map_data(scmnd, target, req);
>   	if (len < 0) {
>   		shost_printk(KERN_ERR, target->scsi_host,
> -			     PFX "Failed to map data\n");
> +			     PFX "Failed to map data (%d)\n", len);
> +		if (len != -ENOMEM) {
> +			scmnd->result = DID_ERROR << 16;
> +			scmnd->scsi_done(scmnd);
> +			ret = 0;
> +		}
>   		goto err_iu;
>   	}
>   
> @@ -1625,11 +1631,13 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
>   		goto err_unmap;
>   	}
>   
> +	ret = 0;
> +
>   unlock_rport:
>   	if (in_scsi_eh)
>   		mutex_unlock(&rport->mutex);
>   
> -	return 0;
> +	return ret;
>   
>   err_unmap:
>   	srp_unmap_data(scmnd, target, req);
> @@ -1643,10 +1651,7 @@ err_iu:
>   err_unlock:
>   	spin_unlock_irqrestore(&target->lock, flags);
>   
> -	if (in_scsi_eh)
> -		mutex_unlock(&rport->mutex);
> -
> -	return SCSI_MLQUEUE_HOST_BUSY;
> +	goto unlock_rport;
>   }
>   
>   /*
Looks good.

Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 9/9] IB/srp: Add fast registration support
       [not found]     ` <5368DC09.70608-HInyCGIudOg@public.gmane.org>
@ 2014-05-07 11:34       ` Sagi Grimberg
       [not found]         ` <536A1A3A.9090208-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Sagi Grimberg @ 2014-05-07 11:34 UTC (permalink / raw)
  To: Bart Van Assche, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 5/6/2014 3:56 PM, Bart Van Assche wrote:
> Certain HCA types (e.g. Connect-IB) and certain configurations (e.g.
> ConnectX VF) support FR but not FMR. Hence add FR support.
>
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
> Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
> Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> ---
>   drivers/infiniband/ulp/srp/ib_srp.c | 442 ++++++++++++++++++++++++++++++------
>   drivers/infiniband/ulp/srp/ib_srp.h |  82 ++++++-
>   2 files changed, 451 insertions(+), 73 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 017de46..fbda2ca 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -66,6 +66,8 @@ static unsigned int srp_sg_tablesize;
>   static unsigned int cmd_sg_entries;
>   static unsigned int indirect_sg_entries;
>   static bool allow_ext_sg;
> +static bool prefer_fr;
> +static bool register_always;
>   static int topspin_workarounds = 1;
>   
>   module_param(srp_sg_tablesize, uint, 0444);
> @@ -87,6 +89,14 @@ module_param(topspin_workarounds, int, 0444);
>   MODULE_PARM_DESC(topspin_workarounds,
>   		 "Enable workarounds for Topspin/Cisco SRP target bugs if != 0");
>   
> +module_param(prefer_fr, bool, 0444);
> +MODULE_PARM_DESC(prefer_fr,
> +		 "Whether to use FR if both FMR and FR are supported");
> +

Will it be better to make it a per-target attribute?

> +module_param(register_always, bool, 0444);
> +MODULE_PARM_DESC(register_always,
> +		 "Use memory registration even for contiguous memory regions");
> +

This is a separate patch not related to fast registration support, I 
suggest to post it as patch #10

>   static struct kernel_param_ops srp_tmo_ops;
>   
>   static int srp_reconnect_delay = 10;
> @@ -288,12 +298,154 @@ static int srp_new_cm_id(struct srp_target_port *target)
>   	return 0;
>   }
>   
> +/**
> + * srp_destroy_fr_pool() - free the resources owned by a pool
> + * @pool: Fast registration pool to be destroyed.
> + */
> +static void srp_destroy_fr_pool(struct srp_fr_pool *pool)
> +{
> +	int i;
> +	struct srp_fr_desc *d;
> +
> +	if (!pool)
> +		return;
> +
> +	for (i = 0, d = &pool->desc[0]; i < pool->size; i++, d++) {
> +		if (d->frpl)
> +			ib_free_fast_reg_page_list(d->frpl);
> +		if (d->mr)
> +			ib_dereg_mr(d->mr);
> +	}
> +	kfree(pool);
> +}
> +
> +/**
> + * srp_create_fr_pool() - allocate and initialize a pool for fast registration
> + * @device:            IB device to allocate fast registration descriptors for.
> + * @pd:                Protection domain associated with the FR descriptors.
> + * @pool_size:         Number of descriptors to allocate.
> + * @max_page_list_len: Maximum fast registration work request page list length.
> + */
> +static struct srp_fr_pool *srp_create_fr_pool(struct ib_device *device,
> +					      struct ib_pd *pd, int pool_size,
> +					      int max_page_list_len)
> +{
> +	struct srp_fr_pool *pool;
> +	struct srp_fr_desc *d;
> +	struct ib_mr *mr;
> +	struct ib_fast_reg_page_list *frpl;
> +	int i, ret = -EINVAL;
> +
> +	if (pool_size <= 0)
> +		goto err;
> +	ret = -ENOMEM;
> +	pool = kzalloc(sizeof(struct srp_fr_pool) +
> +		       pool_size * sizeof(struct srp_fr_desc), GFP_KERNEL);
> +	if (!pool)
> +		goto err;
> +	pool->size = pool_size;
> +	pool->max_page_list_len = max_page_list_len;
> +	spin_lock_init(&pool->lock);
> +	INIT_LIST_HEAD(&pool->free_list);
> +
> +	for (i = 0, d = &pool->desc[0]; i < pool->size; i++, d++) {
> +		mr = ib_alloc_fast_reg_mr(pd, max_page_list_len);
> +		if (IS_ERR(mr)) {
> +			ret = PTR_ERR(mr);
> +			goto destroy_pool;
> +		}
> +		d->mr = mr;
> +		frpl = ib_alloc_fast_reg_page_list(device, max_page_list_len);
> +		if (IS_ERR(frpl)) {
> +			ret = PTR_ERR(frpl);
> +			goto destroy_pool;
> +		}
> +		d->frpl = frpl;
> +		list_add_tail(&d->entry, &pool->free_list);
> +	}
> +
> +out:
> +	return pool;
> +
> +destroy_pool:
> +	srp_destroy_fr_pool(pool);
> +
> +err:
> +	pool = ERR_PTR(ret);
> +	goto out;
> +}
> +
> +/**
> + * srp_fr_pool_get() - obtain a descriptor suitable for fast registration
> + * @pool: Pool to obtain descriptor from.
> + */
> +static struct srp_fr_desc *srp_fr_pool_get(struct srp_fr_pool *pool)
> +{
> +	struct srp_fr_desc *d = NULL;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&pool->lock, flags);
> +	if (!list_empty(&pool->free_list)) {
> +		d = list_first_entry(&pool->free_list, typeof(*d), entry);
> +		list_del(&d->entry);
> +	}
> +	spin_unlock_irqrestore(&pool->lock, flags);
> +
> +	return d;
> +}
> +
> +/**
> + * srp_fr_pool_put() - put an FR descriptor back in the free list
> + * @pool: Pool the descriptor was allocated from.
> + * @desc: Pointer to an array of fast registration descriptor pointers.
> + * @n:    Number of descriptors to put back.
> + *
> + * Note: The caller must already have queued an invalidation request for
> + * desc->mr->rkey before calling this function.
> + */
> +static void srp_fr_pool_put(struct srp_fr_pool *pool, struct srp_fr_desc **desc,
> +			    int n)
> +{
> +	unsigned long flags;
> +	int i;
> +
> +	spin_lock_irqsave(&pool->lock, flags);
> +	for (i = 0; i < n; i++)
> +		list_add(&desc[i]->entry, &pool->free_list);
> +	spin_unlock_irqrestore(&pool->lock, flags);
> +}
> +
> +static struct srp_fr_pool *srp_alloc_fr_pool(struct srp_target_port *target)
> +{
> +	struct srp_device *dev = target->srp_host->srp_dev;
> +	struct srp_fr_pool *pool;
> +	int max_pages_per_mr;
> +
> +	for (max_pages_per_mr = SRP_MAX_PAGES_PER_MR;
> +	     max_pages_per_mr >= SRP_MIN_PAGES_PER_MR;
> +	     max_pages_per_mr /= 2) {

Isn't there some device capability for that? maybe max_mr_size?
Also for smaller max_pages you won't be able to handle 512 pages in a 
single cmnd (you would need 2 FRMRs).
At least a warning print?

> +		pool = srp_create_fr_pool(dev->dev, dev->pd,
> +					  SRP_MDESC_PER_POOL, max_pages_per_mr);

1024 FRMRs per connection?! we use 1024 FMRs for all connections 
(per-device).
I'd say that's a major over-allocation.

> +		if (!IS_ERR(pool))
> +			goto out;
> +	}
> +
> +	if (IS_ERR(pool))
> +		pr_warn("Fast registration pool creation for %s failed: %d\n",
> +			dev->dev->name, PTR_RET(pool));
> +
> +out:
> +	return pool;
> +}
> +
>   static int srp_create_target_ib(struct srp_target_port *target)
>   {
>   	struct srp_device *dev = target->srp_host->srp_dev;
>   	struct ib_qp_init_attr *init_attr;
>   	struct ib_cq *recv_cq, *send_cq;
>   	struct ib_qp *qp;
> +	struct srp_fr_pool *fr_pool = NULL;
> +	const int m = 1 + dev->use_fast_reg;
>   	int ret;
>   
>   	init_attr = kzalloc(sizeof *init_attr, GFP_KERNEL);
> @@ -308,7 +460,7 @@ static int srp_create_target_ib(struct srp_target_port *target)
>   	}
>   
>   	send_cq = ib_create_cq(dev->dev, srp_send_completion, NULL, target,
> -			       target->queue_size, target->comp_vector);
> +			       m * target->queue_size, target->comp_vector);
>   	if (IS_ERR(send_cq)) {
>   		ret = PTR_ERR(send_cq);
>   		goto err_recv_cq;
> @@ -317,11 +469,11 @@ static int srp_create_target_ib(struct srp_target_port *target)
>   	ib_req_notify_cq(recv_cq, IB_CQ_NEXT_COMP);
>   
>   	init_attr->event_handler       = srp_qp_event;
> -	init_attr->cap.max_send_wr     = target->queue_size;
> +	init_attr->cap.max_send_wr     = m * target->queue_size;
>   	init_attr->cap.max_recv_wr     = target->queue_size;
>   	init_attr->cap.max_recv_sge    = 1;
>   	init_attr->cap.max_send_sge    = 1;
> -	init_attr->sq_sig_type         = IB_SIGNAL_ALL_WR;
> +	init_attr->sq_sig_type         = IB_SIGNAL_REQ_WR;

Will it be better to do this in a preparation patch? (not so important 
to me)

>   	init_attr->qp_type             = IB_QPT_RC;
>   	init_attr->send_cq             = send_cq;
>   	init_attr->recv_cq             = recv_cq;
> @@ -336,6 +488,14 @@ static int srp_create_target_ib(struct srp_target_port *target)
>   	if (ret)
>   		goto err_qp;
>   
> +	if (dev->use_fast_reg) {
> +		fr_pool = srp_alloc_fr_pool(target);
> +		if (IS_ERR(fr_pool)) {
> +			ret = PTR_ERR(fr_pool);
> +			goto err_qp;
> +		}
> +	}
> +
>   	if (target->qp)
>   		ib_destroy_qp(target->qp);
>   	if (target->recv_cq)
> @@ -343,6 +503,15 @@ static int srp_create_target_ib(struct srp_target_port *target)
>   	if (target->send_cq)
>   		ib_destroy_cq(target->send_cq);
>   
> +	if (dev->use_fast_reg) {
> +		srp_destroy_fr_pool(target->fr_pool);
> +		target->fr_pool = fr_pool;
> +		target->mr_max_size = dev->mr_page_size *
> +			fr_pool->max_page_list_len;
> +	} else {
> +		target->mr_max_size = dev->fmr_max_size;
> +	}
> +
>   	target->qp = qp;
>   	target->recv_cq = recv_cq;
>   	target->send_cq = send_cq;
> @@ -370,12 +539,16 @@ err:
>    */
>   static void srp_free_target_ib(struct srp_target_port *target)
>   {
> +	struct srp_device *dev = target->srp_host->srp_dev;
>   	int i;
>   
>   	ib_destroy_qp(target->qp);
>   	ib_destroy_cq(target->send_cq);
>   	ib_destroy_cq(target->recv_cq);
>   
> +	if (dev->use_fast_reg)
> +		srp_destroy_fr_pool(target->fr_pool);
> +
>   	target->qp = NULL;
>   	target->send_cq = target->recv_cq = NULL;
>   
> @@ -577,7 +750,8 @@ static void srp_disconnect_target(struct srp_target_port *target)
>   static void srp_free_req_data(struct srp_target_port *target,
>   			      struct srp_request *req_ring)
>   {
> -	struct ib_device *ibdev = target->srp_host->srp_dev->dev;
> +	struct srp_device *dev = target->srp_host->srp_dev;
> +	struct ib_device *ibdev = dev->dev;
>   	struct srp_request *req;
>   	int i;
>   
> @@ -586,7 +760,10 @@ static void srp_free_req_data(struct srp_target_port *target,
>   
>   	for (i = 0; i < target->req_ring_size; ++i) {
>   		req = &req_ring[i];
> -		kfree(req->fmr_list);
> +		if (dev->use_fast_reg)
> +			kfree(req->fr.fr_list);
> +		else
> +			kfree(req->fmr.fmr_list);
>   		kfree(req->map_page);
>   		if (req->indirect_dma_addr) {
>   			ib_dma_unmap_single(ibdev, req->indirect_dma_addr,
> @@ -612,6 +789,7 @@ static int srp_alloc_req_data(struct srp_target_port *target)
>   	struct ib_device *ibdev = srp_dev->dev;
>   	struct list_head free_reqs;
>   	struct srp_request *req_ring, *req;
> +	void *mr_list;
>   	dma_addr_t dma_addr;
>   	int i, ret = -ENOMEM;
>   
> @@ -624,12 +802,20 @@ static int srp_alloc_req_data(struct srp_target_port *target)
>   
>   	for (i = 0; i < target->req_ring_size; ++i) {
>   		req = &req_ring[i];
> -		req->fmr_list = kmalloc(target->cmd_sg_cnt * sizeof(void *),
> -					GFP_KERNEL);
> +		mr_list = kmalloc(target->cmd_sg_cnt * sizeof(void *),
> +				  GFP_KERNEL);
> +		if (!mr_list)
> +			goto out;
> +		if (srp_dev->use_fast_reg)
> +			req->fr.fr_list = mr_list;
> +		else
> +			req->fmr.fmr_list = mr_list;
>   		req->map_page = kmalloc(SRP_MAX_PAGES_PER_MR * sizeof(void *),
>   					GFP_KERNEL);
> +		if (!req->map_page)
> +			goto out;
>   		req->indirect_desc = kmalloc(target->indirect_size, GFP_KERNEL);
> -		if (!req->fmr_list || !req->map_page || !req->indirect_desc)
> +		if (!req->indirect_desc)
>   			goto out;
>   
>   		dma_addr = ib_dma_map_single(ibdev, req->indirect_desc,
> @@ -771,21 +957,49 @@ static int srp_connect_target(struct srp_target_port *target)
>   	}
>   }
>   
> +static int srp_inv_rkey(struct srp_target_port *target, u32 rkey)
> +{
> +	struct ib_send_wr *bad_wr;
> +	struct ib_send_wr wr = {
> +		.opcode		    = IB_WR_LOCAL_INV,
> +		.wr_id		    = LOCAL_INV_WR_ID_MASK,
> +		.next		    = NULL,
> +		.num_sge	    = 0,
> +		.send_flags	    = 0,
> +		.ex.invalidate_rkey = rkey,
> +	};
> +
> +	return ib_post_send(target->qp, &wr, &bad_wr);
> +}
> +
>   static void srp_unmap_data(struct scsi_cmnd *scmnd,
>   			   struct srp_target_port *target,
>   			   struct srp_request *req)
>   {
> -	struct ib_device *ibdev = target->srp_host->srp_dev->dev;
> -	struct ib_pool_fmr **pfmr;
> +	struct srp_device *dev = target->srp_host->srp_dev;
> +	struct ib_device *ibdev = dev->dev;
> +	int i;
>   
>   	if (!scsi_sglist(scmnd) ||
>   	    (scmnd->sc_data_direction != DMA_TO_DEVICE &&
>   	     scmnd->sc_data_direction != DMA_FROM_DEVICE))
>   		return;
>   
> -	pfmr = req->fmr_list;
> -	while (req->nmdesc--)
> -		ib_fmr_pool_unmap(*pfmr++);
> +	if (dev->use_fast_reg) {
> +		struct srp_fr_desc **pfr;
> +
> +		for (i = req->nmdesc, pfr = req->fr.fr_list; i > 0; i--, pfr++)
> +			srp_inv_rkey(target, (*pfr)->mr->rkey);
> +		if (req->nmdesc)
> +			srp_fr_pool_put(target->fr_pool, req->fr.fr_list,
> +					req->nmdesc);
> +	} else {
> +		struct ib_pool_fmr **pfmr;
> +
> +		for (i = req->nmdesc, pfmr = req->fmr.fmr_list; i > 0;
> +		     i--, pfmr++)
> +			ib_fmr_pool_unmap(*pfmr);
> +	}
>   
>   	ib_dma_unmap_sg(ibdev, scsi_sglist(scmnd), scsi_sg_count(scmnd),
>   			scmnd->sc_data_direction);
> @@ -898,20 +1112,25 @@ static int srp_rport_reconnect(struct srp_rport *rport)
>   	 * callbacks will have finished before a new QP is allocated.
>   	 */
>   	ret = srp_new_cm_id(target);
> +
> +	for (i = 0; i < target->req_ring_size; ++i) {
> +		struct srp_request *req = &target->req_ring[i];
> +		srp_finish_req(target, req, NULL, DID_RESET << 16);
> +	}

Can you explain why this moved here? I'm not sure I understand.

> +
>   	/*
>   	 * Whether or not creating a new CM ID succeeded, create a new
> -	 * QP. This guarantees that all completion callback function
> -	 * invocations have finished before request resetting starts.
> +	 * QP. This guarantees that all QP callback function invocations have
> +	 * finished before request reallocating starts.
>   	 */
>   	if (ret == 0)
>   		ret = srp_create_target_ib(target);
>   	else
>   		srp_create_target_ib(target);
>   
> -	for (i = 0; i < target->req_ring_size; ++i) {
> -		struct srp_request *req = &target->req_ring[i];
> -		srp_finish_req(target, req, NULL, DID_RESET << 16);
> -	}
> +	/* Reallocate requests to reset the MR state in FR mode. */
> +	if (ret == 0)
> +		ret = srp_alloc_req_data(target);
>   
>   	INIT_LIST_HEAD(&target->free_tx);
>   	for (i = 0; i < target->queue_size; ++i)
> @@ -961,6 +1180,47 @@ static int srp_map_finish_fmr(struct srp_map_state *state,
>   	return 0;
>   }
>   
> +static int srp_map_finish_fr(struct srp_map_state *state,
> +			     struct srp_target_port *target)
> +{
> +	struct srp_device *dev = target->srp_host->srp_dev;
> +	struct ib_send_wr *bad_wr;
> +	struct ib_send_wr wr;
> +	struct srp_fr_desc *desc;
> +	u32 rkey;
> +
> +	desc = srp_fr_pool_get(target->fr_pool);
> +	if (!desc)
> +		return -ENOMEM;
> +
> +	rkey = ib_inc_rkey(desc->mr->rkey);
> +	ib_update_fast_reg_key(desc->mr, rkey);

Usually the inc_rkey is done right after the invalidation (but that's 
not really an issue).
Maybe it is better here in case the target did remote invalidate (not 
supported at the moment but maybe in the future?)

> +
> +	memcpy(desc->frpl->page_list, state->pages,
> +	       sizeof(state->pages[0]) * state->npages);
> +
> +	memset(&wr, 0, sizeof(wr));
> +	wr.opcode = IB_WR_FAST_REG_MR;
> +	wr.wr_id = FAST_REG_WR_ID_MASK;
> +	wr.wr.fast_reg.iova_start = state->base_dma_addr;
> +	wr.wr.fast_reg.page_list = desc->frpl;
> +	wr.wr.fast_reg.page_list_len = state->npages;
> +	wr.wr.fast_reg.page_shift = ilog2(dev->mr_page_size);
> +	wr.wr.fast_reg.length = state->dma_len;
> +	wr.wr.fast_reg.access_flags = (IB_ACCESS_LOCAL_WRITE |
> +				       IB_ACCESS_REMOTE_READ |
> +				       IB_ACCESS_REMOTE_WRITE);
> +	wr.wr.fast_reg.rkey = desc->mr->lkey;
> +
> +	*state->next_fr++ = desc;
> +	state->nmdesc++;
> +
> +	srp_map_desc(state, state->base_dma_addr, state->dma_len,
> +		     desc->mr->rkey);
> +
> +	return ib_post_send(target->qp, &wr, &bad_wr);
> +}
> +
>   static int srp_finish_mapping(struct srp_map_state *state,
>   			      struct srp_target_port *target)
>   {
> @@ -969,11 +1229,13 @@ static int srp_finish_mapping(struct srp_map_state *state,
>   	if (state->npages == 0)
>   		return 0;
>   
> -	if (state->npages == 1) {
> +	if (state->npages == 1 && !register_always) {
>   		srp_map_desc(state, state->base_dma_addr, state->dma_len,
>   			     target->rkey);
>   	} else {
> -		ret = srp_map_finish_fmr(state, target);
> +		ret = target->srp_host->srp_dev->use_fast_reg ?
> +			srp_map_finish_fr(state, target) :
> +			srp_map_finish_fmr(state, target);
>   	}
>   
>   	if (ret == 0) {
> @@ -996,7 +1258,7 @@ static void srp_map_update_start(struct srp_map_state *state,
>   static int srp_map_sg_entry(struct srp_map_state *state,
>   			    struct srp_target_port *target,
>   			    struct scatterlist *sg, int sg_index,
> -			    int use_fmr)
> +			    bool use_memory_registration)
>   {
>   	struct srp_device *dev = target->srp_host->srp_dev;
>   	struct ib_device *ibdev = dev->dev;
> @@ -1008,22 +1270,24 @@ static int srp_map_sg_entry(struct srp_map_state *state,
>   	if (!dma_len)
>   		return 0;
>   
> -	if (use_fmr == SRP_MAP_NO_FMR) {
> -		/* Once we're in direct map mode for a request, we don't
> -		 * go back to FMR mode, so no need to update anything
> +	if (!use_memory_registration) {
> +		/*
> +		 * Once we're in direct map mode for a request, we don't
> +		 * go back to FMR or FR mode, so no need to update anything
>   		 * other than the descriptor.
>   		 */
>   		srp_map_desc(state, dma_addr, dma_len, target->rkey);
>   		return 0;
>   	}
>   
> -	/* If we start at an offset into the FMR page, don't merge into
> -	 * the current FMR. Finish it out, and use the kernel's MR for this
> -	 * sg entry. This is to avoid potential bugs on some SRP targets
> -	 * that were never quite defined, but went away when the initiator
> -	 * avoided using FMR on such page fragments.
> +	/*
> +	 * Since not all RDMA HW drivers support non-zero page offsets for
> +	 * FMR, if we start at an offset into a page, don't merge into the
> +	 * current FMR mapping. Finish it out, and use the kernel's MR for
> +	 * this sg entry.
>   	 */
> -	if (dma_addr & ~dev->mr_page_mask || dma_len > dev->fmr_max_size) {
> +	if ((!dev->use_fast_reg && dma_addr & ~dev->mr_page_mask) ||
> +	    dma_len > target->mr_max_size) {
>   		ret = srp_finish_mapping(state, target);
>   		if (ret)
>   			return ret;
> @@ -1033,28 +1297,30 @@ static int srp_map_sg_entry(struct srp_map_state *state,
>   		return 0;
>   	}
>   
> -	/* If this is the first sg to go into the FMR, save our position.
> -	 * We need to know the first unmapped entry, its index, and the
> -	 * first unmapped address within that entry to be able to restart
> -	 * mapping after an error.
> +	/*
> +	 * If this is the first sg that will be mapped via FMR or via FR, save
> +	 * our position. We need to know the first unmapped entry, its index,
> +	 * and the first unmapped address within that entry to be able to
> +	 * restart mapping after an error.
>   	 */
>   	if (!state->unmapped_sg)
>   		srp_map_update_start(state, sg, sg_index, dma_addr);
>   
>   	while (dma_len) {
> -		if (state->npages == SRP_MAX_PAGES_PER_MR) {
> -			ret = srp_map_finish_fmr(state, target);
> +		unsigned offset = dma_addr & ~dev->mr_page_mask;
> +		if (state->npages == SRP_MAX_PAGES_PER_MR || offset != 0) {
> +			ret = srp_finish_mapping(state, target);
>   			if (ret)
>   				return ret;
>   
>   			srp_map_update_start(state, sg, sg_index, dma_addr);
>   		}
>   
> -		len = min_t(unsigned int, dma_len, dev->mr_page_size);
> +		len = min_t(unsigned int, dma_len, dev->mr_page_size - offset);
>   
>   		if (!state->npages)
>   			state->base_dma_addr = dma_addr;
> -		state->pages[state->npages++] = dma_addr;
> +		state->pages[state->npages++] = dma_addr & dev->mr_page_mask;
>   		state->dma_len += len;
>   		dma_addr += len;
>   		dma_len -= len;
> @@ -1066,32 +1332,40 @@ static int srp_map_sg_entry(struct srp_map_state *state,
>   	 */
>   	ret = 0;
>   	if (len != dev->mr_page_size) {
> -		ret = srp_map_finish_fmr(state, target);
> +		ret = srp_finish_mapping(state, target);
>   		if (!ret)
>   			srp_map_update_start(state, NULL, 0, 0);
>   	}
>   	return ret;
>   }
>   
> -static void srp_map_fmr(struct srp_map_state *state,
> -			struct srp_target_port *target, struct srp_request *req,
> -			struct scatterlist *scat, int count)
> +static int srp_map_sg(struct srp_map_state *state,
> +		      struct srp_target_port *target, struct srp_request *req,
> +		      struct scatterlist *scat, int count)
>   {
>   	struct srp_device *dev = target->srp_host->srp_dev;
>   	struct ib_device *ibdev = dev->dev;
>   	struct scatterlist *sg;
> -	int i, use_fmr;
> +	int i;
> +	bool use_memory_registration;
>   
>   	state->desc	= req->indirect_desc;
>   	state->pages	= req->map_page;
> -	state->next_fmr	= req->fmr_list;
> -
> -	use_fmr = dev->fmr_pool ? SRP_MAP_ALLOW_FMR : SRP_MAP_NO_FMR;
> +	if (dev->use_fast_reg) {
> +		state->next_fmr = req->fmr.fmr_list;
> +		use_memory_registration = !!target->fr_pool;
> +	} else {
> +		state->next_fr = req->fr.fr_list;
> +		use_memory_registration = !!dev->fmr_pool;
> +	}
>   
>   	for_each_sg(scat, sg, count, i) {
> -		if (srp_map_sg_entry(state, target, sg, i, use_fmr)) {
> -			/* FMR mapping failed, so backtrack to the first
> -			 * unmapped entry and continue on without using FMR.
> +		if (srp_map_sg_entry(state, target, sg, i,
> +				     use_memory_registration)) {
> +			/*
> +			 * Memory registration failed, so backtrack to the
> +			 * first unmapped entry and continue on without using
> +			 * memory registration.
>   			 */
>   			dma_addr_t dma_addr;
>   			unsigned int dma_len;
> @@ -1104,15 +1378,17 @@ backtrack:
>   			dma_len = ib_sg_dma_len(ibdev, sg);
>   			dma_len -= (state->unmapped_addr - dma_addr);
>   			dma_addr = state->unmapped_addr;
> -			use_fmr = SRP_MAP_NO_FMR;
> +			use_memory_registration = false;
>   			srp_map_desc(state, dma_addr, dma_len, target->rkey);
>   		}
>   	}
>   
> -	if (use_fmr == SRP_MAP_ALLOW_FMR && srp_map_finish_fmr(state, target))
> +	if (use_memory_registration && srp_finish_mapping(state, target))
>   		goto backtrack;
>   
>   	req->nmdesc = state->nmdesc;
> +
> +	return 0;
>   }
>   
>   static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
> @@ -1120,7 +1396,7 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
>   {
>   	struct scatterlist *scat;
>   	struct srp_cmd *cmd = req->cmd->buf;
> -	int len, nents, count;
> +	int len, nents, count, res;
>   	struct srp_device *dev;
>   	struct ib_device *ibdev;
>   	struct srp_map_state state;
> @@ -1152,7 +1428,7 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
>   	fmt = SRP_DATA_DESC_DIRECT;
>   	len = sizeof (struct srp_cmd) +	sizeof (struct srp_direct_buf);
>   
> -	if (count == 1) {
> +	if (count == 1 && !register_always) {
>   		/*
>   		 * The midlayer only generated a single gather/scatter
>   		 * entry, or DMA mapping coalesced everything to a
> @@ -1169,9 +1445,9 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
>   		goto map_complete;
>   	}
>   
> -	/* We have more than one scatter/gather entry, so build our indirect
> -	 * descriptor table, trying to merge as many entries with FMR as we
> -	 * can.
> +	/*
> +	 * We have more than one scatter/gather entry, so build our indirect
> +	 * descriptor table, trying to merge as many entries as we can.
>   	 */
>   	indirect_hdr = (void *) cmd->add_data;
>   
> @@ -1179,7 +1455,9 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
>   				   target->indirect_size, DMA_TO_DEVICE);
>   
>   	memset(&state, 0, sizeof(state));
> -	srp_map_fmr(&state, target, req, scat, count);
> +	res = srp_map_sg(&state, target, req, scat, count);
> +	if (res < 0)
> +		return res;
>   
>   	/* We've mapped the request, now pull as much of the indirect
>   	 * descriptor table as we can into the command buffer. If this
> @@ -1188,7 +1466,8 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
>   	 * give us more S/G entries than we allow.
>   	 */
>   	if (state.ndesc == 1) {
> -		/* FMR mapping was able to collapse this to one entry,
> +		/*
> +		 * Memory registration collapsed the sg-list into one entry,
>   		 * so use a direct descriptor.
>   		 */
>   		struct srp_direct_buf *buf = (void *) cmd->add_data;
> @@ -1511,14 +1790,24 @@ static void srp_tl_err_work(struct work_struct *work)
>   		srp_start_tl_fail_timers(target->rport);
>   }
>   
> -static void srp_handle_qp_err(enum ib_wc_status wc_status, bool send_err,
> -			      struct srp_target_port *target)
> +static void srp_handle_qp_err(u64 wr_id, enum ib_wc_status wc_status,
> +			      bool send_err, struct srp_target_port *target)
>   {
>   	if (target->connected && !target->qp_in_error) {
> -		shost_printk(KERN_ERR, target->scsi_host,
> -			     PFX "failed %s status %d\n",
> -			     send_err ? "send" : "receive",
> -			     wc_status);
> +		if (wr_id & LOCAL_INV_WR_ID_MASK) {
> +			shost_printk(KERN_ERR, target->scsi_host,
> +				     "LOCAL_INV failed with status %d\n",
> +				     wc_status);
> +		} else if (wr_id & FAST_REG_WR_ID_MASK) {
> +			shost_printk(KERN_ERR, target->scsi_host,
> +				     "FAST_REG_MR failed status %d\n",
> +				     wc_status);
> +		} else {
> +			shost_printk(KERN_ERR, target->scsi_host,
> +				     PFX "failed %s status %d for iu %p\n",
> +				     send_err ? "send" : "receive",
> +				     wc_status, (void *)(uintptr_t)wr_id);
> +		}
>   		queue_work(system_long_wq, &target->tl_err_work);
>   	}
>   	target->qp_in_error = true;
> @@ -1534,7 +1823,7 @@ static void srp_recv_completion(struct ib_cq *cq, void *target_ptr)
>   		if (likely(wc.status == IB_WC_SUCCESS)) {
>   			srp_handle_recv(target, &wc);
>   		} else {
> -			srp_handle_qp_err(wc.status, false, target);
> +			srp_handle_qp_err(wc.wr_id, wc.status, false, target);
>   		}
>   	}
>   }
> @@ -1550,7 +1839,7 @@ static void srp_send_completion(struct ib_cq *cq, void *target_ptr)
>   			iu = (struct srp_iu *) (uintptr_t) wc.wr_id;
>   			list_add(&iu->list, &target->free_tx);
>   		} else {
> -			srp_handle_qp_err(wc.status, true, target);
> +			srp_handle_qp_err(wc.wr_id, wc.status, true, target);
>   		}
>   	}
>   }
> @@ -2873,6 +3162,7 @@ static void srp_add_one(struct ib_device *device)
>   	struct ib_device_attr *dev_attr;
>   	struct srp_host *host;
>   	int mr_page_shift, s, e, p;
> +	bool have_fmr = false, have_fr = false;
>   
>   	dev_attr = kmalloc(sizeof *dev_attr, GFP_KERNEL);
>   	if (!dev_attr)
> @@ -2887,6 +3177,19 @@ static void srp_add_one(struct ib_device *device)
>   	if (!srp_dev)
>   		goto free_attr;
>   
> +	if (device->alloc_fmr && device->dealloc_fmr && device->map_phys_fmr &&
> +	    device->unmap_fmr) {
> +		have_fmr = true;
> +	}
> +	if (dev_attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS)
> +		have_fr = true;
> +	if (!have_fmr && !have_fr) {
> +		dev_err(&device->dev, "neither FMR nor FR is supported\n");
> +		goto free_dev;
> +	}
> +
> +	srp_dev->use_fast_reg = have_fr && (!have_fmr || prefer_fr);
> +
>   	/*
>   	 * Use the smallest page size supported by the HCA, down to a
>   	 * minimum of 4096 bytes. We're unlikely to build large sglists
> @@ -2910,7 +3213,8 @@ static void srp_add_one(struct ib_device *device)
>   	if (IS_ERR(srp_dev->mr))
>   		goto err_pd;
>   
> -	srp_alloc_fmr_pool(srp_dev);
> +	if (!srp_dev->use_fast_reg)
> +		srp_alloc_fmr_pool(srp_dev);

So I ask here, are we OK with this asymmetry? FMRs are per-device and 
FRMRs are per-target?
I have a feeling that converting FMRs to per-connection might simplify 
things, what do you think?

>   
>   	if (device->node_type == RDMA_NODE_IB_SWITCH) {
>   		s = 0;
> @@ -2974,7 +3278,7 @@ static void srp_remove_one(struct ib_device *device)
>   		kfree(host);
>   	}
>   
> -	if (srp_dev->fmr_pool)
> +	if (!srp_dev->use_fast_reg && srp_dev->fmr_pool)
>   		ib_destroy_fmr_pool(srp_dev->fmr_pool);
>   	ib_dereg_mr(srp_dev->mr);
>   	ib_dealloc_pd(srp_dev->pd);
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
> index 89e3adb..4ec44b5 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.h
> +++ b/drivers/infiniband/ulp/srp/ib_srp.h
> @@ -71,8 +71,8 @@ enum {
>   	SRP_MDESC_PER_POOL	= 1024,
>   	SRP_FMR_DIRTY_SIZE	= SRP_MDESC_PER_POOL / 4,
>   
> -	SRP_MAP_ALLOW_FMR	= 0,
> -	SRP_MAP_NO_FMR		= 1,
> +	LOCAL_INV_WR_ID_MASK	= 1,
> +	FAST_REG_WR_ID_MASK	= 2,
>   };
>   
>   enum srp_target_state {
> @@ -86,6 +86,12 @@ enum srp_iu_type {
>   	SRP_IU_RSP,
>   };
>   
> +/*
> + * @mr_page_mask: HCA memory registration page mask.
> + * @mr_page_size: HCA memory registration page size.
> + * @fmr_max_size: Maximum size of a single HCA memory registration request
> + *                when using FMR.
> + */
>   struct srp_device {
>   	struct list_head	dev_list;
>   	struct ib_device       *dev;
> @@ -95,6 +101,7 @@ struct srp_device {
>   	u64			mr_page_mask;
>   	int			mr_page_size;
>   	int			fmr_max_size;
> +	bool			use_fast_reg;
>   };
>   
>   struct srp_host {
> @@ -108,18 +115,32 @@ struct srp_host {
>   	struct mutex		add_target_mutex;
>   };
>   
> +/*
> + * In the union below 'fmr' stands for 'Fast Memory Registration' and fr for
> + * 'Fast Registration'.
> + */
>   struct srp_request {
>   	struct list_head	list;
>   	struct scsi_cmnd       *scmnd;
>   	struct srp_iu	       *cmd;
> -	struct ib_pool_fmr    **fmr_list;
>   	u64		       *map_page;
>   	struct srp_direct_buf  *indirect_desc;
>   	dma_addr_t		indirect_dma_addr;
>   	short			nmdesc;
>   	short			index;
> +	union {
> +		struct {
> +			struct ib_pool_fmr **fmr_list;
> +		} fmr;
> +		struct {
> +			struct srp_fr_desc **fr_list;
> +		} fr;
> +	};
>   };
>   
> +/*
> + * @mr_max_size: Maximum size of a single HCA memory registration request.
> + */
>   struct srp_target_port {
>   	/* These are RW in the hot path, and commonly used together */
>   	struct list_head	free_tx;
> @@ -131,6 +152,8 @@ struct srp_target_port {
>   	struct ib_cq	       *send_cq ____cacheline_aligned_in_smp;
>   	struct ib_cq	       *recv_cq;
>   	struct ib_qp	       *qp;
> +	struct srp_fr_pool     *fr_pool;
> +	int			mr_max_size;
>   	u32			lkey;
>   	u32			rkey;
>   	enum srp_target_state	state;
> @@ -197,8 +220,59 @@ struct srp_iu {
>   	enum dma_data_direction	direction;
>   };
>   
> +/**
> + * struct srp_fr_desc - fast registration work request arguments
> + * @entry: Entry in free_list.
> + * @mr:    Memory region.
> + * @frpl:  Fast registration page list.
> + */
> +struct srp_fr_desc {
> +	struct list_head		entry;
> +	struct ib_mr			*mr;
> +	struct ib_fast_reg_page_list	*frpl;
> +};
> +
> +/**
> + * struct srp_fr_pool - pool of fast registration descriptors
> + *
> + * An entry is available for allocation if and only if it occurs in @free_list.
> + *
> + * @size:      Number of descriptors in this pool.
> + * @max_page_list_len: Maximum fast registration work request page list length.
> + * @lock:      Protects free_list.
> + * @free_list: List of free descriptors.
> + * @desc:      Fast registration descriptor pool.
> + */
> +struct srp_fr_pool {
> +	int			size;
> +	int			max_page_list_len;
> +	spinlock_t		lock;
> +	struct list_head	free_list;
> +	struct srp_fr_desc	desc[0];
> +};
> +
> +/**
> + * struct srp_map_state - per-request DMA memory mapping state
> + * @desc:	    Pointer to the element of the SRP buffer descriptor array
> + *		    that is being filled in.
> + * @pages:	    Array with DMA addresses of pages being considered for
> + *		    memory registration.
> + * @base_dma_addr:  DMA address of the first page that has not yet been mapped.
> + * @dma_len:	    Number of bytes that will be registered with the next
> + *		    FMR or FR memory registration call.
> + * @total_len:	    Total number of bytes in the sg-list being mapped.
> + * @npages:	    Number of page addresses in the pages[] array.
> + * @nmdesc:	    Number of FMR or FR memory descriptors used for mapping.
> + * @ndesc:	    Number of SRP buffer descriptors that have been filled in.
> + * @unmapped_sg:    First element of the sg-list that is mapped via FMR or FR.
> + * @unmapped_index: Index of the first element mapped via FMR or FR.
> + * @unmapped_addr:  DMA address of the first element mapped via FMR or FR.
> + */
>   struct srp_map_state {
> -	struct ib_pool_fmr    **next_fmr;
> +	union {
> +		struct ib_pool_fmr **next_fmr;
> +		struct srp_fr_desc **next_fr;
> +	};
>   	struct srp_direct_buf  *desc;
>   	u64		       *pages;
>   	dma_addr_t		base_dma_addr;

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/9] IB/srp: Introduce an additional local variable
       [not found]         ` <536A0CF7.7080307-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-05-07 13:52           ` Bart Van Assche
       [not found]             ` <536A3AA4.5010100-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Bart Van Assche @ 2014-05-07 13:52 UTC (permalink / raw)
  To: Sagi Grimberg, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 05/07/14 12:37, Sagi Grimberg wrote:
> On 5/6/2014 3:51 PM, Bart Van Assche wrote:
>> This patch does not change any functionality.
>>
>> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
>> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
>> Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
>> Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
>> ---
>>   drivers/infiniband/ulp/srp/ib_srp.c | 9 ++++-----
>>   1 file changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
>> b/drivers/infiniband/ulp/srp/ib_srp.c
>> index cf80f7a..8c03371 100644
>> --- a/drivers/infiniband/ulp/srp/ib_srp.c
>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>> @@ -290,6 +290,7 @@ static int srp_new_cm_id(struct srp_target_port
>> *target)
>>     static int srp_create_target_ib(struct srp_target_port *target)
>>   {
>> +    struct srp_device *dev = target->srp_host->srp_dev;
>>       struct ib_qp_init_attr *init_attr;
>>       struct ib_cq *recv_cq, *send_cq;
>>       struct ib_qp *qp;
>> @@ -299,16 +300,14 @@ static int srp_create_target_ib(struct
>> srp_target_port *target)
>>       if (!init_attr)
>>           return -ENOMEM;
>>   -    recv_cq = ib_create_cq(target->srp_host->srp_dev->dev,
>> -                   srp_recv_completion, NULL, target,
>> +    recv_cq = ib_create_cq(dev->dev, srp_recv_completion, NULL, target,
>>                      target->queue_size, target->comp_vector);
>>       if (IS_ERR(recv_cq)) {
>>           ret = PTR_ERR(recv_cq);
>>           goto err;
>>       }
>>   -    send_cq = ib_create_cq(target->srp_host->srp_dev->dev,
>> -                   srp_send_completion, NULL, target,
>> +    send_cq = ib_create_cq(dev->dev, srp_send_completion, NULL, target,
>>                      target->queue_size, target->comp_vector);
>>       if (IS_ERR(send_cq)) {
>>           ret = PTR_ERR(send_cq);
>> @@ -327,7 +326,7 @@ static int srp_create_target_ib(struct
>> srp_target_port *target)
>>       init_attr->send_cq             = send_cq;
>>       init_attr->recv_cq             = recv_cq;
>>   -    qp = ib_create_qp(target->srp_host->srp_dev->pd, init_attr);
>> +    qp = ib_create_qp(dev->pd, init_attr);
>>       if (IS_ERR(qp)) {
>>           ret = PTR_ERR(qp);
>>           goto err_send_cq;
> 
> I understand why you need it later, but I'm not sure that use_fastreg
> should be a device attribute.

Hello Sagi,

Can you clarify this comment ? The use_fast_reg member variable is
introduced in patch 9/9, so I'm not sure why a comment about that member
variable is made on patch 2/9 ?

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/9] IB/srp: Make srp_alloc_req_data() reallocate request data
       [not found]         ` <536A0FF7.7050608-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-05-07 14:11           ` Bart Van Assche
       [not found]             ` <536A3F08.7030108-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Bart Van Assche @ 2014-05-07 14:11 UTC (permalink / raw)
  To: Sagi Grimberg, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 05/07/14 12:50, Sagi Grimberg wrote:
> On 5/6/2014 3:54 PM, Bart Van Assche wrote:
>> -static void srp_free_req_data(struct srp_target_port *target)
>> +static void srp_free_req_data(struct srp_target_port *target,
>> +                  struct srp_request *req_ring)
>>   {
>>       struct ib_device *ibdev = target->srp_host->srp_dev->dev;
>>       struct srp_request *req;
>>       int i;
>>   -    if (!target->req_ring)
>> +    if (!req_ring)
>>           return;
>>         for (i = 0; i < target->req_ring_size; ++i) {
>> -        req = &target->req_ring[i];
>> +        req = &req_ring[i];
> 
> You loop for {ring A size} and operates on ring B elements. They will
> probably be the same but the notion seems buggy.
> Will it be better to untie this routine from srp_target_port at all?

Hello Sagi,

Had you noticed that target->req_ring_size is not modified during
resource allocation or reallocation ? That is why target->req_ring_size
has not been converted into a function argument in this patch.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 9/9] IB/srp: Add fast registration support
       [not found]         ` <536A1A3A.9090208-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-05-07 14:59           ` Bart Van Assche
       [not found]             ` <536A4A68.4080605-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Bart Van Assche @ 2014-05-07 14:59 UTC (permalink / raw)
  To: Sagi Grimberg, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 05/07/14 13:34, Sagi Grimberg wrote:
>> +module_param(prefer_fr, bool, 0444);
>> +MODULE_PARM_DESC(prefer_fr,
>> +         "Whether to use FR if both FMR and FR are supported");
> 
> Will it be better to make it a per-target attribute?

Hello Sagi,

The only reason I introduced this kernel module parameter was to allow
me to test fast registration support on HCA's that support both FMR and
FR. I'm not sure it is useful to end users to configure this parameter.
Hence my preference to leave it global instead of converting it into a
per-target attribute.

>> +module_param(register_always, bool, 0444);
>> +MODULE_PARM_DESC(register_always,
>> +         "Use memory registration even for contiguous memory regions");
>> +
> 
> This is a separate patch not related to fast registration support, I
> suggest to post it as patch #10

OK, I will split this out into a separate patch.

>> +static struct srp_fr_pool *srp_alloc_fr_pool(struct srp_target_port
>> *target)
>> +{
>> +    struct srp_device *dev = target->srp_host->srp_dev;
>> +    struct srp_fr_pool *pool;
>> +    int max_pages_per_mr;
>> +
>> +    for (max_pages_per_mr = SRP_MAX_PAGES_PER_MR;
>> +         max_pages_per_mr >= SRP_MIN_PAGES_PER_MR;
>> +         max_pages_per_mr /= 2) {
> 
> Isn't there some device capability for that? maybe max_mr_size?
> Also for smaller max_pages you won't be able to handle 512 pages in a
> single cmnd (you would need 2 FRMRs).
> At least a warning print?

I will look into using max_mr_size here. Using multiple registrations
per command should be fine in the context of the SRP protocol as long as
cmd_sg_entries has not been set to a very small value.

>> +        pool = srp_create_fr_pool(dev->dev, dev->pd,
>> +                      SRP_MDESC_PER_POOL, max_pages_per_mr);
> 
> 1024 FRMRs per connection?! we use 1024 FMRs for all connections
> (per-device). I'd say that's a major over-allocation.

It depends on how many discontiguous I/O requests are submitted
concurrently. Anyway, how about limiting the number of memory regions to
the queue size ?

>> -    init_attr->sq_sig_type         = IB_SIGNAL_ALL_WR;
>> +    init_attr->sq_sig_type         = IB_SIGNAL_REQ_WR;
> 
> Will it be better to do this in a preparation patch? (not so important
> to me)

This patch is the first patch for the SRP initiator in which work
requests are queued for which the IB_SEND_SIGNALED flag is not set. That
is why I had included the above change in this patch.

>> @@ -898,20 +1112,25 @@ static int srp_rport_reconnect(struct srp_rport
>> *rport)
>>        * callbacks will have finished before a new QP is allocated.
>>        */
>>       ret = srp_new_cm_id(target);
>> +
>> +    for (i = 0; i < target->req_ring_size; ++i) {
>> +        struct srp_request *req = &target->req_ring[i];
>> +        srp_finish_req(target, req, NULL, DID_RESET << 16);
>> +    }
> 
> Can you explain why this moved here? I'm not sure I understand.

In fast registration mode calling srp_finish_req() can trigger rkey
invalidation. Trying to invalidate an rkey after the queue pair has been
destroyed and recreated would be wrong since that would cause the new
queue pair to transition into the error state. That is why this loop has
been moved up. Moving the srp_finish_req() loop up is safe since
srp_claim_req() prevents that any work completions that are received
after srp_finish_req() has finished cause any harm.

>> +static int srp_map_finish_fr(struct srp_map_state *state,
>> +                 struct srp_target_port *target)
>> +{
>> +    struct srp_device *dev = target->srp_host->srp_dev;
>> +    struct ib_send_wr *bad_wr;
>> +    struct ib_send_wr wr;
>> +    struct srp_fr_desc *desc;
>> +    u32 rkey;
>> +
>> +    desc = srp_fr_pool_get(target->fr_pool);
>> +    if (!desc)
>> +        return -ENOMEM;
>> +
>> +    rkey = ib_inc_rkey(desc->mr->rkey);
>> +    ib_update_fast_reg_key(desc->mr, rkey);
> 
> Usually the inc_rkey is done right after the invalidation (but that's
> not really an issue).
> Maybe it is better here in case the target did remote invalidate (not
> supported at the moment but maybe in the future?)

Indeed - if remote invalidation would be added later on the code for
incrementing the rkey would have to be moved anyway.

>> @@ -2910,7 +3213,8 @@ static void srp_add_one(struct ib_device *device)
>>       if (IS_ERR(srp_dev->mr))
>>           goto err_pd;
>>   -    srp_alloc_fmr_pool(srp_dev);
>> +    if (!srp_dev->use_fast_reg)
>> +        srp_alloc_fmr_pool(srp_dev);
> 
> So I ask here, are we OK with this asymmetry? FMRs are per-device and
> FRMRs are per-target?
> I have a feeling that converting FMRs to per-connection might simplify
> things, what do you think?

Making the FMR pools per-device instead of per-target would cause the
FMR and FR pool allocation and deallocation code to appear in the same
functions but unfortunately wouldn't make this patch shorter. My goal
with this patch was to introduce fast registration support without
changing the FMR code too much.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/9] IB/srp: Make srp_alloc_req_data() reallocate request data
       [not found]             ` <536A3F08.7030108-HInyCGIudOg@public.gmane.org>
@ 2014-05-07 16:58               ` Sagi Grimberg
  0 siblings, 0 replies; 32+ messages in thread
From: Sagi Grimberg @ 2014-05-07 16:58 UTC (permalink / raw)
  To: Bart Van Assche, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 5/7/2014 5:11 PM, Bart Van Assche wrote:
> On 05/07/14 12:50, Sagi Grimberg wrote:
>> On 5/6/2014 3:54 PM, Bart Van Assche wrote:
>>> -static void srp_free_req_data(struct srp_target_port *target)
>>> +static void srp_free_req_data(struct srp_target_port *target,
>>> +                  struct srp_request *req_ring)
>>>    {
>>>        struct ib_device *ibdev = target->srp_host->srp_dev->dev;
>>>        struct srp_request *req;
>>>        int i;
>>>    -    if (!target->req_ring)
>>> +    if (!req_ring)
>>>            return;
>>>          for (i = 0; i < target->req_ring_size; ++i) {
>>> -        req = &target->req_ring[i];
>>> +        req = &req_ring[i];
>> You loop for {ring A size} and operates on ring B elements. They will
>> probably be the same but the notion seems buggy.
>> Will it be better to untie this routine from srp_target_port at all?
> Hello Sagi,
>
> Had you noticed that target->req_ring_size is not modified during
> resource allocation or reallocation ? That is why target->req_ring_size
> has not been converted into a function argument in this patch.

Yes, I guess I'm fine with that.

Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/9] IB/srp: Introduce an additional local variable
       [not found]             ` <536A3AA4.5010100-HInyCGIudOg@public.gmane.org>
@ 2014-05-07 16:58               ` Sagi Grimberg
  0 siblings, 0 replies; 32+ messages in thread
From: Sagi Grimberg @ 2014-05-07 16:58 UTC (permalink / raw)
  To: Bart Van Assche, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 5/7/2014 4:52 PM, Bart Van Assche wrote:
> On 05/07/14 12:37, Sagi Grimberg wrote:
>> On 5/6/2014 3:51 PM, Bart Van Assche wrote:
>>> This patch does not change any functionality.
>>>
>>> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
>>> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
>>> Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
>>> Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>> Cc: Sebastian Parschauer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
>>> ---
>>>    drivers/infiniband/ulp/srp/ib_srp.c | 9 ++++-----
>>>    1 file changed, 4 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
>>> b/drivers/infiniband/ulp/srp/ib_srp.c
>>> index cf80f7a..8c03371 100644
>>> --- a/drivers/infiniband/ulp/srp/ib_srp.c
>>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>>> @@ -290,6 +290,7 @@ static int srp_new_cm_id(struct srp_target_port
>>> *target)
>>>      static int srp_create_target_ib(struct srp_target_port *target)
>>>    {
>>> +    struct srp_device *dev = target->srp_host->srp_dev;
>>>        struct ib_qp_init_attr *init_attr;
>>>        struct ib_cq *recv_cq, *send_cq;
>>>        struct ib_qp *qp;
>>> @@ -299,16 +300,14 @@ static int srp_create_target_ib(struct
>>> srp_target_port *target)
>>>        if (!init_attr)
>>>            return -ENOMEM;
>>>    -    recv_cq = ib_create_cq(target->srp_host->srp_dev->dev,
>>> -                   srp_recv_completion, NULL, target,
>>> +    recv_cq = ib_create_cq(dev->dev, srp_recv_completion, NULL, target,
>>>                       target->queue_size, target->comp_vector);
>>>        if (IS_ERR(recv_cq)) {
>>>            ret = PTR_ERR(recv_cq);
>>>            goto err;
>>>        }
>>>    -    send_cq = ib_create_cq(target->srp_host->srp_dev->dev,
>>> -                   srp_send_completion, NULL, target,
>>> +    send_cq = ib_create_cq(dev->dev, srp_send_completion, NULL, target,
>>>                       target->queue_size, target->comp_vector);
>>>        if (IS_ERR(send_cq)) {
>>>            ret = PTR_ERR(send_cq);
>>> @@ -327,7 +326,7 @@ static int srp_create_target_ib(struct
>>> srp_target_port *target)
>>>        init_attr->send_cq             = send_cq;
>>>        init_attr->recv_cq             = recv_cq;
>>>    -    qp = ib_create_qp(target->srp_host->srp_dev->pd, init_attr);
>>> +    qp = ib_create_qp(dev->pd, init_attr);
>>>        if (IS_ERR(qp)) {
>>>            ret = PTR_ERR(qp);
>>>            goto err_send_cq;
>> I understand why you need it later, but I'm not sure that use_fastreg
>> should be a device attribute.
> Hello Sagi,
>
> Can you clarify this comment ? The use_fast_reg member variable is
> introduced in patch 9/9, so I'm not sure why a comment about that member
> variable is made on patch 2/9 ?

I was referring to the reason why you added local variable dev - it is 
for use_fastreg right?
Since I agree with you that use_fastreg can be a per-device attribute, 
this comment is irrelevant.

So,
Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Sagi.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 9/9] IB/srp: Add fast registration support
       [not found]             ` <536A4A68.4080605-HInyCGIudOg@public.gmane.org>
@ 2014-05-07 17:19               ` Sagi Grimberg
       [not found]                 ` <536A6B2F.70807-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Sagi Grimberg @ 2014-05-07 17:19 UTC (permalink / raw)
  To: Bart Van Assche, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 5/7/2014 5:59 PM, Bart Van Assche wrote:
> On 05/07/14 13:34, Sagi Grimberg wrote:
>>> +module_param(prefer_fr, bool, 0444);
>>> +MODULE_PARM_DESC(prefer_fr,
>>> +         "Whether to use FR if both FMR and FR are supported");
>> Will it be better to make it a per-target attribute?
> Hello Sagi,
>
> The only reason I introduced this kernel module parameter was to allow
> me to test fast registration support on HCA's that support both FMR and
> FR. I'm not sure it is useful to end users to configure this parameter.
> Hence my preference to leave it global instead of converting it into a
> per-target attribute.

If you add a modparam, the user is allowed to use it.

AFAICT a user that will ask prefer_fr on station with device that 
supports FR and a device that doesn't (don't know if even exists).
per-target the conditions in add_one are correct per-device.

>>> +module_param(register_always, bool, 0444);
>>> +MODULE_PARM_DESC(register_always,
>>> +         "Use memory registration even for contiguous memory regions");
>>> +
>> This is a separate patch not related to fast registration support, I
>> suggest to post it as patch #10
> OK, I will split this out into a separate patch.
>
>>> +static struct srp_fr_pool *srp_alloc_fr_pool(struct srp_target_port
>>> *target)
>>> +{
>>> +    struct srp_device *dev = target->srp_host->srp_dev;
>>> +    struct srp_fr_pool *pool;
>>> +    int max_pages_per_mr;
>>> +
>>> +    for (max_pages_per_mr = SRP_MAX_PAGES_PER_MR;
>>> +         max_pages_per_mr >= SRP_MIN_PAGES_PER_MR;
>>> +         max_pages_per_mr /= 2) {
>> Isn't there some device capability for that? maybe max_mr_size?
>> Also for smaller max_pages you won't be able to handle 512 pages in a
>> single cmnd (you would need 2 FRMRs).
>> At least a warning print?
> I will look into using max_mr_size here. Using multiple registrations
> per command should be fine in the context of the SRP protocol as long as
> cmd_sg_entries has not been set to a very small value.
>
>>> +        pool = srp_create_fr_pool(dev->dev, dev->pd,
>>> +                      SRP_MDESC_PER_POOL, max_pages_per_mr);
>> 1024 FRMRs per connection?! we use 1024 FMRs for all connections
>> (per-device). I'd say that's a major over-allocation.
> It depends on how many discontiguous I/O requests are submitted
> concurrently. Anyway, how about limiting the number of memory regions to
> the queue size ?

Perhaps, but we will need to reserve some more for discontinuous IOs.
It is heuristic, for FS in most cases IO will align nicely, for some 
crazy DB applications - I'm not sure.

>
>>> -    init_attr->sq_sig_type         = IB_SIGNAL_ALL_WR;
>>> +    init_attr->sq_sig_type         = IB_SIGNAL_REQ_WR;
>> Will it be better to do this in a preparation patch? (not so important
>> to me)
> This patch is the first patch for the SRP initiator in which work
> requests are queued for which the IB_SEND_SIGNALED flag is not set. That
> is why I had included the above change in this patch.

Agreed
P.S, can we consider also skip scsi_cmnd SEND completions?
Currently SRP asks for completion but never really arm the CQ (only for 
the first time).

>>> @@ -898,20 +1112,25 @@ static int srp_rport_reconnect(struct srp_rport
>>> *rport)
>>>         * callbacks will have finished before a new QP is allocated.
>>>         */
>>>        ret = srp_new_cm_id(target);
>>> +
>>> +    for (i = 0; i < target->req_ring_size; ++i) {
>>> +        struct srp_request *req = &target->req_ring[i];
>>> +        srp_finish_req(target, req, NULL, DID_RESET << 16);
>>> +    }
>> Can you explain why this moved here? I'm not sure I understand.
> In fast registration mode calling srp_finish_req() can trigger rkey
> invalidation. Trying to invalidate an rkey after the queue pair has been
> destroyed and recreated would be wrong since that would cause the new
> queue pair to transition into the error state. That is why this loop has
> been moved up. Moving the srp_finish_req() loop up is safe since
> srp_claim_req() prevents that any work completions that are received
> after srp_finish_req() has finished cause any harm.

Got it.

>
>>> +static int srp_map_finish_fr(struct srp_map_state *state,
>>> +                 struct srp_target_port *target)
>>> +{
>>> +    struct srp_device *dev = target->srp_host->srp_dev;
>>> +    struct ib_send_wr *bad_wr;
>>> +    struct ib_send_wr wr;
>>> +    struct srp_fr_desc *desc;
>>> +    u32 rkey;
>>> +
>>> +    desc = srp_fr_pool_get(target->fr_pool);
>>> +    if (!desc)
>>> +        return -ENOMEM;
>>> +
>>> +    rkey = ib_inc_rkey(desc->mr->rkey);
>>> +    ib_update_fast_reg_key(desc->mr, rkey);
>> Usually the inc_rkey is done right after the invalidation (but that's
>> not really an issue).
>> Maybe it is better here in case the target did remote invalidate (not
>> supported at the moment but maybe in the future?)
> Indeed - if remote invalidation would be added later on the code for
> incrementing the rkey would have to be moved anyway.
>
>>> @@ -2910,7 +3213,8 @@ static void srp_add_one(struct ib_device *device)
>>>        if (IS_ERR(srp_dev->mr))
>>>            goto err_pd;
>>>    -    srp_alloc_fmr_pool(srp_dev);
>>> +    if (!srp_dev->use_fast_reg)
>>> +        srp_alloc_fmr_pool(srp_dev);
>> So I ask here, are we OK with this asymmetry? FMRs are per-device and
>> FRMRs are per-target?
>> I have a feeling that converting FMRs to per-connection might simplify
>> things, what do you think?
> Making the FMR pools per-device instead of per-target would cause the
> FMR and FR pool allocation and deallocation code to appear in the same
> functions but unfortunately wouldn't make this patch shorter. My goal
> with this patch was to introduce fast registration support without
> changing the FMR code too much.

I understand, my concern is that by this asymmetric design we may get 
different behaviors for a specific workload.
For example revisiting the configurable queue_size work in SRP, I 
pointed that for a large
number of connection with long queues all stressing out the system, the 
global FMR pool may not suffice.
This may appear to the user by a periodic IO stalls, This will not 
happen for a per-target FRMR pools
(at least less likely to happen, or at the very least will happen under 
different conditions).

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 9/9] IB/srp: Add fast registration support
       [not found]                 ` <536A6B2F.70807-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-05-08 12:38                   ` Bart Van Assche
       [not found]                     ` <536B7ABE.8080502-HInyCGIudOg@public.gmane.org>
  2014-05-12 13:21                   ` Bart Van Assche
  1 sibling, 1 reply; 32+ messages in thread
From: Bart Van Assche @ 2014-05-08 12:38 UTC (permalink / raw)
  To: Sagi Grimberg, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 05/07/14 19:19, Sagi Grimberg wrote:
> On 5/7/2014 5:59 PM, Bart Van Assche wrote:
>> On 05/07/14 13:34, Sagi Grimberg wrote:
>>>> +        pool = srp_create_fr_pool(dev->dev, dev->pd,
>>>> +                      SRP_MDESC_PER_POOL, max_pages_per_mr);
>>> 1024 FRMRs per connection?! we use 1024 FMRs for all connections
>>> (per-device). I'd say that's a major over-allocation.
>> It depends on how many discontiguous I/O requests are submitted
>> concurrently. Anyway, how about limiting the number of memory regions to
>> the queue size ?
> 
> Perhaps, but we will need to reserve some more for discontinuous IOs.
> It is heuristic, for FS in most cases IO will align nicely, for some
> crazy DB applications - I'm not sure.

If srp_map_sg() runs out of memory descriptors it adds the remaining sg
entries to the SRP indirect descriptor without using memory
registration. In other words, I think a queue full of discontiguous I/O
requests should be handled properly by srp_map_sg().

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 9/9] IB/srp: Add fast registration support
       [not found]                     ` <536B7ABE.8080502-HInyCGIudOg@public.gmane.org>
@ 2014-05-08 15:16                       ` Sagi Grimberg
       [not found]                         ` <536B9FB4.5040009-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Sagi Grimberg @ 2014-05-08 15:16 UTC (permalink / raw)
  To: Bart Van Assche, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 5/8/2014 3:38 PM, Bart Van Assche wrote:
> On 05/07/14 19:19, Sagi Grimberg wrote:
>> On 5/7/2014 5:59 PM, Bart Van Assche wrote:
>>> On 05/07/14 13:34, Sagi Grimberg wrote:
>>>>> +        pool = srp_create_fr_pool(dev->dev, dev->pd,
>>>>> +                      SRP_MDESC_PER_POOL, max_pages_per_mr);
>>>> 1024 FRMRs per connection?! we use 1024 FMRs for all connections
>>>> (per-device). I'd say that's a major over-allocation.
>>> It depends on how many discontiguous I/O requests are submitted
>>> concurrently. Anyway, how about limiting the number of memory regions to
>>> the queue size ?
>> Perhaps, but we will need to reserve some more for discontinuous IOs.
>> It is heuristic, for FS in most cases IO will align nicely, for some
>> crazy DB applications - I'm not sure.
> If srp_map_sg() runs out of memory descriptors it adds the remaining sg
> entries to the SRP indirect descriptor without using memory
> registration. In other words, I think a queue full of discontiguous I/O
> requests should be handled properly by srp_map_sg().

True, SRP can do without registration altogether by passing multiple 
SGEs, we still strive
to register memory though.

Would you consider making it configurable with default to queue_size?
This way the user which is (hopefully) aware of it's typical IO pattern 
can choose a suitable value.

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 9/9] IB/srp: Add fast registration support
       [not found]                         ` <536B9FB4.5040009-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-05-08 16:00                           ` Bart Van Assche
       [not found]                             ` <536BAA13.3050905-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: Bart Van Assche @ 2014-05-08 16:00 UTC (permalink / raw)
  To: Sagi Grimberg, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 05/08/14 17:16, Sagi Grimberg wrote:
> On 5/8/2014 3:38 PM, Bart Van Assche wrote:
>> On 05/07/14 19:19, Sagi Grimberg wrote:
>>> On 5/7/2014 5:59 PM, Bart Van Assche wrote:
>>>> On 05/07/14 13:34, Sagi Grimberg wrote:
>>>>>> +        pool = srp_create_fr_pool(dev->dev, dev->pd,
>>>>>> +                      SRP_MDESC_PER_POOL, max_pages_per_mr);
>>>>> 1024 FRMRs per connection?! we use 1024 FMRs for all connections
>>>>> (per-device). I'd say that's a major over-allocation.
>>>> It depends on how many discontiguous I/O requests are submitted
>>>> concurrently. Anyway, how about limiting the number of memory
>>>> regions to
>>>> the queue size ?
>>> Perhaps, but we will need to reserve some more for discontinuous IOs.
>>> It is heuristic, for FS in most cases IO will align nicely, for some
>>> crazy DB applications - I'm not sure.
>> If srp_map_sg() runs out of memory descriptors it adds the remaining sg
>> entries to the SRP indirect descriptor without using memory
>> registration. In other words, I think a queue full of discontiguous I/O
>> requests should be handled properly by srp_map_sg().
> 
> True, SRP can do without registration altogether by passing multiple
> SGEs, we still strive
> to register memory though.
> 
> Would you consider making it configurable with default to queue_size?
> This way the user which is (hopefully) aware of it's typical IO pattern
> can choose a suitable value.

I don't think another configuration parameter is needed. The SRP
initiator already allows to configure the queue size. If an application
queues many I/O requests that cannot be registered via a single memory
descriptor and the SRP initiator runs out of memory descriptors
temporarily then srp_queuecommand() will return SCSI_MLQUEUE_HOST_BUSY.
If we change that behavior such that these commands are finished with
e.g. result code (DID_OK << 16) | (QUEUE_FULL << 1) then the SCSI
mid-layer will use that information to decrease the queue depth
dynamically via srp_change_queue_depth(). See e.g.
scsi_decide_disposition() in drivers/scsi/scsi_error.c.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 9/9] IB/srp: Add fast registration support
       [not found]                             ` <536BAA13.3050905-HInyCGIudOg@public.gmane.org>
@ 2014-05-08 16:18                               ` Sagi Grimberg
  0 siblings, 0 replies; 32+ messages in thread
From: Sagi Grimberg @ 2014-05-08 16:18 UTC (permalink / raw)
  To: Bart Van Assche, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 5/8/2014 7:00 PM, Bart Van Assche wrote:
> On 05/08/14 17:16, Sagi Grimberg wrote:
>> On 5/8/2014 3:38 PM, Bart Van Assche wrote:
>>> On 05/07/14 19:19, Sagi Grimberg wrote:
>>>> On 5/7/2014 5:59 PM, Bart Van Assche wrote:
>>>>> On 05/07/14 13:34, Sagi Grimberg wrote:
>>>>>>> +        pool = srp_create_fr_pool(dev->dev, dev->pd,
>>>>>>> +                      SRP_MDESC_PER_POOL, max_pages_per_mr);
>>>>>> 1024 FRMRs per connection?! we use 1024 FMRs for all connections
>>>>>> (per-device). I'd say that's a major over-allocation.
>>>>> It depends on how many discontiguous I/O requests are submitted
>>>>> concurrently. Anyway, how about limiting the number of memory
>>>>> regions to
>>>>> the queue size ?
>>>> Perhaps, but we will need to reserve some more for discontinuous IOs.
>>>> It is heuristic, for FS in most cases IO will align nicely, for some
>>>> crazy DB applications - I'm not sure.
>>> If srp_map_sg() runs out of memory descriptors it adds the remaining sg
>>> entries to the SRP indirect descriptor without using memory
>>> registration. In other words, I think a queue full of discontiguous I/O
>>> requests should be handled properly by srp_map_sg().
>> True, SRP can do without registration altogether by passing multiple
>> SGEs, we still strive
>> to register memory though.
>>
>> Would you consider making it configurable with default to queue_size?
>> This way the user which is (hopefully) aware of it's typical IO pattern
>> can choose a suitable value.
> I don't think another configuration parameter is needed. The SRP
> initiator already allows to configure the queue size. If an application
> queues many I/O requests that cannot be registered via a single memory
> descriptor and the SRP initiator runs out of memory descriptors
> temporarily then srp_queuecommand() will return SCSI_MLQUEUE_HOST_BUSY.
> If we change that behavior such that these commands are finished with
> e.g. result code (DID_OK << 16) | (QUEUE_FULL << 1) then the SCSI
> mid-layer will use that information to decrease the queue depth
> dynamically via srp_change_queue_depth(). See e.g.
> scsi_decide_disposition() in drivers/scsi/scsi_error.c.

I see, I'm fine with that. let's set it to queue_size.

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 9/9] IB/srp: Add fast registration support
       [not found]                 ` <536A6B2F.70807-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  2014-05-08 12:38                   ` Bart Van Assche
@ 2014-05-12 13:21                   ` Bart Van Assche
  1 sibling, 0 replies; 32+ messages in thread
From: Bart Van Assche @ 2014-05-12 13:21 UTC (permalink / raw)
  To: Sagi Grimberg, Roland Dreier
  Cc: Sagi Grimberg, Vu Pham, David Dillow, Sebastian Parschauer, linux-rdma

On 05/07/14 19:19, Sagi Grimberg wrote:
> P.S, can we consider also skip scsi_cmnd SEND completions?
> Currently SRP asks for completion but never really arm the CQ (only for
> the first time).

Hello Sagi,

Can you clarify this further ? Removing the IB_SEND_SIGNALED flag from
send requests is not possible because that would result in the HCA only
delivering work completions for failed work requests to the SRP
initiator driver. The loop in srp_send_completion() also needs the work
completions for send work requests that succeeded. In other words, it's
not clear to me what you were referring to ?

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2014-05-12 13:21 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-06 12:49 [PATCH 0/9] SRP initiator patches for kernel 3.16 Bart Van Assche
     [not found] ` <5368DA5B.80609-HInyCGIudOg@public.gmane.org>
2014-05-06 12:50   ` [PATCH 1/9] IB/srp: Fix kernel-doc warnings Bart Van Assche
     [not found]     ` <5368DAB2.2070006-HInyCGIudOg@public.gmane.org>
2014-05-07 10:35       ` Sagi Grimberg
2014-05-06 12:51   ` [PATCH 2/9] IB/srp: Introduce an additional local variable Bart Van Assche
     [not found]     ` <5368DADF.2070105-HInyCGIudOg@public.gmane.org>
2014-05-07 10:37       ` Sagi Grimberg
     [not found]         ` <536A0CF7.7080307-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-05-07 13:52           ` Bart Van Assche
     [not found]             ` <536A3AA4.5010100-HInyCGIudOg@public.gmane.org>
2014-05-07 16:58               ` Sagi Grimberg
2014-05-06 12:52   ` [PATCH 3/9] IB/srp: Introduce srp_alloc_fmr_pool() Bart Van Assche
     [not found]     ` <5368DB05.4070600-HInyCGIudOg@public.gmane.org>
2014-05-07 10:38       ` Sagi Grimberg
2014-05-06 12:53   ` [PATCH 4/9] IB/srp: Introduce srp_map_fmr() Bart Van Assche
     [not found]     ` <5368DB2F.8020506-HInyCGIudOg@public.gmane.org>
2014-05-07 10:39       ` Sagi Grimberg
2014-05-06 12:53   ` [PATCH 5/9] IB/srp: Introduce srp_finish_mapping() Bart Van Assche
     [not found]     ` <5368DB58.6070502-HInyCGIudOg@public.gmane.org>
2014-05-07 10:41       ` Sagi Grimberg
2014-05-06 12:54   ` [PATCH 6/9] IB/srp: Make srp_alloc_req_data() reallocate request data Bart Van Assche
     [not found]     ` <5368DB90.9050209-HInyCGIudOg@public.gmane.org>
2014-05-07 10:50       ` Sagi Grimberg
     [not found]         ` <536A0FF7.7050608-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-05-07 14:11           ` Bart Van Assche
     [not found]             ` <536A3F08.7030108-HInyCGIudOg@public.gmane.org>
2014-05-07 16:58               ` Sagi Grimberg
2014-05-06 12:55   ` [PATCH 7/9] IB/srp: Avoid triggering an infinite loop if memory mapping fails Bart Van Assche
     [not found]     ` <5368DBC5.6070609-HInyCGIudOg@public.gmane.org>
2014-05-07 10:50       ` Sagi Grimberg
2014-05-06 12:56   ` [PATCH 8/9] IB/srp: Rename FMR-related variables Bart Van Assche
2014-05-06 12:56   ` [PATCH 9/9] IB/srp: Add fast registration support Bart Van Assche
     [not found]     ` <5368DC09.70608-HInyCGIudOg@public.gmane.org>
2014-05-07 11:34       ` Sagi Grimberg
     [not found]         ` <536A1A3A.9090208-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-05-07 14:59           ` Bart Van Assche
     [not found]             ` <536A4A68.4080605-HInyCGIudOg@public.gmane.org>
2014-05-07 17:19               ` Sagi Grimberg
     [not found]                 ` <536A6B2F.70807-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-05-08 12:38                   ` Bart Van Assche
     [not found]                     ` <536B7ABE.8080502-HInyCGIudOg@public.gmane.org>
2014-05-08 15:16                       ` Sagi Grimberg
     [not found]                         ` <536B9FB4.5040009-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-05-08 16:00                           ` Bart Van Assche
     [not found]                             ` <536BAA13.3050905-HInyCGIudOg@public.gmane.org>
2014-05-08 16:18                               ` Sagi Grimberg
2014-05-12 13:21                   ` Bart Van Assche
2014-05-06 14:06   ` [PATCH 0/9] SRP initiator patches for kernel 3.16 Jack Wang
     [not found]     ` <5368EC54.8020802-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-05-06 14:26       ` Bart Van Assche
2014-05-06 14:21   ` [PATCH 3/9] IB/srp: Introduce srp_alloc_fmr_pool() Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.