All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] scsi: target: XCOPY performance
@ 2020-03-23 16:54 ` David Disseldorp
  0 siblings, 0 replies; 24+ messages in thread
From: David Disseldorp @ 2020-03-23 16:54 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, martin.petersen, bvanassche

These changes remove unnecessary heap allocations in the XCOPY
READ/WRITE dispatch loop.

Synthetic benchmarks on my laptop using the libiscsi iscsi-dd utility
(--xcopy --max 1 --blocks 65535 src=dst) against a target backed by an 8G
zram (DEBUG_KMEMLEAK=y) iblock backstore (avg across four runs) show:
before: 5.35776G/s
after:  6.12636G/s (approx. +14%)

Feedback appreciated.

Cheers, David

----------------------------------------------------------------
David Disseldorp (5):
      scsi: target: use #def for xcopy descriptor len
      scsi: target: drop xcopy DISK BLOCK LENGTH debug
      scsi: target: avoid per-loop XCOPY buffer allocations
      scsi: target: increase XCOPY I/O size
      scsi: target: avoid XCOPY per-loop read/write cmd allocations

 drivers/target/target_core_xcopy.c | 135 ++++++++++-------------------
 drivers/target/target_core_xcopy.h |  14 +--
 2 files changed, 57 insertions(+), 92 deletions(-)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC PATCH 0/5] scsi: target: XCOPY performance
@ 2020-03-23 16:54 ` David Disseldorp
  0 siblings, 0 replies; 24+ messages in thread
From: David Disseldorp @ 2020-03-23 16:54 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, martin.petersen, bvanassche

These changes remove unnecessary heap allocations in the XCOPY
READ/WRITE dispatch loop.

Synthetic benchmarks on my laptop using the libiscsi iscsi-dd utility
(--xcopy --max 1 --blocks 65535 src=dst) against a target backed by an 8G
zram (DEBUG_KMEMLEAK=y) iblock backstore (avg across four runs) show:
before: 5.35776G/s
after:  6.12636G/s (approx. +14%)

Feedback appreciated.

Cheers, David

----------------------------------------------------------------
David Disseldorp (5):
      scsi: target: use #def for xcopy descriptor len
      scsi: target: drop xcopy DISK BLOCK LENGTH debug
      scsi: target: avoid per-loop XCOPY buffer allocations
      scsi: target: increase XCOPY I/O size
      scsi: target: avoid XCOPY per-loop read/write cmd allocations

 drivers/target/target_core_xcopy.c | 135 ++++++++++-------------------
 drivers/target/target_core_xcopy.h |  14 +--
 2 files changed, 57 insertions(+), 92 deletions(-)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC PATCH 1/5] scsi: target: use #def for xcopy descriptor len
  2020-03-23 16:54 ` David Disseldorp
@ 2020-03-23 16:54   ` David Disseldorp
  -1 siblings, 0 replies; 24+ messages in thread
From: David Disseldorp @ 2020-03-23 16:54 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, martin.petersen, bvanassche, David Disseldorp

Signed-off-by: David Disseldorp <ddiss@suse.de>
---
 drivers/target/target_core_xcopy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/target/target_core_xcopy.c b/drivers/target/target_core_xcopy.c
index 425c1070de08..7e5b13da0c20 100644
--- a/drivers/target/target_core_xcopy.c
+++ b/drivers/target/target_core_xcopy.c
@@ -134,7 +134,7 @@ static int target_xcopy_parse_tiddesc_e4(struct se_cmd *se_cmd, struct xcopy_op
 	 * Assigned designator
 	 */
 	desig_len = desc[7];
-	if (desig_len != 16) {
+	if (desig_len != XCOPY_NAA_IEEE_REGEX_LEN) {
 		pr_err("XCOPY 0xe4: invalid desig_len: %d\n", (int)desig_len);
 		return -EINVAL;
 	}
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 1/5] scsi: target: use #def for xcopy descriptor len
@ 2020-03-23 16:54   ` David Disseldorp
  0 siblings, 0 replies; 24+ messages in thread
From: David Disseldorp @ 2020-03-23 16:54 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, martin.petersen, bvanassche, David Disseldorp

Signed-off-by: David Disseldorp <ddiss@suse.de>
---
 drivers/target/target_core_xcopy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/target/target_core_xcopy.c b/drivers/target/target_core_xcopy.c
index 425c1070de08..7e5b13da0c20 100644
--- a/drivers/target/target_core_xcopy.c
+++ b/drivers/target/target_core_xcopy.c
@@ -134,7 +134,7 @@ static int target_xcopy_parse_tiddesc_e4(struct se_cmd *se_cmd, struct xcopy_op
 	 * Assigned designator
 	 */
 	desig_len = desc[7];
-	if (desig_len != 16) {
+	if (desig_len != XCOPY_NAA_IEEE_REGEX_LEN) {
 		pr_err("XCOPY 0xe4: invalid desig_len: %d\n", (int)desig_len);
 		return -EINVAL;
 	}
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 2/5] scsi: target: drop xcopy DISK BLOCK LENGTH debug
  2020-03-23 16:54 ` David Disseldorp
@ 2020-03-23 16:54   ` David Disseldorp
  -1 siblings, 0 replies; 24+ messages in thread
From: David Disseldorp @ 2020-03-23 16:54 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, martin.petersen, bvanassche, David Disseldorp

The DISK BLOCK LENGTH field is carried with XCOPY target descriptors on
the wire, but is currently unmarshalled during 0x02 segment descriptor
passing. The unmarshalled value is currently unused, so drop it.

Signed-off-by: David Disseldorp <ddiss@suse.de>
---
 drivers/target/target_core_xcopy.c | 5 -----
 drivers/target/target_core_xcopy.h | 1 -
 2 files changed, 6 deletions(-)

diff --git a/drivers/target/target_core_xcopy.c b/drivers/target/target_core_xcopy.c
index 7e5b13da0c20..66b68295c50f 100644
--- a/drivers/target/target_core_xcopy.c
+++ b/drivers/target/target_core_xcopy.c
@@ -315,11 +315,6 @@ static int target_xcopy_parse_segdesc_02(struct se_cmd *se_cmd, struct xcopy_op
 		xop->nolb, (unsigned long long)xop->src_lba,
 		(unsigned long long)xop->dst_lba);
 
-	if (dc != 0) {
-		xop->dbl = get_unaligned_be24(&desc[29]);
-
-		pr_debug("XCOPY seg desc 0x02: DC=1 w/ dbl: %u\n", xop->dbl);
-	}
 	return 0;
 }
 
diff --git a/drivers/target/target_core_xcopy.h b/drivers/target/target_core_xcopy.h
index 26ba4c3c9cff..0840b03e8faa 100644
--- a/drivers/target/target_core_xcopy.h
+++ b/drivers/target/target_core_xcopy.h
@@ -35,7 +35,6 @@ struct xcopy_op {
 	unsigned short stdi;
 	unsigned short dtdi;
 	unsigned short nolb;
-	unsigned int dbl;
 
 	struct xcopy_pt_cmd *src_pt_cmd;
 	struct xcopy_pt_cmd *dst_pt_cmd;
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 2/5] scsi: target: drop xcopy DISK BLOCK LENGTH debug
@ 2020-03-23 16:54   ` David Disseldorp
  0 siblings, 0 replies; 24+ messages in thread
From: David Disseldorp @ 2020-03-23 16:54 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, martin.petersen, bvanassche, David Disseldorp

The DISK BLOCK LENGTH field is carried with XCOPY target descriptors on
the wire, but is currently unmarshalled during 0x02 segment descriptor
passing. The unmarshalled value is currently unused, so drop it.

Signed-off-by: David Disseldorp <ddiss@suse.de>
---
 drivers/target/target_core_xcopy.c | 5 -----
 drivers/target/target_core_xcopy.h | 1 -
 2 files changed, 6 deletions(-)

diff --git a/drivers/target/target_core_xcopy.c b/drivers/target/target_core_xcopy.c
index 7e5b13da0c20..66b68295c50f 100644
--- a/drivers/target/target_core_xcopy.c
+++ b/drivers/target/target_core_xcopy.c
@@ -315,11 +315,6 @@ static int target_xcopy_parse_segdesc_02(struct se_cmd *se_cmd, struct xcopy_op
 		xop->nolb, (unsigned long long)xop->src_lba,
 		(unsigned long long)xop->dst_lba);
 
-	if (dc != 0) {
-		xop->dbl = get_unaligned_be24(&desc[29]);
-
-		pr_debug("XCOPY seg desc 0x02: DC=1 w/ dbl: %u\n", xop->dbl);
-	}
 	return 0;
 }
 
diff --git a/drivers/target/target_core_xcopy.h b/drivers/target/target_core_xcopy.h
index 26ba4c3c9cff..0840b03e8faa 100644
--- a/drivers/target/target_core_xcopy.h
+++ b/drivers/target/target_core_xcopy.h
@@ -35,7 +35,6 @@ struct xcopy_op {
 	unsigned short stdi;
 	unsigned short dtdi;
 	unsigned short nolb;
-	unsigned int dbl;
 
 	struct xcopy_pt_cmd *src_pt_cmd;
 	struct xcopy_pt_cmd *dst_pt_cmd;
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 3/5] scsi: target: avoid per-loop XCOPY buffer allocations
  2020-03-23 16:54 ` David Disseldorp
@ 2020-03-23 16:54   ` David Disseldorp
  -1 siblings, 0 replies; 24+ messages in thread
From: David Disseldorp @ 2020-03-23 16:54 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, martin.petersen, bvanassche, David Disseldorp

The main target_xcopy_do_work() loop unnecessarily allocates an I/O
buffer with each synchronous READ / WRITE pair. This commit
significantly reduces allocations by reusing the XCOPY I/O buffer when
possible.

Signed-off-by: David Disseldorp <ddiss@suse.de>
---
 drivers/target/target_core_xcopy.c | 95 +++++++++++++++-----------------------
 drivers/target/target_core_xcopy.h |  1 +
 2 files changed, 39 insertions(+), 57 deletions(-)

diff --git a/drivers/target/target_core_xcopy.c b/drivers/target/target_core_xcopy.c
index 66b68295c50f..e9c50766d4fc 100644
--- a/drivers/target/target_core_xcopy.c
+++ b/drivers/target/target_core_xcopy.c
@@ -499,7 +499,6 @@ void target_xcopy_release_pt(void)
  * @cdb:	 SCSI CDB to be copied into @xpt_cmd.
  * @remote_port: If false, use the LUN through which the XCOPY command has
  *		 been received. If true, use @se_dev->xcopy_lun.
- * @alloc_mem:	 Whether or not to allocate an SGL list.
  *
  * Set up a SCSI command (READ or WRITE) that will be used to execute an
  * XCOPY command.
@@ -509,12 +508,11 @@ static int target_xcopy_setup_pt_cmd(
 	struct xcopy_op *xop,
 	struct se_device *se_dev,
 	unsigned char *cdb,
-	bool remote_port,
-	bool alloc_mem)
+	bool remote_port)
 {
 	struct se_cmd *cmd = &xpt_cmd->se_cmd;
 	sense_reason_t sense_rc;
-	int ret = 0, rc;
+	int ret = 0;
 
 	/*
 	 * Setup LUN+port to honor reservations based upon xop->op_origin for
@@ -536,36 +534,21 @@ static int target_xcopy_setup_pt_cmd(
 		goto out;
 	}
 
-	if (alloc_mem) {
-		rc = target_alloc_sgl(&cmd->t_data_sg, &cmd->t_data_nents,
-				      cmd->data_length, false, false);
-		if (rc < 0) {
-			ret = rc;
-			goto out;
-		}
-		/*
-		 * Set this bit so that transport_free_pages() allows the
-		 * caller to release SGLs + physical memory allocated by
-		 * transport_generic_get_mem()..
-		 */
-		cmd->se_cmd_flags |= SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC;
-	} else {
-		/*
-		 * Here the previously allocated SGLs for the internal READ
-		 * are mapped zero-copy to the internal WRITE.
-		 */
-		sense_rc = transport_generic_map_mem_to_cmd(cmd,
-					xop->xop_data_sg, xop->xop_data_nents,
-					NULL, 0);
-		if (sense_rc) {
-			ret = -EINVAL;
-			goto out;
-		}
-
-		pr_debug("Setup PASSTHROUGH_NOALLOC t_data_sg: %p t_data_nents:"
-			 " %u\n", cmd->t_data_sg, cmd->t_data_nents);
+	/*
+	 * Here the previously allocated SGLs for the XCOPY are mapped zero-copy
+	 * to the internal READ or WRITE.
+	 */
+	sense_rc = transport_generic_map_mem_to_cmd(cmd,
+				xop->xop_data_sg, xop->xop_data_nents,
+				NULL, 0);
+	if (sense_rc) {
+		ret = -EINVAL;
+		goto out;
 	}
 
+	pr_debug("Setup PASSTHROUGH_NOALLOC t_data_sg: %p t_data_nents:"
+		 " %u\n", cmd->t_data_sg, cmd->t_data_nents);
+
 	return 0;
 
 out:
@@ -626,15 +609,13 @@ static int target_xcopy_read_source(
 	xop->src_pt_cmd = xpt_cmd;
 
 	rc = target_xcopy_setup_pt_cmd(xpt_cmd, xop, src_dev, &cdb[0],
-				remote_port, true);
+				remote_port);
 	if (rc < 0) {
 		ec_cmd->scsi_status = xpt_cmd->se_cmd.scsi_status;
 		transport_generic_free_cmd(se_cmd, 0);
 		return rc;
 	}
 
-	xop->xop_data_sg = se_cmd->t_data_sg;
-	xop->xop_data_nents = se_cmd->t_data_nents;
 	pr_debug("XCOPY-READ: Saved xop->xop_data_sg: %p, num: %u for READ"
 		" memory\n", xop->xop_data_sg, xop->xop_data_nents);
 
@@ -644,12 +625,6 @@ static int target_xcopy_read_source(
 		transport_generic_free_cmd(se_cmd, 0);
 		return rc;
 	}
-	/*
-	 * Clear off the allocated t_data_sg, that has been saved for
-	 * zero-copy WRITE submission reuse in struct xcopy_op..
-	 */
-	se_cmd->t_data_sg = NULL;
-	se_cmd->t_data_nents = 0;
 
 	return 0;
 }
@@ -688,19 +663,9 @@ static int target_xcopy_write_destination(
 	xop->dst_pt_cmd = xpt_cmd;
 
 	rc = target_xcopy_setup_pt_cmd(xpt_cmd, xop, dst_dev, &cdb[0],
-				remote_port, false);
+				remote_port);
 	if (rc < 0) {
-		struct se_cmd *src_cmd = &xop->src_pt_cmd->se_cmd;
 		ec_cmd->scsi_status = xpt_cmd->se_cmd.scsi_status;
-		/*
-		 * If the failure happened before the t_mem_list hand-off in
-		 * target_xcopy_setup_pt_cmd(), Reset memory + clear flag so that
-		 * core releases this memory on error during X-COPY WRITE I/O.
-		 */
-		src_cmd->se_cmd_flags &= ~SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC;
-		src_cmd->t_data_sg = xop->xop_data_sg;
-		src_cmd->t_data_nents = xop->xop_data_nents;
-
 		transport_generic_free_cmd(se_cmd, 0);
 		return rc;
 	}
@@ -708,7 +673,6 @@ static int target_xcopy_write_destination(
 	rc = target_xcopy_issue_pt_cmd(xpt_cmd);
 	if (rc < 0) {
 		ec_cmd->scsi_status = xpt_cmd->se_cmd.scsi_status;
-		se_cmd->se_cmd_flags &= ~SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC;
 		transport_generic_free_cmd(se_cmd, 0);
 		return rc;
 	}
@@ -724,7 +688,7 @@ static void target_xcopy_do_work(struct work_struct *work)
 	sector_t src_lba, dst_lba, end_lba;
 	unsigned int max_sectors;
 	int rc = 0;
-	unsigned short nolb, cur_nolb, max_nolb, copied_nolb = 0;
+	unsigned short nolb, max_nolb, copied_nolb = 0;
 
 	if (target_parse_xcopy_cmd(xop) != TCM_NO_SENSE)
 		goto err_free;
@@ -754,7 +718,24 @@ static void target_xcopy_do_work(struct work_struct *work)
 			(unsigned long long)src_lba, (unsigned long long)dst_lba);
 
 	while (src_lba < end_lba) {
-		cur_nolb = min(nolb, max_nolb);
+		unsigned short cur_nolb = min(nolb, max_nolb);
+		u32 cur_bytes = cur_nolb * src_dev->dev_attrib.block_size;
+
+		if (cur_bytes != xop->xop_data_bytes) {
+			/*
+			 * (Re)allocate a buffer large enough to hold the XCOPY
+			 * I/O size, which can be reused each read / write loop.
+			 */
+			target_free_sgl(xop->xop_data_sg, xop->xop_data_nents);
+			rc = target_alloc_sgl(&xop->xop_data_sg,
+					      &xop->xop_data_nents,
+					      cur_bytes,
+					      false, false);
+			if (rc < 0) {
+				goto out;
+			}
+			xop->xop_data_bytes = cur_bytes;
+		}
 
 		pr_debug("target_xcopy_do_work: Calling read src_dev: %p src_lba: %llu,"
 			" cur_nolb: %hu\n", src_dev, (unsigned long long)src_lba, cur_nolb);
@@ -785,12 +766,11 @@ static void target_xcopy_do_work(struct work_struct *work)
 		nolb -= cur_nolb;
 
 		transport_generic_free_cmd(&xop->src_pt_cmd->se_cmd, 0);
-		xop->dst_pt_cmd->se_cmd.se_cmd_flags &= ~SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC;
-
 		transport_generic_free_cmd(&xop->dst_pt_cmd->se_cmd, 0);
 	}
 
 	xcopy_pt_undepend_remotedev(xop);
+	target_free_sgl(xop->xop_data_sg, xop->xop_data_nents);
 	kfree(xop);
 
 	pr_debug("target_xcopy_do_work: Final src_lba: %llu, dst_lba: %llu\n",
@@ -804,6 +784,7 @@ static void target_xcopy_do_work(struct work_struct *work)
 
 out:
 	xcopy_pt_undepend_remotedev(xop);
+	target_free_sgl(xop->xop_data_sg, xop->xop_data_nents);
 
 err_free:
 	kfree(xop);
diff --git a/drivers/target/target_core_xcopy.h b/drivers/target/target_core_xcopy.h
index 0840b03e8faa..9558974185ea 100644
--- a/drivers/target/target_core_xcopy.h
+++ b/drivers/target/target_core_xcopy.h
@@ -39,6 +39,7 @@ struct xcopy_op {
 	struct xcopy_pt_cmd *src_pt_cmd;
 	struct xcopy_pt_cmd *dst_pt_cmd;
 
+	u32 xop_data_bytes;
 	u32 xop_data_nents;
 	struct scatterlist *xop_data_sg;
 	struct work_struct xop_work;
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 3/5] scsi: target: avoid per-loop XCOPY buffer allocations
@ 2020-03-23 16:54   ` David Disseldorp
  0 siblings, 0 replies; 24+ messages in thread
From: David Disseldorp @ 2020-03-23 16:54 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, martin.petersen, bvanassche, David Disseldorp

The main target_xcopy_do_work() loop unnecessarily allocates an I/O
buffer with each synchronous READ / WRITE pair. This commit
significantly reduces allocations by reusing the XCOPY I/O buffer when
possible.

Signed-off-by: David Disseldorp <ddiss@suse.de>
---
 drivers/target/target_core_xcopy.c | 95 +++++++++++++++-----------------------
 drivers/target/target_core_xcopy.h |  1 +
 2 files changed, 39 insertions(+), 57 deletions(-)

diff --git a/drivers/target/target_core_xcopy.c b/drivers/target/target_core_xcopy.c
index 66b68295c50f..e9c50766d4fc 100644
--- a/drivers/target/target_core_xcopy.c
+++ b/drivers/target/target_core_xcopy.c
@@ -499,7 +499,6 @@ void target_xcopy_release_pt(void)
  * @cdb:	 SCSI CDB to be copied into @xpt_cmd.
  * @remote_port: If false, use the LUN through which the XCOPY command has
  *		 been received. If true, use @se_dev->xcopy_lun.
- * @alloc_mem:	 Whether or not to allocate an SGL list.
  *
  * Set up a SCSI command (READ or WRITE) that will be used to execute an
  * XCOPY command.
@@ -509,12 +508,11 @@ static int target_xcopy_setup_pt_cmd(
 	struct xcopy_op *xop,
 	struct se_device *se_dev,
 	unsigned char *cdb,
-	bool remote_port,
-	bool alloc_mem)
+	bool remote_port)
 {
 	struct se_cmd *cmd = &xpt_cmd->se_cmd;
 	sense_reason_t sense_rc;
-	int ret = 0, rc;
+	int ret = 0;
 
 	/*
 	 * Setup LUN+port to honor reservations based upon xop->op_origin for
@@ -536,36 +534,21 @@ static int target_xcopy_setup_pt_cmd(
 		goto out;
 	}
 
-	if (alloc_mem) {
-		rc = target_alloc_sgl(&cmd->t_data_sg, &cmd->t_data_nents,
-				      cmd->data_length, false, false);
-		if (rc < 0) {
-			ret = rc;
-			goto out;
-		}
-		/*
-		 * Set this bit so that transport_free_pages() allows the
-		 * caller to release SGLs + physical memory allocated by
-		 * transport_generic_get_mem()..
-		 */
-		cmd->se_cmd_flags |= SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC;
-	} else {
-		/*
-		 * Here the previously allocated SGLs for the internal READ
-		 * are mapped zero-copy to the internal WRITE.
-		 */
-		sense_rc = transport_generic_map_mem_to_cmd(cmd,
-					xop->xop_data_sg, xop->xop_data_nents,
-					NULL, 0);
-		if (sense_rc) {
-			ret = -EINVAL;
-			goto out;
-		}
-
-		pr_debug("Setup PASSTHROUGH_NOALLOC t_data_sg: %p t_data_nents:"
-			 " %u\n", cmd->t_data_sg, cmd->t_data_nents);
+	/*
+	 * Here the previously allocated SGLs for the XCOPY are mapped zero-copy
+	 * to the internal READ or WRITE.
+	 */
+	sense_rc = transport_generic_map_mem_to_cmd(cmd,
+				xop->xop_data_sg, xop->xop_data_nents,
+				NULL, 0);
+	if (sense_rc) {
+		ret = -EINVAL;
+		goto out;
 	}
 
+	pr_debug("Setup PASSTHROUGH_NOALLOC t_data_sg: %p t_data_nents:"
+		 " %u\n", cmd->t_data_sg, cmd->t_data_nents);
+
 	return 0;
 
 out:
@@ -626,15 +609,13 @@ static int target_xcopy_read_source(
 	xop->src_pt_cmd = xpt_cmd;
 
 	rc = target_xcopy_setup_pt_cmd(xpt_cmd, xop, src_dev, &cdb[0],
-				remote_port, true);
+				remote_port);
 	if (rc < 0) {
 		ec_cmd->scsi_status = xpt_cmd->se_cmd.scsi_status;
 		transport_generic_free_cmd(se_cmd, 0);
 		return rc;
 	}
 
-	xop->xop_data_sg = se_cmd->t_data_sg;
-	xop->xop_data_nents = se_cmd->t_data_nents;
 	pr_debug("XCOPY-READ: Saved xop->xop_data_sg: %p, num: %u for READ"
 		" memory\n", xop->xop_data_sg, xop->xop_data_nents);
 
@@ -644,12 +625,6 @@ static int target_xcopy_read_source(
 		transport_generic_free_cmd(se_cmd, 0);
 		return rc;
 	}
-	/*
-	 * Clear off the allocated t_data_sg, that has been saved for
-	 * zero-copy WRITE submission reuse in struct xcopy_op..
-	 */
-	se_cmd->t_data_sg = NULL;
-	se_cmd->t_data_nents = 0;
 
 	return 0;
 }
@@ -688,19 +663,9 @@ static int target_xcopy_write_destination(
 	xop->dst_pt_cmd = xpt_cmd;
 
 	rc = target_xcopy_setup_pt_cmd(xpt_cmd, xop, dst_dev, &cdb[0],
-				remote_port, false);
+				remote_port);
 	if (rc < 0) {
-		struct se_cmd *src_cmd = &xop->src_pt_cmd->se_cmd;
 		ec_cmd->scsi_status = xpt_cmd->se_cmd.scsi_status;
-		/*
-		 * If the failure happened before the t_mem_list hand-off in
-		 * target_xcopy_setup_pt_cmd(), Reset memory + clear flag so that
-		 * core releases this memory on error during X-COPY WRITE I/O.
-		 */
-		src_cmd->se_cmd_flags &= ~SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC;
-		src_cmd->t_data_sg = xop->xop_data_sg;
-		src_cmd->t_data_nents = xop->xop_data_nents;
-
 		transport_generic_free_cmd(se_cmd, 0);
 		return rc;
 	}
@@ -708,7 +673,6 @@ static int target_xcopy_write_destination(
 	rc = target_xcopy_issue_pt_cmd(xpt_cmd);
 	if (rc < 0) {
 		ec_cmd->scsi_status = xpt_cmd->se_cmd.scsi_status;
-		se_cmd->se_cmd_flags &= ~SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC;
 		transport_generic_free_cmd(se_cmd, 0);
 		return rc;
 	}
@@ -724,7 +688,7 @@ static void target_xcopy_do_work(struct work_struct *work)
 	sector_t src_lba, dst_lba, end_lba;
 	unsigned int max_sectors;
 	int rc = 0;
-	unsigned short nolb, cur_nolb, max_nolb, copied_nolb = 0;
+	unsigned short nolb, max_nolb, copied_nolb = 0;
 
 	if (target_parse_xcopy_cmd(xop) != TCM_NO_SENSE)
 		goto err_free;
@@ -754,7 +718,24 @@ static void target_xcopy_do_work(struct work_struct *work)
 			(unsigned long long)src_lba, (unsigned long long)dst_lba);
 
 	while (src_lba < end_lba) {
-		cur_nolb = min(nolb, max_nolb);
+		unsigned short cur_nolb = min(nolb, max_nolb);
+		u32 cur_bytes = cur_nolb * src_dev->dev_attrib.block_size;
+
+		if (cur_bytes != xop->xop_data_bytes) {
+			/*
+			 * (Re)allocate a buffer large enough to hold the XCOPY
+			 * I/O size, which can be reused each read / write loop.
+			 */
+			target_free_sgl(xop->xop_data_sg, xop->xop_data_nents);
+			rc = target_alloc_sgl(&xop->xop_data_sg,
+					      &xop->xop_data_nents,
+					      cur_bytes,
+					      false, false);
+			if (rc < 0) {
+				goto out;
+			}
+			xop->xop_data_bytes = cur_bytes;
+		}
 
 		pr_debug("target_xcopy_do_work: Calling read src_dev: %p src_lba: %llu,"
 			" cur_nolb: %hu\n", src_dev, (unsigned long long)src_lba, cur_nolb);
@@ -785,12 +766,11 @@ static void target_xcopy_do_work(struct work_struct *work)
 		nolb -= cur_nolb;
 
 		transport_generic_free_cmd(&xop->src_pt_cmd->se_cmd, 0);
-		xop->dst_pt_cmd->se_cmd.se_cmd_flags &= ~SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC;
-
 		transport_generic_free_cmd(&xop->dst_pt_cmd->se_cmd, 0);
 	}
 
 	xcopy_pt_undepend_remotedev(xop);
+	target_free_sgl(xop->xop_data_sg, xop->xop_data_nents);
 	kfree(xop);
 
 	pr_debug("target_xcopy_do_work: Final src_lba: %llu, dst_lba: %llu\n",
@@ -804,6 +784,7 @@ static void target_xcopy_do_work(struct work_struct *work)
 
 out:
 	xcopy_pt_undepend_remotedev(xop);
+	target_free_sgl(xop->xop_data_sg, xop->xop_data_nents);
 
 err_free:
 	kfree(xop);
diff --git a/drivers/target/target_core_xcopy.h b/drivers/target/target_core_xcopy.h
index 0840b03e8faa..9558974185ea 100644
--- a/drivers/target/target_core_xcopy.h
+++ b/drivers/target/target_core_xcopy.h
@@ -39,6 +39,7 @@ struct xcopy_op {
 	struct xcopy_pt_cmd *src_pt_cmd;
 	struct xcopy_pt_cmd *dst_pt_cmd;
 
+	u32 xop_data_bytes;
 	u32 xop_data_nents;
 	struct scatterlist *xop_data_sg;
 	struct work_struct xop_work;
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 4/5] scsi: target: increase XCOPY I/O size
  2020-03-23 16:54 ` David Disseldorp
@ 2020-03-23 16:54   ` David Disseldorp
  -1 siblings, 0 replies; 24+ messages in thread
From: David Disseldorp @ 2020-03-23 16:54 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, martin.petersen, bvanassche, David Disseldorp

The I/O size is already bound by dev_attrib.hw_max_sectors, so increase
the hardcoded XCOPY_MAX_SECTORS maximum to improve performance against
backstores with high-latency.

Signed-off-by: David Disseldorp <ddiss@suse.de>
---
 drivers/target/target_core_xcopy.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/target/target_core_xcopy.h b/drivers/target/target_core_xcopy.h
index 9558974185ea..f1aaf7140798 100644
--- a/drivers/target/target_core_xcopy.h
+++ b/drivers/target/target_core_xcopy.h
@@ -5,7 +5,7 @@
 #define XCOPY_TARGET_DESC_LEN		32
 #define XCOPY_SEGMENT_DESC_LEN		28
 #define XCOPY_NAA_IEEE_REGEX_LEN	16
-#define XCOPY_MAX_SECTORS		1024
+#define XCOPY_MAX_SECTORS		4096
 
 /*
  * SPC4r37 6.4.6.1
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 4/5] scsi: target: increase XCOPY I/O size
@ 2020-03-23 16:54   ` David Disseldorp
  0 siblings, 0 replies; 24+ messages in thread
From: David Disseldorp @ 2020-03-23 16:54 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, martin.petersen, bvanassche, David Disseldorp

The I/O size is already bound by dev_attrib.hw_max_sectors, so increase
the hardcoded XCOPY_MAX_SECTORS maximum to improve performance against
backstores with high-latency.

Signed-off-by: David Disseldorp <ddiss@suse.de>
---
 drivers/target/target_core_xcopy.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/target/target_core_xcopy.h b/drivers/target/target_core_xcopy.h
index 9558974185ea..f1aaf7140798 100644
--- a/drivers/target/target_core_xcopy.h
+++ b/drivers/target/target_core_xcopy.h
@@ -5,7 +5,7 @@
 #define XCOPY_TARGET_DESC_LEN		32
 #define XCOPY_SEGMENT_DESC_LEN		28
 #define XCOPY_NAA_IEEE_REGEX_LEN	16
-#define XCOPY_MAX_SECTORS		1024
+#define XCOPY_MAX_SECTORS		4096
 
 /*
  * SPC4r37 6.4.6.1
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 5/5] scsi: target: avoid XCOPY per-loop read/write cmd allocations
  2020-03-23 16:54 ` David Disseldorp
@ 2020-03-23 16:54   ` David Disseldorp
  -1 siblings, 0 replies; 24+ messages in thread
From: David Disseldorp @ 2020-03-23 16:54 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, martin.petersen, bvanassche, David Disseldorp

Reads and writes in the XCOPY loop are synchronous, so needn't be
allocated / freed with each loop.

Signed-off-by: David Disseldorp <ddiss@suse.de>
---
 drivers/target/target_core_xcopy.c | 33 +++++++++------------------------
 drivers/target/target_core_xcopy.h | 10 +++++++---
 2 files changed, 16 insertions(+), 27 deletions(-)

diff --git a/drivers/target/target_core_xcopy.c b/drivers/target/target_core_xcopy.c
index e9c50766d4fc..cdc6c65168da 100644
--- a/drivers/target/target_core_xcopy.c
+++ b/drivers/target/target_core_xcopy.c
@@ -374,12 +374,6 @@ static int target_xcopy_parse_segment_descriptors(struct se_cmd *se_cmd,
  * Start xcopy_pt ops
  */
 
-struct xcopy_pt_cmd {
-	struct se_cmd se_cmd;
-	struct completion xpt_passthrough_sem;
-	unsigned char sense_buffer[TRANSPORT_SENSE_BUFFER];
-};
-
 struct se_portal_group xcopy_pt_tpg;
 static struct se_session xcopy_pt_sess;
 static struct se_node_acl xcopy_pt_nacl;
@@ -410,7 +404,8 @@ static void xcopy_pt_release_cmd(struct se_cmd *se_cmd)
 	struct xcopy_pt_cmd *xpt_cmd = container_of(se_cmd,
 				struct xcopy_pt_cmd, se_cmd);
 
-	kfree(xpt_cmd);
+	/* xpt_cmd is backed by parent xop, nothing to free */
+	pr_debug("xpt_cmd done: %p\n", xpt_cmd);
 }
 
 static int xcopy_pt_check_stop_free(struct se_cmd *se_cmd)
@@ -582,18 +577,14 @@ static int target_xcopy_read_source(
 	sector_t src_lba,
 	u32 src_sectors)
 {
-	struct xcopy_pt_cmd *xpt_cmd;
+	struct xcopy_pt_cmd *xpt_cmd = &xop->src_pt_cmd;
 	struct se_cmd *se_cmd;
 	u32 length = (src_sectors * src_dev->dev_attrib.block_size);
 	int rc;
 	unsigned char cdb[16];
 	bool remote_port = (xop->op_origin = XCOL_DEST_RECV_OP);
 
-	xpt_cmd = kzalloc(sizeof(struct xcopy_pt_cmd), GFP_KERNEL);
-	if (!xpt_cmd) {
-		pr_err("Unable to allocate xcopy_pt_cmd\n");
-		return -ENOMEM;
-	}
+	memset(xpt_cmd, 0, sizeof(*xpt_cmd));
 	init_completion(&xpt_cmd->xpt_passthrough_sem);
 	se_cmd = &xpt_cmd->se_cmd;
 
@@ -606,7 +597,6 @@ static int target_xcopy_read_source(
 
 	transport_init_se_cmd(se_cmd, &xcopy_pt_tfo, &xcopy_pt_sess, length,
 			      DMA_FROM_DEVICE, 0, &xpt_cmd->sense_buffer[0]);
-	xop->src_pt_cmd = xpt_cmd;
 
 	rc = target_xcopy_setup_pt_cmd(xpt_cmd, xop, src_dev, &cdb[0],
 				remote_port);
@@ -636,18 +626,14 @@ static int target_xcopy_write_destination(
 	sector_t dst_lba,
 	u32 dst_sectors)
 {
-	struct xcopy_pt_cmd *xpt_cmd;
+	struct xcopy_pt_cmd *xpt_cmd = &xop->dst_pt_cmd;
 	struct se_cmd *se_cmd;
 	u32 length = (dst_sectors * dst_dev->dev_attrib.block_size);
 	int rc;
 	unsigned char cdb[16];
 	bool remote_port = (xop->op_origin = XCOL_SOURCE_RECV_OP);
 
-	xpt_cmd = kzalloc(sizeof(struct xcopy_pt_cmd), GFP_KERNEL);
-	if (!xpt_cmd) {
-		pr_err("Unable to allocate xcopy_pt_cmd\n");
-		return -ENOMEM;
-	}
+	memset(xpt_cmd, 0, sizeof(*xpt_cmd));
 	init_completion(&xpt_cmd->xpt_passthrough_sem);
 	se_cmd = &xpt_cmd->se_cmd;
 
@@ -660,7 +646,6 @@ static int target_xcopy_write_destination(
 
 	transport_init_se_cmd(se_cmd, &xcopy_pt_tfo, &xcopy_pt_sess, length,
 			      DMA_TO_DEVICE, 0, &xpt_cmd->sense_buffer[0]);
-	xop->dst_pt_cmd = xpt_cmd;
 
 	rc = target_xcopy_setup_pt_cmd(xpt_cmd, xop, dst_dev, &cdb[0],
 				remote_port);
@@ -754,7 +739,7 @@ static void target_xcopy_do_work(struct work_struct *work)
 		rc = target_xcopy_write_destination(ec_cmd, xop, dst_dev,
 						dst_lba, cur_nolb);
 		if (rc < 0) {
-			transport_generic_free_cmd(&xop->src_pt_cmd->se_cmd, 0);
+			transport_generic_free_cmd(&xop->src_pt_cmd.se_cmd, 0);
 			goto out;
 		}
 
@@ -765,8 +750,8 @@ static void target_xcopy_do_work(struct work_struct *work)
 		copied_nolb += cur_nolb;
 		nolb -= cur_nolb;
 
-		transport_generic_free_cmd(&xop->src_pt_cmd->se_cmd, 0);
-		transport_generic_free_cmd(&xop->dst_pt_cmd->se_cmd, 0);
+		transport_generic_free_cmd(&xop->src_pt_cmd.se_cmd, 0);
+		transport_generic_free_cmd(&xop->dst_pt_cmd.se_cmd, 0);
 	}
 
 	xcopy_pt_undepend_remotedev(xop);
diff --git a/drivers/target/target_core_xcopy.h b/drivers/target/target_core_xcopy.h
index f1aaf7140798..f52640fcb12d 100644
--- a/drivers/target/target_core_xcopy.h
+++ b/drivers/target/target_core_xcopy.h
@@ -18,7 +18,11 @@ enum xcopy_origin_list {
 	XCOL_DEST_RECV_OP = 0x02,
 };
 
-struct xcopy_pt_cmd;
+struct xcopy_pt_cmd {
+	struct se_cmd se_cmd;
+	struct completion xpt_passthrough_sem;
+	unsigned char sense_buffer[TRANSPORT_SENSE_BUFFER];
+};
 
 struct xcopy_op {
 	int op_origin;
@@ -36,8 +40,8 @@ struct xcopy_op {
 	unsigned short dtdi;
 	unsigned short nolb;
 
-	struct xcopy_pt_cmd *src_pt_cmd;
-	struct xcopy_pt_cmd *dst_pt_cmd;
+	struct xcopy_pt_cmd src_pt_cmd;
+	struct xcopy_pt_cmd dst_pt_cmd;
 
 	u32 xop_data_bytes;
 	u32 xop_data_nents;
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 5/5] scsi: target: avoid XCOPY per-loop read/write cmd allocations
@ 2020-03-23 16:54   ` David Disseldorp
  0 siblings, 0 replies; 24+ messages in thread
From: David Disseldorp @ 2020-03-23 16:54 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, martin.petersen, bvanassche, David Disseldorp

Reads and writes in the XCOPY loop are synchronous, so needn't be
allocated / freed with each loop.

Signed-off-by: David Disseldorp <ddiss@suse.de>
---
 drivers/target/target_core_xcopy.c | 33 +++++++++------------------------
 drivers/target/target_core_xcopy.h | 10 +++++++---
 2 files changed, 16 insertions(+), 27 deletions(-)

diff --git a/drivers/target/target_core_xcopy.c b/drivers/target/target_core_xcopy.c
index e9c50766d4fc..cdc6c65168da 100644
--- a/drivers/target/target_core_xcopy.c
+++ b/drivers/target/target_core_xcopy.c
@@ -374,12 +374,6 @@ static int target_xcopy_parse_segment_descriptors(struct se_cmd *se_cmd,
  * Start xcopy_pt ops
  */
 
-struct xcopy_pt_cmd {
-	struct se_cmd se_cmd;
-	struct completion xpt_passthrough_sem;
-	unsigned char sense_buffer[TRANSPORT_SENSE_BUFFER];
-};
-
 struct se_portal_group xcopy_pt_tpg;
 static struct se_session xcopy_pt_sess;
 static struct se_node_acl xcopy_pt_nacl;
@@ -410,7 +404,8 @@ static void xcopy_pt_release_cmd(struct se_cmd *se_cmd)
 	struct xcopy_pt_cmd *xpt_cmd = container_of(se_cmd,
 				struct xcopy_pt_cmd, se_cmd);
 
-	kfree(xpt_cmd);
+	/* xpt_cmd is backed by parent xop, nothing to free */
+	pr_debug("xpt_cmd done: %p\n", xpt_cmd);
 }
 
 static int xcopy_pt_check_stop_free(struct se_cmd *se_cmd)
@@ -582,18 +577,14 @@ static int target_xcopy_read_source(
 	sector_t src_lba,
 	u32 src_sectors)
 {
-	struct xcopy_pt_cmd *xpt_cmd;
+	struct xcopy_pt_cmd *xpt_cmd = &xop->src_pt_cmd;
 	struct se_cmd *se_cmd;
 	u32 length = (src_sectors * src_dev->dev_attrib.block_size);
 	int rc;
 	unsigned char cdb[16];
 	bool remote_port = (xop->op_origin == XCOL_DEST_RECV_OP);
 
-	xpt_cmd = kzalloc(sizeof(struct xcopy_pt_cmd), GFP_KERNEL);
-	if (!xpt_cmd) {
-		pr_err("Unable to allocate xcopy_pt_cmd\n");
-		return -ENOMEM;
-	}
+	memset(xpt_cmd, 0, sizeof(*xpt_cmd));
 	init_completion(&xpt_cmd->xpt_passthrough_sem);
 	se_cmd = &xpt_cmd->se_cmd;
 
@@ -606,7 +597,6 @@ static int target_xcopy_read_source(
 
 	transport_init_se_cmd(se_cmd, &xcopy_pt_tfo, &xcopy_pt_sess, length,
 			      DMA_FROM_DEVICE, 0, &xpt_cmd->sense_buffer[0]);
-	xop->src_pt_cmd = xpt_cmd;
 
 	rc = target_xcopy_setup_pt_cmd(xpt_cmd, xop, src_dev, &cdb[0],
 				remote_port);
@@ -636,18 +626,14 @@ static int target_xcopy_write_destination(
 	sector_t dst_lba,
 	u32 dst_sectors)
 {
-	struct xcopy_pt_cmd *xpt_cmd;
+	struct xcopy_pt_cmd *xpt_cmd = &xop->dst_pt_cmd;
 	struct se_cmd *se_cmd;
 	u32 length = (dst_sectors * dst_dev->dev_attrib.block_size);
 	int rc;
 	unsigned char cdb[16];
 	bool remote_port = (xop->op_origin == XCOL_SOURCE_RECV_OP);
 
-	xpt_cmd = kzalloc(sizeof(struct xcopy_pt_cmd), GFP_KERNEL);
-	if (!xpt_cmd) {
-		pr_err("Unable to allocate xcopy_pt_cmd\n");
-		return -ENOMEM;
-	}
+	memset(xpt_cmd, 0, sizeof(*xpt_cmd));
 	init_completion(&xpt_cmd->xpt_passthrough_sem);
 	se_cmd = &xpt_cmd->se_cmd;
 
@@ -660,7 +646,6 @@ static int target_xcopy_write_destination(
 
 	transport_init_se_cmd(se_cmd, &xcopy_pt_tfo, &xcopy_pt_sess, length,
 			      DMA_TO_DEVICE, 0, &xpt_cmd->sense_buffer[0]);
-	xop->dst_pt_cmd = xpt_cmd;
 
 	rc = target_xcopy_setup_pt_cmd(xpt_cmd, xop, dst_dev, &cdb[0],
 				remote_port);
@@ -754,7 +739,7 @@ static void target_xcopy_do_work(struct work_struct *work)
 		rc = target_xcopy_write_destination(ec_cmd, xop, dst_dev,
 						dst_lba, cur_nolb);
 		if (rc < 0) {
-			transport_generic_free_cmd(&xop->src_pt_cmd->se_cmd, 0);
+			transport_generic_free_cmd(&xop->src_pt_cmd.se_cmd, 0);
 			goto out;
 		}
 
@@ -765,8 +750,8 @@ static void target_xcopy_do_work(struct work_struct *work)
 		copied_nolb += cur_nolb;
 		nolb -= cur_nolb;
 
-		transport_generic_free_cmd(&xop->src_pt_cmd->se_cmd, 0);
-		transport_generic_free_cmd(&xop->dst_pt_cmd->se_cmd, 0);
+		transport_generic_free_cmd(&xop->src_pt_cmd.se_cmd, 0);
+		transport_generic_free_cmd(&xop->dst_pt_cmd.se_cmd, 0);
 	}
 
 	xcopy_pt_undepend_remotedev(xop);
diff --git a/drivers/target/target_core_xcopy.h b/drivers/target/target_core_xcopy.h
index f1aaf7140798..f52640fcb12d 100644
--- a/drivers/target/target_core_xcopy.h
+++ b/drivers/target/target_core_xcopy.h
@@ -18,7 +18,11 @@ enum xcopy_origin_list {
 	XCOL_DEST_RECV_OP = 0x02,
 };
 
-struct xcopy_pt_cmd;
+struct xcopy_pt_cmd {
+	struct se_cmd se_cmd;
+	struct completion xpt_passthrough_sem;
+	unsigned char sense_buffer[TRANSPORT_SENSE_BUFFER];
+};
 
 struct xcopy_op {
 	int op_origin;
@@ -36,8 +40,8 @@ struct xcopy_op {
 	unsigned short dtdi;
 	unsigned short nolb;
 
-	struct xcopy_pt_cmd *src_pt_cmd;
-	struct xcopy_pt_cmd *dst_pt_cmd;
+	struct xcopy_pt_cmd src_pt_cmd;
+	struct xcopy_pt_cmd dst_pt_cmd;
 
 	u32 xop_data_bytes;
 	u32 xop_data_nents;
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 1/5] scsi: target: use #def for xcopy descriptor len
  2020-03-23 16:54   ` David Disseldorp
@ 2020-03-24  9:06     ` Christoph Hellwig
  -1 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2020-03-24  9:06 UTC (permalink / raw)
  To: David Disseldorp; +Cc: target-devel, linux-scsi, martin.petersen, bvanassche

On Mon, Mar 23, 2020 at 05:54:06PM +0100, David Disseldorp wrote:
> Signed-off-by: David Disseldorp <ddiss@suse.de>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 1/5] scsi: target: use #def for xcopy descriptor len
@ 2020-03-24  9:06     ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2020-03-24  9:06 UTC (permalink / raw)
  To: David Disseldorp; +Cc: target-devel, linux-scsi, martin.petersen, bvanassche

On Mon, Mar 23, 2020 at 05:54:06PM +0100, David Disseldorp wrote:
> Signed-off-by: David Disseldorp <ddiss@suse.de>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 2/5] scsi: target: drop xcopy DISK BLOCK LENGTH debug
  2020-03-23 16:54   ` David Disseldorp
@ 2020-03-24  9:07     ` Christoph Hellwig
  -1 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2020-03-24  9:07 UTC (permalink / raw)
  To: David Disseldorp; +Cc: target-devel, linux-scsi, martin.petersen, bvanassche

On Mon, Mar 23, 2020 at 05:54:07PM +0100, David Disseldorp wrote:
> The DISK BLOCK LENGTH field is carried with XCOPY target descriptors on
> the wire, but is currently unmarshalled during 0x02 segment descriptor
> passing. The unmarshalled value is currently unused, so drop it.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 2/5] scsi: target: drop xcopy DISK BLOCK LENGTH debug
@ 2020-03-24  9:07     ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2020-03-24  9:07 UTC (permalink / raw)
  To: David Disseldorp; +Cc: target-devel, linux-scsi, martin.petersen, bvanassche

On Mon, Mar 23, 2020 at 05:54:07PM +0100, David Disseldorp wrote:
> The DISK BLOCK LENGTH field is carried with XCOPY target descriptors on
> the wire, but is currently unmarshalled during 0x02 segment descriptor
> passing. The unmarshalled value is currently unused, so drop it.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 3/5] scsi: target: avoid per-loop XCOPY buffer allocations
  2020-03-23 16:54   ` David Disseldorp
@ 2020-03-24  9:16     ` Christoph Hellwig
  -1 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2020-03-24  9:16 UTC (permalink / raw)
  To: David Disseldorp; +Cc: target-devel, linux-scsi, martin.petersen, bvanassche

> +	/*
> +	 * Here the previously allocated SGLs for the XCOPY are mapped zero-copy
> +	 * to the internal READ or WRITE.
> +	 */

This comments is rather pointless and a little confusing.  I'd use the
chance to kill it rather than bringing it over from the previous code.

> +	sense_rc = transport_generic_map_mem_to_cmd(cmd,
> +				xop->xop_data_sg, xop->xop_data_nents,
> +				NULL, 0);
> +	if (sense_rc) {
> +		ret = -EINVAL;
> +		goto out;
>  	}

This could simply be:

	if (transport_generic_map_mem_to_cmd(cmd, xop->xop_data_sg,
			xop->xop_data_nents, NULL, 0))
		return -EINVAL;

> +			rc = target_alloc_sgl(&xop->xop_data_sg,
> +					      &xop->xop_data_nents,
> +					      cur_bytes,
> +					      false, false);
> +			if (rc < 0) {
> +				goto out;
> +			}

			rc = target_alloc_sgl(&xop->xop_data_sg,
					&xop->xop_data_nents, cur_bytes,
					false, false);
			if (rc < 0)
				goto out;

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 3/5] scsi: target: avoid per-loop XCOPY buffer allocations
@ 2020-03-24  9:16     ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2020-03-24  9:16 UTC (permalink / raw)
  To: David Disseldorp; +Cc: target-devel, linux-scsi, martin.petersen, bvanassche

> +	/*
> +	 * Here the previously allocated SGLs for the XCOPY are mapped zero-copy
> +	 * to the internal READ or WRITE.
> +	 */

This comments is rather pointless and a little confusing.  I'd use the
chance to kill it rather than bringing it over from the previous code.

> +	sense_rc = transport_generic_map_mem_to_cmd(cmd,
> +				xop->xop_data_sg, xop->xop_data_nents,
> +				NULL, 0);
> +	if (sense_rc) {
> +		ret = -EINVAL;
> +		goto out;
>  	}

This could simply be:

	if (transport_generic_map_mem_to_cmd(cmd, xop->xop_data_sg,
			xop->xop_data_nents, NULL, 0))
		return -EINVAL;

> +			rc = target_alloc_sgl(&xop->xop_data_sg,
> +					      &xop->xop_data_nents,
> +					      cur_bytes,
> +					      false, false);
> +			if (rc < 0) {
> +				goto out;
> +			}

			rc = target_alloc_sgl(&xop->xop_data_sg,
					&xop->xop_data_nents, cur_bytes,
					false, false);
			if (rc < 0)
				goto out;

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 4/5] scsi: target: increase XCOPY I/O size
  2020-03-23 16:54   ` David Disseldorp
@ 2020-03-24  9:16     ` Christoph Hellwig
  -1 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2020-03-24  9:16 UTC (permalink / raw)
  To: David Disseldorp; +Cc: target-devel, linux-scsi, martin.petersen, bvanassche

On Mon, Mar 23, 2020 at 05:54:09PM +0100, David Disseldorp wrote:
> The I/O size is already bound by dev_attrib.hw_max_sectors, so increase
> the hardcoded XCOPY_MAX_SECTORS maximum to improve performance against
> backstores with high-latency.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 4/5] scsi: target: increase XCOPY I/O size
@ 2020-03-24  9:16     ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2020-03-24  9:16 UTC (permalink / raw)
  To: David Disseldorp; +Cc: target-devel, linux-scsi, martin.petersen, bvanassche

On Mon, Mar 23, 2020 at 05:54:09PM +0100, David Disseldorp wrote:
> The I/O size is already bound by dev_attrib.hw_max_sectors, so increase
> the hardcoded XCOPY_MAX_SECTORS maximum to improve performance against
> backstores with high-latency.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 5/5] scsi: target: avoid XCOPY per-loop read/write cmd allocations
  2020-03-23 16:54   ` David Disseldorp
@ 2020-03-24  9:23     ` Christoph Hellwig
  -1 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2020-03-24  9:23 UTC (permalink / raw)
  To: David Disseldorp; +Cc: target-devel, linux-scsi, martin.petersen, bvanassche

On Mon, Mar 23, 2020 at 05:54:10PM +0100, David Disseldorp wrote:
> Reads and writes in the XCOPY loop are synchronous, so needn't be
> allocated / freed with each loop.

That is true, but I think with your previous cleanups we can easily
go one step further and just allocate a single command and sense buffer
directly in struct xcopy_op, and just have local completions on the
stack.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 5/5] scsi: target: avoid XCOPY per-loop read/write cmd allocations
@ 2020-03-24  9:23     ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2020-03-24  9:23 UTC (permalink / raw)
  To: David Disseldorp; +Cc: target-devel, linux-scsi, martin.petersen, bvanassche

On Mon, Mar 23, 2020 at 05:54:10PM +0100, David Disseldorp wrote:
> Reads and writes in the XCOPY loop are synchronous, so needn't be
> allocated / freed with each loop.

That is true, but I think with your previous cleanups we can easily
go one step further and just allocate a single command and sense buffer
directly in struct xcopy_op, and just have local completions on the
stack.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 5/5] scsi: target: avoid XCOPY per-loop read/write cmd allocations
  2020-03-24  9:23     ` Christoph Hellwig
@ 2020-03-26 21:20       ` David Disseldorp
  -1 siblings, 0 replies; 24+ messages in thread
From: David Disseldorp @ 2020-03-26 21:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: target-devel, linux-scsi, martin.petersen, bvanassche

Hi Christoph,

Thanks for the feedback...

On Tue, 24 Mar 2020 02:23:23 -0700, Christoph Hellwig wrote:

> On Mon, Mar 23, 2020 at 05:54:10PM +0100, David Disseldorp wrote:
> > Reads and writes in the XCOPY loop are synchronous, so needn't be
> > allocated / freed with each loop.  
> 
> That is true, but I think with your previous cleanups we can easily
> go one step further and just allocate a single command and sense buffer
> directly in struct xcopy_op, and just have local completions on the
> stack.

I'm probably missing something, but having the (stack) completion
separate to the se_cmd and sense buffer would mean that it's no longer a
straightforward container_of() in the .check_stop_free() callback.

I've reworked this patch to put the entire xcopy_pt_cmd on the stack and
will send it out with v2.

Cheers, David

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 5/5] scsi: target: avoid XCOPY per-loop read/write cmd allocations
@ 2020-03-26 21:20       ` David Disseldorp
  0 siblings, 0 replies; 24+ messages in thread
From: David Disseldorp @ 2020-03-26 21:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: target-devel, linux-scsi, martin.petersen, bvanassche

Hi Christoph,

Thanks for the feedback...

On Tue, 24 Mar 2020 02:23:23 -0700, Christoph Hellwig wrote:

> On Mon, Mar 23, 2020 at 05:54:10PM +0100, David Disseldorp wrote:
> > Reads and writes in the XCOPY loop are synchronous, so needn't be
> > allocated / freed with each loop.  
> 
> That is true, but I think with your previous cleanups we can easily
> go one step further and just allocate a single command and sense buffer
> directly in struct xcopy_op, and just have local completions on the
> stack.

I'm probably missing something, but having the (stack) completion
separate to the se_cmd and sense buffer would mean that it's no longer a
straightforward container_of() in the .check_stop_free() callback.

I've reworked this patch to put the entire xcopy_pt_cmd on the stack and
will send it out with v2.

Cheers, David

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-03-26 21:20 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-23 16:54 [RFC PATCH 0/5] scsi: target: XCOPY performance David Disseldorp
2020-03-23 16:54 ` David Disseldorp
2020-03-23 16:54 ` [RFC PATCH 1/5] scsi: target: use #def for xcopy descriptor len David Disseldorp
2020-03-23 16:54   ` David Disseldorp
2020-03-24  9:06   ` Christoph Hellwig
2020-03-24  9:06     ` Christoph Hellwig
2020-03-23 16:54 ` [RFC PATCH 2/5] scsi: target: drop xcopy DISK BLOCK LENGTH debug David Disseldorp
2020-03-23 16:54   ` David Disseldorp
2020-03-24  9:07   ` Christoph Hellwig
2020-03-24  9:07     ` Christoph Hellwig
2020-03-23 16:54 ` [RFC PATCH 3/5] scsi: target: avoid per-loop XCOPY buffer allocations David Disseldorp
2020-03-23 16:54   ` David Disseldorp
2020-03-24  9:16   ` Christoph Hellwig
2020-03-24  9:16     ` Christoph Hellwig
2020-03-23 16:54 ` [RFC PATCH 4/5] scsi: target: increase XCOPY I/O size David Disseldorp
2020-03-23 16:54   ` David Disseldorp
2020-03-24  9:16   ` Christoph Hellwig
2020-03-24  9:16     ` Christoph Hellwig
2020-03-23 16:54 ` [RFC PATCH 5/5] scsi: target: avoid XCOPY per-loop read/write cmd allocations David Disseldorp
2020-03-23 16:54   ` David Disseldorp
2020-03-24  9:23   ` Christoph Hellwig
2020-03-24  9:23     ` Christoph Hellwig
2020-03-26 21:20     ` David Disseldorp
2020-03-26 21:20       ` David Disseldorp

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.