All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/20] Use block pr_ops in LIO
@ 2022-08-09  0:03 ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:03 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

The following patches were built over Linus's tree and this patchset
which fixes some scsi error handling issues:

https://lore.kernel.org/linux-scsi/1136e369-49b0-c3ef-340a-ab337f514fc5@oracle.com/T/#meebd5040bc360f8c86532b792b48dbe3efe88619

The patches allow us to use the block pr_ops with LIO's target_core_iblock
module to support cluster applications in VMs. Currently, to use windows
clustering or linux clustering (pacemaker + cluster labs scsi fence agents)
in VMs with LIO and vhost-scsi, you have to use tcmu or pscsi or use a
cluster aware FS/framework for the LIO pr file. Setting up a cluster
FS/framework is pain and waste when your real backend device is already a
distributed device, and pscsi and tcmu are nice for specific use cases,
but iblock gives you the best performance and allows you to use stacked
devices like dm-multipath. So these patches allow iblock to work like
pscsi/tcmu where they can pass a PR command to the backend module. And
then iblock will use the pr_ops to pass the PR command to the real devices
similar to what we do for unmap today.

The patches are separated in the following groups:

patches 1 - 11
- Add callouts to read a reservation and it's keys.

patches 12 - 16
- Have pr_ops return a blk_status_t.

patches 17 - 20
- Support for target_core_iblock to bypass the emulate PR code and call
the pr_ops.

This patchset has been tested with the libiscsi PGR ops and with window's
failover cluster verification test.

v2:
- Drop BLK_STS_NEXUS rename changes. Will do separately.
- Add NVMe support. 
- Fixed bug in target_core_iblock where a variable was not initialized
mentioned by Christoph.
- Fixed sd pr_ops UA handling issue found when running libiscsi PGR tests.
- Added patches to allow pr_ops to pass up a BLK_STS so we could return
a RESERVATION_CONFLICT status when a pr_ops callout fails.




^ permalink raw reply	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH 0/20] Use block pr_ops in LIO
@ 2022-08-09  0:03 ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:03 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

The following patches were built over Linus's tree and this patchset
which fixes some scsi error handling issues:

https://lore.kernel.org/linux-scsi/1136e369-49b0-c3ef-340a-ab337f514fc5@oracle.com/T/#meebd5040bc360f8c86532b792b48dbe3efe88619

The patches allow us to use the block pr_ops with LIO's target_core_iblock
module to support cluster applications in VMs. Currently, to use windows
clustering or linux clustering (pacemaker + cluster labs scsi fence agents)
in VMs with LIO and vhost-scsi, you have to use tcmu or pscsi or use a
cluster aware FS/framework for the LIO pr file. Setting up a cluster
FS/framework is pain and waste when your real backend device is already a
distributed device, and pscsi and tcmu are nice for specific use cases,
but iblock gives you the best performance and allows you to use stacked
devices like dm-multipath. So these patches allow iblock to work like
pscsi/tcmu where they can pass a PR command to the backend module. And
then iblock will use the pr_ops to pass the PR command to the real devices
similar to what we do for unmap today.

The patches are separated in the following groups:

patches 1 - 11
- Add callouts to read a reservation and it's keys.

patches 12 - 16
- Have pr_ops return a blk_status_t.

patches 17 - 20
- Support for target_core_iblock to bypass the emulate PR code and call
the pr_ops.

This patchset has been tested with the libiscsi PGR ops and with window's
failover cluster verification test.

v2:
- Drop BLK_STS_NEXUS rename changes. Will do separately.
- Add NVMe support. 
- Fixed bug in target_core_iblock where a variable was not initialized
mentioned by Christoph.
- Fixed sd pr_ops UA handling issue found when running libiscsi PGR tests.
- Added patches to allow pr_ops to pass up a BLK_STS so we could return
a RESERVATION_CONFLICT status when a pr_ops callout fails.



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v2 01/20] block: Add PR callouts for read keys and reservation
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

Add callouts for reading keys and reservations. This allows LIO to support
the READ_KEYS and READ_RESERVATION commands and will allow dm-multipath
to optimize it's error handling so it can check if it's getting an error
because there's an existing reservation or if we need to retry different
paths.

Note: This only initially adds the struct definitions in the kernel as I'm
not sure if we wanted to export the interface to userspace yet. read_keys
and read_reservation are exactly what dm-multipath and LIO need, but for a
userspace interface we may want something like SCSI's READ_FULL_STATUS and
NVMe's report reservation commands. Those are overkill for dm/LIO and
READ_FULL_STATUS is sometimes broken for SCSI devices.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 include/linux/pr.h | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/include/linux/pr.h b/include/linux/pr.h
index 94ceec713afe..79b3d2853a20 100644
--- a/include/linux/pr.h
+++ b/include/linux/pr.h
@@ -4,6 +4,18 @@
 
 #include <uapi/linux/pr.h>
 
+struct pr_keys {
+	u32	generation;
+	u32	num_keys;
+	u64	keys[];
+};
+
+struct pr_held_reservation {
+	u64	key;
+	u32	type;
+	u32	generation;
+};
+
 struct pr_ops {
 	int (*pr_register)(struct block_device *bdev, u64 old_key, u64 new_key,
 			u32 flags);
@@ -14,6 +26,18 @@ struct pr_ops {
 	int (*pr_preempt)(struct block_device *bdev, u64 old_key, u64 new_key,
 			enum pr_type type, bool abort);
 	int (*pr_clear)(struct block_device *bdev, u64 key);
+	/*
+	 * pr_read_keys - Read the registered keys and return them in the
+	 * pr_keys->keys array. The keys array will have been allocated at the
+	 * end of the pr_keys struct and is keys_len bytes. If there are more
+	 * keys than can fit in the array, success will still be returned and
+	 * pr_keys->num_keys will reflect the total number of keys the device
+	 * contains, so the caller can retry with a larger array.
+	 */
+	int (*pr_read_keys)(struct block_device *bdev,
+			struct pr_keys *keys_info, u32 keys_len);
+	int (*pr_read_reservation)(struct block_device *bdev,
+			struct pr_held_reservation *rsv);
 };
 
 #endif /* LINUX_PR_H */
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 01/20] block: Add PR callouts for read keys and reservation
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

Add callouts for reading keys and reservations. This allows LIO to support
the READ_KEYS and READ_RESERVATION commands and will allow dm-multipath
to optimize it's error handling so it can check if it's getting an error
because there's an existing reservation or if we need to retry different
paths.

Note: This only initially adds the struct definitions in the kernel as I'm
not sure if we wanted to export the interface to userspace yet. read_keys
and read_reservation are exactly what dm-multipath and LIO need, but for a
userspace interface we may want something like SCSI's READ_FULL_STATUS and
NVMe's report reservation commands. Those are overkill for dm/LIO and
READ_FULL_STATUS is sometimes broken for SCSI devices.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 include/linux/pr.h | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/include/linux/pr.h b/include/linux/pr.h
index 94ceec713afe..79b3d2853a20 100644
--- a/include/linux/pr.h
+++ b/include/linux/pr.h
@@ -4,6 +4,18 @@
 
 #include <uapi/linux/pr.h>
 
+struct pr_keys {
+	u32	generation;
+	u32	num_keys;
+	u64	keys[];
+};
+
+struct pr_held_reservation {
+	u64	key;
+	u32	type;
+	u32	generation;
+};
+
 struct pr_ops {
 	int (*pr_register)(struct block_device *bdev, u64 old_key, u64 new_key,
 			u32 flags);
@@ -14,6 +26,18 @@ struct pr_ops {
 	int (*pr_preempt)(struct block_device *bdev, u64 old_key, u64 new_key,
 			enum pr_type type, bool abort);
 	int (*pr_clear)(struct block_device *bdev, u64 key);
+	/*
+	 * pr_read_keys - Read the registered keys and return them in the
+	 * pr_keys->keys array. The keys array will have been allocated at the
+	 * end of the pr_keys struct and is keys_len bytes. If there are more
+	 * keys than can fit in the array, success will still be returned and
+	 * pr_keys->num_keys will reflect the total number of keys the device
+	 * contains, so the caller can retry with a larger array.
+	 */
+	int (*pr_read_keys)(struct block_device *bdev,
+			struct pr_keys *keys_info, u32 keys_len);
+	int (*pr_read_reservation)(struct block_device *bdev,
+			struct pr_held_reservation *rsv);
 };
 
 #endif /* LINUX_PR_H */
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 02/20] scsi: Rename sd_pr_command.
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

Rename sd_pr_command to sd_pr_out_command to match a
sd_pr_in_command helper added in the next patches.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 8f79fa6318fe..18ea9ea6bd68 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1702,7 +1702,7 @@ static char sd_pr_type(enum pr_type type)
 	}
 };
 
-static int sd_pr_command(struct block_device *bdev, u8 sa,
+static int sd_pr_out_command(struct block_device *bdev, u8 sa,
 		u64 key, u64 sa_key, u8 type, u8 flags)
 {
 	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
@@ -1738,7 +1738,7 @@ static int sd_pr_register(struct block_device *bdev, u64 old_key, u64 new_key,
 {
 	if (flags & ~PR_FL_IGNORE_KEY)
 		return -EOPNOTSUPP;
-	return sd_pr_command(bdev, (flags & PR_FL_IGNORE_KEY) ? 0x06 : 0x00,
+	return sd_pr_out_command(bdev, (flags & PR_FL_IGNORE_KEY) ? 0x06 : 0x00,
 			old_key, new_key, 0,
 			(1 << 0) /* APTPL */);
 }
@@ -1748,24 +1748,24 @@ static int sd_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
 {
 	if (flags)
 		return -EOPNOTSUPP;
-	return sd_pr_command(bdev, 0x01, key, 0, sd_pr_type(type), 0);
+	return sd_pr_out_command(bdev, 0x01, key, 0, sd_pr_type(type), 0);
 }
 
 static int sd_pr_release(struct block_device *bdev, u64 key, enum pr_type type)
 {
-	return sd_pr_command(bdev, 0x02, key, 0, sd_pr_type(type), 0);
+	return sd_pr_out_command(bdev, 0x02, key, 0, sd_pr_type(type), 0);
 }
 
 static int sd_pr_preempt(struct block_device *bdev, u64 old_key, u64 new_key,
 		enum pr_type type, bool abort)
 {
-	return sd_pr_command(bdev, abort ? 0x05 : 0x04, old_key, new_key,
+	return sd_pr_out_command(bdev, abort ? 0x05 : 0x04, old_key, new_key,
 			     sd_pr_type(type), 0);
 }
 
 static int sd_pr_clear(struct block_device *bdev, u64 key)
 {
-	return sd_pr_command(bdev, 0x03, key, 0, 0, 0);
+	return sd_pr_out_command(bdev, 0x03, key, 0, 0, 0);
 }
 
 static const struct pr_ops sd_pr_ops = {
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 02/20] scsi: Rename sd_pr_command.
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

Rename sd_pr_command to sd_pr_out_command to match a
sd_pr_in_command helper added in the next patches.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 8f79fa6318fe..18ea9ea6bd68 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1702,7 +1702,7 @@ static char sd_pr_type(enum pr_type type)
 	}
 };
 
-static int sd_pr_command(struct block_device *bdev, u8 sa,
+static int sd_pr_out_command(struct block_device *bdev, u8 sa,
 		u64 key, u64 sa_key, u8 type, u8 flags)
 {
 	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
@@ -1738,7 +1738,7 @@ static int sd_pr_register(struct block_device *bdev, u64 old_key, u64 new_key,
 {
 	if (flags & ~PR_FL_IGNORE_KEY)
 		return -EOPNOTSUPP;
-	return sd_pr_command(bdev, (flags & PR_FL_IGNORE_KEY) ? 0x06 : 0x00,
+	return sd_pr_out_command(bdev, (flags & PR_FL_IGNORE_KEY) ? 0x06 : 0x00,
 			old_key, new_key, 0,
 			(1 << 0) /* APTPL */);
 }
@@ -1748,24 +1748,24 @@ static int sd_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
 {
 	if (flags)
 		return -EOPNOTSUPP;
-	return sd_pr_command(bdev, 0x01, key, 0, sd_pr_type(type), 0);
+	return sd_pr_out_command(bdev, 0x01, key, 0, sd_pr_type(type), 0);
 }
 
 static int sd_pr_release(struct block_device *bdev, u64 key, enum pr_type type)
 {
-	return sd_pr_command(bdev, 0x02, key, 0, sd_pr_type(type), 0);
+	return sd_pr_out_command(bdev, 0x02, key, 0, sd_pr_type(type), 0);
 }
 
 static int sd_pr_preempt(struct block_device *bdev, u64 old_key, u64 new_key,
 		enum pr_type type, bool abort)
 {
-	return sd_pr_command(bdev, abort ? 0x05 : 0x04, old_key, new_key,
+	return sd_pr_out_command(bdev, abort ? 0x05 : 0x04, old_key, new_key,
 			     sd_pr_type(type), 0);
 }
 
 static int sd_pr_clear(struct block_device *bdev, u64 key)
 {
-	return sd_pr_command(bdev, 0x03, key, 0, 0, 0);
+	return sd_pr_out_command(bdev, 0x03, key, 0, 0, 0);
 }
 
 static const struct pr_ops sd_pr_ops = {
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 03/20] scsi: Move sd_pr_type to header to share.
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

LIO is going to want to do the same block to/from SCSI pr types as sd.c
so this moves the sd_pr_type helper to a new file and adds a helper to
go from the SCSI value to the block one.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd.c            | 29 +++++-----------------
 include/scsi/scsi_block_pr.h | 47 ++++++++++++++++++++++++++++++++++++
 2 files changed, 53 insertions(+), 23 deletions(-)
 create mode 100644 include/scsi/scsi_block_pr.h

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 18ea9ea6bd68..88ce1464527c 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -67,6 +67,7 @@
 #include <scsi/scsi_host.h>
 #include <scsi/scsi_ioctl.h>
 #include <scsi/scsicam.h>
+#include <scsi/scsi_block_pr.h>
 
 #include "sd.h"
 #include "scsi_priv.h"
@@ -1682,26 +1683,6 @@ static int sd_get_unique_id(struct gendisk *disk, u8 id[16],
 	return ret;
 }
 
-static char sd_pr_type(enum pr_type type)
-{
-	switch (type) {
-	case PR_WRITE_EXCLUSIVE:
-		return 0x01;
-	case PR_EXCLUSIVE_ACCESS:
-		return 0x03;
-	case PR_WRITE_EXCLUSIVE_REG_ONLY:
-		return 0x05;
-	case PR_EXCLUSIVE_ACCESS_REG_ONLY:
-		return 0x06;
-	case PR_WRITE_EXCLUSIVE_ALL_REGS:
-		return 0x07;
-	case PR_EXCLUSIVE_ACCESS_ALL_REGS:
-		return 0x08;
-	default:
-		return 0;
-	}
-};
-
 static int sd_pr_out_command(struct block_device *bdev, u8 sa,
 		u64 key, u64 sa_key, u8 type, u8 flags)
 {
@@ -1748,19 +1729,21 @@ static int sd_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
 {
 	if (flags)
 		return -EOPNOTSUPP;
-	return sd_pr_out_command(bdev, 0x01, key, 0, sd_pr_type(type), 0);
+	return sd_pr_out_command(bdev, 0x01, key, 0,
+				 block_pr_type_to_scsi(type), 0);
 }
 
 static int sd_pr_release(struct block_device *bdev, u64 key, enum pr_type type)
 {
-	return sd_pr_out_command(bdev, 0x02, key, 0, sd_pr_type(type), 0);
+	return sd_pr_out_command(bdev, 0x02, key, 0,
+				 block_pr_type_to_scsi(type), 0);
 }
 
 static int sd_pr_preempt(struct block_device *bdev, u64 old_key, u64 new_key,
 		enum pr_type type, bool abort)
 {
 	return sd_pr_out_command(bdev, abort ? 0x05 : 0x04, old_key, new_key,
-			     sd_pr_type(type), 0);
+				 block_pr_type_to_scsi(type), 0);
 }
 
 static int sd_pr_clear(struct block_device *bdev, u64 key)
diff --git a/include/scsi/scsi_block_pr.h b/include/scsi/scsi_block_pr.h
new file mode 100644
index 000000000000..36d6e742fd98
--- /dev/null
+++ b/include/scsi/scsi_block_pr.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _SCSI_BLOCK_PR_H
+#define _SCSI_BLOCK_PR_H
+
+#include <uapi/linux/pr.h>
+
+static inline u8 block_pr_type_to_scsi(enum pr_type type)
+{
+	switch (type) {
+	case PR_WRITE_EXCLUSIVE:
+		return 0x01;
+	case PR_EXCLUSIVE_ACCESS:
+		return 0x03;
+	case PR_WRITE_EXCLUSIVE_REG_ONLY:
+		return 0x05;
+	case PR_EXCLUSIVE_ACCESS_REG_ONLY:
+		return 0x06;
+	case PR_WRITE_EXCLUSIVE_ALL_REGS:
+		return 0x07;
+	case PR_EXCLUSIVE_ACCESS_ALL_REGS:
+		return 0x08;
+	default:
+		return 0;
+	}
+};
+
+static inline enum pr_type scsi_pr_type_to_block(u8 type)
+{
+	switch (type) {
+	case 0x01:
+		return PR_WRITE_EXCLUSIVE;
+	case 0x03:
+		return PR_EXCLUSIVE_ACCESS;
+	case 0x05:
+		return PR_WRITE_EXCLUSIVE_REG_ONLY;
+	case 0x06:
+		return PR_EXCLUSIVE_ACCESS_REG_ONLY;
+	case 0x07:
+		return PR_WRITE_EXCLUSIVE_ALL_REGS;
+	case 0x08:
+		return PR_EXCLUSIVE_ACCESS_ALL_REGS;
+	default:
+		return 0;
+	}
+}
+
+#endif
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 03/20] scsi: Move sd_pr_type to header to share.
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

LIO is going to want to do the same block to/from SCSI pr types as sd.c
so this moves the sd_pr_type helper to a new file and adds a helper to
go from the SCSI value to the block one.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd.c            | 29 +++++-----------------
 include/scsi/scsi_block_pr.h | 47 ++++++++++++++++++++++++++++++++++++
 2 files changed, 53 insertions(+), 23 deletions(-)
 create mode 100644 include/scsi/scsi_block_pr.h

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 18ea9ea6bd68..88ce1464527c 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -67,6 +67,7 @@
 #include <scsi/scsi_host.h>
 #include <scsi/scsi_ioctl.h>
 #include <scsi/scsicam.h>
+#include <scsi/scsi_block_pr.h>
 
 #include "sd.h"
 #include "scsi_priv.h"
@@ -1682,26 +1683,6 @@ static int sd_get_unique_id(struct gendisk *disk, u8 id[16],
 	return ret;
 }
 
-static char sd_pr_type(enum pr_type type)
-{
-	switch (type) {
-	case PR_WRITE_EXCLUSIVE:
-		return 0x01;
-	case PR_EXCLUSIVE_ACCESS:
-		return 0x03;
-	case PR_WRITE_EXCLUSIVE_REG_ONLY:
-		return 0x05;
-	case PR_EXCLUSIVE_ACCESS_REG_ONLY:
-		return 0x06;
-	case PR_WRITE_EXCLUSIVE_ALL_REGS:
-		return 0x07;
-	case PR_EXCLUSIVE_ACCESS_ALL_REGS:
-		return 0x08;
-	default:
-		return 0;
-	}
-};
-
 static int sd_pr_out_command(struct block_device *bdev, u8 sa,
 		u64 key, u64 sa_key, u8 type, u8 flags)
 {
@@ -1748,19 +1729,21 @@ static int sd_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
 {
 	if (flags)
 		return -EOPNOTSUPP;
-	return sd_pr_out_command(bdev, 0x01, key, 0, sd_pr_type(type), 0);
+	return sd_pr_out_command(bdev, 0x01, key, 0,
+				 block_pr_type_to_scsi(type), 0);
 }
 
 static int sd_pr_release(struct block_device *bdev, u64 key, enum pr_type type)
 {
-	return sd_pr_out_command(bdev, 0x02, key, 0, sd_pr_type(type), 0);
+	return sd_pr_out_command(bdev, 0x02, key, 0,
+				 block_pr_type_to_scsi(type), 0);
 }
 
 static int sd_pr_preempt(struct block_device *bdev, u64 old_key, u64 new_key,
 		enum pr_type type, bool abort)
 {
 	return sd_pr_out_command(bdev, abort ? 0x05 : 0x04, old_key, new_key,
-			     sd_pr_type(type), 0);
+				 block_pr_type_to_scsi(type), 0);
 }
 
 static int sd_pr_clear(struct block_device *bdev, u64 key)
diff --git a/include/scsi/scsi_block_pr.h b/include/scsi/scsi_block_pr.h
new file mode 100644
index 000000000000..36d6e742fd98
--- /dev/null
+++ b/include/scsi/scsi_block_pr.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _SCSI_BLOCK_PR_H
+#define _SCSI_BLOCK_PR_H
+
+#include <uapi/linux/pr.h>
+
+static inline u8 block_pr_type_to_scsi(enum pr_type type)
+{
+	switch (type) {
+	case PR_WRITE_EXCLUSIVE:
+		return 0x01;
+	case PR_EXCLUSIVE_ACCESS:
+		return 0x03;
+	case PR_WRITE_EXCLUSIVE_REG_ONLY:
+		return 0x05;
+	case PR_EXCLUSIVE_ACCESS_REG_ONLY:
+		return 0x06;
+	case PR_WRITE_EXCLUSIVE_ALL_REGS:
+		return 0x07;
+	case PR_EXCLUSIVE_ACCESS_ALL_REGS:
+		return 0x08;
+	default:
+		return 0;
+	}
+};
+
+static inline enum pr_type scsi_pr_type_to_block(u8 type)
+{
+	switch (type) {
+	case 0x01:
+		return PR_WRITE_EXCLUSIVE;
+	case 0x03:
+		return PR_EXCLUSIVE_ACCESS;
+	case 0x05:
+		return PR_WRITE_EXCLUSIVE_REG_ONLY;
+	case 0x06:
+		return PR_EXCLUSIVE_ACCESS_REG_ONLY;
+	case 0x07:
+		return PR_WRITE_EXCLUSIVE_ALL_REGS;
+	case 0x08:
+		return PR_EXCLUSIVE_ACCESS_ALL_REGS;
+	default:
+		return 0;
+	}
+}
+
+#endif
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 04/20] scsi: Add support for block PR read keys/reservation.
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This adds support in sd.c for the block PR read keys and read reservation
callouts.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/sd.c         | 88 +++++++++++++++++++++++++++++++++++++++
 include/scsi/scsi_proto.h |  5 +++
 2 files changed, 93 insertions(+)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 88ce1464527c..f1d4d0568075 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1683,6 +1683,92 @@ static int sd_get_unique_id(struct gendisk *disk, u8 id[16],
 	return ret;
 }
 
+static int sd_pr_in_command(struct block_device *bdev, u8 sa,
+			    unsigned char *data, int data_len)
+{
+	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
+	struct scsi_device *sdev = sdkp->device;
+	struct scsi_sense_hdr sshdr;
+	u8 cmd[10] = { 0, };
+	int result;
+
+	cmd[0] = PERSISTENT_RESERVE_IN;
+	cmd[1] = sa;
+	put_unaligned_be16(data_len, &cmd[7]);
+
+	result = scsi_execute_req(sdev, cmd, DMA_FROM_DEVICE, data, data_len,
+				  &sshdr, SD_TIMEOUT, sdkp->max_retries, NULL);
+	if (scsi_status_is_check_condition(result) &&
+	    scsi_sense_valid(&sshdr)) {
+		sdev_printk(KERN_INFO, sdev, "PR command failed: %d\n", result);
+		scsi_print_sense_hdr(sdev, NULL, &sshdr);
+	}
+
+	return result;
+}
+
+static int sd_pr_read_keys(struct block_device *bdev, struct pr_keys *keys_info,
+			   u32 keys_len)
+{
+	int result, i, data_offset, num_copy_keys;
+	int data_len = keys_len + 8;
+	u8 *data;
+
+	data = kzalloc(data_len, GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	result = sd_pr_in_command(bdev, READ_KEYS, data, data_len);
+	if (result)
+		goto free_data;
+
+	keys_info->generation = get_unaligned_be32(&data[0]);
+	keys_info->num_keys = get_unaligned_be32(&data[4]) / 8;
+
+	data_offset = 8;
+	num_copy_keys = min(keys_len / 8, keys_info->num_keys);
+
+	for (i = 0; i < num_copy_keys; i++) {
+		keys_info->keys[i] = get_unaligned_be64(&data[data_offset]);
+		data_offset += 8;
+	}
+
+free_data:
+	kfree(data);
+	return result;
+}
+
+static int sd_pr_read_reservation(struct block_device *bdev,
+				  struct pr_held_reservation *rsv)
+{
+	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
+	struct scsi_device *sdev = sdkp->device;
+	u8 data[24] = { 0, };
+	int result, len;
+
+	result = sd_pr_in_command(bdev, READ_RESERVATION, data, sizeof(data));
+	if (result)
+		return result;
+
+	memset(rsv, 0, sizeof(*rsv));
+	len = get_unaligned_be32(&data[4]);
+	if (!len)
+		return result;
+
+	/* Make sure we have at least the key and type */
+	if (len < 14) {
+		sdev_printk(KERN_INFO, sdev,
+			    "READ RESERVATION failed due to short return buffer of %d bytes\n",
+			    len);
+		return -EINVAL;
+	}
+
+	rsv->generation = get_unaligned_be32(&data[0]);
+	rsv->key = get_unaligned_be64(&data[8]);
+	rsv->type = scsi_pr_type_to_block(data[21] & 0x0f);
+	return 0;
+}
+
 static int sd_pr_out_command(struct block_device *bdev, u8 sa,
 		u64 key, u64 sa_key, u8 type, u8 flags)
 {
@@ -1757,6 +1843,8 @@ static const struct pr_ops sd_pr_ops = {
 	.pr_release	= sd_pr_release,
 	.pr_preempt	= sd_pr_preempt,
 	.pr_clear	= sd_pr_clear,
+	.pr_read_keys	= sd_pr_read_keys,
+	.pr_read_reservation = sd_pr_read_reservation,
 };
 
 static void scsi_disk_free_disk(struct gendisk *disk)
diff --git a/include/scsi/scsi_proto.h b/include/scsi/scsi_proto.h
index c03e35fc382c..0fd6e295375a 100644
--- a/include/scsi/scsi_proto.h
+++ b/include/scsi/scsi_proto.h
@@ -151,6 +151,11 @@
 #define ZO_FINISH_ZONE	      0x02
 #define ZO_OPEN_ZONE	      0x03
 #define ZO_RESET_WRITE_POINTER 0x04
+/* values for PR in service action */
+#define READ_KEYS             0x00
+#define READ_RESERVATION      0x01
+#define REPORT_CAPABILITES    0x02
+#define READ_FULL_STATUS      0x03
 /* values for variable length command */
 #define XDREAD_32	      0x03
 #define XDWRITE_32	      0x04
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 04/20] scsi: Add support for block PR read keys/reservation.
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This adds support in sd.c for the block PR read keys and read reservation
callouts.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/sd.c         | 88 +++++++++++++++++++++++++++++++++++++++
 include/scsi/scsi_proto.h |  5 +++
 2 files changed, 93 insertions(+)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 88ce1464527c..f1d4d0568075 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1683,6 +1683,92 @@ static int sd_get_unique_id(struct gendisk *disk, u8 id[16],
 	return ret;
 }
 
+static int sd_pr_in_command(struct block_device *bdev, u8 sa,
+			    unsigned char *data, int data_len)
+{
+	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
+	struct scsi_device *sdev = sdkp->device;
+	struct scsi_sense_hdr sshdr;
+	u8 cmd[10] = { 0, };
+	int result;
+
+	cmd[0] = PERSISTENT_RESERVE_IN;
+	cmd[1] = sa;
+	put_unaligned_be16(data_len, &cmd[7]);
+
+	result = scsi_execute_req(sdev, cmd, DMA_FROM_DEVICE, data, data_len,
+				  &sshdr, SD_TIMEOUT, sdkp->max_retries, NULL);
+	if (scsi_status_is_check_condition(result) &&
+	    scsi_sense_valid(&sshdr)) {
+		sdev_printk(KERN_INFO, sdev, "PR command failed: %d\n", result);
+		scsi_print_sense_hdr(sdev, NULL, &sshdr);
+	}
+
+	return result;
+}
+
+static int sd_pr_read_keys(struct block_device *bdev, struct pr_keys *keys_info,
+			   u32 keys_len)
+{
+	int result, i, data_offset, num_copy_keys;
+	int data_len = keys_len + 8;
+	u8 *data;
+
+	data = kzalloc(data_len, GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	result = sd_pr_in_command(bdev, READ_KEYS, data, data_len);
+	if (result)
+		goto free_data;
+
+	keys_info->generation = get_unaligned_be32(&data[0]);
+	keys_info->num_keys = get_unaligned_be32(&data[4]) / 8;
+
+	data_offset = 8;
+	num_copy_keys = min(keys_len / 8, keys_info->num_keys);
+
+	for (i = 0; i < num_copy_keys; i++) {
+		keys_info->keys[i] = get_unaligned_be64(&data[data_offset]);
+		data_offset += 8;
+	}
+
+free_data:
+	kfree(data);
+	return result;
+}
+
+static int sd_pr_read_reservation(struct block_device *bdev,
+				  struct pr_held_reservation *rsv)
+{
+	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
+	struct scsi_device *sdev = sdkp->device;
+	u8 data[24] = { 0, };
+	int result, len;
+
+	result = sd_pr_in_command(bdev, READ_RESERVATION, data, sizeof(data));
+	if (result)
+		return result;
+
+	memset(rsv, 0, sizeof(*rsv));
+	len = get_unaligned_be32(&data[4]);
+	if (!len)
+		return result;
+
+	/* Make sure we have at least the key and type */
+	if (len < 14) {
+		sdev_printk(KERN_INFO, sdev,
+			    "READ RESERVATION failed due to short return buffer of %d bytes\n",
+			    len);
+		return -EINVAL;
+	}
+
+	rsv->generation = get_unaligned_be32(&data[0]);
+	rsv->key = get_unaligned_be64(&data[8]);
+	rsv->type = scsi_pr_type_to_block(data[21] & 0x0f);
+	return 0;
+}
+
 static int sd_pr_out_command(struct block_device *bdev, u8 sa,
 		u64 key, u64 sa_key, u8 type, u8 flags)
 {
@@ -1757,6 +1843,8 @@ static const struct pr_ops sd_pr_ops = {
 	.pr_release	= sd_pr_release,
 	.pr_preempt	= sd_pr_preempt,
 	.pr_clear	= sd_pr_clear,
+	.pr_read_keys	= sd_pr_read_keys,
+	.pr_read_reservation = sd_pr_read_reservation,
 };
 
 static void scsi_disk_free_disk(struct gendisk *disk)
diff --git a/include/scsi/scsi_proto.h b/include/scsi/scsi_proto.h
index c03e35fc382c..0fd6e295375a 100644
--- a/include/scsi/scsi_proto.h
+++ b/include/scsi/scsi_proto.h
@@ -151,6 +151,11 @@
 #define ZO_FINISH_ZONE	      0x02
 #define ZO_OPEN_ZONE	      0x03
 #define ZO_RESET_WRITE_POINTER 0x04
+/* values for PR in service action */
+#define READ_KEYS             0x00
+#define READ_RESERVATION      0x01
+#define REPORT_CAPABILITES    0x02
+#define READ_FULL_STATUS      0x03
 /* values for variable length command */
 #define XDREAD_32	      0x03
 #define XDWRITE_32	      0x04
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 05/20] dm: Add support for block PR read keys/reservation.
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This adds support in dm for the block PR read keys and read reservation
callouts.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/md/dm.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 60549b65c799..1b15295bdf24 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -3313,12 +3313,56 @@ static int dm_pr_clear(struct block_device *bdev, u64 key)
 	return r;
 }
 
+static int dm_pr_read_keys(struct block_device *bdev, struct pr_keys *keys,
+			   u32 keys_len)
+{
+	struct mapped_device *md = bdev->bd_disk->private_data;
+	const struct pr_ops *ops;
+	int r, srcu_idx;
+
+	r = dm_prepare_ioctl(md, &srcu_idx, &bdev);
+	if (r < 0)
+		goto out;
+
+	ops = bdev->bd_disk->fops->pr_ops;
+	if (ops && ops->pr_read_keys)
+		r = ops->pr_read_keys(bdev, keys, keys_len);
+	else
+		r = -EOPNOTSUPP;
+out:
+	dm_unprepare_ioctl(md, srcu_idx);
+	return r;
+}
+
+static int dm_pr_read_reservation(struct block_device *bdev,
+				  struct pr_held_reservation *rsv)
+{
+	struct mapped_device *md = bdev->bd_disk->private_data;
+	const struct pr_ops *ops;
+	int r, srcu_idx;
+
+	r = dm_prepare_ioctl(md, &srcu_idx, &bdev);
+	if (r < 0)
+		goto out;
+
+	ops = bdev->bd_disk->fops->pr_ops;
+	if (ops && ops->pr_read_reservation)
+		r = ops->pr_read_reservation(bdev, rsv);
+	else
+		r = -EOPNOTSUPP;
+out:
+	dm_unprepare_ioctl(md, srcu_idx);
+	return r;
+}
+
 static const struct pr_ops dm_pr_ops = {
 	.pr_register	= dm_pr_register,
 	.pr_reserve	= dm_pr_reserve,
 	.pr_release	= dm_pr_release,
 	.pr_preempt	= dm_pr_preempt,
 	.pr_clear	= dm_pr_clear,
+	.pr_read_keys	= dm_pr_read_keys,
+	.pr_read_reservation = dm_pr_read_reservation,
 };
 
 static const struct block_device_operations dm_blk_dops = {
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 05/20] dm: Add support for block PR read keys/reservation.
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This adds support in dm for the block PR read keys and read reservation
callouts.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/md/dm.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 60549b65c799..1b15295bdf24 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -3313,12 +3313,56 @@ static int dm_pr_clear(struct block_device *bdev, u64 key)
 	return r;
 }
 
+static int dm_pr_read_keys(struct block_device *bdev, struct pr_keys *keys,
+			   u32 keys_len)
+{
+	struct mapped_device *md = bdev->bd_disk->private_data;
+	const struct pr_ops *ops;
+	int r, srcu_idx;
+
+	r = dm_prepare_ioctl(md, &srcu_idx, &bdev);
+	if (r < 0)
+		goto out;
+
+	ops = bdev->bd_disk->fops->pr_ops;
+	if (ops && ops->pr_read_keys)
+		r = ops->pr_read_keys(bdev, keys, keys_len);
+	else
+		r = -EOPNOTSUPP;
+out:
+	dm_unprepare_ioctl(md, srcu_idx);
+	return r;
+}
+
+static int dm_pr_read_reservation(struct block_device *bdev,
+				  struct pr_held_reservation *rsv)
+{
+	struct mapped_device *md = bdev->bd_disk->private_data;
+	const struct pr_ops *ops;
+	int r, srcu_idx;
+
+	r = dm_prepare_ioctl(md, &srcu_idx, &bdev);
+	if (r < 0)
+		goto out;
+
+	ops = bdev->bd_disk->fops->pr_ops;
+	if (ops && ops->pr_read_reservation)
+		r = ops->pr_read_reservation(bdev, rsv);
+	else
+		r = -EOPNOTSUPP;
+out:
+	dm_unprepare_ioctl(md, srcu_idx);
+	return r;
+}
+
 static const struct pr_ops dm_pr_ops = {
 	.pr_register	= dm_pr_register,
 	.pr_reserve	= dm_pr_reserve,
 	.pr_release	= dm_pr_release,
 	.pr_preempt	= dm_pr_preempt,
 	.pr_clear	= dm_pr_clear,
+	.pr_read_keys	= dm_pr_read_keys,
+	.pr_read_reservation = dm_pr_read_reservation,
 };
 
 static const struct block_device_operations dm_blk_dops = {
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 06/20] nvme: Fix reservation status related structs
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This fixes the following issues with the reservation status structs:

1. resv10 is bytes 23:10 so it should be 14 bytes.
2. regctl_ds only supports 64 bit host IDs.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 include/linux/nvme.h | 33 +++++++++++++++++++++++++--------
 1 file changed, 25 insertions(+), 8 deletions(-)

diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index ae53d74f3696..ae4a76076420 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -757,20 +757,37 @@ enum {
 	NVME_LBART_ATTRIB_HIDE	= 1 << 1,
 };
 
+struct nvme_registered_ctrl {
+	__le16	cntlid;
+	__u8	rcsts;
+	__u8	rsvd3[5];
+	__le64	hostid;
+	__le64	rkey;
+};
+
+struct nvme_registered_ctrl_ext {
+	__le16	cntlid;
+	__u8	rcsts;
+	__u8	rsvd3[5];
+	__le64	rkey;
+	__u8	hostid[16];
+	__u8	rsvd32[32];
+};
+
 struct nvme_reservation_status {
 	__le32	gen;
 	__u8	rtype;
 	__u8	regctl[2];
 	__u8	resv5[2];
 	__u8	ptpls;
-	__u8	resv10[13];
-	struct {
-		__le16	cntlid;
-		__u8	rcsts;
-		__u8	resv3[5];
-		__le64	hostid;
-		__le64	rkey;
-	} regctl_ds[];
+	__u8	resv10[14];
+	union {
+		struct {
+			__u8	rsvd24[40];
+			struct nvme_registered_ctrl_ext regctl_eds[0];
+		};
+		struct nvme_registered_ctrl regctl_ds[0];
+	};
 };
 
 enum nvme_async_event_type {
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 06/20] nvme: Fix reservation status related structs
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This fixes the following issues with the reservation status structs:

1. resv10 is bytes 23:10 so it should be 14 bytes.
2. regctl_ds only supports 64 bit host IDs.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 include/linux/nvme.h | 33 +++++++++++++++++++++++++--------
 1 file changed, 25 insertions(+), 8 deletions(-)

diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index ae53d74f3696..ae4a76076420 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -757,20 +757,37 @@ enum {
 	NVME_LBART_ATTRIB_HIDE	= 1 << 1,
 };
 
+struct nvme_registered_ctrl {
+	__le16	cntlid;
+	__u8	rcsts;
+	__u8	rsvd3[5];
+	__le64	hostid;
+	__le64	rkey;
+};
+
+struct nvme_registered_ctrl_ext {
+	__le16	cntlid;
+	__u8	rcsts;
+	__u8	rsvd3[5];
+	__le64	rkey;
+	__u8	hostid[16];
+	__u8	rsvd32[32];
+};
+
 struct nvme_reservation_status {
 	__le32	gen;
 	__u8	rtype;
 	__u8	regctl[2];
 	__u8	resv5[2];
 	__u8	ptpls;
-	__u8	resv10[13];
-	struct {
-		__le16	cntlid;
-		__u8	rcsts;
-		__u8	resv3[5];
-		__le64	hostid;
-		__le64	rkey;
-	} regctl_ds[];
+	__u8	resv10[14];
+	union {
+		struct {
+			__u8	rsvd24[40];
+			struct nvme_registered_ctrl_ext regctl_eds[0];
+		};
+		struct nvme_registered_ctrl regctl_ds[0];
+	};
 };
 
 enum nvme_async_event_type {
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 07/20] nvme: Don't hardcode the data len for pr commands.
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

Reservation Report support needs to pass in a variable sized buffer, so
this patch has the pr command helpers take a data length argument.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/nvme/host/core.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index af367b22871b..3f223641f321 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2085,7 +2085,7 @@ static char nvme_pr_type(enum pr_type type)
 }
 
 static int nvme_send_ns_head_pr_command(struct block_device *bdev,
-		struct nvme_command *c, u8 data[16])
+		struct nvme_command *c, u8 *data, unsigned int data_len)
 {
 	struct nvme_ns_head *head = bdev->bd_disk->private_data;
 	int srcu_idx = srcu_read_lock(&head->srcu);
@@ -2094,17 +2094,17 @@ static int nvme_send_ns_head_pr_command(struct block_device *bdev,
 
 	if (ns) {
 		c->common.nsid = cpu_to_le32(ns->head->ns_id);
-		ret = nvme_submit_sync_cmd(ns->queue, c, data, 16);
+		ret = nvme_submit_sync_cmd(ns->queue, c, data, data_len);
 	}
 	srcu_read_unlock(&head->srcu, srcu_idx);
 	return ret;
 }
 	
 static int nvme_send_ns_pr_command(struct nvme_ns *ns, struct nvme_command *c,
-		u8 data[16])
+		u8 *data, unsigned int data_len)
 {
 	c->common.nsid = cpu_to_le32(ns->head->ns_id);
-	return nvme_submit_sync_cmd(ns->queue, c, data, 16);
+	return nvme_submit_sync_cmd(ns->queue, c, data, data_len);
 }
 
 static int nvme_pr_command(struct block_device *bdev, u32 cdw10,
@@ -2121,8 +2121,10 @@ static int nvme_pr_command(struct block_device *bdev, u32 cdw10,
 
 	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
 	    bdev->bd_disk->fops == &nvme_ns_head_ops)
-		return nvme_send_ns_head_pr_command(bdev, &c, data);
-	return nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c, data);
+		return nvme_send_ns_head_pr_command(bdev, &c, data,
+						    sizeof(data));
+	return nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c, data,
+				       sizeof(data));
 }
 
 static int nvme_pr_register(struct block_device *bdev, u64 old,
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 07/20] nvme: Don't hardcode the data len for pr commands.
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

Reservation Report support needs to pass in a variable sized buffer, so
this patch has the pr command helpers take a data length argument.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/nvme/host/core.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index af367b22871b..3f223641f321 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2085,7 +2085,7 @@ static char nvme_pr_type(enum pr_type type)
 }
 
 static int nvme_send_ns_head_pr_command(struct block_device *bdev,
-		struct nvme_command *c, u8 data[16])
+		struct nvme_command *c, u8 *data, unsigned int data_len)
 {
 	struct nvme_ns_head *head = bdev->bd_disk->private_data;
 	int srcu_idx = srcu_read_lock(&head->srcu);
@@ -2094,17 +2094,17 @@ static int nvme_send_ns_head_pr_command(struct block_device *bdev,
 
 	if (ns) {
 		c->common.nsid = cpu_to_le32(ns->head->ns_id);
-		ret = nvme_submit_sync_cmd(ns->queue, c, data, 16);
+		ret = nvme_submit_sync_cmd(ns->queue, c, data, data_len);
 	}
 	srcu_read_unlock(&head->srcu, srcu_idx);
 	return ret;
 }
 	
 static int nvme_send_ns_pr_command(struct nvme_ns *ns, struct nvme_command *c,
-		u8 data[16])
+		u8 *data, unsigned int data_len)
 {
 	c->common.nsid = cpu_to_le32(ns->head->ns_id);
-	return nvme_submit_sync_cmd(ns->queue, c, data, 16);
+	return nvme_submit_sync_cmd(ns->queue, c, data, data_len);
 }
 
 static int nvme_pr_command(struct block_device *bdev, u32 cdw10,
@@ -2121,8 +2121,10 @@ static int nvme_pr_command(struct block_device *bdev, u32 cdw10,
 
 	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
 	    bdev->bd_disk->fops == &nvme_ns_head_ops)
-		return nvme_send_ns_head_pr_command(bdev, &c, data);
-	return nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c, data);
+		return nvme_send_ns_head_pr_command(bdev, &c, data,
+						    sizeof(data));
+	return nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c, data,
+				       sizeof(data));
 }
 
 static int nvme_pr_register(struct block_device *bdev, u64 old,
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 08/20] nvme: Add helper to convert to a pr_ops PR type
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This adds a helper to go from the NVMe spec PR type value to the block
layer pr_type, so for Reservation Report support we can convert from its
value.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/nvme/host/core.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 3f223641f321..0dc768ae0c16 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2064,6 +2064,26 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
 	}
 }
 
+static enum pr_type block_pr_type(u8 nvme_type)
+{
+	switch (nvme_type) {
+	case 1:
+		return PR_WRITE_EXCLUSIVE;
+	case 2:
+		return PR_EXCLUSIVE_ACCESS;
+	case 3:
+		return PR_WRITE_EXCLUSIVE_REG_ONLY;
+	case 4:
+		return PR_EXCLUSIVE_ACCESS_REG_ONLY;
+	case 5:
+		return PR_WRITE_EXCLUSIVE_ALL_REGS;
+	case 6:
+		return PR_EXCLUSIVE_ACCESS_ALL_REGS;
+	default:
+		return 0;
+	}
+}
+
 static char nvme_pr_type(enum pr_type type)
 {
 	switch (type) {
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 08/20] nvme: Add helper to convert to a pr_ops PR type
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This adds a helper to go from the NVMe spec PR type value to the block
layer pr_type, so for Reservation Report support we can convert from its
value.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/nvme/host/core.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 3f223641f321..0dc768ae0c16 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2064,6 +2064,26 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
 	}
 }
 
+static enum pr_type block_pr_type(u8 nvme_type)
+{
+	switch (nvme_type) {
+	case 1:
+		return PR_WRITE_EXCLUSIVE;
+	case 2:
+		return PR_EXCLUSIVE_ACCESS;
+	case 3:
+		return PR_WRITE_EXCLUSIVE_REG_ONLY;
+	case 4:
+		return PR_EXCLUSIVE_ACCESS_REG_ONLY;
+	case 5:
+		return PR_WRITE_EXCLUSIVE_ALL_REGS;
+	case 6:
+		return PR_EXCLUSIVE_ACCESS_ALL_REGS;
+	default:
+		return 0;
+	}
+}
+
 static char nvme_pr_type(enum pr_type type)
 {
 	switch (type) {
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This adds a helper to execute the Reservation Report. The next patches
will then convert call it and convert that info to read_keys and
read_reservation.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/nvme/host/core.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 0dc768ae0c16..6b22a5dec122 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2196,6 +2196,33 @@ static int nvme_pr_release(struct block_device *bdev, u64 key, enum pr_type type
 	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_release);
 }
 
+static int nvme_pr_resv_report(struct block_device *bdev, u8 *data,
+		u32 data_len, bool *eds)
+{
+	struct nvme_command c = { };
+	int ret;
+
+	c.common.opcode = nvme_cmd_resv_report;
+	c.common.cdw10 = cpu_to_le32(nvme_bytes_to_numd(data_len));
+	c.common.cdw11 = 1;
+	*eds = true;
+
+retry:
+	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
+	    bdev->bd_disk->fops == &nvme_ns_head_ops)
+		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len);
+	else
+		ret = nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
+					      data, data_len);
+	if (ret == NVME_SC_HOST_ID_INCONSIST && c.common.cdw11) {
+		c.common.cdw11 = 0;
+		*eds = false;
+		goto retry;
+	}
+
+	return ret;
+}
+
 const struct pr_ops nvme_pr_ops = {
 	.pr_register	= nvme_pr_register,
 	.pr_reserve	= nvme_pr_reserve,
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This adds a helper to execute the Reservation Report. The next patches
will then convert call it and convert that info to read_keys and
read_reservation.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/nvme/host/core.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 0dc768ae0c16..6b22a5dec122 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2196,6 +2196,33 @@ static int nvme_pr_release(struct block_device *bdev, u64 key, enum pr_type type
 	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_release);
 }
 
+static int nvme_pr_resv_report(struct block_device *bdev, u8 *data,
+		u32 data_len, bool *eds)
+{
+	struct nvme_command c = { };
+	int ret;
+
+	c.common.opcode = nvme_cmd_resv_report;
+	c.common.cdw10 = cpu_to_le32(nvme_bytes_to_numd(data_len));
+	c.common.cdw11 = 1;
+	*eds = true;
+
+retry:
+	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
+	    bdev->bd_disk->fops == &nvme_ns_head_ops)
+		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len);
+	else
+		ret = nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
+					      data, data_len);
+	if (ret == NVME_SC_HOST_ID_INCONSIST && c.common.cdw11) {
+		c.common.cdw11 = 0;
+		*eds = false;
+		goto retry;
+	}
+
+	return ret;
+}
+
 const struct pr_ops nvme_pr_ops = {
 	.pr_register	= nvme_pr_register,
 	.pr_reserve	= nvme_pr_reserve,
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 10/20] nvme: Add pr_ops read_keys support
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This patch adds support for the pr_ops read_keys callout by calling the
NVMe Reservation Report helper, then parsing that info to get the
controller's registered keys. Because the callout is only used in the
kernel where the callers do not know about controller/host IDs, the
callout just returns the registered keys which is required by the SCSI PR
in READ KEYS command.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/nvme/host/core.c | 45 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 6b22a5dec122..230e5deca391 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2223,12 +2223,57 @@ static int nvme_pr_resv_report(struct block_device *bdev, u8 *data,
 	return ret;
 }
 
+static int nvme_pr_read_keys(struct block_device *bdev,
+		struct pr_keys *keys_info, u32 keys_len)
+{
+	struct nvme_reservation_status *status;
+	u32 data_len, num_ret_keys;
+	int ret, i;
+	bool eds;
+	u8 *data;
+
+	/*
+	 * Assume we are using 128-bit host IDs and allocate a buffer large
+	 * enough to get enough keys to fill the return keys buffer.
+	 */
+	num_ret_keys = keys_len / 8;
+	data_len = sizeof(*status) +
+			num_ret_keys * sizeof(struct nvme_registered_ctrl_ext);
+	data = kzalloc(data_len, GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	ret = nvme_pr_resv_report(bdev, data, data_len, &eds);
+	if (ret)
+		goto free_data;
+
+	status = (struct nvme_reservation_status *)data;
+	keys_info->generation = le32_to_cpu(status->gen);
+	keys_info->num_keys = get_unaligned_le16(&status->regctl);
+
+	num_ret_keys = min(num_ret_keys, keys_info->num_keys);
+	for (i = 0; i < num_ret_keys; i++) {
+		if (eds) {
+			keys_info->keys[i] =
+					le64_to_cpu(status->regctl_eds[i].rkey);
+		} else {
+			keys_info->keys[i] =
+					le64_to_cpu(status->regctl_ds[i].rkey);
+		}
+	}
+
+free_data:
+	kfree(data);
+	return ret;
+}
+
 const struct pr_ops nvme_pr_ops = {
 	.pr_register	= nvme_pr_register,
 	.pr_reserve	= nvme_pr_reserve,
 	.pr_release	= nvme_pr_release,
 	.pr_preempt	= nvme_pr_preempt,
 	.pr_clear	= nvme_pr_clear,
+	.pr_read_keys	= nvme_pr_read_keys,
 };
 
 #ifdef CONFIG_BLK_SED_OPAL
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 10/20] nvme: Add pr_ops read_keys support
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This patch adds support for the pr_ops read_keys callout by calling the
NVMe Reservation Report helper, then parsing that info to get the
controller's registered keys. Because the callout is only used in the
kernel where the callers do not know about controller/host IDs, the
callout just returns the registered keys which is required by the SCSI PR
in READ KEYS command.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/nvme/host/core.c | 45 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 6b22a5dec122..230e5deca391 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2223,12 +2223,57 @@ static int nvme_pr_resv_report(struct block_device *bdev, u8 *data,
 	return ret;
 }
 
+static int nvme_pr_read_keys(struct block_device *bdev,
+		struct pr_keys *keys_info, u32 keys_len)
+{
+	struct nvme_reservation_status *status;
+	u32 data_len, num_ret_keys;
+	int ret, i;
+	bool eds;
+	u8 *data;
+
+	/*
+	 * Assume we are using 128-bit host IDs and allocate a buffer large
+	 * enough to get enough keys to fill the return keys buffer.
+	 */
+	num_ret_keys = keys_len / 8;
+	data_len = sizeof(*status) +
+			num_ret_keys * sizeof(struct nvme_registered_ctrl_ext);
+	data = kzalloc(data_len, GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	ret = nvme_pr_resv_report(bdev, data, data_len, &eds);
+	if (ret)
+		goto free_data;
+
+	status = (struct nvme_reservation_status *)data;
+	keys_info->generation = le32_to_cpu(status->gen);
+	keys_info->num_keys = get_unaligned_le16(&status->regctl);
+
+	num_ret_keys = min(num_ret_keys, keys_info->num_keys);
+	for (i = 0; i < num_ret_keys; i++) {
+		if (eds) {
+			keys_info->keys[i] =
+					le64_to_cpu(status->regctl_eds[i].rkey);
+		} else {
+			keys_info->keys[i] =
+					le64_to_cpu(status->regctl_ds[i].rkey);
+		}
+	}
+
+free_data:
+	kfree(data);
+	return ret;
+}
+
 const struct pr_ops nvme_pr_ops = {
 	.pr_register	= nvme_pr_register,
 	.pr_reserve	= nvme_pr_reserve,
 	.pr_release	= nvme_pr_release,
 	.pr_preempt	= nvme_pr_preempt,
 	.pr_clear	= nvme_pr_clear,
+	.pr_read_keys	= nvme_pr_read_keys,
 };
 
 #ifdef CONFIG_BLK_SED_OPAL
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 11/20] nvme: Add pr_ops read_reservation support
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This patch adds support for the pr_ops read_reservation callout by
calling the NVMe Reservation Report helper. It then parses that info to
detect if there is a reservation and if there is then convert the
returned info to a pr_ops pr_held_reservation struct.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/nvme/host/core.c | 68 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 230e5deca391..5bbc1d84a87e 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2267,6 +2267,73 @@ static int nvme_pr_read_keys(struct block_device *bdev,
 	return ret;
 }
 
+static int nvme_pr_read_reservation(struct block_device *bdev,
+		struct pr_held_reservation *resv)
+{
+	struct nvme_reservation_status tmp_status, *status;
+	int ret, i, num_regs;
+	u32 data_len;
+	bool eds;
+	u8 *data;
+
+	memset(resv, 0, sizeof(*resv));
+
+retry:
+	/*
+	 * Get the number of registrations so we know how big to allocate
+	 * the response buffer.
+	 */
+	ret = nvme_pr_resv_report(bdev, (u8 *)&tmp_status, sizeof(tmp_status),
+				  &eds);
+	if (ret)
+		return 0;
+
+	num_regs = get_unaligned_le16(&tmp_status.regctl);
+	if (!num_regs) {
+		resv->generation = le32_to_cpu(tmp_status.gen);
+		return 0;
+	}
+
+	data_len = sizeof(*status) +
+			num_regs * sizeof(struct nvme_registered_ctrl_ext);
+	data = kzalloc(data_len, GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	ret = nvme_pr_resv_report(bdev, data, data_len, &eds);
+	if (ret)
+		goto free_data;
+	status = (struct nvme_reservation_status *)data;
+
+	if (num_regs != get_unaligned_le16(&status->regctl)) {
+		kfree(data);
+		goto retry;
+	}
+
+	resv->generation = le32_to_cpu(status->gen);
+	resv->type = block_pr_type(status->rtype);
+
+	for (i = 0; i < num_regs; i++) {
+		if (eds) {
+			if (status->regctl_eds[i].rcsts) {
+				resv->key =
+					le64_to_cpu(status->regctl_eds[i].rkey);
+				break;
+			}
+		} else {
+			if (status->regctl_ds[i].rcsts) {
+				resv->key =
+					le64_to_cpu(status->regctl_ds[i].rkey);
+				break;
+			}
+		}
+	}
+
+free_data:
+	kfree(data);
+	return ret;
+}
+
 const struct pr_ops nvme_pr_ops = {
 	.pr_register	= nvme_pr_register,
 	.pr_reserve	= nvme_pr_reserve,
@@ -2274,6 +2341,7 @@ const struct pr_ops nvme_pr_ops = {
 	.pr_preempt	= nvme_pr_preempt,
 	.pr_clear	= nvme_pr_clear,
 	.pr_read_keys	= nvme_pr_read_keys,
+	.pr_read_reservation = nvme_pr_read_reservation,
 };
 
 #ifdef CONFIG_BLK_SED_OPAL
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 11/20] nvme: Add pr_ops read_reservation support
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This patch adds support for the pr_ops read_reservation callout by
calling the NVMe Reservation Report helper. It then parses that info to
detect if there is a reservation and if there is then convert the
returned info to a pr_ops pr_held_reservation struct.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/nvme/host/core.c | 68 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 230e5deca391..5bbc1d84a87e 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2267,6 +2267,73 @@ static int nvme_pr_read_keys(struct block_device *bdev,
 	return ret;
 }
 
+static int nvme_pr_read_reservation(struct block_device *bdev,
+		struct pr_held_reservation *resv)
+{
+	struct nvme_reservation_status tmp_status, *status;
+	int ret, i, num_regs;
+	u32 data_len;
+	bool eds;
+	u8 *data;
+
+	memset(resv, 0, sizeof(*resv));
+
+retry:
+	/*
+	 * Get the number of registrations so we know how big to allocate
+	 * the response buffer.
+	 */
+	ret = nvme_pr_resv_report(bdev, (u8 *)&tmp_status, sizeof(tmp_status),
+				  &eds);
+	if (ret)
+		return 0;
+
+	num_regs = get_unaligned_le16(&tmp_status.regctl);
+	if (!num_regs) {
+		resv->generation = le32_to_cpu(tmp_status.gen);
+		return 0;
+	}
+
+	data_len = sizeof(*status) +
+			num_regs * sizeof(struct nvme_registered_ctrl_ext);
+	data = kzalloc(data_len, GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	ret = nvme_pr_resv_report(bdev, data, data_len, &eds);
+	if (ret)
+		goto free_data;
+	status = (struct nvme_reservation_status *)data;
+
+	if (num_regs != get_unaligned_le16(&status->regctl)) {
+		kfree(data);
+		goto retry;
+	}
+
+	resv->generation = le32_to_cpu(status->gen);
+	resv->type = block_pr_type(status->rtype);
+
+	for (i = 0; i < num_regs; i++) {
+		if (eds) {
+			if (status->regctl_eds[i].rcsts) {
+				resv->key =
+					le64_to_cpu(status->regctl_eds[i].rkey);
+				break;
+			}
+		} else {
+			if (status->regctl_ds[i].rcsts) {
+				resv->key =
+					le64_to_cpu(status->regctl_ds[i].rkey);
+				break;
+			}
+		}
+	}
+
+free_data:
+	kfree(data);
+	return ret;
+}
+
 const struct pr_ops nvme_pr_ops = {
 	.pr_register	= nvme_pr_register,
 	.pr_reserve	= nvme_pr_reserve,
@@ -2274,6 +2341,7 @@ const struct pr_ops nvme_pr_ops = {
 	.pr_preempt	= nvme_pr_preempt,
 	.pr_clear	= nvme_pr_clear,
 	.pr_read_keys	= nvme_pr_read_keys,
+	.pr_read_reservation = nvme_pr_read_reservation,
 };
 
 #ifdef CONFIG_BLK_SED_OPAL
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 12/20] block,nvme,scsi,dm: Add blk_status to pr_ops callouts.
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

Kernel pr_ops users like LIO need to be able to know about if a failure
was a result of a reservation conflict and then be able to convert
from the lower level's definition of that error to SCSI so it can be
returned to the initiator. To do this they currently have to know the
lower level device type and this can be difficult when we have
dm-multipath between LIO and the device.

dm-multipath would also like to be able to distiguish between path
failures and reservation conflict so they can optimize their error
handlers for their pr_ops.

To handle both cases, this patch adds a blk_status_t arg to the pr_ops
callouts. The lower levels will convert their device specific error to
the blk_status_t then the upper levels can easily check that code
without knowing the device type. It also allows us to keep userspace
compat where it expects a negative -Exyz error code if the command fails
before it's sent to the device or a device/tranport specific value if the
error is > 0.

This patch just wires in the blk_status_t to the pr_ops callouts. The
next patches will then have the drivers pass up a blk_status_t.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 block/ioctl.c            | 11 ++++++-----
 drivers/md/dm.c          | 41 +++++++++++++++++++++++++---------------
 drivers/nvme/host/core.c | 16 +++++++++-------
 drivers/scsi/sd.c        | 21 +++++++++++---------
 fs/nfs/blocklayout/dev.c |  4 ++--
 fs/nfsd/blocklayout.c    |  6 +++---
 include/linux/pr.h       | 17 ++++++++++-------
 7 files changed, 68 insertions(+), 48 deletions(-)

diff --git a/block/ioctl.c b/block/ioctl.c
index 60121e89052b..72338c56e235 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -269,7 +269,8 @@ static int blkdev_pr_register(struct block_device *bdev,
 
 	if (reg.flags & ~PR_FL_IGNORE_KEY)
 		return -EOPNOTSUPP;
-	return ops->pr_register(bdev, reg.old_key, reg.new_key, reg.flags);
+	return ops->pr_register(bdev, reg.old_key, reg.new_key, reg.flags,
+				NULL);
 }
 
 static int blkdev_pr_reserve(struct block_device *bdev,
@@ -287,7 +288,7 @@ static int blkdev_pr_reserve(struct block_device *bdev,
 
 	if (rsv.flags & ~PR_FL_IGNORE_KEY)
 		return -EOPNOTSUPP;
-	return ops->pr_reserve(bdev, rsv.key, rsv.type, rsv.flags);
+	return ops->pr_reserve(bdev, rsv.key, rsv.type, rsv.flags, NULL);
 }
 
 static int blkdev_pr_release(struct block_device *bdev,
@@ -305,7 +306,7 @@ static int blkdev_pr_release(struct block_device *bdev,
 
 	if (rsv.flags)
 		return -EOPNOTSUPP;
-	return ops->pr_release(bdev, rsv.key, rsv.type);
+	return ops->pr_release(bdev, rsv.key, rsv.type, NULL);
 }
 
 static int blkdev_pr_preempt(struct block_device *bdev,
@@ -323,7 +324,7 @@ static int blkdev_pr_preempt(struct block_device *bdev,
 
 	if (p.flags)
 		return -EOPNOTSUPP;
-	return ops->pr_preempt(bdev, p.old_key, p.new_key, p.type, abort);
+	return ops->pr_preempt(bdev, p.old_key, p.new_key, p.type, abort, NULL);
 }
 
 static int blkdev_pr_clear(struct block_device *bdev,
@@ -341,7 +342,7 @@ static int blkdev_pr_clear(struct block_device *bdev,
 
 	if (c.flags)
 		return -EOPNOTSUPP;
-	return ops->pr_clear(bdev, c.key);
+	return ops->pr_clear(bdev, c.key, NULL);
 }
 
 static int blkdev_flushbuf(struct block_device *bdev, fmode_t mode,
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 1b15295bdf24..ac39e5d303b9 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -3080,7 +3080,8 @@ struct dm_pr {
 	bool	abort;
 	bool	fail_early;
 	int	ret;
-	enum pr_type type;
+	enum pr_type	type;
+	blk_status_t	*blk_stat;
 };
 
 static int dm_call_pr(struct block_device *bdev, iterate_devices_callout_fn fn,
@@ -3131,7 +3132,8 @@ static int __dm_pr_register(struct dm_target *ti, struct dm_dev *dev,
 		return -1;
 	}
 
-	ret = ops->pr_register(dev->bdev, pr->old_key, pr->new_key, pr->flags);
+	ret = ops->pr_register(dev->bdev, pr->old_key, pr->new_key, pr->flags,
+			       pr->blk_stat);
 	if (!ret)
 		return 0;
 
@@ -3145,7 +3147,7 @@ static int __dm_pr_register(struct dm_target *ti, struct dm_dev *dev,
 }
 
 static int dm_pr_register(struct block_device *bdev, u64 old_key, u64 new_key,
-			  u32 flags)
+			  u32 flags, blk_status_t *blk_stat)
 {
 	struct dm_pr pr = {
 		.old_key	= old_key,
@@ -3153,6 +3155,7 @@ static int dm_pr_register(struct block_device *bdev, u64 old_key, u64 new_key,
 		.flags		= flags,
 		.fail_early	= true,
 		.ret		= 0,
+		.blk_stat	= blk_stat,
 	};
 	int ret;
 
@@ -3190,7 +3193,8 @@ static int __dm_pr_reserve(struct dm_target *ti, struct dm_dev *dev,
 		return -1;
 	}
 
-	pr->ret = ops->pr_reserve(dev->bdev, pr->old_key, pr->type, pr->flags);
+	pr->ret = ops->pr_reserve(dev->bdev, pr->old_key, pr->type, pr->flags,
+				  pr->blk_stat);
 	if (!pr->ret)
 		return -1;
 
@@ -3198,7 +3202,7 @@ static int __dm_pr_reserve(struct dm_target *ti, struct dm_dev *dev,
 }
 
 static int dm_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
-			 u32 flags)
+			 u32 flags, blk_status_t *blk_stat)
 {
 	struct dm_pr pr = {
 		.old_key	= key,
@@ -3206,6 +3210,7 @@ static int dm_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
 		.type		= type,
 		.fail_early	= false,
 		.ret		= 0,
+		.blk_stat	= blk_stat,
 	};
 	int ret;
 
@@ -3233,19 +3238,22 @@ static int __dm_pr_release(struct dm_target *ti, struct dm_dev *dev,
 		return -1;
 	}
 
-	pr->ret = ops->pr_release(dev->bdev, pr->old_key, pr->type);
+	pr->ret = ops->pr_release(dev->bdev, pr->old_key, pr->type,
+				  pr->blk_stat);
 	if (pr->ret)
 		return -1;
 
 	return 0;
 }
 
-static int dm_pr_release(struct block_device *bdev, u64 key, enum pr_type type)
+static int dm_pr_release(struct block_device *bdev, u64 key, enum pr_type type,
+			 blk_status_t *blk_stat)
 {
 	struct dm_pr pr = {
 		.old_key	= key,
 		.type		= type,
 		.fail_early	= false,
+		.blk_stat	= blk_stat,
 	};
 	int ret;
 
@@ -3268,7 +3276,7 @@ static int __dm_pr_preempt(struct dm_target *ti, struct dm_dev *dev,
 	}
 
 	pr->ret = ops->pr_preempt(dev->bdev, pr->old_key, pr->new_key, pr->type,
-				  pr->abort);
+				  pr->abort, pr->blk_stat);
 	if (!pr->ret)
 		return -1;
 
@@ -3276,13 +3284,14 @@ static int __dm_pr_preempt(struct dm_target *ti, struct dm_dev *dev,
 }
 
 static int dm_pr_preempt(struct block_device *bdev, u64 old_key, u64 new_key,
-			 enum pr_type type, bool abort)
+			 enum pr_type type, bool abort, blk_status_t *blk_stat)
 {
 	struct dm_pr pr = {
 		.new_key	= new_key,
 		.old_key	= old_key,
 		.type		= type,
 		.fail_early	= false,
+		.blk_stat	= blk_stat,
 	};
 	int ret;
 
@@ -3293,7 +3302,8 @@ static int dm_pr_preempt(struct block_device *bdev, u64 old_key, u64 new_key,
 	return pr.ret;
 }
 
-static int dm_pr_clear(struct block_device *bdev, u64 key)
+static int dm_pr_clear(struct block_device *bdev, u64 key,
+		       blk_status_t *blk_stat)
 {
 	struct mapped_device *md = bdev->bd_disk->private_data;
 	const struct pr_ops *ops;
@@ -3305,7 +3315,7 @@ static int dm_pr_clear(struct block_device *bdev, u64 key)
 
 	ops = bdev->bd_disk->fops->pr_ops;
 	if (ops && ops->pr_clear)
-		r = ops->pr_clear(bdev, key);
+		r = ops->pr_clear(bdev, key, blk_stat);
 	else
 		r = -EOPNOTSUPP;
 out:
@@ -3314,7 +3324,7 @@ static int dm_pr_clear(struct block_device *bdev, u64 key)
 }
 
 static int dm_pr_read_keys(struct block_device *bdev, struct pr_keys *keys,
-			   u32 keys_len)
+			   u32 keys_len, blk_status_t *blk_stat)
 {
 	struct mapped_device *md = bdev->bd_disk->private_data;
 	const struct pr_ops *ops;
@@ -3326,7 +3336,7 @@ static int dm_pr_read_keys(struct block_device *bdev, struct pr_keys *keys,
 
 	ops = bdev->bd_disk->fops->pr_ops;
 	if (ops && ops->pr_read_keys)
-		r = ops->pr_read_keys(bdev, keys, keys_len);
+		r = ops->pr_read_keys(bdev, keys, keys_len, blk_stat);
 	else
 		r = -EOPNOTSUPP;
 out:
@@ -3335,7 +3345,8 @@ static int dm_pr_read_keys(struct block_device *bdev, struct pr_keys *keys,
 }
 
 static int dm_pr_read_reservation(struct block_device *bdev,
-				  struct pr_held_reservation *rsv)
+				  struct pr_held_reservation *rsv,
+				  blk_status_t *blk_stat)
 {
 	struct mapped_device *md = bdev->bd_disk->private_data;
 	const struct pr_ops *ops;
@@ -3347,7 +3358,7 @@ static int dm_pr_read_reservation(struct block_device *bdev,
 
 	ops = bdev->bd_disk->fops->pr_ops;
 	if (ops && ops->pr_read_reservation)
-		r = ops->pr_read_reservation(bdev, rsv);
+		r = ops->pr_read_reservation(bdev, rsv, blk_stat);
 	else
 		r = -EOPNOTSUPP;
 out:
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 5bbc1d84a87e..49bd745d28e2 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2148,7 +2148,7 @@ static int nvme_pr_command(struct block_device *bdev, u32 cdw10,
 }
 
 static int nvme_pr_register(struct block_device *bdev, u64 old,
-		u64 new, unsigned flags)
+		u64 new, unsigned flags, blk_status_t *blk_stat)
 {
 	u32 cdw10;
 
@@ -2162,7 +2162,7 @@ static int nvme_pr_register(struct block_device *bdev, u64 old,
 }
 
 static int nvme_pr_reserve(struct block_device *bdev, u64 key,
-		enum pr_type type, unsigned flags)
+		enum pr_type type, unsigned flags, blk_status_t *blk_stat)
 {
 	u32 cdw10;
 
@@ -2175,21 +2175,23 @@ static int nvme_pr_reserve(struct block_device *bdev, u64 key,
 }
 
 static int nvme_pr_preempt(struct block_device *bdev, u64 old, u64 new,
-		enum pr_type type, bool abort)
+		enum pr_type type, bool abort, blk_status_t *blk_stat)
 {
 	u32 cdw10 = nvme_pr_type(type) << 8 | (abort ? 2 : 1);
 
 	return nvme_pr_command(bdev, cdw10, old, new, nvme_cmd_resv_acquire);
 }
 
-static int nvme_pr_clear(struct block_device *bdev, u64 key)
+static int nvme_pr_clear(struct block_device *bdev, u64 key,
+		blk_status_t *blk_stat)
 {
 	u32 cdw10 = 1 | (key ? 1 << 3 : 0);
 
 	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_register);
 }
 
-static int nvme_pr_release(struct block_device *bdev, u64 key, enum pr_type type)
+static int nvme_pr_release(struct block_device *bdev, u64 key, enum pr_type type,
+		blk_status_t *blk_stat)
 {
 	u32 cdw10 = nvme_pr_type(type) << 8 | (key ? 1 << 3 : 0);
 
@@ -2224,7 +2226,7 @@ static int nvme_pr_resv_report(struct block_device *bdev, u8 *data,
 }
 
 static int nvme_pr_read_keys(struct block_device *bdev,
-		struct pr_keys *keys_info, u32 keys_len)
+		struct pr_keys *keys_info, u32 keys_len, blk_status_t *blk_stat)
 {
 	struct nvme_reservation_status *status;
 	u32 data_len, num_ret_keys;
@@ -2268,7 +2270,7 @@ static int nvme_pr_read_keys(struct block_device *bdev,
 }
 
 static int nvme_pr_read_reservation(struct block_device *bdev,
-		struct pr_held_reservation *resv)
+		struct pr_held_reservation *resv, blk_status_t *blk_stat)
 {
 	struct nvme_reservation_status tmp_status, *status;
 	int ret, i, num_regs;
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index f1d4d0568075..bf080de9866d 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1708,7 +1708,7 @@ static int sd_pr_in_command(struct block_device *bdev, u8 sa,
 }
 
 static int sd_pr_read_keys(struct block_device *bdev, struct pr_keys *keys_info,
-			   u32 keys_len)
+			   u32 keys_len, blk_status_t *blk_stat)
 {
 	int result, i, data_offset, num_copy_keys;
 	int data_len = keys_len + 8;
@@ -1739,7 +1739,8 @@ static int sd_pr_read_keys(struct block_device *bdev, struct pr_keys *keys_info,
 }
 
 static int sd_pr_read_reservation(struct block_device *bdev,
-				  struct pr_held_reservation *rsv)
+				  struct pr_held_reservation *rsv,
+				  blk_status_t *blk_stat)
 {
 	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
 	struct scsi_device *sdev = sdkp->device;
@@ -1769,8 +1770,8 @@ static int sd_pr_read_reservation(struct block_device *bdev,
 	return 0;
 }
 
-static int sd_pr_out_command(struct block_device *bdev, u8 sa,
-		u64 key, u64 sa_key, u8 type, u8 flags)
+static int sd_pr_out_command(struct block_device *bdev, u8 sa, u64 key,
+		u64 sa_key, u8 type, u8 flags)
 {
 	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
 	struct scsi_device *sdev = sdkp->device;
@@ -1801,7 +1802,7 @@ static int sd_pr_out_command(struct block_device *bdev, u8 sa,
 }
 
 static int sd_pr_register(struct block_device *bdev, u64 old_key, u64 new_key,
-		u32 flags)
+		u32 flags, blk_status_t *blk_stat)
 {
 	if (flags & ~PR_FL_IGNORE_KEY)
 		return -EOPNOTSUPP;
@@ -1811,7 +1812,7 @@ static int sd_pr_register(struct block_device *bdev, u64 old_key, u64 new_key,
 }
 
 static int sd_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
-		u32 flags)
+		u32 flags, blk_status_t *blk_stat)
 {
 	if (flags)
 		return -EOPNOTSUPP;
@@ -1819,20 +1820,22 @@ static int sd_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
 				 block_pr_type_to_scsi(type), 0);
 }
 
-static int sd_pr_release(struct block_device *bdev, u64 key, enum pr_type type)
+static int sd_pr_release(struct block_device *bdev, u64 key, enum pr_type type,
+		blk_status_t *blk_stat)
 {
 	return sd_pr_out_command(bdev, 0x02, key, 0,
 				 block_pr_type_to_scsi(type), 0);
 }
 
 static int sd_pr_preempt(struct block_device *bdev, u64 old_key, u64 new_key,
-		enum pr_type type, bool abort)
+		enum pr_type type, bool abort, blk_status_t *blk_stat)
 {
 	return sd_pr_out_command(bdev, abort ? 0x05 : 0x04, old_key, new_key,
 				 block_pr_type_to_scsi(type), 0);
 }
 
-static int sd_pr_clear(struct block_device *bdev, u64 key)
+static int sd_pr_clear(struct block_device *bdev, u64 key,
+		blk_status_t *blk_stat)
 {
 	return sd_pr_out_command(bdev, 0x03, key, 0, 0, 0);
 }
diff --git a/fs/nfs/blocklayout/dev.c b/fs/nfs/blocklayout/dev.c
index 5e56da748b2a..8726c1473d55 100644
--- a/fs/nfs/blocklayout/dev.c
+++ b/fs/nfs/blocklayout/dev.c
@@ -29,7 +29,7 @@ bl_free_device(struct pnfs_block_dev *dev)
 			int error;
 
 			error = ops->pr_register(dev->bdev, dev->pr_key, 0,
-				false);
+				false, NULL);
 			if (error)
 				pr_err("failed to unregister PR key.\n");
 		}
@@ -382,7 +382,7 @@ bl_parse_scsi(struct nfs_server *server, struct pnfs_block_dev *d,
 		goto out_blkdev_put;
 	}
 
-	error = ops->pr_register(d->bdev, 0, d->pr_key, true);
+	error = ops->pr_register(d->bdev, 0, d->pr_key, true, NULL);
 	if (error) {
 		pr_err("pNFS: failed to register key for block device %s.",
 				d->bdev->bd_disk->disk_name);
diff --git a/fs/nfsd/blocklayout.c b/fs/nfsd/blocklayout.c
index b6d01d51a746..a302ea026f72 100644
--- a/fs/nfsd/blocklayout.c
+++ b/fs/nfsd/blocklayout.c
@@ -277,7 +277,7 @@ nfsd4_block_get_device_info_scsi(struct super_block *sb,
 		goto out_free_dev;
 	}
 
-	ret = ops->pr_register(sb->s_bdev, 0, NFSD_MDS_PR_KEY, true);
+	ret = ops->pr_register(sb->s_bdev, 0, NFSD_MDS_PR_KEY, true, NULL);
 	if (ret) {
 		pr_err("pNFS: failed to register key for device %s.\n",
 			sb->s_id);
@@ -285,7 +285,7 @@ nfsd4_block_get_device_info_scsi(struct super_block *sb,
 	}
 
 	ret = ops->pr_reserve(sb->s_bdev, NFSD_MDS_PR_KEY,
-			PR_EXCLUSIVE_ACCESS_REG_ONLY, 0);
+			PR_EXCLUSIVE_ACCESS_REG_ONLY, 0, NULL);
 	if (ret) {
 		pr_err("pNFS: failed to reserve device %s.\n",
 			sb->s_id);
@@ -331,7 +331,7 @@ nfsd4_scsi_fence_client(struct nfs4_layout_stateid *ls)
 	struct block_device *bdev = ls->ls_file->nf_file->f_path.mnt->mnt_sb->s_bdev;
 
 	bdev->bd_disk->fops->pr_ops->pr_preempt(bdev, NFSD_MDS_PR_KEY,
-			nfsd4_scsi_pr_key(clp), 0, true);
+			nfsd4_scsi_pr_key(clp), 0, true, NULL);
 }
 
 const struct nfsd4_layout_ops scsi_layout_ops = {
diff --git a/include/linux/pr.h b/include/linux/pr.h
index 79b3d2853a20..2cbe97f06490 100644
--- a/include/linux/pr.h
+++ b/include/linux/pr.h
@@ -18,14 +18,15 @@ struct pr_held_reservation {
 
 struct pr_ops {
 	int (*pr_register)(struct block_device *bdev, u64 old_key, u64 new_key,
-			u32 flags);
+			u32 flags, blk_status_t *blk_stat);
 	int (*pr_reserve)(struct block_device *bdev, u64 key,
-			enum pr_type type, u32 flags);
+			enum pr_type type, u32 flags, blk_status_t *blk_stat);
 	int (*pr_release)(struct block_device *bdev, u64 key,
-			enum pr_type type);
+			enum pr_type type, blk_status_t *blk_stat);
 	int (*pr_preempt)(struct block_device *bdev, u64 old_key, u64 new_key,
-			enum pr_type type, bool abort);
-	int (*pr_clear)(struct block_device *bdev, u64 key);
+			enum pr_type type, bool abort, blk_status_t *blk_stat);
+	int (*pr_clear)(struct block_device *bdev, u64 key,
+			blk_status_t *blk_stat);
 	/*
 	 * pr_read_keys - Read the registered keys and return them in the
 	 * pr_keys->keys array. The keys array will have been allocated at the
@@ -35,9 +36,11 @@ struct pr_ops {
 	 * contains, so the caller can retry with a larger array.
 	 */
 	int (*pr_read_keys)(struct block_device *bdev,
-			struct pr_keys *keys_info, u32 keys_len);
+			struct pr_keys *keys_info, u32 keys_len,
+			blk_status_t *blk_stat);
 	int (*pr_read_reservation)(struct block_device *bdev,
-			struct pr_held_reservation *rsv);
+			struct pr_held_reservation *rsv,
+			blk_status_t *blk_stat);
 };
 
 #endif /* LINUX_PR_H */
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 12/20] block, nvme, scsi, dm: Add blk_status to pr_ops callouts.
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

Kernel pr_ops users like LIO need to be able to know about if a failure
was a result of a reservation conflict and then be able to convert
from the lower level's definition of that error to SCSI so it can be
returned to the initiator. To do this they currently have to know the
lower level device type and this can be difficult when we have
dm-multipath between LIO and the device.

dm-multipath would also like to be able to distiguish between path
failures and reservation conflict so they can optimize their error
handlers for their pr_ops.

To handle both cases, this patch adds a blk_status_t arg to the pr_ops
callouts. The lower levels will convert their device specific error to
the blk_status_t then the upper levels can easily check that code
without knowing the device type. It also allows us to keep userspace
compat where it expects a negative -Exyz error code if the command fails
before it's sent to the device or a device/tranport specific value if the
error is > 0.

This patch just wires in the blk_status_t to the pr_ops callouts. The
next patches will then have the drivers pass up a blk_status_t.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 block/ioctl.c            | 11 ++++++-----
 drivers/md/dm.c          | 41 +++++++++++++++++++++++++---------------
 drivers/nvme/host/core.c | 16 +++++++++-------
 drivers/scsi/sd.c        | 21 +++++++++++---------
 fs/nfs/blocklayout/dev.c |  4 ++--
 fs/nfsd/blocklayout.c    |  6 +++---
 include/linux/pr.h       | 17 ++++++++++-------
 7 files changed, 68 insertions(+), 48 deletions(-)

diff --git a/block/ioctl.c b/block/ioctl.c
index 60121e89052b..72338c56e235 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -269,7 +269,8 @@ static int blkdev_pr_register(struct block_device *bdev,
 
 	if (reg.flags & ~PR_FL_IGNORE_KEY)
 		return -EOPNOTSUPP;
-	return ops->pr_register(bdev, reg.old_key, reg.new_key, reg.flags);
+	return ops->pr_register(bdev, reg.old_key, reg.new_key, reg.flags,
+				NULL);
 }
 
 static int blkdev_pr_reserve(struct block_device *bdev,
@@ -287,7 +288,7 @@ static int blkdev_pr_reserve(struct block_device *bdev,
 
 	if (rsv.flags & ~PR_FL_IGNORE_KEY)
 		return -EOPNOTSUPP;
-	return ops->pr_reserve(bdev, rsv.key, rsv.type, rsv.flags);
+	return ops->pr_reserve(bdev, rsv.key, rsv.type, rsv.flags, NULL);
 }
 
 static int blkdev_pr_release(struct block_device *bdev,
@@ -305,7 +306,7 @@ static int blkdev_pr_release(struct block_device *bdev,
 
 	if (rsv.flags)
 		return -EOPNOTSUPP;
-	return ops->pr_release(bdev, rsv.key, rsv.type);
+	return ops->pr_release(bdev, rsv.key, rsv.type, NULL);
 }
 
 static int blkdev_pr_preempt(struct block_device *bdev,
@@ -323,7 +324,7 @@ static int blkdev_pr_preempt(struct block_device *bdev,
 
 	if (p.flags)
 		return -EOPNOTSUPP;
-	return ops->pr_preempt(bdev, p.old_key, p.new_key, p.type, abort);
+	return ops->pr_preempt(bdev, p.old_key, p.new_key, p.type, abort, NULL);
 }
 
 static int blkdev_pr_clear(struct block_device *bdev,
@@ -341,7 +342,7 @@ static int blkdev_pr_clear(struct block_device *bdev,
 
 	if (c.flags)
 		return -EOPNOTSUPP;
-	return ops->pr_clear(bdev, c.key);
+	return ops->pr_clear(bdev, c.key, NULL);
 }
 
 static int blkdev_flushbuf(struct block_device *bdev, fmode_t mode,
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 1b15295bdf24..ac39e5d303b9 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -3080,7 +3080,8 @@ struct dm_pr {
 	bool	abort;
 	bool	fail_early;
 	int	ret;
-	enum pr_type type;
+	enum pr_type	type;
+	blk_status_t	*blk_stat;
 };
 
 static int dm_call_pr(struct block_device *bdev, iterate_devices_callout_fn fn,
@@ -3131,7 +3132,8 @@ static int __dm_pr_register(struct dm_target *ti, struct dm_dev *dev,
 		return -1;
 	}
 
-	ret = ops->pr_register(dev->bdev, pr->old_key, pr->new_key, pr->flags);
+	ret = ops->pr_register(dev->bdev, pr->old_key, pr->new_key, pr->flags,
+			       pr->blk_stat);
 	if (!ret)
 		return 0;
 
@@ -3145,7 +3147,7 @@ static int __dm_pr_register(struct dm_target *ti, struct dm_dev *dev,
 }
 
 static int dm_pr_register(struct block_device *bdev, u64 old_key, u64 new_key,
-			  u32 flags)
+			  u32 flags, blk_status_t *blk_stat)
 {
 	struct dm_pr pr = {
 		.old_key	= old_key,
@@ -3153,6 +3155,7 @@ static int dm_pr_register(struct block_device *bdev, u64 old_key, u64 new_key,
 		.flags		= flags,
 		.fail_early	= true,
 		.ret		= 0,
+		.blk_stat	= blk_stat,
 	};
 	int ret;
 
@@ -3190,7 +3193,8 @@ static int __dm_pr_reserve(struct dm_target *ti, struct dm_dev *dev,
 		return -1;
 	}
 
-	pr->ret = ops->pr_reserve(dev->bdev, pr->old_key, pr->type, pr->flags);
+	pr->ret = ops->pr_reserve(dev->bdev, pr->old_key, pr->type, pr->flags,
+				  pr->blk_stat);
 	if (!pr->ret)
 		return -1;
 
@@ -3198,7 +3202,7 @@ static int __dm_pr_reserve(struct dm_target *ti, struct dm_dev *dev,
 }
 
 static int dm_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
-			 u32 flags)
+			 u32 flags, blk_status_t *blk_stat)
 {
 	struct dm_pr pr = {
 		.old_key	= key,
@@ -3206,6 +3210,7 @@ static int dm_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
 		.type		= type,
 		.fail_early	= false,
 		.ret		= 0,
+		.blk_stat	= blk_stat,
 	};
 	int ret;
 
@@ -3233,19 +3238,22 @@ static int __dm_pr_release(struct dm_target *ti, struct dm_dev *dev,
 		return -1;
 	}
 
-	pr->ret = ops->pr_release(dev->bdev, pr->old_key, pr->type);
+	pr->ret = ops->pr_release(dev->bdev, pr->old_key, pr->type,
+				  pr->blk_stat);
 	if (pr->ret)
 		return -1;
 
 	return 0;
 }
 
-static int dm_pr_release(struct block_device *bdev, u64 key, enum pr_type type)
+static int dm_pr_release(struct block_device *bdev, u64 key, enum pr_type type,
+			 blk_status_t *blk_stat)
 {
 	struct dm_pr pr = {
 		.old_key	= key,
 		.type		= type,
 		.fail_early	= false,
+		.blk_stat	= blk_stat,
 	};
 	int ret;
 
@@ -3268,7 +3276,7 @@ static int __dm_pr_preempt(struct dm_target *ti, struct dm_dev *dev,
 	}
 
 	pr->ret = ops->pr_preempt(dev->bdev, pr->old_key, pr->new_key, pr->type,
-				  pr->abort);
+				  pr->abort, pr->blk_stat);
 	if (!pr->ret)
 		return -1;
 
@@ -3276,13 +3284,14 @@ static int __dm_pr_preempt(struct dm_target *ti, struct dm_dev *dev,
 }
 
 static int dm_pr_preempt(struct block_device *bdev, u64 old_key, u64 new_key,
-			 enum pr_type type, bool abort)
+			 enum pr_type type, bool abort, blk_status_t *blk_stat)
 {
 	struct dm_pr pr = {
 		.new_key	= new_key,
 		.old_key	= old_key,
 		.type		= type,
 		.fail_early	= false,
+		.blk_stat	= blk_stat,
 	};
 	int ret;
 
@@ -3293,7 +3302,8 @@ static int dm_pr_preempt(struct block_device *bdev, u64 old_key, u64 new_key,
 	return pr.ret;
 }
 
-static int dm_pr_clear(struct block_device *bdev, u64 key)
+static int dm_pr_clear(struct block_device *bdev, u64 key,
+		       blk_status_t *blk_stat)
 {
 	struct mapped_device *md = bdev->bd_disk->private_data;
 	const struct pr_ops *ops;
@@ -3305,7 +3315,7 @@ static int dm_pr_clear(struct block_device *bdev, u64 key)
 
 	ops = bdev->bd_disk->fops->pr_ops;
 	if (ops && ops->pr_clear)
-		r = ops->pr_clear(bdev, key);
+		r = ops->pr_clear(bdev, key, blk_stat);
 	else
 		r = -EOPNOTSUPP;
 out:
@@ -3314,7 +3324,7 @@ static int dm_pr_clear(struct block_device *bdev, u64 key)
 }
 
 static int dm_pr_read_keys(struct block_device *bdev, struct pr_keys *keys,
-			   u32 keys_len)
+			   u32 keys_len, blk_status_t *blk_stat)
 {
 	struct mapped_device *md = bdev->bd_disk->private_data;
 	const struct pr_ops *ops;
@@ -3326,7 +3336,7 @@ static int dm_pr_read_keys(struct block_device *bdev, struct pr_keys *keys,
 
 	ops = bdev->bd_disk->fops->pr_ops;
 	if (ops && ops->pr_read_keys)
-		r = ops->pr_read_keys(bdev, keys, keys_len);
+		r = ops->pr_read_keys(bdev, keys, keys_len, blk_stat);
 	else
 		r = -EOPNOTSUPP;
 out:
@@ -3335,7 +3345,8 @@ static int dm_pr_read_keys(struct block_device *bdev, struct pr_keys *keys,
 }
 
 static int dm_pr_read_reservation(struct block_device *bdev,
-				  struct pr_held_reservation *rsv)
+				  struct pr_held_reservation *rsv,
+				  blk_status_t *blk_stat)
 {
 	struct mapped_device *md = bdev->bd_disk->private_data;
 	const struct pr_ops *ops;
@@ -3347,7 +3358,7 @@ static int dm_pr_read_reservation(struct block_device *bdev,
 
 	ops = bdev->bd_disk->fops->pr_ops;
 	if (ops && ops->pr_read_reservation)
-		r = ops->pr_read_reservation(bdev, rsv);
+		r = ops->pr_read_reservation(bdev, rsv, blk_stat);
 	else
 		r = -EOPNOTSUPP;
 out:
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 5bbc1d84a87e..49bd745d28e2 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2148,7 +2148,7 @@ static int nvme_pr_command(struct block_device *bdev, u32 cdw10,
 }
 
 static int nvme_pr_register(struct block_device *bdev, u64 old,
-		u64 new, unsigned flags)
+		u64 new, unsigned flags, blk_status_t *blk_stat)
 {
 	u32 cdw10;
 
@@ -2162,7 +2162,7 @@ static int nvme_pr_register(struct block_device *bdev, u64 old,
 }
 
 static int nvme_pr_reserve(struct block_device *bdev, u64 key,
-		enum pr_type type, unsigned flags)
+		enum pr_type type, unsigned flags, blk_status_t *blk_stat)
 {
 	u32 cdw10;
 
@@ -2175,21 +2175,23 @@ static int nvme_pr_reserve(struct block_device *bdev, u64 key,
 }
 
 static int nvme_pr_preempt(struct block_device *bdev, u64 old, u64 new,
-		enum pr_type type, bool abort)
+		enum pr_type type, bool abort, blk_status_t *blk_stat)
 {
 	u32 cdw10 = nvme_pr_type(type) << 8 | (abort ? 2 : 1);
 
 	return nvme_pr_command(bdev, cdw10, old, new, nvme_cmd_resv_acquire);
 }
 
-static int nvme_pr_clear(struct block_device *bdev, u64 key)
+static int nvme_pr_clear(struct block_device *bdev, u64 key,
+		blk_status_t *blk_stat)
 {
 	u32 cdw10 = 1 | (key ? 1 << 3 : 0);
 
 	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_register);
 }
 
-static int nvme_pr_release(struct block_device *bdev, u64 key, enum pr_type type)
+static int nvme_pr_release(struct block_device *bdev, u64 key, enum pr_type type,
+		blk_status_t *blk_stat)
 {
 	u32 cdw10 = nvme_pr_type(type) << 8 | (key ? 1 << 3 : 0);
 
@@ -2224,7 +2226,7 @@ static int nvme_pr_resv_report(struct block_device *bdev, u8 *data,
 }
 
 static int nvme_pr_read_keys(struct block_device *bdev,
-		struct pr_keys *keys_info, u32 keys_len)
+		struct pr_keys *keys_info, u32 keys_len, blk_status_t *blk_stat)
 {
 	struct nvme_reservation_status *status;
 	u32 data_len, num_ret_keys;
@@ -2268,7 +2270,7 @@ static int nvme_pr_read_keys(struct block_device *bdev,
 }
 
 static int nvme_pr_read_reservation(struct block_device *bdev,
-		struct pr_held_reservation *resv)
+		struct pr_held_reservation *resv, blk_status_t *blk_stat)
 {
 	struct nvme_reservation_status tmp_status, *status;
 	int ret, i, num_regs;
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index f1d4d0568075..bf080de9866d 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1708,7 +1708,7 @@ static int sd_pr_in_command(struct block_device *bdev, u8 sa,
 }
 
 static int sd_pr_read_keys(struct block_device *bdev, struct pr_keys *keys_info,
-			   u32 keys_len)
+			   u32 keys_len, blk_status_t *blk_stat)
 {
 	int result, i, data_offset, num_copy_keys;
 	int data_len = keys_len + 8;
@@ -1739,7 +1739,8 @@ static int sd_pr_read_keys(struct block_device *bdev, struct pr_keys *keys_info,
 }
 
 static int sd_pr_read_reservation(struct block_device *bdev,
-				  struct pr_held_reservation *rsv)
+				  struct pr_held_reservation *rsv,
+				  blk_status_t *blk_stat)
 {
 	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
 	struct scsi_device *sdev = sdkp->device;
@@ -1769,8 +1770,8 @@ static int sd_pr_read_reservation(struct block_device *bdev,
 	return 0;
 }
 
-static int sd_pr_out_command(struct block_device *bdev, u8 sa,
-		u64 key, u64 sa_key, u8 type, u8 flags)
+static int sd_pr_out_command(struct block_device *bdev, u8 sa, u64 key,
+		u64 sa_key, u8 type, u8 flags)
 {
 	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
 	struct scsi_device *sdev = sdkp->device;
@@ -1801,7 +1802,7 @@ static int sd_pr_out_command(struct block_device *bdev, u8 sa,
 }
 
 static int sd_pr_register(struct block_device *bdev, u64 old_key, u64 new_key,
-		u32 flags)
+		u32 flags, blk_status_t *blk_stat)
 {
 	if (flags & ~PR_FL_IGNORE_KEY)
 		return -EOPNOTSUPP;
@@ -1811,7 +1812,7 @@ static int sd_pr_register(struct block_device *bdev, u64 old_key, u64 new_key,
 }
 
 static int sd_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
-		u32 flags)
+		u32 flags, blk_status_t *blk_stat)
 {
 	if (flags)
 		return -EOPNOTSUPP;
@@ -1819,20 +1820,22 @@ static int sd_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
 				 block_pr_type_to_scsi(type), 0);
 }
 
-static int sd_pr_release(struct block_device *bdev, u64 key, enum pr_type type)
+static int sd_pr_release(struct block_device *bdev, u64 key, enum pr_type type,
+		blk_status_t *blk_stat)
 {
 	return sd_pr_out_command(bdev, 0x02, key, 0,
 				 block_pr_type_to_scsi(type), 0);
 }
 
 static int sd_pr_preempt(struct block_device *bdev, u64 old_key, u64 new_key,
-		enum pr_type type, bool abort)
+		enum pr_type type, bool abort, blk_status_t *blk_stat)
 {
 	return sd_pr_out_command(bdev, abort ? 0x05 : 0x04, old_key, new_key,
 				 block_pr_type_to_scsi(type), 0);
 }
 
-static int sd_pr_clear(struct block_device *bdev, u64 key)
+static int sd_pr_clear(struct block_device *bdev, u64 key,
+		blk_status_t *blk_stat)
 {
 	return sd_pr_out_command(bdev, 0x03, key, 0, 0, 0);
 }
diff --git a/fs/nfs/blocklayout/dev.c b/fs/nfs/blocklayout/dev.c
index 5e56da748b2a..8726c1473d55 100644
--- a/fs/nfs/blocklayout/dev.c
+++ b/fs/nfs/blocklayout/dev.c
@@ -29,7 +29,7 @@ bl_free_device(struct pnfs_block_dev *dev)
 			int error;
 
 			error = ops->pr_register(dev->bdev, dev->pr_key, 0,
-				false);
+				false, NULL);
 			if (error)
 				pr_err("failed to unregister PR key.\n");
 		}
@@ -382,7 +382,7 @@ bl_parse_scsi(struct nfs_server *server, struct pnfs_block_dev *d,
 		goto out_blkdev_put;
 	}
 
-	error = ops->pr_register(d->bdev, 0, d->pr_key, true);
+	error = ops->pr_register(d->bdev, 0, d->pr_key, true, NULL);
 	if (error) {
 		pr_err("pNFS: failed to register key for block device %s.",
 				d->bdev->bd_disk->disk_name);
diff --git a/fs/nfsd/blocklayout.c b/fs/nfsd/blocklayout.c
index b6d01d51a746..a302ea026f72 100644
--- a/fs/nfsd/blocklayout.c
+++ b/fs/nfsd/blocklayout.c
@@ -277,7 +277,7 @@ nfsd4_block_get_device_info_scsi(struct super_block *sb,
 		goto out_free_dev;
 	}
 
-	ret = ops->pr_register(sb->s_bdev, 0, NFSD_MDS_PR_KEY, true);
+	ret = ops->pr_register(sb->s_bdev, 0, NFSD_MDS_PR_KEY, true, NULL);
 	if (ret) {
 		pr_err("pNFS: failed to register key for device %s.\n",
 			sb->s_id);
@@ -285,7 +285,7 @@ nfsd4_block_get_device_info_scsi(struct super_block *sb,
 	}
 
 	ret = ops->pr_reserve(sb->s_bdev, NFSD_MDS_PR_KEY,
-			PR_EXCLUSIVE_ACCESS_REG_ONLY, 0);
+			PR_EXCLUSIVE_ACCESS_REG_ONLY, 0, NULL);
 	if (ret) {
 		pr_err("pNFS: failed to reserve device %s.\n",
 			sb->s_id);
@@ -331,7 +331,7 @@ nfsd4_scsi_fence_client(struct nfs4_layout_stateid *ls)
 	struct block_device *bdev = ls->ls_file->nf_file->f_path.mnt->mnt_sb->s_bdev;
 
 	bdev->bd_disk->fops->pr_ops->pr_preempt(bdev, NFSD_MDS_PR_KEY,
-			nfsd4_scsi_pr_key(clp), 0, true);
+			nfsd4_scsi_pr_key(clp), 0, true, NULL);
 }
 
 const struct nfsd4_layout_ops scsi_layout_ops = {
diff --git a/include/linux/pr.h b/include/linux/pr.h
index 79b3d2853a20..2cbe97f06490 100644
--- a/include/linux/pr.h
+++ b/include/linux/pr.h
@@ -18,14 +18,15 @@ struct pr_held_reservation {
 
 struct pr_ops {
 	int (*pr_register)(struct block_device *bdev, u64 old_key, u64 new_key,
-			u32 flags);
+			u32 flags, blk_status_t *blk_stat);
 	int (*pr_reserve)(struct block_device *bdev, u64 key,
-			enum pr_type type, u32 flags);
+			enum pr_type type, u32 flags, blk_status_t *blk_stat);
 	int (*pr_release)(struct block_device *bdev, u64 key,
-			enum pr_type type);
+			enum pr_type type, blk_status_t *blk_stat);
 	int (*pr_preempt)(struct block_device *bdev, u64 old_key, u64 new_key,
-			enum pr_type type, bool abort);
-	int (*pr_clear)(struct block_device *bdev, u64 key);
+			enum pr_type type, bool abort, blk_status_t *blk_stat);
+	int (*pr_clear)(struct block_device *bdev, u64 key,
+			blk_status_t *blk_stat);
 	/*
 	 * pr_read_keys - Read the registered keys and return them in the
 	 * pr_keys->keys array. The keys array will have been allocated at the
@@ -35,9 +36,11 @@ struct pr_ops {
 	 * contains, so the caller can retry with a larger array.
 	 */
 	int (*pr_read_keys)(struct block_device *bdev,
-			struct pr_keys *keys_info, u32 keys_len);
+			struct pr_keys *keys_info, u32 keys_len,
+			blk_status_t *blk_stat);
 	int (*pr_read_reservation)(struct block_device *bdev,
-			struct pr_held_reservation *rsv);
+			struct pr_held_reservation *rsv,
+			blk_status_t *blk_stat);
 };
 
 #endif /* LINUX_PR_H */
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 13/20] nvme: Have nvme pr_ops return a blk_status_t
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This patch has the nvme pr_ops convert from a nvme status value to a
blk_status_t.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/nvme/host/core.c | 54 ++++++++++++++++++++++++++--------------
 1 file changed, 36 insertions(+), 18 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 49bd745d28e2..46188b3d9df8 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2105,7 +2105,8 @@ static char nvme_pr_type(enum pr_type type)
 }
 
 static int nvme_send_ns_head_pr_command(struct block_device *bdev,
-		struct nvme_command *c, u8 *data, unsigned int data_len)
+		struct nvme_command *c, u8 *data, unsigned int data_len,
+		blk_status_t *blk_stat)
 {
 	struct nvme_ns_head *head = bdev->bd_disk->private_data;
 	int srcu_idx = srcu_read_lock(&head->srcu);
@@ -2115,20 +2116,28 @@ static int nvme_send_ns_head_pr_command(struct block_device *bdev,
 	if (ns) {
 		c->common.nsid = cpu_to_le32(ns->head->ns_id);
 		ret = nvme_submit_sync_cmd(ns->queue, c, data, data_len);
+		if (blk_stat && ret >= 0)
+			*blk_stat = nvme_error_status(ret);
 	}
 	srcu_read_unlock(&head->srcu, srcu_idx);
 	return ret;
 }
 	
 static int nvme_send_ns_pr_command(struct nvme_ns *ns, struct nvme_command *c,
-		u8 *data, unsigned int data_len)
+		u8 *data, unsigned int data_len, blk_status_t *blk_stat)
 {
+	int ret;
+
 	c->common.nsid = cpu_to_le32(ns->head->ns_id);
-	return nvme_submit_sync_cmd(ns->queue, c, data, data_len);
+	ret = nvme_submit_sync_cmd(ns->queue, c, data, data_len);
+	if (blk_stat && ret >= 0)
+		*blk_stat = nvme_error_status(ret);
+	return ret;
 }
 
 static int nvme_pr_command(struct block_device *bdev, u32 cdw10,
-				u64 key, u64 sa_key, u8 op)
+				u64 key, u64 sa_key, u8 op,
+				blk_status_t *blk_stat)
 {
 	struct nvme_command c = { };
 	u8 data[16] = { 0, };
@@ -2142,9 +2151,9 @@ static int nvme_pr_command(struct block_device *bdev, u32 cdw10,
 	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
 	    bdev->bd_disk->fops == &nvme_ns_head_ops)
 		return nvme_send_ns_head_pr_command(bdev, &c, data,
-						    sizeof(data));
-	return nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c, data,
-				       sizeof(data));
+						    sizeof(data), blk_stat);
+	return nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
+				       data, sizeof(data), blk_stat);
 }
 
 static int nvme_pr_register(struct block_device *bdev, u64 old,
@@ -2158,7 +2167,8 @@ static int nvme_pr_register(struct block_device *bdev, u64 old,
 	cdw10 = old ? 2 : 0;
 	cdw10 |= (flags & PR_FL_IGNORE_KEY) ? 1 << 3 : 0;
 	cdw10 |= (1 << 30) | (1 << 31); /* PTPL=1 */
-	return nvme_pr_command(bdev, cdw10, old, new, nvme_cmd_resv_register);
+	return nvme_pr_command(bdev, cdw10, old, new, nvme_cmd_resv_register,
+			       blk_stat);
 }
 
 static int nvme_pr_reserve(struct block_device *bdev, u64 key,
@@ -2171,7 +2181,8 @@ static int nvme_pr_reserve(struct block_device *bdev, u64 key,
 
 	cdw10 = nvme_pr_type(type) << 8;
 	cdw10 |= ((flags & PR_FL_IGNORE_KEY) ? 1 << 3 : 0);
-	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_acquire);
+	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_acquire,
+			       blk_stat);
 }
 
 static int nvme_pr_preempt(struct block_device *bdev, u64 old, u64 new,
@@ -2179,7 +2190,8 @@ static int nvme_pr_preempt(struct block_device *bdev, u64 old, u64 new,
 {
 	u32 cdw10 = nvme_pr_type(type) << 8 | (abort ? 2 : 1);
 
-	return nvme_pr_command(bdev, cdw10, old, new, nvme_cmd_resv_acquire);
+	return nvme_pr_command(bdev, cdw10, old, new, nvme_cmd_resv_acquire,
+			       blk_stat);
 }
 
 static int nvme_pr_clear(struct block_device *bdev, u64 key,
@@ -2187,7 +2199,8 @@ static int nvme_pr_clear(struct block_device *bdev, u64 key,
 {
 	u32 cdw10 = 1 | (key ? 1 << 3 : 0);
 
-	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_register);
+	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_register,
+			       blk_stat);
 }
 
 static int nvme_pr_release(struct block_device *bdev, u64 key, enum pr_type type,
@@ -2195,11 +2208,12 @@ static int nvme_pr_release(struct block_device *bdev, u64 key, enum pr_type type
 {
 	u32 cdw10 = nvme_pr_type(type) << 8 | (key ? 1 << 3 : 0);
 
-	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_release);
+	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_release,
+			       blk_stat);
 }
 
 static int nvme_pr_resv_report(struct block_device *bdev, u8 *data,
-		u32 data_len, bool *eds)
+		u32 data_len, bool *eds, blk_status_t *blk_stat)
 {
 	struct nvme_command c = { };
 	int ret;
@@ -2210,12 +2224,16 @@ static int nvme_pr_resv_report(struct block_device *bdev, u8 *data,
 	*eds = true;
 
 retry:
+	if (blk_stat)
+		*blk_stat = 0;
+
 	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
 	    bdev->bd_disk->fops == &nvme_ns_head_ops)
-		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len);
+		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len,
+						   blk_stat);
 	else
 		ret = nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
-					      data, data_len);
+					      data, data_len, blk_stat);
 	if (ret == NVME_SC_HOST_ID_INCONSIST && c.common.cdw11) {
 		c.common.cdw11 = 0;
 		*eds = false;
@@ -2245,7 +2263,7 @@ static int nvme_pr_read_keys(struct block_device *bdev,
 	if (!data)
 		return -ENOMEM;
 
-	ret = nvme_pr_resv_report(bdev, data, data_len, &eds);
+	ret = nvme_pr_resv_report(bdev, data, data_len, &eds, blk_stat);
 	if (ret)
 		goto free_data;
 
@@ -2286,7 +2304,7 @@ static int nvme_pr_read_reservation(struct block_device *bdev,
 	 * the response buffer.
 	 */
 	ret = nvme_pr_resv_report(bdev, (u8 *)&tmp_status, sizeof(tmp_status),
-				  &eds);
+				  &eds, blk_stat);
 	if (ret)
 		return 0;
 
@@ -2302,7 +2320,7 @@ static int nvme_pr_read_reservation(struct block_device *bdev,
 	if (!data)
 		return -ENOMEM;
 
-	ret = nvme_pr_resv_report(bdev, data, data_len, &eds);
+	ret = nvme_pr_resv_report(bdev, data, data_len, &eds, blk_stat);
 	if (ret)
 		goto free_data;
 	status = (struct nvme_reservation_status *)data;
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 13/20] nvme: Have nvme pr_ops return a blk_status_t
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This patch has the nvme pr_ops convert from a nvme status value to a
blk_status_t.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/nvme/host/core.c | 54 ++++++++++++++++++++++++++--------------
 1 file changed, 36 insertions(+), 18 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 49bd745d28e2..46188b3d9df8 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2105,7 +2105,8 @@ static char nvme_pr_type(enum pr_type type)
 }
 
 static int nvme_send_ns_head_pr_command(struct block_device *bdev,
-		struct nvme_command *c, u8 *data, unsigned int data_len)
+		struct nvme_command *c, u8 *data, unsigned int data_len,
+		blk_status_t *blk_stat)
 {
 	struct nvme_ns_head *head = bdev->bd_disk->private_data;
 	int srcu_idx = srcu_read_lock(&head->srcu);
@@ -2115,20 +2116,28 @@ static int nvme_send_ns_head_pr_command(struct block_device *bdev,
 	if (ns) {
 		c->common.nsid = cpu_to_le32(ns->head->ns_id);
 		ret = nvme_submit_sync_cmd(ns->queue, c, data, data_len);
+		if (blk_stat && ret >= 0)
+			*blk_stat = nvme_error_status(ret);
 	}
 	srcu_read_unlock(&head->srcu, srcu_idx);
 	return ret;
 }
 	
 static int nvme_send_ns_pr_command(struct nvme_ns *ns, struct nvme_command *c,
-		u8 *data, unsigned int data_len)
+		u8 *data, unsigned int data_len, blk_status_t *blk_stat)
 {
+	int ret;
+
 	c->common.nsid = cpu_to_le32(ns->head->ns_id);
-	return nvme_submit_sync_cmd(ns->queue, c, data, data_len);
+	ret = nvme_submit_sync_cmd(ns->queue, c, data, data_len);
+	if (blk_stat && ret >= 0)
+		*blk_stat = nvme_error_status(ret);
+	return ret;
 }
 
 static int nvme_pr_command(struct block_device *bdev, u32 cdw10,
-				u64 key, u64 sa_key, u8 op)
+				u64 key, u64 sa_key, u8 op,
+				blk_status_t *blk_stat)
 {
 	struct nvme_command c = { };
 	u8 data[16] = { 0, };
@@ -2142,9 +2151,9 @@ static int nvme_pr_command(struct block_device *bdev, u32 cdw10,
 	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
 	    bdev->bd_disk->fops == &nvme_ns_head_ops)
 		return nvme_send_ns_head_pr_command(bdev, &c, data,
-						    sizeof(data));
-	return nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c, data,
-				       sizeof(data));
+						    sizeof(data), blk_stat);
+	return nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
+				       data, sizeof(data), blk_stat);
 }
 
 static int nvme_pr_register(struct block_device *bdev, u64 old,
@@ -2158,7 +2167,8 @@ static int nvme_pr_register(struct block_device *bdev, u64 old,
 	cdw10 = old ? 2 : 0;
 	cdw10 |= (flags & PR_FL_IGNORE_KEY) ? 1 << 3 : 0;
 	cdw10 |= (1 << 30) | (1 << 31); /* PTPL=1 */
-	return nvme_pr_command(bdev, cdw10, old, new, nvme_cmd_resv_register);
+	return nvme_pr_command(bdev, cdw10, old, new, nvme_cmd_resv_register,
+			       blk_stat);
 }
 
 static int nvme_pr_reserve(struct block_device *bdev, u64 key,
@@ -2171,7 +2181,8 @@ static int nvme_pr_reserve(struct block_device *bdev, u64 key,
 
 	cdw10 = nvme_pr_type(type) << 8;
 	cdw10 |= ((flags & PR_FL_IGNORE_KEY) ? 1 << 3 : 0);
-	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_acquire);
+	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_acquire,
+			       blk_stat);
 }
 
 static int nvme_pr_preempt(struct block_device *bdev, u64 old, u64 new,
@@ -2179,7 +2190,8 @@ static int nvme_pr_preempt(struct block_device *bdev, u64 old, u64 new,
 {
 	u32 cdw10 = nvme_pr_type(type) << 8 | (abort ? 2 : 1);
 
-	return nvme_pr_command(bdev, cdw10, old, new, nvme_cmd_resv_acquire);
+	return nvme_pr_command(bdev, cdw10, old, new, nvme_cmd_resv_acquire,
+			       blk_stat);
 }
 
 static int nvme_pr_clear(struct block_device *bdev, u64 key,
@@ -2187,7 +2199,8 @@ static int nvme_pr_clear(struct block_device *bdev, u64 key,
 {
 	u32 cdw10 = 1 | (key ? 1 << 3 : 0);
 
-	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_register);
+	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_register,
+			       blk_stat);
 }
 
 static int nvme_pr_release(struct block_device *bdev, u64 key, enum pr_type type,
@@ -2195,11 +2208,12 @@ static int nvme_pr_release(struct block_device *bdev, u64 key, enum pr_type type
 {
 	u32 cdw10 = nvme_pr_type(type) << 8 | (key ? 1 << 3 : 0);
 
-	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_release);
+	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_release,
+			       blk_stat);
 }
 
 static int nvme_pr_resv_report(struct block_device *bdev, u8 *data,
-		u32 data_len, bool *eds)
+		u32 data_len, bool *eds, blk_status_t *blk_stat)
 {
 	struct nvme_command c = { };
 	int ret;
@@ -2210,12 +2224,16 @@ static int nvme_pr_resv_report(struct block_device *bdev, u8 *data,
 	*eds = true;
 
 retry:
+	if (blk_stat)
+		*blk_stat = 0;
+
 	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
 	    bdev->bd_disk->fops == &nvme_ns_head_ops)
-		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len);
+		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len,
+						   blk_stat);
 	else
 		ret = nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
-					      data, data_len);
+					      data, data_len, blk_stat);
 	if (ret == NVME_SC_HOST_ID_INCONSIST && c.common.cdw11) {
 		c.common.cdw11 = 0;
 		*eds = false;
@@ -2245,7 +2263,7 @@ static int nvme_pr_read_keys(struct block_device *bdev,
 	if (!data)
 		return -ENOMEM;
 
-	ret = nvme_pr_resv_report(bdev, data, data_len, &eds);
+	ret = nvme_pr_resv_report(bdev, data, data_len, &eds, blk_stat);
 	if (ret)
 		goto free_data;
 
@@ -2286,7 +2304,7 @@ static int nvme_pr_read_reservation(struct block_device *bdev,
 	 * the response buffer.
 	 */
 	ret = nvme_pr_resv_report(bdev, (u8 *)&tmp_status, sizeof(tmp_status),
-				  &eds);
+				  &eds, blk_stat);
 	if (ret)
 		return 0;
 
@@ -2302,7 +2320,7 @@ static int nvme_pr_read_reservation(struct block_device *bdev,
 	if (!data)
 		return -ENOMEM;
 
-	ret = nvme_pr_resv_report(bdev, data, data_len, &eds);
+	ret = nvme_pr_resv_report(bdev, data, data_len, &eds, blk_stat);
 	if (ret)
 		goto free_data;
 	status = (struct nvme_reservation_status *)data;
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 14/20] scsi: Retry pr_ops commands if a UA is returned.
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

It's common to get a UA when doing PR commands. It could be due to a
target restarting, transport level relogin or other PR commands like a
release causing it. The upper layers don't get the sense and in some cases
have no idea if it's a SCSI device, so this has the sd layer retry.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/sd.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index bf080de9866d..61e88c7ffa44 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1683,6 +1683,8 @@ static int sd_get_unique_id(struct gendisk *disk, u8 id[16],
 	return ret;
 }
 
+#define SCSI_PR_UA_RETRIES 5
+
 static int sd_pr_in_command(struct block_device *bdev, u8 sa,
 			    unsigned char *data, int data_len)
 {
@@ -1690,8 +1692,9 @@ static int sd_pr_in_command(struct block_device *bdev, u8 sa,
 	struct scsi_device *sdev = sdkp->device;
 	struct scsi_sense_hdr sshdr;
 	u8 cmd[10] = { 0, };
-	int result;
+	int result, ua_retries = SCSI_PR_UA_RETRIES;
 
+retry:
 	cmd[0] = PERSISTENT_RESERVE_IN;
 	cmd[1] = sa;
 	put_unaligned_be16(data_len, &cmd[7]);
@@ -1700,6 +1703,9 @@ static int sd_pr_in_command(struct block_device *bdev, u8 sa,
 				  &sshdr, SD_TIMEOUT, sdkp->max_retries, NULL);
 	if (scsi_status_is_check_condition(result) &&
 	    scsi_sense_valid(&sshdr)) {
+		if (sshdr.sense_key == UNIT_ATTENTION && ua_retries-- > 0)
+			goto retry;
+
 		sdev_printk(KERN_INFO, sdev, "PR command failed: %d\n", result);
 		scsi_print_sense_hdr(sdev, NULL, &sshdr);
 	}
@@ -1776,10 +1782,11 @@ static int sd_pr_out_command(struct block_device *bdev, u8 sa, u64 key,
 	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
 	struct scsi_device *sdev = sdkp->device;
 	struct scsi_sense_hdr sshdr;
-	int result;
+	int result, ua_retries = SCSI_PR_UA_RETRIES;
 	u8 cmd[16] = { 0, };
 	u8 data[24] = { 0, };
 
+retry:
 	cmd[0] = PERSISTENT_RESERVE_OUT;
 	cmd[1] = sa;
 	cmd[2] = type;
@@ -1794,6 +1801,9 @@ static int sd_pr_out_command(struct block_device *bdev, u8 sa, u64 key,
 
 	if (scsi_status_is_check_condition(result) &&
 	    scsi_sense_valid(&sshdr)) {
+		if (sshdr.sense_key == UNIT_ATTENTION && ua_retries-- > 0)
+			goto retry;
+
 		sdev_printk(KERN_INFO, sdev, "PR command failed: %d\n", result);
 		scsi_print_sense_hdr(sdev, NULL, &sshdr);
 	}
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 14/20] scsi: Retry pr_ops commands if a UA is returned.
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

It's common to get a UA when doing PR commands. It could be due to a
target restarting, transport level relogin or other PR commands like a
release causing it. The upper layers don't get the sense and in some cases
have no idea if it's a SCSI device, so this has the sd layer retry.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/sd.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index bf080de9866d..61e88c7ffa44 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1683,6 +1683,8 @@ static int sd_get_unique_id(struct gendisk *disk, u8 id[16],
 	return ret;
 }
 
+#define SCSI_PR_UA_RETRIES 5
+
 static int sd_pr_in_command(struct block_device *bdev, u8 sa,
 			    unsigned char *data, int data_len)
 {
@@ -1690,8 +1692,9 @@ static int sd_pr_in_command(struct block_device *bdev, u8 sa,
 	struct scsi_device *sdev = sdkp->device;
 	struct scsi_sense_hdr sshdr;
 	u8 cmd[10] = { 0, };
-	int result;
+	int result, ua_retries = SCSI_PR_UA_RETRIES;
 
+retry:
 	cmd[0] = PERSISTENT_RESERVE_IN;
 	cmd[1] = sa;
 	put_unaligned_be16(data_len, &cmd[7]);
@@ -1700,6 +1703,9 @@ static int sd_pr_in_command(struct block_device *bdev, u8 sa,
 				  &sshdr, SD_TIMEOUT, sdkp->max_retries, NULL);
 	if (scsi_status_is_check_condition(result) &&
 	    scsi_sense_valid(&sshdr)) {
+		if (sshdr.sense_key == UNIT_ATTENTION && ua_retries-- > 0)
+			goto retry;
+
 		sdev_printk(KERN_INFO, sdev, "PR command failed: %d\n", result);
 		scsi_print_sense_hdr(sdev, NULL, &sshdr);
 	}
@@ -1776,10 +1782,11 @@ static int sd_pr_out_command(struct block_device *bdev, u8 sa, u64 key,
 	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
 	struct scsi_device *sdev = sdkp->device;
 	struct scsi_sense_hdr sshdr;
-	int result;
+	int result, ua_retries = SCSI_PR_UA_RETRIES;
 	u8 cmd[16] = { 0, };
 	u8 data[24] = { 0, };
 
+retry:
 	cmd[0] = PERSISTENT_RESERVE_OUT;
 	cmd[1] = sa;
 	cmd[2] = type;
@@ -1794,6 +1801,9 @@ static int sd_pr_out_command(struct block_device *bdev, u8 sa, u64 key,
 
 	if (scsi_status_is_check_condition(result) &&
 	    scsi_sense_valid(&sshdr)) {
+		if (sshdr.sense_key == UNIT_ATTENTION && ua_retries-- > 0)
+			goto retry;
+
 		sdev_printk(KERN_INFO, sdev, "PR command failed: %d\n", result);
 		scsi_print_sense_hdr(sdev, NULL, &sshdr);
 	}
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 15/20] scsi: Export scsi_result_to_blk_status.
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

Export scsi_result_to_blk_status so the sd pr_ops can get a BLK_STS error
that can be returned to other kernel pr ops users.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/scsi_lib.c  | 3 ++-
 include/scsi/scsi_cmnd.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index a2a3a9bd5ba1..d7825ff8915d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -587,7 +587,7 @@ static inline u8 get_scsi_ml_byte(int result)
  *
  * Translate a SCSI result code into a blk_status_t value.
  */
-static blk_status_t scsi_result_to_blk_status(int result)
+blk_status_t scsi_result_to_blk_status(int result)
 {
 	/*
 	 * Check the scsi-ml byte first in case we converted a host or status
@@ -618,6 +618,7 @@ static blk_status_t scsi_result_to_blk_status(int result)
 		return BLK_STS_IOERR;
 	}
 }
+EXPORT_SYMBOL_GPL(scsi_result_to_blk_status);
 
 /**
  * scsi_rq_err_bytes - determine number of bytes till the next failure boundary
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index bac55decf900..c4de69ba859f 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -155,6 +155,7 @@ static inline void *scsi_cmd_priv(struct scsi_cmnd *cmd)
 void scsi_done(struct scsi_cmnd *cmd);
 void scsi_done_direct(struct scsi_cmnd *cmd);
 
+blk_status_t scsi_result_to_blk_status(int result);
 extern void scsi_finish_command(struct scsi_cmnd *cmd);
 
 extern void *scsi_kmap_atomic_sg(struct scatterlist *sg, int sg_count,
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 15/20] scsi: Export scsi_result_to_blk_status.
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

Export scsi_result_to_blk_status so the sd pr_ops can get a BLK_STS error
that can be returned to other kernel pr ops users.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/scsi_lib.c  | 3 ++-
 include/scsi/scsi_cmnd.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index a2a3a9bd5ba1..d7825ff8915d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -587,7 +587,7 @@ static inline u8 get_scsi_ml_byte(int result)
  *
  * Translate a SCSI result code into a blk_status_t value.
  */
-static blk_status_t scsi_result_to_blk_status(int result)
+blk_status_t scsi_result_to_blk_status(int result)
 {
 	/*
 	 * Check the scsi-ml byte first in case we converted a host or status
@@ -618,6 +618,7 @@ static blk_status_t scsi_result_to_blk_status(int result)
 		return BLK_STS_IOERR;
 	}
 }
+EXPORT_SYMBOL_GPL(scsi_result_to_blk_status);
 
 /**
  * scsi_rq_err_bytes - determine number of bytes till the next failure boundary
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index bac55decf900..c4de69ba859f 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -155,6 +155,7 @@ static inline void *scsi_cmd_priv(struct scsi_cmnd *cmd)
 void scsi_done(struct scsi_cmnd *cmd);
 void scsi_done_direct(struct scsi_cmnd *cmd);
 
+blk_status_t scsi_result_to_blk_status(int result);
 extern void scsi_finish_command(struct scsi_cmnd *cmd);
 
 extern void *scsi_kmap_atomic_sg(struct scatterlist *sg, int sg_count,
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 16/20] scsi: Have sd pr_ops return a blk_status_t
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This patch has the sd pr_ops convert from the low level SCSI errors to a
blk_status_t.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/sd.c | 27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 61e88c7ffa44..31b4eafadc44 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1686,7 +1686,8 @@ static int sd_get_unique_id(struct gendisk *disk, u8 id[16],
 #define SCSI_PR_UA_RETRIES 5
 
 static int sd_pr_in_command(struct block_device *bdev, u8 sa,
-			    unsigned char *data, int data_len)
+			    unsigned char *data, int data_len,
+			    blk_status_t *blk_stat)
 {
 	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
 	struct scsi_device *sdev = sdkp->device;
@@ -1710,6 +1711,9 @@ static int sd_pr_in_command(struct block_device *bdev, u8 sa,
 		scsi_print_sense_hdr(sdev, NULL, &sshdr);
 	}
 
+	if (blk_stat && result >= 0)
+		*blk_stat = scsi_result_to_blk_status(result);
+
 	return result;
 }
 
@@ -1724,7 +1728,7 @@ static int sd_pr_read_keys(struct block_device *bdev, struct pr_keys *keys_info,
 	if (!data)
 		return -ENOMEM;
 
-	result = sd_pr_in_command(bdev, READ_KEYS, data, data_len);
+	result = sd_pr_in_command(bdev, READ_KEYS, data, data_len, blk_stat);
 	if (result)
 		goto free_data;
 
@@ -1753,7 +1757,8 @@ static int sd_pr_read_reservation(struct block_device *bdev,
 	u8 data[24] = { 0, };
 	int result, len;
 
-	result = sd_pr_in_command(bdev, READ_RESERVATION, data, sizeof(data));
+	result = sd_pr_in_command(bdev, READ_RESERVATION, data, sizeof(data),
+				  blk_stat);
 	if (result)
 		return result;
 
@@ -1777,7 +1782,7 @@ static int sd_pr_read_reservation(struct block_device *bdev,
 }
 
 static int sd_pr_out_command(struct block_device *bdev, u8 sa, u64 key,
-		u64 sa_key, u8 type, u8 flags)
+		u64 sa_key, u8 type, u8 flags, blk_status_t *blk_stat)
 {
 	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
 	struct scsi_device *sdev = sdkp->device;
@@ -1808,6 +1813,9 @@ static int sd_pr_out_command(struct block_device *bdev, u8 sa, u64 key,
 		scsi_print_sense_hdr(sdev, NULL, &sshdr);
 	}
 
+	if (blk_stat && result >= 0)
+		*blk_stat = scsi_result_to_blk_status(result);
+
 	return result;
 }
 
@@ -1818,7 +1826,8 @@ static int sd_pr_register(struct block_device *bdev, u64 old_key, u64 new_key,
 		return -EOPNOTSUPP;
 	return sd_pr_out_command(bdev, (flags & PR_FL_IGNORE_KEY) ? 0x06 : 0x00,
 			old_key, new_key, 0,
-			(1 << 0) /* APTPL */);
+			(1 << 0) /* APTPL */,
+			blk_stat);
 }
 
 static int sd_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
@@ -1827,27 +1836,27 @@ static int sd_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
 	if (flags)
 		return -EOPNOTSUPP;
 	return sd_pr_out_command(bdev, 0x01, key, 0,
-				 block_pr_type_to_scsi(type), 0);
+				 block_pr_type_to_scsi(type), 0, blk_stat);
 }
 
 static int sd_pr_release(struct block_device *bdev, u64 key, enum pr_type type,
 		blk_status_t *blk_stat)
 {
 	return sd_pr_out_command(bdev, 0x02, key, 0,
-				 block_pr_type_to_scsi(type), 0);
+				 block_pr_type_to_scsi(type), 0, blk_stat);
 }
 
 static int sd_pr_preempt(struct block_device *bdev, u64 old_key, u64 new_key,
 		enum pr_type type, bool abort, blk_status_t *blk_stat)
 {
 	return sd_pr_out_command(bdev, abort ? 0x05 : 0x04, old_key, new_key,
-				 block_pr_type_to_scsi(type), 0);
+				 block_pr_type_to_scsi(type), 0, blk_stat);
 }
 
 static int sd_pr_clear(struct block_device *bdev, u64 key,
 		blk_status_t *blk_stat)
 {
-	return sd_pr_out_command(bdev, 0x03, key, 0, 0, 0);
+	return sd_pr_out_command(bdev, 0x03, key, 0, 0, 0, blk_stat);
 }
 
 static const struct pr_ops sd_pr_ops = {
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 16/20] scsi: Have sd pr_ops return a blk_status_t
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This patch has the sd pr_ops convert from the low level SCSI errors to a
blk_status_t.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/sd.c | 27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 61e88c7ffa44..31b4eafadc44 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1686,7 +1686,8 @@ static int sd_get_unique_id(struct gendisk *disk, u8 id[16],
 #define SCSI_PR_UA_RETRIES 5
 
 static int sd_pr_in_command(struct block_device *bdev, u8 sa,
-			    unsigned char *data, int data_len)
+			    unsigned char *data, int data_len,
+			    blk_status_t *blk_stat)
 {
 	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
 	struct scsi_device *sdev = sdkp->device;
@@ -1710,6 +1711,9 @@ static int sd_pr_in_command(struct block_device *bdev, u8 sa,
 		scsi_print_sense_hdr(sdev, NULL, &sshdr);
 	}
 
+	if (blk_stat && result >= 0)
+		*blk_stat = scsi_result_to_blk_status(result);
+
 	return result;
 }
 
@@ -1724,7 +1728,7 @@ static int sd_pr_read_keys(struct block_device *bdev, struct pr_keys *keys_info,
 	if (!data)
 		return -ENOMEM;
 
-	result = sd_pr_in_command(bdev, READ_KEYS, data, data_len);
+	result = sd_pr_in_command(bdev, READ_KEYS, data, data_len, blk_stat);
 	if (result)
 		goto free_data;
 
@@ -1753,7 +1757,8 @@ static int sd_pr_read_reservation(struct block_device *bdev,
 	u8 data[24] = { 0, };
 	int result, len;
 
-	result = sd_pr_in_command(bdev, READ_RESERVATION, data, sizeof(data));
+	result = sd_pr_in_command(bdev, READ_RESERVATION, data, sizeof(data),
+				  blk_stat);
 	if (result)
 		return result;
 
@@ -1777,7 +1782,7 @@ static int sd_pr_read_reservation(struct block_device *bdev,
 }
 
 static int sd_pr_out_command(struct block_device *bdev, u8 sa, u64 key,
-		u64 sa_key, u8 type, u8 flags)
+		u64 sa_key, u8 type, u8 flags, blk_status_t *blk_stat)
 {
 	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
 	struct scsi_device *sdev = sdkp->device;
@@ -1808,6 +1813,9 @@ static int sd_pr_out_command(struct block_device *bdev, u8 sa, u64 key,
 		scsi_print_sense_hdr(sdev, NULL, &sshdr);
 	}
 
+	if (blk_stat && result >= 0)
+		*blk_stat = scsi_result_to_blk_status(result);
+
 	return result;
 }
 
@@ -1818,7 +1826,8 @@ static int sd_pr_register(struct block_device *bdev, u64 old_key, u64 new_key,
 		return -EOPNOTSUPP;
 	return sd_pr_out_command(bdev, (flags & PR_FL_IGNORE_KEY) ? 0x06 : 0x00,
 			old_key, new_key, 0,
-			(1 << 0) /* APTPL */);
+			(1 << 0) /* APTPL */,
+			blk_stat);
 }
 
 static int sd_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
@@ -1827,27 +1836,27 @@ static int sd_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
 	if (flags)
 		return -EOPNOTSUPP;
 	return sd_pr_out_command(bdev, 0x01, key, 0,
-				 block_pr_type_to_scsi(type), 0);
+				 block_pr_type_to_scsi(type), 0, blk_stat);
 }
 
 static int sd_pr_release(struct block_device *bdev, u64 key, enum pr_type type,
 		blk_status_t *blk_stat)
 {
 	return sd_pr_out_command(bdev, 0x02, key, 0,
-				 block_pr_type_to_scsi(type), 0);
+				 block_pr_type_to_scsi(type), 0, blk_stat);
 }
 
 static int sd_pr_preempt(struct block_device *bdev, u64 old_key, u64 new_key,
 		enum pr_type type, bool abort, blk_status_t *blk_stat)
 {
 	return sd_pr_out_command(bdev, abort ? 0x05 : 0x04, old_key, new_key,
-				 block_pr_type_to_scsi(type), 0);
+				 block_pr_type_to_scsi(type), 0, blk_stat);
 }
 
 static int sd_pr_clear(struct block_device *bdev, u64 key,
 		blk_status_t *blk_stat)
 {
-	return sd_pr_out_command(bdev, 0x03, key, 0, 0, 0);
+	return sd_pr_out_command(bdev, 0x03, key, 0, 0, 0, blk_stat);
 }
 
 static const struct pr_ops sd_pr_ops = {
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 17/20] scsi: target: Rename sbc_ops to exec_cmd_ops
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

The next patches allow us to call the block layer's pr_ops from the
backends. This will require allowing the backends to hook into the cmd
processing for SPC commands, so this renames sbc_ops to a more generic
exec_cmd_ops.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/target/target_core_file.c    |  4 ++--
 drivers/target/target_core_iblock.c  |  4 ++--
 drivers/target/target_core_rd.c      |  4 ++--
 drivers/target/target_core_sbc.c     | 13 +++++++------
 include/target/target_core_backend.h |  4 ++--
 5 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c
index 28aa643be5d5..8336c0e0b1db 100644
--- a/drivers/target/target_core_file.c
+++ b/drivers/target/target_core_file.c
@@ -903,7 +903,7 @@ static void fd_free_prot(struct se_device *dev)
 	fd_dev->fd_prot_file = NULL;
 }
 
-static struct sbc_ops fd_sbc_ops = {
+static struct exec_cmd_ops fd_exec_cmd_ops = {
 	.execute_rw		= fd_execute_rw,
 	.execute_sync_cache	= fd_execute_sync_cache,
 	.execute_write_same	= fd_execute_write_same,
@@ -913,7 +913,7 @@ static struct sbc_ops fd_sbc_ops = {
 static sense_reason_t
 fd_parse_cdb(struct se_cmd *cmd)
 {
-	return sbc_parse_cdb(cmd, &fd_sbc_ops);
+	return sbc_parse_cdb(cmd, &fd_exec_cmd_ops);
 }
 
 static const struct target_backend_ops fileio_ops = {
diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index 8351c974cee3..5db7318b4822 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -878,7 +878,7 @@ static unsigned int iblock_get_io_opt(struct se_device *dev)
 	return bdev_io_opt(bd);
 }
 
-static struct sbc_ops iblock_sbc_ops = {
+static struct exec_cmd_ops iblock_exec_cmd_ops = {
 	.execute_rw		= iblock_execute_rw,
 	.execute_sync_cache	= iblock_execute_sync_cache,
 	.execute_write_same	= iblock_execute_write_same,
@@ -888,7 +888,7 @@ static struct sbc_ops iblock_sbc_ops = {
 static sense_reason_t
 iblock_parse_cdb(struct se_cmd *cmd)
 {
-	return sbc_parse_cdb(cmd, &iblock_sbc_ops);
+	return sbc_parse_cdb(cmd, &iblock_exec_cmd_ops);
 }
 
 static bool iblock_get_write_cache(struct se_device *dev)
diff --git a/drivers/target/target_core_rd.c b/drivers/target/target_core_rd.c
index 6648c1c90e19..6f67cc09c2b5 100644
--- a/drivers/target/target_core_rd.c
+++ b/drivers/target/target_core_rd.c
@@ -643,14 +643,14 @@ static void rd_free_prot(struct se_device *dev)
 	rd_release_prot_space(rd_dev);
 }
 
-static struct sbc_ops rd_sbc_ops = {
+static struct exec_cmd_ops rd_exec_cmd_ops = {
 	.execute_rw		= rd_execute_rw,
 };
 
 static sense_reason_t
 rd_parse_cdb(struct se_cmd *cmd)
 {
-	return sbc_parse_cdb(cmd, &rd_sbc_ops);
+	return sbc_parse_cdb(cmd, &rd_exec_cmd_ops);
 }
 
 static const struct target_backend_ops rd_mcp_ops = {
diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
index 1e3216de1e04..74133efda529 100644
--- a/drivers/target/target_core_sbc.c
+++ b/drivers/target/target_core_sbc.c
@@ -192,7 +192,7 @@ EXPORT_SYMBOL(sbc_get_write_same_sectors);
 static sense_reason_t
 sbc_execute_write_same_unmap(struct se_cmd *cmd)
 {
-	struct sbc_ops *ops = cmd->protocol_data;
+	struct exec_cmd_ops *ops = cmd->protocol_data;
 	sector_t nolb = sbc_get_write_same_sectors(cmd);
 	sense_reason_t ret;
 
@@ -279,7 +279,8 @@ static inline unsigned long long transport_lba_64_ext(unsigned char *cdb)
 }
 
 static sense_reason_t
-sbc_setup_write_same(struct se_cmd *cmd, unsigned char flags, struct sbc_ops *ops)
+sbc_setup_write_same(struct se_cmd *cmd, unsigned char flags,
+		     struct exec_cmd_ops *ops)
 {
 	struct se_device *dev = cmd->se_dev;
 	sector_t end_lba = dev->transport->get_blocks(dev) + 1;
@@ -348,7 +349,7 @@ sbc_setup_write_same(struct se_cmd *cmd, unsigned char flags, struct sbc_ops *op
 static sense_reason_t
 sbc_execute_rw(struct se_cmd *cmd)
 {
-	struct sbc_ops *ops = cmd->protocol_data;
+	struct exec_cmd_ops *ops = cmd->protocol_data;
 
 	return ops->execute_rw(cmd, cmd->t_data_sg, cmd->t_data_nents,
 			       cmd->data_direction);
@@ -564,7 +565,7 @@ static sense_reason_t compare_and_write_callback(struct se_cmd *cmd, bool succes
 static sense_reason_t
 sbc_compare_and_write(struct se_cmd *cmd)
 {
-	struct sbc_ops *ops = cmd->protocol_data;
+	struct exec_cmd_ops *ops = cmd->protocol_data;
 	struct se_device *dev = cmd->se_dev;
 	sense_reason_t ret;
 	int rc;
@@ -762,7 +763,7 @@ sbc_check_dpofua(struct se_device *dev, struct se_cmd *cmd, unsigned char *cdb)
 }
 
 sense_reason_t
-sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops)
+sbc_parse_cdb(struct se_cmd *cmd, struct exec_cmd_ops *ops)
 {
 	struct se_device *dev = cmd->se_dev;
 	unsigned char *cdb = cmd->t_task_cdb;
@@ -1074,7 +1075,7 @@ EXPORT_SYMBOL(sbc_get_device_type);
 static sense_reason_t
 sbc_execute_unmap(struct se_cmd *cmd)
 {
-	struct sbc_ops *ops = cmd->protocol_data;
+	struct exec_cmd_ops *ops = cmd->protocol_data;
 	struct se_device *dev = cmd->se_dev;
 	unsigned char *buf, *ptr = NULL;
 	sector_t lba;
diff --git a/include/target/target_core_backend.h b/include/target/target_core_backend.h
index a3c193df25b3..c5df78959532 100644
--- a/include/target/target_core_backend.h
+++ b/include/target/target_core_backend.h
@@ -62,7 +62,7 @@ struct target_backend_ops {
 	struct configfs_attribute **tb_dev_action_attrs;
 };
 
-struct sbc_ops {
+struct exec_cmd_ops {
 	sense_reason_t (*execute_rw)(struct se_cmd *cmd, struct scatterlist *,
 				     u32, enum dma_data_direction);
 	sense_reason_t (*execute_sync_cache)(struct se_cmd *cmd);
@@ -86,7 +86,7 @@ sense_reason_t	spc_emulate_report_luns(struct se_cmd *cmd);
 sense_reason_t	spc_emulate_inquiry_std(struct se_cmd *, unsigned char *);
 sense_reason_t	spc_emulate_evpd_83(struct se_cmd *, unsigned char *);
 
-sense_reason_t	sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops);
+sense_reason_t	sbc_parse_cdb(struct se_cmd *cmd, struct exec_cmd_ops *ops);
 u32	sbc_get_device_rev(struct se_device *dev);
 u32	sbc_get_device_type(struct se_device *dev);
 sector_t	sbc_get_write_same_sectors(struct se_cmd *cmd);
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 17/20] scsi: target: Rename sbc_ops to exec_cmd_ops
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

The next patches allow us to call the block layer's pr_ops from the
backends. This will require allowing the backends to hook into the cmd
processing for SPC commands, so this renames sbc_ops to a more generic
exec_cmd_ops.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/target/target_core_file.c    |  4 ++--
 drivers/target/target_core_iblock.c  |  4 ++--
 drivers/target/target_core_rd.c      |  4 ++--
 drivers/target/target_core_sbc.c     | 13 +++++++------
 include/target/target_core_backend.h |  4 ++--
 5 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c
index 28aa643be5d5..8336c0e0b1db 100644
--- a/drivers/target/target_core_file.c
+++ b/drivers/target/target_core_file.c
@@ -903,7 +903,7 @@ static void fd_free_prot(struct se_device *dev)
 	fd_dev->fd_prot_file = NULL;
 }
 
-static struct sbc_ops fd_sbc_ops = {
+static struct exec_cmd_ops fd_exec_cmd_ops = {
 	.execute_rw		= fd_execute_rw,
 	.execute_sync_cache	= fd_execute_sync_cache,
 	.execute_write_same	= fd_execute_write_same,
@@ -913,7 +913,7 @@ static struct sbc_ops fd_sbc_ops = {
 static sense_reason_t
 fd_parse_cdb(struct se_cmd *cmd)
 {
-	return sbc_parse_cdb(cmd, &fd_sbc_ops);
+	return sbc_parse_cdb(cmd, &fd_exec_cmd_ops);
 }
 
 static const struct target_backend_ops fileio_ops = {
diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index 8351c974cee3..5db7318b4822 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -878,7 +878,7 @@ static unsigned int iblock_get_io_opt(struct se_device *dev)
 	return bdev_io_opt(bd);
 }
 
-static struct sbc_ops iblock_sbc_ops = {
+static struct exec_cmd_ops iblock_exec_cmd_ops = {
 	.execute_rw		= iblock_execute_rw,
 	.execute_sync_cache	= iblock_execute_sync_cache,
 	.execute_write_same	= iblock_execute_write_same,
@@ -888,7 +888,7 @@ static struct sbc_ops iblock_sbc_ops = {
 static sense_reason_t
 iblock_parse_cdb(struct se_cmd *cmd)
 {
-	return sbc_parse_cdb(cmd, &iblock_sbc_ops);
+	return sbc_parse_cdb(cmd, &iblock_exec_cmd_ops);
 }
 
 static bool iblock_get_write_cache(struct se_device *dev)
diff --git a/drivers/target/target_core_rd.c b/drivers/target/target_core_rd.c
index 6648c1c90e19..6f67cc09c2b5 100644
--- a/drivers/target/target_core_rd.c
+++ b/drivers/target/target_core_rd.c
@@ -643,14 +643,14 @@ static void rd_free_prot(struct se_device *dev)
 	rd_release_prot_space(rd_dev);
 }
 
-static struct sbc_ops rd_sbc_ops = {
+static struct exec_cmd_ops rd_exec_cmd_ops = {
 	.execute_rw		= rd_execute_rw,
 };
 
 static sense_reason_t
 rd_parse_cdb(struct se_cmd *cmd)
 {
-	return sbc_parse_cdb(cmd, &rd_sbc_ops);
+	return sbc_parse_cdb(cmd, &rd_exec_cmd_ops);
 }
 
 static const struct target_backend_ops rd_mcp_ops = {
diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
index 1e3216de1e04..74133efda529 100644
--- a/drivers/target/target_core_sbc.c
+++ b/drivers/target/target_core_sbc.c
@@ -192,7 +192,7 @@ EXPORT_SYMBOL(sbc_get_write_same_sectors);
 static sense_reason_t
 sbc_execute_write_same_unmap(struct se_cmd *cmd)
 {
-	struct sbc_ops *ops = cmd->protocol_data;
+	struct exec_cmd_ops *ops = cmd->protocol_data;
 	sector_t nolb = sbc_get_write_same_sectors(cmd);
 	sense_reason_t ret;
 
@@ -279,7 +279,8 @@ static inline unsigned long long transport_lba_64_ext(unsigned char *cdb)
 }
 
 static sense_reason_t
-sbc_setup_write_same(struct se_cmd *cmd, unsigned char flags, struct sbc_ops *ops)
+sbc_setup_write_same(struct se_cmd *cmd, unsigned char flags,
+		     struct exec_cmd_ops *ops)
 {
 	struct se_device *dev = cmd->se_dev;
 	sector_t end_lba = dev->transport->get_blocks(dev) + 1;
@@ -348,7 +349,7 @@ sbc_setup_write_same(struct se_cmd *cmd, unsigned char flags, struct sbc_ops *op
 static sense_reason_t
 sbc_execute_rw(struct se_cmd *cmd)
 {
-	struct sbc_ops *ops = cmd->protocol_data;
+	struct exec_cmd_ops *ops = cmd->protocol_data;
 
 	return ops->execute_rw(cmd, cmd->t_data_sg, cmd->t_data_nents,
 			       cmd->data_direction);
@@ -564,7 +565,7 @@ static sense_reason_t compare_and_write_callback(struct se_cmd *cmd, bool succes
 static sense_reason_t
 sbc_compare_and_write(struct se_cmd *cmd)
 {
-	struct sbc_ops *ops = cmd->protocol_data;
+	struct exec_cmd_ops *ops = cmd->protocol_data;
 	struct se_device *dev = cmd->se_dev;
 	sense_reason_t ret;
 	int rc;
@@ -762,7 +763,7 @@ sbc_check_dpofua(struct se_device *dev, struct se_cmd *cmd, unsigned char *cdb)
 }
 
 sense_reason_t
-sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops)
+sbc_parse_cdb(struct se_cmd *cmd, struct exec_cmd_ops *ops)
 {
 	struct se_device *dev = cmd->se_dev;
 	unsigned char *cdb = cmd->t_task_cdb;
@@ -1074,7 +1075,7 @@ EXPORT_SYMBOL(sbc_get_device_type);
 static sense_reason_t
 sbc_execute_unmap(struct se_cmd *cmd)
 {
-	struct sbc_ops *ops = cmd->protocol_data;
+	struct exec_cmd_ops *ops = cmd->protocol_data;
 	struct se_device *dev = cmd->se_dev;
 	unsigned char *buf, *ptr = NULL;
 	sector_t lba;
diff --git a/include/target/target_core_backend.h b/include/target/target_core_backend.h
index a3c193df25b3..c5df78959532 100644
--- a/include/target/target_core_backend.h
+++ b/include/target/target_core_backend.h
@@ -62,7 +62,7 @@ struct target_backend_ops {
 	struct configfs_attribute **tb_dev_action_attrs;
 };
 
-struct sbc_ops {
+struct exec_cmd_ops {
 	sense_reason_t (*execute_rw)(struct se_cmd *cmd, struct scatterlist *,
 				     u32, enum dma_data_direction);
 	sense_reason_t (*execute_sync_cache)(struct se_cmd *cmd);
@@ -86,7 +86,7 @@ sense_reason_t	spc_emulate_report_luns(struct se_cmd *cmd);
 sense_reason_t	spc_emulate_inquiry_std(struct se_cmd *, unsigned char *);
 sense_reason_t	spc_emulate_evpd_83(struct se_cmd *, unsigned char *);
 
-sense_reason_t	sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops);
+sense_reason_t	sbc_parse_cdb(struct se_cmd *cmd, struct exec_cmd_ops *ops);
 u32	sbc_get_device_rev(struct se_device *dev);
 u32	sbc_get_device_type(struct se_device *dev);
 sector_t	sbc_get_write_same_sectors(struct se_cmd *cmd);
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 18/20] scsi: target: Allow backends to hook into PR handling.
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

For the cases where you want to export a device to a VM via a single
I_T nexus and want to passthrough the PR handling to the physical/real
device you have to use pscsi or tcmu. Both are good for specific uses
however for the case where you want good performance, and are not using
SCSI devices directly (using DM/MD RAID or multipath devices) then we are
out of luck.

The following patches allow iblock to mimimally hook into the LIO PR code
and then pass the PR handling to the physical device. Note that like with
the tcmu an pscsi cases it's only supported when you export the device via
one I_T nexus.

This patch adds the initial LIO callouts. The next patch will modify
iblock.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/target/target_core_pr.c      | 60 ++++++++++++++++++++++++++++
 include/target/target_core_backend.h |  5 +++
 2 files changed, 65 insertions(+)

diff --git a/drivers/target/target_core_pr.c b/drivers/target/target_core_pr.c
index 3829b61b56c1..1c11f884e12f 100644
--- a/drivers/target/target_core_pr.c
+++ b/drivers/target/target_core_pr.c
@@ -3531,6 +3531,26 @@ core_scsi3_emulate_pro_register_and_move(struct se_cmd *cmd, u64 res_key,
 	return ret;
 }
 
+static sense_reason_t
+target_try_pr_out_pt(struct se_cmd *cmd, u8 sa, u64 res_key, u64 sa_res_key,
+		     u8 type, bool aptpl, bool all_tg_pt, bool spec_i_pt)
+{
+	struct exec_cmd_ops *ops = cmd->protocol_data;
+
+	if (!cmd->se_sess || !cmd->se_lun) {
+		pr_err("SPC-3 PR: se_sess || struct se_lun is NULL!\n");
+		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+	}
+
+	if (!ops->execute_pr_out) {
+		pr_err("SPC-3 PR: Device has been configured for PR passthrough but it's not supported by the backend.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	return ops->execute_pr_out(cmd, sa, res_key, sa_res_key, type,
+				   aptpl, all_tg_pt, spec_i_pt);
+}
+
 /*
  * See spc4r17 section 6.14 Table 170
  */
@@ -3634,6 +3654,12 @@ target_scsi3_emulate_pr_out(struct se_cmd *cmd)
 		return TCM_PARAMETER_LIST_LENGTH_ERROR;
 	}
 
+	if (dev->transport_flags & TRANSPORT_FLAG_PASSTHROUGH_PGR) {
+		ret = target_try_pr_out_pt(cmd, sa, res_key, sa_res_key, type,
+					   aptpl, all_tg_pt, spec_i_pt);
+		goto done;
+	}
+
 	/*
 	 * (core_scsi3_emulate_pro_* function parameters
 	 * are defined by spc4r17 Table 174:
@@ -3675,6 +3701,7 @@ target_scsi3_emulate_pr_out(struct se_cmd *cmd)
 		return TCM_INVALID_CDB_FIELD;
 	}
 
+done:
 	if (!ret)
 		target_complete_cmd(cmd, SAM_STAT_GOOD);
 	return ret;
@@ -4032,6 +4059,33 @@ core_scsi3_pri_read_full_status(struct se_cmd *cmd)
 	return 0;
 }
 
+static sense_reason_t target_try_pr_in_pt(struct se_cmd *cmd)
+{
+	struct exec_cmd_ops *ops = cmd->protocol_data;
+	unsigned char *buf;
+	sense_reason_t ret;
+
+	if (cmd->data_length < 8) {
+		pr_err("PRIN SA SCSI Data Length: %u too small\n",
+		       cmd->data_length);
+		return TCM_INVALID_CDB_FIELD;
+	}
+
+	if (!ops->execute_pr_in) {
+		pr_err("SPC-3 PR: Device has been configured for PR passthrough but it's not supported by the backend.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	buf = transport_kmap_data_sg(cmd);
+	if (!buf)
+		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+
+	ret = ops->execute_pr_in(cmd, cmd->t_task_cdb[1] & 0x1f, buf);
+
+	transport_kunmap_data_sg(cmd);
+	return ret;
+}
+
 sense_reason_t
 target_scsi3_emulate_pr_in(struct se_cmd *cmd)
 {
@@ -4053,6 +4107,11 @@ target_scsi3_emulate_pr_in(struct se_cmd *cmd)
 		return TCM_RESERVATION_CONFLICT;
 	}
 
+	if (cmd->se_dev->transport_flags & TRANSPORT_FLAG_PASSTHROUGH_PGR) {
+		ret = target_try_pr_in_pt(cmd);
+		goto done;
+	}
+
 	switch (cmd->t_task_cdb[1] & 0x1f) {
 	case PRI_READ_KEYS:
 		ret = core_scsi3_pri_read_keys(cmd);
@@ -4072,6 +4131,7 @@ target_scsi3_emulate_pr_in(struct se_cmd *cmd)
 		return TCM_INVALID_CDB_FIELD;
 	}
 
+done:
 	if (!ret)
 		target_complete_cmd(cmd, SAM_STAT_GOOD);
 	return ret;
diff --git a/include/target/target_core_backend.h b/include/target/target_core_backend.h
index c5df78959532..84bfdfb14997 100644
--- a/include/target/target_core_backend.h
+++ b/include/target/target_core_backend.h
@@ -69,6 +69,11 @@ struct exec_cmd_ops {
 	sense_reason_t (*execute_write_same)(struct se_cmd *cmd);
 	sense_reason_t (*execute_unmap)(struct se_cmd *cmd,
 				sector_t lba, sector_t nolb);
+	sense_reason_t (*execute_pr_out)(struct se_cmd *cmd, u8 sa, u64 key,
+					 u64 sa_key, u8 type, bool aptpl,
+					 bool all_tg_pt, bool spec_i_pt);
+	sense_reason_t (*execute_pr_in)(struct se_cmd *cmd, u8 sa,
+					unsigned char *param_data);
 };
 
 int	transport_backend_register(const struct target_backend_ops *);
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 18/20] scsi: target: Allow backends to hook into PR handling.
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

For the cases where you want to export a device to a VM via a single
I_T nexus and want to passthrough the PR handling to the physical/real
device you have to use pscsi or tcmu. Both are good for specific uses
however for the case where you want good performance, and are not using
SCSI devices directly (using DM/MD RAID or multipath devices) then we are
out of luck.

The following patches allow iblock to mimimally hook into the LIO PR code
and then pass the PR handling to the physical device. Note that like with
the tcmu an pscsi cases it's only supported when you export the device via
one I_T nexus.

This patch adds the initial LIO callouts. The next patch will modify
iblock.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/target/target_core_pr.c      | 60 ++++++++++++++++++++++++++++
 include/target/target_core_backend.h |  5 +++
 2 files changed, 65 insertions(+)

diff --git a/drivers/target/target_core_pr.c b/drivers/target/target_core_pr.c
index 3829b61b56c1..1c11f884e12f 100644
--- a/drivers/target/target_core_pr.c
+++ b/drivers/target/target_core_pr.c
@@ -3531,6 +3531,26 @@ core_scsi3_emulate_pro_register_and_move(struct se_cmd *cmd, u64 res_key,
 	return ret;
 }
 
+static sense_reason_t
+target_try_pr_out_pt(struct se_cmd *cmd, u8 sa, u64 res_key, u64 sa_res_key,
+		     u8 type, bool aptpl, bool all_tg_pt, bool spec_i_pt)
+{
+	struct exec_cmd_ops *ops = cmd->protocol_data;
+
+	if (!cmd->se_sess || !cmd->se_lun) {
+		pr_err("SPC-3 PR: se_sess || struct se_lun is NULL!\n");
+		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+	}
+
+	if (!ops->execute_pr_out) {
+		pr_err("SPC-3 PR: Device has been configured for PR passthrough but it's not supported by the backend.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	return ops->execute_pr_out(cmd, sa, res_key, sa_res_key, type,
+				   aptpl, all_tg_pt, spec_i_pt);
+}
+
 /*
  * See spc4r17 section 6.14 Table 170
  */
@@ -3634,6 +3654,12 @@ target_scsi3_emulate_pr_out(struct se_cmd *cmd)
 		return TCM_PARAMETER_LIST_LENGTH_ERROR;
 	}
 
+	if (dev->transport_flags & TRANSPORT_FLAG_PASSTHROUGH_PGR) {
+		ret = target_try_pr_out_pt(cmd, sa, res_key, sa_res_key, type,
+					   aptpl, all_tg_pt, spec_i_pt);
+		goto done;
+	}
+
 	/*
 	 * (core_scsi3_emulate_pro_* function parameters
 	 * are defined by spc4r17 Table 174:
@@ -3675,6 +3701,7 @@ target_scsi3_emulate_pr_out(struct se_cmd *cmd)
 		return TCM_INVALID_CDB_FIELD;
 	}
 
+done:
 	if (!ret)
 		target_complete_cmd(cmd, SAM_STAT_GOOD);
 	return ret;
@@ -4032,6 +4059,33 @@ core_scsi3_pri_read_full_status(struct se_cmd *cmd)
 	return 0;
 }
 
+static sense_reason_t target_try_pr_in_pt(struct se_cmd *cmd)
+{
+	struct exec_cmd_ops *ops = cmd->protocol_data;
+	unsigned char *buf;
+	sense_reason_t ret;
+
+	if (cmd->data_length < 8) {
+		pr_err("PRIN SA SCSI Data Length: %u too small\n",
+		       cmd->data_length);
+		return TCM_INVALID_CDB_FIELD;
+	}
+
+	if (!ops->execute_pr_in) {
+		pr_err("SPC-3 PR: Device has been configured for PR passthrough but it's not supported by the backend.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	buf = transport_kmap_data_sg(cmd);
+	if (!buf)
+		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+
+	ret = ops->execute_pr_in(cmd, cmd->t_task_cdb[1] & 0x1f, buf);
+
+	transport_kunmap_data_sg(cmd);
+	return ret;
+}
+
 sense_reason_t
 target_scsi3_emulate_pr_in(struct se_cmd *cmd)
 {
@@ -4053,6 +4107,11 @@ target_scsi3_emulate_pr_in(struct se_cmd *cmd)
 		return TCM_RESERVATION_CONFLICT;
 	}
 
+	if (cmd->se_dev->transport_flags & TRANSPORT_FLAG_PASSTHROUGH_PGR) {
+		ret = target_try_pr_in_pt(cmd);
+		goto done;
+	}
+
 	switch (cmd->t_task_cdb[1] & 0x1f) {
 	case PRI_READ_KEYS:
 		ret = core_scsi3_pri_read_keys(cmd);
@@ -4072,6 +4131,7 @@ target_scsi3_emulate_pr_in(struct se_cmd *cmd)
 		return TCM_INVALID_CDB_FIELD;
 	}
 
+done:
 	if (!ret)
 		target_complete_cmd(cmd, SAM_STAT_GOOD);
 	return ret;
diff --git a/include/target/target_core_backend.h b/include/target/target_core_backend.h
index c5df78959532..84bfdfb14997 100644
--- a/include/target/target_core_backend.h
+++ b/include/target/target_core_backend.h
@@ -69,6 +69,11 @@ struct exec_cmd_ops {
 	sense_reason_t (*execute_write_same)(struct se_cmd *cmd);
 	sense_reason_t (*execute_unmap)(struct se_cmd *cmd,
 				sector_t lba, sector_t nolb);
+	sense_reason_t (*execute_pr_out)(struct se_cmd *cmd, u8 sa, u64 key,
+					 u64 sa_key, u8 type, bool aptpl,
+					 bool all_tg_pt, bool spec_i_pt);
+	sense_reason_t (*execute_pr_in)(struct se_cmd *cmd, u8 sa,
+					unsigned char *param_data);
 };
 
 int	transport_backend_register(const struct target_backend_ops *);
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 19/20] scsi: target: Don't support SCSI-2 RESERVE/RELEASE
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

The pr_ops don't support SCSI-2 RESERVE/RELEASE so fail them during
parsing.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/target/target_core_spc.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/target/target_core_spc.c b/drivers/target/target_core_spc.c
index c14441c89bed..64ac9b92f8cf 100644
--- a/drivers/target/target_core_spc.c
+++ b/drivers/target/target_core_spc.c
@@ -1314,12 +1314,25 @@ spc_parse_cdb(struct se_cmd *cmd, unsigned int *size)
 	struct se_device *dev = cmd->se_dev;
 	unsigned char *cdb = cmd->t_task_cdb;
 
-	if (!dev->dev_attrib.emulate_pr &&
-	    ((cdb[0] == PERSISTENT_RESERVE_IN) ||
-	     (cdb[0] == PERSISTENT_RESERVE_OUT) ||
-	     (cdb[0] == RELEASE || cdb[0] == RELEASE_10) ||
-	     (cdb[0] == RESERVE || cdb[0] == RESERVE_10))) {
-		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	switch (cdb[0]) {
+	case RESERVE:
+	case RESERVE_10:
+	case RELEASE:
+	case RELEASE_10:
+		if (!dev->dev_attrib.emulate_pr)
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		/*
+		 * The block layer pr_ops don't support the old RESERVE/RELEASE
+		 * commands.
+		 */
+		if (dev->transport_flags & TRANSPORT_FLAG_PASSTHROUGH_PGR)
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		break;
+	case PERSISTENT_RESERVE_IN:
+	case PERSISTENT_RESERVE_OUT:
+		if (!dev->dev_attrib.emulate_pr)
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		break;
 	}
 
 	switch (cdb[0]) {
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 19/20] scsi: target: Don't support SCSI-2 RESERVE/RELEASE
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

The pr_ops don't support SCSI-2 RESERVE/RELEASE so fail them during
parsing.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/target/target_core_spc.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/target/target_core_spc.c b/drivers/target/target_core_spc.c
index c14441c89bed..64ac9b92f8cf 100644
--- a/drivers/target/target_core_spc.c
+++ b/drivers/target/target_core_spc.c
@@ -1314,12 +1314,25 @@ spc_parse_cdb(struct se_cmd *cmd, unsigned int *size)
 	struct se_device *dev = cmd->se_dev;
 	unsigned char *cdb = cmd->t_task_cdb;
 
-	if (!dev->dev_attrib.emulate_pr &&
-	    ((cdb[0] == PERSISTENT_RESERVE_IN) ||
-	     (cdb[0] == PERSISTENT_RESERVE_OUT) ||
-	     (cdb[0] == RELEASE || cdb[0] == RELEASE_10) ||
-	     (cdb[0] == RESERVE || cdb[0] == RESERVE_10))) {
-		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	switch (cdb[0]) {
+	case RESERVE:
+	case RESERVE_10:
+	case RELEASE:
+	case RELEASE_10:
+		if (!dev->dev_attrib.emulate_pr)
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		/*
+		 * The block layer pr_ops don't support the old RESERVE/RELEASE
+		 * commands.
+		 */
+		if (dev->transport_flags & TRANSPORT_FLAG_PASSTHROUGH_PGR)
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		break;
+	case PERSISTENT_RESERVE_IN:
+	case PERSISTENT_RESERVE_OUT:
+		if (!dev->dev_attrib.emulate_pr)
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		break;
 	}
 
 	switch (cdb[0]) {
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 20/20] scsi: target: Add block PR support to iblock.
  2022-08-09  0:03 ` [dm-devel] " Mike Christie
@ 2022-08-09  0:04   ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This adds support for the block PR callouts to target_core_iblock. This
patch doesn't attempt to implement the entire spec because there's no way
support it all like SPEC_I_PT and ALL_TG_PT. This only supports
exporting the iblock device from one path on the local target.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/target/target_core_iblock.c | 292 +++++++++++++++++++++++++++-
 1 file changed, 287 insertions(+), 5 deletions(-)

diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index 5db7318b4822..caf6958dd75d 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -23,13 +23,16 @@
 #include <linux/file.h>
 #include <linux/module.h>
 #include <linux/scatterlist.h>
+#include <linux/pr.h>
 #include <scsi/scsi_proto.h>
+#include <scsi/scsi_block_pr.h>
 #include <asm/unaligned.h>
 
 #include <target/target_core_base.h>
 #include <target/target_core_backend.h>
 
 #include "target_core_iblock.h"
+#include "target_core_pr.h"
 
 #define IBLOCK_MAX_BIO_PER_TASK	 32	/* max # of bios to submit at a time */
 #define IBLOCK_BIO_POOL_SIZE	128
@@ -310,7 +313,7 @@ static unsigned long long iblock_emulate_read_cap_with_block_size(
 	return blocks_long;
 }
 
-static void iblock_complete_cmd(struct se_cmd *cmd)
+static void iblock_complete_cmd(struct se_cmd *cmd, blk_status_t blk_status)
 {
 	struct iblock_req *ibr = cmd->priv;
 	u8 status;
@@ -318,7 +321,9 @@ static void iblock_complete_cmd(struct se_cmd *cmd)
 	if (!refcount_dec_and_test(&ibr->pending))
 		return;
 
-	if (atomic_read(&ibr->ib_bio_err_cnt))
+	if (blk_status == BLK_STS_NEXUS)
+		status = SAM_STAT_RESERVATION_CONFLICT;
+	else if (atomic_read(&ibr->ib_bio_err_cnt))
 		status = SAM_STAT_CHECK_CONDITION;
 	else
 		status = SAM_STAT_GOOD;
@@ -331,6 +336,7 @@ static void iblock_bio_done(struct bio *bio)
 {
 	struct se_cmd *cmd = bio->bi_private;
 	struct iblock_req *ibr = cmd->priv;
+	blk_status_t blk_status = bio->bi_status;
 
 	if (bio->bi_status) {
 		pr_err("bio error: %p,  err: %d\n", bio, bio->bi_status);
@@ -343,7 +349,7 @@ static void iblock_bio_done(struct bio *bio)
 
 	bio_put(bio);
 
-	iblock_complete_cmd(cmd);
+	iblock_complete_cmd(cmd, blk_status);
 }
 
 static struct bio *iblock_get_bio(struct se_cmd *cmd, sector_t lba, u32 sg_num,
@@ -759,7 +765,7 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 
 	if (!sgl_nents) {
 		refcount_set(&ibr->pending, 1);
-		iblock_complete_cmd(cmd);
+		iblock_complete_cmd(cmd, BLK_STS_OK);
 		return 0;
 	}
 
@@ -817,7 +823,7 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 	}
 
 	iblock_submit_bios(&list);
-	iblock_complete_cmd(cmd);
+	iblock_complete_cmd(cmd, BLK_STS_OK);
 	return 0;
 
 fail_put_bios:
@@ -829,6 +835,279 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 	return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
 }
 
+static sense_reason_t iblock_execute_pr_out(struct se_cmd *cmd, u8 sa, u64 key,
+					    u64 sa_key, u8 type, bool aptpl,
+					    bool all_tg_pt, bool spec_i_pt)
+{
+	struct se_device *dev = cmd->se_dev;
+	struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
+	struct block_device *bdev = ib_dev->ibd_bd;
+	const struct pr_ops *ops = bdev->bd_disk->fops->pr_ops;
+	blk_status_t blk_stat = 0;
+	int ret;
+
+	if (!ops) {
+		pr_err("Block device does not support pr_ops but iblock device has been configured for PR passthrough.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	switch (sa) {
+	case PRO_REGISTER_AND_MOVE:
+		pr_err("PRO_REGISTER_AND_MOVE is not supported by iblock PR passthrough.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	case PRO_REPLACE_LOST_RESERVATION:
+		pr_err("PRO_REPLACE_LOST_RESERVATION is not supported by iblock PR passthrough.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	case PRO_REGISTER:
+	case PRO_REGISTER_AND_IGNORE_EXISTING_KEY:
+		if (!ops->pr_register) {
+			pr_err("block device does not support pr_register.\n");
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		}
+
+		/*
+		 * We only support one target port. We don't know the target
+		 * port config at this level and the block layer has a
+		 * different view.
+		 */
+		if (spec_i_pt || all_tg_pt) {
+			pr_err("SPC-3 PR: SPEC_I_PT and ALL_TG_PT are not supported by PR passthrough.\n");
+
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		}
+
+		/* The block layer pr ops always enables aptpl */
+		if (!aptpl)
+			pr_info("APTPL not set by initiator, but will be used.\n");
+
+		ret = ops->pr_register(bdev, key, sa_key,
+				sa == PRO_REGISTER ? 0 : PR_FL_IGNORE_KEY,
+				&blk_stat);
+		break;
+	case PRO_RESERVE:
+		if (!ops->pr_reserve) {
+			pr_err("block_device does not support pr_reserve.\n");
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		}
+
+		ret = ops->pr_reserve(bdev, key, scsi_pr_type_to_block(type), 0,
+				      &blk_stat);
+		break;
+	case PRO_CLEAR:
+		if (!ops->pr_clear) {
+			pr_err("block_device does not support pr_clear.\n");
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		}
+
+		ret = ops->pr_clear(bdev, key, &blk_stat);
+		break;
+	case PRO_PREEMPT:
+	case PRO_PREEMPT_AND_ABORT:
+		if (!ops->pr_clear) {
+			pr_err("block_device does not support pr_preempt.\n");
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		}
+
+		ret = ops->pr_preempt(bdev, key, sa_key,
+				      scsi_pr_type_to_block(type),
+				      sa == PRO_PREEMPT ? false : true,
+				      &blk_stat);
+		break;
+	case PRO_RELEASE:
+		if (!ops->pr_clear) {
+			pr_err("block_device does not support pr_pclear.\n");
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		}
+
+		ret = ops->pr_release(bdev, key, scsi_pr_type_to_block(type),
+				      &blk_stat);
+		break;
+	default:
+		pr_err("Unknown PERSISTENT_RESERVE_OUT SA: 0x%02x\n", sa);
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	if (!ret)
+		return TCM_NO_SENSE;
+	else if (blk_stat == BLK_STS_NEXUS)
+		return TCM_RESERVATION_CONFLICT;
+	else
+		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+}
+
+static void iblock_pr_report_caps(unsigned char *param_data)
+{
+	u16 len = 8;
+
+	put_unaligned_be16(len, &param_data[0]);
+	/*
+	 * When using the pr_ops passthrough method we only support exporting
+	 * the device through one target port because from the backend module
+	 * level we can't see the target port config. As a result we only
+	 * support registration directly from the I_T nexus the cmd is sent
+	 * through and do not set ATP_C here.
+	 *
+	 * The block layer pr_ops do not support passing in initiators so
+	 * we don't set SIP_C here.
+	 */
+	/* PTPL_C: Persistence across Target Power Loss bit */
+	param_data[2] |= 0x01;
+	/*
+	 * We are filling in the PERSISTENT RESERVATION TYPE MASK below, so
+	 * set the TMV: Task Mask Valid bit.
+	 */
+	param_data[3] |= 0x80;
+	/*
+	 * Change ALLOW COMMANDs to 0x20 or 0x40 later from Table 166
+	 */
+	param_data[3] |= 0x10; /* ALLOW COMMANDs field 001b */
+	/*
+	 * PTPL_A: Persistence across Target Power Loss Active bit. The block
+	 * layer pr ops always enables this so report it active.
+	 */
+	param_data[3] |= 0x01;
+	/*
+	 * Setup the PERSISTENT RESERVATION TYPE MASK from Table 212 spc4r37.
+	 */
+	param_data[4] |= 0x80; /* PR_TYPE_EXCLUSIVE_ACCESS_ALLREG */
+	param_data[4] |= 0x40; /* PR_TYPE_EXCLUSIVE_ACCESS_REGONLY */
+	param_data[4] |= 0x20; /* PR_TYPE_WRITE_EXCLUSIVE_REGONLY */
+	param_data[4] |= 0x08; /* PR_TYPE_EXCLUSIVE_ACCESS */
+	param_data[4] |= 0x02; /* PR_TYPE_WRITE_EXCLUSIVE */
+	param_data[5] |= 0x01; /* PR_TYPE_EXCLUSIVE_ACCESS_ALLREG */
+}
+
+static int iblock_pr_read_keys(struct se_cmd *cmd, unsigned char *param_data)
+{
+	struct se_device *dev = cmd->se_dev;
+	struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
+	struct block_device *bdev = ib_dev->ibd_bd;
+	const struct pr_ops *ops = bdev->bd_disk->fops->pr_ops;
+	int i, ret, len, paths, data_offset;
+	struct pr_keys *keys;
+
+	if (!ops) {
+		pr_err("Block device does not support pr_ops but iblock device has been configured for PR passthrough.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	if (!ops->pr_read_keys) {
+		pr_err("Block device does not support read_keys.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	/*
+	 * We don't know what's under us, but dm-multipath will register every
+	 * path with the same key, so start off with enough space for 16 paths.
+	 */
+	paths = 16;
+retry:
+	len = 8 * paths;
+	keys = kzalloc(sizeof(*keys) + len, GFP_KERNEL);
+	if (!keys)
+		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+
+	ret = ops->pr_read_keys(bdev, keys, len, NULL);
+	if (!ret) {
+		if (keys->num_keys > paths) {
+			kfree(keys);
+			paths *= 2;
+			goto retry;
+		}
+	} else if (ret) {
+		ret = TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+		goto free_keys;
+	}
+
+	ret = TCM_NO_SENSE;
+
+	put_unaligned_be32(keys->generation, &param_data[0]);
+	if (!keys->num_keys) {
+		put_unaligned_be32(0, &param_data[4]);
+		goto free_keys;
+	}
+
+	put_unaligned_be32(8 * keys->num_keys, &param_data[4]);
+
+	data_offset = 8;
+	for (i = 0; i < keys->num_keys; i++) {
+		if (data_offset + 8 > cmd->data_length)
+			break;
+
+		put_unaligned_be64(keys->keys[i], &param_data[data_offset]);
+		data_offset += 8;
+	}
+
+free_keys:
+	kfree(keys);
+	return ret;
+}
+
+static int iblock_pr_read_reservation(struct se_cmd *cmd,
+				      unsigned char *param_data)
+{
+	struct se_device *dev = cmd->se_dev;
+	struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
+	struct block_device *bdev = ib_dev->ibd_bd;
+	const struct pr_ops *ops = bdev->bd_disk->fops->pr_ops;
+	struct pr_held_reservation rsv;
+
+	if (!ops) {
+		pr_err("Block device does not support pr_ops but iblock device has been configured for PR passthrough.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	if (!ops->pr_read_reservation) {
+		pr_err("Block device does not support read_keys.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	if (ops->pr_read_reservation(bdev, &rsv, NULL))
+		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+
+	put_unaligned_be32(rsv.generation, &param_data[0]);
+	if (!block_pr_type_to_scsi(rsv.type)) {
+		put_unaligned_be32(0, &param_data[4]);
+		return TCM_NO_SENSE;
+	}
+
+	put_unaligned_be32(16, &param_data[4]);
+
+	if (cmd->data_length < 16)
+		return TCM_NO_SENSE;
+	put_unaligned_be64(rsv.key, &param_data[8]);
+
+	if (cmd->data_length < 22)
+		return TCM_NO_SENSE;
+	param_data[21] = block_pr_type_to_scsi(rsv.type);
+
+	return TCM_NO_SENSE;
+}
+
+static sense_reason_t iblock_execute_pr_in(struct se_cmd *cmd, u8 sa,
+					   unsigned char *param_data)
+{
+	sense_reason_t ret = TCM_NO_SENSE;
+
+	switch (sa) {
+	case PRI_REPORT_CAPABILITIES:
+		iblock_pr_report_caps(param_data);
+		break;
+	case PRI_READ_KEYS:
+		ret = iblock_pr_read_keys(cmd, param_data);
+		break;
+	case PRI_READ_RESERVATION:
+		ret = iblock_pr_read_reservation(cmd, param_data);
+		break;
+	case PRI_READ_FULL_STATUS:
+	default:
+		pr_err("Unknown PERSISTENT_RESERVE_IN SA: 0x%02x\n", sa);
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	return ret;
+}
+
 static sector_t iblock_get_blocks(struct se_device *dev)
 {
 	struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
@@ -883,6 +1162,8 @@ static struct exec_cmd_ops iblock_exec_cmd_ops = {
 	.execute_sync_cache	= iblock_execute_sync_cache,
 	.execute_write_same	= iblock_execute_write_same,
 	.execute_unmap		= iblock_execute_unmap,
+	.execute_pr_out		= iblock_execute_pr_out,
+	.execute_pr_in		= iblock_execute_pr_in,
 };
 
 static sense_reason_t
@@ -899,6 +1180,7 @@ static bool iblock_get_write_cache(struct se_device *dev)
 static const struct target_backend_ops iblock_ops = {
 	.name			= "iblock",
 	.inquiry_prod		= "IBLOCK",
+	.transport_flags_changeable = TRANSPORT_FLAG_PASSTHROUGH_PGR,
 	.inquiry_rev		= IBLOCK_VERSION,
 	.owner			= THIS_MODULE,
 	.attach_hba		= iblock_attach_hba,
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [dm-devel] [PATCH v2 20/20] scsi: target: Add block PR support to iblock.
@ 2022-08-09  0:04   ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09  0:04 UTC (permalink / raw)
  To: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

This adds support for the block PR callouts to target_core_iblock. This
patch doesn't attempt to implement the entire spec because there's no way
support it all like SPEC_I_PT and ALL_TG_PT. This only supports
exporting the iblock device from one path on the local target.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/target/target_core_iblock.c | 292 +++++++++++++++++++++++++++-
 1 file changed, 287 insertions(+), 5 deletions(-)

diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index 5db7318b4822..caf6958dd75d 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -23,13 +23,16 @@
 #include <linux/file.h>
 #include <linux/module.h>
 #include <linux/scatterlist.h>
+#include <linux/pr.h>
 #include <scsi/scsi_proto.h>
+#include <scsi/scsi_block_pr.h>
 #include <asm/unaligned.h>
 
 #include <target/target_core_base.h>
 #include <target/target_core_backend.h>
 
 #include "target_core_iblock.h"
+#include "target_core_pr.h"
 
 #define IBLOCK_MAX_BIO_PER_TASK	 32	/* max # of bios to submit at a time */
 #define IBLOCK_BIO_POOL_SIZE	128
@@ -310,7 +313,7 @@ static unsigned long long iblock_emulate_read_cap_with_block_size(
 	return blocks_long;
 }
 
-static void iblock_complete_cmd(struct se_cmd *cmd)
+static void iblock_complete_cmd(struct se_cmd *cmd, blk_status_t blk_status)
 {
 	struct iblock_req *ibr = cmd->priv;
 	u8 status;
@@ -318,7 +321,9 @@ static void iblock_complete_cmd(struct se_cmd *cmd)
 	if (!refcount_dec_and_test(&ibr->pending))
 		return;
 
-	if (atomic_read(&ibr->ib_bio_err_cnt))
+	if (blk_status == BLK_STS_NEXUS)
+		status = SAM_STAT_RESERVATION_CONFLICT;
+	else if (atomic_read(&ibr->ib_bio_err_cnt))
 		status = SAM_STAT_CHECK_CONDITION;
 	else
 		status = SAM_STAT_GOOD;
@@ -331,6 +336,7 @@ static void iblock_bio_done(struct bio *bio)
 {
 	struct se_cmd *cmd = bio->bi_private;
 	struct iblock_req *ibr = cmd->priv;
+	blk_status_t blk_status = bio->bi_status;
 
 	if (bio->bi_status) {
 		pr_err("bio error: %p,  err: %d\n", bio, bio->bi_status);
@@ -343,7 +349,7 @@ static void iblock_bio_done(struct bio *bio)
 
 	bio_put(bio);
 
-	iblock_complete_cmd(cmd);
+	iblock_complete_cmd(cmd, blk_status);
 }
 
 static struct bio *iblock_get_bio(struct se_cmd *cmd, sector_t lba, u32 sg_num,
@@ -759,7 +765,7 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 
 	if (!sgl_nents) {
 		refcount_set(&ibr->pending, 1);
-		iblock_complete_cmd(cmd);
+		iblock_complete_cmd(cmd, BLK_STS_OK);
 		return 0;
 	}
 
@@ -817,7 +823,7 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 	}
 
 	iblock_submit_bios(&list);
-	iblock_complete_cmd(cmd);
+	iblock_complete_cmd(cmd, BLK_STS_OK);
 	return 0;
 
 fail_put_bios:
@@ -829,6 +835,279 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 	return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
 }
 
+static sense_reason_t iblock_execute_pr_out(struct se_cmd *cmd, u8 sa, u64 key,
+					    u64 sa_key, u8 type, bool aptpl,
+					    bool all_tg_pt, bool spec_i_pt)
+{
+	struct se_device *dev = cmd->se_dev;
+	struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
+	struct block_device *bdev = ib_dev->ibd_bd;
+	const struct pr_ops *ops = bdev->bd_disk->fops->pr_ops;
+	blk_status_t blk_stat = 0;
+	int ret;
+
+	if (!ops) {
+		pr_err("Block device does not support pr_ops but iblock device has been configured for PR passthrough.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	switch (sa) {
+	case PRO_REGISTER_AND_MOVE:
+		pr_err("PRO_REGISTER_AND_MOVE is not supported by iblock PR passthrough.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	case PRO_REPLACE_LOST_RESERVATION:
+		pr_err("PRO_REPLACE_LOST_RESERVATION is not supported by iblock PR passthrough.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	case PRO_REGISTER:
+	case PRO_REGISTER_AND_IGNORE_EXISTING_KEY:
+		if (!ops->pr_register) {
+			pr_err("block device does not support pr_register.\n");
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		}
+
+		/*
+		 * We only support one target port. We don't know the target
+		 * port config at this level and the block layer has a
+		 * different view.
+		 */
+		if (spec_i_pt || all_tg_pt) {
+			pr_err("SPC-3 PR: SPEC_I_PT and ALL_TG_PT are not supported by PR passthrough.\n");
+
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		}
+
+		/* The block layer pr ops always enables aptpl */
+		if (!aptpl)
+			pr_info("APTPL not set by initiator, but will be used.\n");
+
+		ret = ops->pr_register(bdev, key, sa_key,
+				sa == PRO_REGISTER ? 0 : PR_FL_IGNORE_KEY,
+				&blk_stat);
+		break;
+	case PRO_RESERVE:
+		if (!ops->pr_reserve) {
+			pr_err("block_device does not support pr_reserve.\n");
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		}
+
+		ret = ops->pr_reserve(bdev, key, scsi_pr_type_to_block(type), 0,
+				      &blk_stat);
+		break;
+	case PRO_CLEAR:
+		if (!ops->pr_clear) {
+			pr_err("block_device does not support pr_clear.\n");
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		}
+
+		ret = ops->pr_clear(bdev, key, &blk_stat);
+		break;
+	case PRO_PREEMPT:
+	case PRO_PREEMPT_AND_ABORT:
+		if (!ops->pr_clear) {
+			pr_err("block_device does not support pr_preempt.\n");
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		}
+
+		ret = ops->pr_preempt(bdev, key, sa_key,
+				      scsi_pr_type_to_block(type),
+				      sa == PRO_PREEMPT ? false : true,
+				      &blk_stat);
+		break;
+	case PRO_RELEASE:
+		if (!ops->pr_clear) {
+			pr_err("block_device does not support pr_pclear.\n");
+			return TCM_UNSUPPORTED_SCSI_OPCODE;
+		}
+
+		ret = ops->pr_release(bdev, key, scsi_pr_type_to_block(type),
+				      &blk_stat);
+		break;
+	default:
+		pr_err("Unknown PERSISTENT_RESERVE_OUT SA: 0x%02x\n", sa);
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	if (!ret)
+		return TCM_NO_SENSE;
+	else if (blk_stat == BLK_STS_NEXUS)
+		return TCM_RESERVATION_CONFLICT;
+	else
+		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+}
+
+static void iblock_pr_report_caps(unsigned char *param_data)
+{
+	u16 len = 8;
+
+	put_unaligned_be16(len, &param_data[0]);
+	/*
+	 * When using the pr_ops passthrough method we only support exporting
+	 * the device through one target port because from the backend module
+	 * level we can't see the target port config. As a result we only
+	 * support registration directly from the I_T nexus the cmd is sent
+	 * through and do not set ATP_C here.
+	 *
+	 * The block layer pr_ops do not support passing in initiators so
+	 * we don't set SIP_C here.
+	 */
+	/* PTPL_C: Persistence across Target Power Loss bit */
+	param_data[2] |= 0x01;
+	/*
+	 * We are filling in the PERSISTENT RESERVATION TYPE MASK below, so
+	 * set the TMV: Task Mask Valid bit.
+	 */
+	param_data[3] |= 0x80;
+	/*
+	 * Change ALLOW COMMANDs to 0x20 or 0x40 later from Table 166
+	 */
+	param_data[3] |= 0x10; /* ALLOW COMMANDs field 001b */
+	/*
+	 * PTPL_A: Persistence across Target Power Loss Active bit. The block
+	 * layer pr ops always enables this so report it active.
+	 */
+	param_data[3] |= 0x01;
+	/*
+	 * Setup the PERSISTENT RESERVATION TYPE MASK from Table 212 spc4r37.
+	 */
+	param_data[4] |= 0x80; /* PR_TYPE_EXCLUSIVE_ACCESS_ALLREG */
+	param_data[4] |= 0x40; /* PR_TYPE_EXCLUSIVE_ACCESS_REGONLY */
+	param_data[4] |= 0x20; /* PR_TYPE_WRITE_EXCLUSIVE_REGONLY */
+	param_data[4] |= 0x08; /* PR_TYPE_EXCLUSIVE_ACCESS */
+	param_data[4] |= 0x02; /* PR_TYPE_WRITE_EXCLUSIVE */
+	param_data[5] |= 0x01; /* PR_TYPE_EXCLUSIVE_ACCESS_ALLREG */
+}
+
+static int iblock_pr_read_keys(struct se_cmd *cmd, unsigned char *param_data)
+{
+	struct se_device *dev = cmd->se_dev;
+	struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
+	struct block_device *bdev = ib_dev->ibd_bd;
+	const struct pr_ops *ops = bdev->bd_disk->fops->pr_ops;
+	int i, ret, len, paths, data_offset;
+	struct pr_keys *keys;
+
+	if (!ops) {
+		pr_err("Block device does not support pr_ops but iblock device has been configured for PR passthrough.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	if (!ops->pr_read_keys) {
+		pr_err("Block device does not support read_keys.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	/*
+	 * We don't know what's under us, but dm-multipath will register every
+	 * path with the same key, so start off with enough space for 16 paths.
+	 */
+	paths = 16;
+retry:
+	len = 8 * paths;
+	keys = kzalloc(sizeof(*keys) + len, GFP_KERNEL);
+	if (!keys)
+		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+
+	ret = ops->pr_read_keys(bdev, keys, len, NULL);
+	if (!ret) {
+		if (keys->num_keys > paths) {
+			kfree(keys);
+			paths *= 2;
+			goto retry;
+		}
+	} else if (ret) {
+		ret = TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+		goto free_keys;
+	}
+
+	ret = TCM_NO_SENSE;
+
+	put_unaligned_be32(keys->generation, &param_data[0]);
+	if (!keys->num_keys) {
+		put_unaligned_be32(0, &param_data[4]);
+		goto free_keys;
+	}
+
+	put_unaligned_be32(8 * keys->num_keys, &param_data[4]);
+
+	data_offset = 8;
+	for (i = 0; i < keys->num_keys; i++) {
+		if (data_offset + 8 > cmd->data_length)
+			break;
+
+		put_unaligned_be64(keys->keys[i], &param_data[data_offset]);
+		data_offset += 8;
+	}
+
+free_keys:
+	kfree(keys);
+	return ret;
+}
+
+static int iblock_pr_read_reservation(struct se_cmd *cmd,
+				      unsigned char *param_data)
+{
+	struct se_device *dev = cmd->se_dev;
+	struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
+	struct block_device *bdev = ib_dev->ibd_bd;
+	const struct pr_ops *ops = bdev->bd_disk->fops->pr_ops;
+	struct pr_held_reservation rsv;
+
+	if (!ops) {
+		pr_err("Block device does not support pr_ops but iblock device has been configured for PR passthrough.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	if (!ops->pr_read_reservation) {
+		pr_err("Block device does not support read_keys.\n");
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	if (ops->pr_read_reservation(bdev, &rsv, NULL))
+		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+
+	put_unaligned_be32(rsv.generation, &param_data[0]);
+	if (!block_pr_type_to_scsi(rsv.type)) {
+		put_unaligned_be32(0, &param_data[4]);
+		return TCM_NO_SENSE;
+	}
+
+	put_unaligned_be32(16, &param_data[4]);
+
+	if (cmd->data_length < 16)
+		return TCM_NO_SENSE;
+	put_unaligned_be64(rsv.key, &param_data[8]);
+
+	if (cmd->data_length < 22)
+		return TCM_NO_SENSE;
+	param_data[21] = block_pr_type_to_scsi(rsv.type);
+
+	return TCM_NO_SENSE;
+}
+
+static sense_reason_t iblock_execute_pr_in(struct se_cmd *cmd, u8 sa,
+					   unsigned char *param_data)
+{
+	sense_reason_t ret = TCM_NO_SENSE;
+
+	switch (sa) {
+	case PRI_REPORT_CAPABILITIES:
+		iblock_pr_report_caps(param_data);
+		break;
+	case PRI_READ_KEYS:
+		ret = iblock_pr_read_keys(cmd, param_data);
+		break;
+	case PRI_READ_RESERVATION:
+		ret = iblock_pr_read_reservation(cmd, param_data);
+		break;
+	case PRI_READ_FULL_STATUS:
+	default:
+		pr_err("Unknown PERSISTENT_RESERVE_IN SA: 0x%02x\n", sa);
+		return TCM_UNSUPPORTED_SCSI_OPCODE;
+	}
+
+	return ret;
+}
+
 static sector_t iblock_get_blocks(struct se_device *dev)
 {
 	struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
@@ -883,6 +1162,8 @@ static struct exec_cmd_ops iblock_exec_cmd_ops = {
 	.execute_sync_cache	= iblock_execute_sync_cache,
 	.execute_write_same	= iblock_execute_write_same,
 	.execute_unmap		= iblock_execute_unmap,
+	.execute_pr_out		= iblock_execute_pr_out,
+	.execute_pr_in		= iblock_execute_pr_in,
 };
 
 static sense_reason_t
@@ -899,6 +1180,7 @@ static bool iblock_get_write_cache(struct se_device *dev)
 static const struct target_backend_ops iblock_ops = {
 	.name			= "iblock",
 	.inquiry_prod		= "IBLOCK",
+	.transport_flags_changeable = TRANSPORT_FLAG_PASSTHROUGH_PGR,
 	.inquiry_rev		= IBLOCK_VERSION,
 	.owner			= THIS_MODULE,
 	.attach_hba		= iblock_attach_hba,
-- 
2.18.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 14/20] scsi: Retry pr_ops commands if a UA is returned.
  2022-08-09  0:04   ` [dm-devel] " Mike Christie
@ 2022-08-09  7:16     ` Christoph Hellwig
  -1 siblings, 0 replies; 94+ messages in thread
From: Christoph Hellwig @ 2022-08-09  7:16 UTC (permalink / raw)
  To: Mike Christie
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On Mon, Aug 08, 2022 at 07:04:13PM -0500, Mike Christie wrote:
> It's common to get a UA when doing PR commands. It could be due to a
> target restarting, transport level relogin or other PR commands like a
> release causing it. The upper layers don't get the sense and in some cases
> have no idea if it's a SCSI device, so this has the sd layer retry.

This seems like another case for the generic in-kernel passthrugh
command retry discussed in the other thread.

Can you split out two series with just bug fixes for nvme and scsi
as I think we should probably get those into 6.0, and then we can
do the actual feature on top of those?

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 14/20] scsi: Retry pr_ops commands if a UA is returned.
@ 2022-08-09  7:16     ` Christoph Hellwig
  0 siblings, 0 replies; 94+ messages in thread
From: Christoph Hellwig @ 2022-08-09  7:16 UTC (permalink / raw)
  To: Mike Christie
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch

On Mon, Aug 08, 2022 at 07:04:13PM -0500, Mike Christie wrote:
> It's common to get a UA when doing PR commands. It could be due to a
> target restarting, transport level relogin or other PR commands like a
> release causing it. The upper layers don't get the sense and in some cases
> have no idea if it's a SCSI device, so this has the sd layer retry.

This seems like another case for the generic in-kernel passthrugh
command retry discussed in the other thread.

Can you split out two series with just bug fixes for nvme and scsi
as I think we should probably get those into 6.0, and then we can
do the actual feature on top of those?

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 16/20] scsi: Have sd pr_ops return a blk_status_t
  2022-08-09  0:04   ` [dm-devel] " Mike Christie
@ 2022-08-09  7:18     ` Christoph Hellwig
  -1 siblings, 0 replies; 94+ messages in thread
From: Christoph Hellwig @ 2022-08-09  7:18 UTC (permalink / raw)
  To: Mike Christie
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On Mon, Aug 08, 2022 at 07:04:15PM -0500, Mike Christie wrote:
> This patch has the sd pr_ops convert from the low level SCSI errors to a
> blk_status_t.

Can you document the why here?

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 16/20] scsi: Have sd pr_ops return a blk_status_t
@ 2022-08-09  7:18     ` Christoph Hellwig
  0 siblings, 0 replies; 94+ messages in thread
From: Christoph Hellwig @ 2022-08-09  7:18 UTC (permalink / raw)
  To: Mike Christie
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch

On Mon, Aug 08, 2022 at 07:04:15PM -0500, Mike Christie wrote:
> This patch has the sd pr_ops convert from the low level SCSI errors to a
> blk_status_t.

Can you document the why here?

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 06/20] nvme: Fix reservation status related structs
  2022-08-09  0:04   ` [dm-devel] " Mike Christie
@ 2022-08-09  7:19     ` Christoph Hellwig
  -1 siblings, 0 replies; 94+ messages in thread
From: Christoph Hellwig @ 2022-08-09  7:19 UTC (permalink / raw)
  To: Mike Christie
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On Mon, Aug 08, 2022 at 07:04:05PM -0500, Mike Christie wrote:
> This fixes the following issues with the reservation status structs:
> 
> 1. resv10 is bytes 23:10 so it should be 14 bytes.
> 2. regctl_ds only supports 64 bit host IDs.

This doesn't actually seem to be used by the kernel at all.  Which
I guess means I need to go back into my todo list and tackle the
discussion if we want to have non-kernel bits in nvme.h to start
with.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 06/20] nvme: Fix reservation status related structs
@ 2022-08-09  7:19     ` Christoph Hellwig
  0 siblings, 0 replies; 94+ messages in thread
From: Christoph Hellwig @ 2022-08-09  7:19 UTC (permalink / raw)
  To: Mike Christie
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch

On Mon, Aug 08, 2022 at 07:04:05PM -0500, Mike Christie wrote:
> This fixes the following issues with the reservation status structs:
> 
> 1. resv10 is bytes 23:10 so it should be 14 bytes.
> 2. regctl_ds only supports 64 bit host IDs.

This doesn't actually seem to be used by the kernel at all.  Which
I guess means I need to go back into my todo list and tackle the
discussion if we want to have non-kernel bits in nvme.h to start
with.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 07/20] nvme: Don't hardcode the data len for pr commands.
  2022-08-09  0:04   ` [dm-devel] " Mike Christie
@ 2022-08-09  7:19     ` Christoph Hellwig
  -1 siblings, 0 replies; 94+ messages in thread
From: Christoph Hellwig @ 2022-08-09  7:19 UTC (permalink / raw)
  To: Mike Christie
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 07/20] nvme: Don't hardcode the data len for pr commands.
@ 2022-08-09  7:19     ` Christoph Hellwig
  0 siblings, 0 replies; 94+ messages in thread
From: Christoph Hellwig @ 2022-08-09  7:19 UTC (permalink / raw)
  To: Mike Christie
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 08/20] nvme: Add helper to convert to a pr_ops PR type
  2022-08-09  0:04   ` [dm-devel] " Mike Christie
@ 2022-08-09  7:20     ` Christoph Hellwig
  -1 siblings, 0 replies; 94+ messages in thread
From: Christoph Hellwig @ 2022-08-09  7:20 UTC (permalink / raw)
  To: Mike Christie
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On Mon, Aug 08, 2022 at 07:04:07PM -0500, Mike Christie wrote:
> This adds a helper to go from the NVMe spec PR type value to the block
> layer pr_type, so for Reservation Report support we can convert from its
> value.

Without a user this is going to create a compiler warning.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 08/20] nvme: Add helper to convert to a pr_ops PR type
@ 2022-08-09  7:20     ` Christoph Hellwig
  0 siblings, 0 replies; 94+ messages in thread
From: Christoph Hellwig @ 2022-08-09  7:20 UTC (permalink / raw)
  To: Mike Christie
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch

On Mon, Aug 08, 2022 at 07:04:07PM -0500, Mike Christie wrote:
> This adds a helper to go from the NVMe spec PR type value to the block
> layer pr_type, so for Reservation Report support we can convert from its
> value.

Without a user this is going to create a compiler warning.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 12/20] block,nvme,scsi,dm: Add blk_status to pr_ops callouts.
  2022-08-09  0:04   ` [dm-devel] [PATCH v2 12/20] block, nvme, scsi, dm: " Mike Christie
@ 2022-08-09  7:21     ` Christoph Hellwig
  -1 siblings, 0 replies; 94+ messages in thread
From: Christoph Hellwig @ 2022-08-09  7:21 UTC (permalink / raw)
  To: Mike Christie
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On Mon, Aug 08, 2022 at 07:04:11PM -0500, Mike Christie wrote:
> To handle both cases, this patch adds a blk_status_t arg to the pr_ops
> callouts. The lower levels will convert their device specific error to
> the blk_status_t then the upper levels can easily check that code
> without knowing the device type. It also allows us to keep userspace
> compat where it expects a negative -Exyz error code if the command fails
> before it's sent to the device or a device/tranport specific value if the
> error is > 0.

Why do we need two return values here?

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 12/20] block, nvme, scsi, dm: Add blk_status to pr_ops callouts.
@ 2022-08-09  7:21     ` Christoph Hellwig
  0 siblings, 0 replies; 94+ messages in thread
From: Christoph Hellwig @ 2022-08-09  7:21 UTC (permalink / raw)
  To: Mike Christie
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch

On Mon, Aug 08, 2022 at 07:04:11PM -0500, Mike Christie wrote:
> To handle both cases, this patch adds a blk_status_t arg to the pr_ops
> callouts. The lower levels will convert their device specific error to
> the blk_status_t then the upper levels can easily check that code
> without knowing the device type. It also allows us to keep userspace
> compat where it expects a negative -Exyz error code if the command fails
> before it's sent to the device or a device/tranport specific value if the
> error is > 0.

Why do we need two return values here?

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
  2022-08-09  0:04   ` [dm-devel] " Mike Christie
@ 2022-08-09 10:55     ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-09 10:55 UTC (permalink / raw)
  To: Mike Christie
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On 8/8/22 17:04, Mike Christie wrote:
> This adds a helper to execute the Reservation Report. The next patches
> will then convert call it and convert that info to read_keys and
> read_reservation.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/nvme/host/core.c | 27 +++++++++++++++++++++++++++
>   1 file changed, 27 insertions(+)
> 

from the comments I've received in the past, please add a function
in the patch where it is actually using it.

Also, please consider if we can move pr code into its separate file
if others are okay with it as host/core.c file is getting bigger.

-ck



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
@ 2022-08-09 10:55     ` Chaitanya Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-09 10:55 UTC (permalink / raw)
  To: Mike Christie
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch

On 8/8/22 17:04, Mike Christie wrote:
> This adds a helper to execute the Reservation Report. The next patches
> will then convert call it and convert that info to read_keys and
> read_reservation.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/nvme/host/core.c | 27 +++++++++++++++++++++++++++
>   1 file changed, 27 insertions(+)
> 

from the comments I've received in the past, please add a function
in the patch where it is actually using it.

Also, please consider if we can move pr code into its separate file
if others are okay with it as host/core.c file is getting bigger.

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
  2022-08-09  0:04   ` [dm-devel] " Mike Christie
@ 2022-08-09 10:56     ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-09 10:56 UTC (permalink / raw)
  To: Mike Christie
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On 8/8/22 17:04, Mike Christie wrote:
> This adds a helper to execute the Reservation Report. The next patches
> will then convert call it and convert that info to read_keys and
> read_reservation.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/nvme/host/core.c | 27 +++++++++++++++++++++++++++
>   1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 0dc768ae0c16..6b22a5dec122 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2196,6 +2196,33 @@ static int nvme_pr_release(struct block_device *bdev, u64 key, enum pr_type type
>   	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_release);
>   }
>   
> +static int nvme_pr_resv_report(struct block_device *bdev, u8 *data,
> +		u32 data_len, bool *eds)
> +{
> +	struct nvme_command c = { };
> +	int ret;
> +
> +	c.common.opcode = nvme_cmd_resv_report;
> +	c.common.cdw10 = cpu_to_le32(nvme_bytes_to_numd(data_len));
> +	c.common.cdw11 = 1;
> +	*eds = true;
> +
> +retry:
> +	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
> +	    bdev->bd_disk->fops == &nvme_ns_head_ops)
> +		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len);
> +	else
> +		ret = nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
> +					      data, data_len);
> +	if (ret == NVME_SC_HOST_ID_INCONSIST && c.common.cdw11) {
> +		c.common.cdw11 = 0;
> +		*eds = false;
> +		goto retry;

Unconditional retries without any limit can create problems,
perhaps consider adding some soft limits.

-ck



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
@ 2022-08-09 10:56     ` Chaitanya Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-09 10:56 UTC (permalink / raw)
  To: Mike Christie
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch

On 8/8/22 17:04, Mike Christie wrote:
> This adds a helper to execute the Reservation Report. The next patches
> will then convert call it and convert that info to read_keys and
> read_reservation.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/nvme/host/core.c | 27 +++++++++++++++++++++++++++
>   1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 0dc768ae0c16..6b22a5dec122 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2196,6 +2196,33 @@ static int nvme_pr_release(struct block_device *bdev, u64 key, enum pr_type type
>   	return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_release);
>   }
>   
> +static int nvme_pr_resv_report(struct block_device *bdev, u8 *data,
> +		u32 data_len, bool *eds)
> +{
> +	struct nvme_command c = { };
> +	int ret;
> +
> +	c.common.opcode = nvme_cmd_resv_report;
> +	c.common.cdw10 = cpu_to_le32(nvme_bytes_to_numd(data_len));
> +	c.common.cdw11 = 1;
> +	*eds = true;
> +
> +retry:
> +	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
> +	    bdev->bd_disk->fops == &nvme_ns_head_ops)
> +		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len);
> +	else
> +		ret = nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
> +					      data, data_len);
> +	if (ret == NVME_SC_HOST_ID_INCONSIST && c.common.cdw11) {
> +		c.common.cdw11 = 0;
> +		*eds = false;
> +		goto retry;

Unconditional retries without any limit can create problems,
perhaps consider adding some soft limits.

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 13/20] nvme: Have nvme pr_ops return a blk_status_t
  2022-08-09  0:04   ` [dm-devel] " Mike Christie
@ 2022-08-09 10:58     ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-09 10:58 UTC (permalink / raw)
  To: Mike Christie
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On 8/8/22 17:04, Mike Christie wrote:
> This patch has the nvme pr_ops convert from a nvme status value to a
> blk_status_t.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/nvme/host/core.c | 54 ++++++++++++++++++++++++++--------------
>   1 file changed, 36 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c

Again please consider moving pr code into it's own file,
if others are okay with it.

-ck



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 13/20] nvme: Have nvme pr_ops return a blk_status_t
@ 2022-08-09 10:58     ` Chaitanya Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-09 10:58 UTC (permalink / raw)
  To: Mike Christie
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch

On 8/8/22 17:04, Mike Christie wrote:
> This patch has the nvme pr_ops convert from a nvme status value to a
> blk_status_t.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/nvme/host/core.c | 54 ++++++++++++++++++++++++++--------------
>   1 file changed, 36 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c

Again please consider moving pr code into it's own file,
if others are okay with it.

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 06/20] nvme: Fix reservation status related structs
  2022-08-09  7:19     ` [dm-devel] " Christoph Hellwig
@ 2022-08-09 11:09       ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-09 11:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, linux-nvme,
	martin.petersen, linux-scsi, james.bottomley, Mike Christie

On 8/9/22 00:19, Christoph Hellwig wrote:
> On Mon, Aug 08, 2022 at 07:04:05PM -0500, Mike Christie wrote:
>> This fixes the following issues with the reservation status structs:
>>
>> 1. resv10 is bytes 23:10 so it should be 14 bytes.
>> 2. regctl_ds only supports 64 bit host IDs.
> 
> This doesn't actually seem to be used by the kernel at all.  Which
> I guess means I need to go back into my todo list and tackle the
> discussion if we want to have non-kernel bits in nvme.h to start
> with.

Having non-kernel bits in nvme.h creates confusion, I've raised this
question in past, in case old nvme-cli (without libnvme) was the
reason to keep them in kernel maybe we can sort this out since
now that nvme-cli and libnvme are spilt ?

-ck



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 06/20] nvme: Fix reservation status related structs
@ 2022-08-09 11:09       ` Chaitanya Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-09 11:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, Mike Christie

On 8/9/22 00:19, Christoph Hellwig wrote:
> On Mon, Aug 08, 2022 at 07:04:05PM -0500, Mike Christie wrote:
>> This fixes the following issues with the reservation status structs:
>>
>> 1. resv10 is bytes 23:10 so it should be 14 bytes.
>> 2. regctl_ds only supports 64 bit host IDs.
> 
> This doesn't actually seem to be used by the kernel at all.  Which
> I guess means I need to go back into my todo list and tackle the
> discussion if we want to have non-kernel bits in nvme.h to start
> with.

Having non-kernel bits in nvme.h creates confusion, I've raised this
question in past, in case old nvme-cli (without libnvme) was the
reason to keep them in kernel maybe we can sort this out since
now that nvme-cli and libnvme are spilt ?

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 08/20] nvme: Add helper to convert to a pr_ops PR type
  2022-08-09  0:04   ` [dm-devel] " Mike Christie
@ 2022-08-09 11:12     ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-09 11:12 UTC (permalink / raw)
  To: Mike Christie
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On 8/8/22 17:04, Mike Christie wrote:
> This adds a helper to go from the NVMe spec PR type value to the block
> layer pr_type, so for Reservation Report support we can convert from its
> value.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/nvme/host/core.c | 20 ++++++++++++++++++++
>   1 file changed, 20 insertions(+)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 3f223641f321..0dc768ae0c16 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2064,6 +2064,26 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
>   	}
>   }
>   
> +static enum pr_type block_pr_type(u8 nvme_type)
> +{
> +	switch (nvme_type) {
> +	case 1:
> +		return PR_WRITE_EXCLUSIVE;
> +	case 2:
> +		return PR_EXCLUSIVE_ACCESS;
> +	case 3:
> +		return PR_WRITE_EXCLUSIVE_REG_ONLY;
> +	case 4:
> +		return PR_EXCLUSIVE_ACCESS_REG_ONLY;
> +	case 5:
> +		return PR_WRITE_EXCLUSIVE_ALL_REGS;
> +	case 6:
> +		return PR_EXCLUSIVE_ACCESS_ALL_REGS;
> +	default:
> +		return 0;
> +	}
> +}
> +

missing caller for this one ? and we can use a sparse array
to remove the switch case for every new nvme_type.

-ck



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 08/20] nvme: Add helper to convert to a pr_ops PR type
@ 2022-08-09 11:12     ` Chaitanya Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-09 11:12 UTC (permalink / raw)
  To: Mike Christie
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch

On 8/8/22 17:04, Mike Christie wrote:
> This adds a helper to go from the NVMe spec PR type value to the block
> layer pr_type, so for Reservation Report support we can convert from its
> value.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/nvme/host/core.c | 20 ++++++++++++++++++++
>   1 file changed, 20 insertions(+)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 3f223641f321..0dc768ae0c16 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2064,6 +2064,26 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
>   	}
>   }
>   
> +static enum pr_type block_pr_type(u8 nvme_type)
> +{
> +	switch (nvme_type) {
> +	case 1:
> +		return PR_WRITE_EXCLUSIVE;
> +	case 2:
> +		return PR_EXCLUSIVE_ACCESS;
> +	case 3:
> +		return PR_WRITE_EXCLUSIVE_REG_ONLY;
> +	case 4:
> +		return PR_EXCLUSIVE_ACCESS_REG_ONLY;
> +	case 5:
> +		return PR_WRITE_EXCLUSIVE_ALL_REGS;
> +	case 6:
> +		return PR_EXCLUSIVE_ACCESS_ALL_REGS;
> +	default:
> +		return 0;
> +	}
> +}
> +

missing caller for this one ? and we can use a sparse array
to remove the switch case for every new nvme_type.

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
  2022-08-09 10:56     ` [dm-devel] " Chaitanya Kulkarni
@ 2022-08-09 14:51       ` Keith Busch
  -1 siblings, 0 replies; 94+ messages in thread
From: Keith Busch @ 2022-08-09 14:51 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: Mike Christie, bvanassche, linux-block, dm-devel, snitzer, axboe,
	hch, linux-nvme, martin.petersen, linux-scsi, james.bottomley

On Tue, Aug 09, 2022 at 10:56:55AM +0000, Chaitanya Kulkarni wrote:
> On 8/8/22 17:04, Mike Christie wrote:
> > +
> > +	c.common.opcode = nvme_cmd_resv_report;
> > +	c.common.cdw10 = cpu_to_le32(nvme_bytes_to_numd(data_len));
> > +	c.common.cdw11 = 1;
> > +	*eds = true;
> > +
> > +retry:
> > +	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
> > +	    bdev->bd_disk->fops == &nvme_ns_head_ops)
> > +		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len);
> > +	else
> > +		ret = nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
> > +					      data, data_len);
> > +	if (ret == NVME_SC_HOST_ID_INCONSIST && c.common.cdw11) {
> > +		c.common.cdw11 = 0;
> > +		*eds = false;
> > +		goto retry;
> 
> Unconditional retries without any limit can create problems,
> perhaps consider adding some soft limits.

It's already conditioned on cdw11, which is cleared to 0 on the 2nd try. Not
that that's particularly clear. I'd suggest naming an enum value for it so the
code tells us what the signficance of cdw11 is in this context (it's the
Extended Data Structure control flag).

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
@ 2022-08-09 14:51       ` Keith Busch
  0 siblings, 0 replies; 94+ messages in thread
From: Keith Busch @ 2022-08-09 14:51 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch,
	Mike Christie

On Tue, Aug 09, 2022 at 10:56:55AM +0000, Chaitanya Kulkarni wrote:
> On 8/8/22 17:04, Mike Christie wrote:
> > +
> > +	c.common.opcode = nvme_cmd_resv_report;
> > +	c.common.cdw10 = cpu_to_le32(nvme_bytes_to_numd(data_len));
> > +	c.common.cdw11 = 1;
> > +	*eds = true;
> > +
> > +retry:
> > +	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
> > +	    bdev->bd_disk->fops == &nvme_ns_head_ops)
> > +		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len);
> > +	else
> > +		ret = nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
> > +					      data, data_len);
> > +	if (ret == NVME_SC_HOST_ID_INCONSIST && c.common.cdw11) {
> > +		c.common.cdw11 = 0;
> > +		*eds = false;
> > +		goto retry;
> 
> Unconditional retries without any limit can create problems,
> perhaps consider adding some soft limits.

It's already conditioned on cdw11, which is cleared to 0 on the 2nd try. Not
that that's particularly clear. I'd suggest naming an enum value for it so the
code tells us what the signficance of cdw11 is in this context (it's the
Extended Data Structure control flag).

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
  2022-08-09 10:55     ` [dm-devel] " Chaitanya Kulkarni
@ 2022-08-09 16:18       ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09 16:18 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On 8/9/22 5:55 AM, Chaitanya Kulkarni wrote:
> On 8/8/22 17:04, Mike Christie wrote:
>> This adds a helper to execute the Reservation Report. The next patches
>> will then convert call it and convert that info to read_keys and
>> read_reservation.
>>
>> Signed-off-by: Mike Christie <michael.christie@oracle.com>
>> ---
>>   drivers/nvme/host/core.c | 27 +++++++++++++++++++++++++++
>>   1 file changed, 27 insertions(+)
>>
> 
> from the comments I've received in the past, please add a function
> in the patch where it is actually using it.
> 

For your comment and Christoph's comment I'll merge patch 8 - 11 together
so the helpers and users are together.


> Also, please consider if we can move pr code into its separate file
> if others are okay with it as host/core.c file is getting bigger.
> 

Ok.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
@ 2022-08-09 16:18       ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09 16:18 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch

On 8/9/22 5:55 AM, Chaitanya Kulkarni wrote:
> On 8/8/22 17:04, Mike Christie wrote:
>> This adds a helper to execute the Reservation Report. The next patches
>> will then convert call it and convert that info to read_keys and
>> read_reservation.
>>
>> Signed-off-by: Mike Christie <michael.christie@oracle.com>
>> ---
>>   drivers/nvme/host/core.c | 27 +++++++++++++++++++++++++++
>>   1 file changed, 27 insertions(+)
>>
> 
> from the comments I've received in the past, please add a function
> in the patch where it is actually using it.
> 

For your comment and Christoph's comment I'll merge patch 8 - 11 together
so the helpers and users are together.


> Also, please consider if we can move pr code into its separate file
> if others are okay with it as host/core.c file is getting bigger.
> 

Ok.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
  2022-08-09 14:51       ` [dm-devel] " Keith Busch
@ 2022-08-09 16:21         ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09 16:21 UTC (permalink / raw)
  To: Keith Busch, Chaitanya Kulkarni
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On 8/9/22 9:51 AM, Keith Busch wrote:
> On Tue, Aug 09, 2022 at 10:56:55AM +0000, Chaitanya Kulkarni wrote:
>> On 8/8/22 17:04, Mike Christie wrote:
>>> +
>>> +	c.common.opcode = nvme_cmd_resv_report;
>>> +	c.common.cdw10 = cpu_to_le32(nvme_bytes_to_numd(data_len));
>>> +	c.common.cdw11 = 1;
>>> +	*eds = true;
>>> +
>>> +retry:
>>> +	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
>>> +	    bdev->bd_disk->fops == &nvme_ns_head_ops)
>>> +		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len);
>>> +	else
>>> +		ret = nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
>>> +					      data, data_len);
>>> +	if (ret == NVME_SC_HOST_ID_INCONSIST && c.common.cdw11) {
>>> +		c.common.cdw11 = 0;
>>> +		*eds = false;
>>> +		goto retry;
>>
>> Unconditional retries without any limit can create problems,
>> perhaps consider adding some soft limits.
> 
> It's already conditioned on cdw11, which is cleared to 0 on the 2nd try. Not
> that that's particularly clear. I'd suggest naming an enum value for it so the
> code tells us what the signficance of cdw11 is in this context (it's the
> Extended Data Structure control flag).

Will do that.

Chaitanya for your comment, with a bad device we could hit an issue where we
we cleared the Extended Data Structure control flag and it also returned 
NVME_SC_HOST_ID_INCONSIST and we'd be in an infinite loop, so I'll handle that.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
@ 2022-08-09 16:21         ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09 16:21 UTC (permalink / raw)
  To: Keith Busch, Chaitanya Kulkarni
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch

On 8/9/22 9:51 AM, Keith Busch wrote:
> On Tue, Aug 09, 2022 at 10:56:55AM +0000, Chaitanya Kulkarni wrote:
>> On 8/8/22 17:04, Mike Christie wrote:
>>> +
>>> +	c.common.opcode = nvme_cmd_resv_report;
>>> +	c.common.cdw10 = cpu_to_le32(nvme_bytes_to_numd(data_len));
>>> +	c.common.cdw11 = 1;
>>> +	*eds = true;
>>> +
>>> +retry:
>>> +	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
>>> +	    bdev->bd_disk->fops == &nvme_ns_head_ops)
>>> +		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len);
>>> +	else
>>> +		ret = nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
>>> +					      data, data_len);
>>> +	if (ret == NVME_SC_HOST_ID_INCONSIST && c.common.cdw11) {
>>> +		c.common.cdw11 = 0;
>>> +		*eds = false;
>>> +		goto retry;
>>
>> Unconditional retries without any limit can create problems,
>> perhaps consider adding some soft limits.
> 
> It's already conditioned on cdw11, which is cleared to 0 on the 2nd try. Not
> that that's particularly clear. I'd suggest naming an enum value for it so the
> code tells us what the signficance of cdw11 is in this context (it's the
> Extended Data Structure control flag).

Will do that.

Chaitanya for your comment, with a bad device we could hit an issue where we
we cleared the Extended Data Structure control flag and it also returned 
NVME_SC_HOST_ID_INCONSIST and we'd be in an infinite loop, so I'll handle that.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 16/20] scsi: Have sd pr_ops return a blk_status_t
  2022-08-09  7:18     ` [dm-devel] " Christoph Hellwig
@ 2022-08-09 16:22       ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09 16:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, linux-nvme,
	martin.petersen, linux-scsi, james.bottomley

On 8/9/22 2:18 AM, Christoph Hellwig wrote:
> On Mon, Aug 08, 2022 at 07:04:15PM -0500, Mike Christie wrote:
>> This patch has the sd pr_ops convert from the low level SCSI errors to a
>> blk_status_t.
> 
> Can you document the why here?

Will do. Also will fix up the comments in the other patches.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 16/20] scsi: Have sd pr_ops return a blk_status_t
@ 2022-08-09 16:22       ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09 16:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi

On 8/9/22 2:18 AM, Christoph Hellwig wrote:
> On Mon, Aug 08, 2022 at 07:04:15PM -0500, Mike Christie wrote:
>> This patch has the sd pr_ops convert from the low level SCSI errors to a
>> blk_status_t.
> 
> Can you document the why here?

Will do. Also will fix up the comments in the other patches.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 14/20] scsi: Retry pr_ops commands if a UA is returned.
  2022-08-09  7:16     ` [dm-devel] " Christoph Hellwig
@ 2022-08-09 16:24       ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09 16:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, linux-nvme,
	martin.petersen, linux-scsi, james.bottomley, Martin Wilck

On 8/9/22 2:16 AM, Christoph Hellwig wrote:
> On Mon, Aug 08, 2022 at 07:04:13PM -0500, Mike Christie wrote:
>> It's common to get a UA when doing PR commands. It could be due to a
>> target restarting, transport level relogin or other PR commands like a
>> release causing it. The upper layers don't get the sense and in some cases
>> have no idea if it's a SCSI device, so this has the sd layer retry.
> 
> This seems like another case for the generic in-kernel passthrugh
> command retry discussed in the other thread.

It is.

> 
> Can you split out two series with just bug fixes for nvme and scsi
> as I think we should probably get those into 6.0, and then we can
> do the actual feature on top of those?

Ok.

Martin W, I'll submit a patch with a new SCMD flag that will fix
both our problems.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 14/20] scsi: Retry pr_ops commands if a UA is returned.
@ 2022-08-09 16:24       ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09 16:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, Martin Wilck

On 8/9/22 2:16 AM, Christoph Hellwig wrote:
> On Mon, Aug 08, 2022 at 07:04:13PM -0500, Mike Christie wrote:
>> It's common to get a UA when doing PR commands. It could be due to a
>> target restarting, transport level relogin or other PR commands like a
>> release causing it. The upper layers don't get the sense and in some cases
>> have no idea if it's a SCSI device, so this has the sd layer retry.
> 
> This seems like another case for the generic in-kernel passthrugh
> command retry discussed in the other thread.

It is.

> 
> Can you split out two series with just bug fixes for nvme and scsi
> as I think we should probably get those into 6.0, and then we can
> do the actual feature on top of those?

Ok.

Martin W, I'll submit a patch with a new SCMD flag that will fix
both our problems.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 12/20] block,nvme,scsi,dm: Add blk_status to pr_ops callouts.
  2022-08-09  7:21     ` [dm-devel] [PATCH v2 12/20] block, nvme, scsi, dm: " Christoph Hellwig
@ 2022-08-09 18:08       ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09 18:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, linux-nvme,
	martin.petersen, linux-scsi, james.bottomley

On 8/9/22 2:21 AM, Christoph Hellwig wrote:
> On Mon, Aug 08, 2022 at 07:04:11PM -0500, Mike Christie wrote:
>> To handle both cases, this patch adds a blk_status_t arg to the pr_ops
>> callouts. The lower levels will convert their device specific error to
>> the blk_status_t then the upper levels can easily check that code
>> without knowing the device type. It also allows us to keep userspace
>> compat where it expects a negative -Exyz error code if the command fails
>> before it's sent to the device or a device/tranport specific value if the
>> error is > 0.
> 
> Why do we need two return values here?

I know the 2 return values are gross :) I can do it in one, but I wasn't sure
what's worse. See below for the other possible solutions. I think they are all
bad.


0. Convert device specific conflict error to -EBADE then back:

sd_pr_command()

.....

/* would add similar check for NVME_SC_RESERVATION_CONFLICT in nvme */
if (result == SAM_STAT_CHECK_CONDITION)
	return -EBADE;
else
	return result;


LIO then just checks for -EBADE but when going to userspace we have to
convert:


blkdev_pr_register()

...
	result = ops->pr_register()
	if (result < 0) {
		/* For compat we must convert back to the nvme/scsi code */
		if (result == -EBADE) {
			/* need some helper for this that calls down the stack */
			if (bdev == SCSI)
				return SAM_STAT_RESERVATION_CONFLICT
			else
				return NVME_SC_RESERVATION_CONFLICT
		} else
			return blk_status_to_str(result)
	} else
		return result;


The conversion is kind of gross and I was thinking in the future it's going
to get worse. I'm going to want to have more advanced error handling in LIO
and dm-multipath. Like dm-multipath wants to know if an pr_op failed because
of a path failure, so it can retry another one, or a hard device/target error.
It would be nice for LIO if an PGR had bad/illegal values and the device
returned an error than I could detect that.


1. Drop the -Exyz error type and use blk_status_t in the kernel:

sd_pr_command()

.....
if (result < 0)
	return -errno_to_blk_status(result);
else if (result == SAM_STAT_CHECK_CONDITION)
	return -BLK_STS_NEXUS;
else
	return result;

blkdev_pr_register()

...
	result = ops->pr_register()
	if (result < 0) {
		/* For compat we must convert back to the nvme/scsi code */
		if (result == -BLK_STS_NEXUS) {
			/* need some helper for this that calls down the stack */
			if (bdev == SCSI)
				return SAM_STAT_RESERVATION_CONFLICT
			else
				return NVME_SC_RESERVATION_CONFLICT
		} else
			return blk_status_to_str(result)
	} else
		return result;

This has similar issues as #0 where we have to convert before returning to
userspace.


Note: In this case, if the block layer uses an -Exyz error code there's not
BLK_STS for then we would return -EIO to userspace now. I was thinking
that might not be ok but I could also just add a BLK_STS error code
for errors like EINVAL, EWOULDBLOCK, ENOMEM, etc so that doesn't happen.


2. We could do something like below where the low levels are not changed but the
caller converts:

sd_pr_command()
	/* no changes */

lio()
	result = ops->pr_register()
	if (result > 0) { 
		/* add some stacked helper again that goes through dm and
		 * to the low level device
		 */
		if (bdev == SCSI) {
			result = scsi_result_to_blk_status(result)
		else
			result = nvme_error_status(result)


This looks simple, but it felt wrong having upper layers having to
know the device type and calling conversion functions.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 12/20] block, nvme, scsi, dm: Add blk_status to pr_ops callouts.
@ 2022-08-09 18:08       ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-09 18:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi

On 8/9/22 2:21 AM, Christoph Hellwig wrote:
> On Mon, Aug 08, 2022 at 07:04:11PM -0500, Mike Christie wrote:
>> To handle both cases, this patch adds a blk_status_t arg to the pr_ops
>> callouts. The lower levels will convert their device specific error to
>> the blk_status_t then the upper levels can easily check that code
>> without knowing the device type. It also allows us to keep userspace
>> compat where it expects a negative -Exyz error code if the command fails
>> before it's sent to the device or a device/tranport specific value if the
>> error is > 0.
> 
> Why do we need two return values here?

I know the 2 return values are gross :) I can do it in one, but I wasn't sure
what's worse. See below for the other possible solutions. I think they are all
bad.


0. Convert device specific conflict error to -EBADE then back:

sd_pr_command()

.....

/* would add similar check for NVME_SC_RESERVATION_CONFLICT in nvme */
if (result == SAM_STAT_CHECK_CONDITION)
	return -EBADE;
else
	return result;


LIO then just checks for -EBADE but when going to userspace we have to
convert:


blkdev_pr_register()

...
	result = ops->pr_register()
	if (result < 0) {
		/* For compat we must convert back to the nvme/scsi code */
		if (result == -EBADE) {
			/* need some helper for this that calls down the stack */
			if (bdev == SCSI)
				return SAM_STAT_RESERVATION_CONFLICT
			else
				return NVME_SC_RESERVATION_CONFLICT
		} else
			return blk_status_to_str(result)
	} else
		return result;


The conversion is kind of gross and I was thinking in the future it's going
to get worse. I'm going to want to have more advanced error handling in LIO
and dm-multipath. Like dm-multipath wants to know if an pr_op failed because
of a path failure, so it can retry another one, or a hard device/target error.
It would be nice for LIO if an PGR had bad/illegal values and the device
returned an error than I could detect that.


1. Drop the -Exyz error type and use blk_status_t in the kernel:

sd_pr_command()

.....
if (result < 0)
	return -errno_to_blk_status(result);
else if (result == SAM_STAT_CHECK_CONDITION)
	return -BLK_STS_NEXUS;
else
	return result;

blkdev_pr_register()

...
	result = ops->pr_register()
	if (result < 0) {
		/* For compat we must convert back to the nvme/scsi code */
		if (result == -BLK_STS_NEXUS) {
			/* need some helper for this that calls down the stack */
			if (bdev == SCSI)
				return SAM_STAT_RESERVATION_CONFLICT
			else
				return NVME_SC_RESERVATION_CONFLICT
		} else
			return blk_status_to_str(result)
	} else
		return result;

This has similar issues as #0 where we have to convert before returning to
userspace.


Note: In this case, if the block layer uses an -Exyz error code there's not
BLK_STS for then we would return -EIO to userspace now. I was thinking
that might not be ok but I could also just add a BLK_STS error code
for errors like EINVAL, EWOULDBLOCK, ENOMEM, etc so that doesn't happen.


2. We could do something like below where the low levels are not changed but the
caller converts:

sd_pr_command()
	/* no changes */

lio()
	result = ops->pr_register()
	if (result > 0) { 
		/* add some stacked helper again that goes through dm and
		 * to the low level device
		 */
		if (bdev == SCSI) {
			result = scsi_result_to_blk_status(result)
		else
			result = nvme_error_status(result)


This looks simple, but it felt wrong having upper layers having to
know the device type and calling conversion functions.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 02/20] scsi: Rename sd_pr_command.
  2022-08-09  0:04   ` [dm-devel] " Mike Christie
@ 2022-08-09 19:22     ` Bart Van Assche
  -1 siblings, 0 replies; 94+ messages in thread
From: Bart Van Assche @ 2022-08-09 19:22 UTC (permalink / raw)
  To: Mike Christie, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On 8/8/22 17:04, Mike Christie wrote:
> Rename sd_pr_command to sd_pr_out_command to match a
> sd_pr_in_command helper added in the next patches.

No trailing dots at the end of the patch subject please (this patch and 
other patches). Otherwise this patch looks good to me.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 02/20] scsi: Rename sd_pr_command.
@ 2022-08-09 19:22     ` Bart Van Assche
  0 siblings, 0 replies; 94+ messages in thread
From: Bart Van Assche @ 2022-08-09 19:22 UTC (permalink / raw)
  To: Mike Christie, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On 8/8/22 17:04, Mike Christie wrote:
> Rename sd_pr_command to sd_pr_out_command to match a
> sd_pr_in_command helper added in the next patches.

No trailing dots at the end of the patch subject please (this patch and 
other patches). Otherwise this patch looks good to me.

Thanks,

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 04/20] scsi: Add support for block PR read keys/reservation.
  2022-08-09  0:04   ` [dm-devel] " Mike Christie
@ 2022-08-09 19:26     ` Bart Van Assche
  -1 siblings, 0 replies; 94+ messages in thread
From: Bart Van Assche @ 2022-08-09 19:26 UTC (permalink / raw)
  To: Mike Christie, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On 8/8/22 17:04, Mike Christie wrote:
> +static int sd_pr_in_command(struct block_device *bdev, u8 sa,
> +			    unsigned char *data, int data_len)
> +{
> +	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
> +	struct scsi_device *sdev = sdkp->device;
> +	struct scsi_sense_hdr sshdr;
> +	u8 cmd[10] = { 0, };
> +	int result;

Isn't "{ }" instead of "{ 0, }" the preferred way to zero-initialize a 
data structure?

> +
> +	cmd[0] = PERSISTENT_RESERVE_IN;
> +	cmd[1] = sa;

Can the above two assignments be moved into the initializer of cmd[]?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 04/20] scsi: Add support for block PR read keys/reservation.
@ 2022-08-09 19:26     ` Bart Van Assche
  0 siblings, 0 replies; 94+ messages in thread
From: Bart Van Assche @ 2022-08-09 19:26 UTC (permalink / raw)
  To: Mike Christie, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On 8/8/22 17:04, Mike Christie wrote:
> +static int sd_pr_in_command(struct block_device *bdev, u8 sa,
> +			    unsigned char *data, int data_len)
> +{
> +	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
> +	struct scsi_device *sdev = sdkp->device;
> +	struct scsi_sense_hdr sshdr;
> +	u8 cmd[10] = { 0, };
> +	int result;

Isn't "{ }" instead of "{ 0, }" the preferred way to zero-initialize a 
data structure?

> +
> +	cmd[0] = PERSISTENT_RESERVE_IN;
> +	cmd[1] = sa;

Can the above two assignments be moved into the initializer of cmd[]?

Thanks,

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 14/20] scsi: Retry pr_ops commands if a UA is returned.
  2022-08-09 16:24       ` [dm-devel] " Mike Christie
@ 2022-08-09 19:31         ` Bart Van Assche
  -1 siblings, 0 replies; 94+ messages in thread
From: Bart Van Assche @ 2022-08-09 19:31 UTC (permalink / raw)
  To: Mike Christie, Christoph Hellwig
  Cc: linux-block, dm-devel, snitzer, axboe, linux-nvme,
	martin.petersen, linux-scsi, james.bottomley, Martin Wilck

On 8/9/22 09:24, Mike Christie wrote:
> On 8/9/22 2:16 AM, Christoph Hellwig wrote:
>> On Mon, Aug 08, 2022 at 07:04:13PM -0500, Mike Christie wrote:
>>> It's common to get a UA when doing PR commands. It could be due to a
>>> target restarting, transport level relogin or other PR commands like a
>>> release causing it. The upper layers don't get the sense and in some cases
>>> have no idea if it's a SCSI device, so this has the sd layer retry.
>>
>> This seems like another case for the generic in-kernel passthrugh
>> command retry discussed in the other thread.
> 
> It is.

Has it been considered to introduce a flag that makes scsi_noretry_cmd() 
retry passthrough commands?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 14/20] scsi: Retry pr_ops commands if a UA is returned.
@ 2022-08-09 19:31         ` Bart Van Assche
  0 siblings, 0 replies; 94+ messages in thread
From: Bart Van Assche @ 2022-08-09 19:31 UTC (permalink / raw)
  To: Mike Christie, Christoph Hellwig
  Cc: axboe, james.bottomley, linux-scsi, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, Martin Wilck

On 8/9/22 09:24, Mike Christie wrote:
> On 8/9/22 2:16 AM, Christoph Hellwig wrote:
>> On Mon, Aug 08, 2022 at 07:04:13PM -0500, Mike Christie wrote:
>>> It's common to get a UA when doing PR commands. It could be due to a
>>> target restarting, transport level relogin or other PR commands like a
>>> release causing it. The upper layers don't get the sense and in some cases
>>> have no idea if it's a SCSI device, so this has the sd layer retry.
>>
>> This seems like another case for the generic in-kernel passthrugh
>> command retry discussed in the other thread.
> 
> It is.

Has it been considered to introduce a flag that makes scsi_noretry_cmd() 
retry passthrough commands?

Thanks,

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 12/20] block,nvme,scsi,dm: Add blk_status to pr_ops callouts.
  2022-08-09 18:08       ` [dm-devel] [PATCH v2 12/20] block, nvme, scsi, dm: " Mike Christie
@ 2022-08-09 19:33         ` Bart Van Assche
  -1 siblings, 0 replies; 94+ messages in thread
From: Bart Van Assche @ 2022-08-09 19:33 UTC (permalink / raw)
  To: Mike Christie, Christoph Hellwig
  Cc: linux-block, dm-devel, snitzer, axboe, linux-nvme,
	martin.petersen, linux-scsi, james.bottomley

On 8/9/22 11:08, Mike Christie wrote:
> On 8/9/22 2:21 AM, Christoph Hellwig wrote:
>> On Mon, Aug 08, 2022 at 07:04:11PM -0500, Mike Christie wrote:
>>> To handle both cases, this patch adds a blk_status_t arg to the pr_ops
>>> callouts. The lower levels will convert their device specific error to
>>> the blk_status_t then the upper levels can easily check that code
>>> without knowing the device type. It also allows us to keep userspace
>>> compat where it expects a negative -Exyz error code if the command fails
>>> before it's sent to the device or a device/tranport specific value if the
>>> error is > 0.
>>
>> Why do we need two return values here?
> 
> I know the 2 return values are gross :) I can do it in one, but I wasn't sure
> what's worse. See below for the other possible solutions. I think they are all
> bad.
> 
> 
> 0. Convert device specific conflict error to -EBADE then back:
> 
> sd_pr_command()
> 
> .....
> 
> /* would add similar check for NVME_SC_RESERVATION_CONFLICT in nvme */
> if (result == SAM_STAT_CHECK_CONDITION)
> 	return -EBADE;
> else
> 	return result;
> 
> 
> LIO then just checks for -EBADE but when going to userspace we have to
> convert:
> 
> 
> blkdev_pr_register()
> 
> ...
> 	result = ops->pr_register()
> 	if (result < 0) {
> 		/* For compat we must convert back to the nvme/scsi code */
> 		if (result == -EBADE) {
> 			/* need some helper for this that calls down the stack */
> 			if (bdev == SCSI)
> 				return SAM_STAT_RESERVATION_CONFLICT
> 			else
> 				return NVME_SC_RESERVATION_CONFLICT
> 		} else
> 			return blk_status_to_str(result)
> 	} else
> 		return result;
> 
> 
> The conversion is kind of gross and I was thinking in the future it's going
> to get worse. I'm going to want to have more advanced error handling in LIO
> and dm-multipath. Like dm-multipath wants to know if an pr_op failed because
> of a path failure, so it can retry another one, or a hard device/target error.
> It would be nice for LIO if an PGR had bad/illegal values and the device
> returned an error than I could detect that.
> 
> 
> 1. Drop the -Exyz error type and use blk_status_t in the kernel:
> 
> sd_pr_command()
> 
> .....
> if (result < 0)
> 	return -errno_to_blk_status(result);
> else if (result == SAM_STAT_CHECK_CONDITION)
> 	return -BLK_STS_NEXUS;
> else
> 	return result;
> 
> blkdev_pr_register()
> 
> ...
> 	result = ops->pr_register()
> 	if (result < 0) {
> 		/* For compat we must convert back to the nvme/scsi code */
> 		if (result == -BLK_STS_NEXUS) {
> 			/* need some helper for this that calls down the stack */
> 			if (bdev == SCSI)
> 				return SAM_STAT_RESERVATION_CONFLICT
> 			else
> 				return NVME_SC_RESERVATION_CONFLICT
> 		} else
> 			return blk_status_to_str(result)
> 	} else
> 		return result;
> 
> This has similar issues as #0 where we have to convert before returning to
> userspace.
> 
> 
> Note: In this case, if the block layer uses an -Exyz error code there's not
> BLK_STS for then we would return -EIO to userspace now. I was thinking
> that might not be ok but I could also just add a BLK_STS error code
> for errors like EINVAL, EWOULDBLOCK, ENOMEM, etc so that doesn't happen.
> 
> 
> 2. We could do something like below where the low levels are not changed but the
> caller converts:
> 
> sd_pr_command()
> 	/* no changes */
> 
> lio()
> 	result = ops->pr_register()
> 	if (result > 0) {
> 		/* add some stacked helper again that goes through dm and
> 		 * to the low level device
> 		 */
> 		if (bdev == SCSI) {
> 			result = scsi_result_to_blk_status(result)
> 		else
> 			result = nvme_error_status(result)
> 
> 
> This looks simple, but it felt wrong having upper layers having to
> know the device type and calling conversion functions.

Has it been considered to introduce a new enumeration type instead of 
choosing (0), (1) or (2)?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 12/20] block, nvme, scsi, dm: Add blk_status to pr_ops callouts.
@ 2022-08-09 19:33         ` Bart Van Assche
  0 siblings, 0 replies; 94+ messages in thread
From: Bart Van Assche @ 2022-08-09 19:33 UTC (permalink / raw)
  To: Mike Christie, Christoph Hellwig
  Cc: axboe, james.bottomley, linux-scsi, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel

On 8/9/22 11:08, Mike Christie wrote:
> On 8/9/22 2:21 AM, Christoph Hellwig wrote:
>> On Mon, Aug 08, 2022 at 07:04:11PM -0500, Mike Christie wrote:
>>> To handle both cases, this patch adds a blk_status_t arg to the pr_ops
>>> callouts. The lower levels will convert their device specific error to
>>> the blk_status_t then the upper levels can easily check that code
>>> without knowing the device type. It also allows us to keep userspace
>>> compat where it expects a negative -Exyz error code if the command fails
>>> before it's sent to the device or a device/tranport specific value if the
>>> error is > 0.
>>
>> Why do we need two return values here?
> 
> I know the 2 return values are gross :) I can do it in one, but I wasn't sure
> what's worse. See below for the other possible solutions. I think they are all
> bad.
> 
> 
> 0. Convert device specific conflict error to -EBADE then back:
> 
> sd_pr_command()
> 
> .....
> 
> /* would add similar check for NVME_SC_RESERVATION_CONFLICT in nvme */
> if (result == SAM_STAT_CHECK_CONDITION)
> 	return -EBADE;
> else
> 	return result;
> 
> 
> LIO then just checks for -EBADE but when going to userspace we have to
> convert:
> 
> 
> blkdev_pr_register()
> 
> ...
> 	result = ops->pr_register()
> 	if (result < 0) {
> 		/* For compat we must convert back to the nvme/scsi code */
> 		if (result == -EBADE) {
> 			/* need some helper for this that calls down the stack */
> 			if (bdev == SCSI)
> 				return SAM_STAT_RESERVATION_CONFLICT
> 			else
> 				return NVME_SC_RESERVATION_CONFLICT
> 		} else
> 			return blk_status_to_str(result)
> 	} else
> 		return result;
> 
> 
> The conversion is kind of gross and I was thinking in the future it's going
> to get worse. I'm going to want to have more advanced error handling in LIO
> and dm-multipath. Like dm-multipath wants to know if an pr_op failed because
> of a path failure, so it can retry another one, or a hard device/target error.
> It would be nice for LIO if an PGR had bad/illegal values and the device
> returned an error than I could detect that.
> 
> 
> 1. Drop the -Exyz error type and use blk_status_t in the kernel:
> 
> sd_pr_command()
> 
> .....
> if (result < 0)
> 	return -errno_to_blk_status(result);
> else if (result == SAM_STAT_CHECK_CONDITION)
> 	return -BLK_STS_NEXUS;
> else
> 	return result;
> 
> blkdev_pr_register()
> 
> ...
> 	result = ops->pr_register()
> 	if (result < 0) {
> 		/* For compat we must convert back to the nvme/scsi code */
> 		if (result == -BLK_STS_NEXUS) {
> 			/* need some helper for this that calls down the stack */
> 			if (bdev == SCSI)
> 				return SAM_STAT_RESERVATION_CONFLICT
> 			else
> 				return NVME_SC_RESERVATION_CONFLICT
> 		} else
> 			return blk_status_to_str(result)
> 	} else
> 		return result;
> 
> This has similar issues as #0 where we have to convert before returning to
> userspace.
> 
> 
> Note: In this case, if the block layer uses an -Exyz error code there's not
> BLK_STS for then we would return -EIO to userspace now. I was thinking
> that might not be ok but I could also just add a BLK_STS error code
> for errors like EINVAL, EWOULDBLOCK, ENOMEM, etc so that doesn't happen.
> 
> 
> 2. We could do something like below where the low levels are not changed but the
> caller converts:
> 
> sd_pr_command()
> 	/* no changes */
> 
> lio()
> 	result = ops->pr_register()
> 	if (result > 0) {
> 		/* add some stacked helper again that goes through dm and
> 		 * to the low level device
> 		 */
> 		if (bdev == SCSI) {
> 			result = scsi_result_to_blk_status(result)
> 		else
> 			result = nvme_error_status(result)
> 
> 
> This looks simple, but it felt wrong having upper layers having to
> know the device type and calling conversion functions.

Has it been considered to introduce a new enumeration type instead of 
choosing (0), (1) or (2)?

Thanks,

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
  2022-08-09 16:21         ` [dm-devel] " Mike Christie
@ 2022-08-10  1:45           ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-10  1:45 UTC (permalink / raw)
  To: Mike Christie, Keith Busch
  Cc: bvanassche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On 8/9/22 09:21, Mike Christie wrote:
> On 8/9/22 9:51 AM, Keith Busch wrote:
>> On Tue, Aug 09, 2022 at 10:56:55AM +0000, Chaitanya Kulkarni wrote:
>>> On 8/8/22 17:04, Mike Christie wrote:
>>>> +
>>>> +	c.common.opcode = nvme_cmd_resv_report;
>>>> +	c.common.cdw10 = cpu_to_le32(nvme_bytes_to_numd(data_len));
>>>> +	c.common.cdw11 = 1;
>>>> +	*eds = true;
>>>> +
>>>> +retry:
>>>> +	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
>>>> +	    bdev->bd_disk->fops == &nvme_ns_head_ops)
>>>> +		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len);
>>>> +	else
>>>> +		ret = nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
>>>> +					      data, data_len);
>>>> +	if (ret == NVME_SC_HOST_ID_INCONSIST && c.common.cdw11) {
>>>> +		c.common.cdw11 = 0;
>>>> +		*eds = false;
>>>> +		goto retry;
>>>
>>> Unconditional retries without any limit can create problems,
>>> perhaps consider adding some soft limits.
>>
>> It's already conditioned on cdw11, which is cleared to 0 on the 2nd try. Not
>> that that's particularly clear. I'd suggest naming an enum value for it so the
>> code tells us what the signficance of cdw11 is in this context (it's the
>> Extended Data Structure control flag).
> 

true, my concern is if controller went bad (not a common case but it is
H/W afterall) then we should have some soft limit to avoid infinite
retries.

> Will do that.
> 
> Chaitanya for your comment, with a bad device we could hit an issue where we
> we cleared the Extended Data Structure control flag and it also returned
> NVME_SC_HOST_ID_INCONSIST and we'd be in an infinite loop, so I'll handle that.
> 

that will be great.

-ck



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
@ 2022-08-10  1:45           ` Chaitanya Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-10  1:45 UTC (permalink / raw)
  To: Mike Christie, Keith Busch
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch

On 8/9/22 09:21, Mike Christie wrote:
> On 8/9/22 9:51 AM, Keith Busch wrote:
>> On Tue, Aug 09, 2022 at 10:56:55AM +0000, Chaitanya Kulkarni wrote:
>>> On 8/8/22 17:04, Mike Christie wrote:
>>>> +
>>>> +	c.common.opcode = nvme_cmd_resv_report;
>>>> +	c.common.cdw10 = cpu_to_le32(nvme_bytes_to_numd(data_len));
>>>> +	c.common.cdw11 = 1;
>>>> +	*eds = true;
>>>> +
>>>> +retry:
>>>> +	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
>>>> +	    bdev->bd_disk->fops == &nvme_ns_head_ops)
>>>> +		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len);
>>>> +	else
>>>> +		ret = nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
>>>> +					      data, data_len);
>>>> +	if (ret == NVME_SC_HOST_ID_INCONSIST && c.common.cdw11) {
>>>> +		c.common.cdw11 = 0;
>>>> +		*eds = false;
>>>> +		goto retry;
>>>
>>> Unconditional retries without any limit can create problems,
>>> perhaps consider adding some soft limits.
>>
>> It's already conditioned on cdw11, which is cleared to 0 on the 2nd try. Not
>> that that's particularly clear. I'd suggest naming an enum value for it so the
>> code tells us what the signficance of cdw11 is in this context (it's the
>> Extended Data Structure control flag).
> 

true, my concern is if controller went bad (not a common case but it is
H/W afterall) then we should have some soft limit to avoid infinite
retries.

> Will do that.
> 
> Chaitanya for your comment, with a bad device we could hit an issue where we
> we cleared the Extended Data Structure control flag and it also returned
> NVME_SC_HOST_ID_INCONSIST and we'd be in an infinite loop, so I'll handle that.
> 

that will be great.

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
  2022-08-10  1:45           ` [dm-devel] " Chaitanya Kulkarni
@ 2022-08-10  3:17             ` Keith Busch
  -1 siblings, 0 replies; 94+ messages in thread
From: Keith Busch @ 2022-08-10  3:17 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: Mike Christie, bvanassche, linux-block, dm-devel, snitzer, axboe,
	hch, linux-nvme, martin.petersen, linux-scsi, james.bottomley

On Wed, Aug 10, 2022 at 01:45:48AM +0000, Chaitanya Kulkarni wrote:
> On 8/9/22 09:21, Mike Christie wrote:
> > On 8/9/22 9:51 AM, Keith Busch wrote:
> >> On Tue, Aug 09, 2022 at 10:56:55AM +0000, Chaitanya Kulkarni wrote:
> >>> On 8/8/22 17:04, Mike Christie wrote:
> >>>> +
> >>>> +	c.common.opcode = nvme_cmd_resv_report;
> >>>> +	c.common.cdw10 = cpu_to_le32(nvme_bytes_to_numd(data_len));
> >>>> +	c.common.cdw11 = 1;
> >>>> +	*eds = true;
> >>>> +
> >>>> +retry:
> >>>> +	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
> >>>> +	    bdev->bd_disk->fops == &nvme_ns_head_ops)
> >>>> +		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len);
> >>>> +	else
> >>>> +		ret = nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
> >>>> +					      data, data_len);
> >>>> +	if (ret == NVME_SC_HOST_ID_INCONSIST && c.common.cdw11) {
> >>>> +		c.common.cdw11 = 0;
> >>>> +		*eds = false;
> >>>> +		goto retry;
> >>>
> >>> Unconditional retries without any limit can create problems,
> >>> perhaps consider adding some soft limits.
> >>
> >> It's already conditioned on cdw11, which is cleared to 0 on the 2nd try. Not
> >> that that's particularly clear. I'd suggest naming an enum value for it so the
> >> code tells us what the signficance of cdw11 is in this context (it's the
> >> Extended Data Structure control flag).
> > 
> 
> true, my concern is if controller went bad (not a common case but it is
> H/W afterall) then we should have some soft limit to avoid infinite
> retries.

cdw11 is '0' on the 2nd try, and the 'goto' is conditioned on cdw11 being
non-zero. There's no infinite retry here.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
@ 2022-08-10  3:17             ` Keith Busch
  0 siblings, 0 replies; 94+ messages in thread
From: Keith Busch @ 2022-08-10  3:17 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch,
	Mike Christie

On Wed, Aug 10, 2022 at 01:45:48AM +0000, Chaitanya Kulkarni wrote:
> On 8/9/22 09:21, Mike Christie wrote:
> > On 8/9/22 9:51 AM, Keith Busch wrote:
> >> On Tue, Aug 09, 2022 at 10:56:55AM +0000, Chaitanya Kulkarni wrote:
> >>> On 8/8/22 17:04, Mike Christie wrote:
> >>>> +
> >>>> +	c.common.opcode = nvme_cmd_resv_report;
> >>>> +	c.common.cdw10 = cpu_to_le32(nvme_bytes_to_numd(data_len));
> >>>> +	c.common.cdw11 = 1;
> >>>> +	*eds = true;
> >>>> +
> >>>> +retry:
> >>>> +	if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
> >>>> +	    bdev->bd_disk->fops == &nvme_ns_head_ops)
> >>>> +		ret = nvme_send_ns_head_pr_command(bdev, &c, data, data_len);
> >>>> +	else
> >>>> +		ret = nvme_send_ns_pr_command(bdev->bd_disk->private_data, &c,
> >>>> +					      data, data_len);
> >>>> +	if (ret == NVME_SC_HOST_ID_INCONSIST && c.common.cdw11) {
> >>>> +		c.common.cdw11 = 0;
> >>>> +		*eds = false;
> >>>> +		goto retry;
> >>>
> >>> Unconditional retries without any limit can create problems,
> >>> perhaps consider adding some soft limits.
> >>
> >> It's already conditioned on cdw11, which is cleared to 0 on the 2nd try. Not
> >> that that's particularly clear. I'd suggest naming an enum value for it so the
> >> code tells us what the signficance of cdw11 is in this context (it's the
> >> Extended Data Structure control flag).
> > 
> 
> true, my concern is if controller went bad (not a common case but it is
> H/W afterall) then we should have some soft limit to avoid infinite
> retries.

cdw11 is '0' on the 2nd try, and the 'goto' is conditioned on cdw11 being
non-zero. There's no infinite retry here.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 04/20] scsi: Add support for block PR read keys/reservation.
  2022-08-09 19:26     ` [dm-devel] " Bart Van Assche
@ 2022-08-10  3:28       ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-10  3:28 UTC (permalink / raw)
  To: Bart Van Assche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On 8/9/22 2:26 PM, Bart Van Assche wrote:
> On 8/8/22 17:04, Mike Christie wrote:
>> +static int sd_pr_in_command(struct block_device *bdev, u8 sa,
>> +                unsigned char *data, int data_len)
>> +{
>> +    struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
>> +    struct scsi_device *sdev = sdkp->device;
>> +    struct scsi_sense_hdr sshdr;
>> +    u8 cmd[10] = { 0, };
>> +    int result;
> 
> Isn't "{ }" instead of "{ 0, }" the preferred way to zero-initialize a data structure?

The original code used { 0, } and that seems common sd.c. { } was not used in sd.c.

I didn't see anything in coding-style.rst. It does not make any difference to me
other than it's better to be consistent unless we are supposed to be transitioning
to a new style.

> 
>> +
>> +    cmd[0] = PERSISTENT_RESERVE_IN;
>> +    cmd[1] = sa;
> 
> Can the above two assignments be moved into the initializer of cmd[]?
> 

Yes, but it was like the first comment. The original code didn't do
that and it seemed more common to not do it. Do we want to switch
or are we transitioning? It does not matter to me. Both are simple changes.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 04/20] scsi: Add support for block PR read keys/reservation.
@ 2022-08-10  3:28       ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-10  3:28 UTC (permalink / raw)
  To: Bart Van Assche, linux-block, dm-devel, snitzer, axboe, hch,
	linux-nvme, martin.petersen, linux-scsi, james.bottomley

On 8/9/22 2:26 PM, Bart Van Assche wrote:
> On 8/8/22 17:04, Mike Christie wrote:
>> +static int sd_pr_in_command(struct block_device *bdev, u8 sa,
>> +                unsigned char *data, int data_len)
>> +{
>> +    struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
>> +    struct scsi_device *sdev = sdkp->device;
>> +    struct scsi_sense_hdr sshdr;
>> +    u8 cmd[10] = { 0, };
>> +    int result;
> 
> Isn't "{ }" instead of "{ 0, }" the preferred way to zero-initialize a data structure?

The original code used { 0, } and that seems common sd.c. { } was not used in sd.c.

I didn't see anything in coding-style.rst. It does not make any difference to me
other than it's better to be consistent unless we are supposed to be transitioning
to a new style.

> 
>> +
>> +    cmd[0] = PERSISTENT_RESERVE_IN;
>> +    cmd[1] = sa;
> 
> Can the above two assignments be moved into the initializer of cmd[]?
> 

Yes, but it was like the first comment. The original code didn't do
that and it seemed more common to not do it. Do we want to switch
or are we transitioning? It does not matter to me. Both are simple changes.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 12/20] block,nvme,scsi,dm: Add blk_status to pr_ops callouts.
  2022-08-09 19:33         ` [dm-devel] [PATCH v2 12/20] block, nvme, scsi, dm: " Bart Van Assche
@ 2022-08-10  3:34           ` Mike Christie
  -1 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-10  3:34 UTC (permalink / raw)
  To: Bart Van Assche, Christoph Hellwig
  Cc: linux-block, dm-devel, snitzer, axboe, linux-nvme,
	martin.petersen, linux-scsi, james.bottomley

On 8/9/22 2:33 PM, Bart Van Assche wrote:
> On 8/9/22 11:08, Mike Christie wrote:
>> On 8/9/22 2:21 AM, Christoph Hellwig wrote:
>>> On Mon, Aug 08, 2022 at 07:04:11PM -0500, Mike Christie wrote:
>>>> To handle both cases, this patch adds a blk_status_t arg to the pr_ops
>>>> callouts. The lower levels will convert their device specific error to
>>>> the blk_status_t then the upper levels can easily check that code
>>>> without knowing the device type. It also allows us to keep userspace
>>>> compat where it expects a negative -Exyz error code if the command fails
>>>> before it's sent to the device or a device/tranport specific value if the
>>>> error is > 0.
>>>
>>> Why do we need two return values here?
>>
>> I know the 2 return values are gross :) I can do it in one, but I wasn't sure
>> what's worse. See below for the other possible solutions. I think they are all
>> bad.
>>
>>
>> 0. Convert device specific conflict error to -EBADE then back:
>>
>> sd_pr_command()
>>
>> .....
>>
>> /* would add similar check for NVME_SC_RESERVATION_CONFLICT in nvme */
>> if (result == SAM_STAT_CHECK_CONDITION)
>>     return -EBADE;
>> else
>>     return result;
>>
>>
>> LIO then just checks for -EBADE but when going to userspace we have to
>> convert:
>>
>>
>> blkdev_pr_register()
>>
>> ...
>>     result = ops->pr_register()
>>     if (result < 0) {
>>         /* For compat we must convert back to the nvme/scsi code */
>>         if (result == -EBADE) {
>>             /* need some helper for this that calls down the stack */
>>             if (bdev == SCSI)
>>                 return SAM_STAT_RESERVATION_CONFLICT
>>             else
>>                 return NVME_SC_RESERVATION_CONFLICT
>>         } else
>>             return blk_status_to_str(result)
>>     } else
>>         return result;
>>
>>
>> The conversion is kind of gross and I was thinking in the future it's going
>> to get worse. I'm going to want to have more advanced error handling in LIO
>> and dm-multipath. Like dm-multipath wants to know if an pr_op failed because
>> of a path failure, so it can retry another one, or a hard device/target error.
>> It would be nice for LIO if an PGR had bad/illegal values and the device
>> returned an error than I could detect that.
>>
>>
>> 1. Drop the -Exyz error type and use blk_status_t in the kernel:
>>
>> sd_pr_command()
>>
>> .....
>> if (result < 0)
>>     return -errno_to_blk_status(result);
>> else if (result == SAM_STAT_CHECK_CONDITION)
>>     return -BLK_STS_NEXUS;
>> else
>>     return result;
>>
>> blkdev_pr_register()
>>
>> ...
>>     result = ops->pr_register()
>>     if (result < 0) {
>>         /* For compat we must convert back to the nvme/scsi code */
>>         if (result == -BLK_STS_NEXUS) {
>>             /* need some helper for this that calls down the stack */
>>             if (bdev == SCSI)
>>                 return SAM_STAT_RESERVATION_CONFLICT
>>             else
>>                 return NVME_SC_RESERVATION_CONFLICT
>>         } else
>>             return blk_status_to_str(result)
>>     } else
>>         return result;
>>
>> This has similar issues as #0 where we have to convert before returning to
>> userspace.
>>
>>
>> Note: In this case, if the block layer uses an -Exyz error code there's not
>> BLK_STS for then we would return -EIO to userspace now. I was thinking
>> that might not be ok but I could also just add a BLK_STS error code
>> for errors like EINVAL, EWOULDBLOCK, ENOMEM, etc so that doesn't happen.
>>
>>
>> 2. We could do something like below where the low levels are not changed but the
>> caller converts:
>>
>> sd_pr_command()
>>     /* no changes */
>>
>> lio()
>>     result = ops->pr_register()
>>     if (result > 0) {
>>         /* add some stacked helper again that goes through dm and
>>          * to the low level device
>>          */
>>         if (bdev == SCSI) {
>>             result = scsi_result_to_blk_status(result)
>>         else
>>             result = nvme_error_status(result)
>>
>>
>> This looks simple, but it felt wrong having upper layers having to
>> know the device type and calling conversion functions.
> 
> Has it been considered to introduce a new enumeration type instead of choosing (0), (1) or (2)?
> 

The problem is that userspace currently gets the nvme status value or the
scsi_cmnd->result which can be host/status byte values like with SG IO.
So you could you just do a new enum or add every possible error to blk_status_t
but before passing back to userspace you still have to then convert to what
format userspace is getting today. So for scsi devices, you have to mimic
the host_byte.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 12/20] block, nvme, scsi, dm: Add blk_status to pr_ops callouts.
@ 2022-08-10  3:34           ` Mike Christie
  0 siblings, 0 replies; 94+ messages in thread
From: Mike Christie @ 2022-08-10  3:34 UTC (permalink / raw)
  To: Bart Van Assche, Christoph Hellwig
  Cc: axboe, james.bottomley, linux-scsi, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel

On 8/9/22 2:33 PM, Bart Van Assche wrote:
> On 8/9/22 11:08, Mike Christie wrote:
>> On 8/9/22 2:21 AM, Christoph Hellwig wrote:
>>> On Mon, Aug 08, 2022 at 07:04:11PM -0500, Mike Christie wrote:
>>>> To handle both cases, this patch adds a blk_status_t arg to the pr_ops
>>>> callouts. The lower levels will convert their device specific error to
>>>> the blk_status_t then the upper levels can easily check that code
>>>> without knowing the device type. It also allows us to keep userspace
>>>> compat where it expects a negative -Exyz error code if the command fails
>>>> before it's sent to the device or a device/tranport specific value if the
>>>> error is > 0.
>>>
>>> Why do we need two return values here?
>>
>> I know the 2 return values are gross :) I can do it in one, but I wasn't sure
>> what's worse. See below for the other possible solutions. I think they are all
>> bad.
>>
>>
>> 0. Convert device specific conflict error to -EBADE then back:
>>
>> sd_pr_command()
>>
>> .....
>>
>> /* would add similar check for NVME_SC_RESERVATION_CONFLICT in nvme */
>> if (result == SAM_STAT_CHECK_CONDITION)
>>     return -EBADE;
>> else
>>     return result;
>>
>>
>> LIO then just checks for -EBADE but when going to userspace we have to
>> convert:
>>
>>
>> blkdev_pr_register()
>>
>> ...
>>     result = ops->pr_register()
>>     if (result < 0) {
>>         /* For compat we must convert back to the nvme/scsi code */
>>         if (result == -EBADE) {
>>             /* need some helper for this that calls down the stack */
>>             if (bdev == SCSI)
>>                 return SAM_STAT_RESERVATION_CONFLICT
>>             else
>>                 return NVME_SC_RESERVATION_CONFLICT
>>         } else
>>             return blk_status_to_str(result)
>>     } else
>>         return result;
>>
>>
>> The conversion is kind of gross and I was thinking in the future it's going
>> to get worse. I'm going to want to have more advanced error handling in LIO
>> and dm-multipath. Like dm-multipath wants to know if an pr_op failed because
>> of a path failure, so it can retry another one, or a hard device/target error.
>> It would be nice for LIO if an PGR had bad/illegal values and the device
>> returned an error than I could detect that.
>>
>>
>> 1. Drop the -Exyz error type and use blk_status_t in the kernel:
>>
>> sd_pr_command()
>>
>> .....
>> if (result < 0)
>>     return -errno_to_blk_status(result);
>> else if (result == SAM_STAT_CHECK_CONDITION)
>>     return -BLK_STS_NEXUS;
>> else
>>     return result;
>>
>> blkdev_pr_register()
>>
>> ...
>>     result = ops->pr_register()
>>     if (result < 0) {
>>         /* For compat we must convert back to the nvme/scsi code */
>>         if (result == -BLK_STS_NEXUS) {
>>             /* need some helper for this that calls down the stack */
>>             if (bdev == SCSI)
>>                 return SAM_STAT_RESERVATION_CONFLICT
>>             else
>>                 return NVME_SC_RESERVATION_CONFLICT
>>         } else
>>             return blk_status_to_str(result)
>>     } else
>>         return result;
>>
>> This has similar issues as #0 where we have to convert before returning to
>> userspace.
>>
>>
>> Note: In this case, if the block layer uses an -Exyz error code there's not
>> BLK_STS for then we would return -EIO to userspace now. I was thinking
>> that might not be ok but I could also just add a BLK_STS error code
>> for errors like EINVAL, EWOULDBLOCK, ENOMEM, etc so that doesn't happen.
>>
>>
>> 2. We could do something like below where the low levels are not changed but the
>> caller converts:
>>
>> sd_pr_command()
>>     /* no changes */
>>
>> lio()
>>     result = ops->pr_register()
>>     if (result > 0) {
>>         /* add some stacked helper again that goes through dm and
>>          * to the low level device
>>          */
>>         if (bdev == SCSI) {
>>             result = scsi_result_to_blk_status(result)
>>         else
>>             result = nvme_error_status(result)
>>
>>
>> This looks simple, but it felt wrong having upper layers having to
>> know the device type and calling conversion functions.
> 
> Has it been considered to introduce a new enumeration type instead of choosing (0), (1) or (2)?
> 

The problem is that userspace currently gets the nvme status value or the
scsi_cmnd->result which can be host/status byte values like with SG IO.
So you could you just do a new enum or add every possible error to blk_status_t
but before passing back to userspace you still have to then convert to what
format userspace is getting today. So for scsi devices, you have to mimic
the host_byte.
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
  2022-08-10  3:17             ` [dm-devel] " Keith Busch
@ 2022-08-10  4:54               ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-10  4:54 UTC (permalink / raw)
  To: Keith Busch
  Cc: Mike Christie, bvanassche, linux-block, dm-devel, snitzer, axboe,
	hch, linux-nvme, martin.petersen, linux-scsi, james.bottomley


> cdw11 is '0' on the 2nd try, and the 'goto' is conditioned on cdw11 being
> non-zero. There's no infinite retry here.

Right I have misread the code, thanks for pointing that out.

-ck



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [dm-devel] [PATCH v2 09/20] nvme: Add helper to execute Reservation Report
@ 2022-08-10  4:54               ` Chaitanya Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Chaitanya Kulkarni @ 2022-08-10  4:54 UTC (permalink / raw)
  To: Keith Busch
  Cc: axboe, james.bottomley, bvanassche, martin.petersen, snitzer,
	linux-nvme, linux-block, dm-devel, linux-scsi, hch,
	Mike Christie


> cdw11 is '0' on the 2nd try, and the 'goto' is conditioned on cdw11 being
> non-zero. There's no infinite retry here.

Right I have misread the code, thanks for pointing that out.

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 94+ messages in thread

end of thread, other threads:[~2022-08-10  4:54 UTC | newest]

Thread overview: 94+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-09  0:03 [PATCH 0/20] Use block pr_ops in LIO Mike Christie
2022-08-09  0:03 ` [dm-devel] " Mike Christie
2022-08-09  0:04 ` [PATCH v2 01/20] block: Add PR callouts for read keys and reservation Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09  0:04 ` [PATCH v2 02/20] scsi: Rename sd_pr_command Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09 19:22   ` Bart Van Assche
2022-08-09 19:22     ` [dm-devel] " Bart Van Assche
2022-08-09  0:04 ` [PATCH v2 03/20] scsi: Move sd_pr_type to header to share Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09  0:04 ` [PATCH v2 04/20] scsi: Add support for block PR read keys/reservation Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09 19:26   ` Bart Van Assche
2022-08-09 19:26     ` [dm-devel] " Bart Van Assche
2022-08-10  3:28     ` Mike Christie
2022-08-10  3:28       ` [dm-devel] " Mike Christie
2022-08-09  0:04 ` [PATCH v2 05/20] dm: " Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09  0:04 ` [PATCH v2 06/20] nvme: Fix reservation status related structs Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09  7:19   ` Christoph Hellwig
2022-08-09  7:19     ` [dm-devel] " Christoph Hellwig
2022-08-09 11:09     ` Chaitanya Kulkarni
2022-08-09 11:09       ` [dm-devel] " Chaitanya Kulkarni
2022-08-09  0:04 ` [PATCH v2 07/20] nvme: Don't hardcode the data len for pr commands Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09  7:19   ` Christoph Hellwig
2022-08-09  7:19     ` [dm-devel] " Christoph Hellwig
2022-08-09  0:04 ` [PATCH v2 08/20] nvme: Add helper to convert to a pr_ops PR type Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09  7:20   ` Christoph Hellwig
2022-08-09  7:20     ` [dm-devel] " Christoph Hellwig
2022-08-09 11:12   ` Chaitanya Kulkarni
2022-08-09 11:12     ` [dm-devel] " Chaitanya Kulkarni
2022-08-09  0:04 ` [PATCH v2 09/20] nvme: Add helper to execute Reservation Report Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09 10:55   ` Chaitanya Kulkarni
2022-08-09 10:55     ` [dm-devel] " Chaitanya Kulkarni
2022-08-09 16:18     ` Mike Christie
2022-08-09 16:18       ` [dm-devel] " Mike Christie
2022-08-09 10:56   ` Chaitanya Kulkarni
2022-08-09 10:56     ` [dm-devel] " Chaitanya Kulkarni
2022-08-09 14:51     ` Keith Busch
2022-08-09 14:51       ` [dm-devel] " Keith Busch
2022-08-09 16:21       ` Mike Christie
2022-08-09 16:21         ` [dm-devel] " Mike Christie
2022-08-10  1:45         ` Chaitanya Kulkarni
2022-08-10  1:45           ` [dm-devel] " Chaitanya Kulkarni
2022-08-10  3:17           ` Keith Busch
2022-08-10  3:17             ` [dm-devel] " Keith Busch
2022-08-10  4:54             ` Chaitanya Kulkarni
2022-08-10  4:54               ` [dm-devel] " Chaitanya Kulkarni
2022-08-09  0:04 ` [PATCH v2 10/20] nvme: Add pr_ops read_keys support Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09  0:04 ` [PATCH v2 11/20] nvme: Add pr_ops read_reservation support Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09  0:04 ` [PATCH v2 12/20] block,nvme,scsi,dm: Add blk_status to pr_ops callouts Mike Christie
2022-08-09  0:04   ` [dm-devel] [PATCH v2 12/20] block, nvme, scsi, dm: " Mike Christie
2022-08-09  7:21   ` [PATCH v2 12/20] block,nvme,scsi,dm: " Christoph Hellwig
2022-08-09  7:21     ` [dm-devel] [PATCH v2 12/20] block, nvme, scsi, dm: " Christoph Hellwig
2022-08-09 18:08     ` [PATCH v2 12/20] block,nvme,scsi,dm: " Mike Christie
2022-08-09 18:08       ` [dm-devel] [PATCH v2 12/20] block, nvme, scsi, dm: " Mike Christie
2022-08-09 19:33       ` [PATCH v2 12/20] block,nvme,scsi,dm: " Bart Van Assche
2022-08-09 19:33         ` [dm-devel] [PATCH v2 12/20] block, nvme, scsi, dm: " Bart Van Assche
2022-08-10  3:34         ` [PATCH v2 12/20] block,nvme,scsi,dm: " Mike Christie
2022-08-10  3:34           ` [dm-devel] [PATCH v2 12/20] block, nvme, scsi, dm: " Mike Christie
2022-08-09  0:04 ` [PATCH v2 13/20] nvme: Have nvme pr_ops return a blk_status_t Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09 10:58   ` Chaitanya Kulkarni
2022-08-09 10:58     ` [dm-devel] " Chaitanya Kulkarni
2022-08-09  0:04 ` [PATCH v2 14/20] scsi: Retry pr_ops commands if a UA is returned Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09  7:16   ` Christoph Hellwig
2022-08-09  7:16     ` [dm-devel] " Christoph Hellwig
2022-08-09 16:24     ` Mike Christie
2022-08-09 16:24       ` [dm-devel] " Mike Christie
2022-08-09 19:31       ` Bart Van Assche
2022-08-09 19:31         ` [dm-devel] " Bart Van Assche
2022-08-09  0:04 ` [PATCH v2 15/20] scsi: Export scsi_result_to_blk_status Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09  0:04 ` [PATCH v2 16/20] scsi: Have sd pr_ops return a blk_status_t Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09  7:18   ` Christoph Hellwig
2022-08-09  7:18     ` [dm-devel] " Christoph Hellwig
2022-08-09 16:22     ` Mike Christie
2022-08-09 16:22       ` [dm-devel] " Mike Christie
2022-08-09  0:04 ` [PATCH v2 17/20] scsi: target: Rename sbc_ops to exec_cmd_ops Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie
2022-08-09  0:04 ` [dm-devel] [PATCH v2 18/20] scsi: target: Allow backends to hook into PR handling Mike Christie
2022-08-09  0:04   ` Mike Christie
2022-08-09  0:04 ` [dm-devel] [PATCH v2 19/20] scsi: target: Don't support SCSI-2 RESERVE/RELEASE Mike Christie
2022-08-09  0:04   ` Mike Christie
2022-08-09  0:04 ` [PATCH v2 20/20] scsi: target: Add block PR support to iblock Mike Christie
2022-08-09  0:04   ` [dm-devel] " Mike Christie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.