All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/15] host managed ZBC + doublestore
@ 2020-02-20 20:08 Douglas Gilbert
  2020-02-20 20:08 ` [PATCH v3 01/15] scsi_debug: randomize command completion time Douglas Gilbert
                   ` (15 more replies)
  0 siblings, 16 replies; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal

This patchset is a follow-on from an earlier patchset titled:
    [RFC v2 0/6] scsi_debug: random doublestore verify
posted on the linux-scsi list on 20200109.

The major addition is support for host-managed ZBC devices.
The bulk of the work in this area was done by Damien Le Moal.
It allows ZBC devices with a mix of conventional and
"sequential write required" zones to be specified. These
follow the same model as direct access devices in this
driver. Namely, each device has its own metadata (including
its write pointer(s)) but all (scsi_debug) devices share the
same user data store (see doublestore below).

Significantly, this simulation passes the test/zbc_tests.sh
script in the https://github.com/hgst/libzbc repository.

The lower numbered patches in this set contain various
measures to improve the speed and usefulness of this driver.
It is being used to test the rewrite of the SCSI generic (sg)
driver which is still underway. Disk to disk copies are the
test of choice by the author [DG]. Comparing the results
between (simulated) disks is useless since all scsi_debug
devices share the same user data store. This limitation was
circumvented by adding the doublestore parameter. When set,
doublestore doubles the data store and allocates them in an
alternating pattern to each scsi_debug device. To enhance
the comparisons, simulations of VERIFY(10 and 16) commands
have been added. A further enhancement is to simulate the
PRE-FETCH command (which does nothing as the data is
already cached).

doublestore can also help with ZBC testing. Single threaded
copy commands like dd (and ddpt) can be used to copy one
non-empty zone into another empty zone. Multiple-threaded
copies (e.g. sgp_dd) can do out-of-order WRITEs which
become a WRITE VIOLATION error in a "sequential write
required" zone.

The author [DG] found that precise command duration timing
gave a false impression of how "bulletproof" the sg driver
state machines and locking were. The first patch involves
randomizing the command durations and it did expose various
issues in the driver under test (sg).

Since all scsi_debug memory store accesses are done in the
context of queuecommand() call, the *_irqsave() and
*_irqrestore() variants of the associated locks have been
removed.  That could be a problem if queuecommand() can ever
be called form an interrupt or related context.

Finally to address the discrepancy between command duration
times seen by the sg driver compared to what was set with
this driver's ndelay option, this driver's timekeeping for
short durations was made more accurate.

If and when this patchset is accepted, this page will be
updated:   http://sg.danny.cz/sg/sdebug26.html
This patchset is against Martin Petersen's git repository
and its 5.7/scsi-queue branch.

Changes since v2 (RFC):
  - add support for host-managed zbc devices with
    conventional and "sequential write required"
    zones [DLM]

Changes since v1 (RFC):
  - testing with version 1 caused several strange crashes that
    turned out to be caused by a code trick to read in the
    data-out buffer but _not_ place it in the big fake_storep
    array. This approach failed badly when multiple threads
    were doing verifies at the same time.
  - replace the code trick with a new do_dout_fetch() function
  - since the code trick was borrowed from the COMPARE AND
    WRITE implementation [resp_comp_write()] using
    do_dout_fetch() fixes the same bug in the existing driver
    which hasn't been reported (yet).

Damien Le Moal (3):
  scsi_debug: zone_max_open module parameter
  scsi_debug: zone_nr_conv module parameter
  scsi_debug: zone_size_mb module parameter

Douglas Gilbert (12):
  scsi_debug: randomize command completion time
  scsi_debug: add doublestore option
  scsi_debug: implement verify(10), add verify(16)
  scsi_debug: weaken rwlock around ramdisk access
  scsi_debug: improve command duration calculation
  scsi_debug: implement pre-fetch commands
  scsi_debug: expand zbc support
  scsi_debug: add zone commands
  scsi_debug: zbc module parameter
  scsi_debug: re-arrange parameters alphabetically
  scsi_debug: zbc parameter can be string
  scsi_debug: bump to version 1.89

 drivers/scsi/scsi_debug.c | 1593 ++++++++++++++++++++++++++++++++-----
 1 file changed, 1389 insertions(+), 204 deletions(-)

-- 
2.25.0


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v3 01/15] scsi_debug: randomize command completion time
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
@ 2020-02-20 20:08 ` Douglas Gilbert
  2020-02-21  9:45   ` Hannes Reinecke
  2020-02-20 20:08 ` [PATCH v3 02/15] scsi_debug: add doublestore option Douglas Gilbert
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal

Add a new command line option (e.g. random=1) and sysfs attribute
that causes subsequent command completion times to be between the
current command delay setting and 0. A unformily distributed 32
bit, kernel provided integer is used for this purpose.

Since the existing 'delay' whose units are jiffies (typically
milliseconds) and 'ndelay' (units: nanoseconds) options (and sysfs
attributes) span a range greater than 32 bits, some scaling is
required.

The purpose of this patch is to widen the range of testing cases
that are visited in long running tests. Put simply: rarely struct
race conditions are more likely to be found when this facility is
used.

The default is the previous case in which all command completions
were roughly equal to (if not, slightly longer) than the value
given by the 'delay' or 'ndelay' settings (or their defaults).
This option's default is equivalent to setting 'random=0' .

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 drivers/scsi/scsi_debug.c | 42 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 39 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 44cb054d5e66..18c07e9ace15 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -38,6 +38,7 @@
 #include <linux/hrtimer.h>
 #include <linux/uuid.h>
 #include <linux/t10-pi.h>
+#include <linux/random.h>
 
 #include <net/checksum.h>
 
@@ -125,6 +126,7 @@ static const char *sdebug_version_date = "20190125";
 #define DEF_PHYSBLK_EXP 0
 #define DEF_OPT_XFERLEN_EXP 0
 #define DEF_PTYPE   TYPE_DISK
+#define DEF_RANDOM false
 #define DEF_REMOVABLE false
 #define DEF_SCSI_LEVEL   7    /* INQUIRY, byte2 [6->SPC-4; 7->SPC-5] */
 #define DEF_SECTOR_SIZE 512
@@ -655,6 +657,7 @@ static unsigned int sdebug_unmap_max_blocks = DEF_UNMAP_MAX_BLOCKS;
 static unsigned int sdebug_unmap_max_desc = DEF_UNMAP_MAX_DESC;
 static unsigned int sdebug_write_same_length = DEF_WRITESAME_LENGTH;
 static int sdebug_uuid_ctl = DEF_UUID_CTL;
+static bool sdebug_random = DEF_RANDOM;
 static bool sdebug_removable = DEF_REMOVABLE;
 static bool sdebug_clustering;
 static bool sdebug_host_lock = DEF_HOST_LOCK;
@@ -4354,9 +4357,21 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
 		ktime_t kt;
 
 		if (delta_jiff > 0) {
-			kt = ns_to_ktime((u64)delta_jiff * (NSEC_PER_SEC / HZ));
-		} else
-			kt = ndelay;
+			u64 ns = jiffies_to_nsecs(delta_jiff);
+
+			if (sdebug_random && ns < U32_MAX) {
+				ns = prandom_u32_max((u32)ns);
+			} else if (sdebug_random) {
+				ns >>= 12;	/* scale to 4 usec precision */
+				if (ns < U32_MAX)	/* over 4 hours max */
+					ns = prandom_u32_max((u32)ns);
+				ns <<= 12;
+			}
+			kt = ns_to_ktime(ns);
+		} else {	/* ndelay has a 4.2 second max */
+			kt = sdebug_random ? prandom_u32_max((u32)ndelay) :
+					     (u32)ndelay;
+		}
 		if (!sd_dp->init_hrt) {
 			sd_dp->init_hrt = true;
 			sqcp->sd_dp = sd_dp;
@@ -4451,6 +4466,7 @@ module_param_named(opts, sdebug_opts, int, S_IRUGO | S_IWUSR);
 module_param_named(physblk_exp, sdebug_physblk_exp, int, S_IRUGO);
 module_param_named(opt_xferlen_exp, sdebug_opt_xferlen_exp, int, S_IRUGO);
 module_param_named(ptype, sdebug_ptype, int, S_IRUGO | S_IWUSR);
+module_param_named(random, sdebug_random, bool, S_IRUGO | S_IWUSR);
 module_param_named(removable, sdebug_removable, bool, S_IRUGO | S_IWUSR);
 module_param_named(scsi_level, sdebug_scsi_level, int, S_IRUGO);
 module_param_named(sector_size, sdebug_sector_size, int, S_IRUGO);
@@ -4511,6 +4527,7 @@ MODULE_PARM_DESC(opts, "1->noise, 2->medium_err, 4->timeout, 8->recovered_err...
 MODULE_PARM_DESC(physblk_exp, "physical block exponent (def=0)");
 MODULE_PARM_DESC(opt_xferlen_exp, "optimal transfer length granularity exponent (def=physblk_exp)");
 MODULE_PARM_DESC(ptype, "SCSI peripheral type(def=0[disk])");
+MODULE_PARM_DESC(random, "If set, uniformly randomize command duration between 0 and delay_in_ns");
 MODULE_PARM_DESC(removable, "claim to have removable media (def=0)");
 MODULE_PARM_DESC(scsi_level, "SCSI level to simulate(def=7[SPC-5])");
 MODULE_PARM_DESC(sector_size, "logical block size in bytes (def=512)");
@@ -5101,6 +5118,24 @@ static ssize_t map_show(struct device_driver *ddp, char *buf)
 }
 static DRIVER_ATTR_RO(map);
 
+static ssize_t random_show(struct device_driver *ddp, char *buf)
+{
+	return scnprintf(buf, PAGE_SIZE, "%d\n", (int)sdebug_random);
+}
+
+static ssize_t random_store(struct device_driver *ddp, const char *buf,
+			    size_t count)
+{
+	int n;
+
+	if (count > 0 && kstrtoint(buf, 10, &n) == 0 && n >= 0) {
+		sdebug_random = (n > 0);
+		return count;
+	}
+	return -EINVAL;
+}
+static DRIVER_ATTR_RW(random);
+
 static ssize_t removable_show(struct device_driver *ddp, char *buf)
 {
 	return scnprintf(buf, PAGE_SIZE, "%d\n", sdebug_removable ? 1 : 0);
@@ -5211,6 +5246,7 @@ static struct attribute *sdebug_drv_attrs[] = {
 	&driver_attr_guard.attr,
 	&driver_attr_ato.attr,
 	&driver_attr_map.attr,
+	&driver_attr_random.attr,
 	&driver_attr_removable.attr,
 	&driver_attr_host_lock.attr,
 	&driver_attr_ndelay.attr,
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 02/15] scsi_debug: add doublestore option
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
  2020-02-20 20:08 ` [PATCH v3 01/15] scsi_debug: randomize command completion time Douglas Gilbert
@ 2020-02-20 20:08 ` Douglas Gilbert
  2020-02-21  9:46   ` Hannes Reinecke
  2020-02-20 20:08 ` [PATCH v3 03/15] scsi_debug: implement verify(10), add verify(16) Douglas Gilbert
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal

The scsi_debug driver has always been restricted to using one
(or none) ramdisk image for its storage. This means that thousands
of scsi_debug devices can be created without exhausting the host
machine's RAM. The downside is that all scsi_debug devices share
the same ramdisk image. This option doubles the amount of ramdisk
storage space with the first, third, fifth (etc) created
scsi_debug devices using the first ramdisk image while the second,
fourth, sixth (etc) created scsi_debug devices using the second
ramdisk image.

The reason for doing this is to check that (partial) disk to disk
copies based on scsi_debug devices have actually worked properly.
As an example: assume /dev/sdb and /dev/sg1 are the same
scsi_debug device, while /dev/sdc and /dev/sg2 are also the
same scsi_debug device. With doublestore=1 they will have
different ramdisk images. Then the following pseudocode could
be executed to check the if sgh_dd copy worked:
    dd if=/dev/urandom of=/dev/sdb
    sgh_dd if=/dev/sg1 of=/dev/sg2 [plus option(s) to test]
    cmp /dev/sdb /dev/sdc

If the cmp fails then the copy has failed (or some other
mechanism wrote to /dev/sdb or /dev/sdc in the interim).

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 drivers/scsi/scsi_debug.c | 264 +++++++++++++++++++++++++++-----------
 1 file changed, 186 insertions(+), 78 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 18c07e9ace15..a32be83b22c7 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -108,6 +108,7 @@ static const char *sdebug_version_date = "20190125";
 #define DEF_DEV_SIZE_MB   8
 #define DEF_DIF 0
 #define DEF_DIX 0
+#define DEF_DOUBLESTORE false
 #define DEF_D_SENSE   0
 #define DEF_EVERY_NTH   0
 #define DEF_FAKE_RW	0
@@ -255,6 +256,7 @@ struct sdebug_dev_info {
 	unsigned long uas_bm[1];
 	atomic_t num_in_q;
 	atomic_t stopped;
+	int sdg_devnum;
 	bool used;
 };
 
@@ -633,6 +635,7 @@ static int sdebug_max_queue = SDEBUG_CANQUEUE;	/* per submit queue */
 static unsigned int sdebug_medium_error_start = OPT_MEDIUM_ERR_ADDR;
 static int sdebug_medium_error_count = OPT_MEDIUM_ERR_NUM;
 static atomic_t retired_max_queue;	/* if > 0 then was prior max_queue */
+static atomic_t a_sdg_devnum;
 static int sdebug_ndelay = DEF_NDELAY;	/* if > 0 then unit is nanoseconds */
 static int sdebug_no_lun_0 = DEF_NO_LUN_0;
 static int sdebug_no_uld;
@@ -658,6 +661,7 @@ static unsigned int sdebug_unmap_max_desc = DEF_UNMAP_MAX_DESC;
 static unsigned int sdebug_write_same_length = DEF_WRITESAME_LENGTH;
 static int sdebug_uuid_ctl = DEF_UUID_CTL;
 static bool sdebug_random = DEF_RANDOM;
+static bool sdebug_doublestore = DEF_DOUBLESTORE;
 static bool sdebug_removable = DEF_REMOVABLE;
 static bool sdebug_clustering;
 static bool sdebug_host_lock = DEF_HOST_LOCK;
@@ -681,7 +685,7 @@ static int sdebug_sectors_per;		/* sectors per cylinder */
 static LIST_HEAD(sdebug_host_list);
 static DEFINE_SPINLOCK(sdebug_host_list_lock);
 
-static unsigned char *fake_storep;	/* ramdisk storage */
+static u8 *fake_store_a[2];		/* ramdisk storage */
 static struct t10_pi_tuple *dif_storep;	/* protection info */
 static void *map_storep;		/* provisioning map */
 
@@ -699,6 +703,9 @@ static int submit_queues = DEF_SUBMIT_QUEUES;  /* > 1 for multi-queue (mq) */
 static struct sdebug_queue *sdebug_q_arr;  /* ptr to array of submit queues */
 
 static DEFINE_RWLOCK(atomic_rw);
+static DEFINE_RWLOCK(atomic_rw2);
+
+static rwlock_t *ramdisk_lck_a[2];
 
 static char sdebug_proc_name[] = MY_NAME;
 static const char *my_name = MY_NAME;
@@ -730,11 +737,11 @@ static inline bool scsi_debug_lbp(void)
 		(sdebug_lbpu || sdebug_lbpws || sdebug_lbpws10);
 }
 
-static void *lba2fake_store(unsigned long long lba)
+static void *lba2fake_store(unsigned long long lba, int acc_num)
 {
 	lba = do_div(lba, sdebug_store_sectors);
 
-	return fake_storep + lba * sdebug_sector_size;
+	return fake_store_a[acc_num % 2] + lba * sdebug_sector_size;
 }
 
 static struct t10_pi_tuple *dif_store(sector_t sector)
@@ -1043,7 +1050,7 @@ static int p_fill_from_dev_buffer(struct scsi_cmnd *scp, const void *arr,
 		 __func__, off_dst, scsi_bufflen(scp), act_len,
 		 scsi_get_resid(scp));
 	n = scsi_bufflen(scp) - (off_dst + act_len);
-	scsi_set_resid(scp, min(scsi_get_resid(scp), n));
+	scsi_set_resid(scp, min_t(int, scsi_get_resid(scp), n));
 	return 0;
 }
 
@@ -1535,7 +1542,7 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	}
 	put_unaligned_be16(0x2100, arr + n);	/* SPL-4 no version claimed */
 	ret = fill_from_dev_buffer(scp, arr,
-			    min(alloc_len, SDEBUG_LONG_INQ_SZ));
+			    min_t(int, alloc_len, SDEBUG_LONG_INQ_SZ));
 	kfree(arr);
 	return ret;
 }
@@ -1690,7 +1697,7 @@ static int resp_readcap16(struct scsi_cmnd *scp,
 	}
 
 	return fill_from_dev_buffer(scp, arr,
-				    min(alloc_len, SDEBUG_READCAP16_ARR_SZ));
+			    min_t(int, alloc_len, SDEBUG_READCAP16_ARR_SZ));
 }
 
 #define SDEBUG_MAX_TGTPGS_ARR_SZ 1412
@@ -1764,9 +1771,9 @@ static int resp_report_tgtpgs(struct scsi_cmnd *scp,
 	 * - The constructed command length
 	 * - The maximum array size
 	 */
-	rlen = min(alen,n);
+	rlen = min_t(int, alen, n);
 	ret = fill_from_dev_buffer(scp, arr,
-				   min(rlen, SDEBUG_MAX_TGTPGS_ARR_SZ));
+			   min_t(int, rlen, SDEBUG_MAX_TGTPGS_ARR_SZ));
 	kfree(arr);
 	return ret;
 }
@@ -2268,7 +2275,7 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 		arr[0] = offset - 1;
 	else
 		put_unaligned_be16((offset - 2), arr + 0);
-	return fill_from_dev_buffer(scp, arr, min(alloc_len, offset));
+	return fill_from_dev_buffer(scp, arr, min_t(int, alloc_len, offset));
 }
 
 #define SDEBUG_MAX_MSELECT_SZ 512
@@ -2453,9 +2460,9 @@ static int resp_log_sense(struct scsi_cmnd *scp,
 		mk_sense_invalid_fld(scp, SDEB_IN_CDB, 3, -1);
 		return check_condition_result;
 	}
-	len = min(get_unaligned_be16(arr + 2) + 4, alloc_len);
+	len = min_t(int, get_unaligned_be16(arr + 2) + 4, alloc_len);
 	return fill_from_dev_buffer(scp, arr,
-		    min(len, SDEBUG_MAX_INQ_ARR_SZ));
+		    min_t(int, len, SDEBUG_MAX_INQ_ARR_SZ));
 }
 
 static inline int check_device_access_params(struct scsi_cmnd *scp,
@@ -2478,6 +2485,18 @@ static inline int check_device_access_params(struct scsi_cmnd *scp,
 	return 0;
 }
 
+static int scp2acc_num(struct scsi_cmnd *scp)
+{
+	if (sdebug_doublestore) {
+		struct scsi_device *sdp = scp->device;
+		struct sdebug_dev_info *devip =
+				(struct sdebug_dev_info *)sdp->hostdata;
+
+		return devip->sdg_devnum;
+	}
+	return 0;
+}
+
 /* Returns number of bytes copied or -1 if error. */
 static int do_device_access(struct scsi_cmnd *scmd, u32 sg_skip, u64 lba,
 			    u32 num, bool do_write)
@@ -2486,6 +2505,7 @@ static int do_device_access(struct scsi_cmnd *scmd, u32 sg_skip, u64 lba,
 	u64 block, rest = 0;
 	struct scsi_data_buffer *sdb = &scmd->sdb;
 	enum dma_data_direction dir;
+	u8 *fsp;
 
 	if (do_write) {
 		dir = DMA_TO_DEVICE;
@@ -2498,20 +2518,21 @@ static int do_device_access(struct scsi_cmnd *scmd, u32 sg_skip, u64 lba,
 		return 0;
 	if (scmd->sc_data_direction != dir)
 		return -1;
+	fsp = fake_store_a[scp2acc_num(scmd) % 2];
 
 	block = do_div(lba, sdebug_store_sectors);
 	if (block + num > sdebug_store_sectors)
 		rest = block + num - sdebug_store_sectors;
 
 	ret = sg_copy_buffer(sdb->table.sgl, sdb->table.nents,
-		   fake_storep + (block * sdebug_sector_size),
+		   fsp + (block * sdebug_sector_size),
 		   (num - rest) * sdebug_sector_size, sg_skip, do_write);
 	if (ret != (num - rest) * sdebug_sector_size)
 		return ret;
 
 	if (rest) {
 		ret += sg_copy_buffer(sdb->table.sgl, sdb->table.nents,
-			    fake_storep, rest * sdebug_sector_size,
+			    fsp, rest * sdebug_sector_size,
 			    sg_skip + ((num - rest) * sdebug_sector_size),
 			    do_write);
 	}
@@ -2519,34 +2540,48 @@ static int do_device_access(struct scsi_cmnd *scmd, u32 sg_skip, u64 lba,
 	return ret;
 }
 
+/* Returns number of bytes copied or -1 if error. */
+static int do_dout_fetch(struct scsi_cmnd *scmd, u32 num, u8 *doutp)
+{
+	struct scsi_data_buffer *sdb = &scmd->sdb;
+
+	if (!sdb->length)
+		return 0;
+	if (scmd->sc_data_direction != DMA_TO_DEVICE)
+		return -1;
+	return sg_copy_buffer(sdb->table.sgl, sdb->table.nents, doutp,
+			      num * sdebug_sector_size, 0, true);
+}
+
 /* If lba2fake_store(lba,num) compares equal to arr(num), then copy top half of
  * arr into lba2fake_store(lba,num) and return true. If comparison fails then
  * return false. */
-static bool comp_write_worker(u64 lba, u32 num, const u8 *arr)
+static bool comp_write_worker(u64 lba, u32 num, const u8 *arr, int acc_num)
 {
 	bool res;
 	u64 block, rest = 0;
 	u32 store_blks = sdebug_store_sectors;
 	u32 lb_size = sdebug_sector_size;
+	u8 *fsp;
 
 	block = do_div(lba, store_blks);
 	if (block + num > store_blks)
 		rest = block + num - store_blks;
 
-	res = !memcmp(fake_storep + (block * lb_size), arr,
-		      (num - rest) * lb_size);
+	fsp = fake_store_a[acc_num % 2];
+
+	res = !memcmp(fsp + (block * lb_size), arr, (num - rest) * lb_size);
 	if (!res)
 		return res;
 	if (rest)
-		res = memcmp(fake_storep, arr + ((num - rest) * lb_size),
+		res = memcmp(fsp, arr + ((num - rest) * lb_size),
 			     rest * lb_size);
 	if (!res)
 		return res;
 	arr += num * lb_size;
-	memcpy(fake_storep + (block * lb_size), arr, (num - rest) * lb_size);
+	memcpy(fsp + (block * lb_size), arr, (num - rest) * lb_size);
 	if (rest)
-		memcpy(fake_storep, arr + ((num - rest) * lb_size),
-		       rest * lb_size);
+		memcpy(fsp, arr + ((num - rest) * lb_size), rest * lb_size);
 	return res;
 }
 
@@ -2605,7 +2640,7 @@ static void dif_copy_prot(struct scsi_cmnd *SCpnt, sector_t sector,
 			(read ? SG_MITER_TO_SG : SG_MITER_FROM_SG));
 
 	while (sg_miter_next(&miter) && resid > 0) {
-		size_t len = min(miter.length, resid);
+		size_t len = min_t(size_t, miter.length, resid);
 		void *start = dif_store(sector);
 		size_t rest = 0;
 
@@ -2632,12 +2667,12 @@ static void dif_copy_prot(struct scsi_cmnd *SCpnt, sector_t sector,
 	sg_miter_stop(&miter);
 }
 
-static int prot_verify_read(struct scsi_cmnd *SCpnt, sector_t start_sec,
+static int prot_verify_read(struct scsi_cmnd *scp, sector_t start_sec,
 			    unsigned int sectors, u32 ei_lba)
 {
 	unsigned int i;
-	struct t10_pi_tuple *sdt;
 	sector_t sector;
+	struct t10_pi_tuple *sdt;
 
 	for (i = 0; i < sectors; i++, ei_lba++) {
 		int ret;
@@ -2648,14 +2683,15 @@ static int prot_verify_read(struct scsi_cmnd *SCpnt, sector_t start_sec,
 		if (sdt->app_tag == cpu_to_be16(0xffff))
 			continue;
 
-		ret = dif_verify(sdt, lba2fake_store(sector), sector, ei_lba);
+		ret = dif_verify(sdt, lba2fake_store(sector, scp2acc_num(scp)),
+				 sector, ei_lba);
 		if (ret) {
 			dif_errors++;
 			return ret;
 		}
 	}
 
-	dif_copy_prot(SCpnt, start_sec, sectors, true);
+	dif_copy_prot(scp, start_sec, sectors, true);
 	dix_reads++;
 
 	return 0;
@@ -2665,10 +2701,11 @@ static int resp_read_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 {
 	u8 *cmd = scp->cmnd;
 	struct sdebug_queued_cmd *sqcp;
-	u64 lba;
 	u32 num;
 	u32 ei_lba;
+	int acc_num = scp2acc_num(scp);
 	unsigned long iflags;
+	u64 lba;
 	int ret;
 	bool check_prot;
 
@@ -2752,21 +2789,22 @@ static int resp_read_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 		return check_condition_result;
 	}
 
-	read_lock_irqsave(&atomic_rw, iflags);
+	read_lock_irqsave(ramdisk_lck_a[acc_num % 2], iflags);
 
 	/* DIX + T10 DIF */
 	if (unlikely(sdebug_dix && scsi_prot_sg_count(scp))) {
 		int prot_ret = prot_verify_read(scp, lba, num, ei_lba);
 
 		if (prot_ret) {
-			read_unlock_irqrestore(&atomic_rw, iflags);
+			read_unlock_irqrestore(ramdisk_lck_a[acc_num % 2],
+					       iflags);
 			mk_sense_buffer(scp, ABORTED_COMMAND, 0x10, prot_ret);
 			return illegal_condition_result;
 		}
 	}
 
 	ret = do_device_access(scp, 0, lba, num, false);
-	read_unlock_irqrestore(&atomic_rw, iflags);
+	read_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
 	if (unlikely(ret == -1))
 		return DID_ERROR << 16;
 
@@ -2938,9 +2976,10 @@ static void map_region(sector_t lba, unsigned int len)
 	}
 }
 
-static void unmap_region(sector_t lba, unsigned int len)
+static void unmap_region(sector_t lba, unsigned int len, int acc_num)
 {
 	sector_t end = lba + len;
+	u8 *fsp = fake_store_a[acc_num % 2];
 
 	while (lba < end) {
 		unsigned long index = lba_to_map_index(lba);
@@ -2950,8 +2989,7 @@ static void unmap_region(sector_t lba, unsigned int len)
 		    index < map_size) {
 			clear_bit(index, map_storep);
 			if (sdebug_lbprz) {  /* for LBPRZ=2 return 0xff_s */
-				memset(fake_storep +
-				       lba * sdebug_sector_size,
+				memset(fsp + lba * sdebug_sector_size,
 				       (sdebug_lbprz & 1) ? 0 : 0xff,
 				       sdebug_sector_size *
 				       sdebug_unmap_granularity);
@@ -2968,13 +3006,14 @@ static void unmap_region(sector_t lba, unsigned int len)
 
 static int resp_write_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 {
-	u8 *cmd = scp->cmnd;
-	u64 lba;
+	bool check_prot;
 	u32 num;
 	u32 ei_lba;
-	unsigned long iflags;
+	int acc_num = scp2acc_num(scp);
 	int ret;
-	bool check_prot;
+	unsigned long iflags;
+	u64 lba;
+	u8 *cmd = scp->cmnd;
 
 	switch (cmd[0]) {
 	case WRITE_16:
@@ -3030,14 +3069,15 @@ static int resp_write_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	ret = check_device_access_params(scp, lba, num, true);
 	if (ret)
 		return ret;
-	write_lock_irqsave(&atomic_rw, iflags);
+	write_lock_irqsave(ramdisk_lck_a[acc_num % 2], iflags);
 
 	/* DIX + T10 DIF */
 	if (unlikely(sdebug_dix && scsi_prot_sg_count(scp))) {
 		int prot_ret = prot_verify_write(scp, lba, num, ei_lba);
 
 		if (prot_ret) {
-			write_unlock_irqrestore(&atomic_rw, iflags);
+			write_unlock_irqrestore(ramdisk_lck_a[acc_num % 2],
+						iflags);
 			mk_sense_buffer(scp, ILLEGAL_REQUEST, 0x10, prot_ret);
 			return illegal_condition_result;
 		}
@@ -3046,7 +3086,7 @@ static int resp_write_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	ret = do_device_access(scp, 0, lba, num, true);
 	if (unlikely(scsi_debug_lbp()))
 		map_region(lba, num);
-	write_unlock_irqrestore(&atomic_rw, iflags);
+	write_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
 	if (unlikely(-1 == ret))
 		return DID_ERROR << 16;
 	else if (unlikely(sdebug_verbose &&
@@ -3092,6 +3132,7 @@ static int resp_write_scat(struct scsi_cmnd *scp,
 	u32 num, num_by, bt_len, lbdof_blen, sg_off, cum_lb;
 	u32 lb_size = sdebug_sector_size;
 	u32 ei_lba;
+	int acc_num = scp2acc_num(scp);
 	u64 lba;
 	unsigned long iflags;
 	int ret, res;
@@ -3155,7 +3196,7 @@ static int resp_write_scat(struct scsi_cmnd *scp,
 		goto err_out;
 	}
 
-	write_lock_irqsave(&atomic_rw, iflags);
+	write_lock_irqsave(ramdisk_lck_a[acc_num % 2], iflags);
 	sg_off = lbdof_blen;
 	/* Spec says Buffer xfer Length field in number of LBs in dout */
 	cum_lb = 0;
@@ -3238,7 +3279,7 @@ static int resp_write_scat(struct scsi_cmnd *scp,
 	}
 	ret = 0;
 err_out_unlock:
-	write_unlock_irqrestore(&atomic_rw, iflags);
+	write_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
 err_out:
 	kfree(lrdp);
 	return ret;
@@ -3247,27 +3288,30 @@ static int resp_write_scat(struct scsi_cmnd *scp,
 static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num,
 			   u32 ei_lba, bool unmap, bool ndob)
 {
-	int ret;
 	unsigned long iflags;
 	unsigned long long i;
-	u32 lb_size = sdebug_sector_size;
 	u64 block, lbaa;
+	u32 lb_size = sdebug_sector_size;
+	int ret;
+	int acc_num = scp2acc_num(scp);
 	u8 *fs1p;
+	u8 *fsp;
 
 	ret = check_device_access_params(scp, lba, num, true);
 	if (ret)
 		return ret;
 
-	write_lock_irqsave(&atomic_rw, iflags);
+	write_lock_irqsave(ramdisk_lck_a[acc_num % 2], iflags);
 
 	if (unmap && scsi_debug_lbp()) {
-		unmap_region(lba, num);
+		unmap_region(lba, num, acc_num);
 		goto out;
 	}
 	lbaa = lba;
 	block = do_div(lbaa, sdebug_store_sectors);
 	/* if ndob then zero 1 logical block, else fetch 1 logical block */
-	fs1p = fake_storep + (block * lb_size);
+	fsp = fake_store_a[acc_num % 2];
+	fs1p = fsp + (block * lb_size);
 	if (ndob) {
 		memset(fs1p, 0, lb_size);
 		ret = 0;
@@ -3275,7 +3319,7 @@ static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num,
 		ret = fetch_to_dev_buffer(scp, fs1p, lb_size);
 
 	if (-1 == ret) {
-		write_unlock_irqrestore(&atomic_rw, iflags);
+		write_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
 		return DID_ERROR << 16;
 	} else if (sdebug_verbose && !ndob && (ret < lb_size))
 		sdev_printk(KERN_INFO, scp->device,
@@ -3286,12 +3330,12 @@ static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num,
 	for (i = 1 ; i < num ; i++) {
 		lbaa = lba + i;
 		block = do_div(lbaa, sdebug_store_sectors);
-		memmove(fake_storep + (block * lb_size), fs1p, lb_size);
+		memmove(fsp + (block * lb_size), fs1p, lb_size);
 	}
 	if (scsi_debug_lbp())
 		map_region(lba, num);
 out:
-	write_unlock_irqrestore(&atomic_rw, iflags);
+	write_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
 
 	return 0;
 }
@@ -3403,7 +3447,6 @@ static int resp_comp_write(struct scsi_cmnd *scp,
 {
 	u8 *cmd = scp->cmnd;
 	u8 *arr;
-	u8 *fake_storep_hold;
 	u64 lba;
 	u32 dnum;
 	u32 lb_size = sdebug_sector_size;
@@ -3411,6 +3454,7 @@ static int resp_comp_write(struct scsi_cmnd *scp,
 	unsigned long iflags;
 	int ret;
 	int retval = 0;
+	int acc_num = scp2acc_num(scp);
 
 	lba = get_unaligned_be64(cmd + 2);
 	num = cmd[13];		/* 1 to a maximum of 255 logical blocks */
@@ -3437,14 +3481,9 @@ static int resp_comp_write(struct scsi_cmnd *scp,
 		return check_condition_result;
 	}
 
-	write_lock_irqsave(&atomic_rw, iflags);
+	write_lock_irqsave(ramdisk_lck_a[acc_num % 2], iflags);
 
-	/* trick do_device_access() to fetch both compare and write buffers
-	 * from data-in into arr. Safe (atomic) since write_lock held. */
-	fake_storep_hold = fake_storep;
-	fake_storep = arr;
-	ret = do_device_access(scp, 0, 0, dnum, true);
-	fake_storep = fake_storep_hold;
+	ret = do_dout_fetch(scp, dnum, arr);
 	if (ret == -1) {
 		retval = DID_ERROR << 16;
 		goto cleanup;
@@ -3452,7 +3491,7 @@ static int resp_comp_write(struct scsi_cmnd *scp,
 		sdev_printk(KERN_INFO, scp->device, "%s: compare_write: cdb "
 			    "indicated=%u, IO sent=%d bytes\n", my_name,
 			    dnum * lb_size, ret);
-	if (!comp_write_worker(lba, num, arr)) {
+	if (!comp_write_worker(lba, num, arr, scp2acc_num(scp))) {
 		mk_sense_buffer(scp, MISCOMPARE, MISCOMPARE_VERIFY_ASC, 0);
 		retval = check_condition_result;
 		goto cleanup;
@@ -3460,7 +3499,7 @@ static int resp_comp_write(struct scsi_cmnd *scp,
 	if (scsi_debug_lbp())
 		map_region(lba, num);
 cleanup:
-	write_unlock_irqrestore(&atomic_rw, iflags);
+	write_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
 	kfree(arr);
 	return retval;
 }
@@ -3477,6 +3516,7 @@ static int resp_unmap(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	struct unmap_block_desc *desc;
 	unsigned int i, payload_len, descriptors;
 	int ret;
+	int acc_num = scp2acc_num(scp);
 	unsigned long iflags;
 
 
@@ -3505,7 +3545,7 @@ static int resp_unmap(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 
 	desc = (void *)&buf[8];
 
-	write_lock_irqsave(&atomic_rw, iflags);
+	write_lock_irqsave(ramdisk_lck_a[acc_num % 2], iflags);
 
 	for (i = 0 ; i < descriptors ; i++) {
 		unsigned long long lba = get_unaligned_be64(&desc[i].lba);
@@ -3515,13 +3555,13 @@ static int resp_unmap(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 		if (ret)
 			goto out;
 
-		unmap_region(lba, num);
+		unmap_region(lba, num, scp2acc_num(scp));
 	}
 
 	ret = 0;
 
 out:
-	write_unlock_irqrestore(&atomic_rw, iflags);
+	write_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
 	kfree(buf);
 
 	return ret;
@@ -3891,6 +3931,7 @@ static int scsi_debug_slave_configure(struct scsi_device *sdp)
 	if (sdebug_no_uld)
 		sdp->no_uld_attach = 1;
 	config_cdb_len(sdp);
+	devip->sdg_devnum = atomic_inc_return(&a_sdg_devnum);
 	return 0;
 }
 
@@ -4146,8 +4187,7 @@ static int scsi_debug_host_reset(struct scsi_cmnd *SCpnt)
 	return SUCCESS;
 }
 
-static void __init sdebug_build_parts(unsigned char *ramp,
-				      unsigned long store_size)
+static void sdebug_build_parts(unsigned char *ramp, unsigned long store_size)
 {
 	struct partition *pp;
 	int starts[SDEBUG_MAX_PARTS + 2];
@@ -4436,6 +4476,7 @@ module_param_named(delay, sdebug_jdelay, int, S_IRUGO | S_IWUSR);
 module_param_named(dev_size_mb, sdebug_dev_size_mb, int, S_IRUGO);
 module_param_named(dif, sdebug_dif, int, S_IRUGO);
 module_param_named(dix, sdebug_dix, int, S_IRUGO);
+module_param_named(doublestore, sdebug_doublestore, bool, S_IRUGO | S_IWUSR);
 module_param_named(dsense, sdebug_dsense, int, S_IRUGO | S_IWUSR);
 module_param_named(every_nth, sdebug_every_nth, int, S_IRUGO | S_IWUSR);
 module_param_named(fake_rw, sdebug_fake_rw, int, S_IRUGO | S_IWUSR);
@@ -4499,6 +4540,7 @@ MODULE_PARM_DESC(dev_size_mb, "size in MiB of ram shared by devs(def=8)");
 MODULE_PARM_DESC(dif, "data integrity field type: 0-3 (def=0)");
 MODULE_PARM_DESC(dix, "data integrity extensions mask (def=0)");
 MODULE_PARM_DESC(dsense, "use descriptor sense format(def=0 -> fixed)");
+MODULE_PARM_DESC(doublestore, "If set, 2 data buffers allocated, devices alternate between the two");
 MODULE_PARM_DESC(every_nth, "timeout every nth command(def=0)");
 MODULE_PARM_DESC(fake_rw, "fake reads/writes instead of copying (def=0)");
 MODULE_PARM_DESC(guard, "protection checksum: 0=crc, 1=ip (def=0)");
@@ -4788,16 +4830,24 @@ static ssize_t fake_rw_store(struct device_driver *ddp, const char *buf,
 		n = (n > 0);
 		sdebug_fake_rw = (sdebug_fake_rw > 0);
 		if (sdebug_fake_rw != n) {
-			if ((0 == n) && (NULL == fake_storep)) {
-				unsigned long sz =
-					(unsigned long)sdebug_dev_size_mb *
-					1048576;
-
-				fake_storep = vzalloc(sz);
-				if (NULL == fake_storep) {
-					pr_err("out of memory, 9\n");
+			unsigned long sz = (unsigned long)sdebug_dev_size_mb *
+					   1048576;
+
+			if (n == 0 && !fake_store_a[0]) {
+				fake_store_a[0] = vzalloc(sz);
+				if (!fake_store_a[0])
+					return -ENOMEM;
+				if (sdebug_num_parts > 0)
+					sdebug_build_parts(fake_store_a[0], sz);
+			}
+			if (sdebug_doublestore && n == 0 && !fake_store_a[1]) {
+				fake_store_a[1] = vzalloc(sz);
+				if (!fake_store_a[1]) {
+					sdebug_doublestore = false;
 					return -ENOMEM;
 				}
+				if (sdebug_num_parts > 0)
+					sdebug_build_parts(fake_store_a[1], sz);
 			}
 			sdebug_fake_rw = n;
 		}
@@ -4848,6 +4898,47 @@ static ssize_t dev_size_mb_show(struct device_driver *ddp, char *buf)
 }
 static DRIVER_ATTR_RO(dev_size_mb);
 
+static ssize_t doublestore_show(struct device_driver *ddp, char *buf)
+{
+	return scnprintf(buf, PAGE_SIZE, "%d\n", (int)sdebug_doublestore);
+}
+
+static ssize_t doublestore_store(struct device_driver *ddp, const char *buf,
+				 size_t count)
+{
+	int n;
+
+	if (count > 0 && kstrtoint(buf, 10, &n) == 0 && n >= 0) {
+		unsigned long iflags;
+
+		if (sdebug_doublestore == (n > 0))
+			return count;	/* no state change */
+		if (n <= 0) {
+			write_lock_irqsave(ramdisk_lck_a[1], iflags);
+			sdebug_doublestore = false;
+			vfree(fake_store_a[1]);
+			fake_store_a[1] = NULL;
+			write_unlock_irqrestore(ramdisk_lck_a[1], iflags);
+		} else {
+			unsigned long sz = (unsigned long)sdebug_dev_size_mb *
+					   1048576;
+			u8 *fsp = vzalloc(sz);
+
+			if (!fsp)
+				return -ENOMEM;
+			if (sdebug_num_parts > 0)
+				sdebug_build_parts(fsp, sz);
+			write_lock_irqsave(ramdisk_lck_a[1], iflags);
+			fake_store_a[1] = fsp;
+			sdebug_doublestore = true;
+			write_unlock_irqrestore(ramdisk_lck_a[1], iflags);
+		}
+		return count;
+	}
+	return -EINVAL;
+}
+static DRIVER_ATTR_RW(doublestore);
+
 static ssize_t num_parts_show(struct device_driver *ddp, char *buf)
 {
 	return scnprintf(buf, PAGE_SIZE, "%d\n", sdebug_num_parts);
@@ -5225,6 +5316,7 @@ static struct attribute *sdebug_drv_attrs[] = {
 	&driver_attr_opts.attr,
 	&driver_attr_ptype.attr,
 	&driver_attr_dsense.attr,
+	&driver_attr_doublestore.attr,
 	&driver_attr_fake_rw.attr,
 	&driver_attr_no_lun_0.attr,
 	&driver_attr_num_tgts.attr,
@@ -5266,7 +5358,10 @@ static int __init scsi_debug_init(void)
 	int k;
 	int ret;
 
+	ramdisk_lck_a[0] = &atomic_rw;
+	ramdisk_lck_a[1] = &atomic_rw2;
 	atomic_set(&retired_max_queue, 0);
+	atomic_set(&a_sdg_devnum, 0);
 
 	if (sdebug_ndelay >= 1000 * 1000 * 1000) {
 		pr_warn("ndelay must be less than 1 second, ignored\n");
@@ -5363,14 +5458,25 @@ static int __init scsi_debug_init(void)
 	}
 
 	if (sdebug_fake_rw == 0) {
-		fake_storep = vzalloc(sz);
-		if (NULL == fake_storep) {
-			pr_err("out of memory, 1\n");
+		fake_store_a[0] = vzalloc(sz);
+		if (!fake_store_a[0]) {
 			ret = -ENOMEM;
 			goto free_q_arr;
 		}
 		if (sdebug_num_parts > 0)
-			sdebug_build_parts(fake_storep, sz);
+			sdebug_build_parts(fake_store_a[0], sz);
+		if (sdebug_doublestore) {
+			fake_store_a[1] = vzalloc(sz);
+			if (!fake_store_a[1]) {
+				sdebug_doublestore = false;
+				vfree(fake_store_a[0]);
+				fake_store_a[0] = NULL;
+				ret = -ENOMEM;
+				goto free_q_arr;
+			}
+			if (sdebug_num_parts > 0)
+				sdebug_build_parts(fake_store_a[1], sz);
+		}
 	}
 
 	if (sdebug_dix) {
@@ -5467,7 +5573,8 @@ static int __init scsi_debug_init(void)
 free_vm:
 	vfree(map_storep);
 	vfree(dif_storep);
-	vfree(fake_storep);
+	vfree(fake_store_a[1]);
+	vfree(fake_store_a[0]);
 free_q_arr:
 	kfree(sdebug_q_arr);
 	return ret;
@@ -5487,7 +5594,8 @@ static void __exit scsi_debug_exit(void)
 
 	vfree(map_storep);
 	vfree(dif_storep);
-	vfree(fake_storep);
+	vfree(fake_store_a[1]);
+	vfree(fake_store_a[0]);
 	kfree(sdebug_q_arr);
 }
 
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 03/15] scsi_debug: implement verify(10), add verify(16)
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
  2020-02-20 20:08 ` [PATCH v3 01/15] scsi_debug: randomize command completion time Douglas Gilbert
  2020-02-20 20:08 ` [PATCH v3 02/15] scsi_debug: add doublestore option Douglas Gilbert
@ 2020-02-20 20:08 ` Douglas Gilbert
  2020-02-21  9:49   ` Hannes Reinecke
  2020-02-20 20:08 ` [PATCH v3 04/15] scsi_debug: weaken rwlock around ramdisk access Douglas Gilbert
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal

With the addition of the doublestore option, the ability to check
whether the two different ramdisk images are the same or not
becomes useful. Prior to this patch VERIFY(10) always returned
true (i.e. the SCSI GOOD status) without checking. This option
adds support for BYTCHK equal to 1 and 3 . If the comparison
fails then a sense key of MISCOMPARE is returned as per the
T10 standards. Add support for the VERIFY(16).

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 drivers/scsi/scsi_debug.c | 103 +++++++++++++++++++++++++++++++++++---
 1 file changed, 95 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index a32be83b22c7..0cd27eca263a 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -342,7 +342,7 @@ enum sdeb_opcode_index {
 	SDEB_I_SERV_ACT_OUT_16 = 13,	/* add ...SERV_ACT_OUT_12 if needed */
 	SDEB_I_MAINT_IN = 14,
 	SDEB_I_MAINT_OUT = 15,
-	SDEB_I_VERIFY = 16,		/* 10 only */
+	SDEB_I_VERIFY = 16,		/* VERIFY(10), VERIFY(16) */
 	SDEB_I_VARIABLE_LEN = 17,	/* READ(32), WRITE(32), WR_SCAT(32) */
 	SDEB_I_RESERVE = 18,		/* 6, 10 */
 	SDEB_I_RELEASE = 19,		/* 6, 10 */
@@ -385,7 +385,8 @@ static const unsigned char opcode_ind_arr[256] = {
 	0, SDEB_I_VARIABLE_LEN,
 /* 0x80; 0x80->0x9f: 16 byte cdbs */
 	0, 0, 0, 0, 0, SDEB_I_ATA_PT, 0, 0,
-	SDEB_I_READ, SDEB_I_COMP_WRITE, SDEB_I_WRITE, 0, 0, 0, 0, 0,
+	SDEB_I_READ, SDEB_I_COMP_WRITE, SDEB_I_WRITE, 0,
+	0, 0, 0, SDEB_I_VERIFY,
 	0, SDEB_I_SYNC_CACHE, 0, SDEB_I_WRITE_SAME, 0, 0, 0, 0,
 	0, 0, 0, 0, 0, 0, SDEB_I_SERV_ACT_IN_16, SDEB_I_SERV_ACT_OUT_16,
 /* 0xa0; 0xa0->0xbf: 12 byte cdbs */
@@ -427,6 +428,7 @@ static int resp_report_tgtpgs(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_unmap(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_rsup_opcodes(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_rsup_tmfs(struct scsi_cmnd *, struct sdebug_dev_info *);
+static int resp_verify(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_write_same_10(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_write_same_16(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_comp_write(struct scsi_cmnd *, struct sdebug_dev_info *);
@@ -471,6 +473,12 @@ static const struct opcode_info_t write_iarr[] = {
 		   0xbf, 0xc7, 0, 0, 0, 0} },
 };
 
+static const struct opcode_info_t verify_iarr[] = {
+	{0, 0x2f, 0, F_D_OUT_MAYBE | FF_MEDIA_IO, resp_verify,/* VERIFY(10) */
+	    NULL, {10,  0xf7, 0xff, 0xff, 0xff, 0xff, 0xbf, 0xff, 0xff, 0xc7,
+		   0, 0, 0, 0, 0, 0} },
+};
+
 static const struct opcode_info_t sa_in_16_iarr[] = {
 	{0, 0x9e, 0x12, F_SA_LOW | F_D_IN, resp_get_lba_status, NULL,
 	    {16,  0x12, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
@@ -571,9 +579,10 @@ static const struct opcode_info_t opcode_info_arr[SDEB_I_LAST_ELEMENT + 1] = {
 /* 15 */
 	{0, 0, 0, F_INV_OP | FF_RESPOND, NULL, NULL, /* MAINT OUT */
 	    {0,  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} },
-	{0, 0x2f, 0, F_D_OUT_MAYBE | FF_MEDIA_IO, NULL, NULL, /* VERIFY(10) */
-	    {10,  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xc7,
-	     0, 0, 0, 0, 0, 0} },
+	{ARRAY_SIZE(verify_iarr), 0x8f, 0,
+	    F_D_OUT_MAYBE | FF_MEDIA_IO, resp_verify,	/* VERIFY(16) */
+	    verify_iarr, {16,  0xf6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+			  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f, 0xc7} },
 	{ARRAY_SIZE(vl_iarr), 0x7f, 0x9, F_SA_HIGH | F_D_IN | FF_MEDIA_IO,
 	    resp_read_dt0, vl_iarr,	/* VARIABLE LENGTH, READ(32) */
 	    {32,  0xc7, 0, 0, 0, 0, 0x3f, 0x18, 0x0, 0x9, 0xfe, 0, 0xff, 0xff,
@@ -2556,7 +2565,8 @@ static int do_dout_fetch(struct scsi_cmnd *scmd, u32 num, u8 *doutp)
 /* If lba2fake_store(lba,num) compares equal to arr(num), then copy top half of
  * arr into lba2fake_store(lba,num) and return true. If comparison fails then
  * return false. */
-static bool comp_write_worker(u64 lba, u32 num, const u8 *arr, int acc_num)
+static bool comp_write_worker(u64 lba, u32 num, const u8 *arr, int acc_num,
+			      bool compare_only)
 {
 	bool res;
 	u64 block, rest = 0;
@@ -2578,6 +2588,8 @@ static bool comp_write_worker(u64 lba, u32 num, const u8 *arr, int acc_num)
 			     rest * lb_size);
 	if (!res)
 		return res;
+	if (compare_only)
+		return true;
 	arr += num * lb_size;
 	memcpy(fsp + (block * lb_size), arr, (num - rest) * lb_size);
 	if (rest)
@@ -3491,7 +3503,7 @@ static int resp_comp_write(struct scsi_cmnd *scp,
 		sdev_printk(KERN_INFO, scp->device, "%s: compare_write: cdb "
 			    "indicated=%u, IO sent=%d bytes\n", my_name,
 			    dnum * lb_size, ret);
-	if (!comp_write_worker(lba, num, arr, scp2acc_num(scp))) {
+	if (!comp_write_worker(lba, num, arr, acc_num, false)) {
 		mk_sense_buffer(scp, MISCOMPARE, MISCOMPARE_VERIFY_ASC, 0);
 		retval = check_condition_result;
 		goto cleanup;
@@ -3555,7 +3567,7 @@ static int resp_unmap(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 		if (ret)
 			goto out;
 
-		unmap_region(lba, num, scp2acc_num(scp));
+		unmap_region(lba, num, acc_num);
 	}
 
 	ret = 0;
@@ -3736,6 +3748,81 @@ static int resp_report_luns(struct scsi_cmnd *scp,
 	return res;
 }
 
+static int resp_verify(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
+{
+	bool is_bytchk3 = false;
+	u8 bytchk;
+	int ret, j;
+	int acc_num = scp2acc_num(scp);
+	u32 vnum, a_num, off;
+	const u32 lb_size = sdebug_sector_size;
+	unsigned long iflags;
+	u64 lba;
+	u8 *arr;
+	u8 *cmd = scp->cmnd;
+
+	bytchk = (cmd[1] >> 1) & 0x3;
+	if (bytchk == 0) {
+		return 0;	/* always claim internal verify okay */
+	} else if (bytchk == 2) {
+		mk_sense_invalid_fld(scp, SDEB_IN_CDB, 2, 2);
+		return check_condition_result;
+	} else if (bytchk == 3) {
+		is_bytchk3 = true;	/* 1 block sent, compared repeatedly */
+	}
+	switch (cmd[0]) {
+	case VERIFY_16:
+		lba = get_unaligned_be64(cmd + 2);
+		vnum = get_unaligned_be32(cmd + 10);
+		break;
+	case VERIFY:		/* is VERIFY(10) */
+		lba = get_unaligned_be32(cmd + 2);
+		vnum = get_unaligned_be16(cmd + 7);
+		break;
+	default:
+		mk_sense_invalid_opcode(scp);
+		return check_condition_result;
+	}
+	a_num = is_bytchk3 ? 1 : vnum;
+	/* Treat following check like one for read (i.e. no write) access */
+	ret = check_device_access_params(scp, lba, a_num, false);
+	if (ret)
+		return ret;
+
+	arr = kcalloc(lb_size, vnum, GFP_ATOMIC);
+	if (!arr) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, INSUFF_RES_ASC,
+				INSUFF_RES_ASCQ);
+		return check_condition_result;
+	}
+	/* Not changing store, so only need read access */
+	read_lock_irqsave(ramdisk_lck_a[acc_num % 2], iflags);
+
+	ret = do_dout_fetch(scp, a_num, arr);
+	if (ret == -1) {
+		ret = DID_ERROR << 16;
+		goto cleanup;
+	} else if (sdebug_verbose && (ret < (a_num * lb_size))) {
+		sdev_printk(KERN_INFO, scp->device,
+			    "%s: %s: cdb indicated=%u, IO sent=%d bytes\n",
+			    my_name, __func__, a_num * lb_size, ret);
+	}
+	if (is_bytchk3) {
+		for (j = 1, off = lb_size; j < vnum; ++j, off += lb_size)
+			memcpy(arr + off, arr, lb_size);
+	}
+	ret = 0;
+	if (!comp_write_worker(lba, vnum, arr, acc_num, true)) {
+		mk_sense_buffer(scp, MISCOMPARE, MISCOMPARE_VERIFY_ASC, 0);
+		ret = check_condition_result;
+		goto cleanup;
+	}
+cleanup:
+	read_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
+	kfree(arr);
+	return ret;
+}
+
 static struct sdebug_queue *get_queue(struct scsi_cmnd *cmnd)
 {
 	u32 tag = blk_mq_unique_tag(cmnd->request);
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 04/15] scsi_debug: weaken rwlock around ramdisk access
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
                   ` (2 preceding siblings ...)
  2020-02-20 20:08 ` [PATCH v3 03/15] scsi_debug: implement verify(10), add verify(16) Douglas Gilbert
@ 2020-02-20 20:08 ` Douglas Gilbert
  2020-02-20 20:08 ` [PATCH v3 05/15] scsi_debug: improve command duration calculation Douglas Gilbert
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal

The design of this driver is to do any ramdisk access on the
same thread that invoked the queuecommand() call. That is
assumed to be user space context. The command
duration is implemented by setting the delay with a high
resolution timer. The hr timer's callback may well be in
interrupt context, but it doesn't touch the ramdisk. So try
removing the _irqsave()/_irqrestore() portion on the read-
write lock that protects ramdisk access.

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 drivers/scsi/scsi_debug.c | 59 +++++++++++++++------------------------
 1 file changed, 22 insertions(+), 37 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 0cd27eca263a..c6b0a7d021cd 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -2716,7 +2716,6 @@ static int resp_read_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	u32 num;
 	u32 ei_lba;
 	int acc_num = scp2acc_num(scp);
-	unsigned long iflags;
 	u64 lba;
 	int ret;
 	bool check_prot;
@@ -2801,22 +2800,21 @@ static int resp_read_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 		return check_condition_result;
 	}
 
-	read_lock_irqsave(ramdisk_lck_a[acc_num % 2], iflags);
+	read_lock(ramdisk_lck_a[acc_num % 2]);
 
 	/* DIX + T10 DIF */
 	if (unlikely(sdebug_dix && scsi_prot_sg_count(scp))) {
 		int prot_ret = prot_verify_read(scp, lba, num, ei_lba);
 
 		if (prot_ret) {
-			read_unlock_irqrestore(ramdisk_lck_a[acc_num % 2],
-					       iflags);
+			read_unlock(ramdisk_lck_a[acc_num % 2]);
 			mk_sense_buffer(scp, ABORTED_COMMAND, 0x10, prot_ret);
 			return illegal_condition_result;
 		}
 	}
 
 	ret = do_device_access(scp, 0, lba, num, false);
-	read_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
+	read_unlock(ramdisk_lck_a[acc_num % 2]);
 	if (unlikely(ret == -1))
 		return DID_ERROR << 16;
 
@@ -3023,7 +3021,6 @@ static int resp_write_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	u32 ei_lba;
 	int acc_num = scp2acc_num(scp);
 	int ret;
-	unsigned long iflags;
 	u64 lba;
 	u8 *cmd = scp->cmnd;
 
@@ -3081,15 +3078,14 @@ static int resp_write_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	ret = check_device_access_params(scp, lba, num, true);
 	if (ret)
 		return ret;
-	write_lock_irqsave(ramdisk_lck_a[acc_num % 2], iflags);
+	write_lock(ramdisk_lck_a[acc_num % 2]);
 
 	/* DIX + T10 DIF */
 	if (unlikely(sdebug_dix && scsi_prot_sg_count(scp))) {
 		int prot_ret = prot_verify_write(scp, lba, num, ei_lba);
 
 		if (prot_ret) {
-			write_unlock_irqrestore(ramdisk_lck_a[acc_num % 2],
-						iflags);
+			write_unlock(ramdisk_lck_a[acc_num % 2]);
 			mk_sense_buffer(scp, ILLEGAL_REQUEST, 0x10, prot_ret);
 			return illegal_condition_result;
 		}
@@ -3098,7 +3094,7 @@ static int resp_write_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	ret = do_device_access(scp, 0, lba, num, true);
 	if (unlikely(scsi_debug_lbp()))
 		map_region(lba, num);
-	write_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
+	write_unlock(ramdisk_lck_a[acc_num % 2]);
 	if (unlikely(-1 == ret))
 		return DID_ERROR << 16;
 	else if (unlikely(sdebug_verbose &&
@@ -3146,7 +3142,6 @@ static int resp_write_scat(struct scsi_cmnd *scp,
 	u32 ei_lba;
 	int acc_num = scp2acc_num(scp);
 	u64 lba;
-	unsigned long iflags;
 	int ret, res;
 	bool is_16;
 	static const u32 lrd_size = 32; /* + parameter list header size */
@@ -3208,7 +3203,7 @@ static int resp_write_scat(struct scsi_cmnd *scp,
 		goto err_out;
 	}
 
-	write_lock_irqsave(ramdisk_lck_a[acc_num % 2], iflags);
+	write_lock(ramdisk_lck_a[acc_num % 2]);
 	sg_off = lbdof_blen;
 	/* Spec says Buffer xfer Length field in number of LBs in dout */
 	cum_lb = 0;
@@ -3291,7 +3286,7 @@ static int resp_write_scat(struct scsi_cmnd *scp,
 	}
 	ret = 0;
 err_out_unlock:
-	write_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
+	write_unlock(ramdisk_lck_a[acc_num % 2]);
 err_out:
 	kfree(lrdp);
 	return ret;
@@ -3300,7 +3295,6 @@ static int resp_write_scat(struct scsi_cmnd *scp,
 static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num,
 			   u32 ei_lba, bool unmap, bool ndob)
 {
-	unsigned long iflags;
 	unsigned long long i;
 	u64 block, lbaa;
 	u32 lb_size = sdebug_sector_size;
@@ -3313,7 +3307,7 @@ static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num,
 	if (ret)
 		return ret;
 
-	write_lock_irqsave(ramdisk_lck_a[acc_num % 2], iflags);
+	write_lock(ramdisk_lck_a[acc_num % 2]);
 
 	if (unmap && scsi_debug_lbp()) {
 		unmap_region(lba, num, acc_num);
@@ -3331,7 +3325,7 @@ static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num,
 		ret = fetch_to_dev_buffer(scp, fs1p, lb_size);
 
 	if (-1 == ret) {
-		write_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
+		write_unlock(ramdisk_lck_a[acc_num % 2]);
 		return DID_ERROR << 16;
 	} else if (sdebug_verbose && !ndob && (ret < lb_size))
 		sdev_printk(KERN_INFO, scp->device,
@@ -3347,7 +3341,7 @@ static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num,
 	if (scsi_debug_lbp())
 		map_region(lba, num);
 out:
-	write_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
+	write_unlock(ramdisk_lck_a[acc_num % 2]);
 
 	return 0;
 }
@@ -3463,7 +3457,6 @@ static int resp_comp_write(struct scsi_cmnd *scp,
 	u32 dnum;
 	u32 lb_size = sdebug_sector_size;
 	u8 num;
-	unsigned long iflags;
 	int ret;
 	int retval = 0;
 	int acc_num = scp2acc_num(scp);
@@ -3493,7 +3486,7 @@ static int resp_comp_write(struct scsi_cmnd *scp,
 		return check_condition_result;
 	}
 
-	write_lock_irqsave(ramdisk_lck_a[acc_num % 2], iflags);
+	write_lock(ramdisk_lck_a[acc_num % 2]);
 
 	ret = do_dout_fetch(scp, dnum, arr);
 	if (ret == -1) {
@@ -3511,7 +3504,7 @@ static int resp_comp_write(struct scsi_cmnd *scp,
 	if (scsi_debug_lbp())
 		map_region(lba, num);
 cleanup:
-	write_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
+	write_unlock(ramdisk_lck_a[acc_num % 2]);
 	kfree(arr);
 	return retval;
 }
@@ -3529,8 +3522,6 @@ static int resp_unmap(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	unsigned int i, payload_len, descriptors;
 	int ret;
 	int acc_num = scp2acc_num(scp);
-	unsigned long iflags;
-
 
 	if (!scsi_debug_lbp())
 		return 0;	/* fib and say its done */
@@ -3557,7 +3548,7 @@ static int resp_unmap(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 
 	desc = (void *)&buf[8];
 
-	write_lock_irqsave(ramdisk_lck_a[acc_num % 2], iflags);
+	write_lock(ramdisk_lck_a[acc_num % 2]);
 
 	for (i = 0 ; i < descriptors ; i++) {
 		unsigned long long lba = get_unaligned_be64(&desc[i].lba);
@@ -3573,7 +3564,7 @@ static int resp_unmap(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	ret = 0;
 
 out:
-	write_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
+	write_unlock(ramdisk_lck_a[acc_num % 2]);
 	kfree(buf);
 
 	return ret;
@@ -3756,7 +3747,6 @@ static int resp_verify(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	int acc_num = scp2acc_num(scp);
 	u32 vnum, a_num, off;
 	const u32 lb_size = sdebug_sector_size;
-	unsigned long iflags;
 	u64 lba;
 	u8 *arr;
 	u8 *cmd = scp->cmnd;
@@ -3796,7 +3786,7 @@ static int resp_verify(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 		return check_condition_result;
 	}
 	/* Not changing store, so only need read access */
-	read_lock_irqsave(ramdisk_lck_a[acc_num % 2], iflags);
+	read_lock(ramdisk_lck_a[acc_num % 2]);
 
 	ret = do_dout_fetch(scp, a_num, arr);
 	if (ret == -1) {
@@ -3818,7 +3808,7 @@ static int resp_verify(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 		goto cleanup;
 	}
 cleanup:
-	read_unlock_irqrestore(ramdisk_lck_a[acc_num % 2], iflags);
+	read_unlock(ramdisk_lck_a[acc_num % 2]);
 	kfree(arr);
 	return ret;
 }
@@ -4467,9 +4457,6 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
 
 	cmnd->result = pfp != NULL ? pfp(cmnd, devip) : 0;
 	if (cmnd->result & SDEG_RES_IMMED_MASK) {
-		/*
-		 * This is the F_DELAY_OVERR case. No delay.
-		 */
 		cmnd->result &= ~SDEG_RES_IMMED_MASK;
 		delta_jiff = ndelay = 0;
 	}
@@ -4996,16 +4983,14 @@ static ssize_t doublestore_store(struct device_driver *ddp, const char *buf,
 	int n;
 
 	if (count > 0 && kstrtoint(buf, 10, &n) == 0 && n >= 0) {
-		unsigned long iflags;
-
 		if (sdebug_doublestore == (n > 0))
 			return count;	/* no state change */
 		if (n <= 0) {
-			write_lock_irqsave(ramdisk_lck_a[1], iflags);
+			write_lock(ramdisk_lck_a[1]);
 			sdebug_doublestore = false;
 			vfree(fake_store_a[1]);
 			fake_store_a[1] = NULL;
-			write_unlock_irqrestore(ramdisk_lck_a[1], iflags);
+			write_unlock(ramdisk_lck_a[1]);
 		} else {
 			unsigned long sz = (unsigned long)sdebug_dev_size_mb *
 					   1048576;
@@ -5015,10 +5000,10 @@ static ssize_t doublestore_store(struct device_driver *ddp, const char *buf,
 				return -ENOMEM;
 			if (sdebug_num_parts > 0)
 				sdebug_build_parts(fsp, sz);
-			write_lock_irqsave(ramdisk_lck_a[1], iflags);
+			write_lock(ramdisk_lck_a[1]);
 			fake_store_a[1] = fsp;
 			sdebug_doublestore = true;
-			write_unlock_irqrestore(ramdisk_lck_a[1], iflags);
+			write_unlock(ramdisk_lck_a[1]);
 		}
 		return count;
 	}
@@ -5954,7 +5939,7 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
 		pfp = r_pfp;    /* if leaf function ptr NULL, try the root's */
 
 fini:
-	if (F_DELAY_OVERR & flags)
+	if (F_DELAY_OVERR & flags)	/* cmds like INQUIRY respond asap */
 		return schedule_resp(scp, devip, errsts, pfp, 0, 0);
 	else if ((flags & F_LONG_DELAY) && (sdebug_jdelay > 0 ||
 					    sdebug_ndelay > 10000)) {
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 05/15] scsi_debug: improve command duration calculation
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
                   ` (3 preceding siblings ...)
  2020-02-20 20:08 ` [PATCH v3 04/15] scsi_debug: weaken rwlock around ramdisk access Douglas Gilbert
@ 2020-02-20 20:08 ` Douglas Gilbert
  2020-02-20 20:08 ` [PATCH v3 06/15] scsi_debug: implement pre-fetch commands Douglas Gilbert
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal

Previously the code did the work implied by the given SCSI
command and after that it waited for a timer based on the user
specified command duration to be exhausted before informing
the mid-level that the command was complete. For short command
durations the time to complete the work implied by the SCSI
command could be significant compared to the user specified
command duration.

For example a WRITE of 128 blocks (say 512 bytes each) on a
machine that can copy from main memory to main memory at a rate
of 10 GB/sec will take around 6.4 microseconds to do that copy.
If the user specified a command duration of 5 microseconds
(ndelay=5000) should the driver do a further delay of 5
microseconds after the copy or return immediately because
6.4 > 5 ?

The action prior to this patch was to always do the timer based
delay. After this patch, for ndelay values less than 1
millisecond, this driver will complete the command immediately.
And in the case where the user specified delay was 7
microseconds, a timer delay of 600 nanoseconds will be set
((7 - 6.4) * 1000).

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 drivers/scsi/scsi_debug.c | 31 +++++++++++++++++++++++++++++--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index c6b0a7d021cd..6193a88f9e24 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -4366,6 +4366,8 @@ static void setup_inject(struct sdebug_queue *sqp,
 	sqcp->inj_cmd_abort = !!(SDEBUG_OPT_CMD_ABORT & sdebug_opts);
 }
 
+#define INCLUSIVE_TIMING_MAX_NS 1000000		/* 1 millisecond */
+
 /* Complete the processing of the thread that queued a SCSI command to this
  * driver. It either completes the command by calling cmnd_done() or
  * schedules a hr timer or work queue then returns 0. Returns
@@ -4377,8 +4379,10 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
 				    struct sdebug_dev_info *),
 			 int delta_jiff, int ndelay)
 {
-	unsigned long iflags;
+	bool new_sd_dp;
 	int k, num_in_q, qdepth, inject;
+	unsigned long iflags;
+	u64 ns_from_boot = 0;
 	struct sdebug_queue *sqp;
 	struct sdebug_queued_cmd *sqcp;
 	struct scsi_device *sdp;
@@ -4394,7 +4398,6 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
 	if (delta_jiff == 0)
 		goto respond_in_thread;
 
-	/* schedule the response at a later time if resources permit */
 	sqp = get_queue(cmnd);
 	spin_lock_irqsave(&sqp->qc_lock, iflags);
 	if (unlikely(atomic_read(&sqp->blocked))) {
@@ -4453,8 +4456,15 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
 		sd_dp = kzalloc(sizeof(*sd_dp), GFP_ATOMIC);
 		if (sd_dp == NULL)
 			return SCSI_MLQUEUE_HOST_BUSY;
+		new_sd_dp = true;
+	} else {
+		new_sd_dp = false;
 	}
 
+	if (ndelay > 0 && ndelay < INCLUSIVE_TIMING_MAX_NS)
+		ns_from_boot = ktime_get_boottime_ns();
+
+	/* one of the resp_*() response functions is called here */
 	cmnd->result = pfp != NULL ? pfp(cmnd, devip) : 0;
 	if (cmnd->result & SDEG_RES_IMMED_MASK) {
 		cmnd->result &= ~SDEG_RES_IMMED_MASK;
@@ -4485,6 +4495,22 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
 		} else {	/* ndelay has a 4.2 second max */
 			kt = sdebug_random ? prandom_u32_max((u32)ndelay) :
 					     (u32)ndelay;
+			if (ndelay < INCLUSIVE_TIMING_MAX_NS) {
+				u64 d = ktime_get_boottime_ns() - ns_from_boot;
+
+				if (kt <= d) {	/* elapsed duration >= kt */
+					sqcp->a_cmnd = NULL;
+					atomic_dec(&devip->num_in_q);
+					clear_bit(k, sqp->in_use_bm);
+					if (new_sd_dp)
+						kfree(sd_dp);
+					/* call scsi_done() from this thread */
+					cmnd->scsi_done(cmnd);
+					return 0;
+				}
+				/* otherwise reduce kt by elapsed time */
+				kt -= d;
+			}
 		}
 		if (!sd_dp->init_hrt) {
 			sd_dp->init_hrt = true;
@@ -4498,6 +4524,7 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
 		if (sdebug_statistics)
 			sd_dp->issuing_cpu = raw_smp_processor_id();
 		sd_dp->defer_t = SDEB_DEFER_HRT;
+		/* schedule the invocation of scsi_done() for a later time */
 		hrtimer_start(&sd_dp->hrt, kt, HRTIMER_MODE_REL_PINNED);
 	} else {	/* jdelay < 0, use work queue */
 		if (!sd_dp->init_wq) {
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 06/15] scsi_debug: implement pre-fetch commands
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
                   ` (4 preceding siblings ...)
  2020-02-20 20:08 ` [PATCH v3 05/15] scsi_debug: improve command duration calculation Douglas Gilbert
@ 2020-02-20 20:08 ` Douglas Gilbert
  2020-02-20 20:08 ` [PATCH v3 07/15] scsi_debug: expand zbc support Douglas Gilbert
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal

Many disks implement the SCSI PRE-FETCH commands. One use case
might be a disk-to-disk compare, say between disks A and B.
Then this sequence of commands might be used:
PRE-FETCH(from B, IMMED), READ(from A), VERIFY (BYTCHK=1 on B
with data returned from READ). The PRE-FETCH (which returns
quickly due to the IMMED) fetches the data from the media into
B's cache which should speed the trailing VERIFY command.
The next chunk of the compare might be done in parallel,
with A and B reversed.

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 drivers/scsi/scsi_debug.c | 54 +++++++++++++++++++++++++++++++++++----
 1 file changed, 49 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 6193a88f9e24..6568ad7cfb56 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -355,7 +355,8 @@ enum sdeb_opcode_index {
 	SDEB_I_WRITE_SAME = 26,		/* 10, 16 */
 	SDEB_I_SYNC_CACHE = 27,		/* 10, 16 */
 	SDEB_I_COMP_WRITE = 28,
-	SDEB_I_LAST_ELEMENT = 29,	/* keep this last (previous + 1) */
+	SDEB_I_PRE_FETCH = 29,		/* 10, 16 */
+	SDEB_I_LAST_ELEM_P1 = 30,	/* keep this last (previous + 1) */
 };
 
 
@@ -371,7 +372,7 @@ static const unsigned char opcode_ind_arr[256] = {
 /* 0x20; 0x20->0x3f: 10 byte cdbs */
 	0, 0, 0, 0, 0, SDEB_I_READ_CAPACITY, 0, 0,
 	SDEB_I_READ, 0, SDEB_I_WRITE, 0, 0, 0, 0, SDEB_I_VERIFY,
-	0, 0, 0, 0, 0, SDEB_I_SYNC_CACHE, 0, 0,
+	0, 0, 0, 0, SDEB_I_PRE_FETCH, SDEB_I_SYNC_CACHE, 0, 0,
 	0, 0, 0, SDEB_I_WRITE_BUFFER, 0, 0, 0, 0,
 /* 0x40; 0x40->0x5f: 10 byte cdbs */
 	0, SDEB_I_WRITE_SAME, SDEB_I_UNMAP, 0, 0, 0, 0, 0,
@@ -387,7 +388,7 @@ static const unsigned char opcode_ind_arr[256] = {
 	0, 0, 0, 0, 0, SDEB_I_ATA_PT, 0, 0,
 	SDEB_I_READ, SDEB_I_COMP_WRITE, SDEB_I_WRITE, 0,
 	0, 0, 0, SDEB_I_VERIFY,
-	0, SDEB_I_SYNC_CACHE, 0, SDEB_I_WRITE_SAME, 0, 0, 0, 0,
+	SDEB_I_PRE_FETCH, SDEB_I_SYNC_CACHE, 0, SDEB_I_WRITE_SAME, 0, 0, 0, 0,
 	0, 0, 0, 0, 0, 0, SDEB_I_SERV_ACT_IN_16, SDEB_I_SERV_ACT_OUT_16,
 /* 0xa0; 0xa0->0xbf: 12 byte cdbs */
 	SDEB_I_REPORT_LUNS, SDEB_I_ATA_PT, 0, SDEB_I_MAINT_IN,
@@ -434,6 +435,7 @@ static int resp_write_same_16(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_comp_write(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_write_buffer(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_sync_cache(struct scsi_cmnd *, struct sdebug_dev_info *);
+static int resp_pre_fetch(struct scsi_cmnd *, struct sdebug_dev_info *);
 
 /*
  * The following are overflow arrays for cdbs that "hit" the same index in
@@ -525,11 +527,17 @@ static const struct opcode_info_t sync_cache_iarr[] = {
 	     0xff, 0xff, 0xff, 0xff, 0x3f, 0xc7} },	/* SYNC_CACHE (16) */
 };
 
+static const struct opcode_info_t pre_fetch_iarr[] = {
+	{0, 0x90, 0, F_SYNC_DELAY | F_M_ACCESS, resp_pre_fetch, NULL,
+	    {16,  0x2, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+	     0xff, 0xff, 0xff, 0xff, 0x3f, 0xc7} },	/* PRE-FETCH (16) */
+};
+
 
 /* This array is accessed via SDEB_I_* values. Make sure all are mapped,
  * plus the terminating elements for logic that scans this table such as
  * REPORT SUPPORTED OPERATION CODES. */
-static const struct opcode_info_t opcode_info_arr[SDEB_I_LAST_ELEMENT + 1] = {
+static const struct opcode_info_t opcode_info_arr[SDEB_I_LAST_ELEM_P1 + 1] = {
 /* 0 */
 	{0, 0, 0, F_INV_OP | FF_RESPOND, NULL, NULL,	/* unknown opcodes */
 	    {0,  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} },
@@ -621,8 +629,12 @@ static const struct opcode_info_t opcode_info_arr[SDEB_I_LAST_ELEMENT + 1] = {
 	{0, 0x89, 0, F_D_OUT | FF_MEDIA_IO, resp_comp_write, NULL,
 	    {16,  0xf8, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0, 0,
 	     0, 0xff, 0x3f, 0xc7} },		/* COMPARE AND WRITE */
+	{ARRAY_SIZE(pre_fetch_iarr), 0x34, 0, F_SYNC_DELAY | F_M_ACCESS,
+	    resp_pre_fetch, pre_fetch_iarr,
+	    {10,  0x2, 0xff, 0xff, 0xff, 0xff, 0x3f, 0xff, 0xff, 0xc7, 0, 0,
+	     0, 0, 0, 0} },			/* PRE-FETCH (10) */
 
-/* 29 */
+/* 30 */
 	{0xff, 0, 0, 0, NULL, NULL,		/* terminating element */
 	    {0,  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} },
 };
@@ -735,6 +747,8 @@ static const int illegal_condition_result =
 static const int device_qfull_result =
 	(DID_OK << 16) | (COMMAND_COMPLETE << 8) | SAM_STAT_TASK_SET_FULL;
 
+static const int condition_met_result = SAM_STAT_CONDITION_MET;
+
 
 /* Only do the extra work involved in logical block provisioning if one or
  * more of the lbpu, lbpws or lbpws10 parameters are given and we are doing
@@ -3638,6 +3652,36 @@ static int resp_sync_cache(struct scsi_cmnd *scp,
 	return res;
 }
 
+/*
+ * Assuming the LBA+num_blocks is not out-of-range, this function will return
+ * CONDITION MET if the specified blocks will/have fitted in the cache, and
+ * a GOOD status otherwise. Model a disk with a big cache and yield
+ * CONDITION MET.
+ */
+static int resp_pre_fetch(struct scsi_cmnd *scp,
+			  struct sdebug_dev_info *devip)
+{
+	int res = 0;
+	u64 lba;
+	u32 num_blocks;
+	u8 *cmd = scp->cmnd;
+
+	if (cmd[0] == PRE_FETCH) {	/* 10 byte cdb */
+		lba = get_unaligned_be32(cmd + 2);
+		num_blocks = get_unaligned_be16(cmd + 7);
+	} else {			/* PRE-FETCH(16) */
+		lba = get_unaligned_be64(cmd + 2);
+		num_blocks = get_unaligned_be32(cmd + 10);
+	}
+	if (lba + num_blocks > sdebug_capacity) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, LBA_OUT_OF_RANGE, 0);
+		return check_condition_result;
+	}
+	if (cmd[1] & 0x2)
+		res = SDEG_RES_IMMED_MASK;
+	return res | condition_met_result;
+}
+
 #define RL_BUCKET_ELEMS 8
 
 /* Even though each pseudo target has a REPORT LUNS "well known logical unit"
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 07/15] scsi_debug: expand zbc support
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
                   ` (5 preceding siblings ...)
  2020-02-20 20:08 ` [PATCH v3 06/15] scsi_debug: implement pre-fetch commands Douglas Gilbert
@ 2020-02-20 20:08 ` Douglas Gilbert
  2020-02-20 20:08 ` [PATCH v3 08/15] scsi_debug: add zone commands Douglas Gilbert
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal

The ZBC standard "piggy-backs" on many, but not all, of the
facilities in SBC. Add those ZBC mode pages (plus mode
parameter block descriptors (e.g. "WP")) and VPD pages in
common with SBC. Add ZBC specific VPD page.

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 drivers/scsi/scsi_debug.c | 55 +++++++++++++++++++++++++++++++--------
 1 file changed, 44 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 6568ad7cfb56..5a720d2a14c4 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -1416,6 +1416,24 @@ static int inquiry_vpd_b2(unsigned char *arr)
 	return 0x4;
 }
 
+/* Zoned block device characteristics VPD page (ZBC mandatory) */
+static int inquiry_vpd_b6(unsigned char *arr)
+{
+	memset(arr, 0, 0x3c);
+	arr[0] = 0x1; /* set URSWRZ (unrestricted read in seq. wr req zone) */
+	/*
+	 * Set Optimal number of open sequential write preferred zones and
+	 * Optimal number of non-sequentially written sequential write
+	 * preferred zones and Maximum number of open sequential write
+	 * required zones fields to 'not reported' (0xffffffff). Leave other
+	 * fields set to zero.
+	 */
+	put_unaligned_be32(0xffffffff, &arr[4]);
+	put_unaligned_be32(0xffffffff, &arr[8]);
+	put_unaligned_be32(0xffffffff, &arr[12]);
+	return 0x3c;
+}
+
 #define SDEBUG_LONG_INQ_SZ 96
 #define SDEBUG_MAX_INQ_ARR_SZ 584
 
@@ -1425,13 +1443,15 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	unsigned char *arr;
 	unsigned char *cmd = scp->cmnd;
 	int alloc_len, n, ret;
-	bool have_wlun, is_disk;
+	bool have_wlun, is_disk, is_zbc, is_disk_zbc;
 
 	alloc_len = get_unaligned_be16(cmd + 3);
 	arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_ATOMIC);
 	if (! arr)
 		return DID_REQUEUE << 16;
 	is_disk = (sdebug_ptype == TYPE_DISK);
+	is_zbc = (sdebug_ptype == TYPE_ZBC);
+	is_disk_zbc = (is_disk || is_zbc);
 	have_wlun = scsi_is_wlun(scp->device->lun);
 	if (have_wlun)
 		pq_pdt = TYPE_WLUN;	/* present, wlun */
@@ -1469,11 +1489,14 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 			arr[n++] = 0x86;  /* extended inquiry */
 			arr[n++] = 0x87;  /* mode page policy */
 			arr[n++] = 0x88;  /* SCSI ports */
-			if (is_disk) {	  /* SBC only */
+			if (is_disk_zbc) {	  /* SBC or ZBC */
 				arr[n++] = 0x89;  /* ATA information */
 				arr[n++] = 0xb0;  /* Block limits */
 				arr[n++] = 0xb1;  /* Block characteristics */
-				arr[n++] = 0xb2;  /* Logical Block Prov */
+				if (is_disk)
+					arr[n++] = 0xb2;  /* LB Provisioning */
+				else if (is_zbc)
+					arr[n++] = 0xb6;  /* ZB dev. char. */
 			}
 			arr[3] = n - 4;	  /* number of supported VPD pages */
 		} else if (0x80 == cmd[2]) { /* unit serial number */
@@ -1512,19 +1535,22 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 		} else if (0x88 == cmd[2]) { /* SCSI Ports */
 			arr[1] = cmd[2];	/*sanity */
 			arr[3] = inquiry_vpd_88(&arr[4], target_dev_id);
-		} else if (is_disk && 0x89 == cmd[2]) { /* ATA information */
+		} else if (is_disk_zbc && 0x89 == cmd[2]) { /* ATA info */
 			arr[1] = cmd[2];        /*sanity */
 			n = inquiry_vpd_89(&arr[4]);
 			put_unaligned_be16(n, arr + 2);
-		} else if (is_disk && 0xb0 == cmd[2]) { /* Block limits */
+		} else if (is_disk_zbc && 0xb0 == cmd[2]) { /* Block limits */
 			arr[1] = cmd[2];        /*sanity */
 			arr[3] = inquiry_vpd_b0(&arr[4]);
-		} else if (is_disk && 0xb1 == cmd[2]) { /* Block char. */
+		} else if (is_disk_zbc && 0xb1 == cmd[2]) { /* Block char. */
 			arr[1] = cmd[2];        /*sanity */
 			arr[3] = inquiry_vpd_b1(&arr[4]);
 		} else if (is_disk && 0xb2 == cmd[2]) { /* LB Prov. */
 			arr[1] = cmd[2];        /*sanity */
 			arr[3] = inquiry_vpd_b2(&arr[4]);
+		} else if (is_zbc && cmd[2] == 0xb6) { /* ZB dev. charact. */
+			arr[1] = cmd[2];        /*sanity */
+			arr[3] = inquiry_vpd_b6(&arr[4]);
 		} else {
 			mk_sense_invalid_fld(scp, SDEB_IN_CDB, 2, -1);
 			kfree(arr);
@@ -1562,6 +1588,9 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	} else if (sdebug_ptype == TYPE_TAPE) {	/* SSC-4 rev 3 */
 		put_unaligned_be16(0x525, arr + n);
 		n += 2;
+	} else if (is_zbc) {	/* ZBC BSR INCITS 536 revision 05 */
+		put_unaligned_be16(0x624, arr + n);
+		n += 2;
 	}
 	put_unaligned_be16(0x2100, arr + n);	/* SPL-4 no version claimed */
 	ret = fill_from_dev_buffer(scp, arr,
@@ -2151,7 +2180,7 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 	unsigned char *ap;
 	unsigned char arr[SDEBUG_MAX_MSENSE_SZ];
 	unsigned char *cmd = scp->cmnd;
-	bool dbd, llbaa, msense_6, is_disk, bad_pcode;
+	bool dbd, llbaa, msense_6, is_disk, is_zbc, bad_pcode;
 
 	dbd = !!(cmd[1] & 0x8);		/* disable block descriptors */
 	pcontrol = (cmd[2] & 0xc0) >> 6;
@@ -2160,7 +2189,8 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 	msense_6 = (MODE_SENSE == cmd[0]);
 	llbaa = msense_6 ? false : !!(cmd[1] & 0x10);
 	is_disk = (sdebug_ptype == TYPE_DISK);
-	if (is_disk && !dbd)
+	is_zbc = (sdebug_ptype == TYPE_ZBC);
+	if ((is_disk || is_zbc) && !dbd)
 		bd_len = llbaa ? 16 : 8;
 	else
 		bd_len = 0;
@@ -2172,8 +2202,8 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 	}
 	target_dev_id = ((devip->sdbg_host->shost->host_no + 1) * 2000) +
 			(devip->target * 1000) - 3;
-	/* for disks set DPOFUA bit and clear write protect (WP) bit */
-	if (is_disk) {
+	/* for disks+zbc set DPOFUA bit and clear write protect (WP) bit */
+	if (is_disk || is_zbc) {
 		dev_spec = 0x10;	/* =0x90 if WP=1 implies read-only */
 		if (sdebug_wp)
 			dev_spec |= 0x80;
@@ -2233,7 +2263,7 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 			bad_pcode = true;
 		break;
 	case 0x8:	/* Caching page, direct access */
-		if (is_disk) {
+		if (is_disk || is_zbc) {
 			len = resp_caching_pg(ap, pcontrol, target);
 			offset += len;
 		} else
@@ -2271,6 +2301,9 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 						      target);
 				len += resp_caching_pg(ap + len, pcontrol,
 						       target);
+			} else if (is_zbc) {
+				len += resp_caching_pg(ap + len, pcontrol,
+						       target);
 			}
 			len += resp_ctrl_m_pg(ap + len, pcontrol, target);
 			len += resp_sas_sf_m_pg(ap + len, pcontrol, target);
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 08/15] scsi_debug: add zone commands
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
                   ` (6 preceding siblings ...)
  2020-02-20 20:08 ` [PATCH v3 07/15] scsi_debug: expand zbc support Douglas Gilbert
@ 2020-02-20 20:08 ` Douglas Gilbert
  2020-02-21  9:44   ` Johannes Thumshirn
  2020-02-20 20:08 ` [PATCH v3 09/15] scsi_debug: zbc module parameter Douglas Gilbert
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal, Damien Le Moal

Add support for the 5 ZBC commands and enough functionality to emulate
a host-managed device with one conventional and a set of sequential
write required zones up to the disk capacity.

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/scsi/scsi_debug.c | 835 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 793 insertions(+), 42 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 5a720d2a14c4..10f8698031aa 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -91,6 +91,11 @@ static const char *sdebug_version_date = "20190125";
 #define MICROCODE_CHANGED_ASCQ 0x1	/* with TARGET_CHANGED_ASC */
 #define MICROCODE_CHANGED_WO_RESET_ASCQ 0x16
 #define WRITE_ERROR_ASC 0xc
+#define UNALIGNED_WRITE_ASCQ 0x4
+#define WRITE_BOUNDARY_ASCQ 0x5
+#define READ_INVDATA_ASCQ 0x6
+#define READ_BOUNDARY_ASCQ 0x7
+#define INSUFF_ZONE_ASCQ 0xe
 
 /* Additional Sense Code Qualifier (ASCQ) */
 #define ACK_NAK_TO 0x3
@@ -144,6 +149,10 @@ static const char *sdebug_version_date = "20190125";
 #define DEF_UUID_CTL 0
 #define JDELAY_OVERRIDDEN -9999
 
+/* Default parameters for ZBC drives */
+#define DEF_ZBC_ZONE_SIZE_MB	128
+#define DEF_ZBC_MAX_OPEN_ZONES	8
+
 #define SDEBUG_LUN_0_VAL 0
 
 /* bit mask values for sdebug_opts */
@@ -245,6 +254,23 @@ static const char *sdebug_version_date = "20190125";
 
 #define SDEBUG_MAX_CMD_LEN 32
 
+enum sdebug_z_cond {	/* enumeration names taken from table 12, zbc2r04 */
+	zbc_not_write_pointer = 0x0, /* not in table 12; conventional zone */
+	zc1_empty = 0x1,
+	zc2_implicit_open,
+	zc3_explicit_open,
+	zc4_closed,
+	zc5_full = 0xe,	/* values per Zone Condition field in Report Zones */
+	zc6_read_only = 0xd,
+	zc7_offline = 0xf,
+};
+
+struct sdeb_zone_state {	/* ZBC: per zone state */
+	enum sdebug_z_cond z_cond;
+	unsigned int z_size;
+	sector_t z_start;
+	sector_t z_wp;
+};
 
 struct sdebug_dev_info {
 	struct list_head dev_list;
@@ -258,6 +284,16 @@ struct sdebug_dev_info {
 	atomic_t stopped;
 	int sdg_devnum;
 	bool used;
+
+	/* For ZBC devices */
+	sector_t zsize;
+	sector_t zsize_shift;
+	unsigned int nr_zones;
+	unsigned int nr_imp_open;
+	unsigned int nr_exp_open;
+	unsigned int nr_closed;
+	unsigned int max_open;
+	struct sdeb_zone_state *zstate;
 };
 
 struct sdebug_host_info {
@@ -356,10 +392,11 @@ enum sdeb_opcode_index {
 	SDEB_I_SYNC_CACHE = 27,		/* 10, 16 */
 	SDEB_I_COMP_WRITE = 28,
 	SDEB_I_PRE_FETCH = 29,		/* 10, 16 */
-	SDEB_I_LAST_ELEM_P1 = 30,	/* keep this last (previous + 1) */
+	SDEB_I_ZONE_OUT = 30,		/* 0x94+SA; includes no data xfer */
+	SDEB_I_ZONE_IN = 31,		/* 0x95+SA; all have data-in */
+	SDEB_I_LAST_ELEM_P1 = 32,	/* keep this last (previous + 1) */
 };
 
-
 static const unsigned char opcode_ind_arr[256] = {
 /* 0x0; 0x0->0x1f: 6 byte cdbs */
 	SDEB_I_TEST_UNIT_READY, SDEB_I_REZERO_UNIT, 0, SDEB_I_REQUEST_SENSE,
@@ -388,7 +425,8 @@ static const unsigned char opcode_ind_arr[256] = {
 	0, 0, 0, 0, 0, SDEB_I_ATA_PT, 0, 0,
 	SDEB_I_READ, SDEB_I_COMP_WRITE, SDEB_I_WRITE, 0,
 	0, 0, 0, SDEB_I_VERIFY,
-	SDEB_I_PRE_FETCH, SDEB_I_SYNC_CACHE, 0, SDEB_I_WRITE_SAME, 0, 0, 0, 0,
+	SDEB_I_PRE_FETCH, SDEB_I_SYNC_CACHE, 0, SDEB_I_WRITE_SAME,
+	SDEB_I_ZONE_OUT, SDEB_I_ZONE_IN, 0, 0,
 	0, 0, 0, 0, 0, 0, SDEB_I_SERV_ACT_IN_16, SDEB_I_SERV_ACT_OUT_16,
 /* 0xa0; 0xa0->0xbf: 12 byte cdbs */
 	SDEB_I_REPORT_LUNS, SDEB_I_ATA_PT, 0, SDEB_I_MAINT_IN,
@@ -436,6 +474,11 @@ static int resp_comp_write(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_write_buffer(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_sync_cache(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_pre_fetch(struct scsi_cmnd *, struct sdebug_dev_info *);
+static int resp_rep_zones(struct scsi_cmnd *, struct sdebug_dev_info *);
+static int resp_open_zone(struct scsi_cmnd *, struct sdebug_dev_info *);
+static int resp_close_zone(struct scsi_cmnd *, struct sdebug_dev_info *);
+static int resp_finish_zone(struct scsi_cmnd *, struct sdebug_dev_info *);
+static int resp_rwp_zone(struct scsi_cmnd *, struct sdebug_dev_info *);
 
 /*
  * The following are overflow arrays for cdbs that "hit" the same index in
@@ -523,16 +566,34 @@ static const struct opcode_info_t release_iarr[] = {
 
 static const struct opcode_info_t sync_cache_iarr[] = {
 	{0, 0x91, 0, F_SYNC_DELAY | F_M_ACCESS, resp_sync_cache, NULL,
-	    {16,  0x6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+	    {16, 0x6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
 	     0xff, 0xff, 0xff, 0xff, 0x3f, 0xc7} },	/* SYNC_CACHE (16) */
 };
 
 static const struct opcode_info_t pre_fetch_iarr[] = {
 	{0, 0x90, 0, F_SYNC_DELAY | F_M_ACCESS, resp_pre_fetch, NULL,
-	    {16,  0x2, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+	    {16, 0x2, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
 	     0xff, 0xff, 0xff, 0xff, 0x3f, 0xc7} },	/* PRE-FETCH (16) */
 };
 
+static const struct opcode_info_t zone_out_iarr[] = {	/* ZONE OUT(16) */
+	{0, 0x94, 0x1, F_SA_LOW, resp_close_zone, NULL,
+	    {16, 0x1, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+	     0xff, 0, 0, 0xff, 0xff, 0x1, 0xc7} },	/* CLOSE ZONE */
+	{0, 0x94, 0x2, F_SA_LOW, resp_finish_zone, NULL,
+	    {16, 0x2, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+	     0xff, 0, 0, 0xff, 0xff, 0x1, 0xc7} },	/* FINISH ZONE */
+	{0, 0x94, 0x4, F_SA_LOW, resp_rwp_zone, NULL,
+	    {16, 0x4, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+	     0xff, 0, 0, 0xff, 0xff, 0x1, 0xc7} },  /* RESET WRITE POINTER */
+};
+
+static const struct opcode_info_t zone_in_iarr[] = {	/* ZONE IN(16) */
+	{0, 0x95, 0x6, F_SA_LOW | F_D_IN, NULL, NULL,
+	    {16, 0x6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+	     0xff, 0xff, 0xff, 0xff, 0x3f, 0xc7} }, /* REPORT REALMS */
+};
+
 
 /* This array is accessed via SDEB_I_* values. Make sure all are mapped,
  * plus the terminating elements for logic that scans this table such as
@@ -635,6 +696,15 @@ static const struct opcode_info_t opcode_info_arr[SDEB_I_LAST_ELEM_P1 + 1] = {
 	     0, 0, 0, 0} },			/* PRE-FETCH (10) */
 
 /* 30 */
+	{ARRAY_SIZE(zone_out_iarr), 0x94, 0x3, F_SA_LOW,
+	    resp_open_zone, zone_out_iarr, /* ZONE_OUT(16), OPEN ZONE) */
+		{16,  0x3 /* SA */, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+		 0xff, 0xff, 0x0, 0x0, 0xff, 0xff, 0x1, 0xc7} },
+	{ARRAY_SIZE(zone_in_iarr), 0x95, 0x0, F_SA_LOW | F_D_IN,
+	    resp_rep_zones, zone_in_iarr, /* ZONE_IN(16), REPORT ZONES) */
+		{16,  0x0 /* SA */, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+		 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xbf, 0xc7} },
+/* sentinel */
 	{0xff, 0, 0, 0, NULL, NULL,		/* terminating element */
 	    {0,  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} },
 };
@@ -720,6 +790,11 @@ static int dix_writes;
 static int dix_reads;
 static int dif_errors;
 
+/* ZBC global data */
+static bool sdeb_zbc_in_use;		/* true when ptype=TYPE_ZBC [0x14] */
+static const int zbc_zone_size_mb;
+static const int zbc_max_open_zones = DEF_ZBC_MAX_OPEN_ZONES;
+
 static int submit_queues = DEF_SUBMIT_QUEUES;  /* > 1 for multi-queue (mq) */
 static struct sdebug_queue *sdebug_q_arr;  /* ptr to array of submit queues */
 
@@ -1417,20 +1492,22 @@ static int inquiry_vpd_b2(unsigned char *arr)
 }
 
 /* Zoned block device characteristics VPD page (ZBC mandatory) */
-static int inquiry_vpd_b6(unsigned char *arr)
+static int inquiry_vpd_b6(struct sdebug_dev_info *devip, unsigned char *arr)
 {
 	memset(arr, 0, 0x3c);
 	arr[0] = 0x1; /* set URSWRZ (unrestricted read in seq. wr req zone) */
 	/*
 	 * Set Optimal number of open sequential write preferred zones and
 	 * Optimal number of non-sequentially written sequential write
-	 * preferred zones and Maximum number of open sequential write
-	 * required zones fields to 'not reported' (0xffffffff). Leave other
-	 * fields set to zero.
+	 * preferred zones fields to 'not reported' (0xffffffff). Leave other
+	 * fields set to zero, apart from Max. number of open swrz_s field.
 	 */
 	put_unaligned_be32(0xffffffff, &arr[4]);
 	put_unaligned_be32(0xffffffff, &arr[8]);
-	put_unaligned_be32(0xffffffff, &arr[12]);
+	if (devip->max_open)
+		put_unaligned_be32(devip->max_open, &arr[12]);
+	else
+		put_unaligned_be32(0xffffffff, &arr[12]);
 	return 0x3c;
 }
 
@@ -1550,7 +1627,7 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 			arr[3] = inquiry_vpd_b2(&arr[4]);
 		} else if (is_zbc && cmd[2] == 0xb6) { /* ZB dev. charact. */
 			arr[1] = cmd[2];        /*sanity */
-			arr[3] = inquiry_vpd_b6(&arr[4]);
+			arr[3] = inquiry_vpd_b6(devip, &arr[4]);
 		} else {
 			mk_sense_invalid_fld(scp, SDEB_IN_CDB, 2, -1);
 			kfree(arr);
@@ -2521,9 +2598,109 @@ static int resp_log_sense(struct scsi_cmnd *scp,
 		    min_t(int, len, SDEBUG_MAX_INQ_ARR_SZ));
 }
 
+/* acc_num is modulo 2 as that is how it is used */
+static int scp2acc_num(struct scsi_cmnd *scp)
+{
+	if (sdebug_doublestore) {
+		struct scsi_device *sdp = scp->device;
+		struct sdebug_dev_info *devip =
+				(struct sdebug_dev_info *)sdp->hostdata;
+
+		return devip->sdg_devnum % 2;
+	}
+	return 0;
+}
+
+static inline bool sdebug_dev_is_zoned(struct sdebug_dev_info *devip)
+{
+	return devip->nr_zones != 0;
+}
+static struct sdeb_zone_state *zbc_zone(unsigned long long lba,
+					struct sdebug_dev_info *devip)
+{
+	return &devip->zstate[lba >> devip->zsize_shift];
+}
+
+static inline bool zbc_zone_is_conv(struct sdeb_zone_state *zsp)
+{
+	return zsp->z_cond == zbc_not_write_pointer;
+}
+
+static void zbc_close_zone(struct sdebug_dev_info *devip,
+			   struct sdeb_zone_state *zsp)
+{
+	enum sdebug_z_cond zc;
+
+	if (zbc_zone_is_conv(zsp))
+		return;
+
+	zc = zsp->z_cond;
+	if (!(zc == zc2_implicit_open || zc == zc3_explicit_open))
+		return;
+
+	if (zc == zc2_implicit_open)
+		devip->nr_imp_open--;
+	else
+		devip->nr_exp_open--;
+
+	if (zsp->z_wp == zsp->z_start) {
+		zsp->z_cond = zc1_empty;
+	} else {
+		zsp->z_cond = zc4_closed;
+		devip->nr_closed++;
+	}
+}
+
+static void zbc_close_imp_open_zone(struct sdebug_dev_info *devip)
+{
+	struct sdeb_zone_state *zsp = &devip->zstate[0];
+	unsigned int i;
+
+	for (i = 0; i < devip->nr_zones; i++, zsp++) {
+		if (zsp->z_cond == zc2_implicit_open) {
+			zbc_close_zone(devip, zsp);
+			return;
+		}
+	}
+}
+
+static void zbc_open_zone(struct sdebug_dev_info *devip,
+			  struct sdeb_zone_state *zsp, bool explicit)
+{
+	enum sdebug_z_cond zc;
+
+	if (zbc_zone_is_conv(zsp))
+		return;
+
+	zc = zsp->z_cond;
+	if ((explicit && zc == zc3_explicit_open) ||
+	    (!explicit && zc == zc2_implicit_open))
+		return;
+
+	/* Close an implicit open zone if necessary */
+	if (explicit && zsp->z_cond == zc2_implicit_open)
+		zbc_close_zone(devip, zsp);
+	else if (devip->max_open &&
+		 devip->nr_imp_open + devip->nr_exp_open >= devip->max_open)
+		zbc_close_imp_open_zone(devip);
+
+	if (zsp->z_cond == zc4_closed)
+		devip->nr_closed--;
+	if (explicit) {
+		zsp->z_cond = zc3_explicit_open;
+		devip->nr_exp_open++;
+	} else {
+		zsp->z_cond = zc2_implicit_open;
+		devip->nr_imp_open++;
+	}
+}
+
 static inline int check_device_access_params(struct scsi_cmnd *scp,
 	unsigned long long lba, unsigned int num, bool write)
 {
+	struct scsi_device *sdp = scp->device;
+	struct sdebug_dev_info *devip = (struct sdebug_dev_info *)sdp->hostdata;
+
 	if (lba + num > sdebug_capacity) {
 		mk_sense_buffer(scp, ILLEGAL_REQUEST, LBA_OUT_OF_RANGE, 0);
 		return check_condition_result;
@@ -2538,17 +2715,57 @@ static inline int check_device_access_params(struct scsi_cmnd *scp,
 		mk_sense_buffer(scp, DATA_PROTECT, WRITE_PROTECTED, 0x2);
 		return check_condition_result;
 	}
-	return 0;
-}
 
-static int scp2acc_num(struct scsi_cmnd *scp)
-{
-	if (sdebug_doublestore) {
-		struct scsi_device *sdp = scp->device;
-		struct sdebug_dev_info *devip =
-				(struct sdebug_dev_info *)sdp->hostdata;
+	if (sdebug_dev_is_zoned(devip)) {
+		struct sdeb_zone_state *zsp = zbc_zone(lba, devip);
+		struct sdeb_zone_state *zsp_end =
+			zbc_zone(lba + num - 1, devip);
 
-		return devip->sdg_devnum;
+		if (write) {
+			/* No restrictions for writes within conv zones */
+			if (zbc_zone_is_conv(zsp)) {
+				if (zbc_zone_is_conv(zsp_end))
+					return 0;
+			}
+			/* Writes cannot cross sequential zone boundaries */
+			if (zsp_end != zsp) {
+				mk_sense_buffer(scp, ILLEGAL_REQUEST,
+					LBA_OUT_OF_RANGE, WRITE_BOUNDARY_ASCQ);
+				return check_condition_result;
+			}
+			/* Cannot write full zones */
+			if (zsp->z_cond == zc5_full) {
+				mk_sense_buffer(scp, ILLEGAL_REQUEST,
+						INVALID_FIELD_IN_CDB, 0);
+				return check_condition_result;
+			}
+			/* Writes must be aligned to the zone WP */
+			if (lba != zsp->z_wp) {
+				mk_sense_buffer(scp, ILLEGAL_REQUEST,
+						LBA_OUT_OF_RANGE,
+						UNALIGNED_WRITE_ASCQ);
+				return check_condition_result;
+			}
+			/* Handle implicit open of closed and empty zones */
+			if (zsp->z_cond == zc1_empty ||
+			    zsp->z_cond == zc4_closed) {
+				if (devip->max_open &&
+				    devip->nr_exp_open >= devip->max_open) {
+					mk_sense_buffer(scp, DATA_PROTECT,
+							INSUFF_RES_ASC,
+							INSUFF_ZONE_ASCQ);
+					return check_condition_result;
+				}
+				zbc_open_zone(devip, zsp, false);
+			}
+		} else if (zsp_end != zsp &&
+			   zbc_zone_is_conv(zsp) &&
+			   !zbc_zone_is_conv(zsp_end)) {
+			/* Reads cannot cross zone types boundaries */
+			mk_sense_buffer(scp, ILLEGAL_REQUEST,
+					LBA_OUT_OF_RANGE, READ_INVDATA_ASCQ);
+			return check_condition_result;
+		}
 	}
 	return 0;
 }
@@ -2574,7 +2791,7 @@ static int do_device_access(struct scsi_cmnd *scmd, u32 sg_skip, u64 lba,
 		return 0;
 	if (scmd->sc_data_direction != dir)
 		return -1;
-	fsp = fake_store_a[scp2acc_num(scmd) % 2];
+	fsp = fake_store_a[scp2acc_num(scmd)];
 
 	block = do_div(lba, sdebug_store_sectors);
 	if (block + num > sdebug_store_sectors)
@@ -2625,7 +2842,7 @@ static bool comp_write_worker(u64 lba, u32 num, const u8 *arr, int acc_num,
 	if (block + num > store_blks)
 		rest = block + num - store_blks;
 
-	fsp = fake_store_a[acc_num % 2];
+	fsp = fake_store_a[acc_num];
 
 	res = !memcmp(fsp + (block * lb_size), arr, (num - rest) * lb_size);
 	if (!res)
@@ -2847,21 +3064,21 @@ static int resp_read_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 		return check_condition_result;
 	}
 
-	read_lock(ramdisk_lck_a[acc_num % 2]);
+	read_lock(ramdisk_lck_a[acc_num]);
 
 	/* DIX + T10 DIF */
 	if (unlikely(sdebug_dix && scsi_prot_sg_count(scp))) {
 		int prot_ret = prot_verify_read(scp, lba, num, ei_lba);
 
 		if (prot_ret) {
-			read_unlock(ramdisk_lck_a[acc_num % 2]);
+			read_unlock(ramdisk_lck_a[acc_num]);
 			mk_sense_buffer(scp, ABORTED_COMMAND, 0x10, prot_ret);
 			return illegal_condition_result;
 		}
 	}
 
 	ret = do_device_access(scp, 0, lba, num, false);
-	read_unlock(ramdisk_lck_a[acc_num % 2]);
+	read_unlock(ramdisk_lck_a[acc_num]);
 	if (unlikely(ret == -1))
 		return DID_ERROR << 16;
 
@@ -3036,7 +3253,7 @@ static void map_region(sector_t lba, unsigned int len)
 static void unmap_region(sector_t lba, unsigned int len, int acc_num)
 {
 	sector_t end = lba + len;
-	u8 *fsp = fake_store_a[acc_num % 2];
+	u8 *fsp = fake_store_a[acc_num];
 
 	while (lba < end) {
 		unsigned long index = lba_to_map_index(lba);
@@ -3125,14 +3342,14 @@ static int resp_write_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	ret = check_device_access_params(scp, lba, num, true);
 	if (ret)
 		return ret;
-	write_lock(ramdisk_lck_a[acc_num % 2]);
+	write_lock(ramdisk_lck_a[acc_num]);
 
 	/* DIX + T10 DIF */
 	if (unlikely(sdebug_dix && scsi_prot_sg_count(scp))) {
 		int prot_ret = prot_verify_write(scp, lba, num, ei_lba);
 
 		if (prot_ret) {
-			write_unlock(ramdisk_lck_a[acc_num % 2]);
+			write_unlock(ramdisk_lck_a[acc_num]);
 			mk_sense_buffer(scp, ILLEGAL_REQUEST, 0x10, prot_ret);
 			return illegal_condition_result;
 		}
@@ -3141,7 +3358,17 @@ static int resp_write_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	ret = do_device_access(scp, 0, lba, num, true);
 	if (unlikely(scsi_debug_lbp()))
 		map_region(lba, num);
-	write_unlock(ramdisk_lck_a[acc_num % 2]);
+	/* If ZBC zone then bump its write pointer */
+	if (sdebug_dev_is_zoned(devip)) {
+		struct sdeb_zone_state *zsp = zbc_zone(lba, devip);
+
+		if (!zbc_zone_is_conv(zsp)) {
+			zsp->z_wp += num;
+			if (zsp->z_wp >= zsp->z_start + zsp->z_size)
+				zsp->z_cond = zc5_full;
+		}
+	}
+	write_unlock(ramdisk_lck_a[acc_num]);
 	if (unlikely(-1 == ret))
 		return DID_ERROR << 16;
 	else if (unlikely(sdebug_verbose &&
@@ -3250,7 +3477,7 @@ static int resp_write_scat(struct scsi_cmnd *scp,
 		goto err_out;
 	}
 
-	write_lock(ramdisk_lck_a[acc_num % 2]);
+	write_lock(ramdisk_lck_a[acc_num]);
 	sg_off = lbdof_blen;
 	/* Spec says Buffer xfer Length field in number of LBs in dout */
 	cum_lb = 0;
@@ -3294,6 +3521,16 @@ static int resp_write_scat(struct scsi_cmnd *scp,
 		}
 
 		ret = do_device_access(scp, sg_off, lba, num, true);
+		/* If ZBC zone then bump its write pointer */
+		if (sdebug_dev_is_zoned(devip)) {
+			struct sdeb_zone_state *zsp = zbc_zone(lba, devip);
+
+			if (!zbc_zone_is_conv(zsp)) {
+				zsp->z_wp += num;
+				if (zsp->z_wp >= zsp->z_start + zsp->z_size)
+					zsp->z_cond = zc5_full;
+			}
+		}
 		if (unlikely(scsi_debug_lbp()))
 			map_region(lba, num);
 		if (unlikely(-1 == ret)) {
@@ -3333,7 +3570,7 @@ static int resp_write_scat(struct scsi_cmnd *scp,
 	}
 	ret = 0;
 err_out_unlock:
-	write_unlock(ramdisk_lck_a[acc_num % 2]);
+	write_unlock(ramdisk_lck_a[acc_num]);
 err_out:
 	kfree(lrdp);
 	return ret;
@@ -3342,6 +3579,8 @@ static int resp_write_scat(struct scsi_cmnd *scp,
 static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num,
 			   u32 ei_lba, bool unmap, bool ndob)
 {
+	struct scsi_device *sdp = scp->device;
+	struct sdebug_dev_info *devip = (struct sdebug_dev_info *)sdp->hostdata;
 	unsigned long long i;
 	u64 block, lbaa;
 	u32 lb_size = sdebug_sector_size;
@@ -3354,7 +3593,7 @@ static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num,
 	if (ret)
 		return ret;
 
-	write_lock(ramdisk_lck_a[acc_num % 2]);
+	write_lock(ramdisk_lck_a[acc_num]);
 
 	if (unmap && scsi_debug_lbp()) {
 		unmap_region(lba, num, acc_num);
@@ -3363,7 +3602,7 @@ static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num,
 	lbaa = lba;
 	block = do_div(lbaa, sdebug_store_sectors);
 	/* if ndob then zero 1 logical block, else fetch 1 logical block */
-	fsp = fake_store_a[acc_num % 2];
+	fsp = fake_store_a[acc_num];
 	fs1p = fsp + (block * lb_size);
 	if (ndob) {
 		memset(fs1p, 0, lb_size);
@@ -3372,7 +3611,7 @@ static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num,
 		ret = fetch_to_dev_buffer(scp, fs1p, lb_size);
 
 	if (-1 == ret) {
-		write_unlock(ramdisk_lck_a[acc_num % 2]);
+		write_unlock(ramdisk_lck_a[acc_num]);
 		return DID_ERROR << 16;
 	} else if (sdebug_verbose && !ndob && (ret < lb_size))
 		sdev_printk(KERN_INFO, scp->device,
@@ -3387,8 +3626,18 @@ static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num,
 	}
 	if (scsi_debug_lbp())
 		map_region(lba, num);
+	/* If ZBC zone then bump its write pointer */
+	if (sdebug_dev_is_zoned(devip)) {
+		struct sdeb_zone_state *zsp = zbc_zone(lba, devip);
+
+		if (!zbc_zone_is_conv(zsp)) {
+			zsp->z_wp += num;
+			if (zsp->z_wp >= zsp->z_start + zsp->z_size)
+				zsp->z_cond = zc5_full;
+		}
+	}
 out:
-	write_unlock(ramdisk_lck_a[acc_num % 2]);
+	write_unlock(ramdisk_lck_a[acc_num]);
 
 	return 0;
 }
@@ -3533,7 +3782,7 @@ static int resp_comp_write(struct scsi_cmnd *scp,
 		return check_condition_result;
 	}
 
-	write_lock(ramdisk_lck_a[acc_num % 2]);
+	write_lock(ramdisk_lck_a[acc_num]);
 
 	ret = do_dout_fetch(scp, dnum, arr);
 	if (ret == -1) {
@@ -3551,7 +3800,7 @@ static int resp_comp_write(struct scsi_cmnd *scp,
 	if (scsi_debug_lbp())
 		map_region(lba, num);
 cleanup:
-	write_unlock(ramdisk_lck_a[acc_num % 2]);
+	write_unlock(ramdisk_lck_a[acc_num]);
 	kfree(arr);
 	return retval;
 }
@@ -3595,7 +3844,7 @@ static int resp_unmap(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 
 	desc = (void *)&buf[8];
 
-	write_lock(ramdisk_lck_a[acc_num % 2]);
+	write_lock(ramdisk_lck_a[acc_num]);
 
 	for (i = 0 ; i < descriptors ; i++) {
 		unsigned long long lba = get_unaligned_be64(&desc[i].lba);
@@ -3611,7 +3860,7 @@ static int resp_unmap(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	ret = 0;
 
 out:
-	write_unlock(ramdisk_lck_a[acc_num % 2]);
+	write_unlock(ramdisk_lck_a[acc_num]);
 	kfree(buf);
 
 	return ret;
@@ -3863,7 +4112,7 @@ static int resp_verify(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 		return check_condition_result;
 	}
 	/* Not changing store, so only need read access */
-	read_lock(ramdisk_lck_a[acc_num % 2]);
+	read_lock(ramdisk_lck_a[acc_num]);
 
 	ret = do_dout_fetch(scp, a_num, arr);
 	if (ret == -1) {
@@ -3885,11 +4134,428 @@ static int resp_verify(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 		goto cleanup;
 	}
 cleanup:
-	read_unlock(ramdisk_lck_a[acc_num % 2]);
+	read_unlock(ramdisk_lck_a[acc_num]);
+	kfree(arr);
+	return ret;
+}
+
+#define RZONES_DESC_HD 64
+
+/*
+ * Report two zones, the first: conventional; the second: sequential write
+ * required. The available storage is divided in two for these zones.
+ */
+static int resp_rep_zones(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
+{
+	unsigned int i, max_zones, rep_max_zones, nrz = 0;
+	int ret = 0;
+	int acc_num = scp2acc_num(scp);
+	u32 alloc_len, rep_opts, rep_len;
+	bool partial;
+	u64 lba, zs_lba;
+	u8 *arr = NULL, *desc;
+	u8 *cmd = scp->cmnd;
+	struct sdeb_zone_state *zsp;
+
+	if (!sdebug_dev_is_zoned(devip)) {
+		mk_sense_invalid_opcode(scp);
+		return check_condition_result;
+	}
+
+	zs_lba = get_unaligned_be64(cmd + 2);
+        alloc_len = get_unaligned_be32(cmd + 10);
+	rep_opts = cmd[14] & 0x3f;
+	partial = cmd[14] & 0x80;
+
+	lba = sdebug_capacity;
+	if (zs_lba >= lba) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, LBA_OUT_OF_RANGE, 0);
+		ret = check_condition_result;
+		goto fini;
+	}
+
+	max_zones = devip->nr_zones - (zs_lba >> devip->zsize_shift);
+        rep_max_zones = min((alloc_len - 64) >> ilog2(RZONES_DESC_HD),
+			    max_zones);
+
+	arr = kcalloc(RZONES_DESC_HD, alloc_len, GFP_ATOMIC);
+	if (!arr) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, INSUFF_RES_ASC,
+				INSUFF_RES_ASCQ);
+		return check_condition_result;
+	}
+
+	read_lock(ramdisk_lck_a[acc_num]);
+
+	desc = arr + 64;
+	for (i = 0; i < max_zones; i++) {
+
+		zsp = zbc_zone(zs_lba + ((sector_t)i << devip->zsize_shift),
+			       devip);
+		switch (rep_opts) {
+		case 0x00:
+			/* All zones */
+			break;
+		case 0x01:
+			/* Empty zones */
+			if (zsp->z_cond != zc1_empty)
+				continue;
+			break;
+		case 0x02:
+			/* Implicit open zones */
+			if (zsp->z_cond != zc2_implicit_open)
+				continue;
+			break;
+		case 0x03:
+			/* Explicit open zones */
+			if (zsp->z_cond != zc3_explicit_open)
+				continue;
+			break;
+		case 0x04:
+			/* Closed zones */
+			if (zsp->z_cond != zc4_closed)
+				continue;
+			break;
+		case 0x05:
+			/* Full zones */
+			if (zsp->z_cond != zc5_full)
+				continue;
+			break;
+		case 0x06:
+		case 0x07:
+		case 0x10:
+		case 0x11:
+			/*
+			 * Read-only, offline, reset WP recommended and
+			 * non-seq-resource-used are not emulated: no zones
+			 * to report;
+			 */
+			continue;
+		case 0x3f:
+			/* Not write pointer (conventional) zones */
+			if (!zbc_zone_is_conv(zsp))
+				continue;
+			break;
+		default:
+			mk_sense_buffer(scp, ILLEGAL_REQUEST,
+					INVALID_FIELD_IN_CDB, 0);
+			ret = check_condition_result;
+			read_unlock(ramdisk_lck_a[acc_num]);
+			goto fini;
+		}
+
+		if (nrz < rep_max_zones) {
+			/* Fill zone descriptor */
+			if (zbc_zone_is_conv(zsp))
+				desc[0] = 0x1;
+			else
+				desc[0] = 0x2;
+			desc[1] = zsp->z_cond << 4;
+			put_unaligned_be64((u64)zsp->z_size, desc + 8);
+			put_unaligned_be64((u64)zsp->z_start, desc + 16);
+			put_unaligned_be64((u64)zsp->z_wp, desc + 24);
+			desc += 64;
+		}
+
+		if (partial && nrz >= rep_max_zones)
+			break;
+
+		nrz++;
+	}
+	read_unlock(ramdisk_lck_a[acc_num]);
+
+	/* Report header */
+	put_unaligned_be32(nrz * RZONES_DESC_HD, arr + 0);
+	put_unaligned_be64(lba - 1, arr + 8);
+
+	rep_len = (unsigned long)desc - (unsigned long)arr;
+	ret = fill_from_dev_buffer(scp, arr, min_t(int, alloc_len, rep_len));
+
+fini:
 	kfree(arr);
 	return ret;
 }
 
+/* Logic transplanted from tcmu-runner, file_zbc.c __zbc_close_zone() */
+static void zbc_open_all(struct sdebug_dev_info *devip)
+{
+	struct sdeb_zone_state *zsp = &devip->zstate[0];
+	unsigned int i;
+
+	for (i = 0; i < devip->nr_zones; i++, zsp++) {
+		if (zsp->z_cond == zc4_closed)
+			zbc_open_zone(devip, &devip->zstate[i], true);
+	}
+}
+
+static int resp_open_zone(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
+{
+	int acc_num = scp2acc_num(scp);
+	int res = 0;
+	u64 z_id;
+	u8 *cmd = scp->cmnd;
+	struct sdeb_zone_state *zsp;
+	enum sdebug_z_cond zc;
+	bool all = cmd[14] & 0x01;
+
+	if (!sdebug_dev_is_zoned(devip)) {
+		mk_sense_invalid_opcode(scp);
+		return check_condition_result;
+	}
+
+	write_lock(ramdisk_lck_a[acc_num]);
+	if (all) {
+		/* Check if all closed zones can be open */
+		if (devip->max_open &&
+		    devip->nr_exp_open + devip->nr_closed > devip->max_open) {
+			mk_sense_buffer(scp, DATA_PROTECT, INSUFF_RES_ASC,
+					INSUFF_ZONE_ASCQ);
+			res = check_condition_result;
+			goto fini;
+		}
+		/* Open all closed zones */
+		zbc_open_all(devip);
+		goto fini;
+	}
+
+	/* Open the specified zone */
+	z_id = get_unaligned_be64(cmd + 2);
+	if (z_id >= sdebug_capacity) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, LBA_OUT_OF_RANGE, 0);
+		res = check_condition_result;
+		goto fini;
+	}
+
+	zsp = zbc_zone(z_id, devip);
+	if (z_id != zsp->z_start) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, INVALID_FIELD_IN_CDB, 0);
+		res = check_condition_result;
+		goto fini;
+	}
+	if (zbc_zone_is_conv(zsp)) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, INVALID_FIELD_IN_CDB, 0);
+		res = check_condition_result;
+		goto fini;
+	}
+
+	zc = zsp->z_cond;
+	if (zc == zc3_explicit_open || zc == zc5_full)
+		goto fini;
+
+	if (devip->max_open && devip->nr_exp_open >= devip->max_open) {
+		mk_sense_buffer(scp, DATA_PROTECT, INSUFF_RES_ASC,
+				INSUFF_ZONE_ASCQ);
+		res = check_condition_result;
+		goto fini;
+	}
+
+	if (zc == zc2_implicit_open)
+		zbc_close_zone(devip, zsp);
+	zbc_open_zone(devip, zsp, true);
+fini:
+	write_unlock(ramdisk_lck_a[acc_num]);
+	return res;
+}
+
+static void zbc_close_all(struct sdebug_dev_info *devip)
+{
+	unsigned int i;
+
+	for (i = 0; i < devip->nr_zones; i++)
+		zbc_close_zone(devip, &devip->zstate[i]);
+}
+
+static int resp_close_zone(struct scsi_cmnd *scp,
+			   struct sdebug_dev_info *devip)
+{
+	int acc_num = scp2acc_num(scp);
+	int res = 0;
+	u64 z_id;
+	u8 *cmd = scp->cmnd;
+	struct sdeb_zone_state *zsp;
+	bool all = cmd[14] & 0x01;
+
+	if (!sdebug_dev_is_zoned(devip)) {
+		mk_sense_invalid_opcode(scp);
+		return check_condition_result;
+	}
+
+	write_lock(ramdisk_lck_a[acc_num]);
+
+	if (all) {
+		zbc_close_all(devip);
+		goto fini;
+	}
+
+	/* Close specified zone */
+	z_id = get_unaligned_be64(cmd + 2);
+	if (z_id >= sdebug_capacity) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, LBA_OUT_OF_RANGE, 0);
+		res = check_condition_result;
+		goto fini;
+	}
+
+	zsp = zbc_zone(z_id, devip);
+	if (z_id != zsp->z_start) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, INVALID_FIELD_IN_CDB, 0);
+		res = check_condition_result;
+		goto fini;
+	}
+	if (zbc_zone_is_conv(zsp)) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, INVALID_FIELD_IN_CDB, 0);
+		res = check_condition_result;
+		goto fini;
+	}
+
+	zbc_close_zone(devip, zsp);
+fini:
+	write_unlock(ramdisk_lck_a[acc_num]);
+	return res;
+}
+
+static void zbc_finish_zone(struct sdebug_dev_info *devip,
+			    struct sdeb_zone_state *zsp, bool empty)
+{
+	enum sdebug_z_cond zc = zsp->z_cond;
+
+	if (zc == zc4_closed || zc == zc2_implicit_open ||
+	    zc == zc3_explicit_open || (empty && zc == zc1_empty)) {
+		if (zc == zc2_implicit_open || zc == zc3_explicit_open)
+			zbc_close_zone(devip, zsp);
+		if (zsp->z_cond == zc4_closed)
+			devip->nr_closed--;
+		zsp->z_wp = zsp->z_start + zsp->z_size;
+		zsp->z_cond = zc5_full;
+	}
+}
+
+static void zbc_finish_all(struct sdebug_dev_info *devip)
+{
+	unsigned int i;
+
+	for (i = 0; i < devip->nr_zones; i++)
+		zbc_finish_zone(devip, &devip->zstate[i], false);
+}
+
+static int resp_finish_zone(struct scsi_cmnd *scp,
+			    struct sdebug_dev_info *devip)
+{
+	int acc_num = scp2acc_num(scp);
+	struct sdeb_zone_state *zsp;
+	int res = 0;
+	u64 z_id;
+	u8 *cmd = scp->cmnd;
+	bool all = cmd[14] & 0x01;
+
+	if (!sdebug_dev_is_zoned(devip)) {
+		mk_sense_invalid_opcode(scp);
+		return check_condition_result;
+	}
+
+	write_lock(ramdisk_lck_a[acc_num]);
+	if (all) {
+		zbc_finish_all(devip);
+		goto fini;
+	}
+
+	/* Finish the specified zone */
+	z_id = get_unaligned_be64(cmd + 2);
+	if (z_id >= sdebug_capacity) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, LBA_OUT_OF_RANGE, 0);
+		res = check_condition_result;
+		goto fini;
+	}
+
+	zsp = zbc_zone(z_id, devip);
+	if (z_id != zsp->z_start) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, INVALID_FIELD_IN_CDB, 0);
+		res = check_condition_result;
+		goto fini;
+	}
+	if (zbc_zone_is_conv(zsp)) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, INVALID_FIELD_IN_CDB, 0);
+		res = check_condition_result;
+		goto fini;
+	}
+
+	zbc_finish_zone(devip, zsp, true);
+fini:
+	write_unlock(ramdisk_lck_a[acc_num]);
+	return res;
+}
+
+static void zbc_rwp_zone(struct sdebug_dev_info *devip,
+			 struct sdeb_zone_state *zsp)
+{
+	enum sdebug_z_cond zc;
+
+	if (zbc_zone_is_conv(zsp))
+		return;
+
+	zc = zsp->z_cond;
+	if (zc == zc2_implicit_open || zc == zc3_explicit_open)
+		zbc_close_zone(devip, zsp);
+
+	if (zsp->z_cond == zc4_closed)
+		devip->nr_closed--;
+
+	zsp->z_wp = zsp->z_start;
+	zsp->z_cond = zc1_empty;
+}
+
+static void zbc_rwp_all(struct sdebug_dev_info *devip)
+{
+	unsigned int i;
+
+	for (i = 0; i < devip->nr_zones; i++)
+		zbc_rwp_zone(devip, &devip->zstate[i]);
+}
+
+static int resp_rwp_zone(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
+{
+	int acc_num = scp2acc_num(scp);
+	struct sdeb_zone_state *zsp;
+	int res = 0;
+	u64 z_id;
+	u8 *cmd = scp->cmnd;
+	bool all = cmd[14] & 0x01;
+
+	if (!sdebug_dev_is_zoned(devip)) {
+		mk_sense_invalid_opcode(scp);
+		return check_condition_result;
+	}
+
+	write_lock(ramdisk_lck_a[acc_num]);
+	if (all) {
+		zbc_rwp_all(devip);
+		goto fini;
+	}
+
+	z_id = get_unaligned_be64(cmd + 2);
+	if (z_id >= sdebug_capacity) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, LBA_OUT_OF_RANGE, 0);
+		res = check_condition_result;
+		goto fini;
+	}
+
+	zsp = zbc_zone(z_id, devip);
+	if (z_id != zsp->z_start) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, INVALID_FIELD_IN_CDB, 0);
+		res = check_condition_result;
+		goto fini;
+	}
+	if (zbc_zone_is_conv(zsp)) {
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, INVALID_FIELD_IN_CDB, 0);
+		res = check_condition_result;
+		goto fini;
+	}
+
+	zbc_rwp_zone(devip, zsp);
+fini:
+	write_unlock(ramdisk_lck_a[acc_num]);
+	return res;
+}
+
 static struct sdebug_queue *get_queue(struct scsi_cmnd *cmnd)
 {
 	u32 tag = blk_mq_unique_tag(cmnd->request);
@@ -3995,6 +4661,73 @@ static void sdebug_q_cmd_wq_complete(struct work_struct *work)
 static bool got_shared_uuid;
 static uuid_t shared_uuid;
 
+static int sdebug_device_create_zones(struct sdebug_dev_info *devip)
+{
+	struct sdeb_zone_state *zsp;
+	sector_t capacity = get_sdebug_capacity();
+	sector_t zstart = 0;
+	unsigned int i;
+
+	/*
+	 * Set zone size: if zbc_zone_size_mb was not set, figure out a zone
+	 * size allowing for at least 4 zones on the device.
+	 */
+	if (!zbc_zone_size_mb) {
+		devip->zsize = (DEF_ZBC_ZONE_SIZE_MB * SZ_1M)
+			>> ilog2(sdebug_sector_size);
+		while (capacity < devip->zsize * 4 && devip->zsize >= 2)
+			devip->zsize >>= 1;
+		if (devip->zsize < 2) {
+			pr_err("Device capacity too small\n");
+			return -EINVAL;
+		}
+	} else {
+		devip->zsize = (zbc_zone_size_mb * SZ_1M)
+			>> ilog2(sdebug_sector_size);
+		if (devip->zsize >= capacity) {
+			pr_err("Zone size too large for device capacity\n");
+			return -EINVAL;
+		}
+	}
+
+	devip->zsize_shift = ilog2(devip->zsize);
+	devip->nr_zones = (capacity + devip->zsize - 1) >> devip->zsize_shift;
+
+	/* zbc_max_open_zones can be 0, meaning "no limit" */
+	if (zbc_max_open_zones >= devip->nr_zones - 1)
+		devip->max_open = (devip->nr_zones - 1) / 2;
+	else
+		devip->max_open = zbc_max_open_zones;
+
+	devip->zstate = kcalloc(sizeof(struct sdeb_zone_state),
+				devip->nr_zones, GFP_KERNEL);
+	if (!devip->zstate)
+		return -ENOMEM;
+
+	for (i = 0; i < devip->nr_zones; i++) {
+		zsp = &devip->zstate[i];
+
+		zsp->z_start = zstart;
+
+		if (i == 0) {
+			zsp->z_cond = zbc_not_write_pointer;
+			zsp->z_wp = (sector_t)-1;
+		} else {
+			zsp->z_cond = zc1_empty;
+			zsp->z_wp = zsp->z_start;
+		}
+
+		if (zsp->z_start + devip->zsize < capacity)
+			zsp->z_size = devip->zsize;
+		else
+			zsp->z_size = capacity - zsp->z_start;
+
+		zstart += zsp->z_size;
+	}
+
+	return 0;
+}
+
 static struct sdebug_dev_info *sdebug_device_create(
 			struct sdebug_host_info *sdbg_host, gfp_t flags)
 {
@@ -4014,6 +4747,13 @@ static struct sdebug_dev_info *sdebug_device_create(
 			}
 		}
 		devip->sdbg_host = sdbg_host;
+		if (sdeb_zbc_in_use) {
+			if (sdebug_device_create_zones(devip)) {
+				kfree(devip);
+				return NULL;
+			}
+		}
+		devip->sdbg_host = sdbg_host;
 		list_add_tail(&devip->dev_list, &sdbg_host->dev_info_list);
 	}
 	return devip;
@@ -4968,10 +5708,12 @@ static ssize_t ptype_show(struct device_driver *ddp, char *buf)
 static ssize_t ptype_store(struct device_driver *ddp, const char *buf,
 			   size_t count)
 {
-	int n;
+	int n, prev_pdt;
 
 	if ((count > 0) && (1 == sscanf(buf, "%d", &n)) && (n >= 0)) {
+		prev_pdt = sdebug_ptype;
 		sdebug_ptype = n;
+		sdeb_zbc_in_use = (sdebug_ptype == TYPE_ZBC);
 		return count;
 	}
 	return -EINVAL;
@@ -5240,6 +5982,10 @@ static ssize_t virtual_gb_store(struct device_driver *ddp, const char *buf,
 	int n;
 	bool changed;
 
+	/* Ignore capacity change for ZBC drives for now */
+	if (sdeb_zbc_in_use)
+		return -ENOTSUPP;
+
 	if ((count > 0) && (1 == sscanf(buf, "%d", &n)) && (n >= 0)) {
 		changed = (sdebug_virtual_gb != n);
 		sdebug_virtual_gb = n;
@@ -5709,6 +6455,9 @@ static int __init scsi_debug_init(void)
 		if (sdebug_num_parts)
 			map_region(0, 2);
 	}
+	/* check for host managed zoned block device [ptype=0x14] */
+	if (sdebug_ptype == TYPE_ZBC)
+		sdeb_zbc_in_use = true;
 
 	pseudo_primary = root_device_register("pseudo_0");
 	if (IS_ERR(pseudo_primary)) {
@@ -5832,6 +6581,7 @@ static int sdebug_add_adapter(void)
 	list_for_each_entry_safe(sdbg_devinfo, tmp, &sdbg_host->dev_info_list,
 				 dev_list) {
 		list_del(&sdbg_devinfo->dev_list);
+		kfree(sdbg_devinfo->zstate);
 		kfree(sdbg_devinfo);
 	}
 
@@ -6208,6 +6958,7 @@ static int sdebug_driver_remove(struct device *dev)
 	list_for_each_entry_safe(sdbg_devinfo, tmp, &sdbg_host->dev_info_list,
 				 dev_list) {
 		list_del(&sdbg_devinfo->dev_list);
+		kfree(sdbg_devinfo->zstate);
 		kfree(sdbg_devinfo);
 	}
 
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 09/15] scsi_debug: zbc module parameter
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
                   ` (7 preceding siblings ...)
  2020-02-20 20:08 ` [PATCH v3 08/15] scsi_debug: add zone commands Douglas Gilbert
@ 2020-02-20 20:08 ` Douglas Gilbert
  2020-02-21  9:49   ` Johannes Thumshirn
  2020-02-20 20:08 ` [PATCH v3 10/15] scsi_debug: re-arrange parameters alphabetically Douglas Gilbert
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal

Add zbc module parameter (a boolean) as a synonym for the more
obscure ptype=0x14 . The zbc attribute is also available in
/sys/bus/pseudo/drivers/scsi_debug/zbc and is modifiable.

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 drivers/scsi/scsi_debug.c | 42 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 10f8698031aa..be3597805b08 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -763,6 +763,7 @@ static bool have_dif_prot;
 static bool write_since_sync;
 static bool sdebug_statistics = DEF_STATISTICS;
 static bool sdebug_wp;
+static bool sdebug_zbc;
 
 static unsigned int sdebug_store_sectors;
 static sector_t sdebug_capacity;	/* in sectors */
@@ -5443,6 +5444,7 @@ module_param_named(vpd_use_hostno, sdebug_vpd_use_hostno, int,
 module_param_named(wp, sdebug_wp, bool, S_IRUGO | S_IWUSR);
 module_param_named(write_same_length, sdebug_write_same_length, int,
 		   S_IRUGO | S_IWUSR);
+module_param_named(zbc, sdebug_zbc, bool, S_IRUGO | S_IWUSR);
 
 MODULE_AUTHOR("Eric Youngdale + Douglas Gilbert");
 MODULE_DESCRIPTION("SCSI debug adapter driver");
@@ -5483,9 +5485,9 @@ MODULE_PARM_DESC(no_uld, "stop ULD (e.g. sd driver) attaching (def=0))");
 MODULE_PARM_DESC(num_parts, "number of partitions(def=0)");
 MODULE_PARM_DESC(num_tgts, "number of targets per host to simulate(def=1)");
 MODULE_PARM_DESC(opt_blks, "optimal transfer length in blocks (def=1024)");
+MODULE_PARM_DESC(opt_xferlen_exp, "optimal transfer length granularity exponent (def=physblk_exp)");
 MODULE_PARM_DESC(opts, "1->noise, 2->medium_err, 4->timeout, 8->recovered_err... (def=0)");
 MODULE_PARM_DESC(physblk_exp, "physical block exponent (def=0)");
-MODULE_PARM_DESC(opt_xferlen_exp, "optimal transfer length granularity exponent (def=physblk_exp)");
 MODULE_PARM_DESC(ptype, "SCSI peripheral type(def=0[disk])");
 MODULE_PARM_DESC(random, "If set, uniformly randomize command duration between 0 and delay_in_ns");
 MODULE_PARM_DESC(removable, "claim to have removable media (def=0)");
@@ -5504,6 +5506,7 @@ MODULE_PARM_DESC(virtual_gb, "virtual gigabyte (GiB) size (def=0 -> use dev_size
 MODULE_PARM_DESC(vpd_use_hostno, "0 -> dev ids ignore hostno (def=1 -> unique dev ids)");
 MODULE_PARM_DESC(wp, "Write Protect (def=0)");
 MODULE_PARM_DESC(write_same_length, "Maximum blocks per WRITE SAME cmd (def=0xffff)");
+MODULE_PARM_DESC(zbc, "Emulate ZBC device(s) (def=false) [same action as ptype=0x14]");
 
 #define SDEBUG_INFO_LEN 256
 static char sdebug_info[SDEBUG_INFO_LEN];
@@ -6225,6 +6228,42 @@ static ssize_t cdb_len_store(struct device_driver *ddp, const char *buf,
 }
 static DRIVER_ATTR_RW(cdb_len);
 
+static ssize_t zbc_show(struct device_driver *ddp, char *buf)
+{
+	return scnprintf(buf, PAGE_SIZE, "%d\n", sdebug_ptype == TYPE_ZBC);
+}
+static ssize_t zbc_store(struct device_driver *ddp, const char *buf,
+			 size_t count)
+{
+	bool want_zbc;
+	int ret, n;
+	int prev_pdt = sdebug_ptype;
+
+	ret = kstrtoint(buf, 0, &n);
+	if (ret)
+		return ret;
+	want_zbc = !!n;
+	if (prev_pdt == TYPE_ZBC || want_zbc) {
+		if (prev_pdt == TYPE_ZBC && want_zbc)
+			return count;
+		sdeb_zbc_in_use = want_zbc;
+		if (sdeb_zbc_in_use) {
+			struct sdeb_zone_state *zsp = sdeb_zstate_a;
+
+			sdebug_ptype = TYPE_ZBC;
+			zbc_swrz_start = sdebug_capacity >> 1;
+			zsp->write_pointer = zbc_swrz_start;
+			zsp->z_cond = zc1_empty;
+			++zsp;
+			zsp->write_pointer = zbc_swrz_start;
+			zsp->z_cond = zc1_empty;
+		} else {
+			sdebug_ptype = DEF_PTYPE;	/* a disk ? ? */
+		}
+	}
+	return count;
+}
+static DRIVER_ATTR_RW(zbc);
 
 /* Note: The following array creates attribute files in the
    /sys/bus/pseudo/drivers/scsi_debug directory. The advantage of these
@@ -6267,6 +6306,7 @@ static struct attribute *sdebug_drv_attrs[] = {
 	&driver_attr_strict.attr,
 	&driver_attr_uuid_ctl.attr,
 	&driver_attr_cdb_len.attr,
+	&driver_attr_zbc.attr,
 	NULL,
 };
 ATTRIBUTE_GROUPS(sdebug_drv);
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 10/15] scsi_debug: re-arrange parameters alphabetically
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
                   ` (8 preceding siblings ...)
  2020-02-20 20:08 ` [PATCH v3 09/15] scsi_debug: zbc module parameter Douglas Gilbert
@ 2020-02-20 20:08 ` Douglas Gilbert
  2020-02-20 20:08 ` [PATCH v3 11/15] scsi_debug: zbc parameter can be string Douglas Gilbert
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal

This module has a lot of parameters and when searching for
one, the author prefers them in alphabetical orders. This can
lead to somewhat illogical ordering (e.g. inq_product before
inq_vendor) but it is not clear what another total logical
ordering would be.

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 drivers/scsi/scsi_debug.c | 36 +++++++++++++++++++-----------------
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index be3597805b08..1a3cca5f3ec6 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -5401,30 +5401,32 @@ module_param_named(every_nth, sdebug_every_nth, int, S_IRUGO | S_IWUSR);
 module_param_named(fake_rw, sdebug_fake_rw, int, S_IRUGO | S_IWUSR);
 module_param_named(guard, sdebug_guard, uint, S_IRUGO);
 module_param_named(host_lock, sdebug_host_lock, bool, S_IRUGO | S_IWUSR);
-module_param_string(inq_vendor, sdebug_inq_vendor_id,
-		    sizeof(sdebug_inq_vendor_id), S_IRUGO|S_IWUSR);
 module_param_string(inq_product, sdebug_inq_product_id,
-		    sizeof(sdebug_inq_product_id), S_IRUGO|S_IWUSR);
+		    sizeof(sdebug_inq_product_id), S_IRUGO | S_IWUSR);
 module_param_string(inq_rev, sdebug_inq_product_rev,
-		    sizeof(sdebug_inq_product_rev), S_IRUGO|S_IWUSR);
+		    sizeof(sdebug_inq_product_rev), S_IRUGO | S_IWUSR);
+module_param_string(inq_vendor, sdebug_inq_vendor_id,
+		    sizeof(sdebug_inq_vendor_id), S_IRUGO | S_IWUSR);
+module_param_named(lowest_aligned, sdebug_lowest_aligned, int, S_IRUGO);
+module_param_named(lbprz, sdebug_lbprz, int, S_IRUGO);
 module_param_named(lbpu, sdebug_lbpu, int, S_IRUGO);
 module_param_named(lbpws, sdebug_lbpws, int, S_IRUGO);
 module_param_named(lbpws10, sdebug_lbpws10, int, S_IRUGO);
-module_param_named(lbprz, sdebug_lbprz, int, S_IRUGO);
-module_param_named(lowest_aligned, sdebug_lowest_aligned, int, S_IRUGO);
 module_param_named(max_luns, sdebug_max_luns, int, S_IRUGO | S_IWUSR);
 module_param_named(max_queue, sdebug_max_queue, int, S_IRUGO | S_IWUSR);
-module_param_named(medium_error_start, sdebug_medium_error_start, int, S_IRUGO | S_IWUSR);
-module_param_named(medium_error_count, sdebug_medium_error_count, int, S_IRUGO | S_IWUSR);
+module_param_named(medium_error_count, sdebug_medium_error_count, int,
+		   S_IRUGO | S_IWUSR);
+module_param_named(medium_error_start, sdebug_medium_error_start, int,
+		   S_IRUGO | S_IWUSR);
 module_param_named(ndelay, sdebug_ndelay, int, S_IRUGO | S_IWUSR);
 module_param_named(no_lun_0, sdebug_no_lun_0, int, S_IRUGO | S_IWUSR);
 module_param_named(no_uld, sdebug_no_uld, int, S_IRUGO);
 module_param_named(num_parts, sdebug_num_parts, int, S_IRUGO);
 module_param_named(num_tgts, sdebug_num_tgts, int, S_IRUGO | S_IWUSR);
 module_param_named(opt_blks, sdebug_opt_blks, int, S_IRUGO);
+module_param_named(opt_xferlen_exp, sdebug_opt_xferlen_exp, int, S_IRUGO);
 module_param_named(opts, sdebug_opts, int, S_IRUGO | S_IWUSR);
 module_param_named(physblk_exp, sdebug_physblk_exp, int, S_IRUGO);
-module_param_named(opt_xferlen_exp, sdebug_opt_xferlen_exp, int, S_IRUGO);
 module_param_named(ptype, sdebug_ptype, int, S_IRUGO | S_IWUSR);
 module_param_named(random, sdebug_random, bool, S_IRUGO | S_IWUSR);
 module_param_named(removable, sdebug_removable, bool, S_IRUGO | S_IWUSR);
@@ -5437,8 +5439,8 @@ module_param_named(unmap_alignment, sdebug_unmap_alignment, int, S_IRUGO);
 module_param_named(unmap_granularity, sdebug_unmap_granularity, int, S_IRUGO);
 module_param_named(unmap_max_blocks, sdebug_unmap_max_blocks, int, S_IRUGO);
 module_param_named(unmap_max_desc, sdebug_unmap_max_desc, int, S_IRUGO);
-module_param_named(virtual_gb, sdebug_virtual_gb, int, S_IRUGO | S_IWUSR);
 module_param_named(uuid_ctl, sdebug_uuid_ctl, int, S_IRUGO);
+module_param_named(virtual_gb, sdebug_virtual_gb, int, S_IRUGO | S_IWUSR);
 module_param_named(vpd_use_hostno, sdebug_vpd_use_hostno, int,
 		   S_IRUGO | S_IWUSR);
 module_param_named(wp, sdebug_wp, bool, S_IRUGO | S_IWUSR);
@@ -5459,26 +5461,26 @@ MODULE_PARM_DESC(delay, "response delay (def=1 jiffy); 0:imm, -1,-2:tiny");
 MODULE_PARM_DESC(dev_size_mb, "size in MiB of ram shared by devs(def=8)");
 MODULE_PARM_DESC(dif, "data integrity field type: 0-3 (def=0)");
 MODULE_PARM_DESC(dix, "data integrity extensions mask (def=0)");
-MODULE_PARM_DESC(dsense, "use descriptor sense format(def=0 -> fixed)");
 MODULE_PARM_DESC(doublestore, "If set, 2 data buffers allocated, devices alternate between the two");
+MODULE_PARM_DESC(dsense, "use descriptor sense format(def=0 -> fixed)");
 MODULE_PARM_DESC(every_nth, "timeout every nth command(def=0)");
 MODULE_PARM_DESC(fake_rw, "fake reads/writes instead of copying (def=0)");
 MODULE_PARM_DESC(guard, "protection checksum: 0=crc, 1=ip (def=0)");
 MODULE_PARM_DESC(host_lock, "host_lock is ignored (def=0)");
-MODULE_PARM_DESC(inq_vendor, "SCSI INQUIRY vendor string (def=\"Linux\")");
 MODULE_PARM_DESC(inq_product, "SCSI INQUIRY product string (def=\"scsi_debug\")");
 MODULE_PARM_DESC(inq_rev, "SCSI INQUIRY revision string (def=\""
 		 SDEBUG_VERSION "\")");
+MODULE_PARM_DESC(inq_vendor, "SCSI INQUIRY vendor string (def=\"Linux\")");
+MODULE_PARM_DESC(lowest_aligned, "lowest aligned lba (def=0)");
+MODULE_PARM_DESC(lbprz,
+		 "on read unmapped LBs return 0 when 1 (def), return 0xff when 2");
 MODULE_PARM_DESC(lbpu, "enable LBP, support UNMAP command (def=0)");
 MODULE_PARM_DESC(lbpws, "enable LBP, support WRITE SAME(16) with UNMAP bit (def=0)");
 MODULE_PARM_DESC(lbpws10, "enable LBP, support WRITE SAME(10) with UNMAP bit (def=0)");
-MODULE_PARM_DESC(lbprz,
-	"on read unmapped LBs return 0 when 1 (def), return 0xff when 2");
-MODULE_PARM_DESC(lowest_aligned, "lowest aligned lba (def=0)");
 MODULE_PARM_DESC(max_luns, "number of LUNs per target to simulate(def=1)");
 MODULE_PARM_DESC(max_queue, "max number of queued commands (1 to max(def))");
-MODULE_PARM_DESC(medium_error_start, "starting sector number to return MEDIUM error");
 MODULE_PARM_DESC(medium_error_count, "count of sectors to return follow on MEDIUM error");
+MODULE_PARM_DESC(medium_error_start, "starting sector number to return MEDIUM error");
 MODULE_PARM_DESC(ndelay, "response delay in nanoseconds (def=0 -> ignore)");
 MODULE_PARM_DESC(no_lun_0, "no LU number 0 (def=0 -> have lun 0)");
 MODULE_PARM_DESC(no_uld, "stop ULD (e.g. sd driver) attaching (def=0))");
@@ -6276,8 +6278,8 @@ static struct attribute *sdebug_drv_attrs[] = {
 	&driver_attr_delay.attr,
 	&driver_attr_opts.attr,
 	&driver_attr_ptype.attr,
-	&driver_attr_dsense.attr,
 	&driver_attr_doublestore.attr,
+	&driver_attr_dsense.attr,
 	&driver_attr_fake_rw.attr,
 	&driver_attr_no_lun_0.attr,
 	&driver_attr_num_tgts.attr,
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 11/15] scsi_debug: zbc parameter can be string
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
                   ` (9 preceding siblings ...)
  2020-02-20 20:08 ` [PATCH v3 10/15] scsi_debug: re-arrange parameters alphabetically Douglas Gilbert
@ 2020-02-20 20:08 ` Douglas Gilbert
  2020-02-20 20:08 ` [PATCH v3 12/15] scsi_debug: zone_max_open module parameter Douglas Gilbert
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal

Change the zbc module parameter to:
    0: none (probably a convential disk)
    1: host_aware
    2: host_managed

These values are chosen to match 'enum blk_zoned_model' found in
include/linux/blkdev.h . One of the strings 'none', 'aware' or
'managed' can be given instead of a number. Those strings may
be prefixed by 'host_' or 'host-'.

At this time there is no ZBC "host-aware" implementation so that
string (or the value '1') has no effect.

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 drivers/scsi/scsi_debug.c | 116 ++++++++++++++++++++++++++++----------
 1 file changed, 86 insertions(+), 30 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 1a3cca5f3ec6..ec8ab3c4380c 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -39,6 +39,7 @@
 #include <linux/uuid.h>
 #include <linux/t10-pi.h>
 #include <linux/random.h>
+#include <linux/ctype.h>
 
 #include <net/checksum.h>
 
@@ -110,7 +111,9 @@ static const char *sdebug_version_date = "20190125";
 #define DEF_ATO 1
 #define DEF_CDB_LEN 10
 #define DEF_JDELAY   1		/* if > 0 unit is a jiffy */
+#define DEF_DEV_SIZE_PRE_INIT   0
 #define DEF_DEV_SIZE_MB   8
+#define DEF_ZBC_DEV_SIZE_MB   128
 #define DEF_DIF 0
 #define DEF_DIX 0
 #define DEF_DOUBLESTORE false
@@ -713,7 +716,7 @@ static int sdebug_add_host = DEF_NUM_HOST;
 static int sdebug_ato = DEF_ATO;
 static int sdebug_cdb_len = DEF_CDB_LEN;
 static int sdebug_jdelay = DEF_JDELAY;	/* if > 0 then unit is jiffies */
-static int sdebug_dev_size_mb = DEF_DEV_SIZE_MB;
+static int sdebug_dev_size_mb = DEF_DEV_SIZE_PRE_INIT;
 static int sdebug_dif = DEF_DIF;
 static int sdebug_dix = DEF_DIX;
 static int sdebug_dsense = DEF_D_SENSE;
@@ -741,6 +744,8 @@ static int sdebug_scsi_level = DEF_SCSI_LEVEL;
 static int sdebug_sector_size = DEF_SECTOR_SIZE;
 static int sdebug_virtual_gb = DEF_VIRTUAL_GB;
 static int sdebug_vpd_use_hostno = DEF_VPD_USE_HOSTNO;
+/* Following enum: 0: no zbc, def; 1: host aware; 2: host managed */
+static enum blk_zoned_model sdeb_zbc_model;
 static unsigned int sdebug_lbpu = DEF_LBPU;
 static unsigned int sdebug_lbpws = DEF_LBPWS;
 static unsigned int sdebug_lbpws10 = DEF_LBPWS10;
@@ -763,7 +768,7 @@ static bool have_dif_prot;
 static bool write_since_sync;
 static bool sdebug_statistics = DEF_STATISTICS;
 static bool sdebug_wp;
-static bool sdebug_zbc;
+static char *sdeb_zbc_model_s;
 
 static unsigned int sdebug_store_sectors;
 static sector_t sdebug_capacity;	/* in sectors */
@@ -5446,7 +5451,7 @@ module_param_named(vpd_use_hostno, sdebug_vpd_use_hostno, int,
 module_param_named(wp, sdebug_wp, bool, S_IRUGO | S_IWUSR);
 module_param_named(write_same_length, sdebug_write_same_length, int,
 		   S_IRUGO | S_IWUSR);
-module_param_named(zbc, sdebug_zbc, bool, S_IRUGO | S_IWUSR);
+module_param_named(zbc, sdeb_zbc_model_s, charp, S_IRUGO | S_IWUSR);
 
 MODULE_AUTHOR("Eric Youngdale + Douglas Gilbert");
 MODULE_DESCRIPTION("SCSI debug adapter driver");
@@ -5508,7 +5513,7 @@ MODULE_PARM_DESC(virtual_gb, "virtual gigabyte (GiB) size (def=0 -> use dev_size
 MODULE_PARM_DESC(vpd_use_hostno, "0 -> dev ids ignore hostno (def=1 -> unique dev ids)");
 MODULE_PARM_DESC(wp, "Write Protect (def=0)");
 MODULE_PARM_DESC(write_same_length, "Maximum blocks per WRITE SAME cmd (def=0xffff)");
-MODULE_PARM_DESC(zbc, "Emulate ZBC device(s) (def=false) [same action as ptype=0x14]");
+MODULE_PARM_DESC(zbc, "'none' [0]; 'aware' [1]; 'managed' [2] (def=0). Can have 'host_' prefix");
 
 #define SDEBUG_INFO_LEN 256
 static char sdebug_info[SDEBUG_INFO_LEN];
@@ -5718,7 +5723,13 @@ static ssize_t ptype_store(struct device_driver *ddp, const char *buf,
 	if ((count > 0) && (1 == sscanf(buf, "%d", &n)) && (n >= 0)) {
 		prev_pdt = sdebug_ptype;
 		sdebug_ptype = n;
-		sdeb_zbc_in_use = (sdebug_ptype == TYPE_ZBC);
+		if (sdebug_ptype == TYPE_ZBC) {
+			sdeb_zbc_in_use = true;
+			sdeb_zbc_model = BLK_ZONED_HM;
+		} else if (prev_pdt == TYPE_ZBC) {
+			sdeb_zbc_in_use = false;
+			sdeb_zbc_model = BLK_ZONED_NONE;
+		}
 		return count;
 	}
 	return -EINVAL;
@@ -6230,38 +6241,63 @@ static ssize_t cdb_len_store(struct device_driver *ddp, const char *buf,
 }
 static DRIVER_ATTR_RW(cdb_len);
 
+static int sdeb_zbc_model_str(const char *cp)
+{
+	int res = -EINVAL;
+
+	if (isalpha(cp[0])) {
+		if (strstr(cp, "none"))
+			res = BLK_ZONED_NONE;
+		else if (strstr(cp, "aware"))
+			res = BLK_ZONED_HA;
+		else if (strstr(cp, "managed"))
+			res = BLK_ZONED_HM;
+	} else {
+		int n, ret;
+
+		ret = kstrtoint(cp, 0, &n);
+		if (ret)
+			return ret;
+		if (n >= 0 || n <= 2)
+			res = n;
+	}
+	return res;
+}
+
 static ssize_t zbc_show(struct device_driver *ddp, char *buf)
 {
-	return scnprintf(buf, PAGE_SIZE, "%d\n", sdebug_ptype == TYPE_ZBC);
+	switch (sdeb_zbc_model) {
+	case BLK_ZONED_NONE:	/* 0 */
+		return scnprintf(buf, PAGE_SIZE, "none\n");
+	case BLK_ZONED_HA:	/* 1, not yet supported */
+		return scnprintf(buf, PAGE_SIZE, "host_aware\n");
+	case BLK_ZONED_HM:	/* 2 */
+		return scnprintf(buf, PAGE_SIZE, "host_managed\n");
+	default:
+		return scnprintf(buf, PAGE_SIZE, "unknown_zbc_model [0x%x]\n",
+				 (unsigned int)sdeb_zbc_model);
+	}
 }
 static ssize_t zbc_store(struct device_driver *ddp, const char *buf,
 			 size_t count)
 {
-	bool want_zbc;
-	int ret, n;
+	bool want_man_zbc;
+	int n;
 	int prev_pdt = sdebug_ptype;
 
-	ret = kstrtoint(buf, 0, &n);
-	if (ret)
-		return ret;
-	want_zbc = !!n;
-	if (prev_pdt == TYPE_ZBC || want_zbc) {
-		if (prev_pdt == TYPE_ZBC && want_zbc)
+	n = sdeb_zbc_model_str(buf);
+	if (n < 0)
+		return n;
+	sdeb_zbc_model = n;
+	want_man_zbc = (sdeb_zbc_model == BLK_ZONED_HM);
+	if (prev_pdt == TYPE_ZBC || want_man_zbc) {
+		if (prev_pdt == TYPE_ZBC && want_man_zbc)
 			return count;
-		sdeb_zbc_in_use = want_zbc;
-		if (sdeb_zbc_in_use) {
-			struct sdeb_zone_state *zsp = sdeb_zstate_a;
-
+		sdeb_zbc_in_use = want_man_zbc;
+		if (sdeb_zbc_in_use)
 			sdebug_ptype = TYPE_ZBC;
-			zbc_swrz_start = sdebug_capacity >> 1;
-			zsp->write_pointer = zbc_swrz_start;
-			zsp->z_cond = zc1_empty;
-			++zsp;
-			zsp->write_pointer = zbc_swrz_start;
-			zsp->z_cond = zc1_empty;
-		} else {
+		else
 			sdebug_ptype = DEF_PTYPE;	/* a disk ? ? */
-		}
 	}
 	return count;
 }
@@ -6398,7 +6434,30 @@ static int __init scsi_debug_init(void)
 	for (k = 0; k < submit_queues; ++k)
 		spin_lock_init(&sdebug_q_arr[k].qc_lock);
 
-	if (sdebug_dev_size_mb < 1)
+	/* check for host managed zoned block device [ptype=0x14] */
+	if (sdebug_ptype == TYPE_ZBC) {
+		sdeb_zbc_in_use = true;
+		sdeb_zbc_model = BLK_ZONED_HM;
+		if (sdebug_dev_size_mb == DEF_DEV_SIZE_PRE_INIT)
+			sdebug_dev_size_mb = DEF_ZBC_DEV_SIZE_MB;
+	}
+	if (sdeb_zbc_model_s && *sdeb_zbc_model_s) { /* e.g. 'zbc=managed' */
+		k = sdeb_zbc_model_str(sdeb_zbc_model_s);
+		if (k < 0) {
+			ret = k;
+			goto free_vm;
+		}
+		sdeb_zbc_model = k;
+		if (sdeb_zbc_model == BLK_ZONED_HM) {
+			sdeb_zbc_in_use = true;
+			sdebug_ptype = TYPE_ZBC;
+			if (sdebug_dev_size_mb == DEF_DEV_SIZE_PRE_INIT)
+				sdebug_dev_size_mb = DEF_ZBC_DEV_SIZE_MB;
+		}
+	}
+	if (sdebug_dev_size_mb == DEF_DEV_SIZE_PRE_INIT)
+		sdebug_dev_size_mb = DEF_DEV_SIZE_MB;
+	if (sdebug_dev_size_mb < 1)	/* could be negative */
 		sdebug_dev_size_mb = 1;  /* force minimum 1 MB ramdisk */
 	sz = (unsigned long)sdebug_dev_size_mb * 1048576;
 	sdebug_store_sectors = sz / sdebug_sector_size;
@@ -6497,9 +6556,6 @@ static int __init scsi_debug_init(void)
 		if (sdebug_num_parts)
 			map_region(0, 2);
 	}
-	/* check for host managed zoned block device [ptype=0x14] */
-	if (sdebug_ptype == TYPE_ZBC)
-		sdeb_zbc_in_use = true;
 
 	pseudo_primary = root_device_register("pseudo_0");
 	if (IS_ERR(pseudo_primary)) {
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 12/15] scsi_debug: zone_max_open module parameter
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
                   ` (10 preceding siblings ...)
  2020-02-20 20:08 ` [PATCH v3 11/15] scsi_debug: zbc parameter can be string Douglas Gilbert
@ 2020-02-20 20:08 ` Douglas Gilbert
  2020-02-20 20:08 ` [PATCH v3 13/15] scsi_debug: zone_nr_conv " Douglas Gilbert
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal, Damien Le Moal

From: Damien Le Moal <damien.lemoal@wdc.com>

Add the zone_max_open module parameters to control the maximum number
of open zones of a ZBC device. This parameter is ignored for device
types other than 0x14 (zbc=2 case).

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 drivers/scsi/scsi_debug.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index ec8ab3c4380c..d4022eafe834 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -799,7 +799,7 @@ static int dif_errors;
 /* ZBC global data */
 static bool sdeb_zbc_in_use;		/* true when ptype=TYPE_ZBC [0x14] */
 static const int zbc_zone_size_mb;
-static const int zbc_max_open_zones = DEF_ZBC_MAX_OPEN_ZONES;
+static int sdeb_zbc_max_open = DEF_ZBC_MAX_OPEN_ZONES;
 
 static int submit_queues = DEF_SUBMIT_QUEUES;  /* > 1 for multi-queue (mq) */
 static struct sdebug_queue *sdebug_q_arr;  /* ptr to array of submit queues */
@@ -4699,11 +4699,11 @@ static int sdebug_device_create_zones(struct sdebug_dev_info *devip)
 	devip->zsize_shift = ilog2(devip->zsize);
 	devip->nr_zones = (capacity + devip->zsize - 1) >> devip->zsize_shift;
 
-	/* zbc_max_open_zones can be 0, meaning "no limit" */
-	if (zbc_max_open_zones >= devip->nr_zones - 1)
+	/* sdeb_zbc_max_open can be 0, meaning "no limit" */
+	if (sdeb_zbc_max_open >= devip->nr_zones - 1)
 		devip->max_open = (devip->nr_zones - 1) / 2;
 	else
-		devip->max_open = zbc_max_open_zones;
+		devip->max_open = sdeb_zbc_max_open;
 
 	devip->zstate = kcalloc(sizeof(struct sdeb_zone_state),
 				devip->nr_zones, GFP_KERNEL);
@@ -5452,6 +5452,7 @@ module_param_named(wp, sdebug_wp, bool, S_IRUGO | S_IWUSR);
 module_param_named(write_same_length, sdebug_write_same_length, int,
 		   S_IRUGO | S_IWUSR);
 module_param_named(zbc, sdeb_zbc_model_s, charp, S_IRUGO | S_IWUSR);
+module_param_named(zone_max_open, sdeb_zbc_max_open, int, S_IRUGO);
 
 MODULE_AUTHOR("Eric Youngdale + Douglas Gilbert");
 MODULE_DESCRIPTION("SCSI debug adapter driver");
@@ -5514,6 +5515,7 @@ MODULE_PARM_DESC(vpd_use_hostno, "0 -> dev ids ignore hostno (def=1 -> unique de
 MODULE_PARM_DESC(wp, "Write Protect (def=0)");
 MODULE_PARM_DESC(write_same_length, "Maximum blocks per WRITE SAME cmd (def=0xffff)");
 MODULE_PARM_DESC(zbc, "'none' [0]; 'aware' [1]; 'managed' [2] (def=0). Can have 'host_' prefix");
+MODULE_PARM_DESC(zone_max_open, "Maximum number of open zones; [0] for no limit (def=auto)");
 
 #define SDEBUG_INFO_LEN 256
 static char sdebug_info[SDEBUG_INFO_LEN];
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 13/15] scsi_debug: zone_nr_conv module parameter
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
                   ` (11 preceding siblings ...)
  2020-02-20 20:08 ` [PATCH v3 12/15] scsi_debug: zone_max_open module parameter Douglas Gilbert
@ 2020-02-20 20:08 ` Douglas Gilbert
  2020-02-20 20:08 ` [PATCH v3 14/15] scsi_debug: zone_size_mb " Douglas Gilbert
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal, Damien Le Moal

From: Damien Le Moal <damien.lemoal@wdc.com>

Allow controlling the number of conventional zones of a zbc device with
the new zone_nr_conv module parameter. The default value is 1 and the
specified value must be less than the total number of zones of the
device. This parameter is ignored for device types other than 0x14
(zbc=2 case).

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 drivers/scsi/scsi_debug.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index d4022eafe834..a88f182af5c5 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -155,6 +155,7 @@ static const char *sdebug_version_date = "20190125";
 /* Default parameters for ZBC drives */
 #define DEF_ZBC_ZONE_SIZE_MB	128
 #define DEF_ZBC_MAX_OPEN_ZONES	8
+#define DEF_ZBC_NR_CONV_ZONES	1
 
 #define SDEBUG_LUN_0_VAL 0
 
@@ -292,6 +293,7 @@ struct sdebug_dev_info {
 	sector_t zsize;
 	sector_t zsize_shift;
 	unsigned int nr_zones;
+	unsigned int nr_conv_zones;
 	unsigned int nr_imp_open;
 	unsigned int nr_exp_open;
 	unsigned int nr_closed;
@@ -800,6 +802,7 @@ static int dif_errors;
 static bool sdeb_zbc_in_use;		/* true when ptype=TYPE_ZBC [0x14] */
 static const int zbc_zone_size_mb;
 static int sdeb_zbc_max_open = DEF_ZBC_MAX_OPEN_ZONES;
+static int sdeb_zbc_nr_conv = DEF_ZBC_NR_CONV_ZONES;
 
 static int submit_queues = DEF_SUBMIT_QUEUES;  /* > 1 for multi-queue (mq) */
 static struct sdebug_queue *sdebug_q_arr;  /* ptr to array of submit queues */
@@ -4699,6 +4702,13 @@ static int sdebug_device_create_zones(struct sdebug_dev_info *devip)
 	devip->zsize_shift = ilog2(devip->zsize);
 	devip->nr_zones = (capacity + devip->zsize - 1) >> devip->zsize_shift;
 
+	if (sdeb_zbc_nr_conv >= devip->nr_zones) {
+		pr_err("Number of conventional zones too large\n");
+		return -EINVAL;
+	}
+
+	devip->nr_conv_zones = sdeb_zbc_nr_conv;
+
 	/* sdeb_zbc_max_open can be 0, meaning "no limit" */
 	if (sdeb_zbc_max_open >= devip->nr_zones - 1)
 		devip->max_open = (devip->nr_zones - 1) / 2;
@@ -4715,7 +4725,7 @@ static int sdebug_device_create_zones(struct sdebug_dev_info *devip)
 
 		zsp->z_start = zstart;
 
-		if (i == 0) {
+		if (i < devip->nr_conv_zones) {
 			zsp->z_cond = zbc_not_write_pointer;
 			zsp->z_wp = (sector_t)-1;
 		} else {
@@ -5453,6 +5463,7 @@ module_param_named(write_same_length, sdebug_write_same_length, int,
 		   S_IRUGO | S_IWUSR);
 module_param_named(zbc, sdeb_zbc_model_s, charp, S_IRUGO | S_IWUSR);
 module_param_named(zone_max_open, sdeb_zbc_max_open, int, S_IRUGO);
+module_param_named(zone_nr_conv, sdeb_zbc_nr_conv, int, S_IRUGO);
 
 MODULE_AUTHOR("Eric Youngdale + Douglas Gilbert");
 MODULE_DESCRIPTION("SCSI debug adapter driver");
@@ -5516,6 +5527,7 @@ MODULE_PARM_DESC(wp, "Write Protect (def=0)");
 MODULE_PARM_DESC(write_same_length, "Maximum blocks per WRITE SAME cmd (def=0xffff)");
 MODULE_PARM_DESC(zbc, "'none' [0]; 'aware' [1]; 'managed' [2] (def=0). Can have 'host_' prefix");
 MODULE_PARM_DESC(zone_max_open, "Maximum number of open zones; [0] for no limit (def=auto)");
+MODULE_PARM_DESC(zone_nr_conv, "Number of conventional zones (def=1)");
 
 #define SDEBUG_INFO_LEN 256
 static char sdebug_info[SDEBUG_INFO_LEN];
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 14/15] scsi_debug: zone_size_mb module parameter
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
                   ` (12 preceding siblings ...)
  2020-02-20 20:08 ` [PATCH v3 13/15] scsi_debug: zone_nr_conv " Douglas Gilbert
@ 2020-02-20 20:08 ` Douglas Gilbert
  2020-02-20 20:08 ` [PATCH v3 15/15] scsi_debug: bump to version 1.89 Douglas Gilbert
  2020-02-20 20:11 ` [PATCH v3 00/15] scsi_debug: host managed ZBC + doublestore Douglas Gilbert
  15 siblings, 0 replies; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal, Damien Le Moal

From: Damien Le Moal <damien.lemoal@wdc.com>

Add the zone_size_mb module parameters to control the zone size of a
ZBC device. If the zone size specified is not a divisor of the device
capacity, the last zone of the device will be created as a smaller
"runt" zone. This parameter is ignored for device types other than
0x14 (zbc=2 case).

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 drivers/scsi/scsi_debug.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index a88f182af5c5..b02f98a00d28 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -800,7 +800,7 @@ static int dif_errors;
 
 /* ZBC global data */
 static bool sdeb_zbc_in_use;		/* true when ptype=TYPE_ZBC [0x14] */
-static const int zbc_zone_size_mb;
+static int sdeb_zbc_zone_size_mb;
 static int sdeb_zbc_max_open = DEF_ZBC_MAX_OPEN_ZONES;
 static int sdeb_zbc_nr_conv = DEF_ZBC_NR_CONV_ZONES;
 
@@ -4678,10 +4678,10 @@ static int sdebug_device_create_zones(struct sdebug_dev_info *devip)
 	unsigned int i;
 
 	/*
-	 * Set zone size: if zbc_zone_size_mb was not set, figure out a zone
-	 * size allowing for at least 4 zones on the device.
+	 * Set zone size: if sdeb_zbc_zone_size_mb was not set, figure out
+	 * a zone size allowing for at least 4 zones on the device.
 	 */
-	if (!zbc_zone_size_mb) {
+	if (!sdeb_zbc_zone_size_mb) {
 		devip->zsize = (DEF_ZBC_ZONE_SIZE_MB * SZ_1M)
 			>> ilog2(sdebug_sector_size);
 		while (capacity < devip->zsize * 4 && devip->zsize >= 2)
@@ -4691,7 +4691,7 @@ static int sdebug_device_create_zones(struct sdebug_dev_info *devip)
 			return -EINVAL;
 		}
 	} else {
-		devip->zsize = (zbc_zone_size_mb * SZ_1M)
+		devip->zsize = (sdeb_zbc_zone_size_mb * SZ_1M)
 			>> ilog2(sdebug_sector_size);
 		if (devip->zsize >= capacity) {
 			pr_err("Zone size too large for device capacity\n");
@@ -5464,6 +5464,7 @@ module_param_named(write_same_length, sdebug_write_same_length, int,
 module_param_named(zbc, sdeb_zbc_model_s, charp, S_IRUGO | S_IWUSR);
 module_param_named(zone_max_open, sdeb_zbc_max_open, int, S_IRUGO);
 module_param_named(zone_nr_conv, sdeb_zbc_nr_conv, int, S_IRUGO);
+module_param_named(zone_size_mb, sdeb_zbc_zone_size_mb, int, S_IRUGO);
 
 MODULE_AUTHOR("Eric Youngdale + Douglas Gilbert");
 MODULE_DESCRIPTION("SCSI debug adapter driver");
@@ -5528,6 +5529,7 @@ MODULE_PARM_DESC(write_same_length, "Maximum blocks per WRITE SAME cmd (def=0xff
 MODULE_PARM_DESC(zbc, "'none' [0]; 'aware' [1]; 'managed' [2] (def=0). Can have 'host_' prefix");
 MODULE_PARM_DESC(zone_max_open, "Maximum number of open zones; [0] for no limit (def=auto)");
 MODULE_PARM_DESC(zone_nr_conv, "Number of conventional zones (def=1)");
+MODULE_PARM_DESC(zone_size_mb, "Zone size in MiB (def=auto)");
 
 #define SDEBUG_INFO_LEN 256
 static char sdebug_info[SDEBUG_INFO_LEN];
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 15/15] scsi_debug: bump to version 1.89
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
                   ` (13 preceding siblings ...)
  2020-02-20 20:08 ` [PATCH v3 14/15] scsi_debug: zone_size_mb " Douglas Gilbert
@ 2020-02-20 20:08 ` Douglas Gilbert
  2020-02-20 20:11 ` [PATCH v3 00/15] scsi_debug: host managed ZBC + doublestore Douglas Gilbert
  15 siblings, 0 replies; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:08 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal

The scsi_debug driver version is visible in:
   /sys/modules/scsi_debug/version
and can thus be used by user space programs to alter the
features they try to use. Since the doublestore option is
a significant addition, bump the version number which
will appear as 0189 when the above file is read.

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 drivers/scsi/scsi_debug.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index b02f98a00d28..bf6727be054b 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -7,7 +7,7 @@
  *  anything out of the ordinary is seen.
  * ^^^^^^^^^^^^^^^^^^^^^^^ Original ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  *
- * Copyright (C) 2001 - 2018 Douglas Gilbert
+ * Copyright (C) 2001 - 2020 Douglas Gilbert
  *
  *  For documentation see http://sg.danny.cz/sg/sdebug26.html
  */
@@ -58,8 +58,8 @@
 #include "scsi_logging.h"
 
 /* make sure inq_product_rev string corresponds to this version */
-#define SDEBUG_VERSION "0188"	/* format to fit INQUIRY revision field */
-static const char *sdebug_version_date = "20190125";
+#define SDEBUG_VERSION "0189"	/* format to fit INQUIRY revision field */
+static const char *sdebug_version_date = "20200218";
 
 #define MY_NAME "scsi_debug"
 
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 00/15] scsi_debug: host managed ZBC + doublestore
  2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
                   ` (14 preceding siblings ...)
  2020-02-20 20:08 ` [PATCH v3 15/15] scsi_debug: bump to version 1.89 Douglas Gilbert
@ 2020-02-20 20:11 ` Douglas Gilbert
  15 siblings, 0 replies; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-20 20:11 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare, Damien.LeMoal

Missed putting the driver name in the subject ...

On 2020-02-20 3:08 p.m., Douglas Gilbert wrote:
> This patchset is a follow-on from an earlier patchset titled:
>      [RFC v2 0/6] scsi_debug: random doublestore verify
> posted on the linux-scsi list on 20200109.
> 
> The major addition is support for host-managed ZBC devices.
> The bulk of the work in this area was done by Damien Le Moal.
> It allows ZBC devices with a mix of conventional and
> "sequential write required" zones to be specified. These
> follow the same model as direct access devices in this
> driver. Namely, each device has its own metadata (including
> its write pointer(s)) but all (scsi_debug) devices share the
> same user data store (see doublestore below).
> 
> Significantly, this simulation passes the test/zbc_tests.sh
> script in the https://github.com/hgst/libzbc repository.
> 
> The lower numbered patches in this set contain various
> measures to improve the speed and usefulness of this driver.
> It is being used to test the rewrite of the SCSI generic (sg)
> driver which is still underway. Disk to disk copies are the
> test of choice by the author [DG]. Comparing the results
> between (simulated) disks is useless since all scsi_debug
> devices share the same user data store. This limitation was
> circumvented by adding the doublestore parameter. When set,
> doublestore doubles the data store and allocates them in an
> alternating pattern to each scsi_debug device. To enhance
> the comparisons, simulations of VERIFY(10 and 16) commands
> have been added. A further enhancement is to simulate the
> PRE-FETCH command (which does nothing as the data is
> already cached).
> 
> doublestore can also help with ZBC testing. Single threaded
> copy commands like dd (and ddpt) can be used to copy one
> non-empty zone into another empty zone. Multiple-threaded
> copies (e.g. sgp_dd) can do out-of-order WRITEs which
> become a WRITE VIOLATION error in a "sequential write
> required" zone.
> 
> The author [DG] found that precise command duration timing
> gave a false impression of how "bulletproof" the sg driver
> state machines and locking were. The first patch involves
> randomizing the command durations and it did expose various
> issues in the driver under test (sg).
> 
> Since all scsi_debug memory store accesses are done in the
> context of queuecommand() call, the *_irqsave() and
> *_irqrestore() variants of the associated locks have been
> removed.  That could be a problem if queuecommand() can ever
> be called form an interrupt or related context.
> 
> Finally to address the discrepancy between command duration
> times seen by the sg driver compared to what was set with
> this driver's ndelay option, this driver's timekeeping for
> short durations was made more accurate.
> 
> If and when this patchset is accepted, this page will be
> updated:   http://sg.danny.cz/sg/sdebug26.html
> This patchset is against Martin Petersen's git repository
> and its 5.7/scsi-queue branch.
> 
> Changes since v2 (RFC):
>    - add support for host-managed zbc devices with
>      conventional and "sequential write required"
>      zones [DLM]
> 
> Changes since v1 (RFC):
>    - testing with version 1 caused several strange crashes that
>      turned out to be caused by a code trick to read in the
>      data-out buffer but _not_ place it in the big fake_storep
>      array. This approach failed badly when multiple threads
>      were doing verifies at the same time.
>    - replace the code trick with a new do_dout_fetch() function
>    - since the code trick was borrowed from the COMPARE AND
>      WRITE implementation [resp_comp_write()] using
>      do_dout_fetch() fixes the same bug in the existing driver
>      which hasn't been reported (yet).
> 
> Damien Le Moal (3):
>    scsi_debug: zone_max_open module parameter
>    scsi_debug: zone_nr_conv module parameter
>    scsi_debug: zone_size_mb module parameter
> 
> Douglas Gilbert (12):
>    scsi_debug: randomize command completion time
>    scsi_debug: add doublestore option
>    scsi_debug: implement verify(10), add verify(16)
>    scsi_debug: weaken rwlock around ramdisk access
>    scsi_debug: improve command duration calculation
>    scsi_debug: implement pre-fetch commands
>    scsi_debug: expand zbc support
>    scsi_debug: add zone commands
>    scsi_debug: zbc module parameter
>    scsi_debug: re-arrange parameters alphabetically
>    scsi_debug: zbc parameter can be string
>    scsi_debug: bump to version 1.89
> 
>   drivers/scsi/scsi_debug.c | 1593 ++++++++++++++++++++++++++++++++-----
>   1 file changed, 1389 insertions(+), 204 deletions(-)
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 08/15] scsi_debug: add zone commands
  2020-02-20 20:08 ` [PATCH v3 08/15] scsi_debug: add zone commands Douglas Gilbert
@ 2020-02-21  9:44   ` Johannes Thumshirn
  2020-02-23  5:59     ` Douglas Gilbert
  0 siblings, 1 reply; 26+ messages in thread
From: Johannes Thumshirn @ 2020-02-21  9:44 UTC (permalink / raw)
  To: Douglas Gilbert, linux-scsi; +Cc: martin.petersen, jejb, hare, Damien Le Moal

On 20/02/2020 21:09, Douglas Gilbert wrote:
> +enum sdebug_z_cond {	/* enumeration names taken from table 12, zbc2r04 */
> +	zbc_not_write_pointer = 0x0, /* not in table 12; conventional zone */
> +	zc1_empty = 0x1,
> +	zc2_implicit_open,
> +	zc3_explicit_open,
> +	zc4_closed,
> +	zc5_full = 0xe,	/* values per Zone Condition field in Report Zones */
> +	zc6_read_only = 0xd,
> +	zc7_offline = 0xf,
> +};

Haven't checked the rest of scsi_debug, but I'd write these enum values 
in all caps. Furthermore I'd just call them EMPTY, OPEN, READ_ONLY, 
FULL, OFFLINE, etc..

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 01/15] scsi_debug: randomize command completion time
  2020-02-20 20:08 ` [PATCH v3 01/15] scsi_debug: randomize command completion time Douglas Gilbert
@ 2020-02-21  9:45   ` Hannes Reinecke
  0 siblings, 0 replies; 26+ messages in thread
From: Hannes Reinecke @ 2020-02-21  9:45 UTC (permalink / raw)
  To: Douglas Gilbert, linux-scsi; +Cc: martin.petersen, jejb, Damien.LeMoal

On 2/20/20 9:08 PM, Douglas Gilbert wrote:
> Add a new command line option (e.g. random=1) and sysfs attribute
> that causes subsequent command completion times to be between the
> current command delay setting and 0. A unformily distributed 32
> bit, kernel provided integer is used for this purpose.
> 
> Since the existing 'delay' whose units are jiffies (typically
> milliseconds) and 'ndelay' (units: nanoseconds) options (and sysfs
> attributes) span a range greater than 32 bits, some scaling is
> required.
> 
> The purpose of this patch is to widen the range of testing cases
> that are visited in long running tests. Put simply: rarely struct
> race conditions are more likely to be found when this facility is
> used.
> 
> The default is the previous case in which all command completions
> were roughly equal to (if not, slightly longer) than the value
> given by the 'delay' or 'ndelay' settings (or their defaults).
> This option's default is equivalent to setting 'random=0' .
> 
> Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
> ---
>  drivers/scsi/scsi_debug.c | 42 ++++++++++++++++++++++++++++++++++++---
>  1 file changed, 39 insertions(+), 3 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 02/15] scsi_debug: add doublestore option
  2020-02-20 20:08 ` [PATCH v3 02/15] scsi_debug: add doublestore option Douglas Gilbert
@ 2020-02-21  9:46   ` Hannes Reinecke
  2020-02-21 15:37     ` Douglas Gilbert
  0 siblings, 1 reply; 26+ messages in thread
From: Hannes Reinecke @ 2020-02-21  9:46 UTC (permalink / raw)
  To: Douglas Gilbert, linux-scsi; +Cc: martin.petersen, jejb, Damien.LeMoal

On 2/20/20 9:08 PM, Douglas Gilbert wrote:
> The scsi_debug driver has always been restricted to using one
> (or none) ramdisk image for its storage. This means that thousands
> of scsi_debug devices can be created without exhausting the host
> machine's RAM. The downside is that all scsi_debug devices share
> the same ramdisk image. This option doubles the amount of ramdisk
> storage space with the first, third, fifth (etc) created
> scsi_debug devices using the first ramdisk image while the second,
> fourth, sixth (etc) created scsi_debug devices using the second
> ramdisk image.
> 
> The reason for doing this is to check that (partial) disk to disk
> copies based on scsi_debug devices have actually worked properly.
> As an example: assume /dev/sdb and /dev/sg1 are the same
> scsi_debug device, while /dev/sdc and /dev/sg2 are also the
> same scsi_debug device. With doublestore=1 they will have
> different ramdisk images. Then the following pseudocode could
> be executed to check the if sgh_dd copy worked:
>     dd if=/dev/urandom of=/dev/sdb
>     sgh_dd if=/dev/sg1 of=/dev/sg2 [plus option(s) to test]
>     cmp /dev/sdb /dev/sdc
> 
> If the cmp fails then the copy has failed (or some other
> mechanism wrote to /dev/sdb or /dev/sdc in the interim).
> 
> Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
> ---
>  drivers/scsi/scsi_debug.c | 264 +++++++++++++++++++++++++++-----------
>  1 file changed, 186 insertions(+), 78 deletions(-)
> 
Nice use-case, but really should be documented somewhere.
Otherwise:

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 03/15] scsi_debug: implement verify(10), add verify(16)
  2020-02-20 20:08 ` [PATCH v3 03/15] scsi_debug: implement verify(10), add verify(16) Douglas Gilbert
@ 2020-02-21  9:49   ` Hannes Reinecke
  0 siblings, 0 replies; 26+ messages in thread
From: Hannes Reinecke @ 2020-02-21  9:49 UTC (permalink / raw)
  To: Douglas Gilbert, linux-scsi; +Cc: martin.petersen, jejb, Damien.LeMoal

On 2/20/20 9:08 PM, Douglas Gilbert wrote:
> With the addition of the doublestore option, the ability to check
> whether the two different ramdisk images are the same or not
> becomes useful. Prior to this patch VERIFY(10) always returned
> true (i.e. the SCSI GOOD status) without checking. This option
> adds support for BYTCHK equal to 1 and 3 . If the comparison
> fails then a sense key of MISCOMPARE is returned as per the
> T10 standards. Add support for the VERIFY(16).
> 
> Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
> ---
>  drivers/scsi/scsi_debug.c | 103 +++++++++++++++++++++++++++++++++++---
>  1 file changed, 95 insertions(+), 8 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 09/15] scsi_debug: zbc module parameter
  2020-02-20 20:08 ` [PATCH v3 09/15] scsi_debug: zbc module parameter Douglas Gilbert
@ 2020-02-21  9:49   ` Johannes Thumshirn
  2020-02-21 15:37     ` Douglas Gilbert
  0 siblings, 1 reply; 26+ messages in thread
From: Johannes Thumshirn @ 2020-02-21  9:49 UTC (permalink / raw)
  To: Douglas Gilbert, linux-scsi; +Cc: martin.petersen, jejb, hare, Damien Le Moal

On 20/02/2020 21:09, Douglas Gilbert wrote:
> +MODULE_PARM_DESC(opt_xferlen_exp, "optimal transfer length granularity exponent (def=physblk_exp)");
>   MODULE_PARM_DESC(opts, "1->noise, 2->medium_err, 4->timeout, 8->recovered_err... (def=0)");
>   MODULE_PARM_DESC(physblk_exp, "physical block exponent (def=0)");
> -MODULE_PARM_DESC(opt_xferlen_exp, "optimal transfer length granularity exponent (def=physblk_exp)");

Unrelated change, isn't it?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 02/15] scsi_debug: add doublestore option
  2020-02-21  9:46   ` Hannes Reinecke
@ 2020-02-21 15:37     ` Douglas Gilbert
  0 siblings, 0 replies; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-21 15:37 UTC (permalink / raw)
  To: Hannes Reinecke, linux-scsi; +Cc: martin.petersen, jejb, Damien.LeMoal

On 2020-02-21 4:46 a.m., Hannes Reinecke wrote:
> On 2/20/20 9:08 PM, Douglas Gilbert wrote:
>> The scsi_debug driver has always been restricted to using one
>> (or none) ramdisk image for its storage. This means that thousands
>> of scsi_debug devices can be created without exhausting the host
>> machine's RAM. The downside is that all scsi_debug devices share
>> the same ramdisk image. This option doubles the amount of ramdisk
>> storage space with the first, third, fifth (etc) created
>> scsi_debug devices using the first ramdisk image while the second,
>> fourth, sixth (etc) created scsi_debug devices using the second
>> ramdisk image.
>>
>> The reason for doing this is to check that (partial) disk to disk
>> copies based on scsi_debug devices have actually worked properly.
>> As an example: assume /dev/sdb and /dev/sg1 are the same
>> scsi_debug device, while /dev/sdc and /dev/sg2 are also the
>> same scsi_debug device. With doublestore=1 they will have
>> different ramdisk images. Then the following pseudocode could
>> be executed to check the if sgh_dd copy worked:
>>      dd if=/dev/urandom of=/dev/sdb
>>      sgh_dd if=/dev/sg1 of=/dev/sg2 [plus option(s) to test]
>>      cmp /dev/sdb /dev/sdc
>>
>> If the cmp fails then the copy has failed (or some other
>> mechanism wrote to /dev/sdb or /dev/sdc in the interim).
>>
>> Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
>> ---
>>   drivers/scsi/scsi_debug.c | 264 +++++++++++++++++++++++++++-----------
>>   1 file changed, 186 insertions(+), 78 deletions(-)
>>
> Nice use-case, but really should be documented somewhere.
> Otherwise:

As stated in the cover letter:
     "If and when this patchset is accepted, this page will be
      updated:   http://sg.danny.cz/sg/sdebug26.html ".
Doing it before then can lead to confusion if the patchset
is not accepted.

One of my scripts for testing the sg driver uses that pattern
with the cmp replaced by:
     sgh_dd --verify if=/dev/sg1 of=/dev/sg2

The sg_dd utility also has a --verify option and I plan to add
one the the ddpt utility. The pseudo SCSI command sequence that
it performs is:
     PRE-FETCH(sg2, IMMED)
     READ(sg1)
     VERIFY(sg2, BYTCHK=1, data from READ)

Comparing this pattern to the standard READ-READ approach:
   - uses less host CPU and ram resources
   - possibly faster in the inequality case
   - faster if the host does the READ-READ sequentially

Doug Gilbert

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 09/15] scsi_debug: zbc module parameter
  2020-02-21  9:49   ` Johannes Thumshirn
@ 2020-02-21 15:37     ` Douglas Gilbert
  2020-02-22  8:27       ` Johannes Thumshirn
  0 siblings, 1 reply; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-21 15:37 UTC (permalink / raw)
  To: Johannes Thumshirn, linux-scsi
  Cc: martin.petersen, jejb, hare, Damien Le Moal

On 2020-02-21 4:49 a.m., Johannes Thumshirn wrote:
> On 20/02/2020 21:09, Douglas Gilbert wrote:
>> +MODULE_PARM_DESC(opt_xferlen_exp, "optimal transfer length granularity exponent (def=physblk_exp)");
>>    MODULE_PARM_DESC(opts, "1->noise, 2->medium_err, 4->timeout, 8->recovered_err... (def=0)");
>>    MODULE_PARM_DESC(physblk_exp, "physical block exponent (def=0)");
>> -MODULE_PARM_DESC(opt_xferlen_exp, "optimal transfer length granularity exponent (def=physblk_exp)");
> 
> Unrelated change, isn't it?

Yes, it should be in the re-arrange in alphabetical order patch. Do you
know a one-liner (or two or ...) for pulling a line out of one patch
and placing it in another with git :-)

Doug Gilbert


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 09/15] scsi_debug: zbc module parameter
  2020-02-21 15:37     ` Douglas Gilbert
@ 2020-02-22  8:27       ` Johannes Thumshirn
  0 siblings, 0 replies; 26+ messages in thread
From: Johannes Thumshirn @ 2020-02-22  8:27 UTC (permalink / raw)
  To: dgilbert, linux-scsi; +Cc: martin.petersen, jejb, hare, Damien Le Moal

On 21/02/2020 16:37, Douglas Gilbert wrote:
> On 2020-02-21 4:49 a.m., Johannes Thumshirn wrote:
>> On 20/02/2020 21:09, Douglas Gilbert wrote:
>>> +MODULE_PARM_DESC(opt_xferlen_exp, "optimal transfer length granularity exponent (def=physblk_exp)");
>>>     MODULE_PARM_DESC(opts, "1->noise, 2->medium_err, 4->timeout, 8->recovered_err... (def=0)");
>>>     MODULE_PARM_DESC(physblk_exp, "physical block exponent (def=0)");
>>> -MODULE_PARM_DESC(opt_xferlen_exp, "optimal transfer length granularity exponent (def=physblk_exp)");
>>
>> Unrelated change, isn't it?
> 
> Yes, it should be in the re-arrange in alphabetical order patch. Do you
> know a one-liner (or two or ...) for pulling a line out of one patch
> and placing it in another with git :-)

 From the top of my head (haven't tested it right now) I'd do

git rebase -i $BASE_BRANCH
<mark this patch as 'edit'>
git reset --soft HEAD^
git add -p
<add hunk #1>
git commit
git add -p
<add other hunks>
git commit
git rebase --continue

This /should/ do the trick, but someone please speak up if there's an 
error in there.

Byte,
	Johannes

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 08/15] scsi_debug: add zone commands
  2020-02-21  9:44   ` Johannes Thumshirn
@ 2020-02-23  5:59     ` Douglas Gilbert
  0 siblings, 0 replies; 26+ messages in thread
From: Douglas Gilbert @ 2020-02-23  5:59 UTC (permalink / raw)
  To: Johannes Thumshirn, linux-scsi
  Cc: martin.petersen, jejb, hare, Damien Le Moal

On 2020-02-21 4:44 a.m., Johannes Thumshirn wrote:
> On 20/02/2020 21:09, Douglas Gilbert wrote:
>> +enum sdebug_z_cond {	/* enumeration names taken from table 12, zbc2r04 */
>> +	zbc_not_write_pointer = 0x0, /* not in table 12; conventional zone */
>> +	zc1_empty = 0x1,
>> +	zc2_implicit_open,
>> +	zc3_explicit_open,
>> +	zc4_closed,
>> +	zc5_full = 0xe,	/* values per Zone Condition field in Report Zones */
>> +	zc6_read_only = 0xd,
>> +	zc7_offline = 0xf,
>> +};
> 
> Haven't checked the rest of scsi_debug, but I'd write these enum values
> in all caps. Furthermore I'd just call them EMPTY, OPEN, READ_ONLY,
> FULL, OFFLINE, etc..

Yes, they should be in upper case. The ZC1, ZC2, etc are the names of
states in these T10 documents: zbc-r05.pdf and zbc2r04a.pdf . As for
the final suggestion, the C compiler thinks otherwise, so the
enumeration constants need something that is relatively unique, hence
the chosen scheme.

ZC6 and ZC7 are being dropped as they only appear in ZBC-2 which is still
at the draft level while we are trying to implement the original (and
currently only) ZBC standard.

Doug Gilbert



^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2020-02-23  5:59 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-20 20:08 [PATCH v3 00/15] host managed ZBC + doublestore Douglas Gilbert
2020-02-20 20:08 ` [PATCH v3 01/15] scsi_debug: randomize command completion time Douglas Gilbert
2020-02-21  9:45   ` Hannes Reinecke
2020-02-20 20:08 ` [PATCH v3 02/15] scsi_debug: add doublestore option Douglas Gilbert
2020-02-21  9:46   ` Hannes Reinecke
2020-02-21 15:37     ` Douglas Gilbert
2020-02-20 20:08 ` [PATCH v3 03/15] scsi_debug: implement verify(10), add verify(16) Douglas Gilbert
2020-02-21  9:49   ` Hannes Reinecke
2020-02-20 20:08 ` [PATCH v3 04/15] scsi_debug: weaken rwlock around ramdisk access Douglas Gilbert
2020-02-20 20:08 ` [PATCH v3 05/15] scsi_debug: improve command duration calculation Douglas Gilbert
2020-02-20 20:08 ` [PATCH v3 06/15] scsi_debug: implement pre-fetch commands Douglas Gilbert
2020-02-20 20:08 ` [PATCH v3 07/15] scsi_debug: expand zbc support Douglas Gilbert
2020-02-20 20:08 ` [PATCH v3 08/15] scsi_debug: add zone commands Douglas Gilbert
2020-02-21  9:44   ` Johannes Thumshirn
2020-02-23  5:59     ` Douglas Gilbert
2020-02-20 20:08 ` [PATCH v3 09/15] scsi_debug: zbc module parameter Douglas Gilbert
2020-02-21  9:49   ` Johannes Thumshirn
2020-02-21 15:37     ` Douglas Gilbert
2020-02-22  8:27       ` Johannes Thumshirn
2020-02-20 20:08 ` [PATCH v3 10/15] scsi_debug: re-arrange parameters alphabetically Douglas Gilbert
2020-02-20 20:08 ` [PATCH v3 11/15] scsi_debug: zbc parameter can be string Douglas Gilbert
2020-02-20 20:08 ` [PATCH v3 12/15] scsi_debug: zone_max_open module parameter Douglas Gilbert
2020-02-20 20:08 ` [PATCH v3 13/15] scsi_debug: zone_nr_conv " Douglas Gilbert
2020-02-20 20:08 ` [PATCH v3 14/15] scsi_debug: zone_size_mb " Douglas Gilbert
2020-02-20 20:08 ` [PATCH v3 15/15] scsi_debug: bump to version 1.89 Douglas Gilbert
2020-02-20 20:11 ` [PATCH v3 00/15] scsi_debug: host managed ZBC + doublestore Douglas Gilbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.