All of lore.kernel.org
 help / color / mirror / Atom feed
* SCSI discovery update
@ 2022-03-02  5:35 Martin K. Petersen
  2022-03-02  5:35 ` [PATCH 01/14] scsi: mpt3sas: Use cached ATA Information VPD page Martin K. Petersen
                   ` (14 more replies)
  0 siblings, 15 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-02  5:35 UTC (permalink / raw)
  To: linux-scsi

This series addresses several issues in the SCSI device discovery
code:

 - Fetch the VPD header before getting the full VPD page. This removes
   the guesswork from sizing the VPD buffer and fixes problems with
   RAID controllers that wedge when we try to fetch the IDENTIFY
   DEVICE information trailing the ATA Information VPD page.

 - Cache the VPD pages we need instead of fetching them every
   revalidate iteration.

 - Avoid truncating the INQUIRY length for modern devices. This allows
   us to query the version descriptors reported by most contemporary
   drives. These version descriptors are used as an extra heuristic
   for querying protocol features.

 - Additional sanity checking for the reported minimum and optimal I/O
   sizes.

 - Fix reported discard failures by making the configuration a
   two-stage process. Completing full VPD/RSOC discovery before we
   configure discard prevents a small window of error where the wrong
   command and/or wrong limit would briefly be applied.

 - Make the zeroing configuration a two-stage process as well.

 - Implement support for the NDOB flag for both discards and
   zeroing. The "No Data Out Buffer" flag removes the need for a
   zeroed payload to be included with a WRITE SAME(16) command.

 - Remove the superfluous revalidate operation historically required
   by the integrity profile registration. This further reduces the
   commands we send during device discovery.

 - Add additional heuristics for enabling discards on modern devices.
   Specifically, if a device reports that it supports logical block
   provisioning, attempt to query the LBP VPD page.

 - Also query the device VPD pages if a device reports conformance to
   a recent version of the SCSI Block Commands specification.

Thanks to several bug reporters and volunteers this series has been
extensively tested with a much wider variety of USB/UAS devices than I
have access to.

-- 
Martin K. Petersen	Oracle Linux Engineering




^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 01/14] scsi: mpt3sas: Use cached ATA Information VPD page
  2022-03-02  5:35 SCSI discovery update Martin K. Petersen
@ 2022-03-02  5:35 ` Martin K. Petersen
  2022-03-02  9:47   ` Christoph Hellwig
                     ` (3 more replies)
  2022-03-02  5:35 ` [PATCH 02/14] scsi: core: Query VPD size before getting full page Martin K. Petersen
                   ` (13 subsequent siblings)
  14 siblings, 4 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-02  5:35 UTC (permalink / raw)
  To: linux-scsi; +Cc: Martin K. Petersen, Sreekanth Reddy

We now cache VPD page 0x89 so there is no need to request it from the
hardware. Make mpt3sas use the cached page.

Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 00792767c620..69cd64a26141 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -12585,20 +12585,18 @@ scsih_pci_mmio_enabled(struct pci_dev *pdev)
  */
 bool scsih_ncq_prio_supp(struct scsi_device *sdev)
 {
-	unsigned char *buf;
+	struct scsi_vpd *vpd;
 	bool ncq_prio_supp = false;
 
-	if (!scsi_device_supports_vpd(sdev))
-		return ncq_prio_supp;
-
-	buf = kmalloc(SCSI_VPD_PG_LEN, GFP_KERNEL);
-	if (!buf)
-		return ncq_prio_supp;
+	rcu_read_lock();
+	vpd = rcu_dereference(sdev->vpd_pg89);
+	if (!vpd || vpd->len < 214)
+		goto out;
 
-	if (!scsi_get_vpd_page(sdev, 0x89, buf, SCSI_VPD_PG_LEN))
-		ncq_prio_supp = (buf[213] >> 4) & 1;
+	ncq_prio_supp = (vpd->data[213] >> 4) & 1;
+out:
+	rcu_read_unlock();
 
-	kfree(buf);
 	return ncq_prio_supp;
 }
 /*
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 02/14] scsi: core: Query VPD size before getting full page
  2022-03-02  5:35 SCSI discovery update Martin K. Petersen
  2022-03-02  5:35 ` [PATCH 01/14] scsi: mpt3sas: Use cached ATA Information VPD page Martin K. Petersen
@ 2022-03-02  5:35 ` Martin K. Petersen
  2022-03-02  9:48   ` Christoph Hellwig
                     ` (2 more replies)
  2022-03-02  5:35 ` [PATCH 03/14] scsi: core: Do not truncate INQUIRY data on modern devices Martin K. Petersen
                   ` (12 subsequent siblings)
  14 siblings, 3 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-02  5:35 UTC (permalink / raw)
  To: linux-scsi; +Cc: Martin K. Petersen, Maciej W . Rozycki

We currently default to 255 bytes when fetching VPD pages during discovery.
However, we have had a few devices that are known to wedge if the requested
buffer exceeds a certain size. See commit af73623f5f10 ("[SCSI] sd: Reduce
buffer size for vpd request") which works around one example of this
problem in the SCSI disk driver.

With commit d188b0675b21 ("scsi: core: Add sysfs attributes for VPD pages
0h and 89h") we now risk triggering the same issue in the generic midlayer
code.

The problem with the ATA VPD page in particular is that the SCSI portion of
the page is trailed by 512 bytes of verbatim ATA Identify Device
information.  However, not all controllers actually provide the additional
512 bytes and will lock up if one asks for more than the 64 bytes
containing the SCSI protocol fields.

Instead of picking a new, somewhat arbitrary, number of bytes for the VPD
buffer size, start fetching the 4-byte header for each page. The header
contains the size of the page as far as the device is concerned. We can use
the reported size to specify the correct allocation length when
subsequently fetching the full page.

The header validation is done by a new helper function scsi_get_vpd_size()
and both scsi_get_vpd_page() and scsi_get_vpd_buf() now rely on this to
query the page size.

In addition, scsi_get_vpd_page() is simplified to mirror the logic in
scsi_get_vpd_page(). This involves removing the Supported VPD Pages lookup
prior to attempting to query a page. There does not appear any evidence,
even in the oldest SCSI specs, that this step is required. We already rely
on scsi_get_vpd_page() throughout the stack and this function never
consulted the Supported VPD Pages. Since this has not caused any problems
it should be safe to remove the precondition from scsi_get_vpd_page().

Instrumented runs also revealed that the Supported VPD Pages lookup had
little effect since the device page index often was larger than the
supplied buffer size. As a result, inquiries frequently bypassed the index
check and went through the "If we ran off the end of the buffer, give us
the benefit of the doubt" code path which assumed the page was present
despite not being listed. The revised code takes both the page size
reported by the device as well as the size of the buffer provided by the
scsi_get_vpd_page() caller into account.

Fixes: d188b0675b21 ("scsi: core: Add sysfs attributes for VPD pages 0h and 89h")
Reported-by: Maciej W. Rozycki <macro@orcam.me.uk>
Tested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/scsi.c        | 94 +++++++++++++++++++++++++-------------
 include/scsi/scsi_device.h |  5 +-
 2 files changed, 67 insertions(+), 32 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 211aace69c22..af896d0647a7 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -321,6 +321,32 @@ static int scsi_vpd_inquiry(struct scsi_device *sdev, unsigned char *buffer,
 	return get_unaligned_be16(&buffer[2]) + 4;
 }
 
+static int scsi_get_vpd_size(struct scsi_device *sdev, u8 page)
+{
+	unsigned char vpd_header[SCSI_VPD_HEADER_SIZE];
+	int result;
+
+	/*
+	 * Fetch the VPD page header to find out how big the page
+	 * is. This is done to prevent problems on legacy devices
+	 * which can not handle allocation lengths as large as
+	 * potentially requested by the caller.
+	 */
+	result = scsi_vpd_inquiry(sdev, vpd_header, page, sizeof(vpd_header));
+
+	if (result < 0)
+		return 0;
+
+	if (result < SCSI_VPD_HEADER_SIZE) {
+		dev_warn_once(&sdev->sdev_gendev,
+			      "%s: short VPD page 0x%02x length: %d bytes\n",
+			      __func__, page, result);
+		return 0;
+	}
+
+	return result;
+}
+
 /**
  * scsi_get_vpd_page - Get Vital Product Data from a SCSI device
  * @sdev: The device to ask
@@ -330,47 +356,42 @@ static int scsi_vpd_inquiry(struct scsi_device *sdev, unsigned char *buffer,
  *
  * SCSI devices may optionally supply Vital Product Data.  Each 'page'
  * of VPD is defined in the appropriate SCSI document (eg SPC, SBC).
- * If the device supports this VPD page, this routine returns a pointer
- * to a buffer containing the data from that page.  The caller is
- * responsible for calling kfree() on this pointer when it is no longer
- * needed.  If we cannot retrieve the VPD page this routine returns %NULL.
+ * If the device supports this VPD page, this routine fills @buf
+ * with the data from that page and return 0. If the VPD page is not
+ * supported or its content cannot be retrieved, -EINVAL is returned.
  */
 int scsi_get_vpd_page(struct scsi_device *sdev, u8 page, unsigned char *buf,
 		      int buf_len)
 {
-	int i, result;
+	int result, vpd_len;
 
-	if (sdev->skip_vpd_pages)
-		goto fail;
-
-	/* Ask for all the pages supported by this device */
-	result = scsi_vpd_inquiry(sdev, buf, 0, buf_len);
-	if (result < 4)
-		goto fail;
-
-	/* If the user actually wanted this page, we can skip the rest */
-	if (page == 0)
-		return 0;
+	if (!scsi_device_supports_vpd(sdev))
+		return -EINVAL;
 
-	for (i = 4; i < min(result, buf_len); i++)
-		if (buf[i] == page)
-			goto found;
+	vpd_len = scsi_get_vpd_size(sdev, page);
+	if (vpd_len <= 0)
+		return -EINVAL;
 
-	if (i < result && i >= buf_len)
-		/* ran off the end of the buffer, give us benefit of doubt */
-		goto found;
-	/* The device claims it doesn't support the requested page */
-	goto fail;
+	vpd_len = min(vpd_len, buf_len);
 
- found:
-	result = scsi_vpd_inquiry(sdev, buf, page, buf_len);
+retry_pg:
+	/*
+	 * Fetch the actual page. Since the appropriate size was reported
+	 * by the device it is now safe to ask for something bigger.
+	 */
+	memset(buf, 0, buf_len);
+	result = scsi_vpd_inquiry(sdev, buf, page, vpd_len);
 	if (result < 0)
-		goto fail;
+		return -EINVAL;
+	else if (result > vpd_len) {
+		dev_warn_once(&sdev->sdev_gendev,
+			      "%s: VPD page 0x%02x result %d > %d bytes\n",
+			      __func__, page, result, vpd_len);
+		vpd_len = min(result, buf_len);
+		goto retry_pg;
+	}
 
 	return 0;
-
- fail:
-	return -EINVAL;
 }
 EXPORT_SYMBOL_GPL(scsi_get_vpd_page);
 
@@ -384,9 +405,17 @@ EXPORT_SYMBOL_GPL(scsi_get_vpd_page);
 static struct scsi_vpd *scsi_get_vpd_buf(struct scsi_device *sdev, u8 page)
 {
 	struct scsi_vpd *vpd_buf;
-	int vpd_len = SCSI_VPD_PG_LEN, result;
+	int vpd_len, result;
+
+	vpd_len = scsi_get_vpd_size(sdev, page);
+	if (vpd_len <= 0)
+		return NULL;
 
 retry_pg:
+	/*
+	 * Fetch the actual page. Since the appropriate size was reported
+	 * by the device it is now safe to ask for something bigger.
+	 */
 	vpd_buf = kmalloc(sizeof(*vpd_buf) + vpd_len, GFP_KERNEL);
 	if (!vpd_buf)
 		return NULL;
@@ -397,6 +426,9 @@ static struct scsi_vpd *scsi_get_vpd_buf(struct scsi_device *sdev, u8 page)
 		return NULL;
 	}
 	if (result > vpd_len) {
+		dev_warn_once(&sdev->sdev_gendev,
+			      "%s: VPD page 0x%02x result %d > %d bytes\n",
+			      __func__, page, result, vpd_len);
 		vpd_len = result;
 		kfree(vpd_buf);
 		goto retry_pg;
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 647c53b26105..144d3a801c8d 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -100,6 +100,10 @@ struct scsi_vpd {
 	unsigned char	data[];
 };
 
+enum scsi_vpd_parameters {
+	SCSI_VPD_HEADER_SIZE = 4,
+};
+
 struct scsi_device {
 	struct Scsi_Host *host;
 	struct request_queue *request_queue;
@@ -141,7 +145,6 @@ struct scsi_device {
 	const char * model;		/* ... after scan; point to static string */
 	const char * rev;		/* ... "nullnullnullnull" before scan */
 
-#define SCSI_VPD_PG_LEN                255
 	struct scsi_vpd __rcu *vpd_pg0;
 	struct scsi_vpd __rcu *vpd_pg83;
 	struct scsi_vpd __rcu *vpd_pg80;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 03/14] scsi: core: Do not truncate INQUIRY data on modern devices
  2022-03-02  5:35 SCSI discovery update Martin K. Petersen
  2022-03-02  5:35 ` [PATCH 01/14] scsi: mpt3sas: Use cached ATA Information VPD page Martin K. Petersen
  2022-03-02  5:35 ` [PATCH 02/14] scsi: core: Query VPD size before getting full page Martin K. Petersen
@ 2022-03-02  5:35 ` Martin K. Petersen
  2022-03-02  9:49   ` Christoph Hellwig
                     ` (2 more replies)
  2022-03-02  5:35 ` [PATCH 04/14] scsi: core: Pick suitable allocation length in scsi_report_opcode() Martin K. Petersen
                   ` (11 subsequent siblings)
  14 siblings, 3 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-02  5:35 UTC (permalink / raw)
  To: linux-scsi; +Cc: Martin K. Petersen

Low-level device drivers have had the ability to limit the size of an
INQUIRY for many years. This made sense for a wide variety of legacy
devices. However, we are unnecessarily truncating the INQUIRY response
for many modern devices. This prevents us from consulting fields
beyond the first 36 bytes.

If a device reports that it supports a larger INQUIRY response, and
the device also reports that it implements SPC-4 or newer, allow the
larger INQUIRY to proceed.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/scsi_scan.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index f4e6c68ac99e..95bf9a1f35ce 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -728,7 +728,17 @@ static int scsi_probe_lun(struct scsi_device *sdev, unsigned char *inq_result,
 		if (pass == 1) {
 			if (BLIST_INQUIRY_36 & *bflags)
 				next_inquiry_len = 36;
-			else if (sdev->inquiry_len)
+			/*
+			 * LLD specified a maximum sdev->inquiry_len
+			 * but device claims it has more data. Capping
+			 * the length only makes sense for legacy
+			 * devices. If a device supports SPC-4 (2014)
+			 * or newer, assume that it is safe to ask for
+			 * as much as the device says it supports.
+			 */
+			else if (sdev->inquiry_len &&
+				 response_len > sdev->inquiry_len &&
+				 (inq_result[2] & 0x7) < 6) /* SPC-4 */
 				next_inquiry_len = sdev->inquiry_len;
 			else
 				next_inquiry_len = response_len;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 04/14] scsi: core: Pick suitable allocation length in scsi_report_opcode()
  2022-03-02  5:35 SCSI discovery update Martin K. Petersen
                   ` (2 preceding siblings ...)
  2022-03-02  5:35 ` [PATCH 03/14] scsi: core: Do not truncate INQUIRY data on modern devices Martin K. Petersen
@ 2022-03-02  5:35 ` Martin K. Petersen
  2022-03-02  9:49   ` Christoph Hellwig
                     ` (3 more replies)
  2022-03-02  5:35 ` [PATCH 05/14] scsi: core: Cache VPD pages b0, b1, b2 Martin K. Petersen
                   ` (10 subsequent siblings)
  14 siblings, 4 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-02  5:35 UTC (permalink / raw)
  To: linux-scsi; +Cc: Martin K. Petersen

Some devices hang when a buffer size larger than expected is passed in
the ALLOCATION LENGTH field. For REPORT SUPPORTED OPERATION CODES we
currently only request a single command descriptor at a time and
therefore the actual size of the command is known ahead of time. Limit
the ALLOCATION LENGTH to the header size plus the command length of
the opcode we are asking about.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/scsi.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index af896d0647a7..c6221a2ca04e 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -508,21 +508,30 @@ int scsi_report_opcode(struct scsi_device *sdev, unsigned char *buffer,
 {
 	unsigned char cmd[16];
 	struct scsi_sense_hdr sshdr;
-	int result;
+	int result, request_len;
 
 	if (sdev->no_report_opcodes || sdev->scsi_level < SCSI_SPC_3)
 		return -EINVAL;
 
+	/* RSOC header + size of command we are asking about */
+	request_len = 4 + COMMAND_SIZE(opcode);
+	if (request_len > len) {
+		dev_warn_once(&sdev->sdev_gendev,
+			      "%s: len %u bytes, opcode 0x%02x needs %u\n",
+			      __func__, len, opcode, request_len);
+		return -EINVAL;
+	}
+
 	memset(cmd, 0, 16);
 	cmd[0] = MAINTENANCE_IN;
 	cmd[1] = MI_REPORT_SUPPORTED_OPERATION_CODES;
 	cmd[2] = 1;		/* One command format */
 	cmd[3] = opcode;
-	put_unaligned_be32(len, &cmd[6]);
+	put_unaligned_be32(request_len, &cmd[6]);
 	memset(buffer, 0, len);
 
-	result = scsi_execute_req(sdev, cmd, DMA_FROM_DEVICE, buffer, len,
-				  &sshdr, 30 * HZ, 3, NULL);
+	result = scsi_execute_req(sdev, cmd, DMA_FROM_DEVICE, buffer,
+				  request_len, &sshdr, 30 * HZ, 3, NULL);
 
 	if (result < 0)
 		return result;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 05/14] scsi: core: Cache VPD pages b0, b1, b2
  2022-03-02  5:35 SCSI discovery update Martin K. Petersen
                   ` (3 preceding siblings ...)
  2022-03-02  5:35 ` [PATCH 04/14] scsi: core: Pick suitable allocation length in scsi_report_opcode() Martin K. Petersen
@ 2022-03-02  5:35 ` Martin K. Petersen
  2022-03-02  9:50   ` Christoph Hellwig
                     ` (3 more replies)
  2022-03-02  5:35 ` [PATCH 06/14] scsi: sd: Use cached ATA Information VPD page Martin K. Petersen
                   ` (9 subsequent siblings)
  14 siblings, 4 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-02  5:35 UTC (permalink / raw)
  To: linux-scsi; +Cc: Martin K. Petersen

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/scsi.c        |  6 ++++++
 drivers/scsi/scsi_sysfs.c  | 28 ++++++++++++++++++++++++++++
 include/scsi/scsi_device.h |  4 ++++
 3 files changed, 38 insertions(+)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index c6221a2ca04e..524e03aba9ef 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -488,6 +488,12 @@ void scsi_attach_vpd(struct scsi_device *sdev)
 			scsi_update_vpd_page(sdev, 0x83, &sdev->vpd_pg83);
 		if (vpd_buf->data[i] == 0x89)
 			scsi_update_vpd_page(sdev, 0x89, &sdev->vpd_pg89);
+		if (vpd_buf->data[i] == 0xb0)
+			scsi_update_vpd_page(sdev, 0xb0, &sdev->vpd_pgb0);
+		if (vpd_buf->data[i] == 0xb1)
+			scsi_update_vpd_page(sdev, 0xb1, &sdev->vpd_pgb1);
+		if (vpd_buf->data[i] == 0xb2)
+			scsi_update_vpd_page(sdev, 0xb2, &sdev->vpd_pgb2);
 	}
 	kfree(vpd_buf);
 }
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 226a50944c00..1ed45de9970d 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -448,6 +448,7 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	struct list_head *this, *tmp;
 	struct scsi_vpd *vpd_pg80 = NULL, *vpd_pg83 = NULL;
 	struct scsi_vpd *vpd_pg0 = NULL, *vpd_pg89 = NULL;
+	struct scsi_vpd *vpd_pgb0 = NULL, *vpd_pgb1 = NULL, *vpd_pgb2 = NULL;
 	unsigned long flags;
 	struct module *mod;
 
@@ -490,6 +491,12 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 				       lockdep_is_held(&sdev->inquiry_mutex));
 	vpd_pg89 = rcu_replace_pointer(sdev->vpd_pg89, vpd_pg89,
 				       lockdep_is_held(&sdev->inquiry_mutex));
+	vpd_pgb0 = rcu_replace_pointer(sdev->vpd_pgb0, vpd_pgb0,
+				       lockdep_is_held(&sdev->inquiry_mutex));
+	vpd_pgb1 = rcu_replace_pointer(sdev->vpd_pgb1, vpd_pgb1,
+				       lockdep_is_held(&sdev->inquiry_mutex));
+	vpd_pgb2 = rcu_replace_pointer(sdev->vpd_pgb2, vpd_pgb2,
+				       lockdep_is_held(&sdev->inquiry_mutex));
 	mutex_unlock(&sdev->inquiry_mutex);
 
 	if (vpd_pg0)
@@ -500,6 +507,12 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 		kfree_rcu(vpd_pg80, rcu);
 	if (vpd_pg89)
 		kfree_rcu(vpd_pg89, rcu);
+	if (vpd_pgb0)
+		kfree_rcu(vpd_pgb0, rcu);
+	if (vpd_pgb1)
+		kfree_rcu(vpd_pgb1, rcu);
+	if (vpd_pgb2)
+		kfree_rcu(vpd_pgb2, rcu);
 	kfree(sdev->inquiry);
 	kfree(sdev);
 
@@ -913,6 +926,9 @@ static struct bin_attribute dev_attr_vpd_##_page = {		\
 sdev_vpd_pg_attr(pg83);
 sdev_vpd_pg_attr(pg80);
 sdev_vpd_pg_attr(pg89);
+sdev_vpd_pg_attr(pgb0);
+sdev_vpd_pg_attr(pgb1);
+sdev_vpd_pg_attr(pgb2);
 sdev_vpd_pg_attr(pg0);
 
 static ssize_t show_inquiry(struct file *filep, struct kobject *kobj,
@@ -1250,6 +1266,15 @@ static umode_t scsi_sdev_bin_attr_is_visible(struct kobject *kobj,
 	if (attr == &dev_attr_vpd_pg89 && !sdev->vpd_pg89)
 		return 0;
 
+	if (attr == &dev_attr_vpd_pgb0 && !sdev->vpd_pgb0)
+		return 0;
+
+	if (attr == &dev_attr_vpd_pgb1 && !sdev->vpd_pgb1)
+		return 0;
+
+	if (attr == &dev_attr_vpd_pgb2 && !sdev->vpd_pgb2)
+		return 0;
+
 	return S_IRUGO;
 }
 
@@ -1296,6 +1321,9 @@ static struct bin_attribute *scsi_sdev_bin_attrs[] = {
 	&dev_attr_vpd_pg83,
 	&dev_attr_vpd_pg80,
 	&dev_attr_vpd_pg89,
+	&dev_attr_vpd_pgb0,
+	&dev_attr_vpd_pgb1,
+	&dev_attr_vpd_pgb2,
 	&dev_attr_inquiry,
 	NULL
 };
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 144d3a801c8d..fd6a803ac8cd 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -149,6 +149,10 @@ struct scsi_device {
 	struct scsi_vpd __rcu *vpd_pg83;
 	struct scsi_vpd __rcu *vpd_pg80;
 	struct scsi_vpd __rcu *vpd_pg89;
+	struct scsi_vpd __rcu *vpd_pgb0;
+	struct scsi_vpd __rcu *vpd_pgb1;
+	struct scsi_vpd __rcu *vpd_pgb2;
+
 	struct scsi_target      *sdev_target;
 
 	blist_flags_t		sdev_bflags; /* black/white flags as also found in
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 06/14] scsi: sd: Use cached ATA Information VPD page
  2022-03-02  5:35 SCSI discovery update Martin K. Petersen
                   ` (4 preceding siblings ...)
  2022-03-02  5:35 ` [PATCH 05/14] scsi: core: Cache VPD pages b0, b1, b2 Martin K. Petersen
@ 2022-03-02  5:35 ` Martin K. Petersen
  2022-03-02  9:50   ` Christoph Hellwig
                     ` (2 more replies)
  2022-03-02  5:35 ` [PATCH 07/14] scsi: sd: Switch to using scsi_device VPD pages Martin K. Petersen
                   ` (8 subsequent siblings)
  14 siblings, 3 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-02  5:35 UTC (permalink / raw)
  To: linux-scsi; +Cc: Martin K. Petersen

Since the ATA Information VPD is now cached at device discovery time it is
no longer necessary to request this page when we configure WRITE SAME.
Instead use the cached information to determine if this disk sits behind a
SCSI-ATA translation layer.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/sd.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 2f9d160bc8c2..d6aec12aef76 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3024,8 +3024,7 @@ static void sd_read_write_same(struct scsi_disk *sdkp, unsigned char *buffer)
 	}
 
 	if (scsi_report_opcode(sdev, buffer, SD_BUF_SIZE, INQUIRY) < 0) {
-		/* too large values might cause issues with arcmsr */
-		int vpd_buf_len = 64;
+		struct scsi_vpd *vpd;
 
 		sdev->no_report_opcodes = 1;
 
@@ -3033,8 +3032,11 @@ static void sd_read_write_same(struct scsi_disk *sdkp, unsigned char *buffer)
 		 * CODES is unsupported and the device has an ATA
 		 * Information VPD page (SAT).
 		 */
-		if (!scsi_get_vpd_page(sdev, 0x89, buffer, vpd_buf_len))
+		rcu_read_lock();
+		vpd = rcu_dereference(sdev->vpd_pg89);
+		if (vpd)
 			sdev->no_write_same = 1;
+		rcu_read_unlock();
 	}
 
 	if (scsi_report_opcode(sdev, buffer, SD_BUF_SIZE, WRITE_SAME_16) == 1)
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 07/14] scsi: sd: Switch to using scsi_device VPD pages
  2022-03-02  5:35 SCSI discovery update Martin K. Petersen
                   ` (5 preceding siblings ...)
  2022-03-02  5:35 ` [PATCH 06/14] scsi: sd: Use cached ATA Information VPD page Martin K. Petersen
@ 2022-03-02  5:35 ` Martin K. Petersen
  2022-03-02  9:51   ` Christoph Hellwig
                     ` (2 more replies)
  2022-03-02  5:35 ` [PATCH 08/14] scsi: sd: Optimal I/O size should be a multiple of reported granularity Martin K. Petersen
                   ` (7 subsequent siblings)
  14 siblings, 3 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-02  5:35 UTC (permalink / raw)
  To: linux-scsi; +Cc: Martin K. Petersen

Use the VPD pages already provided by the SCSI midlayer. No need to
request them individually in the SCSI disk driver.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/sd.c | 80 +++++++++++++++++++++++------------------------
 1 file changed, 40 insertions(+), 40 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d6aec12aef76..84c04691841f 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2872,39 +2872,39 @@ static void sd_read_app_tag_own(struct scsi_disk *sdkp, unsigned char *buffer)
 static void sd_read_block_limits(struct scsi_disk *sdkp)
 {
 	unsigned int sector_sz = sdkp->device->sector_size;
-	const int vpd_len = 64;
-	unsigned char *buffer = kmalloc(vpd_len, GFP_KERNEL);
+	struct scsi_vpd *vpd;
 
-	if (!buffer ||
-	    /* Block Limits VPD */
-	    scsi_get_vpd_page(sdkp->device, 0xb0, buffer, vpd_len))
+	rcu_read_lock();
+
+	vpd = rcu_dereference(sdkp->device->vpd_pgb0);
+	if (!vpd || vpd->len < 16)
 		goto out;
 
 	blk_queue_io_min(sdkp->disk->queue,
-			 get_unaligned_be16(&buffer[6]) * sector_sz);
+			 get_unaligned_be16(&vpd->data[6]) * sector_sz);
 
-	sdkp->max_xfer_blocks = get_unaligned_be32(&buffer[8]);
-	sdkp->opt_xfer_blocks = get_unaligned_be32(&buffer[12]);
+	sdkp->max_xfer_blocks = get_unaligned_be32(&vpd->data[8]);
+	sdkp->opt_xfer_blocks = get_unaligned_be32(&vpd->data[12]);
 
-	if (buffer[3] == 0x3c) {
+	if (vpd->len >= 64) {
 		unsigned int lba_count, desc_count;
 
-		sdkp->max_ws_blocks = (u32)get_unaligned_be64(&buffer[36]);
+		sdkp->max_ws_blocks = (u32)get_unaligned_be64(&vpd->data[36]);
 
 		if (!sdkp->lbpme)
 			goto out;
 
-		lba_count = get_unaligned_be32(&buffer[20]);
-		desc_count = get_unaligned_be32(&buffer[24]);
+		lba_count = get_unaligned_be32(&vpd->data[20]);
+		desc_count = get_unaligned_be32(&vpd->data[24]);
 
 		if (lba_count && desc_count)
 			sdkp->max_unmap_blocks = lba_count;
 
-		sdkp->unmap_granularity = get_unaligned_be32(&buffer[28]);
+		sdkp->unmap_granularity = get_unaligned_be32(&vpd->data[28]);
 
-		if (buffer[32] & 0x80)
+		if (vpd->data[32] & 0x80)
 			sdkp->unmap_alignment =
-				get_unaligned_be32(&buffer[32]) & ~(1 << 31);
+				get_unaligned_be32(&vpd->data[32]) & ~(1 << 31);
 
 		if (!sdkp->lbpvpd) { /* LBP VPD page not provided */
 
@@ -2926,7 +2926,7 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
 	}
 
  out:
-	kfree(buffer);
+	rcu_read_unlock();
 }
 
 /**
@@ -2936,18 +2936,21 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
 static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 {
 	struct request_queue *q = sdkp->disk->queue;
-	unsigned char *buffer;
+	struct scsi_vpd *vpd;
 	u16 rot;
-	const int vpd_len = 64;
+	u8 zoned;
 
-	buffer = kmalloc(vpd_len, GFP_KERNEL);
+	rcu_read_lock();
+	vpd = rcu_dereference(sdkp->device->vpd_pgb1);
 
-	if (!buffer ||
-	    /* Block Device Characteristics VPD */
-	    scsi_get_vpd_page(sdkp->device, 0xb1, buffer, vpd_len))
-		goto out;
+	if (!vpd || vpd->len < 8) {
+		rcu_read_unlock();
+	        return;
+	}
 
-	rot = get_unaligned_be16(&buffer[4]);
+	rot = get_unaligned_be16(&vpd->data[4]);
+	zoned = (vpd->data[8] >> 4) & 3;
+	rcu_read_unlock();
 
 	if (rot == 1) {
 		blk_queue_flag_set(QUEUE_FLAG_NONROT, q);
@@ -2958,7 +2961,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 		/* Host-managed */
 		blk_queue_set_zoned(sdkp->disk, BLK_ZONED_HM);
 	} else {
-		sdkp->zoned = (buffer[8] >> 4) & 3;
+		sdkp->zoned = zoned;
 		if (sdkp->zoned == 1) {
 			/* Host-aware */
 			blk_queue_set_zoned(sdkp->disk, BLK_ZONED_HA);
@@ -2969,7 +2972,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 	}
 
 	if (!sdkp->first_scan)
-		goto out;
+		return;
 
 	if (blk_queue_is_zoned(q)) {
 		sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
@@ -2982,9 +2985,6 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 			sd_printk(KERN_NOTICE, sdkp,
 				  "Drive-managed SMR disk\n");
 	}
-
- out:
-	kfree(buffer);
 }
 
 /**
@@ -2993,24 +2993,24 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
  */
 static void sd_read_block_provisioning(struct scsi_disk *sdkp)
 {
-	unsigned char *buffer;
-	const int vpd_len = 8;
+	struct scsi_vpd *vpd;
 
 	if (sdkp->lbpme == 0)
 		return;
 
-	buffer = kmalloc(vpd_len, GFP_KERNEL);
+	rcu_read_lock();
+	vpd = rcu_dereference(sdkp->device->vpd_pgb2);
 
-	if (!buffer || scsi_get_vpd_page(sdkp->device, 0xb2, buffer, vpd_len))
-		goto out;
+	if (!vpd || vpd->len < 8) {
+		rcu_read_unlock();
+		return;
+	}
 
 	sdkp->lbpvpd	= 1;
-	sdkp->lbpu	= (buffer[5] >> 7) & 1;	/* UNMAP */
-	sdkp->lbpws	= (buffer[5] >> 6) & 1;	/* WRITE SAME(16) with UNMAP */
-	sdkp->lbpws10	= (buffer[5] >> 5) & 1;	/* WRITE SAME(10) with UNMAP */
-
- out:
-	kfree(buffer);
+	sdkp->lbpu	= (vpd->data[5] >> 7) & 1; /* UNMAP */
+	sdkp->lbpws	= (vpd->data[5] >> 6) & 1; /* WRITE SAME(16) w/ UNMAP */
+	sdkp->lbpws10	= (vpd->data[5] >> 5) & 1; /* WRITE SAME(10) w/ UNMAP */
+	rcu_read_unlock();
 }
 
 static void sd_read_write_same(struct scsi_disk *sdkp, unsigned char *buffer)
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 08/14] scsi: sd: Optimal I/O size should be a multiple of reported granularity
  2022-03-02  5:35 SCSI discovery update Martin K. Petersen
                   ` (6 preceding siblings ...)
  2022-03-02  5:35 ` [PATCH 07/14] scsi: sd: Switch to using scsi_device VPD pages Martin K. Petersen
@ 2022-03-02  5:35 ` Martin K. Petersen
  2022-03-02  9:51   ` Christoph Hellwig
  2022-03-03 20:17   ` Bart Van Assche
  2022-03-02  5:35 ` [PATCH 09/14] scsi: sd: Fix discard errors during revalidate Martin K. Petersen
                   ` (6 subsequent siblings)
  14 siblings, 2 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-02  5:35 UTC (permalink / raw)
  To: linux-scsi; +Cc: Martin K. Petersen, Bernhard Sulzer

Commit a83da8a4509d ("scsi: sd: Optimal I/O size should be a multiple of
physical block size") validated the reported optimal I/O size against the
physical block size to overcome problems with devices reporting nonsensical
transfer sizes.

However, some devices claim conformity to older SCSI versions that predate
the physical block size being reported. Other devices do not report a
physical block size at all. We need to be able to validate the optimal I/O
size on those devices as well.

Many devices report an OPTIMAL TRANSFER LENGTH GRANULARITY in the same VPD
page as the OPTIMAL TRANSFER LENGTH. Use this value to validate the optimal
I/O size. Also check that the reported granularity is a multiple of the
physical block size, if supported.

Link: https://lore.kernel.org/r/33fb522e-4f61-1b76-914f-c9e6a3553c9b@gmail.com
Reported-by: Bernhard Sulzer <micraft.b@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/sd.c | 45 +++++++++++++++++++++++++++++++++++++++++----
 drivers/scsi/sd.h |  1 +
 2 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 84c04691841f..8595c4f2e915 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2871,7 +2871,6 @@ static void sd_read_app_tag_own(struct scsi_disk *sdkp, unsigned char *buffer)
  */
 static void sd_read_block_limits(struct scsi_disk *sdkp)
 {
-	unsigned int sector_sz = sdkp->device->sector_size;
 	struct scsi_vpd *vpd;
 
 	rcu_read_lock();
@@ -2880,9 +2879,7 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
 	if (!vpd || vpd->len < 16)
 		goto out;
 
-	blk_queue_io_min(sdkp->disk->queue,
-			 get_unaligned_be16(&vpd->data[6]) * sector_sz);
-
+	sdkp->min_xfer_blocks = get_unaligned_be16(&vpd->data[6]);
 	sdkp->max_xfer_blocks = get_unaligned_be32(&vpd->data[8]);
 	sdkp->opt_xfer_blocks = get_unaligned_be32(&vpd->data[12]);
 
@@ -3140,6 +3137,29 @@ static void sd_read_cpr(struct scsi_disk *sdkp)
 	kfree(buffer);
 }
 
+static bool sd_validate_min_xfer_size(struct scsi_disk *sdkp)
+{
+	struct scsi_device *sdp = sdkp->device;
+	unsigned int min_xfer_bytes =
+		logical_to_bytes(sdp, sdkp->min_xfer_blocks);
+
+	if (sdkp->min_xfer_blocks == 0)
+		return false;
+
+	if (min_xfer_bytes & (sdkp->physical_block_size - 1)) {
+		sd_first_printk(KERN_WARNING, sdkp,
+				"Preferred minimum I/O size %u bytes not a " \
+				"multiple of physical block size (%u bytes)\n",
+				min_xfer_bytes, sdkp->physical_block_size);
+		sdkp->min_xfer_blocks = 0;
+		return false;
+	}
+
+	sd_first_printk(KERN_INFO, sdkp, "Preferred minimum I/O size %u bytes\n",
+			min_xfer_bytes);
+	return true;
+}
+
 /*
  * Determine the device's preferred I/O size for reads and writes
  * unless the reported value is unreasonably small, large, not a
@@ -3151,6 +3171,8 @@ static bool sd_validate_opt_xfer_size(struct scsi_disk *sdkp,
 	struct scsi_device *sdp = sdkp->device;
 	unsigned int opt_xfer_bytes =
 		logical_to_bytes(sdp, sdkp->opt_xfer_blocks);
+	unsigned int min_xfer_bytes =
+		logical_to_bytes(sdp, sdkp->min_xfer_blocks);
 
 	if (sdkp->opt_xfer_blocks == 0)
 		return false;
@@ -3179,6 +3201,15 @@ static bool sd_validate_opt_xfer_size(struct scsi_disk *sdkp,
 		return false;
 	}
 
+	if (min_xfer_bytes && opt_xfer_bytes & (min_xfer_bytes - 1)) {
+		sd_first_printk(KERN_WARNING, sdkp,
+				"Optimal transfer size %u bytes not a " \
+				"multiple of preferred minimum block " \
+				"size (%u bytes)\n",
+				opt_xfer_bytes, min_xfer_bytes);
+		return false;
+	}
+
 	if (opt_xfer_bytes & (sdkp->physical_block_size - 1)) {
 		sd_first_printk(KERN_WARNING, sdkp,
 				"Optimal transfer size %u bytes not a " \
@@ -3271,6 +3302,12 @@ static int sd_revalidate_disk(struct gendisk *disk)
 	dev_max = min_not_zero(dev_max, sdkp->max_xfer_blocks);
 	q->limits.max_dev_sectors = logical_to_sectors(sdp, dev_max);
 
+	if (sd_validate_min_xfer_size(sdkp))
+		blk_queue_io_min(sdkp->disk->queue,
+				 logical_to_bytes(sdp, sdkp->min_xfer_blocks));
+	else
+		blk_queue_io_min(sdkp->disk->queue, 0);
+
 	if (sd_validate_opt_xfer_size(sdkp, dev_max)) {
 		q->limits.io_opt = logical_to_bytes(sdp, sdkp->opt_xfer_blocks);
 		rw_max = logical_to_sectors(sdp, sdkp->opt_xfer_blocks);
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 2e5932bde43d..f4fbca90e997 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -91,6 +91,7 @@ struct scsi_disk {
 	atomic_t	openers;
 	sector_t	capacity;	/* size in logical blocks */
 	int		max_retries;
+	u32		min_xfer_blocks;
 	u32		max_xfer_blocks;
 	u32		opt_xfer_blocks;
 	u32		max_ws_blocks;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 09/14] scsi: sd: Fix discard errors during revalidate
  2022-03-02  5:35 SCSI discovery update Martin K. Petersen
                   ` (7 preceding siblings ...)
  2022-03-02  5:35 ` [PATCH 08/14] scsi: sd: Optimal I/O size should be a multiple of reported granularity Martin K. Petersen
@ 2022-03-02  5:35 ` Martin K. Petersen
  2022-03-02  9:52   ` Christoph Hellwig
  2022-03-03 21:06   ` Bart Van Assche
  2022-03-02  5:35 ` [PATCH 10/14] scsi: sd: Move WRITE_ZEROES configuration to a separate function Martin K. Petersen
                   ` (5 subsequent siblings)
  14 siblings, 2 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-02  5:35 UTC (permalink / raw)
  To: linux-scsi; +Cc: Martin K. Petersen

We previously switched to SD_LBP_WS16 mode by default if a device
identified itself as being thinly provisioned. This was done for
compatibility with older devices that predate the Logical Block
Provisioning VPD page and the introduction of the separate UNMAP command.

Since WS16 was originally the only option there was no way to explicitly
signal support for it outside of the device reporting LBPME=1. And thus we
switch it on every time we discover a thinly provisioned device in READ
CAPACITY(16).

Some devices, however, report different values for unmap operations
performed with WRITE SAME and ones performed with the UNMAP command. For
instance a device may report that it can unmap 64KB with WRITE SAME but
only 32KB with UNMAP. If the device then reports a preference for UNMAP in
the LBP VPD, there is a tiny window between the WS16 being enabled and the
UNMAP limit being set. And during that window the block layer can issue
64KB discards which, when being prepped by the sd driver, now violate the
UNMAP limit.

To avoid temporarily setting WS16 during revalidate, relocate all the
provisioning mode setting heuristics to sd_config_discard(). Introduce a
new mode, SD_LBP_DEFAULT, which sd_revalidate() will use to trigger the
heuristic to select a suitable mode based on what the device reports.

SD_LBP_DEFAULT can also be triggered in sysfs via the string "default",
should a user decide to change back to the kernel-chosen provisioning mode
after manually overriding the default.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/sd.c | 124 +++++++++++++++++++++++++++++++++++-----------
 drivers/scsi/sd.h |   9 ++--
 2 files changed, 101 insertions(+), 32 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 8595c4f2e915..eae5c81ae515 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -100,7 +100,7 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC);
 
 #define SD_MINORS	16
 
-static void sd_config_discard(struct scsi_disk *, unsigned int);
+static void sd_config_discard(struct scsi_disk *, enum sd_lbp_mode);
 static void sd_config_write_same(struct scsi_disk *);
 static int  sd_revalidate_disk(struct gendisk *);
 static void sd_unlock_native_capacity(struct gendisk *disk);
@@ -374,6 +374,7 @@ static DEVICE_ATTR_RO(thin_provisioning);
 
 /* sysfs_match_string() requires dense arrays */
 static const char *lbp_mode[] = {
+	[SD_LBP_DEFAULT]	= "default",
 	[SD_LBP_FULL]		= "full",
 	[SD_LBP_UNMAP]		= "unmap",
 	[SD_LBP_WS16]		= "writesame_16",
@@ -414,6 +415,11 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr,
 	if (mode < 0)
 		return -EINVAL;
 
+	if (mode == SD_LBP_DEFAULT)
+		sdkp->provisioning_override = false;
+	else
+		sdkp->provisioning_override = true;
+
 	sd_config_discard(sdkp, mode);
 
 	return count;
@@ -812,23 +818,95 @@ static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd,
 	return protect;
 }
 
-static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
+/*
+ * It took many iterations in T10 to develop a model for thinly provisioned
+ * devices. Linux was an early adopter of the concept of discards, and as a
+ * result of the SCSI spec being a moving target for several years, we have a
+ * set of heuristics in place that allow us to support a wide variety of
+ * devices that predate the final SBC specification.
+ *
+ * These heuristics are triggered by default during device discovery, but the
+ * user can subsequently override what the kernel decided by writing a
+ * particular mode string to a scsi_disk's provisioning_mode node in sysfs.
+ * For devices that predate any of the provisioning knobs in the spec but rely
+ * on zero-detection, it is possible to enable discard through the
+ * "writesame_zero" override.
+ *
+ * For Linux to automatically identify a SCSI disk as being thinly
+ * provisioned, the device must set the LBPME bit in READ CAPACITY(16).
+ *
+ * In the ratified version of T10 SBC-4, a device must also provide a Logical
+ * Block Provisioning VPD page which has three fields that indicate which
+ * provisioning commands the device supports. The device should also implement
+ * the extended version of the Block Limits VPD which is used to indicate any
+ * limitations on the size of unmap operations as well as alignment and
+ * granularity used inside the device.
+ *
+ * If the device supports the Logical Block Provisioning VPD, and sets the
+ * LBPU flag, and reports a MAXIMUM UNMAP LBA COUNT > 0 and a MAXIMUM UNMAP
+ * BLOCK DESCRIPTOR count > 0 in the extended Block Limits VPD, then we will
+ * use UNMAP for discards. Otherwise, if the device set LBPWS in the LBP VPD,
+ * we will use WRITE SAME(16) with the UNMAP bit set for discards.  Otherwise,
+ * if the device sets LBPWS10 in the LBP VPD, then we will use WRITE SAME(10)
+ * with the UNMAP bit set for discards.
+ *
+ * If the device does *not* support the Logical Block Provisioning VPD, we
+ * rely on the extended version of the Block Limits VPD. If that is supported,
+ * and and the device reports a MAXIMUM UNMAP LBA COUNT > 0 and a MAXIMUM
+ * UNMAP BLOCK DESCRIPTOR count > 0, then we will use UNMAP for discards.
+ * Otherwise we will use WRITE SAME(16) with the UNMAP bit set for discards.
+ *
+ * If a device implements the *short* version of the Block Limits VPD or does
+ * not have a Block Limits VPD at all, we default to using WRITE SAME(16) with
+ * the UNMAP bit set for discards.
+ *
+ * The possible values for provisioning_mode in sysfs are:
+ *
+ *   "default"	      - use heuristics outlined above to decide on command
+ *   "full"           - the device does not support discard
+ *   "unmap"          - use the UNMAP command
+ *   "writesame_16"   - use the WRITE SAME(16) command with the UNMAP bit set
+ *   "writesame_10"   - use the WRITE SAME(10) command with the UNMAP bit set
+ *   "writesame_zero" - use WRITE SAME(16) with a zeroed payload, no UNMAP bit
+ *   "disabled"	      - discards disabled due to command failure
+ */
+static void sd_config_discard(struct scsi_disk *sdkp, enum sd_lbp_mode mode)
 {
 	struct request_queue *q = sdkp->disk->queue;
 	unsigned int logical_block_size = sdkp->device->sector_size;
 	unsigned int max_blocks = 0;
 
-	q->limits.discard_alignment =
-		sdkp->unmap_alignment * logical_block_size;
-	q->limits.discard_granularity =
-		max(sdkp->physical_block_size,
-		    sdkp->unmap_granularity * logical_block_size);
-	sdkp->provisioning_mode = mode;
+	if (mode == SD_LBP_DEFAULT && !sdkp->provisioning_override) {
+		if (sdkp->lbpme) { /* Logical Block Provisioning Enabled */
+			if (sdkp->lbpvpd) { /* Logical Block Provisioning VPD */
+				if (sdkp->lbpu && sdkp->max_unmap_blocks)
+					mode = SD_LBP_UNMAP;
+				else if (sdkp->lbpws)
+					mode = SD_LBP_WS16;
+				else if (sdkp->lbpws10)
+					mode = SD_LBP_WS10;
+				else
+					mode = SD_LBP_FULL;
+			} else if (sdkp->lblvpd) { /* Long Block Limits VPD */
+				if (sdkp->max_unmap_blocks)
+					mode = SD_LBP_UNMAP;
+				else
+					mode = SD_LBP_WS16;
+			} else /* LBPME only, no VPDs supported */
+				mode = SD_LBP_WS16;
+		} else
+			mode = SD_LBP_FULL;
+	}
 
 	switch (mode) {
-
+	case SD_LBP_DEFAULT:
 	case SD_LBP_FULL:
 	case SD_LBP_DISABLE:
+		if (mode == SD_LBP_DISABLE)
+			sdkp->provisioning_override = true;
+		sdkp->provisioning_mode = mode;
+		q->limits.discard_alignment = 0;
+		q->limits.discard_granularity = 0;
 		blk_queue_max_discard_sectors(q, 0);
 		blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q);
 		return;
@@ -862,6 +940,12 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
 		break;
 	}
 
+	sdkp->provisioning_mode = mode;
+	q->limits.discard_alignment =
+		sdkp->unmap_alignment * logical_block_size;
+	q->limits.discard_granularity =
+		max(sdkp->physical_block_size,
+		    sdkp->unmap_granularity * logical_block_size);
 	blk_queue_max_discard_sectors(q, max_blocks * (logical_block_size >> 9));
 	blk_queue_flag_set(QUEUE_FLAG_DISCARD, q);
 }
@@ -2355,8 +2439,6 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
 
 		if (buffer[14] & 0x40) /* LBPRZ */
 			sdkp->lbprz = 1;
-
-		sd_config_discard(sdkp, SD_LBP_WS16);
 	}
 
 	sdkp->capacity = lba + 1;
@@ -2886,6 +2968,7 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
 	if (vpd->len >= 64) {
 		unsigned int lba_count, desc_count;
 
+		sdkp->lblvpd = 1;
 		sdkp->max_ws_blocks = (u32)get_unaligned_be64(&vpd->data[36]);
 
 		if (!sdkp->lbpme)
@@ -2902,24 +2985,6 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
 		if (vpd->data[32] & 0x80)
 			sdkp->unmap_alignment =
 				get_unaligned_be32(&vpd->data[32]) & ~(1 << 31);
-
-		if (!sdkp->lbpvpd) { /* LBP VPD page not provided */
-
-			if (sdkp->max_unmap_blocks)
-				sd_config_discard(sdkp, SD_LBP_UNMAP);
-			else
-				sd_config_discard(sdkp, SD_LBP_WS16);
-
-		} else {	/* LBP VPD page tells us what to use */
-			if (sdkp->lbpu && sdkp->max_unmap_blocks)
-				sd_config_discard(sdkp, SD_LBP_UNMAP);
-			else if (sdkp->lbpws)
-				sd_config_discard(sdkp, SD_LBP_WS16);
-			else if (sdkp->lbpws10)
-				sd_config_discard(sdkp, SD_LBP_WS10);
-			else
-				sd_config_discard(sdkp, SD_LBP_DISABLE);
-		}
 	}
 
  out:
@@ -3287,6 +3352,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
 		sd_read_write_same(sdkp, buffer);
 		sd_read_security(sdkp, buffer);
 		sd_read_cpr(sdkp);
+		sd_config_discard(sdkp, SD_LBP_DEFAULT);
 	}
 
 	/*
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index f4fbca90e997..57a8241163c5 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -51,12 +51,13 @@ enum {
 	SD_MAX_WS16_BLOCKS = 0x7fffff,
 };
 
-enum {
-	SD_LBP_FULL = 0,	/* Full logical block provisioning */
+enum sd_lbp_mode {
+	SD_LBP_DEFAULT = 0,	/* Select mode based on what device reports */
+	SD_LBP_FULL,		/* Full logical block provisioning */
 	SD_LBP_UNMAP,		/* Use UNMAP command */
 	SD_LBP_WS16,		/* Use WRITE SAME(16) with UNMAP bit */
 	SD_LBP_WS10,		/* Use WRITE SAME(10) with UNMAP bit */
-	SD_LBP_ZERO,		/* Use WRITE SAME(10) with zero payload */
+	SD_LBP_ZERO,		/* Use WRITE SAME(10) with zeroed payload */
 	SD_LBP_DISABLE,		/* Discard disabled due to failed cmd */
 };
 
@@ -108,6 +109,8 @@ struct scsi_disk {
 	u8		provisioning_mode;
 	u8		zeroing_mode;
 	u8		nr_actuators;		/* Number of actuators */
+	bool		lblvpd;
+	bool		provisioning_override;
 	unsigned	ATO : 1;	/* state of disk ATO bit */
 	unsigned	cache_override : 1; /* temp override of WCE,RCD */
 	unsigned	WCE : 1;	/* state of disk WCE bit */
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 10/14] scsi: sd: Move WRITE_ZEROES configuration to a separate function
  2022-03-02  5:35 SCSI discovery update Martin K. Petersen
                   ` (8 preceding siblings ...)
  2022-03-02  5:35 ` [PATCH 09/14] scsi: sd: Fix discard errors during revalidate Martin K. Petersen
@ 2022-03-02  5:35 ` Martin K. Petersen
  2022-03-02  9:53   ` Christoph Hellwig
                     ` (2 more replies)
  2022-03-02  5:35 ` [PATCH 11/14] scsi: sd: Implement support for NDOB flag in WRITE SAME(16) Martin K. Petersen
                   ` (4 subsequent siblings)
  14 siblings, 3 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-02  5:35 UTC (permalink / raw)
  To: linux-scsi; +Cc: Martin K. Petersen

In preparation for adding support for the WRITE SAME(16) NDOB flag,
move configuration of the WRITE_ZEROES operation to a separate
function. This is done to facilitate fetching all VPD pages before
choosing the appropriate zeroing method for a given device.

The deferred configuration also allows us to mirror the discard
behavior and permit the user to revert a device to the kernel default
configuration by echoing "default" to the sysfs file.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/sd.c | 56 +++++++++++++++++++++++++++++++++--------------
 drivers/scsi/sd.h |  7 ++++--
 2 files changed, 44 insertions(+), 19 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index eae5c81ae515..ee4f4aea5f0f 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -101,6 +101,7 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC);
 #define SD_MINORS	16
 
 static void sd_config_discard(struct scsi_disk *, enum sd_lbp_mode);
+static void sd_config_write_zeroes(struct scsi_disk *, enum sd_zeroing_mode);
 static void sd_config_write_same(struct scsi_disk *);
 static int  sd_revalidate_disk(struct gendisk *);
 static void sd_unlock_native_capacity(struct gendisk *disk);
@@ -428,10 +429,12 @@ static DEVICE_ATTR_RW(provisioning_mode);
 
 /* sysfs_match_string() requires dense arrays */
 static const char *zeroing_mode[] = {
+	[SD_ZERO_DEFAULT]	= "default",
 	[SD_ZERO_WRITE]		= "write",
 	[SD_ZERO_WS]		= "writesame",
 	[SD_ZERO_WS16_UNMAP]	= "writesame_16_unmap",
 	[SD_ZERO_WS10_UNMAP]	= "writesame_10_unmap",
+	[SD_ZERO_DISABLE]	= "disabled",
 };
 
 static ssize_t
@@ -457,7 +460,12 @@ zeroing_mode_store(struct device *dev, struct device_attribute *attr,
 	if (mode < 0)
 		return -EINVAL;
 
-	sdkp->zeroing_mode = mode;
+	if (mode == SD_ZERO_DEFAULT)
+		sdkp->zeroing_override = false;
+	else
+		sdkp->zeroing_override = true;
+
+	sd_config_write_zeroes(sdkp, mode);
 
 	return count;
 }
@@ -1049,6 +1057,31 @@ static blk_status_t sd_setup_write_same10_cmnd(struct scsi_cmnd *cmd,
 	return scsi_alloc_sgtables(cmd);
 }
 
+static void sd_config_write_zeroes(struct scsi_disk *sdkp,
+				   enum sd_zeroing_mode mode)
+{
+	struct request_queue *q = sdkp->disk->queue;
+	unsigned int logical_block_size = sdkp->device->sector_size;
+
+	if (mode == SD_ZERO_DEFAULT && !sdkp->zeroing_override) {
+		if (sdkp->lbprz && sdkp->lbpws)
+			mode = SD_ZERO_WS16_UNMAP;
+		else if (sdkp->lbprz && sdkp->lbpws10)
+			mode = SD_ZERO_WS10_UNMAP;
+		else if (sdkp->max_ws_blocks)
+			mode = SD_ZERO_WS;
+		else
+			mode = SD_ZERO_WRITE;
+	}
+
+	if (mode == SD_ZERO_DISABLE)
+		sdkp->zeroing_override = true;
+
+	sdkp->zeroing_mode = mode;
+	blk_queue_max_write_zeroes_sectors(q, sdkp->max_ws_blocks *
+					   (logical_block_size >> 9));
+}
+
 static blk_status_t sd_setup_write_zeroes_cmnd(struct scsi_cmnd *cmd)
 {
 	struct request *rq = scsi_cmd_to_rq(cmd);
@@ -1079,12 +1112,11 @@ static blk_status_t sd_setup_write_zeroes_cmnd(struct scsi_cmnd *cmd)
 
 static void sd_config_write_same(struct scsi_disk *sdkp)
 {
-	struct request_queue *q = sdkp->disk->queue;
 	unsigned int logical_block_size = sdkp->device->sector_size;
 
 	if (sdkp->device->no_write_same) {
 		sdkp->max_ws_blocks = 0;
-		goto out;
+		return;
 	}
 
 	/* Some devices can not handle block counts above 0xffff despite
@@ -1103,15 +1135,6 @@ static void sd_config_write_same(struct scsi_disk *sdkp)
 		sdkp->max_ws_blocks = 0;
 	}
 
-	if (sdkp->lbprz && sdkp->lbpws)
-		sdkp->zeroing_mode = SD_ZERO_WS16_UNMAP;
-	else if (sdkp->lbprz && sdkp->lbpws10)
-		sdkp->zeroing_mode = SD_ZERO_WS10_UNMAP;
-	else if (sdkp->max_ws_blocks)
-		sdkp->zeroing_mode = SD_ZERO_WS;
-	else
-		sdkp->zeroing_mode = SD_ZERO_WRITE;
-
 	if (sdkp->max_ws_blocks &&
 	    sdkp->physical_block_size > logical_block_size) {
 		/*
@@ -1131,10 +1154,6 @@ static void sd_config_write_same(struct scsi_disk *sdkp)
 				   bytes_to_logical(sdkp->device,
 						    sdkp->physical_block_size));
 	}
-
-out:
-	blk_queue_max_write_zeroes_sectors(q, sdkp->max_ws_blocks *
-					 (logical_block_size >> 9));
 }
 
 static blk_status_t sd_setup_flush_cmnd(struct scsi_cmnd *cmd)
@@ -2126,6 +2145,8 @@ static int sd_done(struct scsi_cmnd *SCpnt)
 			case WRITE_SAME:
 				if (SCpnt->cmnd[1] & 8) { /* UNMAP */
 					sd_config_discard(sdkp, SD_LBP_DISABLE);
+					sd_config_write_zeroes(sdkp,
+							       SD_ZERO_DISABLE);
 				} else {
 					sdkp->device->no_write_same = 1;
 					sd_config_write_same(sdkp);
@@ -3352,7 +3373,9 @@ static int sd_revalidate_disk(struct gendisk *disk)
 		sd_read_write_same(sdkp, buffer);
 		sd_read_security(sdkp, buffer);
 		sd_read_cpr(sdkp);
+		sd_config_write_same(sdkp);
 		sd_config_discard(sdkp, SD_LBP_DEFAULT);
+		sd_config_write_zeroes(sdkp, SD_ZERO_DEFAULT);
 	}
 
 	/*
@@ -3398,7 +3421,6 @@ static int sd_revalidate_disk(struct gendisk *disk)
 	sdkp->first_scan = 0;
 
 	set_capacity_and_notify(disk, logical_to_sectors(sdp, sdkp->capacity));
-	sd_config_write_same(sdkp);
 	kfree(buffer);
 
 	/*
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 57a8241163c5..e0ee4215a3b4 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -61,11 +61,13 @@ enum sd_lbp_mode {
 	SD_LBP_DISABLE,		/* Discard disabled due to failed cmd */
 };
 
-enum {
-	SD_ZERO_WRITE = 0,	/* Use WRITE(10/16) command */
+enum sd_zeroing_mode {
+	SD_ZERO_DEFAULT = 0,	/* Default mode based on what device reports */
+	SD_ZERO_WRITE,		/* Use WRITE(10/16) command */
 	SD_ZERO_WS,		/* Use WRITE SAME(10/16) command */
 	SD_ZERO_WS16_UNMAP,	/* Use WRITE SAME(16) with UNMAP */
 	SD_ZERO_WS10_UNMAP,	/* Use WRITE SAME(10) with UNMAP */
+	SD_ZERO_DISABLE,	/* Write Zeroes disabled due to failed cmd */
 };
 
 struct scsi_disk {
@@ -111,6 +113,7 @@ struct scsi_disk {
 	u8		nr_actuators;		/* Number of actuators */
 	bool		lblvpd;
 	bool		provisioning_override;
+	bool		zeroing_override;
 	unsigned	ATO : 1;	/* state of disk ATO bit */
 	unsigned	cache_override : 1; /* temp override of WCE,RCD */
 	unsigned	WCE : 1;	/* state of disk WCE bit */
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 11/14] scsi: sd: Implement support for NDOB flag in WRITE SAME(16)
  2022-03-02  5:35 SCSI discovery update Martin K. Petersen
                   ` (9 preceding siblings ...)
  2022-03-02  5:35 ` [PATCH 10/14] scsi: sd: Move WRITE_ZEROES configuration to a separate function Martin K. Petersen
@ 2022-03-02  5:35 ` Martin K. Petersen
  2022-03-02  9:54   ` Christoph Hellwig
                     ` (2 more replies)
  2022-03-02  5:35 ` [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages Martin K. Petersen
                   ` (3 subsequent siblings)
  14 siblings, 3 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-02  5:35 UTC (permalink / raw)
  To: linux-scsi; +Cc: Martin K. Petersen

The NDOB flag removes the need for a zeroed logical block in the data-out
buffer when using WRITE SAME(16) to zero block ranges.  Implement support
for NDOB in the SCSI disk driver to mirror WRITE ZEROES in NVMe.

The only way to detect whether a device supports NDOB is through
REPORT SUPPORTED OPERATION CODES. Since we can't safely send that
command to all devices we only attempt this if the device implements
the Block Provisioning VPD page and sets the LBPWS flag.

If we issue a WRITE SAME(16) we check whether NDOB is set for the
device in question. If so we do not allocate a zeroed page from the
pool and simply issue the command with a zero-length payload.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/scsi_trace.c |  3 +-
 drivers/scsi/sd.c         | 93 ++++++++++++++++++++++++++-------------
 drivers/scsi/sd.h         |  4 ++
 3 files changed, 69 insertions(+), 31 deletions(-)

diff --git a/drivers/scsi/scsi_trace.c b/drivers/scsi/scsi_trace.c
index 41a950075913..1d1f25f689ef 100644
--- a/drivers/scsi/scsi_trace.c
+++ b/drivers/scsi/scsi_trace.c
@@ -83,7 +83,8 @@ scsi_trace_rw16(struct trace_seq *p, unsigned char *cdb, int len)
 			 cdb[1] >> 5);
 
 	if (cdb[0] == WRITE_SAME_16)
-		trace_seq_printf(p, " unmap=%u", cdb[1] >> 3 & 1);
+		trace_seq_printf(p, " unmap=%u ndob=%u", cdb[1] >> 3 & 1,
+				 cdb[1] & 1);
 
 	trace_seq_putc(p, 0);
 
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index ee4f4aea5f0f..2c2e86738578 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -379,6 +379,7 @@ static const char *lbp_mode[] = {
 	[SD_LBP_FULL]		= "full",
 	[SD_LBP_UNMAP]		= "unmap",
 	[SD_LBP_WS16]		= "writesame_16",
+	[SD_LBP_WS16_NDOB]	= "writesame_16_ndob",
 	[SD_LBP_WS10]		= "writesame_10",
 	[SD_LBP_ZERO]		= "writesame_zero",
 	[SD_LBP_DISABLE]	= "disabled",
@@ -429,12 +430,14 @@ static DEVICE_ATTR_RW(provisioning_mode);
 
 /* sysfs_match_string() requires dense arrays */
 static const char *zeroing_mode[] = {
-	[SD_ZERO_DEFAULT]	= "default",
-	[SD_ZERO_WRITE]		= "write",
-	[SD_ZERO_WS]		= "writesame",
-	[SD_ZERO_WS16_UNMAP]	= "writesame_16_unmap",
-	[SD_ZERO_WS10_UNMAP]	= "writesame_10_unmap",
-	[SD_ZERO_DISABLE]	= "disabled",
+	[SD_ZERO_DEFAULT]		= "default",
+	[SD_ZERO_WRITE]			= "write",
+	[SD_ZERO_WS]			= "writesame",
+	[SD_ZERO_WS16_UNMAP_NDOB]	= "writesame_16_unmap_ndob",
+	[SD_ZERO_WS16_UNMAP]		= "writesame_16_unmap",
+	[SD_ZERO_WS10_UNMAP]		= "writesame_10_unmap",
+	[SD_ZERO_WS16_NDOB]		= "writesame_16_ndob",
+	[SD_ZERO_DISABLE]		= "disabled",
 };
 
 static ssize_t
@@ -870,13 +873,14 @@ static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd,
  *
  * The possible values for provisioning_mode in sysfs are:
  *
- *   "default"	      - use heuristics outlined above to decide on command
- *   "full"           - the device does not support discard
- *   "unmap"          - use the UNMAP command
- *   "writesame_16"   - use the WRITE SAME(16) command with the UNMAP bit set
- *   "writesame_10"   - use the WRITE SAME(10) command with the UNMAP bit set
- *   "writesame_zero" - use WRITE SAME(16) with a zeroed payload, no UNMAP bit
- *   "disabled"	      - discards disabled due to command failure
+ *   "default"		 - use heuristics outlined above to decide on command
+ *   "full"		 - the device does not support discard
+ *   "unmap"		 - use the UNMAP command
+ *   "writesame_16"	 - use the WRITE SAME(16) command with the UNMAP bit set
+ *   "writesame_16_ndob" - use WRITE SAME(16) with UNMAP and NDOB bits set
+ *   "writesame_10"	 - use the WRITE SAME(10) command with the UNMAP bit set
+ *   "writesame_zero"	 - use WRITE SAME(16) with a zeroed payload, no UNMAP bit
+ *   "disabled"		 - discards disabled due to command failure
  */
 static void sd_config_discard(struct scsi_disk *sdkp, enum sd_lbp_mode mode)
 {
@@ -889,9 +893,12 @@ static void sd_config_discard(struct scsi_disk *sdkp, enum sd_lbp_mode mode)
 			if (sdkp->lbpvpd) { /* Logical Block Provisioning VPD */
 				if (sdkp->lbpu && sdkp->max_unmap_blocks)
 					mode = SD_LBP_UNMAP;
-				else if (sdkp->lbpws)
-					mode = SD_LBP_WS16;
-				else if (sdkp->lbpws10)
+				else if (sdkp->lbpws) {
+					if (sdkp->ndob)
+						mode = SD_LBP_WS16_NDOB;
+					else
+						mode = SD_LBP_WS16;
+				} else if (sdkp->lbpws10)
 					mode = SD_LBP_WS10;
 				else
 					mode = SD_LBP_FULL;
@@ -925,6 +932,7 @@ static void sd_config_discard(struct scsi_disk *sdkp, enum sd_lbp_mode mode)
 		break;
 
 	case SD_LBP_WS16:
+	case SD_LBP_WS16_NDOB:
 		if (sdkp->device->unmap_limit_for_ws)
 			max_blocks = sdkp->max_unmap_blocks;
 		else
@@ -994,7 +1002,7 @@ static blk_status_t sd_setup_unmap_cmnd(struct scsi_cmnd *cmd)
 }
 
 static blk_status_t sd_setup_write_same16_cmnd(struct scsi_cmnd *cmd,
-		bool unmap)
+		bool unmap, bool ndob)
 {
 	struct scsi_device *sdp = cmd->device;
 	struct request *rq = scsi_cmd_to_rq(cmd);
@@ -1003,23 +1011,32 @@ static blk_status_t sd_setup_write_same16_cmnd(struct scsi_cmnd *cmd,
 	u32 nr_blocks = sectors_to_logical(sdp, blk_rq_sectors(rq));
 	u32 data_len = sdp->sector_size;
 
-	rq->special_vec.bv_page = mempool_alloc(sd_page_pool, GFP_ATOMIC);
-	if (!rq->special_vec.bv_page)
-		return BLK_STS_RESOURCE;
-	clear_highpage(rq->special_vec.bv_page);
-	rq->special_vec.bv_offset = 0;
-	rq->special_vec.bv_len = data_len;
+	if (ndob) {
+		rq->special_vec.bv_page = NULL;
+		rq->special_vec.bv_len = 0;
+	} else {
+		rq->special_vec.bv_page =
+			mempool_alloc(sd_page_pool, GFP_ATOMIC);
+		if (!rq->special_vec.bv_page)
+			return BLK_STS_RESOURCE;
+		clear_highpage(rq->special_vec.bv_page);
+		rq->special_vec.bv_len = data_len;
+	}
+
 	rq->rq_flags |= RQF_SPECIAL_PAYLOAD;
+	rq->special_vec.bv_offset = 0;
 
 	cmd->cmd_len = 16;
 	cmd->cmnd[0] = WRITE_SAME_16;
 	if (unmap)
 		cmd->cmnd[1] = 0x8; /* UNMAP */
+	if (ndob)
+		cmd->cmnd[1] |= 0x1; /* NDOB */
 	put_unaligned_be64(lba, &cmd->cmnd[2]);
 	put_unaligned_be32(nr_blocks, &cmd->cmnd[10]);
 
 	cmd->allowed = sdkp->max_retries;
-	cmd->transfersize = data_len;
+	cmd->transfersize = rq->special_vec.bv_len;
 	rq->timeout = unmap ? SD_TIMEOUT : SD_WRITE_SAME_TIMEOUT;
 
 	return scsi_alloc_sgtables(cmd);
@@ -1064,10 +1081,14 @@ static void sd_config_write_zeroes(struct scsi_disk *sdkp,
 	unsigned int logical_block_size = sdkp->device->sector_size;
 
 	if (mode == SD_ZERO_DEFAULT && !sdkp->zeroing_override) {
-		if (sdkp->lbprz && sdkp->lbpws)
+		if (sdkp->lbprz && sdkp->lbpws && sdkp->ndob)
+			mode = SD_ZERO_WS16_UNMAP_NDOB;
+		else if (sdkp->lbprz && sdkp->lbpws)
 			mode = SD_ZERO_WS16_UNMAP;
 		else if (sdkp->lbprz && sdkp->lbpws10)
 			mode = SD_ZERO_WS10_UNMAP;
+		else if (sdkp->max_ws_blocks && sdkp->ndob)
+			mode = SD_ZERO_WS16_NDOB;
 		else if (sdkp->max_ws_blocks)
 			mode = SD_ZERO_WS;
 		else
@@ -1092,8 +1113,10 @@ static blk_status_t sd_setup_write_zeroes_cmnd(struct scsi_cmnd *cmd)
 
 	if (!(rq->cmd_flags & REQ_NOUNMAP)) {
 		switch (sdkp->zeroing_mode) {
+		case SD_ZERO_WS16_UNMAP_NDOB:
+			return sd_setup_write_same16_cmnd(cmd, true, true);
 		case SD_ZERO_WS16_UNMAP:
-			return sd_setup_write_same16_cmnd(cmd, true);
+			return sd_setup_write_same16_cmnd(cmd, true, false);
 		case SD_ZERO_WS10_UNMAP:
 			return sd_setup_write_same10_cmnd(cmd, true);
 		}
@@ -1104,8 +1127,12 @@ static blk_status_t sd_setup_write_zeroes_cmnd(struct scsi_cmnd *cmd)
 		return BLK_STS_TARGET;
 	}
 
-	if (sdkp->ws16 || lba > 0xffffffff || nr_blocks > 0xffff)
-		return sd_setup_write_same16_cmnd(cmd, false);
+	if (sdkp->ws16 || lba > 0xffffffff || nr_blocks > 0xffff) {
+		if (sdkp->zeroing_mode == SD_ZERO_WS16_NDOB)
+			return sd_setup_write_same16_cmnd(cmd, false, true);
+		else
+			return sd_setup_write_same16_cmnd(cmd, false, false);
+	}
 
 	return sd_setup_write_same10_cmnd(cmd, false);
 }
@@ -1372,7 +1399,9 @@ static blk_status_t sd_init_command(struct scsi_cmnd *cmd)
 		case SD_LBP_UNMAP:
 			return sd_setup_unmap_cmnd(cmd);
 		case SD_LBP_WS16:
-			return sd_setup_write_same16_cmnd(cmd, true);
+			return sd_setup_write_same16_cmnd(cmd, true, false);
+		case SD_LBP_WS16_NDOB:
+			return sd_setup_write_same16_cmnd(cmd, true, true);
 		case SD_LBP_WS10:
 			return sd_setup_write_same10_cmnd(cmd, true);
 		case SD_LBP_ZERO:
@@ -3122,9 +3151,13 @@ static void sd_read_write_same(struct scsi_disk *sdkp, unsigned char *buffer)
 		rcu_read_unlock();
 	}
 
-	if (scsi_report_opcode(sdev, buffer, SD_BUF_SIZE, WRITE_SAME_16) == 1)
+	if (scsi_report_opcode(sdev, buffer, SD_BUF_SIZE, WRITE_SAME_16) == 1) {
 		sdkp->ws16 = 1;
 
+		if (get_unaligned_be16(&buffer[2]) >= 2)
+			sdkp->ndob = buffer[5] & 1;
+	}
+
 	if (scsi_report_opcode(sdev, buffer, SD_BUF_SIZE, WRITE_SAME) == 1)
 		sdkp->ws10 = 1;
 }
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index e0ee4215a3b4..2cef9e884b2a 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -56,6 +56,7 @@ enum sd_lbp_mode {
 	SD_LBP_FULL,		/* Full logical block provisioning */
 	SD_LBP_UNMAP,		/* Use UNMAP command */
 	SD_LBP_WS16,		/* Use WRITE SAME(16) with UNMAP bit */
+	SD_LBP_WS16_NDOB,	/* Use WRITE SAME(16) with UNMAP + NDOB bits */
 	SD_LBP_WS10,		/* Use WRITE SAME(10) with UNMAP bit */
 	SD_LBP_ZERO,		/* Use WRITE SAME(10) with zeroed payload */
 	SD_LBP_DISABLE,		/* Discard disabled due to failed cmd */
@@ -65,8 +66,10 @@ enum sd_zeroing_mode {
 	SD_ZERO_DEFAULT = 0,	/* Default mode based on what device reports */
 	SD_ZERO_WRITE,		/* Use WRITE(10/16) command */
 	SD_ZERO_WS,		/* Use WRITE SAME(10/16) command */
+	SD_ZERO_WS16_UNMAP_NDOB,/* Use WRITE SAME(16) with UNMAP + NDOB bits */
 	SD_ZERO_WS16_UNMAP,	/* Use WRITE SAME(16) with UNMAP */
 	SD_ZERO_WS10_UNMAP,	/* Use WRITE SAME(10) with UNMAP */
+	SD_ZERO_WS16_NDOB,	/* Use WRITE SAME(16) with NDOB */
 	SD_ZERO_DISABLE,	/* Write Zeroes disabled due to failed cmd */
 };
 
@@ -114,6 +117,7 @@ struct scsi_disk {
 	bool		lblvpd;
 	bool		provisioning_override;
 	bool		zeroing_override;
+	bool		ndob;
 	unsigned	ATO : 1;	/* state of disk ATO bit */
 	unsigned	cache_override : 1; /* temp override of WCE,RCD */
 	unsigned	WCE : 1;	/* state of disk WCE bit */
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages
  2022-03-02  5:35 SCSI discovery update Martin K. Petersen
                   ` (10 preceding siblings ...)
  2022-03-02  5:35 ` [PATCH 11/14] scsi: sd: Implement support for NDOB flag in WRITE SAME(16) Martin K. Petersen
@ 2022-03-02  5:35 ` Martin K. Petersen
  2022-03-02  9:54   ` Christoph Hellwig
                     ` (3 more replies)
  2022-03-02  5:35 ` [PATCH 13/14] scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice Martin K. Petersen
                   ` (2 subsequent siblings)
  14 siblings, 4 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-02  5:35 UTC (permalink / raw)
  To: linux-scsi; +Cc: Martin K. Petersen, Damien Le Moal

As such it should be called inside the scsi_device_supports_vpd()
conditional.

Cc: Damien Le Moal <damien.lemoal@wdc.com>
Fixes: e815d36548f0 ("scsi: sd: add concurrent positioning ranges support")
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/sd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 2c2e86738578..9d6b2205339d 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3396,6 +3396,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
 			sd_read_block_limits(sdkp);
 			sd_read_block_characteristics(sdkp);
 			sd_zbc_read_zones(sdkp, buffer);
+			sd_read_cpr(sdkp);
 		}
 
 		sd_print_capacity(sdkp, old_capacity);
@@ -3405,7 +3406,6 @@ static int sd_revalidate_disk(struct gendisk *disk)
 		sd_read_app_tag_own(sdkp, buffer);
 		sd_read_write_same(sdkp, buffer);
 		sd_read_security(sdkp, buffer);
-		sd_read_cpr(sdkp);
 		sd_config_write_same(sdkp);
 		sd_config_discard(sdkp, SD_LBP_DEFAULT);
 		sd_config_write_zeroes(sdkp, SD_ZERO_DEFAULT);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 13/14] scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice
  2022-03-02  5:35 SCSI discovery update Martin K. Petersen
                   ` (11 preceding siblings ...)
  2022-03-02  5:35 ` [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages Martin K. Petersen
@ 2022-03-02  5:35 ` Martin K. Petersen
  2022-03-02  9:57   ` Christoph Hellwig
  2022-03-04  9:36   ` Johannes Thumshirn
  2022-03-02  5:35 ` [PATCH 14/14] scsi: sd: Enable modern protocol features on more devices Martin K. Petersen
  2022-03-03  6:09 ` SCSI discovery update Douglas Gilbert
  14 siblings, 2 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-02  5:35 UTC (permalink / raw)
  To: linux-scsi; +Cc: Martin K. Petersen

During device discovery we ended up calling revalidate twice and thus
requested the same parameters multiple times. This was originally
necessary due to the request_queue and gendisk needing to be
instantiated to configure the block integrity profile.

Since this dependency no longer exists, reorganize the integrity
probing code so it can be run once at the end of discovery and drop
the superfluous revalidate call. Postponing the registration step
involves splitting sd_read_protection() into two functions, one to
read the device protection type and one to configure the mode of
operation.

As part of this cleanup, make the printing code a bit less verbose.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/sd.c     | 62 +++++++++++++++++++++++--------------------
 drivers/scsi/sd_dif.c |  8 +++---
 2 files changed, 36 insertions(+), 34 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 9d6b2205339d..163697dd799a 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2338,40 +2338,48 @@ static int sd_read_protection_type(struct scsi_disk *sdkp, unsigned char *buffer
 {
 	struct scsi_device *sdp = sdkp->device;
 	u8 type;
-	int ret = 0;
 
 	if (scsi_device_protection(sdp) == 0 || (buffer[12] & 1) == 0) {
 		sdkp->protection_type = 0;
-		return ret;
+		return 0;
 	}
 
 	type = ((buffer[12] >> 1) & 7) + 1; /* P_TYPE 0 = Type 1 */
 
-	if (type > T10_PI_TYPE3_PROTECTION)
-		ret = -ENODEV;
-	else if (scsi_host_dif_capable(sdp->host, type))
-		ret = 1;
-
-	if (sdkp->first_scan || type != sdkp->protection_type)
-		switch (ret) {
-		case -ENODEV:
-			sd_printk(KERN_ERR, sdkp, "formatted with unsupported" \
-				  " protection type %u. Disabling disk!\n",
-				  type);
-			break;
-		case 1:
-			sd_printk(KERN_NOTICE, sdkp,
-				  "Enabling DIF Type %u protection\n", type);
-			break;
-		case 0:
-			sd_printk(KERN_NOTICE, sdkp,
-				  "Disabling DIF Type %u protection\n", type);
-			break;
-		}
+	if (type > T10_PI_TYPE3_PROTECTION) {
+		sd_printk(KERN_ERR, sdkp, "formatted with unsupported"	\
+			  " protection type %u. Disabling disk!\n",
+			  type);
+		sdkp->protection_type = 0;
+		return -ENODEV;
+	}
 
 	sdkp->protection_type = type;
 
-	return ret;
+	return 0;
+}
+
+static void sd_config_protection(struct scsi_disk *sdkp)
+{
+	struct scsi_device *sdp = sdkp->device;
+
+	if (!sdkp->first_scan)
+		return;
+
+	sd_dif_config_host(sdkp);
+
+	if (!sdkp->protection_type)
+		return;
+
+	if (!scsi_host_dif_capable(sdp->host, sdkp->protection_type)) {
+		sd_printk(KERN_NOTICE, sdkp,
+			  "Disabling DIF Type %u protection\n",
+			  sdkp->protection_type);
+		sdkp->protection_type = 0;
+	}
+
+	sd_printk(KERN_NOTICE, sdkp, "Enabling DIF Type %u protection\n",
+		  sdkp->protection_type);
 }
 
 static void read_capacity_error(struct scsi_disk *sdkp, struct scsi_device *sdp,
@@ -3409,6 +3417,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
 		sd_config_write_same(sdkp);
 		sd_config_discard(sdkp, SD_LBP_DEFAULT);
 		sd_config_write_zeroes(sdkp, SD_ZERO_DEFAULT);
+		sd_config_protection(sdkp);
 	}
 
 	/*
@@ -3667,11 +3676,6 @@ static int sd_probe(struct device *dev)
 		goto out;
 	}
 
-	if (sdkp->capacity)
-		sd_dif_config_host(sdkp);
-
-	sd_revalidate_disk(gd);
-
 	if (sdkp->security) {
 		sdkp->opal_dev = init_opal_dev(sdkp, &sd_sec_submit);
 		if (sdkp->opal_dev)
diff --git a/drivers/scsi/sd_dif.c b/drivers/scsi/sd_dif.c
index 349950616adc..968993ee6d5d 100644
--- a/drivers/scsi/sd_dif.c
+++ b/drivers/scsi/sd_dif.c
@@ -59,8 +59,6 @@ void sd_dif_config_host(struct scsi_disk *sdkp)
 			bi.profile = &t10_pi_type1_crc;
 
 	bi.tuple_size = sizeof(struct t10_pi_tuple);
-	sd_printk(KERN_NOTICE, sdkp,
-		  "Enabling DIX %s protection\n", bi.profile->name);
 
 	if (dif && type) {
 		bi.flags |= BLK_INTEGRITY_DEVICE_CAPABLE;
@@ -72,11 +70,11 @@ void sd_dif_config_host(struct scsi_disk *sdkp)
 			bi.tag_size = sizeof(u16) + sizeof(u32);
 		else
 			bi.tag_size = sizeof(u16);
-
-		sd_printk(KERN_NOTICE, sdkp, "DIF application tag size %u\n",
-			  bi.tag_size);
 	}
 
+	sd_printk(KERN_NOTICE, sdkp,
+		  "Enabling DIX %s, application tag size %u bytes\n",
+		  bi.profile->name, bi.tag_size);
 out:
 	blk_integrity_register(disk, &bi);
 }
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 14/14] scsi: sd: Enable modern protocol features on more devices
  2022-03-02  5:35 SCSI discovery update Martin K. Petersen
                   ` (12 preceding siblings ...)
  2022-03-02  5:35 ` [PATCH 13/14] scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice Martin K. Petersen
@ 2022-03-02  5:35 ` Martin K. Petersen
  2022-03-02  9:58   ` Christoph Hellwig
                     ` (2 more replies)
  2022-03-03  6:09 ` SCSI discovery update Douglas Gilbert
  14 siblings, 3 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-02  5:35 UTC (permalink / raw)
  To: linux-scsi; +Cc: Martin K. Petersen, Aman Karmani, David Sebek

Due to legacy USB devices having a tendency to either lock up or
return garbage if one attempts to query device capabilities, the USB
transport disables VPD pages by default. This prevents discard from
being properly configured on most modern USB-attached SSDs.

Introduce two additional heuristics to determine whether VPD pages
should be consulted.

The first heuristic fetches VPD pages if a device reports that Logical
Block Provisioning is enabled. It is very unusual for a device to
support thin provisioning and not provide the associated VPDs.
Consequently, if a device reports that Logical Block Provisioning is
enabled (LBPME) in READ CAPACITY(16) response, the scsi_device has no
VPDs attached, and the reported SPC version is larger than 3, then an
attempt will be made to read the VPD pages during revalidate.

The second heuristic relies on the fact that almost all modern devices
return a set of version descriptors in the INQUIRY response. These
descriptors outline which version of various protocol features are
supported. If a device manufacturer has gone through the effort of
filling out compliance descriptors, it is highly unlikely that VPD
pages are not supported. So if a device provides version descriptors
in the INQUIRY response, the scsi_device has no VPDs attached, and the
reported SBC version is larger than 2, then an attempt will be made to
read the VPD pages. In addition, READ CAPACITY(16) will be preferred
over READ CAPACITY(10) to facilitate accessing the LBPME flag.

The benefit to relying on INQUIRY is that it is data we already
have. We do not have to blindly poke the device for additional
information and risk confusing it.

Extracting the SBC version is done by a new helper, sd_sbc_version().
Another helper is provided to determine whether a scsi_device has VPD
pages attached or not.

Reported-by: Aman Karmani <aman@tmm1.net>
Tested-by: Aman Karmani <aman@tmm1.net>
Reported-by: David Sebek <dasebek@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/scsi.c        |  1 +
 drivers/scsi/sd.c          | 77 +++++++++++++++++++++++++++++++++++++-
 drivers/scsi/sd.h          |  2 +
 include/scsi/scsi_device.h | 14 +++++++
 4 files changed, 92 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 524e03aba9ef..7183fd35e48b 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -497,6 +497,7 @@ void scsi_attach_vpd(struct scsi_device *sdev)
 	}
 	kfree(vpd_buf);
 }
+EXPORT_SYMBOL_GPL(scsi_attach_vpd);
 
 /**
  * scsi_report_opcode - Find out if a given command opcode is supported
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 163697dd799a..1f7ca9446949 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2497,6 +2497,19 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
 
 		if (buffer[14] & 0x40) /* LBPRZ */
 			sdkp->lbprz = 1;
+		/*
+		 * If a device sets LBPME=1 then it should, in theory, support
+		 * the Logical Block Provisioning VPD page. Assume that querying
+		 * VPD pages is safe if logical block provisioning is enabled
+		 * and the device claims conformance to a recent version of the
+		 * spec.
+		 */
+		if (!sdkp->reattach_vpds && !scsi_device_has_vpd(sdp) &&
+		    sdp->scsi_level > SCSI_SPC_3) {
+			sd_first_printk(KERN_NOTICE, sdkp,
+			 "Logical Block Provisioning enabled, fetching VPDs\n");
+			sdkp->reattach_vpds = true;
+		}
 	}
 
 	sdkp->capacity = lba + 1;
@@ -2563,8 +2576,10 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
 	return sector_size;
 }
 
-static int sd_try_rc16_first(struct scsi_device *sdp)
+static int sd_try_rc16_first(struct scsi_disk *sdkp)
 {
+	struct scsi_device *sdp = sdkp->device;
+
 	if (sdp->host->max_cmd_len < 16)
 		return 0;
 	if (sdp->try_rc_10_first)
@@ -2585,7 +2600,7 @@ sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer)
 	int sector_size;
 	struct scsi_device *sdp = sdkp->device;
 
-	if (sd_try_rc16_first(sdp)) {
+	if (sd_try_rc16_first(sdkp)) {
 		sector_size = read_capacity_16(sdkp, sdp, buffer);
 		if (sector_size == -EOVERFLOW)
 			goto got_data;
@@ -3399,6 +3414,12 @@ static int sd_revalidate_disk(struct gendisk *disk)
 		blk_queue_flag_clear(QUEUE_FLAG_NONROT, q);
 		blk_queue_flag_set(QUEUE_FLAG_ADD_RANDOM, q);
 
+		if (sdkp->reattach_vpds) {
+			sdp->try_vpd_pages = 1;
+			scsi_attach_vpd(sdp);
+			sdkp->reattach_vpds = false;
+		}
+
 		if (scsi_device_supports_vpd(sdp)) {
 			sd_read_block_provisioning(sdkp);
 			sd_read_block_limits(sdkp);
@@ -3543,6 +3564,42 @@ static int sd_format_disk_name(char *prefix, int index, char *buf, int buflen)
 	return 0;
 }
 
+enum {
+	INQUIRY_DESC_START	= 58,
+	INQUIRY_DESC_END	= 74,
+	INQUIRY_DESC_SIZE	= 2,
+};
+
+static unsigned int sd_sbc_version(struct scsi_device *sdp)
+{
+	unsigned int i;
+	unsigned int max;
+
+	if (sdp->inquiry_len < INQUIRY_DESC_START + INQUIRY_DESC_SIZE)
+		return 0;
+
+	max = min_t(unsigned int, sdp->inquiry_len, INQUIRY_DESC_END);
+	max = rounddown(max, INQUIRY_DESC_SIZE);
+
+	for (i = INQUIRY_DESC_START ; i < max ; i += INQUIRY_DESC_SIZE) {
+		u16 desc = get_unaligned_be16(&sdp->inquiry[i]);
+
+		switch (desc) {
+		case 0x0600:
+			return 4;
+		case 0x04c0: case 0x04c3: case 0x04c5: case 0x04c8:
+			return 3;
+		case 0x0320: case 0x0322: case 0x0324: case 0x033B:
+		case 0x033D: case 0x033E:
+			return 2;
+		case 0x0180: case 0x019b: case 0x019c:
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
 /**
  *	sd_probe - called during driver initialization and whenever a
  *	new scsi device is attached to the system. It is called once
@@ -3568,6 +3625,7 @@ static int sd_probe(struct device *dev)
 	struct gendisk *gd;
 	int index;
 	int error;
+	unsigned int sbc_version;
 
 	scsi_autopm_get_device(sdp);
 	error = -ENODEV;
@@ -3656,6 +3714,21 @@ static int sd_probe(struct device *dev)
 	sdkp->first_scan = 1;
 	sdkp->max_medium_access_timeouts = SD_MAX_MEDIUM_TIMEOUTS;
 
+	/*
+	 * If the device explicitly claims support for SBC version 3
+	 * or later, unset the LLD flags which prevent probing for
+	 * modern protocol features and reattach VPD pages.
+	 */
+	sbc_version = sd_sbc_version(sdp);
+	if (!scsi_device_has_vpd(sdp) && sbc_version >= 3) {
+		sdkp->reattach_vpds = true;
+		sdp->try_rc_10_first = 0;
+		sdp->no_read_capacity_16 = 0;
+		sd_first_printk(KERN_NOTICE, sdkp,
+				"Detected SBC version %u, fetching VPDs\n",
+				sbc_version);
+	}
+
 	sd_revalidate_disk(gd);
 
 	if (sdp->removable) {
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 2cef9e884b2a..b0b39e0ea5b3 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -108,6 +108,7 @@ struct scsi_disk {
 	unsigned int	physical_block_size;
 	unsigned int	max_medium_access_timeouts;
 	unsigned int	medium_access_timed_out;
+	unsigned int	sbc_version;
 	u8		media_present;
 	u8		write_prot;
 	u8		protection_type;/* Data Integrity Field */
@@ -118,6 +119,7 @@ struct scsi_disk {
 	bool		provisioning_override;
 	bool		zeroing_override;
 	bool		ndob;
+	bool		reattach_vpds;
 	unsigned	ATO : 1;	/* state of disk ATO bit */
 	unsigned	cache_override : 1; /* temp override of WCE,RCD */
 	unsigned	WCE : 1;	/* state of disk WCE bit */
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index fd6a803ac8cd..feca18e5fe28 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -605,6 +605,20 @@ static inline int scsi_device_supports_vpd(struct scsi_device *sdev)
 	return 0;
 }
 
+static inline bool scsi_device_has_vpd(struct scsi_device *sdev)
+{
+	struct scsi_vpd *vpd;
+	bool found = false;
+
+	rcu_read_lock();
+	vpd = rcu_dereference(sdev->vpd_pg0);
+	if (vpd)
+		found = true;
+	rcu_read_unlock();
+
+	return found;
+}
+
 static inline int scsi_device_busy(struct scsi_device *sdev)
 {
 	return sbitmap_weight(&sdev->budget_map);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH 01/14] scsi: mpt3sas: Use cached ATA Information VPD page
  2022-03-02  5:35 ` [PATCH 01/14] scsi: mpt3sas: Use cached ATA Information VPD page Martin K. Petersen
@ 2022-03-02  9:47   ` Christoph Hellwig
  2022-03-02 14:18   ` Johannes Thumshirn
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 75+ messages in thread
From: Christoph Hellwig @ 2022-03-02  9:47 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi, Sreekanth Reddy

On Wed, Mar 02, 2022 at 12:35:46AM -0500, Martin K. Petersen wrote:
> We now cache VPD page 0x89 so there is no need to request it from the
> hardware. Make mpt3sas use the cached page.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 02/14] scsi: core: Query VPD size before getting full page
  2022-03-02  5:35 ` [PATCH 02/14] scsi: core: Query VPD size before getting full page Martin K. Petersen
@ 2022-03-02  9:48   ` Christoph Hellwig
  2022-03-02 14:25   ` Johannes Thumshirn
  2022-03-03  0:30   ` Bart Van Assche
  2 siblings, 0 replies; 75+ messages in thread
From: Christoph Hellwig @ 2022-03-02  9:48 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi, Maciej W . Rozycki

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 03/14] scsi: core: Do not truncate INQUIRY data on modern devices
  2022-03-02  5:35 ` [PATCH 03/14] scsi: core: Do not truncate INQUIRY data on modern devices Martin K. Petersen
@ 2022-03-02  9:49   ` Christoph Hellwig
  2022-03-02 14:27   ` Johannes Thumshirn
  2022-03-03  0:14   ` Bart Van Assche
  2 siblings, 0 replies; 75+ messages in thread
From: Christoph Hellwig @ 2022-03-02  9:49 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 04/14] scsi: core: Pick suitable allocation length in scsi_report_opcode()
  2022-03-02  5:35 ` [PATCH 04/14] scsi: core: Pick suitable allocation length in scsi_report_opcode() Martin K. Petersen
@ 2022-03-02  9:49   ` Christoph Hellwig
  2022-03-02 14:29   ` Johannes Thumshirn
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 75+ messages in thread
From: Christoph Hellwig @ 2022-03-02  9:49 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi

On Wed, Mar 02, 2022 at 12:35:49AM -0500, Martin K. Petersen wrote:
> Some devices hang when a buffer size larger than expected is passed in
> the ALLOCATION LENGTH field. For REPORT SUPPORTED OPERATION CODES we
> currently only request a single command descriptor at a time and
> therefore the actual size of the command is known ahead of time. Limit
> the ALLOCATION LENGTH to the header size plus the command length of
> the opcode we are asking about.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 05/14] scsi: core: Cache VPD pages b0, b1, b2
  2022-03-02  5:35 ` [PATCH 05/14] scsi: core: Cache VPD pages b0, b1, b2 Martin K. Petersen
@ 2022-03-02  9:50   ` Christoph Hellwig
  2022-03-02 14:30   ` Johannes Thumshirn
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 75+ messages in thread
From: Christoph Hellwig @ 2022-03-02  9:50 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 06/14] scsi: sd: Use cached ATA Information VPD page
  2022-03-02  5:35 ` [PATCH 06/14] scsi: sd: Use cached ATA Information VPD page Martin K. Petersen
@ 2022-03-02  9:50   ` Christoph Hellwig
  2022-03-03  0:40   ` Bart Van Assche
  2022-03-04  9:29   ` Johannes Thumshirn
  2 siblings, 0 replies; 75+ messages in thread
From: Christoph Hellwig @ 2022-03-02  9:50 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 07/14] scsi: sd: Switch to using scsi_device VPD pages
  2022-03-02  5:35 ` [PATCH 07/14] scsi: sd: Switch to using scsi_device VPD pages Martin K. Petersen
@ 2022-03-02  9:51   ` Christoph Hellwig
  2022-03-03  0:42   ` Bart Van Assche
  2022-03-04  9:29   ` Johannes Thumshirn
  2 siblings, 0 replies; 75+ messages in thread
From: Christoph Hellwig @ 2022-03-02  9:51 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/14] scsi: sd: Optimal I/O size should be a multiple of reported granularity
  2022-03-02  5:35 ` [PATCH 08/14] scsi: sd: Optimal I/O size should be a multiple of reported granularity Martin K. Petersen
@ 2022-03-02  9:51   ` Christoph Hellwig
  2022-03-03 20:17   ` Bart Van Assche
  1 sibling, 0 replies; 75+ messages in thread
From: Christoph Hellwig @ 2022-03-02  9:51 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi, Bernhard Sulzer

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 09/14] scsi: sd: Fix discard errors during revalidate
  2022-03-02  5:35 ` [PATCH 09/14] scsi: sd: Fix discard errors during revalidate Martin K. Petersen
@ 2022-03-02  9:52   ` Christoph Hellwig
  2022-03-03 21:06   ` Bart Van Assche
  1 sibling, 0 replies; 75+ messages in thread
From: Christoph Hellwig @ 2022-03-02  9:52 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 10/14] scsi: sd: Move WRITE_ZEROES configuration to a separate function
  2022-03-02  5:35 ` [PATCH 10/14] scsi: sd: Move WRITE_ZEROES configuration to a separate function Martin K. Petersen
@ 2022-03-02  9:53   ` Christoph Hellwig
  2022-03-03  0:52   ` Bart Van Assche
  2022-07-18 16:51   ` regression next-20220714: mkfs.ext4 on multipath device over scsi disks causes 'lifelock' in block layer Benjamin Block
  2 siblings, 0 replies; 75+ messages in thread
From: Christoph Hellwig @ 2022-03-02  9:53 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 11/14] scsi: sd: Implement support for NDOB flag in WRITE SAME(16)
  2022-03-02  5:35 ` [PATCH 11/14] scsi: sd: Implement support for NDOB flag in WRITE SAME(16) Martin K. Petersen
@ 2022-03-02  9:54   ` Christoph Hellwig
  2022-03-03  1:29   ` Bart Van Assche
  2022-03-04  9:32   ` Johannes Thumshirn
  2 siblings, 0 replies; 75+ messages in thread
From: Christoph Hellwig @ 2022-03-02  9:54 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages
  2022-03-02  5:35 ` [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages Martin K. Petersen
@ 2022-03-02  9:54   ` Christoph Hellwig
  2022-03-02 10:45   ` Damien Le Moal
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 75+ messages in thread
From: Christoph Hellwig @ 2022-03-02  9:54 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi, Damien Le Moal

On Wed, Mar 02, 2022 at 12:35:57AM -0500, Martin K. Petersen wrote:
> As such it should be called inside the scsi_device_supports_vpd()
> conditional.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 13/14] scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice
  2022-03-02  5:35 ` [PATCH 13/14] scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice Martin K. Petersen
@ 2022-03-02  9:57   ` Christoph Hellwig
  2022-03-04  9:36   ` Johannes Thumshirn
  1 sibling, 0 replies; 75+ messages in thread
From: Christoph Hellwig @ 2022-03-02  9:57 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 14/14] scsi: sd: Enable modern protocol features on more devices
  2022-03-02  5:35 ` [PATCH 14/14] scsi: sd: Enable modern protocol features on more devices Martin K. Petersen
@ 2022-03-02  9:58   ` Christoph Hellwig
  2022-03-03  1:25   ` Bart Van Assche
  2022-03-04  9:38   ` Johannes Thumshirn
  2 siblings, 0 replies; 75+ messages in thread
From: Christoph Hellwig @ 2022-03-02  9:58 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi, Aman Karmani, David Sebek

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages
  2022-03-02  5:35 ` [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages Martin K. Petersen
  2022-03-02  9:54   ` Christoph Hellwig
@ 2022-03-02 10:45   ` Damien Le Moal
  2022-04-06  1:29     ` Damien Le Moal
  2022-03-03 20:13   ` Bart Van Assche
  2022-03-04  9:33   ` Johannes Thumshirn
  3 siblings, 1 reply; 75+ messages in thread
From: Damien Le Moal @ 2022-03-02 10:45 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

On 2022/03/02 7:35, Martin K. Petersen wrote:
> As such it should be called inside the scsi_device_supports_vpd()
> conditional.
> 
> Cc: Damien Le Moal <damien.lemoal@wdc.com>
> Fixes: e815d36548f0 ("scsi: sd: add concurrent positioning ranges support")
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

I thought this was already queued :)

Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>

> ---
>  drivers/scsi/sd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 2c2e86738578..9d6b2205339d 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3396,6 +3396,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
>  			sd_read_block_limits(sdkp);
>  			sd_read_block_characteristics(sdkp);
>  			sd_zbc_read_zones(sdkp, buffer);
> +			sd_read_cpr(sdkp);
>  		}
>  
>  		sd_print_capacity(sdkp, old_capacity);
> @@ -3405,7 +3406,6 @@ static int sd_revalidate_disk(struct gendisk *disk)
>  		sd_read_app_tag_own(sdkp, buffer);
>  		sd_read_write_same(sdkp, buffer);
>  		sd_read_security(sdkp, buffer);
> -		sd_read_cpr(sdkp);
>  		sd_config_write_same(sdkp);
>  		sd_config_discard(sdkp, SD_LBP_DEFAULT);
>  		sd_config_write_zeroes(sdkp, SD_ZERO_DEFAULT);


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 01/14] scsi: mpt3sas: Use cached ATA Information VPD page
  2022-03-02  5:35 ` [PATCH 01/14] scsi: mpt3sas: Use cached ATA Information VPD page Martin K. Petersen
  2022-03-02  9:47   ` Christoph Hellwig
@ 2022-03-02 14:18   ` Johannes Thumshirn
  2022-04-20 12:06   ` Hannes Reinecke
  2022-05-03  0:51   ` Martin K. Petersen
  3 siblings, 0 replies; 75+ messages in thread
From: Johannes Thumshirn @ 2022-03-02 14:18 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi; +Cc: Sreekanth Reddy

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 02/14] scsi: core: Query VPD size before getting full page
  2022-03-02  5:35 ` [PATCH 02/14] scsi: core: Query VPD size before getting full page Martin K. Petersen
  2022-03-02  9:48   ` Christoph Hellwig
@ 2022-03-02 14:25   ` Johannes Thumshirn
  2022-03-04  3:42     ` Martin K. Petersen
  2022-03-03  0:30   ` Bart Van Assche
  2 siblings, 1 reply; 75+ messages in thread
From: Johannes Thumshirn @ 2022-03-02 14:25 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi; +Cc: Maciej W . Rozycki

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

Two minior nits below:

> +	result = scsi_vpd_inquiry(sdev, vpd_header, page, sizeof(vpd_header));
> +
> +	if (result < 0)
> +		return 0;

can we remove the empty line?

> +	memset(buf, 0, buf_len);
> +	result = scsi_vpd_inquiry(sdev, buf, page, vpd_len);
>  	if (result < 0)
> -		goto fail;
> +		return -EINVAL;
> +	else if (result > vpd_len) {
> +		dev_warn_once(&sdev->sdev_gendev,
> +			      "%s: VPD page 0x%02x result %d > %d bytes\n",
> +			      __func__, page, result, vpd_len);
> +		vpd_len = min(result, buf_len);
> +		goto retry_pg;
> +	}
>  

Adding {} to the if block as well makes it look nicer IMHO

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 03/14] scsi: core: Do not truncate INQUIRY data on modern devices
  2022-03-02  5:35 ` [PATCH 03/14] scsi: core: Do not truncate INQUIRY data on modern devices Martin K. Petersen
  2022-03-02  9:49   ` Christoph Hellwig
@ 2022-03-02 14:27   ` Johannes Thumshirn
  2022-03-03  0:14   ` Bart Van Assche
  2 siblings, 0 replies; 75+ messages in thread
From: Johannes Thumshirn @ 2022-03-02 14:27 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 04/14] scsi: core: Pick suitable allocation length in scsi_report_opcode()
  2022-03-02  5:35 ` [PATCH 04/14] scsi: core: Pick suitable allocation length in scsi_report_opcode() Martin K. Petersen
  2022-03-02  9:49   ` Christoph Hellwig
@ 2022-03-02 14:29   ` Johannes Thumshirn
  2022-03-03  0:39   ` Bart Van Assche
  2022-04-20 12:06   ` Hannes Reinecke
  3 siblings, 0 replies; 75+ messages in thread
From: Johannes Thumshirn @ 2022-03-02 14:29 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 05/14] scsi: core: Cache VPD pages b0, b1, b2
  2022-03-02  5:35 ` [PATCH 05/14] scsi: core: Cache VPD pages b0, b1, b2 Martin K. Petersen
  2022-03-02  9:50   ` Christoph Hellwig
@ 2022-03-02 14:30   ` Johannes Thumshirn
  2022-03-03  1:30   ` Bart Van Assche
  2022-04-20 12:07   ` Hannes Reinecke
  3 siblings, 0 replies; 75+ messages in thread
From: Johannes Thumshirn @ 2022-03-02 14:30 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 03/14] scsi: core: Do not truncate INQUIRY data on modern devices
  2022-03-02  5:35 ` [PATCH 03/14] scsi: core: Do not truncate INQUIRY data on modern devices Martin K. Petersen
  2022-03-02  9:49   ` Christoph Hellwig
  2022-03-02 14:27   ` Johannes Thumshirn
@ 2022-03-03  0:14   ` Bart Van Assche
  2022-03-04  3:40     ` Martin K. Petersen
  2 siblings, 1 reply; 75+ messages in thread
From: Bart Van Assche @ 2022-03-03  0:14 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

On 3/1/22 21:35, Martin K. Petersen wrote:
> Low-level device drivers have had the ability to limit the size of an
> INQUIRY for many years. This made sense for a wide variety of legacy
> devices. However, we are unnecessarily truncating the INQUIRY response
> for many modern devices. This prevents us from consulting fields
> beyond the first 36 bytes.
> 
> If a device reports that it supports a larger INQUIRY response, and
> the device also reports that it implements SPC-4 or newer, allow the
> larger INQUIRY to proceed.
> 
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> ---
>   drivers/scsi/scsi_scan.c | 12 +++++++++++-
>   1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index f4e6c68ac99e..95bf9a1f35ce 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -728,7 +728,17 @@ static int scsi_probe_lun(struct scsi_device *sdev, unsigned char *inq_result,
>   		if (pass == 1) {
>   			if (BLIST_INQUIRY_36 & *bflags)
>   				next_inquiry_len = 36;
> -			else if (sdev->inquiry_len)
> +			/*
> +			 * LLD specified a maximum sdev->inquiry_len
> +			 * but device claims it has more data. Capping
> +			 * the length only makes sense for legacy
> +			 * devices. If a device supports SPC-4 (2014)
> +			 * or newer, assume that it is safe to ask for
> +			 * as much as the device says it supports.
> +			 */
> +			else if (sdev->inquiry_len &&
> +				 response_len > sdev->inquiry_len &&
> +				 (inq_result[2] & 0x7) < 6) /* SPC-4 */
>   				next_inquiry_len = sdev->inquiry_len;
>   			else
>   				next_inquiry_len = response_len;

Hi Martin,

Do the benefits of this change outweigh the additional complexity 
introduced by this code and the risk of breaking support for certain 
devices? I'm asking this because the number of LLDs that sets 
inquiry_len is small:

$ git grep -nH '>inquiry_len[[:blank:]]*=[[:blank:]]'|grep -v scsi_scan
drivers/firewire/sbp2.c:1508:		sdev->inquiry_len = 36;
drivers/staging/rts5208/rtsx.c:67:	sdev->inquiry_len = 36;
drivers/usb/image/microtek.c:323:	s->inquiry_len = 0x24;
drivers/usb/storage/scsiglue.c:77:	sdev->inquiry_len = 36;

Does any of these LLDs support SPC-4 devices? Can this change e.g. break 
support for certain USB sticks?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 02/14] scsi: core: Query VPD size before getting full page
  2022-03-02  5:35 ` [PATCH 02/14] scsi: core: Query VPD size before getting full page Martin K. Petersen
  2022-03-02  9:48   ` Christoph Hellwig
  2022-03-02 14:25   ` Johannes Thumshirn
@ 2022-03-03  0:30   ` Bart Van Assche
  2022-03-04  3:28     ` Martin K. Petersen
  2 siblings, 1 reply; 75+ messages in thread
From: Bart Van Assche @ 2022-03-03  0:30 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi; +Cc: Maciej W . Rozycki

On 3/1/22 21:35, Martin K. Petersen wrote:
> +static int scsi_get_vpd_size(struct scsi_device *sdev, u8 page)
> +{
> +	unsigned char vpd_header[SCSI_VPD_HEADER_SIZE];
> +	int result;

The SCSI core sets the minimum DMA alignment to 4 bytes. How about 
aligning the vpd_header[] array explicitly to a four-byte boundary such 
that the block layer does not have to allocate a temporary buffer?

> +	vpd_len = min(vpd_len, buf_len);
>   
> - found:
> -	result = scsi_vpd_inquiry(sdev, buf, page, buf_len);
> +retry_pg:
> +	/*
> +	 * Fetch the actual page. Since the appropriate size was reported
> +	 * by the device it is now safe to ask for something bigger.
> +	 */
> +	memset(buf, 0, buf_len);
> +	result = scsi_vpd_inquiry(sdev, buf, page, vpd_len);
>   	if (result < 0)
> -		goto fail;
> +		return -EINVAL;
> +	else if (result > vpd_len) {
> +		dev_warn_once(&sdev->sdev_gendev,
> +			      "%s: VPD page 0x%02x result %d > %d bytes\n",
> +			      __func__, page, result, vpd_len);
> +		vpd_len = min(result, buf_len);
> +		goto retry_pg;
> +	}

Will an endless loop be triggered if the VPD page length is larger than 
'buf_len'?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 04/14] scsi: core: Pick suitable allocation length in scsi_report_opcode()
  2022-03-02  5:35 ` [PATCH 04/14] scsi: core: Pick suitable allocation length in scsi_report_opcode() Martin K. Petersen
  2022-03-02  9:49   ` Christoph Hellwig
  2022-03-02 14:29   ` Johannes Thumshirn
@ 2022-03-03  0:39   ` Bart Van Assche
  2022-04-20 12:06   ` Hannes Reinecke
  3 siblings, 0 replies; 75+ messages in thread
From: Bart Van Assche @ 2022-03-03  0:39 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

On 3/1/22 21:35, Martin K. Petersen wrote:
> Some devices hang when a buffer size larger than expected is passed in
> the ALLOCATION LENGTH field. For REPORT SUPPORTED OPERATION CODES we
> currently only request a single command descriptor at a time and
> therefore the actual size of the command is known ahead of time. Limit
> the ALLOCATION LENGTH to the header size plus the command length of
> the opcode we are asking about.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 06/14] scsi: sd: Use cached ATA Information VPD page
  2022-03-02  5:35 ` [PATCH 06/14] scsi: sd: Use cached ATA Information VPD page Martin K. Petersen
  2022-03-02  9:50   ` Christoph Hellwig
@ 2022-03-03  0:40   ` Bart Van Assche
  2022-03-04  9:29   ` Johannes Thumshirn
  2 siblings, 0 replies; 75+ messages in thread
From: Bart Van Assche @ 2022-03-03  0:40 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

On 3/1/22 21:35, Martin K. Petersen wrote:
> Since the ATA Information VPD is now cached at device discovery time it is
> no longer necessary to request this page when we configure WRITE SAME.
> Instead use the cached information to determine if this disk sits behind a
> SCSI-ATA translation layer.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 07/14] scsi: sd: Switch to using scsi_device VPD pages
  2022-03-02  5:35 ` [PATCH 07/14] scsi: sd: Switch to using scsi_device VPD pages Martin K. Petersen
  2022-03-02  9:51   ` Christoph Hellwig
@ 2022-03-03  0:42   ` Bart Van Assche
  2022-03-04  9:29   ` Johannes Thumshirn
  2 siblings, 0 replies; 75+ messages in thread
From: Bart Van Assche @ 2022-03-03  0:42 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

On 3/1/22 21:35, Martin K. Petersen wrote:
> Use the VPD pages already provided by the SCSI midlayer. No need to
> request them individually in the SCSI disk driver.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 10/14] scsi: sd: Move WRITE_ZEROES configuration to a separate function
  2022-03-02  5:35 ` [PATCH 10/14] scsi: sd: Move WRITE_ZEROES configuration to a separate function Martin K. Petersen
  2022-03-02  9:53   ` Christoph Hellwig
@ 2022-03-03  0:52   ` Bart Van Assche
  2022-07-18 16:51   ` regression next-20220714: mkfs.ext4 on multipath device over scsi disks causes 'lifelock' in block layer Benjamin Block
  2 siblings, 0 replies; 75+ messages in thread
From: Bart Van Assche @ 2022-03-03  0:52 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

On 3/1/22 21:35, Martin K. Petersen wrote:
> +	if (mode == SD_ZERO_DEFAULT)
> +		sdkp->zeroing_override = false;
> +	else
> +		sdkp->zeroing_override = true;

Hmm ... since sd_config_write_zeroes() special-cases mode == 
SD_ZERO_DEFAULT, does the value of zeroing_override matter if mode == 
SD_ZERO_DEFAULT?

> +static void sd_config_write_zeroes(struct scsi_disk *sdkp,
> +				   enum sd_zeroing_mode mode)
> +{
> +	struct request_queue *q = sdkp->disk->queue;
> +	unsigned int logical_block_size = sdkp->device->sector_size;
> +
> +	if (mode == SD_ZERO_DEFAULT && !sdkp->zeroing_override) {
> +		if (sdkp->lbprz && sdkp->lbpws)
> +			mode = SD_ZERO_WS16_UNMAP;
> +		else if (sdkp->lbprz && sdkp->lbpws10)
> +			mode = SD_ZERO_WS10_UNMAP;
> +		else if (sdkp->max_ws_blocks)
> +			mode = SD_ZERO_WS;
> +		else
> +			mode = SD_ZERO_WRITE;
> +	}
> +
> +	if (mode == SD_ZERO_DISABLE)
> +		sdkp->zeroing_override = true;

What does "zeroing_override" mean? I would expect if mode == 
SD_ZERO_DISABLE that that choice from the user is not overridden.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 14/14] scsi: sd: Enable modern protocol features on more devices
  2022-03-02  5:35 ` [PATCH 14/14] scsi: sd: Enable modern protocol features on more devices Martin K. Petersen
  2022-03-02  9:58   ` Christoph Hellwig
@ 2022-03-03  1:25   ` Bart Van Assche
  2022-03-04  9:38   ` Johannes Thumshirn
  2 siblings, 0 replies; 75+ messages in thread
From: Bart Van Assche @ 2022-03-03  1:25 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi; +Cc: Aman Karmani, David Sebek

On 3/1/22 21:35, Martin K. Petersen wrote:
> +static unsigned int sd_sbc_version(struct scsi_device *sdp)
> +{
> +	unsigned int i;
> +	unsigned int max;
> +
> +	if (sdp->inquiry_len < INQUIRY_DESC_START + INQUIRY_DESC_SIZE)
> +		return 0;
> +
> +	max = min_t(unsigned int, sdp->inquiry_len, INQUIRY_DESC_END);
> +	max = rounddown(max, INQUIRY_DESC_SIZE);
> +
> +	for (i = INQUIRY_DESC_START ; i < max ; i += INQUIRY_DESC_SIZE) {
> +		u16 desc = get_unaligned_be16(&sdp->inquiry[i]);
> +
> +		switch (desc) {
> +		case 0x0600:
> +			return 4;
> +		case 0x04c0: case 0x04c3: case 0x04c5: case 0x04c8:
> +			return 3;
> +		case 0x0320: case 0x0322: case 0x0324: case 0x033B:
> +		case 0x033D: case 0x033E:
> +			return 2;
> +		case 0x0180: case 0x019b: case 0x019c:
> +			return 1;
> +		}
> +	}

Please add a comment that explains where these constants come from (the 
SPC Version descriptor values table?). Anyway:

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 11/14] scsi: sd: Implement support for NDOB flag in WRITE SAME(16)
  2022-03-02  5:35 ` [PATCH 11/14] scsi: sd: Implement support for NDOB flag in WRITE SAME(16) Martin K. Petersen
  2022-03-02  9:54   ` Christoph Hellwig
@ 2022-03-03  1:29   ` Bart Van Assche
  2022-03-04  9:32   ` Johannes Thumshirn
  2 siblings, 0 replies; 75+ messages in thread
From: Bart Van Assche @ 2022-03-03  1:29 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

On 3/1/22 21:35, Martin K. Petersen wrote:
> +		if (get_unaligned_be16(&buffer[2]) >= 2)
> +			sdkp->ndob = buffer[5] & 1;
> +	}

Code like the above is incomprehensible without having access to the 
corresponding specification. Anyway:

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 05/14] scsi: core: Cache VPD pages b0, b1, b2
  2022-03-02  5:35 ` [PATCH 05/14] scsi: core: Cache VPD pages b0, b1, b2 Martin K. Petersen
  2022-03-02  9:50   ` Christoph Hellwig
  2022-03-02 14:30   ` Johannes Thumshirn
@ 2022-03-03  1:30   ` Bart Van Assche
  2022-04-20 12:07   ` Hannes Reinecke
  3 siblings, 0 replies; 75+ messages in thread
From: Bart Van Assche @ 2022-03-03  1:30 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

On 3/1/22 21:35, Martin K. Petersen wrote:
> [ ... ]

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: SCSI discovery update
  2022-03-02  5:35 SCSI discovery update Martin K. Petersen
                   ` (13 preceding siblings ...)
  2022-03-02  5:35 ` [PATCH 14/14] scsi: sd: Enable modern protocol features on more devices Martin K. Petersen
@ 2022-03-03  6:09 ` Douglas Gilbert
  2022-03-04  3:26   ` Martin K. Petersen
  14 siblings, 1 reply; 75+ messages in thread
From: Douglas Gilbert @ 2022-03-03  6:09 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi; +Cc: Knight, Frederick

On 2022-03-02 00:35, Martin K. Petersen wrote:
> This series addresses several issues in the SCSI device discovery
> code:
> 
>   - Fetch the VPD header before getting the full VPD page. This removes
>     the guesswork from sizing the VPD buffer and fixes problems with
>     RAID controllers that wedge when we try to fetch the IDENTIFY
>     DEVICE information trailing the ATA Information VPD page.
> 
>   - Cache the VPD pages we need instead of fetching them every
>     revalidate iteration.
> 
>   - Avoid truncating the INQUIRY length for modern devices. This allows
>     us to query the version descriptors reported by most contemporary
>     drives. These version descriptors are used as an extra heuristic
>     for querying protocol features.

Version descriptors used to be updated by Ralph Weber (WDC, T10) but
he stopped doing that about 2 years ago. When that was pointed out
he made a proposal to T10 [20-022r0] for dropping version descriptors
henceforth. The proposal was voted down but no-one has stepped up at
T10 to keep the version descriptors up to date.

Version descriptors used to have an entry on this page:
     https://www.t10.org/lists/1spc-lst.htm
That is no longer the case. They are still in spc6r06.pdf (latest draft)
but haven't been updated for the same period.

I believe Seagate never supported version descriptors. Hitachi/WD
used to support them. But I have a WD SAS disk manufactured last
month and it has no version descriptors.

I have no Toshiba, Samsung nor Kioxia disks. Could someone report if their
recent SAS disks support version descriptors?

>   - Additional sanity checking for the reported minimum and optimal I/O
>     sizes.
> 
>   - Fix reported discard failures by making the configuration a
>     two-stage process. Completing full VPD/RSOC discovery before we
>     configure discard prevents a small window of error where the wrong
>     command and/or wrong limit would briefly be applied.
> 
>   - Make the zeroing configuration a two-stage process as well.
> 
>   - Implement support for the NDOB flag for both discards and
>     zeroing. The "No Data Out Buffer" flag removes the need for a
>     zeroed payload to be included with a WRITE SAME(16) command.
> 
>   - Remove the superfluous revalidate operation historically required
>     by the integrity profile registration. This further reduces the
>     commands we send during device discovery.
> 
>   - Add additional heuristics for enabling discards on modern devices.
>     Specifically, if a device reports that it supports logical block
>     provisioning, attempt to query the LBP VPD page.
> 
>   - Also query the device VPD pages if a device reports conformance to
>     a recent version of the SCSI Block Commands specification.

Everything else here looks great.

Ah, one thing. If you cache VPD pages (and the standard INQUIRY response),
then if an INQUIRY DATA CHANGED Unit Attention occurs, all the cached data
should be invalidated and the cached VPD pages re-fetched. The Last n Inquiry
Data Changed log page [0xb,0x1] could help with that, but I haven't seen
it implemented yet.

Doug Gilbert

> Thanks to several bug reporters and volunteers this series has been
> extensively tested with a much wider variety of USB/UAS devices than I
> have access to.
> 


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages
  2022-03-02  5:35 ` [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages Martin K. Petersen
  2022-03-02  9:54   ` Christoph Hellwig
  2022-03-02 10:45   ` Damien Le Moal
@ 2022-03-03 20:13   ` Bart Van Assche
  2022-03-04  9:33   ` Johannes Thumshirn
  3 siblings, 0 replies; 75+ messages in thread
From: Bart Van Assche @ 2022-03-03 20:13 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi; +Cc: Damien Le Moal

On 3/1/22 21:35, Martin K. Petersen wrote:
> As such it should be called inside the scsi_device_supports_vpd()
> conditional.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/14] scsi: sd: Optimal I/O size should be a multiple of reported granularity
  2022-03-02  5:35 ` [PATCH 08/14] scsi: sd: Optimal I/O size should be a multiple of reported granularity Martin K. Petersen
  2022-03-02  9:51   ` Christoph Hellwig
@ 2022-03-03 20:17   ` Bart Van Assche
  2022-03-04  3:45     ` Martin K. Petersen
  1 sibling, 1 reply; 75+ messages in thread
From: Bart Van Assche @ 2022-03-03 20:17 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi; +Cc: Bernhard Sulzer

On 3/1/22 21:35, Martin K. Petersen wrote:
> +	if (min_xfer_bytes && opt_xfer_bytes & (min_xfer_bytes - 1)) {
> +		sd_first_printk(KERN_WARNING, sdkp,
> +				"Optimal transfer size %u bytes not a " \
> +				"multiple of preferred minimum block " \
> +				"size (%u bytes)\n",
> +				opt_xfer_bytes, min_xfer_bytes);
> +		return false;
> +	}

Hmm ... what guarantees that min_xfer_bytes is a power of two? Did I perhaps 
overlook something?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 09/14] scsi: sd: Fix discard errors during revalidate
  2022-03-02  5:35 ` [PATCH 09/14] scsi: sd: Fix discard errors during revalidate Martin K. Petersen
  2022-03-02  9:52   ` Christoph Hellwig
@ 2022-03-03 21:06   ` Bart Van Assche
  2022-03-04  3:55     ` Martin K. Petersen
  1 sibling, 1 reply; 75+ messages in thread
From: Bart Van Assche @ 2022-03-03 21:06 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

On 3/1/22 21:35, Martin K. Petersen wrote:
> +	if (mode == SD_LBP_DEFAULT)
> +		sdkp->provisioning_override = false;
> +	else
> +		sdkp->provisioning_override = true;

This can be changed into the following?

sdkp->provisioning_override = mode != SD_LBP_DEFAULT;

> +	if (mode == SD_LBP_DEFAULT && !sdkp->provisioning_override) {

Hmm ... is provisioning_override ever true for the SD_LBP_DEFAULT mode? If not, 
can "&& !sdkp->provisioning_override" be left out?

> +	bool		lblvpd;

Please add a comment that explains what lblvpd stands for.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: SCSI discovery update
  2022-03-03  6:09 ` SCSI discovery update Douglas Gilbert
@ 2022-03-04  3:26   ` Martin K. Petersen
  0 siblings, 0 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-04  3:26 UTC (permalink / raw)
  To: Douglas Gilbert; +Cc: Martin K. Petersen, linux-scsi, Knight, Frederick


Doug,

> I believe Seagate never supported version descriptors. Hitachi/WD
> used to support them. But I have a WD SAS disk manufactured last
> month and it has no version descriptors.

I only have a couple of SAS disk drives in the lab that implement them
but almost all of my SAS SSDs feature version descriptors. As do, much
to my surprise, many USB devices.

The rationale behind adding the descriptor check is that it proved to be
a surprisingly good heuristic for "modern UAS device which handles unmap
and other fancy things correctly".

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 02/14] scsi: core: Query VPD size before getting full page
  2022-03-03  0:30   ` Bart Van Assche
@ 2022-03-04  3:28     ` Martin K. Petersen
  0 siblings, 0 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-04  3:28 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Martin K. Petersen, linux-scsi, Maciej W . Rozycki


Bart,

>> +	vpd_len = min(vpd_len, buf_len);
>>   - found:
>> -	result = scsi_vpd_inquiry(sdev, buf, page, buf_len);
>> +retry_pg:
>> +	/*
>> +	 * Fetch the actual page. Since the appropriate size was reported
>> +	 * by the device it is now safe to ask for something bigger.
>> +	 */
>> +	memset(buf, 0, buf_len);
>> +	result = scsi_vpd_inquiry(sdev, buf, page, vpd_len);
>>   	if (result < 0)
>> -		goto fail;
>> +		return -EINVAL;
>> +	else if (result > vpd_len) {
>> +		dev_warn_once(&sdev->sdev_gendev,
>> +			      "%s: VPD page 0x%02x result %d > %d bytes\n",
>> +			      __func__, page, result, vpd_len);
>> +		vpd_len = min(result, buf_len);
>> +		goto retry_pg;
>> +	}
>
> Will an endless loop be triggered if the VPD page length is larger
> than 'buf_len'?

Ah, transplant thinko from scsi_get_vpd_buf() which reallocates the
buffer on mismatch. Will fix.

Thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 03/14] scsi: core: Do not truncate INQUIRY data on modern devices
  2022-03-03  0:14   ` Bart Van Assche
@ 2022-03-04  3:40     ` Martin K. Petersen
  0 siblings, 0 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-04  3:40 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Martin K. Petersen, linux-scsi


Bart,

> Do the benefits of this change outweigh the additional complexity
> introduced by this code and the risk of breaking support for certain 
> devices? I'm asking this because the number of LLDs that sets
> inquiry_len is small:
>
> $ git grep -nH '>inquiry_len[[:blank:]]*=[[:blank:]]'|grep -v scsi_scan
> drivers/firewire/sbp2.c:1508:		sdev->inquiry_len = 36;
> drivers/staging/rts5208/rtsx.c:67:	sdev->inquiry_len = 36;
> drivers/usb/image/microtek.c:323:	s->inquiry_len = 0x24;
> drivers/usb/storage/scsiglue.c:77:	sdev->inquiry_len = 36;
>
> Does any of these LLDs support SPC-4 devices? Can this change
> e.g. break support for certain USB sticks?

The intent is to enable more modern protocol features on devices
attached using USB. I was concerned about blindly increasingly the
inquiry length and risk breaking older devices that we know are likely
to have problems in this department. At the same time the conservative
clamp on the inquiry length prevents several useful features from being
enabled on modern devices.

There is a risk associated with any change. This patch tries to strike a
reasonable balance between avoiding regressions and accommodating modern
USB SSDs.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 02/14] scsi: core: Query VPD size before getting full page
  2022-03-02 14:25   ` Johannes Thumshirn
@ 2022-03-04  3:42     ` Martin K. Petersen
  0 siblings, 0 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-04  3:42 UTC (permalink / raw)
  To: Johannes Thumshirn; +Cc: Martin K. Petersen, linux-scsi, Maciej W . Rozycki


Johannes,

> Two minior nits below:
>
>> +	result = scsi_vpd_inquiry(sdev, vpd_header, page, sizeof(vpd_header));
>> +
>> +	if (result < 0)
>> +		return 0;
>
> can we remove the empty line?

Sure!

>> +	memset(buf, 0, buf_len);
>> +	result = scsi_vpd_inquiry(sdev, buf, page, vpd_len);
>>  	if (result < 0)
>> -		goto fail;
>> +		return -EINVAL;
>> +	else if (result > vpd_len) {
>> +		dev_warn_once(&sdev->sdev_gendev,
>> +			      "%s: VPD page 0x%02x result %d > %d bytes\n",
>> +			      __func__, page, result, vpd_len);
>> +		vpd_len = min(result, buf_len);
>> +		goto retry_pg;
>> +	}
>>  
>
> Adding {} to the if block as well makes it look nicer IMHO

Roger.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/14] scsi: sd: Optimal I/O size should be a multiple of reported granularity
  2022-03-03 20:17   ` Bart Van Assche
@ 2022-03-04  3:45     ` Martin K. Petersen
  2022-03-04  5:06       ` Bart Van Assche
  0 siblings, 1 reply; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-04  3:45 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Martin K. Petersen, linux-scsi, Bernhard Sulzer


Bart,

> On 3/1/22 21:35, Martin K. Petersen wrote:
>> +	if (min_xfer_bytes && opt_xfer_bytes & (min_xfer_bytes - 1)) {
>> +		sd_first_printk(KERN_WARNING, sdkp,
>> +				"Optimal transfer size %u bytes not a " \
>> +				"multiple of preferred minimum block " \
>> +				"size (%u bytes)\n",
>> +				opt_xfer_bytes, min_xfer_bytes);
>> +		return false;
>> +	}
>
> Hmm ... what guarantees that min_xfer_bytes is a power of two? Did I
> perhaps overlook something?

Nothing in the spec, obviously, but I think we'd have a lot of headaches
further up the stack if that wasn't the case.

Do you know of any devices that report something unusual here?

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 09/14] scsi: sd: Fix discard errors during revalidate
  2022-03-03 21:06   ` Bart Van Assche
@ 2022-03-04  3:55     ` Martin K. Petersen
  2022-03-06  0:35       ` Bart Van Assche
  0 siblings, 1 reply; 75+ messages in thread
From: Martin K. Petersen @ 2022-03-04  3:55 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Martin K. Petersen, linux-scsi


Bart,

>> +	if (mode == SD_LBP_DEFAULT && !sdkp->provisioning_override) {
>
> Hmm ... is provisioning_override ever true for the SD_LBP_DEFAULT
> mode? If not, can "&& !sdkp->provisioning_override" be left out?

The two *_override variables are used to prevent subsequent revalidates
from clobbering the mode configured by the user.

I experimented with various approaches for this, including having a
separate SD_LBP_ mode for revalidate, using first_scan, etc. In the end
I felt that the boolean was the best approach to capturing the fact that
the currently active mode was explicitly configured by the user.

Open to suggestions.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/14] scsi: sd: Optimal I/O size should be a multiple of reported granularity
  2022-03-04  3:45     ` Martin K. Petersen
@ 2022-03-04  5:06       ` Bart Van Assche
  0 siblings, 0 replies; 75+ messages in thread
From: Bart Van Assche @ 2022-03-04  5:06 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi, Bernhard Sulzer

On 3/3/22 19:45, Martin K. Petersen wrote:
>> On 3/1/22 21:35, Martin K. Petersen wrote:
>>> +	if (min_xfer_bytes && opt_xfer_bytes & (min_xfer_bytes - 1)) {
>>> +		sd_first_printk(KERN_WARNING, sdkp,
>>> +				"Optimal transfer size %u bytes not a " \
>>> +				"multiple of preferred minimum block " \
>>> +				"size (%u bytes)\n",
>>> +				opt_xfer_bytes, min_xfer_bytes);
>>> +		return false;
>>> +	}
>>
>> Hmm ... what guarantees that min_xfer_bytes is a power of two? Did I
>> perhaps overlook something?
> 
> Nothing in the spec, obviously, but I think we'd have a lot of headaches
> further up the stack if that wasn't the case.
> 
> Do you know of any devices that report something unusual here?

Hi Martin,

I'm not aware of any devices that report an unusual OPTIMAL TRANSFER LENGTH 
GRANULARITY value.

Since this code is not in the hot path, how about using the modulo operator 
instead of binary and?

Thanks,

Bart.



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 06/14] scsi: sd: Use cached ATA Information VPD page
  2022-03-02  5:35 ` [PATCH 06/14] scsi: sd: Use cached ATA Information VPD page Martin K. Petersen
  2022-03-02  9:50   ` Christoph Hellwig
  2022-03-03  0:40   ` Bart Van Assche
@ 2022-03-04  9:29   ` Johannes Thumshirn
  2 siblings, 0 replies; 75+ messages in thread
From: Johannes Thumshirn @ 2022-03-04  9:29 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 07/14] scsi: sd: Switch to using scsi_device VPD pages
  2022-03-02  5:35 ` [PATCH 07/14] scsi: sd: Switch to using scsi_device VPD pages Martin K. Petersen
  2022-03-02  9:51   ` Christoph Hellwig
  2022-03-03  0:42   ` Bart Van Assche
@ 2022-03-04  9:29   ` Johannes Thumshirn
  2 siblings, 0 replies; 75+ messages in thread
From: Johannes Thumshirn @ 2022-03-04  9:29 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 11/14] scsi: sd: Implement support for NDOB flag in WRITE SAME(16)
  2022-03-02  5:35 ` [PATCH 11/14] scsi: sd: Implement support for NDOB flag in WRITE SAME(16) Martin K. Petersen
  2022-03-02  9:54   ` Christoph Hellwig
  2022-03-03  1:29   ` Bart Van Assche
@ 2022-03-04  9:32   ` Johannes Thumshirn
  2 siblings, 0 replies; 75+ messages in thread
From: Johannes Thumshirn @ 2022-03-04  9:32 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages
  2022-03-02  5:35 ` [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages Martin K. Petersen
                     ` (2 preceding siblings ...)
  2022-03-03 20:13   ` Bart Van Assche
@ 2022-03-04  9:33   ` Johannes Thumshirn
  3 siblings, 0 replies; 75+ messages in thread
From: Johannes Thumshirn @ 2022-03-04  9:33 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi; +Cc: Damien Le Moal

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 13/14] scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice
  2022-03-02  5:35 ` [PATCH 13/14] scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice Martin K. Petersen
  2022-03-02  9:57   ` Christoph Hellwig
@ 2022-03-04  9:36   ` Johannes Thumshirn
  1 sibling, 0 replies; 75+ messages in thread
From: Johannes Thumshirn @ 2022-03-04  9:36 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 14/14] scsi: sd: Enable modern protocol features on more devices
  2022-03-02  5:35 ` [PATCH 14/14] scsi: sd: Enable modern protocol features on more devices Martin K. Petersen
  2022-03-02  9:58   ` Christoph Hellwig
  2022-03-03  1:25   ` Bart Van Assche
@ 2022-03-04  9:38   ` Johannes Thumshirn
  2 siblings, 0 replies; 75+ messages in thread
From: Johannes Thumshirn @ 2022-03-04  9:38 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi; +Cc: Aman Karmani, David Sebek

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 09/14] scsi: sd: Fix discard errors during revalidate
  2022-03-04  3:55     ` Martin K. Petersen
@ 2022-03-06  0:35       ` Bart Van Assche
  0 siblings, 0 replies; 75+ messages in thread
From: Bart Van Assche @ 2022-03-06  0:35 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi

On 3/3/22 19:55, Martin K. Petersen wrote:
> 
> Bart,
> 
>>> +	if (mode == SD_LBP_DEFAULT && !sdkp->provisioning_override) {
>>
>> Hmm ... is provisioning_override ever true for the SD_LBP_DEFAULT
>> mode? If not, can "&& !sdkp->provisioning_override" be left out?
> 
> The two *_override variables are used to prevent subsequent revalidates
> from clobbering the mode configured by the user.
> 
> I experimented with various approaches for this, including having a
> separate SD_LBP_ mode for revalidate, using first_scan, etc. In the end
> I felt that the boolean was the best approach to capturing the fact that
> the currently active mode was explicitly configured by the user.
> 
> Open to suggestions.

Hi Martin,

It took me a while before I realized that the purpose of the 
provisioning_override variable is to make the behavior of the 
sd_config_discard(sdkp, SD_LBP_DEFAULT) different depending on whether the 
SD_LBP_DEFAULT comes from userspace (via sysfs) or from the sd_config_discard() 
call in sd_revalidate_disk().

How about storing the mode selected by the user inside the 
provisioning_mode_store() function and using that variable inside 
sd_config_discard() instead of sdkp->provisioning_override? I think that would 
result in easier to comprehend code.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages
  2022-03-02 10:45   ` Damien Le Moal
@ 2022-04-06  1:29     ` Damien Le Moal
  2022-04-07  2:19       ` Martin K. Petersen
  0 siblings, 1 reply; 75+ messages in thread
From: Damien Le Moal @ 2022-04-06  1:29 UTC (permalink / raw)
  To: Damien Le Moal, Martin K. Petersen, linux-scsi

On 3/2/22 19:45, Damien Le Moal wrote:
> On 2022/03/02 7:35, Martin K. Petersen wrote:
>> As such it should be called inside the scsi_device_supports_vpd()
>> conditional.
>>
>> Cc: Damien Le Moal <damien.lemoal@wdc.com>
>> Fixes: e815d36548f0 ("scsi: sd: add concurrent positioning ranges support")
>> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> 
> I thought this was already queued :)
> 
> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>

Martin,

I still do not see this patch in 5.18-rc1... Did it fell through the 
cracks ? It also needs cc stable...
See: https://bugzilla.kernel.org/show_bug.cgi?id=215788

> 
>> ---
>>   drivers/scsi/sd.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index 2c2e86738578..9d6b2205339d 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -3396,6 +3396,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
>>   			sd_read_block_limits(sdkp);
>>   			sd_read_block_characteristics(sdkp);
>>   			sd_zbc_read_zones(sdkp, buffer);
>> +			sd_read_cpr(sdkp);
>>   		}
>>   
>>   		sd_print_capacity(sdkp, old_capacity);
>> @@ -3405,7 +3406,6 @@ static int sd_revalidate_disk(struct gendisk *disk)
>>   		sd_read_app_tag_own(sdkp, buffer);
>>   		sd_read_write_same(sdkp, buffer);
>>   		sd_read_security(sdkp, buffer);
>> -		sd_read_cpr(sdkp);
>>   		sd_config_write_same(sdkp);
>>   		sd_config_discard(sdkp, SD_LBP_DEFAULT);
>>   		sd_config_write_zeroes(sdkp, SD_ZERO_DEFAULT);
> 
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages
  2022-04-06  1:29     ` Damien Le Moal
@ 2022-04-07  2:19       ` Martin K. Petersen
  2022-04-07  2:36         ` Damien Le Moal
  0 siblings, 1 reply; 75+ messages in thread
From: Martin K. Petersen @ 2022-04-07  2:19 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Damien Le Moal, Martin K. Petersen, linux-scsi


Damien,

> I still do not see this patch in 5.18-rc1... Did it fell through the
> cracks ? It also needs cc stable...

It was part of my discovery series which I ended up holding back from
5.18 due to a regression. But I'll apply this particular fix.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages
  2022-04-07  2:19       ` Martin K. Petersen
@ 2022-04-07  2:36         ` Damien Le Moal
  2022-04-07  2:48           ` Martin K. Petersen
  0 siblings, 1 reply; 75+ messages in thread
From: Damien Le Moal @ 2022-04-07  2:36 UTC (permalink / raw)
  To: martin.petersen, damien.lemoal; +Cc: linux-scsi

On Wed, 2022-04-06 at 22:19 -0400, Martin K. Petersen wrote:
> Damien,
> 
> > I still do not see this patch in 5.18-rc1... Did it fell through the
> > cracks ? It also needs cc stable...
> 
> It was part of my discovery series which I ended up holding back from
> 5.18 due to a regression. But I'll apply this particular fix.

Thanks.

Unfortunately, the patch did not help with that Areca RAID controller
problem. It looks like the HBA is crashing when the VPD page 0xb9 command
is in the discovery. No issues when manually doing a sg_vpd of that page,
which fails as the RAID volume does not support it. I am tempted in
putting this problem on a buggy adapter FW... Any other idea what could be
wrong ?


-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages
  2022-04-07  2:36         ` Damien Le Moal
@ 2022-04-07  2:48           ` Martin K. Petersen
  0 siblings, 0 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-04-07  2:48 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: martin.petersen, damien.lemoal, linux-scsi


Damien,

> Unfortunately, the patch did not help with that Areca RAID controller
> problem. It looks like the HBA is crashing when the VPD page 0xb9
> command is in the discovery. No issues when manually doing a sg_vpd of
> that page, which fails as the RAID volume does not support it. I am
> tempted in putting this problem on a buggy adapter FW... Any other
> idea what could be wrong ?

Yeah, I saw the bugzilla updates.

It would be good to know if the tweaked VPD code in

	https://git.kernel.org/mkp/h/5.18/discovery

makes a difference...

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 01/14] scsi: mpt3sas: Use cached ATA Information VPD page
  2022-03-02  5:35 ` [PATCH 01/14] scsi: mpt3sas: Use cached ATA Information VPD page Martin K. Petersen
  2022-03-02  9:47   ` Christoph Hellwig
  2022-03-02 14:18   ` Johannes Thumshirn
@ 2022-04-20 12:06   ` Hannes Reinecke
  2022-05-03  0:51   ` Martin K. Petersen
  3 siblings, 0 replies; 75+ messages in thread
From: Hannes Reinecke @ 2022-04-20 12:06 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi; +Cc: Sreekanth Reddy

On 3/2/22 06:35, Martin K. Petersen wrote:
> We now cache VPD page 0x89 so there is no need to request it from the
> hardware. Make mpt3sas use the cached page.
> 
> Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> ---
>   drivers/scsi/mpt3sas/mpt3sas_scsih.c | 18 ++++++++----------
>   1 file changed, 8 insertions(+), 10 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 04/14] scsi: core: Pick suitable allocation length in scsi_report_opcode()
  2022-03-02  5:35 ` [PATCH 04/14] scsi: core: Pick suitable allocation length in scsi_report_opcode() Martin K. Petersen
                     ` (2 preceding siblings ...)
  2022-03-03  0:39   ` Bart Van Assche
@ 2022-04-20 12:06   ` Hannes Reinecke
  3 siblings, 0 replies; 75+ messages in thread
From: Hannes Reinecke @ 2022-04-20 12:06 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

On 3/2/22 06:35, Martin K. Petersen wrote:
> Some devices hang when a buffer size larger than expected is passed in
> the ALLOCATION LENGTH field. For REPORT SUPPORTED OPERATION CODES we
> currently only request a single command descriptor at a time and
> therefore the actual size of the command is known ahead of time. Limit
> the ALLOCATION LENGTH to the header size plus the command length of
> the opcode we are asking about.
> 
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> ---
>   drivers/scsi/scsi.c | 17 +++++++++++++----
>   1 file changed, 13 insertions(+), 4 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 05/14] scsi: core: Cache VPD pages b0, b1, b2
  2022-03-02  5:35 ` [PATCH 05/14] scsi: core: Cache VPD pages b0, b1, b2 Martin K. Petersen
                     ` (2 preceding siblings ...)
  2022-03-03  1:30   ` Bart Van Assche
@ 2022-04-20 12:07   ` Hannes Reinecke
  3 siblings, 0 replies; 75+ messages in thread
From: Hannes Reinecke @ 2022-04-20 12:07 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi

On 3/2/22 06:35, Martin K. Petersen wrote:
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> ---
>   drivers/scsi/scsi.c        |  6 ++++++
>   drivers/scsi/scsi_sysfs.c  | 28 ++++++++++++++++++++++++++++
>   include/scsi/scsi_device.h |  4 ++++
>   3 files changed, 38 insertions(+)
> 
Maybe some description why these pages are important.
But otherwise:

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 01/14] scsi: mpt3sas: Use cached ATA Information VPD page
  2022-03-02  5:35 ` [PATCH 01/14] scsi: mpt3sas: Use cached ATA Information VPD page Martin K. Petersen
                     ` (2 preceding siblings ...)
  2022-04-20 12:06   ` Hannes Reinecke
@ 2022-05-03  0:51   ` Martin K. Petersen
  3 siblings, 0 replies; 75+ messages in thread
From: Martin K. Petersen @ 2022-05-03  0:51 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi; +Cc: Sreekanth Reddy

On Wed, 2 Mar 2022 00:35:46 -0500, Martin K. Petersen wrote:

> We now cache VPD page 0x89 so there is no need to request it from the
> hardware. Make mpt3sas use the cached page.
> 
> 

Applied to 5.19/scsi-queue, thanks!

[01/14] scsi: mpt3sas: Use cached ATA Information VPD page
        https://git.kernel.org/mkp/scsi/c/dc1178767cba
[02/14] scsi: core: Query VPD size before getting full page
        https://git.kernel.org/mkp/scsi/c/c92a6b5d6335
[03/14] scsi: core: Do not truncate INQUIRY data on modern devices
        https://git.kernel.org/mkp/scsi/c/d657700ccac7
[04/14] scsi: core: Pick suitable allocation length in scsi_report_opcode()
        https://git.kernel.org/mkp/scsi/c/e17d63403076
[05/14] scsi: core: Cache VPD pages b0, b1, b2
        https://git.kernel.org/mkp/scsi/c/e60ac0b9e445
[06/14] scsi: sd: Use cached ATA Information VPD page
        https://git.kernel.org/mkp/scsi/c/e38d9e83a376
[07/14] scsi: sd: Switch to using scsi_device VPD pages
        https://git.kernel.org/mkp/scsi/c/7fb019c46eee
[08/14] scsi: sd: Optimal I/O size should be a multiple of reported granularity
        https://git.kernel.org/mkp/scsi/c/631669a256f9
[13/14] scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice
        https://git.kernel.org/mkp/scsi/c/1e029397d12f

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 75+ messages in thread

* regression next-20220714: mkfs.ext4 on multipath device over scsi disks causes 'lifelock' in block layer
  2022-03-02  5:35 ` [PATCH 10/14] scsi: sd: Move WRITE_ZEROES configuration to a separate function Martin K. Petersen
  2022-03-02  9:53   ` Christoph Hellwig
  2022-03-03  0:52   ` Bart Van Assche
@ 2022-07-18 16:51   ` Benjamin Block
  2022-07-19  2:23     ` Martin K. Petersen
  2 siblings, 1 reply; 75+ messages in thread
From: Benjamin Block @ 2022-07-18 16:51 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi, Steffen Maier, Alexander Egorenkov

On Wed, Mar 02, 2022 at 12:35:55AM -0500, Martin K. Petersen wrote:
> In preparation for adding support for the WRITE SAME(16) NDOB flag,
> move configuration of the WRITE_ZEROES operation to a separate
> function. This is done to facilitate fetching all VPD pages before
> choosing the appropriate zeroing method for a given device.
> 
> The deferred configuration also allows us to mirror the discard
> behavior and permit the user to revert a device to the kernel default
> configuration by echoing "default" to the sysfs file.
> 
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> ---
>  drivers/scsi/sd.c | 56 +++++++++++++++++++++++++++++++++--------------
>  drivers/scsi/sd.h |  7 ++++--
>  2 files changed, 44 insertions(+), 19 deletions(-)
> 

Hello Martin,

somehow this patch triggers a regression on s390x with zFCP in
`next-20220714`.

In our daily regression test suite a simple:

  # mkfs.ext4 -F /dev/mapper/mpathc1

causes the block layer to trip with this when trying to discard blocks
(at least that's my assumption about what its trying to do) from a SCSI
disk:

  [   33.042224] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [   33.042228] device-mapper: multipath: 251:0: Failing path 8:0.
  [   33.042239] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [   33.042267] device-mapper: multipath: 251:0: Failing path 8:64.
  [   33.197329] device-mapper: multipath: 251:0: Reinstating path 8:0.
  [   33.198850] device-mapper: multipath: 251:0: Reinstating path 8:64.
  [   33.210742] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [   33.210752] device-mapper: multipath: 251:0: Failing path 8:0.
  [   33.210771] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [   33.210792] device-mapper: multipath: 251:0: Failing path 8:64.
  [   38.200929] device-mapper: multipath: 251:0: Reinstating path 8:0.
  [   38.201489] device-mapper: multipath: 251:0: Reinstating path 8:64.
  [   38.220039] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [   38.220045] device-mapper: multipath: 251:0: Failing path 8:0.
  [   38.220056] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [   38.220060] device-mapper: multipath: 251:0: Failing path 8:64.
  [   43.202538] device-mapper: multipath: 251:0: Reinstating path 8:0.
  [   43.203015] device-mapper: multipath: 251:0: Reinstating path 8:64.
  [   43.219877] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [   43.219881] device-mapper: multipath: 251:0: Failing path 8:0.
  [   43.219889] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [   43.219892] device-mapper: multipath: 251:0: Failing path 8:64.
  [   48.204035] device-mapper: multipath: 251:0: Reinstating path 8:0.
  [   48.204526] device-mapper: multipath: 251:0: Reinstating path 8:64.
  [   48.219951] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [   48.219964] device-mapper: multipath: 251:0: Failing path 8:0.
  [   48.219990] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [   48.219996] device-mapper: multipath: 251:0: Failing path 8:64.
  [   53.205456] device-mapper: multipath: 251:0: Reinstating path 8:0.
  [   53.206950] device-mapper: multipath: 251:0: Reinstating path 8:64.
  [   53.219820] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [   53.219824] device-mapper: multipath: 251:0: Failing path 8:0.
  [   53.219834] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [   53.219837] device-mapper: multipath: 251:0: Failing path 8:64.
  [   58.209693] device-mapper: multipath: 251:0: Reinstating path 8:0.
  [   58.210143] device-mapper: multipath: 251:0: Reinstating path 8:64.

This continues ad infinitum.

I suspected this patchset as it's new in next-20220714, and next-20220704
didn't have this bug in our regression runs. I didn't see any other 'obvious'
patch in scsi or block that has a diff between those two tags.

I started bisecting between:
  9821106213c8 ("scsi: zfcp: Drop redundant "the" in the comments")
as good, and
  f095c3cd1b69 ("scsi: qla2xxx: Update version to 10.02.07.800-k")
as bad; and ended up at:
  1bd95bb98f83 ("scsi: sd: Move WRITE_ZEROES configuration to a separate function")

I ran this on Fedora 36, with mkfs.ext4 1.46.5 (30-Dec-2021).

The multipath device:

  create: mpathc (36005076307ffc5e3000000000000805c) dm-0 IBM,2107900
  size=20G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
  `-+- policy='service-time 0' prio=50 status=active
    |- 0:0:0:1079787648 sda 8:0   active ready running
    `- 1:0:0:1079787648 sde 8:64  active ready running

Some information on the block devices and topology:

  # lsblk /dev/sda /dev/sde
  NAME        MAJ:MIN RM SIZE RO TYPE  MOUNTPOINTS
  sda           8:0    0  20G  0 disk
  └─mpathc    251:0    0  20G  0 mpath
    └─mpathc1 251:2    0  20G  0 part
  sde           8:64   0  20G  0 disk
  └─mpathc    251:0    0  20G  0 mpath
    └─mpathc1 251:2    0  20G  0 part
  # lsblk -t /dev/sda /dev/sde
  NAME        ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED       RQ-SIZE  RA WSAME
  sda                 0    512      0     512     512    1 bfq             256 512    0B
  └─mpathc            0    512      0     512     512    1 mq-deadline     256 128    0B
    └─mpathc1         0    512      0     512     512    1                 128 128    0B
  sde                 0    512      0     512     512    1 bfq             256 512    0B
  └─mpathc            0    512      0     512     512    1 mq-deadline     256 128    0B
    └─mpathc1         0    512      0     512     512    1                 128 128    0B
  # lsblk -D /dev/sda /dev/sde
  NAME        DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
  sda                0        1G       4G         0
  └─mpathc           0        1G       4G         0
    └─mpathc1        0        1G       4G         0
  sde                0        1G       4G         0
  └─mpathc           0        1G       4G         0
    └─mpathc1        0        1G       4G         0
  # lsblk -S /dev/sda /dev/sde
  NAME HCTL             TYPE VENDOR   MODEL    REV SERIAL      TRAN
  sda  0:0:0:1079787648 disk IBM      2107900 1060 75DL241805C fc
  sde  1:0:0:1079787648 disk IBM      2107900 1060 75DL241805C fc

Any idea why this is happening? In case you need more details, this
reproduces very reliably here.

-- 
Best Regards, Benjamin Block  / Linux on IBM Z Kernel Development / IBM Systems
IBM Deutschland Research & Development GmbH    /    https://www.ibm.com/privacy
Vorsitz. AufsR.: Gregor Pillen         /         Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: regression next-20220714: mkfs.ext4 on multipath device over scsi disks causes 'lifelock' in block layer
  2022-07-18 16:51   ` regression next-20220714: mkfs.ext4 on multipath device over scsi disks causes 'lifelock' in block layer Benjamin Block
@ 2022-07-19  2:23     ` Martin K. Petersen
  2022-07-19 11:37       ` Benjamin Block
  0 siblings, 1 reply; 75+ messages in thread
From: Martin K. Petersen @ 2022-07-19  2:23 UTC (permalink / raw)
  To: Benjamin Block
  Cc: Martin K. Petersen, linux-scsi, Steffen Maier, Alexander Egorenkov


Benjamin,

>   # lsblk -D /dev/sda /dev/sde
>   NAME        DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
>   sda                0        1G       4G         0
>   └─mpathc           0        1G       4G         0
>     └─mpathc1        0        1G       4G         0
>   sde                0        1G       4G         0
>   └─mpathc           0        1G       4G         0
>     └─mpathc1        0        1G       4G         0

>   # lsblk -S /dev/sda /dev/sde
>   NAME HCTL             TYPE VENDOR   MODEL    REV SERIAL      TRAN
>   sda  0:0:0:1079787648 disk IBM      2107900 1060 75DL241805C fc
>   sde  1:0:0:1079787648 disk IBM      2107900 1060 75DL241805C fc
>
> Any idea why this is happening? In case you need more details, this
> reproduces very reliably here.

Are the reported discard limits different on kernels predating that
commit?

Please send the output of:

# grep . /sys/block/sdN/queue/discard_* /sys/block/sdN/device/scsi_disk/*/*_mode
# sg_readcap -l /dev/sdN
# sg_vpg -p bl /dev/sdN
# sg_vpg -p lbpv /dev/sdN

Ideally (for the grep) before and after the offending commit.

Thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: regression next-20220714: mkfs.ext4 on multipath device over scsi disks causes 'lifelock' in block layer
  2022-07-19  2:23     ` Martin K. Petersen
@ 2022-07-19 11:37       ` Benjamin Block
  2022-07-21  2:29         ` Martin K. Petersen
  0 siblings, 1 reply; 75+ messages in thread
From: Benjamin Block @ 2022-07-19 11:37 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi, Steffen Maier, Alexander Egorenkov

On Mon, Jul 18, 2022 at 10:23:26PM -0400, Martin K. Petersen wrote:
> Please send the output of:
> 
> # grep . /sys/block/sdN/queue/discard_* /sys/block/sdN/device/scsi_disk/*/*_mode
> # sg_readcap -l /dev/sdN
> # sg_vpg -p bl /dev/sdN
> # sg_vpg -p lbpv /dev/sdN
> 
> Ideally (for the grep) before and after the offending commit.

Sure,

I assume with `sg_vpg` you mean `sg_vpd`.

1bd95bb98f83 ("scsi: sd: Move WRITE_ZEROES configuration to a separate function")
---------------------------------------------------------------------------------

This is the first bad commit 

  # lsblk -s /dev/mapper/mpathe1
  NAME     MAJ:MIN RM SIZE RO TYPE  MOUNTPOINTS
  mpathe1  251:5    0  20G  0 part
  └─mpathe 251:2    0  20G  0 mpath
    ├─sde    8:64   0  20G  0 disk
    └─sdi    8:128  0  20G  0 disk
  
  # ll /dev/mapper/{mpathe1,mpathe}
  lrwxrwxrwx. 1 root root 7 Jul 19 12:52 /dev/mapper/mpathe -> ../dm-2
  lrwxrwxrwx. 1 root root 7 Jul 19 12:52 /dev/mapper/mpathe1 -> ../dm-5
  
  # lsblk -st /dev/mapper/mpathe1
  NAME     ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED       RQ-SIZE  RA WSAME
  mpathe1          0    512      0     512     512    1                 128 128    0B
  └─mpathe         0    512      0     512     512    1 mq-deadline     256 128    0B
    ├─sde          0    512      0     512     512    1 bfq             256 512    0B
    └─sdi          0    512      0     512     512    1 bfq             256 512    0B
  
  # lsblk -sD /dev/mapper/mpathe1
  NAME     DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
  mpathe1         0        1G      32M         0
  └─mpathe        0        1G      32M         0
    ├─sde         0        1G      32M         0
    └─sdi         0        1G      32M         0
  
  # grep -H . /sys/block/{sde,sdi,dm-2,dm-5}/queue/discard_* /sys/block/{sde,sdi}/device/scsi_disk/*/*_mode
  /sys/block/sde/queue/discard_granularity:1073741824
  /sys/block/sde/queue/discard_max_bytes:33553920
  /sys/block/sde/queue/discard_max_hw_bytes:33553920
  /sys/block/sde/queue/discard_zeroes_data:0
  /sys/block/sdi/queue/discard_granularity:1073741824
  /sys/block/sdi/queue/discard_max_bytes:33553920
  /sys/block/sdi/queue/discard_max_hw_bytes:33553920
  /sys/block/sdi/queue/discard_zeroes_data:0
  /sys/block/dm-2/queue/discard_granularity:1073741824
  /sys/block/dm-2/queue/discard_max_bytes:33553920
  /sys/block/dm-2/queue/discard_max_hw_bytes:33553920
  /sys/block/dm-2/queue/discard_zeroes_data:0
  /sys/block/dm-5/queue/discard_granularity:1073741824
  /sys/block/dm-5/queue/discard_max_bytes:33553920
  /sys/block/dm-5/queue/discard_max_hw_bytes:33553920
  /sys/block/dm-5/queue/discard_zeroes_data:0
  /sys/block/sde/device/scsi_disk/2:0:0:1083719810/protection_mode:none
  /sys/block/sde/device/scsi_disk/2:0:0:1083719810/provisioning_mode:writesame_16
  /sys/block/sde/device/scsi_disk/2:0:0:1083719810/zeroing_mode:writesame_16_unmap
  /sys/block/sdi/device/scsi_disk/3:0:0:1083719810/protection_mode:none
  /sys/block/sdi/device/scsi_disk/3:0:0:1083719810/provisioning_mode:writesame_16
  /sys/block/sdi/device/scsi_disk/3:0:0:1083719810/zeroing_mode:writesame_16_unmap
  
  # sg_readcap -l /dev/sde
  Read Capacity results:
     Protection: prot_en=0, p_type=0, p_i_exponent=0
     Logical block provisioning: lbpme=1, lbprz=1
     Last LBA=41943039 (0x27fffff), Number of logical blocks=41943040
     Logical block length=512 bytes
     Logical blocks per physical block exponent=0
     Lowest aligned LBA=0
  Hence:
     Device size: 21474836480 bytes, 20480.0 MiB, 21.47 GB
  
  # dmesg | tail -n 5
  [  111.308428] sd 2:0:0:1083719810: [sde] tag#2053 Done: SUCCESS Result: hostbyte=DID_TARGET_FAILURE driverbyte=DRIVER_OK cmd_age=0s
  [  111.308438] sd 2:0:0:1083719810: [sde] tag#2053 CDB: Inquiry 12 01 b9 00 04 00
  [  111.308441] sd 2:0:0:1083719810: [sde] tag#2053 Sense Key : Illegal Request [current]
  [  111.308444] sd 2:0:0:1083719810: [sde] tag#2053 Add. Sense: Invalid field in cdb
  [  111.311099]  sde: sde1
  
  # sg_readcap -l /dev/sdi
  Read Capacity results:
     Protection: prot_en=0, p_type=0, p_i_exponent=0
     Logical block provisioning: lbpme=1, lbprz=1
     Last LBA=41943039 (0x27fffff), Number of logical blocks=41943040
     Logical block length=512 bytes
     Logical blocks per physical block exponent=0
     Lowest aligned LBA=0
  Hence:
     Device size: 21474836480 bytes, 20480.0 MiB, 21.47 GB
  
  # dmesg | tail -n 5
  [  125.621343] sd 3:0:0:1083719810: [sdi] tag#2325 Done: SUCCESS Result: hostbyte=DID_TARGET_FAILURE driverbyte=DRIVER_OK cmd_age=0s
  [  125.621352] sd 3:0:0:1083719810: [sdi] tag#2325 CDB: Inquiry 12 01 b9 00 04 00
  [  125.621355] sd 3:0:0:1083719810: [sdi] tag#2325 Sense Key : Illegal Request [current]
  [  125.621358] sd 3:0:0:1083719810: [sdi] tag#2325 Add. Sense: Invalid field in cdb
  [  125.623898]  sdi: sdi1
  
  # sg_vpd -p bl /dev/sde
  Block limits VPD page (SBC):
    Write same non-zero (WSNZ): 0
    Maximum compare and write length: 1 blocks
    Optimal transfer length granularity: 0 blocks [not reported]
    Maximum transfer length: 0 blocks [not reported]
    Optimal transfer length: 0 blocks [not reported]
    Maximum prefetch transfer length: 0 blocks [ignored]
    Maximum unmap LBA count: -1 [unbounded]
    Maximum unmap block descriptor count: 0 [Unmap command not implemented]
    Optimal unmap granularity: 2097152 blocks
    Unmap granularity alignment valid: true
    Unmap granularity alignment: 0
    Maximum write same length: 0 blocks [not reported]
    Maximum atomic transfer length: 0 blocks [not reported]
    Atomic alignment: 0 [unaligned atomic writes permitted]
    Atomic transfer length granularity: 0 [no granularity requirement
    Maximum atomic transfer length with atomic boundary: 0 blocks [not reported]
    Maximum atomic boundary size: 0 blocks [can only write atomic 1 block]
  
  # sg_vpd -p lbpv /dev/sde
  Logical block provisioning VPD page (SBC):
    Unmap command supported (LBPU): 0
    Write same (16) with unmap bit supported (LBPWS): 1
    Write same (10) with unmap bit supported (LBPWS10): 0
    Logical block provisioning read zeros (LBPRZ): 0
    Anchored LBAs supported (ANC_SUP): 0
    Threshold exponent: 21
    Descriptor present (DP): 0
    Minimum percentage: 0 [not reported]
    Provisioning type: 0 (not known or fully provisioned)
    Threshold percentage: 0 [percentages not supported]
  
  # sg_vpd -p bl /dev/sdi
  Block limits VPD page (SBC):
    Write same non-zero (WSNZ): 0
    Maximum compare and write length: 1 blocks
    Optimal transfer length granularity: 0 blocks [not reported]
    Maximum transfer length: 0 blocks [not reported]
    Optimal transfer length: 0 blocks [not reported]
    Maximum prefetch transfer length: 0 blocks [ignored]
    Maximum unmap LBA count: -1 [unbounded]
    Maximum unmap block descriptor count: 0 [Unmap command not implemented]
    Optimal unmap granularity: 2097152 blocks
    Unmap granularity alignment valid: true
    Unmap granularity alignment: 0
    Maximum write same length: 0 blocks [not reported]
    Maximum atomic transfer length: 0 blocks [not reported]
    Atomic alignment: 0 [unaligned atomic writes permitted]
    Atomic transfer length granularity: 0 [no granularity requirement
    Maximum atomic transfer length with atomic boundary: 0 blocks [not reported]
    Maximum atomic boundary size: 0 blocks [can only write atomic 1 block]
  
  # sg_vpd -p lbpv /dev/sdi
  Logical block provisioning VPD page (SBC):
    Unmap command supported (LBPU): 0
    Write same (16) with unmap bit supported (LBPWS): 1
    Write same (10) with unmap bit supported (LBPWS10): 0
    Logical block provisioning read zeros (LBPRZ): 0
    Anchored LBAs supported (ANC_SUP): 0
    Threshold exponent: 21
    Descriptor present (DP): 0
    Minimum percentage: 0 [not reported]
    Provisioning type: 0 (not known or fully provisioned)
    Threshold percentage: 0 [percentages not supported]
  
  # mkfs.ext4 -F /dev/mapper/mpathe1
  ...
  [  307.192885] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [  307.192892] device-mapper: multipath: 251:2: Failing path 8:128.
  [  307.192938] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [  307.192941] device-mapper: multipath: 251:2: Failing path 8:64.
  [  311.548555] device-mapper: multipath: 251:2: Reinstating path 8:128.
  [  311.548883] device-mapper: multipath: 251:2: Reinstating path 8:64.
  [  311.562499] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [  311.562521] device-mapper: multipath: 251:2: Failing path 8:128.
  [  311.562553] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [  311.562557] device-mapper: multipath: 251:2: Failing path 8:64.
  ...

5be0f08e9d95 ("scsi: sd: Fix discard errors during revalidate")
---------------------------------------------------------------

This is the last good commit

  # lsblk -s /dev/mapper/mpathe1
  NAME     MAJ:MIN RM SIZE RO TYPE  MOUNTPOINTS
  mpathe1  251:6    0  20G  0 part
  └─mpathe 251:2    0  20G  0 mpath
    ├─sde    8:64   0  20G  0 disk
    └─sdf    8:80   0  20G  0 disk
  
  # ll /dev/mapper/{mpathe1,mpathe}
  lrwxrwxrwx. 1 root root 7 Jul 19 12:29 /dev/mapper/mpathe -> ../dm-2
  lrwxrwxrwx. 1 root root 7 Jul 19 12:37 /dev/mapper/mpathe1 -> ../dm-6
  
  # lsblk -st /dev/mapper/mpathe1
  NAME     ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED       RQ-SIZE  RA WSAME
  mpathe1          0    512      0     512     512    1                 128 128    0B
  └─mpathe         0    512      0     512     512    1 mq-deadline     256 128    0B
    ├─sde          0    512      0     512     512    1 bfq             256 512    0B
    └─sdf          0    512      0     512     512    1 bfq             256 512    0B
  
  # lsblk -sD /dev/mapper/mpathe1
  NAME     DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
  mpathe1         0        1G       4G         0
  └─mpathe        0        1G       4G         0
    ├─sde         0        1G       4G         0
    └─sdf         0        1G       4G         0
  
  # grep -H . /sys/block/{sde,sdf,dm-2,dm-6}/queue/discard_* /sys/block/{sde,sdf}/device/scsi_disk/*/*_mode
  /sys/block/sde/queue/discard_granularity:1073741824
  /sys/block/sde/queue/discard_max_bytes:4294966784
  /sys/block/sde/queue/discard_max_hw_bytes:4294966784
  /sys/block/sde/queue/discard_zeroes_data:0
  /sys/block/sdf/queue/discard_granularity:1073741824
  /sys/block/sdf/queue/discard_max_bytes:4294966784
  /sys/block/sdf/queue/discard_max_hw_bytes:4294966784
  /sys/block/sdf/queue/discard_zeroes_data:0
  /sys/block/dm-2/queue/discard_granularity:1073741824
  /sys/block/dm-2/queue/discard_max_bytes:4294966784
  /sys/block/dm-2/queue/discard_max_hw_bytes:4294966784
  /sys/block/dm-2/queue/discard_zeroes_data:0
  /sys/block/dm-6/queue/discard_granularity:1073741824
  /sys/block/dm-6/queue/discard_max_bytes:4294966784
  /sys/block/dm-6/queue/discard_max_hw_bytes:4294966784
  /sys/block/dm-6/queue/discard_zeroes_data:0
  /sys/block/sde/device/scsi_disk/2:0:0:1083719810/protection_mode:none
  /sys/block/sde/device/scsi_disk/2:0:0:1083719810/provisioning_mode:writesame_16
  /sys/block/sde/device/scsi_disk/2:0:0:1083719810/zeroing_mode:writesame_16_unmap
  /sys/block/sdf/device/scsi_disk/3:0:0:1083719810/protection_mode:none
  /sys/block/sdf/device/scsi_disk/3:0:0:1083719810/provisioning_mode:writesame_16
  /sys/block/sdf/device/scsi_disk/3:0:0:1083719810/zeroing_mode:writesame_16_unmap
  
  # mkfs.ext4 -F /dev/mapper/mpathe1
  mke2fs 1.46.5 (30-Dec-2021)
  Discarding device blocks: done
  Creating filesystem with 5242368 4k blocks and 1310720 inodes
  Filesystem UUID: 5d0dc4c2-445c-4a90-aaa1-0998459497c5
  Superblock backups stored on blocks:
  	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
  	4096000
  
  Allocating group tables: done
  Writing inode tables: done
  Creating journal (32768 blocks): done
  Writing superblocks and filesystem accounting information: done

This is a IBM DS8870 (first announced in 2012):
https://www.ibm.com/common/ssi/rep_sm/4/877/ENUS2424-_h04/index.html

This is one of the oldest storage boxes we have right now, and this
regression it doesn't seem to happen on newer models as far as I can
see.

-- 
Best Regards, Benjamin Block  / Linux on IBM Z Kernel Development / IBM Systems
IBM Deutschland Research & Development GmbH    /    https://www.ibm.com/privacy
Vorsitz. AufsR.: Gregor Pillen         /         Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: regression next-20220714: mkfs.ext4 on multipath device over scsi disks causes 'lifelock' in block layer
  2022-07-19 11:37       ` Benjamin Block
@ 2022-07-21  2:29         ` Martin K. Petersen
  2022-07-21 13:25           ` Benjamin Block
  0 siblings, 1 reply; 75+ messages in thread
From: Martin K. Petersen @ 2022-07-21  2:29 UTC (permalink / raw)
  To: Benjamin Block
  Cc: Martin K. Petersen, linux-scsi, Steffen Maier, Alexander Egorenkov


Benjamin,

> This is one of the oldest storage boxes we have right now, and this
> regression it doesn't seem to happen on newer models as far as I can
> see.

I have not had much luck reproducing your results today despite
reporting the same parameters in the VPD pages as your device.

I would appreciate if you send me the output of:

# grep . /sys/block/{sd,dm}*/queue/write_zeroes_max_bytes /sys/block/sd*/device/scsi_disk/*/max_write_same_blocks

for the failing configuration.

Thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: regression next-20220714: mkfs.ext4 on multipath device over scsi disks causes 'lifelock' in block layer
  2022-07-21  2:29         ` Martin K. Petersen
@ 2022-07-21 13:25           ` Benjamin Block
  0 siblings, 0 replies; 75+ messages in thread
From: Benjamin Block @ 2022-07-21 13:25 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi, Steffen Maier, Alexander Egorenkov

On Wed, Jul 20, 2022 at 10:29:06PM -0400, Martin K. Petersen wrote:
> > This is one of the oldest storage boxes we have right now, and this
> > regression it doesn't seem to happen on newer models as far as I can
> > see.
> 
> I have not had much luck reproducing your results today despite
> reporting the same parameters in the VPD pages as your device.
> 
> I would appreciate if you send me the output of:
> 
> # grep . /sys/block/{sd,dm}*/queue/write_zeroes_max_bytes /sys/block/sd*/device/scsi_disk/*/max_write_same_blocks
> 
> for the failing configuration.

Again at commit
1bd95bb98f83 ("scsi: sd: Move WRITE_ZEROES configuration to a separate function")

The device names keep changing like that because these are CI systems
that are reset and re-installed every night, and I don't get the same
system every day; just in case this looks strange.

  # lsblk -s -o '+KNAME' /dev/mapper/mpathd1
  NAME     MAJ:MIN RM SIZE RO TYPE  MOUNTPOINTS KNAME
  mpathd1  251:2    0  20G  0 part              dm-2
  └─mpathd 251:0    0  20G  0 mpath             dm-0
    ├─sda    8:0    0  20G  0 disk              sda
    └─sde    8:64   0  20G  0 disk              sde
  
  # grep . /sys/block/{sda,sde,dm-0,dm-2}/queue/write_zeroes_max_bytes /sys/block/{sda,sde}/device/scsi_disk/*/max_write_same_blocks
  /sys/block/sda/queue/write_zeroes_max_bytes:33553920
  /sys/block/sde/queue/write_zeroes_max_bytes:33553920
  /sys/block/dm-0/queue/write_zeroes_max_bytes:33553920
  /sys/block/dm-2/queue/write_zeroes_max_bytes:33553920
  /sys/block/sda/device/scsi_disk/0:0:0:1079787648/max_write_same_blocks:65535
  /sys/block/sde/device/scsi_disk/1:0:0:1079787648/max_write_same_blocks:65535
  
  # for fl in /sys/block/sda/device/scsi_disk/*/device/inquiry /sys/block/sda/device/scsi_disk/*/device/vpd_*; do (set -x; xxd "${fl}"); done
  + xxd /sys/block/sda/device/scsi_disk/0:0:0:1079787648/device/inquiry
  00000000: 0000 0532 9f10 1002 4942 4d20 2020 2020  ...2....IBM
  00000010: 3231 3037 3930 3020 2020 2020 2020 2020  2107900
  00000020: 3130 3630 3735 444c 3234 3138 3035 4320  106075DL241805C
  00000030: 2020 2020 2020 2020 0000 0060 0da0 0a00          ...`....
  00000040: 0300 0320 0000 0000 0000 0000 0000 0000  ... ............
  00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  00000060: 0101 3037 3500 3034 3731 3400 0000 8800  ..075.04714.....
  00000070: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  00000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  00000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  000000a0: 0002 0800                                ....
  + xxd /sys/block/sda/device/scsi_disk/0:0:0:1079787648/device/vpd_pg0
  00000000: 0000 000a 0080 8386 b0b1 b2c0 c1c2       ..............
  + xxd /sys/block/sda/device/scsi_disk/0:0:0:1079787648/device/vpd_pg80
  00000000: 0080 0010 3735 444c 3234 3138 3035 4320  ....75DL241805C
  00000010: 2020 2020
  + xxd /sys/block/sda/device/scsi_disk/0:0:0:1079787648/device/vpd_pg83
  00000000: 0083 0024 0103 0010 6005 0763 07ff c5e3  ...$....`..c....
  00000010: 0000 0000 0000 805c 0114 0004 0000 0101  .......\........
  00000020: 0115 0004 0000 0000                      ........
  + xxd /sys/block/sda/device/scsi_disk/0:0:0:1079787648/device/vpd_pgb0
  00000000: 00b0 003c 0001 0000 0000 0000 0000 0000  ...<............
  00000010: 0000 0000 ffff ffff 0000 0000 0020 0000  ............. ..
  00000020: 8000 0000 0000 0000 0000 0000 0000 0000  ................
  00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  + xxd /sys/block/sda/device/scsi_disk/0:0:0:1079787648/device/vpd_pgb1
  00000000: 00b1 003c 1c20 0000 0000 0000 0000 0000  ...<. ..........
  00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  + xxd /sys/block/sda/device/scsi_disk/0:0:0:1079787648/device/vpd_pgb2
  00000000: 00b2 0004 1540 0000                      .....@..

Maybe this also helps, I took a snapshot of dmesg when the devices are
sensed:

  ...
  [    3.228175] SCSI subsystem initialized
  ...
  [    3.942878] device-mapper: core: CONFIG_IMA_DISABLE_HTABLE is disabled. Duplicate IMA measurements will not be recorded in the IMA log.
  [    3.942912] device-mapper: uevent: version 1.0.3
  [    3.942981] device-mapper: ioctl: 4.46.0-ioctl (2022-02-22) initialised: dm-devel@redhat.com
  ...
  [   31.563828] zfcp 0.0.1900: qdio: ZFCP on SC 15b using AI:1 QEBSM:0 PRI:1 TDD:1 SIGA: W
  [   32.688192] scsi host0: scsi_eh_0: sleeping
  [   32.688238] scsi host0: zfcp
  ...
  [   32.725595] scsi 0:0:0:0: scsi scan: INQUIRY pass 1 length 36
  [   32.725921] scsi 0:0:0:0: scsi scan: INQUIRY successful with code 0x0
  [   32.725931] scsi 0:0:0:0: scsi scan: INQUIRY pass 2 length 164
  [   32.726133] scsi 0:0:0:0: scsi scan: INQUIRY successful with code 0x0
  [   32.726139] scsi 0:0:0:0: scsi scan: peripheral device type of 31, no device added
  [   32.726560] scsi 0:0:0:0: scsi scan: Sending REPORT LUNS to (try 0)
  [   32.727220] scsi 0:0:0:0: scsi scan: REPORT LUNS successful (try 0) result 0x0
  [   32.727222] scsi 0:0:0:0: scsi scan: REPORT LUN scan
  [   32.727475] scsi 0:0:0:1079787648: scsi scan: INQUIRY pass 1 length 36
  [   32.727717] scsi 0:0:0:1079787648: scsi scan: INQUIRY successful with code 0x0
  [   32.727723] scsi 0:0:0:1079787648: scsi scan: INQUIRY pass 2 length 164
  [   32.727927] scsi 0:0:0:1079787648: scsi scan: INQUIRY successful with code 0x0
  [   32.727935] scsi 0:0:0:1079787648: Direct-Access     IBM      2107900          1060 PQ: 0 ANSI: 5
  [   32.729982] scsi 0:0:0:1079787648: sg_alloc: dev=0
  [   32.730027] sd 0:0:0:1079787648: Attached scsi generic sg0 type 0
  ...
  [   32.730502] sd 0:0:0:1079787648: Power-on or device reset occurred
  [   32.730508] sd 0:0:0:1079787648: [sda] tag#2176 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
  [   32.730511] sd 0:0:0:1079787648: [sda] tag#2176 CDB: Test Unit Ready 00 00 00 00 00 00
  [   32.730514] sd 0:0:0:1079787648: [sda] tag#2176 Sense Key : Unit Attention [current]
  [   32.730516] sd 0:0:0:1079787648: [sda] tag#2176 Add. Sense: Power on, reset, or bus device reset occurred
  ...
  [   32.730774] sd 0:0:0:1079787648: [sda] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB)
  [   32.730928] sd 0:0:0:1079787648: [sda] Write Protect is off
  [   32.730930] sd 0:0:0:1079787648: [sda] Mode Sense: ed 00 00 08
  [   32.731243] sd 0:0:0:1079787648: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
  [   32.731376] sd 0:0:0:1079787648: [sda] tag#3781 Done: SUCCESS Result: hostbyte=DID_TARGET_FAILURE driverbyte=DRIVER_OK cmd_age=0s
  [   32.731379] sd 0:0:0:1079787648: [sda] tag#3781 CDB: Report supported operation codes a3 0c 01 12 00 00 00 00 00 0a 00 00
  [   32.731381] sd 0:0:0:1079787648: [sda] tag#3781 Sense Key : Illegal Request [current]
  [   32.731383] sd 0:0:0:1079787648: [sda] tag#3781 Add. Sense: Invalid field in cdb
  ...
  [   32.734915] sd 0:0:0:1079787648: [sda] tag#3784 Done: SUCCESS Result: hostbyte=DID_TARGET_FAILURE driverbyte=DRIVER_OK cmd_age=0s
  [   32.734917] sd 0:0:0:1079787648: [sda] tag#3784 CDB: Inquiry 12 01 b9 00 04 00
  [   32.734919] sd 0:0:0:1079787648: [sda] tag#3784 Sense Key : Illegal Request [current]
  [   32.734921] sd 0:0:0:1079787648: [sda] tag#3784 Add. Sense: Invalid field in cdb
  ...
  [   32.737032]  sda: sda1
  ...
  [   32.737185] sd 0:0:0:1079787648: [sda] Attached SCSI disk
  ...
  [   37.294939] alua: device handler registered
  [   37.296390] emc: device handler registered
  [   37.297819] rdac: device handler registered
  [   37.351199] device-mapper: multipath service-time: version 0.3.0 loaded
  [   37.351424] sd 0:0:0:1079787648: alua: supports implicit TPGS
  [   37.351427] sd 0:0:0:1079787648: alua: device naa.6005076307ffc5e3000000000000805c port group 0 rel port 101
  ...
  [   37.378201] sd 0:0:0:1079787648: alua: transition timeout set to 60 seconds
  [   37.378204] sd 0:0:0:1079787648: alua: port group 00 state A preferred supports tolusnA
  ...

Thanks for the support Martin.

-- 
Best Regards, Benjamin Block  / Linux on IBM Z Kernel Development / IBM Systems
IBM Deutschland Research & Development GmbH    /    https://www.ibm.com/privacy
Vorsitz. AufsR.: Gregor Pillen         /         Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 75+ messages in thread

end of thread, other threads:[~2022-07-21 13:26 UTC | newest]

Thread overview: 75+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-02  5:35 SCSI discovery update Martin K. Petersen
2022-03-02  5:35 ` [PATCH 01/14] scsi: mpt3sas: Use cached ATA Information VPD page Martin K. Petersen
2022-03-02  9:47   ` Christoph Hellwig
2022-03-02 14:18   ` Johannes Thumshirn
2022-04-20 12:06   ` Hannes Reinecke
2022-05-03  0:51   ` Martin K. Petersen
2022-03-02  5:35 ` [PATCH 02/14] scsi: core: Query VPD size before getting full page Martin K. Petersen
2022-03-02  9:48   ` Christoph Hellwig
2022-03-02 14:25   ` Johannes Thumshirn
2022-03-04  3:42     ` Martin K. Petersen
2022-03-03  0:30   ` Bart Van Assche
2022-03-04  3:28     ` Martin K. Petersen
2022-03-02  5:35 ` [PATCH 03/14] scsi: core: Do not truncate INQUIRY data on modern devices Martin K. Petersen
2022-03-02  9:49   ` Christoph Hellwig
2022-03-02 14:27   ` Johannes Thumshirn
2022-03-03  0:14   ` Bart Van Assche
2022-03-04  3:40     ` Martin K. Petersen
2022-03-02  5:35 ` [PATCH 04/14] scsi: core: Pick suitable allocation length in scsi_report_opcode() Martin K. Petersen
2022-03-02  9:49   ` Christoph Hellwig
2022-03-02 14:29   ` Johannes Thumshirn
2022-03-03  0:39   ` Bart Van Assche
2022-04-20 12:06   ` Hannes Reinecke
2022-03-02  5:35 ` [PATCH 05/14] scsi: core: Cache VPD pages b0, b1, b2 Martin K. Petersen
2022-03-02  9:50   ` Christoph Hellwig
2022-03-02 14:30   ` Johannes Thumshirn
2022-03-03  1:30   ` Bart Van Assche
2022-04-20 12:07   ` Hannes Reinecke
2022-03-02  5:35 ` [PATCH 06/14] scsi: sd: Use cached ATA Information VPD page Martin K. Petersen
2022-03-02  9:50   ` Christoph Hellwig
2022-03-03  0:40   ` Bart Van Assche
2022-03-04  9:29   ` Johannes Thumshirn
2022-03-02  5:35 ` [PATCH 07/14] scsi: sd: Switch to using scsi_device VPD pages Martin K. Petersen
2022-03-02  9:51   ` Christoph Hellwig
2022-03-03  0:42   ` Bart Van Assche
2022-03-04  9:29   ` Johannes Thumshirn
2022-03-02  5:35 ` [PATCH 08/14] scsi: sd: Optimal I/O size should be a multiple of reported granularity Martin K. Petersen
2022-03-02  9:51   ` Christoph Hellwig
2022-03-03 20:17   ` Bart Van Assche
2022-03-04  3:45     ` Martin K. Petersen
2022-03-04  5:06       ` Bart Van Assche
2022-03-02  5:35 ` [PATCH 09/14] scsi: sd: Fix discard errors during revalidate Martin K. Petersen
2022-03-02  9:52   ` Christoph Hellwig
2022-03-03 21:06   ` Bart Van Assche
2022-03-04  3:55     ` Martin K. Petersen
2022-03-06  0:35       ` Bart Van Assche
2022-03-02  5:35 ` [PATCH 10/14] scsi: sd: Move WRITE_ZEROES configuration to a separate function Martin K. Petersen
2022-03-02  9:53   ` Christoph Hellwig
2022-03-03  0:52   ` Bart Van Assche
2022-07-18 16:51   ` regression next-20220714: mkfs.ext4 on multipath device over scsi disks causes 'lifelock' in block layer Benjamin Block
2022-07-19  2:23     ` Martin K. Petersen
2022-07-19 11:37       ` Benjamin Block
2022-07-21  2:29         ` Martin K. Petersen
2022-07-21 13:25           ` Benjamin Block
2022-03-02  5:35 ` [PATCH 11/14] scsi: sd: Implement support for NDOB flag in WRITE SAME(16) Martin K. Petersen
2022-03-02  9:54   ` Christoph Hellwig
2022-03-03  1:29   ` Bart Van Assche
2022-03-04  9:32   ` Johannes Thumshirn
2022-03-02  5:35 ` [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages Martin K. Petersen
2022-03-02  9:54   ` Christoph Hellwig
2022-03-02 10:45   ` Damien Le Moal
2022-04-06  1:29     ` Damien Le Moal
2022-04-07  2:19       ` Martin K. Petersen
2022-04-07  2:36         ` Damien Le Moal
2022-04-07  2:48           ` Martin K. Petersen
2022-03-03 20:13   ` Bart Van Assche
2022-03-04  9:33   ` Johannes Thumshirn
2022-03-02  5:35 ` [PATCH 13/14] scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice Martin K. Petersen
2022-03-02  9:57   ` Christoph Hellwig
2022-03-04  9:36   ` Johannes Thumshirn
2022-03-02  5:35 ` [PATCH 14/14] scsi: sd: Enable modern protocol features on more devices Martin K. Petersen
2022-03-02  9:58   ` Christoph Hellwig
2022-03-03  1:25   ` Bart Van Assche
2022-03-04  9:38   ` Johannes Thumshirn
2022-03-03  6:09 ` SCSI discovery update Douglas Gilbert
2022-03-04  3:26   ` Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.