All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH][RFC] scsi: Use W_LUN for scanning
@ 2013-03-15  9:46 Hannes Reinecke
  2013-03-15 15:54 ` Steffen Maier
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Hannes Reinecke @ 2013-03-15  9:46 UTC (permalink / raw)
  To: linux-scsi
  Cc: James Bottomley, George Martin, Steffen Maier, Mike Christie,
	Hannes Reinecke

SAM advertises the use of a Well-known LUN (W_LUN) for scanning.
As this avoids exposing LUN 0 (which might be a valid LUN) for
all initiators it is the preferred method for LUN scanning on
some arrays.
So we should be using W_LUN for scanning, too. If the W_LUN is
not supported we'll fall back to use LUN 0.
For broken W_LUN implementations a new blacklist flag
'BLIST_NO_WLUN' is added.

Signed-off-by: Hannes Reinecke <hare@suse.de>

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 3e58b22..f4ccdea 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1312,6 +1312,7 @@ static int scsi_report_lun_scan(struct scsi_target *starget, int bflags,
 	unsigned int num_luns;
 	unsigned int retries;
 	int result;
+	int w_lun = SCSI_W_LUN_REPORT_LUNS;
 	struct scsi_lun *lunp, *lun_data;
 	u8 *data;
 	struct scsi_sense_hdr sshdr;
@@ -1337,11 +1338,20 @@ static int scsi_report_lun_scan(struct scsi_target *starget, int bflags,
 		return 0;
 	if (starget->no_report_luns)
 		return 1;
+	if (bflags & BLIST_NO_WLUN)
+		w_lun = 0;
 
+retry_report_lun_scan:
 	if (!(sdev = scsi_device_lookup_by_target(starget, 0))) {
-		sdev = scsi_alloc_sdev(starget, 0, NULL);
-		if (!sdev)
-			return 0;
+		sdev = scsi_alloc_sdev(starget, w_lun, NULL);
+		if (!sdev) {
+			if (w_lun != 0) {
+				w_lun = 0;
+				sdev = scsi_alloc_sdev(starget, w_lun, NULL);
+			}
+			if (!sdev)
+				return 0;
+		}
 		if (scsi_device_get(sdev)) {
 			__scsi_remove_device(sdev);
 			return 0;
@@ -1418,6 +1428,18 @@ static int scsi_report_lun_scan(struct scsi_target *starget, int bflags,
 	}
 
 	if (result) {
+		if (w_lun != 0 && scsi_device_created(sdev)) {
+			/*
+			 * W_LUN probably not supported, try with LUN 0
+			 */
+			SCSI_LOG_SCAN_BUS(3, printk (KERN_INFO "scsi scan:"
+					"W_LUN not supported, try LUN 0\n"));
+			kfree(lun_data);
+			scsi_device_put(sdev);
+			__scsi_remove_device(sdev);
+			w_lun = 0;
+			goto retry_report_lun_scan;
+		}
 		/*
 		 * The device probably does not support a REPORT LUN command
 		 */
diff --git a/include/scsi/scsi_devinfo.h b/include/scsi/scsi_devinfo.h
index cc1f3e7..ffb42b1 100644
--- a/include/scsi/scsi_devinfo.h
+++ b/include/scsi/scsi_devinfo.h
@@ -31,4 +31,5 @@
 #define BLIST_MAX_512		0x800000 /* maximum 512 sector cdb length */
 #define BLIST_ATTACH_PQ3	0x1000000 /* Scan: Attach to PQ3 devices */
 #define BLIST_NO_DIF		0x2000000 /* Disable T10 PI (DIF) */
+#define BLIST_NO_WLUN		0x4000000 /* Disable W_LUN scanning */
 #endif

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH][RFC] scsi: Use W_LUN for scanning
  2013-03-15  9:46 [PATCH][RFC] scsi: Use W_LUN for scanning Hannes Reinecke
@ 2013-03-15 15:54 ` Steffen Maier
  2013-03-17 21:50   ` Steffen Maier
  2013-03-18 15:25   ` Hannes Reinecke
  2013-03-15 21:22 ` Douglas Gilbert
  2013-04-06  9:08 ` James Bottomley
  2 siblings, 2 replies; 12+ messages in thread
From: Steffen Maier @ 2013-03-15 15:54 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: linux-scsi, James Bottomley, George Martin, Mike Christie

While we're at it: I recently figured that there are targets responding 
to inquiry with PQ=1 && PDT=31 for LUN0 if LUN0 has no backing device 
(e.g. no disk mapped for the initiator host). While this is likely to 
work with in-kernel lun scanning, the kernel does not even allocate an 
sg dev in this case.
If the target (such do exist) now also does not support report_luns on 
WLUN, this effectively prevents any user space lun discovery.
E.g. non-NPIV zfcp-attached LUNs cannot be discovered, because we cannot 
do in-kernel scanning due to the lack of initiator lun masking.

The reason for Linux ignoring LUNs with PDT=31 and PQ=1 is:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=84961f28e9d13a4b193d0c8545f3c060c1890ff3
[SCSI] Don't add scsi_device for devices that return PQ=1, PDT=0x1f
from kernel 2.6.19.
Since Linux on System z could no longer perform report luns with IBM 
DS8000, the following workaround was implemented:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=01b291bd66564b4bd826326af6bd0b6d17e99439
[SCSI] fix check of PQ and PDT bits for WLUNs
in kernel 2.6.27.
However, this only works for WLUNs reporting PDT=31 and PQ=1 but not for 
LUN0.
What made it worse is that, the attached LUN looks perfect from a zfcp 
status point of view but is still missing any user visible handle for 
the scsi midlayer, so I was puzzled as a user despite the otherwise 
clear scsi log message:
"scsi scan: peripheral device type of 31, ***no device added***".

On 03/15/2013 10:46 AM, Hannes Reinecke wrote:
> SAM advertises the use of a Well-known LUN (W_LUN) for scanning.
> As this avoids exposing LUN 0 (which might be a valid LUN) for
> all initiators it is the preferred method for LUN scanning on
> some arrays.
> So we should be using W_LUN for scanning, too. If the W_LUN is
> not supported we'll fall back to use LUN 0.
> For broken W_LUN implementations a new blacklist flag
> 'BLIST_NO_WLUN' is added.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
>
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index 3e58b22..f4ccdea 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -1312,6 +1312,7 @@ static int scsi_report_lun_scan(struct scsi_target *starget, int bflags,
>   	unsigned int num_luns;
>   	unsigned int retries;
>   	int result;
> +	int w_lun = SCSI_W_LUN_REPORT_LUNS;
>   	struct scsi_lun *lunp, *lun_data;
>   	u8 *data;
>   	struct scsi_sense_hdr sshdr;
> @@ -1337,11 +1338,20 @@ static int scsi_report_lun_scan(struct scsi_target *starget, int bflags,
>   		return 0;
>   	if (starget->no_report_luns)
>   		return 1;
> +	if (bflags & BLIST_NO_WLUN)
> +		w_lun = 0;
>
> +retry_report_lun_scan:
>   	if (!(sdev = scsi_device_lookup_by_target(starget, 0))) {
> -		sdev = scsi_alloc_sdev(starget, 0, NULL);
> -		if (!sdev)
> -			return 0;
> +		sdev = scsi_alloc_sdev(starget, w_lun, NULL);
> +		if (!sdev) {
> +			if (w_lun != 0) {
> +				w_lun = 0;
> +				sdev = scsi_alloc_sdev(starget, w_lun, NULL);
> +			}

> +			if (!sdev)
> +				return 0;

Semantically this belongs to the 2nd scsi_alloc_sdev(starget, w_lun, 
NULL) i.e. inside (at the end of) the body of if (w_lun != 0).
Then you can even merge the outer and inner if statement to
if (!sdev && wlun != 0)
Semantics: WLUN did not work and we haven't already tried LUN0.
Maybe it's just me but I found it more difficult to read if the 2nd 
check is on its own inside the outer if statement with exactly the same 
boolean expression.


> +		}
>   		if (scsi_device_get(sdev)) {
>   			__scsi_remove_device(sdev);
>   			return 0;

Now that we not only use LUN0, it would be nice to reflect the currently 
used LUN in the corresponding scsi logging statements for debugging and 
problem determination purposes.
It seems as if devname only contains HCT but not L, which can now be 
either WLUN or LUN0, e.g. here:

> 	for (retries = 0; retries < 3; retries++) {
> 		SCSI_LOG_SCAN_BUS(3, printk (KERN_INFO "scsi scan: Sending"
> 				" REPORT LUNS to %s (try %d)\n", devname,
> 				retries));
>
> 		result = scsi_execute_req(sdev, scsi_cmd, DMA_FROM_DEVICE,
> 					  lun_data, length, &sshdr,
> 					  SCSI_TIMEOUT + 4 * HZ, 3, NULL);
>
> 		SCSI_LOG_SCAN_BUS(3, printk (KERN_INFO "scsi scan: REPORT LUNS"
> 				" %s (try %d) result 0x%x\n", result
> 				?  "failed" : "successful", retries, result));

> @@ -1418,6 +1428,18 @@ static int scsi_report_lun_scan(struct scsi_target *starget, int bflags,
>   	}
>
>   	if (result) {
> +		if (w_lun != 0 && scsi_device_created(sdev)) {

Did you mean _!_scsi_device_created(sdev), i.e. we tried WLUN but could 
not attach it?

> +			/*
> +			 * W_LUN probably not supported, try with LUN 0
> +			 */

Does this only check for an sg dev having been created or does it check 
for the response to the report_luns cmnd? It should be the latter since 
there are targets allowing attachment of WLUN (with PQ=1 && PDT=31) but 
refuse to answer report_luns, e.g. IBM SVC / Storwize V7000.
Remember your changes to zfcp_san_disc reacting on sg_luns return code? 
[Novell bug 742352]
It would be nice if we could avoid start using BLIST_NO_WLUN right away. ;-)

> +			SCSI_LOG_SCAN_BUS(3, printk (KERN_INFO "scsi scan:"
> +					"W_LUN not supported, try LUN 0\n"));
> +			kfree(lun_data);
> +			scsi_device_put(sdev);
> +			__scsi_remove_device(sdev);
> +			w_lun = 0;
> +			goto retry_report_lun_scan;
> +		}
>   		/*
>   		 * The device probably does not support a REPORT LUN command
>   		 */
> diff --git a/include/scsi/scsi_devinfo.h b/include/scsi/scsi_devinfo.h
> index cc1f3e7..ffb42b1 100644
> --- a/include/scsi/scsi_devinfo.h
> +++ b/include/scsi/scsi_devinfo.h
> @@ -31,4 +31,5 @@
>   #define BLIST_MAX_512		0x800000 /* maximum 512 sector cdb length */
>   #define BLIST_ATTACH_PQ3	0x1000000 /* Scan: Attach to PQ3 devices */
>   #define BLIST_NO_DIF		0x2000000 /* Disable T10 PI (DIF) */
> +#define BLIST_NO_WLUN		0x4000000 /* Disable W_LUN scanning */
>   #endif
>

The overall idea seems reasonable to me.

Steffen

Linux on System z Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][RFC] scsi: Use W_LUN for scanning
  2013-03-15  9:46 [PATCH][RFC] scsi: Use W_LUN for scanning Hannes Reinecke
  2013-03-15 15:54 ` Steffen Maier
@ 2013-03-15 21:22 ` Douglas Gilbert
  2013-04-06  9:08 ` James Bottomley
  2 siblings, 0 replies; 12+ messages in thread
From: Douglas Gilbert @ 2013-03-15 21:22 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: linux-scsi, James Bottomley, George Martin, Steffen Maier, Mike Christie

On 13-03-15 05:46 AM, Hannes Reinecke wrote:
> SAM advertises the use of a Well-known LUN (W_LUN) for scanning.
> As this avoids exposing LUN 0 (which might be a valid LUN) for
> all initiators it is the preferred method for LUN scanning on
> some arrays.
> So we should be using W_LUN for scanning, too. If the W_LUN is
> not supported we'll fall back to use LUN 0.
> For broken W_LUN implementations a new blacklist flag
> 'BLIST_NO_WLUN' is added.

There are proposals at T10 for feature sets to be added to
SCSI (similar to what ATA devices have). Perhaps we could
have something similar at the OS level: an opaque call
that makes some general decisions based on what we know
about the transport, HBA and possibly target/LU. So if
we are looking at a USB transport not doing UASP, then
prefer to probe LUN 0 rather than the to probe via
REPORT LUNS W_LUN. Many other (somewhat) advanced SCSI
techniques could be filtered in a similar way (e.g. if
it's a USB device assume badly implemented SCSI-2
compliance).

We could still keep the blacklist and, if we don't already
have it, add whitelist logic (e.g. for 0.001% of well-behaved
USB devices).

Doug Gilbert



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][RFC] scsi: Use W_LUN for scanning
  2013-03-15 15:54 ` Steffen Maier
@ 2013-03-17 21:50   ` Steffen Maier
  2013-03-18 15:25   ` Hannes Reinecke
  1 sibling, 0 replies; 12+ messages in thread
From: Steffen Maier @ 2013-03-17 21:50 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: linux-scsi, James Bottomley, George Martin, Mike Christie

just a small addendum regarding the naming of the w_lun variable

On 03/15/2013 04:54 PM, Steffen Maier wrote:
> While we're at it: I recently figured that there are targets responding
> to inquiry with PQ=1 && PDT=31 for LUN0 if LUN0 has no backing device
> (e.g. no disk mapped for the initiator host). While this is likely to
> work with in-kernel lun scanning, the kernel does not even allocate an
> sg dev in this case.
> If the target (such do exist) now also does not support report_luns on
> WLUN, this effectively prevents any user space lun discovery.
> E.g. non-NPIV zfcp-attached LUNs cannot be discovered, because we cannot
> do in-kernel scanning due to the lack of initiator lun masking.
>
> The reason for Linux ignoring LUNs with PDT=31 and PQ=1 is:
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=84961f28e9d13a4b193d0c8545f3c060c1890ff3
> [SCSI] Don't add scsi_device for devices that return PQ=1, PDT=0x1f
> from kernel 2.6.19.
> Since Linux on System z could no longer perform report luns with IBM
> DS8000, the following workaround was implemented:
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=01b291bd66564b4bd826326af6bd0b6d17e99439
> [SCSI] fix check of PQ and PDT bits for WLUNs
> in kernel 2.6.27.
> However, this only works for WLUNs reporting PDT=31 and PQ=1 but not for
> LUN0.
> What made it worse is that, the attached LUN looks perfect from a zfcp
> status point of view but is still missing any user visible handle for
> the scsi midlayer, so I was puzzled as a user despite the otherwise
> clear scsi log message:
> "scsi scan: peripheral device type of 31, ***no device added***".
>
> On 03/15/2013 10:46 AM, Hannes Reinecke wrote:
>> SAM advertises the use of a Well-known LUN (W_LUN) for scanning.
>> As this avoids exposing LUN 0 (which might be a valid LUN) for
>> all initiators it is the preferred method for LUN scanning on
>> some arrays.
>> So we should be using W_LUN for scanning, too. If the W_LUN is
>> not supported we'll fall back to use LUN 0.
>> For broken W_LUN implementations a new blacklist flag
>> 'BLIST_NO_WLUN' is added.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>
>> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
>> index 3e58b22..f4ccdea 100644
>> --- a/drivers/scsi/scsi_scan.c
>> +++ b/drivers/scsi/scsi_scan.c
>> @@ -1312,6 +1312,7 @@ static int scsi_report_lun_scan(struct
>> scsi_target *starget, int bflags,
>>       unsigned int num_luns;
>>       unsigned int retries;
>>       int result;
>> +    int w_lun = SCSI_W_LUN_REPORT_LUNS;

Meanwhile I realized why I sometimes had to think twice to understand 
below code: The naming of the identifier "w_lun" is misleading to me 
because it seems to imply that the variable (always) contains the value 
of the WLUN while in fact it could also contain LUN0. Maybe the variable 
should just be called "lun"?

>>       struct scsi_lun *lunp, *lun_data;
>>       u8 *data;
>>       struct scsi_sense_hdr sshdr;
>> @@ -1337,11 +1338,20 @@ static int scsi_report_lun_scan(struct
>> scsi_target *starget, int bflags,
>>           return 0;
>>       if (starget->no_report_luns)
>>           return 1;
>> +    if (bflags & BLIST_NO_WLUN)
>> +        w_lun = 0;
>>
>> +retry_report_lun_scan:
>>       if (!(sdev = scsi_device_lookup_by_target(starget, 0))) {
>> -        sdev = scsi_alloc_sdev(starget, 0, NULL);
>> -        if (!sdev)
>> -            return 0;
>> +        sdev = scsi_alloc_sdev(starget, w_lun, NULL);
>> +        if (!sdev) {
>> +            if (w_lun != 0) {
>> +                w_lun = 0;
>> +                sdev = scsi_alloc_sdev(starget, w_lun, NULL);
>> +            }
>
>> +            if (!sdev)
>> +                return 0;
>
> Semantically this belongs to the 2nd scsi_alloc_sdev(starget, w_lun,
> NULL) i.e. inside (at the end of) the body of if (w_lun != 0).
> Then you can even merge the outer and inner if statement to
> if (!sdev && wlun != 0)
> Semantics: WLUN did not work and we haven't already tried LUN0.
> Maybe it's just me but I found it more difficult to read if the 2nd
> check is on its own inside the outer if statement with exactly the same
> boolean expression.
>
>
>> +        }
>>           if (scsi_device_get(sdev)) {
>>               __scsi_remove_device(sdev);
>>               return 0;
>
> Now that we not only use LUN0, it would be nice to reflect the currently
> used LUN in the corresponding scsi logging statements for debugging and
> problem determination purposes.
> It seems as if devname only contains HCT but not L, which can now be
> either WLUN or LUN0, e.g. here:
>
>>     for (retries = 0; retries < 3; retries++) {
>>         SCSI_LOG_SCAN_BUS(3, printk (KERN_INFO "scsi scan: Sending"
>>                 " REPORT LUNS to %s (try %d)\n", devname,
>>                 retries));
>>
>>         result = scsi_execute_req(sdev, scsi_cmd, DMA_FROM_DEVICE,
>>                       lun_data, length, &sshdr,
>>                       SCSI_TIMEOUT + 4 * HZ, 3, NULL);
>>
>>         SCSI_LOG_SCAN_BUS(3, printk (KERN_INFO "scsi scan: REPORT LUNS"
>>                 " %s (try %d) result 0x%x\n", result
>>                 ?  "failed" : "successful", retries, result));
>
>> @@ -1418,6 +1428,18 @@ static int scsi_report_lun_scan(struct
>> scsi_target *starget, int bflags,
>>       }
>>
>>       if (result) {
>> +        if (w_lun != 0 && scsi_device_created(sdev)) {
>
> Did you mean _!_scsi_device_created(sdev), i.e. we tried WLUN but could
> not attach it?
>
>> +            /*
>> +             * W_LUN probably not supported, try with LUN 0
>> +             */
>
> Does this only check for an sg dev having been created or does it check
> for the response to the report_luns cmnd? It should be the latter since
> there are targets allowing attachment of WLUN (with PQ=1 && PDT=31) but
> refuse to answer report_luns, e.g. IBM SVC / Storwize V7000.
> Remember your changes to zfcp_san_disc reacting on sg_luns return code?
> [Novell bug 742352]
> It would be nice if we could avoid start using BLIST_NO_WLUN right away.
> ;-)
>
>> +            SCSI_LOG_SCAN_BUS(3, printk (KERN_INFO "scsi scan:"
>> +                    "W_LUN not supported, try LUN 0\n"));
>> +            kfree(lun_data);
>> +            scsi_device_put(sdev);
>> +            __scsi_remove_device(sdev);
>> +            w_lun = 0;
>> +            goto retry_report_lun_scan;
>> +        }
>>           /*
>>            * The device probably does not support a REPORT LUN command
>>            */
>> diff --git a/include/scsi/scsi_devinfo.h b/include/scsi/scsi_devinfo.h
>> index cc1f3e7..ffb42b1 100644
>> --- a/include/scsi/scsi_devinfo.h
>> +++ b/include/scsi/scsi_devinfo.h
>> @@ -31,4 +31,5 @@
>>   #define BLIST_MAX_512        0x800000 /* maximum 512 sector cdb
>> length */
>>   #define BLIST_ATTACH_PQ3    0x1000000 /* Scan: Attach to PQ3 devices */
>>   #define BLIST_NO_DIF        0x2000000 /* Disable T10 PI (DIF) */
>> +#define BLIST_NO_WLUN        0x4000000 /* Disable W_LUN scanning */
>>   #endif
>>
>
> The overall idea seems reasonable to me.

Steffen

Linux on System z Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][RFC] scsi: Use W_LUN for scanning
  2013-03-15 15:54 ` Steffen Maier
  2013-03-17 21:50   ` Steffen Maier
@ 2013-03-18 15:25   ` Hannes Reinecke
  1 sibling, 0 replies; 12+ messages in thread
From: Hannes Reinecke @ 2013-03-18 15:25 UTC (permalink / raw)
  To: Steffen Maier; +Cc: linux-scsi, James Bottomley, George Martin, Mike Christie

On 03/15/2013 04:54 PM, Steffen Maier wrote:
> While we're at it: I recently figured that there are targets
> responding to inquiry with PQ=1 && PDT=31 for LUN0 if LUN0 has no
> backing device (e.g. no disk mapped for the initiator host). While
> this is likely to work with in-kernel lun scanning, the kernel does
> not even allocate an sg dev in this case.
Yes.

> If the target (such do exist) now also does not support report_luns
> on WLUN, this effectively prevents any user space lun discovery.
> E.g. non-NPIV zfcp-attached LUNs cannot be discovered, because we
> cannot do in-kernel scanning due to the lack of initiator lun masking.
>
Yes, I know; the NetApp case.
But you can always initiate a user-space discovery by

echo '- - -' > /sys/class/scsi_host/hostX/scan

> The reason for Linux ignoring LUNs with PDT=31 and PQ=1 is:
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=84961f28e9d13a4b193d0c8545f3c060c1890ff3
>
> [SCSI] Don't add scsi_device for devices that return PQ=1, PDT=0x1f
> from kernel 2.6.19.
> Since Linux on System z could no longer perform report luns with IBM
> DS8000, the following workaround was implemented:
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=01b291bd66564b4bd826326af6bd0b6d17e99439
>
> [SCSI] fix check of PQ and PDT bits for WLUNs
> in kernel 2.6.27.
> However, this only works for WLUNs reporting PDT=31 and PQ=1 but not
> for LUN0.
> What made it worse is that, the attached LUN looks perfect from a
> zfcp status point of view but is still missing any user visible
> handle for the scsi midlayer, so I was puzzled as a user despite the
> otherwise clear scsi log message:
> "scsi scan: peripheral device type of 31, ***no device added***".
>

Hmm. You sure this is still valid today?
Assuming the device supports REPORT LUNS we're having this:

	res = scsi_probe_and_add_lun(starget, 0, &bflags, NULL, rescan, NULL);
	if (res == SCSI_SCAN_LUN_PRESENT || res == SCSI_SCAN_TARGET_PRESENT) {
		if (scsi_report_lun_scan(starget, bflags, rescan) != 0)
			/*
			 * The REPORT LUN did not scan the target,
			 * do a sequential scan.
			 */
			scsi_sequential_lun_scan(starget, bflags,
						 starget->scsi_level, rescan);
	}

  out_reap:
	scsi_autopm_put_target(starget);
	/* now determine if the target has any children at all
	 * and if not, nuke it */
	scsi_target_reap(starget);

So we don't actually need an sdev to invoke a report lun scan;
all we require is a correct return code from scsi_probe_and_add_lun().
Which we'll get irrespective of the PQ setting.
So from what I'm seeing this case should be covered.
Unless I'm missing something ...

> On 03/15/2013 10:46 AM, Hannes Reinecke wrote:
>> SAM advertises the use of a Well-known LUN (W_LUN) for scanning.
>> As this avoids exposing LUN 0 (which might be a valid LUN) for
>> all initiators it is the preferred method for LUN scanning on
>> some arrays.
>> So we should be using W_LUN for scanning, too. If the W_LUN is
>> not supported we'll fall back to use LUN 0.
>> For broken W_LUN implementations a new blacklist flag
>> 'BLIST_NO_WLUN' is added.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>
>> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
>> index 3e58b22..f4ccdea 100644
>> --- a/drivers/scsi/scsi_scan.c
>> +++ b/drivers/scsi/scsi_scan.c
>> @@ -1312,6 +1312,7 @@ static int scsi_report_lun_scan(struct
>> scsi_target *starget, int bflags,
>>       unsigned int num_luns;
>>       unsigned int retries;
>>       int result;
>> +    int w_lun = SCSI_W_LUN_REPORT_LUNS;
>>       struct scsi_lun *lunp, *lun_data;
>>       u8 *data;
>>       struct scsi_sense_hdr sshdr;
>> @@ -1337,11 +1338,20 @@ static int scsi_report_lun_scan(struct
>> scsi_target *starget, int bflags,
>>           return 0;
>>       if (starget->no_report_luns)
>>           return 1;
>> +    if (bflags & BLIST_NO_WLUN)
>> +        w_lun = 0;
>>
>> +retry_report_lun_scan:
>>       if (!(sdev = scsi_device_lookup_by_target(starget, 0))) {
>> -        sdev = scsi_alloc_sdev(starget, 0, NULL);
>> -        if (!sdev)
>> -            return 0;
>> +        sdev = scsi_alloc_sdev(starget, w_lun, NULL);
>> +        if (!sdev) {
>> +            if (w_lun != 0) {
>> +                w_lun = 0;
>> +                sdev = scsi_alloc_sdev(starget, w_lun, NULL);
>> +            }
>
>> +            if (!sdev)
>> +                return 0;
>
> Semantically this belongs to the 2nd scsi_alloc_sdev(starget, w_lun,
> NULL) i.e. inside (at the end of) the body of if (w_lun != 0).
> Then you can even merge the outer and inner if statement to
> if (!sdev && wlun != 0)
> Semantics: WLUN did not work and we haven't already tried LUN0.
> Maybe it's just me but I found it more difficult to read if the 2nd
> check is on its own inside the outer if statement with exactly the
> same boolean expression.
>
>
>> +        }
>>           if (scsi_device_get(sdev)) {
>>               __scsi_remove_device(sdev);
>>               return 0;
>
> Now that we not only use LUN0, it would be nice to reflect the
> currently used LUN in the corresponding scsi logging statements for
> debugging and problem determination purposes.
> It seems as if devname only contains HCT but not L, which can now be
> either WLUN or LUN0, e.g. here:
>
>>     for (retries = 0; retries < 3; retries++) {
>>         SCSI_LOG_SCAN_BUS(3, printk (KERN_INFO "scsi scan: Sending"
>>                 " REPORT LUNS to %s (try %d)\n", devname,
>>                 retries));
>>
>>         result = scsi_execute_req(sdev, scsi_cmd, DMA_FROM_DEVICE,
>>                       lun_data, length, &sshdr,
>>                       SCSI_TIMEOUT + 4 * HZ, 3, NULL);
>>
>>         SCSI_LOG_SCAN_BUS(3, printk (KERN_INFO "scsi scan: REPORT
>> LUNS"
>>                 " %s (try %d) result 0x%x\n", result
>>                 ?  "failed" : "successful", retries, result));
>
>> @@ -1418,6 +1428,18 @@ static int scsi_report_lun_scan(struct
>> scsi_target *starget, int bflags,
>>       }
>>
>>       if (result) {
>> +        if (w_lun != 0 && scsi_device_created(sdev)) {
>
> Did you mean _!_scsi_device_created(sdev), i.e. we tried WLUN but
> could not attach it?
>
>> +            /*
>> +             * W_LUN probably not supported, try with LUN 0
>> +             */
>
> Does this only check for an sg dev having been created or does it
> check for the response to the report_luns cmnd? It should be the
> latter since there are targets allowing attachment of WLUN (with
> PQ=1 && PDT=31) but refuse to answer report_luns, e.g. IBM SVC /
> Storwize V7000.

Hmm?
Targets supporting WLUN but refuse to answer report_luns on them?
This is the 'REPORT_LUNs WLUN' we're talking to.
The _sole_ purpose of which is to _answer_ to REPORT_LUNS.

Not answering to REPORT LUNS seems to be a violation of
SPC, where support for REPORT LUNs is mandatory.

But in either case, that's why I did the blacklist flag.
We would need to add those devices to it.

Anyway, you are correct in that we need to check the response
from report_luns. I'll cross check here.

> Remember your changes to zfcp_san_disc reacting on sg_luns return
> code? [Novell bug 742352]
> It would be nice if we could avoid start using BLIST_NO_WLUN right
> away. ;-)
>
Oh, that was the idea.

>> +            SCSI_LOG_SCAN_BUS(3, printk (KERN_INFO "scsi scan:"
>> +                    "W_LUN not supported, try LUN 0\n"));
>> +            kfree(lun_data);
>> +            scsi_device_put(sdev);
>> +            __scsi_remove_device(sdev);
>> +            w_lun = 0;
>> +            goto retry_report_lun_scan;
>> +        }
>>           /*
>>            * The device probably does not support a REPORT LUN
>> command
>>            */
>> diff --git a/include/scsi/scsi_devinfo.h
>> b/include/scsi/scsi_devinfo.h
>> index cc1f3e7..ffb42b1 100644
>> --- a/include/scsi/scsi_devinfo.h
>> +++ b/include/scsi/scsi_devinfo.h
>> @@ -31,4 +31,5 @@
>>   #define BLIST_MAX_512        0x800000 /* maximum 512 sector cdb
>> length */
>>   #define BLIST_ATTACH_PQ3    0x1000000 /* Scan: Attach to PQ3
>> devices */
>>   #define BLIST_NO_DIF        0x2000000 /* Disable T10 PI (DIF) */
>> +#define BLIST_NO_WLUN        0x4000000 /* Disable W_LUN scanning */
>>   #endif
>>
>
> The overall idea seems reasonable to me.
>
See?

I'll be updating the patch.

Cheers,

Hannes]
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][RFC] scsi: Use W_LUN for scanning
  2013-03-15  9:46 [PATCH][RFC] scsi: Use W_LUN for scanning Hannes Reinecke
  2013-03-15 15:54 ` Steffen Maier
  2013-03-15 21:22 ` Douglas Gilbert
@ 2013-04-06  9:08 ` James Bottomley
  2013-04-07 13:31   ` Hannes Reinecke
  2 siblings, 1 reply; 12+ messages in thread
From: James Bottomley @ 2013-04-06  9:08 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: linux-scsi, George Martin, Steffen Maier, Mike Christie

On Fri, 2013-03-15 at 10:46 +0100, Hannes Reinecke wrote:
> SAM advertises the use of a Well-known LUN (W_LUN) for scanning.
> As this avoids exposing LUN 0 (which might be a valid LUN) for
> all initiators it is the preferred method for LUN scanning on
> some arrays.
> So we should be using W_LUN for scanning, too. If the W_LUN is
> not supported we'll fall back to use LUN 0.
> For broken W_LUN implementations a new blacklist flag
> 'BLIST_NO_WLUN' is added.

Well, we could do this, but I don't really see the point.  By the time
we get into the report lun code, we've already probed LUN 0, so it's as
good as any for a REPORT LUN scan.

What worries me slightly about the W-LUN is that for the first time
we'll  be assuming a device supports a particular address method
(Extended Logical Unit addressing) rather than treating LUNs as opaque
handles we keep and pass back to the target.  I appreciate you now have
a blacklist for failures, but if we didn't use W-LUNs we wouldn't need
that blacklist.

So could you answer two questions clearly:

     1. What does this buy us over the current LUN0 method?  I don't see
        LUN0 might be a valid LUN being a convincing reason.
     2. What devices have you actually tested this on?

James

> Signed-off-by: Hannes Reinecke <hare@suse.de>
> 
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index 3e58b22..f4ccdea 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -1312,6 +1312,7 @@ static int scsi_report_lun_scan(struct scsi_target *starget, int bflags,
>  	unsigned int num_luns;
>  	unsigned int retries;
>  	int result;
> +	int w_lun = SCSI_W_LUN_REPORT_LUNS;
>  	struct scsi_lun *lunp, *lun_data;
>  	u8 *data;
>  	struct scsi_sense_hdr sshdr;
> @@ -1337,11 +1338,20 @@ static int scsi_report_lun_scan(struct scsi_target *starget, int bflags,
>  		return 0;
>  	if (starget->no_report_luns)
>  		return 1;
> +	if (bflags & BLIST_NO_WLUN)
> +		w_lun = 0;
>  
> +retry_report_lun_scan:
>  	if (!(sdev = scsi_device_lookup_by_target(starget, 0))) {
> -		sdev = scsi_alloc_sdev(starget, 0, NULL);
> -		if (!sdev)
> -			return 0;
> +		sdev = scsi_alloc_sdev(starget, w_lun, NULL);
> +		if (!sdev) {
> +			if (w_lun != 0) {
> +				w_lun = 0;
> +				sdev = scsi_alloc_sdev(starget, w_lun, NULL);
> +			}
> +			if (!sdev)
> +				return 0;
> +		}
>  		if (scsi_device_get(sdev)) {
>  			__scsi_remove_device(sdev);
>  			return 0;
> @@ -1418,6 +1428,18 @@ static int scsi_report_lun_scan(struct scsi_target *starget, int bflags,
>  	}
>  
>  	if (result) {
> +		if (w_lun != 0 && scsi_device_created(sdev)) {
> +			/*
> +			 * W_LUN probably not supported, try with LUN 0
> +			 */
> +			SCSI_LOG_SCAN_BUS(3, printk (KERN_INFO "scsi scan:"
> +					"W_LUN not supported, try LUN 0\n"));
> +			kfree(lun_data);
> +			scsi_device_put(sdev);
> +			__scsi_remove_device(sdev);
> +			w_lun = 0;
> +			goto retry_report_lun_scan;
> +		}
>  		/*
>  		 * The device probably does not support a REPORT LUN command
>  		 */
> diff --git a/include/scsi/scsi_devinfo.h b/include/scsi/scsi_devinfo.h
> index cc1f3e7..ffb42b1 100644
> --- a/include/scsi/scsi_devinfo.h
> +++ b/include/scsi/scsi_devinfo.h
> @@ -31,4 +31,5 @@
>  #define BLIST_MAX_512		0x800000 /* maximum 512 sector cdb length */
>  #define BLIST_ATTACH_PQ3	0x1000000 /* Scan: Attach to PQ3 devices */
>  #define BLIST_NO_DIF		0x2000000 /* Disable T10 PI (DIF) */
> +#define BLIST_NO_WLUN		0x4000000 /* Disable W_LUN scanning */
>  #endif
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][RFC] scsi: Use W_LUN for scanning
  2013-04-06  9:08 ` James Bottomley
@ 2013-04-07 13:31   ` Hannes Reinecke
  2013-04-07 14:49     ` James Bottomley
  0 siblings, 1 reply; 12+ messages in thread
From: Hannes Reinecke @ 2013-04-07 13:31 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, George Martin, Steffen Maier, Mike Christie

On 04/06/2013 11:08 AM, James Bottomley wrote:
> On Fri, 2013-03-15 at 10:46 +0100, Hannes Reinecke wrote:
>> SAM advertises the use of a Well-known LUN (W_LUN) for scanning.
>> As this avoids exposing LUN 0 (which might be a valid LUN) for
>> all initiators it is the preferred method for LUN scanning on
>> some arrays.
>> So we should be using W_LUN for scanning, too. If the W_LUN is
>> not supported we'll fall back to use LUN 0.
>> For broken W_LUN implementations a new blacklist flag
>> 'BLIST_NO_WLUN' is added.
>
> Well, we could do this, but I don't really see the point.  By the time
> we get into the report lun code, we've already probed LUN 0, so it's as
> goeod as any for a REPORT LUN scan.
>
Did we? I thought I had avoided that and directly went for probing
W_LUN _first_.
Will be cross-checking.

> What worries me slightly about the W-LUN is that for the first time
> we'll  be assuming a device supports a particular address method
> (Extended Logical Unit addressing) rather than treating LUNs as opaque
> handles we keep and pass back to the target.  I appreciate you now have
> a blacklist for failures, but if we didn't use W-LUNs we wouldn't need
> that blacklist.
>
> So could you answer two questions clearly:
>
>       1. What does this buy us over the current LUN0 method?  I don't see
>          LUN0 might be a valid LUN being a convincing reason.

LUN masking.
Some HBAs / virtualised devices use LUN masking to forward LUNs to the 
virtual machines.
So for LUN0 you have the choice of exposing it to every virtual machine,
meaning you cannot assign a device to LUN0, or have LUN0 as a no-device
LUN which then can be exposed to every virtual machine.

At which point you run into hardware limitations, as not every storage 
array allow for the first option.
And not every LUN masking implementation allows you to expose a single 
LUN to several virtual machines. So you might be screwed either way.

This was the main reason why zfcp could not use the standard LUN 
scanning method like every other HBA LLDD and had to resort to manual 
LUN activation.

>       2. What devices have you actually tested this on?
>
Netapp FAS, HP EVA, HP P2000 / MSA, EMC Clariion.

But as mentioned, I'll be rechecking the patch.
We should _not_ try to probe LUN0 first, but rather send REPORT_LUNS to 
the W_LUN directly. If it responds, good. If not, we'll fall back to LUN0.

Cheers,

Hannes


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][RFC] scsi: Use W_LUN for scanning
  2013-04-07 13:31   ` Hannes Reinecke
@ 2013-04-07 14:49     ` James Bottomley
  2013-04-07 15:59       ` Douglas Gilbert
  0 siblings, 1 reply; 12+ messages in thread
From: James Bottomley @ 2013-04-07 14:49 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: linux-scsi, George Martin, Steffen Maier, Mike Christie

On Sun, 2013-04-07 at 15:31 +0200, Hannes Reinecke wrote:
> On 04/06/2013 11:08 AM, James Bottomley wrote:
> > On Fri, 2013-03-15 at 10:46 +0100, Hannes Reinecke wrote:
> >> SAM advertises the use of a Well-known LUN (W_LUN) for scanning.
> >> As this avoids exposing LUN 0 (which might be a valid LUN) for
> >> all initiators it is the preferred method for LUN scanning on
> >> some arrays.
> >> So we should be using W_LUN for scanning, too. If the W_LUN is
> >> not supported we'll fall back to use LUN 0.
> >> For broken W_LUN implementations a new blacklist flag
> >> 'BLIST_NO_WLUN' is added.
> >
> > Well, we could do this, but I don't really see the point.  By the time
> > we get into the report lun code, we've already probed LUN 0, so it's as
> > goeod as any for a REPORT LUN scan.
> >
> Did we? I thought I had avoided that and directly went for probing
> W_LUN _first_.
> Will be cross-checking.
> 
> > What worries me slightly about the W-LUN is that for the first time
> > we'll  be assuming a device supports a particular address method
> > (Extended Logical Unit addressing) rather than treating LUNs as opaque
> > handles we keep and pass back to the target.  I appreciate you now have
> > a blacklist for failures, but if we didn't use W-LUNs we wouldn't need
> > that blacklist.
> >
> > So could you answer two questions clearly:
> >
> >       1. What does this buy us over the current LUN0 method?  I don't see
> >          LUN0 might be a valid LUN being a convincing reason.
> 
> LUN masking.
> Some HBAs / virtualised devices use LUN masking to forward LUNs to the 
> virtual machines.
> So for LUN0 you have the choice of exposing it to every virtual machine,
> meaning you cannot assign a device to LUN0, or have LUN0 as a no-device
> LUN which then can be exposed to every virtual machine.

That shouldn't matter, should it?  The spec says that even a masked LUN
must respond to an inquiry (with PQ indicating appropriate
inaccessibility).

> At which point you run into hardware limitations, as not every storage 
> array allow for the first option.
> And not every LUN masking implementation allows you to expose a single 
> LUN to several virtual machines. So you might be screwed either way.
> 
> This was the main reason why zfcp could not use the standard LUN 
> scanning method like every other HBA LLDD and had to resort to manual 
> LUN activation.

So this is an out of spec implementation of LUN masking ... as in it
doesn't respond correctly to an INQUIRY?

> >       2. What devices have you actually tested this on?
> >
> Netapp FAS, HP EVA, HP P2000 / MSA, EMC Clariion.
> 
> But as mentioned, I'll be rechecking the patch.
> We should _not_ try to probe LUN0 first, but rather send REPORT_LUNS to 
> the W_LUN directly. If it responds, good. If not, we'll fall back to LUN0.

I don't think we can ever do that ... what about SCSI 2 devices that
don't support REPORT LUNS or USB devices that will crash on it?  We
might be able to try a host type whitelist, where if we were a USB or
traditional bus host (SPI) we never try this, but if we're a modern one
(SAS, FC) we do.

James


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][RFC] scsi: Use W_LUN for scanning
  2013-04-07 14:49     ` James Bottomley
@ 2013-04-07 15:59       ` Douglas Gilbert
  2013-04-07 16:15         ` James Bottomley
  0 siblings, 1 reply; 12+ messages in thread
From: Douglas Gilbert @ 2013-04-07 15:59 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, linux-scsi, George Martin, Steffen Maier, Mike Christie

On 13-04-07 10:49 AM, James Bottomley wrote:
> On Sun, 2013-04-07 at 15:31 +0200, Hannes Reinecke wrote:
>> On 04/06/2013 11:08 AM, James Bottomley wrote:
>>> On Fri, 2013-03-15 at 10:46 +0100, Hannes Reinecke wrote:
>>>> SAM advertises the use of a Well-known LUN (W_LUN) for scanning.
>>>> As this avoids exposing LUN 0 (which might be a valid LUN) for
>>>> all initiators it is the preferred method for LUN scanning on
>>>> some arrays.
>>>> So we should be using W_LUN for scanning, too. If the W_LUN is
>>>> not supported we'll fall back to use LUN 0.
>>>> For broken W_LUN implementations a new blacklist flag
>>>> 'BLIST_NO_WLUN' is added.
>>>
>>> Well, we could do this, but I don't really see the point.  By the time
>>> we get into the report lun code, we've already probed LUN 0, so it's as
>>> goeod as any for a REPORT LUN scan.
>>>
>> Did we? I thought I had avoided that and directly went for probing
>> W_LUN _first_.
>> Will be cross-checking.
>>
>>> What worries me slightly about the W-LUN is that for the first time
>>> we'll  be assuming a device supports a particular address method
>>> (Extended Logical Unit addressing) rather than treating LUNs as opaque
>>> handles we keep and pass back to the target.  I appreciate you now have
>>> a blacklist for failures, but if we didn't use W-LUNs we wouldn't need
>>> that blacklist.
>>>
>>> So could you answer two questions clearly:
>>>
>>>        1. What does this buy us over the current LUN0 method?  I don't see
>>>           LUN0 might be a valid LUN being a convincing reason.
>>
>> LUN masking.
>> Some HBAs / virtualised devices use LUN masking to forward LUNs to the
>> virtual machines.
>> So for LUN0 you have the choice of exposing it to every virtual machine,
>> meaning you cannot assign a device to LUN0, or have LUN0 as a no-device
>> LUN which then can be exposed to every virtual machine.
>
> That shouldn't matter, should it?  The spec says that even a masked LUN
> must respond to an inquiry (with PQ indicating appropriate
> inaccessibility).

Which spec? I haven't seen a mention of LUN masking
in any SCSI spec and I just rechecked SAM-2,3,4 and 5.
Looked at FCP-4 as well.

>> At which point you run into hardware limitations, as not every storage
>> array allow for the first option.
>> And not every LUN masking implementation allows you to expose a single
>> LUN to several virtual machines. So you might be screwed either way.
>>
>> This was the main reason why zfcp could not use the standard LUN
>> scanning method like every other HBA LLDD and had to resort to manual
>> LUN activation.
>
> So this is an out of spec implementation of LUN masking ... as in it
> doesn't respond correctly to an INQUIRY?

No specs apply that I can see.

>>>        2. What devices have you actually tested this on?
>>>
>> Netapp FAS, HP EVA, HP P2000 / MSA, EMC Clariion.
>>
>> But as mentioned, I'll be rechecking the patch.
>> We should _not_ try to probe LUN0 first, but rather send REPORT_LUNS to
>> the W_LUN directly. If it responds, good. If not, we'll fall back to LUN0.
>
> I don't think we can ever do that ... what about SCSI 2 devices that
> don't support REPORT LUNS or USB devices that will crash on it?  We
> might be able to try a host type whitelist, where if we were a USB or
> traditional bus host (SPI) we never try this, but if we're a modern one
> (SAS, FC) we do.

The VERSION field (byte 2) of an INQUIRY response is always
available, even on USB storage devices which usually claim
SCSI-2 compliance:
    2 == (rsp_buff[2] & 0x7)

No need to try REPORT LUNS on such devices.


Are there any SCSI-1 devices still out there?

Doug Gilbert


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][RFC] scsi: Use W_LUN for scanning
  2013-04-07 15:59       ` Douglas Gilbert
@ 2013-04-07 16:15         ` James Bottomley
  2013-04-07 16:34           ` Douglas Gilbert
  0 siblings, 1 reply; 12+ messages in thread
From: James Bottomley @ 2013-04-07 16:15 UTC (permalink / raw)
  To: dgilbert
  Cc: Hannes Reinecke, linux-scsi, George Martin, Steffen Maier, Mike Christie

On Sun, 2013-04-07 at 11:59 -0400, Douglas Gilbert wrote:
> On 13-04-07 10:49 AM, James Bottomley wrote:
> > On Sun, 2013-04-07 at 15:31 +0200, Hannes Reinecke wrote:
> >> On 04/06/2013 11:08 AM, James Bottomley wrote:
> >>> On Fri, 2013-03-15 at 10:46 +0100, Hannes Reinecke wrote:
> >>>> SAM advertises the use of a Well-known LUN (W_LUN) for scanning.
> >>>> As this avoids exposing LUN 0 (which might be a valid LUN) for
> >>>> all initiators it is the preferred method for LUN scanning on
> >>>> some arrays.
> >>>> So we should be using W_LUN for scanning, too. If the W_LUN is
> >>>> not supported we'll fall back to use LUN 0.
> >>>> For broken W_LUN implementations a new blacklist flag
> >>>> 'BLIST_NO_WLUN' is added.
> >>>
> >>> Well, we could do this, but I don't really see the point.  By the time
> >>> we get into the report lun code, we've already probed LUN 0, so it's as
> >>> goeod as any for a REPORT LUN scan.
> >>>
> >> Did we? I thought I had avoided that and directly went for probing
> >> W_LUN _first_.
> >> Will be cross-checking.
> >>
> >>> What worries me slightly about the W-LUN is that for the first time
> >>> we'll  be assuming a device supports a particular address method
> >>> (Extended Logical Unit addressing) rather than treating LUNs as opaque
> >>> handles we keep and pass back to the target.  I appreciate you now have
> >>> a blacklist for failures, but if we didn't use W-LUNs we wouldn't need
> >>> that blacklist.
> >>>
> >>> So could you answer two questions clearly:
> >>>
> >>>        1. What does this buy us over the current LUN0 method?  I don't see
> >>>           LUN0 might be a valid LUN being a convincing reason.
> >>
> >> LUN masking.
> >> Some HBAs / virtualised devices use LUN masking to forward LUNs to the
> >> virtual machines.
> >> So for LUN0 you have the choice of exposing it to every virtual machine,
> >> meaning you cannot assign a device to LUN0, or have LUN0 as a no-device
> >> LUN which then can be exposed to every virtual machine.
> >
> > That shouldn't matter, should it?  The spec says that even a masked LUN
> > must respond to an inquiry (with PQ indicating appropriate
> > inaccessibility).
> 
> Which spec? I haven't seen a mention of LUN masking
> in any SCSI spec and I just rechecked SAM-2,3,4 and 5.
> Looked at FCP-4 as well.

It's not called LUN Masking; it's called access control, although the
ACLs are referred to as LUN masks.

> >> At which point you run into hardware limitations, as not every storage
> >> array allow for the first option.
> >> And not every LUN masking implementation allows you to expose a single
> >> LUN to several virtual machines. So you might be screwed either way.
> >>
> >> This was the main reason why zfcp could not use the standard LUN
> >> scanning method like every other HBA LLDD and had to resort to manual
> >> LUN activation.
> >
> > So this is an out of spec implementation of LUN masking ... as in it
> > doesn't respond correctly to an INQUIRY?
> 
> No specs apply that I can see.

SPC-3 Section 8.3 "Access Controls"

> >>>        2. What devices have you actually tested this on?
> >>>
> >> Netapp FAS, HP EVA, HP P2000 / MSA, EMC Clariion.
> >>
> >> But as mentioned, I'll be rechecking the patch.
> >> We should _not_ try to probe LUN0 first, but rather send REPORT_LUNS to
> >> the W_LUN directly. If it responds, good. If not, we'll fall back to LUN0.
> >
> > I don't think we can ever do that ... what about SCSI 2 devices that
> > don't support REPORT LUNS or USB devices that will crash on it?  We
> > might be able to try a host type whitelist, where if we were a USB or
> > traditional bus host (SPI) we never try this, but if we're a modern one
> > (SAS, FC) we do.
> 
> The VERSION field (byte 2) of an INQUIRY response is always
> available, even on USB storage devices which usually claim
> SCSI-2 compliance:
>     2 == (rsp_buff[2] & 0x7)

So this is the chicken and egg problem: if we haven't probed the target
at all, how do we know it's SCSI-2?  If we do an initial probe, we have
to do it as an INQUIRY to LUN0 and we end up in the same situation we
are now.  That's why I suggested a host bus type parameter if we want to
do REPORT LUNS probes.

> No need to try REPORT LUNS on such devices.
> 
> 
> Are there any SCSI-1 devices still out there?

I don't have any (last one burned out a while ago).

James


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][RFC] scsi: Use W_LUN for scanning
  2013-04-07 16:15         ` James Bottomley
@ 2013-04-07 16:34           ` Douglas Gilbert
  2013-04-07 17:37             ` James Bottomley
  0 siblings, 1 reply; 12+ messages in thread
From: Douglas Gilbert @ 2013-04-07 16:34 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, linux-scsi, George Martin, Steffen Maier, Mike Christie

On 13-04-07 12:15 PM, James Bottomley wrote:
> On Sun, 2013-04-07 at 11:59 -0400, Douglas Gilbert wrote:
>> On 13-04-07 10:49 AM, James Bottomley wrote:
>>> On Sun, 2013-04-07 at 15:31 +0200, Hannes Reinecke wrote:
>>>> On 04/06/2013 11:08 AM, James Bottomley wrote:
>>>>> On Fri, 2013-03-15 at 10:46 +0100, Hannes Reinecke wrote:
>>>>>> SAM advertises the use of a Well-known LUN (W_LUN) for scanning.
>>>>>> As this avoids exposing LUN 0 (which might be a valid LUN) for
>>>>>> all initiators it is the preferred method for LUN scanning on
>>>>>> some arrays.
>>>>>> So we should be using W_LUN for scanning, too. If the W_LUN is
>>>>>> not supported we'll fall back to use LUN 0.
>>>>>> For broken W_LUN implementations a new blacklist flag
>>>>>> 'BLIST_NO_WLUN' is added.
>>>>>
>>>>> Well, we could do this, but I don't really see the point.  By the time
>>>>> we get into the report lun code, we've already probed LUN 0, so it's as
>>>>> goeod as any for a REPORT LUN scan.
>>>>>
>>>> Did we? I thought I had avoided that and directly went for probing
>>>> W_LUN _first_.
>>>> Will be cross-checking.
>>>>
>>>>> What worries me slightly about the W-LUN is that for the first time
>>>>> we'll  be assuming a device supports a particular address method
>>>>> (Extended Logical Unit addressing) rather than treating LUNs as opaque
>>>>> handles we keep and pass back to the target.  I appreciate you now have
>>>>> a blacklist for failures, but if we didn't use W-LUNs we wouldn't need
>>>>> that blacklist.
>>>>>
>>>>> So could you answer two questions clearly:
>>>>>
>>>>>         1. What does this buy us over the current LUN0 method?  I don't see
>>>>>            LUN0 might be a valid LUN being a convincing reason.
>>>>
>>>> LUN masking.
>>>> Some HBAs / virtualised devices use LUN masking to forward LUNs to the
>>>> virtual machines.
>>>> So for LUN0 you have the choice of exposing it to every virtual machine,
>>>> meaning you cannot assign a device to LUN0, or have LUN0 as a no-device
>>>> LUN which then can be exposed to every virtual machine.
>>>
>>> That shouldn't matter, should it?  The spec says that even a masked LUN
>>> must respond to an inquiry (with PQ indicating appropriate
>>> inaccessibility).
>>
>> Which spec? I haven't seen a mention of LUN masking
>> in any SCSI spec and I just rechecked SAM-2,3,4 and 5.
>> Looked at FCP-4 as well.
>
> It's not called LUN Masking; it's called access control, although the
> ACLs are referred to as LUN masks.
>
>>>> At which point you run into hardware limitations, as not every storage
>>>> array allow for the first option.
>>>> And not every LUN masking implementation allows you to expose a single
>>>> LUN to several virtual machines. So you might be screwed either way.
>>>>
>>>> This was the main reason why zfcp could not use the standard LUN
>>>> scanning method like every other HBA LLDD and had to resort to manual
>>>> LUN activation.
>>>
>>> So this is an out of spec implementation of LUN masking ... as in it
>>> doesn't respond correctly to an INQUIRY?
>>
>> No specs apply that I can see.
>
> SPC-3 Section 8.3 "Access Controls"

spc4r36f.pdf section 8.3.1.2 [Overview]

"Access controls are handled in the SCSI target device by
an access controls coordinator located at the ACCESS CONTROLS
well known logical unit. The access controls coordinator also
may be accessible via LUN 0. The access controls coordinator
associates a specific LUN to a specific logical unit depending
on which initiator port accesses the SCSI target device and
whether the initiator port has access rights to the
logical unit."

That seems to strengthen the argument for going the W_LUN route.

Doug Gilbert

>>>>>         2. What devices have you actually tested this on?
>>>>>
>>>> Netapp FAS, HP EVA, HP P2000 / MSA, EMC Clariion.
>>>>
>>>> But as mentioned, I'll be rechecking the patch.
>>>> We should _not_ try to probe LUN0 first, but rather send REPORT_LUNS to
>>>> the W_LUN directly. If it responds, good. If not, we'll fall back to LUN0.
>>>
>>> I don't think we can ever do that ... what about SCSI 2 devices that
>>> don't support REPORT LUNS or USB devices that will crash on it?  We
>>> might be able to try a host type whitelist, where if we were a USB or
>>> traditional bus host (SPI) we never try this, but if we're a modern one
>>> (SAS, FC) we do.
>>
>> The VERSION field (byte 2) of an INQUIRY response is always
>> available, even on USB storage devices which usually claim
>> SCSI-2 compliance:
>>      2 == (rsp_buff[2] & 0x7)
>
> So this is the chicken and egg problem: if we haven't probed the target
> at all, how do we know it's SCSI-2?  If we do an initial probe, we have
> to do it as an INQUIRY to LUN0 and we end up in the same situation we
> are now.  That's why I suggested a host bus type parameter if we want to
> do REPORT LUNS probes.
>
>> No need to try REPORT LUNS on such devices.
>>
>>
>> Are there any SCSI-1 devices still out there?
>
> I don't have any (last one burned out a while ago).
>
> James
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][RFC] scsi: Use W_LUN for scanning
  2013-04-07 16:34           ` Douglas Gilbert
@ 2013-04-07 17:37             ` James Bottomley
  0 siblings, 0 replies; 12+ messages in thread
From: James Bottomley @ 2013-04-07 17:37 UTC (permalink / raw)
  To: dgilbert
  Cc: Hannes Reinecke, linux-scsi, George Martin, Steffen Maier, Mike Christie

On Sun, 2013-04-07 at 12:34 -0400, Douglas Gilbert wrote:
> On 13-04-07 12:15 PM, James Bottomley wrote:
> >> No specs apply that I can see.
> >
> > SPC-3 Section 8.3 "Access Controls"
> 
> spc4r36f.pdf section 8.3.1.2 [Overview]
> 
> "Access controls are handled in the SCSI target device by
> an access controls coordinator located at the ACCESS CONTROLS
> well known logical unit. The access controls coordinator also
> may be accessible via LUN 0. The access controls coordinator
> associates a specific LUN to a specific logical unit depending
> on which initiator port accesses the SCSI target device and
> whether the initiator port has access rights to the
> logical unit."
> 
> That seems to strengthen the argument for going the W_LUN route.

The W-LUN route is only viable if we can find a way of making it work
for everything (and that includes USB).

The spec also says in 8.3.1.7 [Verifying access rights]

"If an INQUIRY command is addressed to a LUN for which there is no
matching LUN value in any LUACD in any ACE allowing the initiator port
logical unit access rights, the standard INQUIRY data (see 6.4.2)
PERIPHERAL DEVICE TYPE field shall be set to 1Fh and the PERIPHERAL
QUALIFIER field shall be set to 011b (i.e., the device server is not
capable of supporting a device at this logical unit)."

That means our current LUN0 probe should work just fine.

James


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-04-07 17:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-15  9:46 [PATCH][RFC] scsi: Use W_LUN for scanning Hannes Reinecke
2013-03-15 15:54 ` Steffen Maier
2013-03-17 21:50   ` Steffen Maier
2013-03-18 15:25   ` Hannes Reinecke
2013-03-15 21:22 ` Douglas Gilbert
2013-04-06  9:08 ` James Bottomley
2013-04-07 13:31   ` Hannes Reinecke
2013-04-07 14:49     ` James Bottomley
2013-04-07 15:59       ` Douglas Gilbert
2013-04-07 16:15         ` James Bottomley
2013-04-07 16:34           ` Douglas Gilbert
2013-04-07 17:37             ` James Bottomley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.