All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status
@ 2018-04-17  9:03 Steffen Maier
  2018-04-17  9:41 ` Johannes Thumshirn
  2018-04-19 18:53 ` Omar Sandoval
  0 siblings, 2 replies; 10+ messages in thread
From: Steffen Maier @ 2018-04-17  9:03 UTC (permalink / raw)
  To: linux-block, linux-block
  Cc: Steffen Maier, linux-scsi, Omar Sandoval, Johannes Thumshirn,
	Bart Van Assche, Hannes Reinecke, Douglas Gilbert,
	Damien Le Moal, Christoph Hellwig, Lee Duncan

Signed-off-by: Steffen Maier <maier@linux.ibm.com>
---
 tests/scsi/004     |   59 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/scsi/004.out |    3 ++
 2 files changed, 62 insertions(+), 0 deletions(-)
 create mode 100755 tests/scsi/004
 create mode 100644 tests/scsi/004.out

diff --git a/tests/scsi/004 b/tests/scsi/004
new file mode 100755
index 0000000..4852efc
--- /dev/null
+++ b/tests/scsi/004
@@ -0,0 +1,59 @@
+#!/bin/bash
+#
+# Ensure repeated SAM_STAT_TASK_SET_FULL results in EIO on timing out command.
+#
+# Regression test for commit cbe095e2b584 ("Revert "scsi: core: return
+# BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()"")
+#
+# Found independently of corresponding commit mail threads while
+# experimenting with storage mirroring. This test is a storage-independent
+# reproducer for the error case I ran into.
+#
+# Copyright IBM Corp. 2018
+# Author: Steffen Maier <maier@linux.ibm.com>
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. common/scsi_debug
+
+DESCRIPTION="ensure repeated TASK SET FULL results in EIO on timing out command"
+
+requires() {
+	_have_scsi_debug
+}
+
+test() {
+	echo "Running ${TEST_NAME}"
+
+	if ! _init_scsi_debug add_host=1 max_luns=1 statistics=1 every_nth=1; then
+	    return 1
+	fi
+	# every_nth RW with full queue gets SAM_STAT_TASK_SET_FULL
+	echo 0x800 > /sys/bus/pseudo/drivers/scsi_debug/opts
+	# delay to reduce response repetition: around 1..10sec depending on HZ
+	echo 1000 > /sys/bus/pseudo/drivers/scsi_debug/delay
+	# a single command fills device queue to satisfy 0x800 opts condition
+	echo 1 > "/sys/block/${SCSI_DEBUG_DEVICES[0]}/device/queue_depth"
+	dd if="/dev/${SCSI_DEBUG_DEVICES[0]}" iflag=direct of=/dev/null bs=512 count=1 |& grep -o "Input/output error"
+	# stop injection
+	echo 0 > /sys/bus/pseudo/drivers/scsi_debug/opts
+	# dd closing SCSI disk causes implicit TUR also being delayed once
+	while grep -q -F "in_use_bm BUSY:" "/proc/scsi/scsi_debug/${SCSI_DEBUG_HOSTS[0]}"; do
+		sleep 1
+	done
+	echo 1 > /sys/bus/pseudo/drivers/scsi_debug/delay
+	_exit_scsi_debug
+
+	echo "Test complete"
+}
diff --git a/tests/scsi/004.out b/tests/scsi/004.out
new file mode 100644
index 0000000..b1126fb
--- /dev/null
+++ b/tests/scsi/004.out
@@ -0,0 +1,3 @@
+Running scsi/004
+Input/output error
+Test complete
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status
  2018-04-17  9:03 [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status Steffen Maier
@ 2018-04-17  9:41 ` Johannes Thumshirn
  2018-04-19 18:53 ` Omar Sandoval
  1 sibling, 0 replies; 10+ messages in thread
From: Johannes Thumshirn @ 2018-04-17  9:41 UTC (permalink / raw)
  To: Steffen Maier
  Cc: linux-block, linux-scsi, Omar Sandoval, Bart Van Assche,
	Hannes Reinecke, Douglas Gilbert, Damien Le Moal,
	Christoph Hellwig, Lee Duncan

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status
  2018-04-17  9:03 [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status Steffen Maier
  2018-04-17  9:41 ` Johannes Thumshirn
@ 2018-04-19 18:53 ` Omar Sandoval
  2018-04-19 19:13   ` Omar Sandoval
  1 sibling, 1 reply; 10+ messages in thread
From: Omar Sandoval @ 2018-04-19 18:53 UTC (permalink / raw)
  To: Steffen Maier
  Cc: linux-block, linux-scsi, Omar Sandoval, Johannes Thumshirn,
	Bart Van Assche, Hannes Reinecke, Douglas Gilbert,
	Damien Le Moal, Christoph Hellwig, Lee Duncan

On Tue, Apr 17, 2018 at 11:03:37AM +0200, Steffen Maier wrote:
> Signed-off-by: Steffen Maier <maier@linux.ibm.com>
> ---
>  tests/scsi/004     |   59 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/scsi/004.out |    3 ++
>  2 files changed, 62 insertions(+), 0 deletions(-)
>  create mode 100755 tests/scsi/004
>  create mode 100644 tests/scsi/004.out
> 
> diff --git a/tests/scsi/004 b/tests/scsi/004
> new file mode 100755
> index 0000000..4852efc
> --- /dev/null
> +++ b/tests/scsi/004
> @@ -0,0 +1,59 @@
> +#!/bin/bash
> +#
> +# Ensure repeated SAM_STAT_TASK_SET_FULL results in EIO on timing out command.
> +#
> +# Regression test for commit cbe095e2b584 ("Revert "scsi: core: return
> +# BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()"")
> +#
> +# Found independently of corresponding commit mail threads while
> +# experimenting with storage mirroring. This test is a storage-independent
> +# reproducer for the error case I ran into.
> +#
> +# Copyright IBM Corp. 2018
> +# Author: Steffen Maier <maier@linux.ibm.com>
> +#
> +# This program is free software: you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation, either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +. common/scsi_debug
> +
> +DESCRIPTION="ensure repeated TASK SET FULL results in EIO on timing out command"
> +
> +requires() {
> +	_have_scsi_debug
> +}
> +
> +test() {
> +	echo "Running ${TEST_NAME}"
> +
> +	if ! _init_scsi_debug add_host=1 max_luns=1 statistics=1 every_nth=1; then
> +	    return 1
> +	fi
> +	# every_nth RW with full queue gets SAM_STAT_TASK_SET_FULL
> +	echo 0x800 > /sys/bus/pseudo/drivers/scsi_debug/opts
> +	# delay to reduce response repetition: around 1..10sec depending on HZ
> +	echo 1000 > /sys/bus/pseudo/drivers/scsi_debug/delay
> +	# a single command fills device queue to satisfy 0x800 opts condition
> +	echo 1 > "/sys/block/${SCSI_DEBUG_DEVICES[0]}/device/queue_depth"
> +	dd if="/dev/${SCSI_DEBUG_DEVICES[0]}" iflag=direct of=/dev/null bs=512 count=1 |& grep -o "Input/output error"
> +	# stop injection
> +	echo 0 > /sys/bus/pseudo/drivers/scsi_debug/opts
> +	# dd closing SCSI disk causes implicit TUR also being delayed once
> +	while grep -q -F "in_use_bm BUSY:" "/proc/scsi/scsi_debug/${SCSI_DEBUG_HOSTS[0]}"; do
> +		sleep 1
> +	done
> +	echo 1 > /sys/bus/pseudo/drivers/scsi_debug/delay
> +	_exit_scsi_debug
> +
> +	echo "Test complete"
> +}
> diff --git a/tests/scsi/004.out b/tests/scsi/004.out
> new file mode 100644
> index 0000000..b1126fb
> --- /dev/null
> +++ b/tests/scsi/004.out
> @@ -0,0 +1,3 @@
> +Running scsi/004
> +Input/output error
> +Test complete
> -- 
> 1.7.1
> 

Thanks for the test! Applied.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status
  2018-04-19 18:53 ` Omar Sandoval
@ 2018-04-19 19:13   ` Omar Sandoval
  2018-04-19 19:41     ` Bart Van Assche
  2018-04-19 20:04     ` Jens Axboe
  0 siblings, 2 replies; 10+ messages in thread
From: Omar Sandoval @ 2018-04-19 19:13 UTC (permalink / raw)
  To: Steffen Maier
  Cc: linux-block, linux-scsi, Omar Sandoval, Johannes Thumshirn,
	Bart Van Assche, Hannes Reinecke, Douglas Gilbert,
	Damien Le Moal, Christoph Hellwig, Lee Duncan

On Thu, Apr 19, 2018 at 11:53:30AM -0700, Omar Sandoval wrote:
> Thanks for the test! Applied.

Side note, it's unfortunate that this test takes 180 seconds to run only
because we have to wait for the command timeout. We should be able to
export request_queue->rq_timeout writeable in sysfs. Would you be
interested in doing that?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status
  2018-04-19 19:13   ` Omar Sandoval
@ 2018-04-19 19:41     ` Bart Van Assche
  2018-04-19 19:44       ` Jens Axboe
  2018-04-19 20:04     ` Jens Axboe
  1 sibling, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2018-04-19 19:41 UTC (permalink / raw)
  To: maier, osandov
  Cc: linux-block, jthumshirn, Damien Le Moal, dgilbert, hch,
	linux-scsi, hare, lduncan, osandov

T24gVGh1LCAyMDE4LTA0LTE5IGF0IDEyOjEzIC0wNzAwLCBPbWFyIFNhbmRvdmFsIHdyb3RlOg0K
PiBPbiBUaHUsIEFwciAxOSwgMjAxOCBhdCAxMTo1MzozMEFNIC0wNzAwLCBPbWFyIFNhbmRvdmFs
IHdyb3RlOg0KPiA+IFRoYW5rcyBmb3IgdGhlIHRlc3QhIEFwcGxpZWQuDQo+IA0KPiBTaWRlIG5v
dGUsIGl0J3MgdW5mb3J0dW5hdGUgdGhhdCB0aGlzIHRlc3QgdGFrZXMgMTgwIHNlY29uZHMgdG8g
cnVuIG9ubHkNCj4gYmVjYXVzZSB3ZSBoYXZlIHRvIHdhaXQgZm9yIHRoZSBjb21tYW5kIHRpbWVv
dXQuIFdlIHNob3VsZCBiZSBhYmxlIHRvDQo+IGV4cG9ydCByZXF1ZXN0X3F1ZXVlLT5ycV90aW1l
b3V0IHdyaXRlYWJsZSBpbiBzeXNmcy4gV291bGQgeW91IGJlDQo+IGludGVyZXN0ZWQgaW4gZG9p
bmcgdGhhdD8NCg0KSGVsbG8gT21hciwNCg0KSXMgdGhpcyBwZXJoYXBzIHdoYXQgeW91IGFyZSBs
b29raW5nIGZvcj8NCiMgbHMgLWwgL3N5cy9jbGFzcy9zY3NpX2RldmljZS8qLyovdGltZW91dCAg
ICAgICAgICAgICANCi1ydy1yLS1yLS0gMSByb290IHJvb3QgNDA5NiBBcHIgMTkgMDg6NTIgL3N5
cy9jbGFzcy9zY3NpX2RldmljZS8yOjA6MDowL2RldmljZS90aW1lb3V0DQotcnctci0tci0tIDEg
cm9vdCByb290IDQwOTYgQXByIDE5IDEyOjM5IC9zeXMvY2xhc3Mvc2NzaV9kZXZpY2UvODowOjA6
MS9kZXZpY2UvdGltZW91dA0KDQpUaGFua3MsDQoNCkJhcnQuDQoNCg0KDQo=

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status
  2018-04-19 19:41     ` Bart Van Assche
@ 2018-04-19 19:44       ` Jens Axboe
  2018-04-19 20:18         ` Omar Sandoval
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2018-04-19 19:44 UTC (permalink / raw)
  To: Bart Van Assche, maier, osandov
  Cc: linux-block, jthumshirn, Damien Le Moal, dgilbert, hch,
	linux-scsi, hare, lduncan, osandov

On 4/19/18 1:41 PM, Bart Van Assche wrote:
> On Thu, 2018-04-19 at 12:13 -0700, Omar Sandoval wrote:
>> On Thu, Apr 19, 2018 at 11:53:30AM -0700, Omar Sandoval wrote:
>>> Thanks for the test! Applied.
>>
>> Side note, it's unfortunate that this test takes 180 seconds to run only
>> because we have to wait for the command timeout. We should be able to
>> export request_queue->rq_timeout writeable in sysfs. Would you be
>> interested in doing that?
> 
> Hello Omar,
> 
> Is this perhaps what you are looking for?
> # ls -l /sys/class/scsi_device/*/*/timeout             
> -rw-r--r-- 1 root root 4096 Apr 19 08:52 /sys/class/scsi_device/2:0:0:0/device/timeout
> -rw-r--r-- 1 root root 4096 Apr 19 12:39 /sys/class/scsi_device/8:0:0:1/device/timeout

We should have it generically available though, not just for SCSI. In
retrospect, it should have been under queue/ from the start, now we'll
end up with duplicate entries for SCSI.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status
  2018-04-19 19:13   ` Omar Sandoval
  2018-04-19 19:41     ` Bart Van Assche
@ 2018-04-19 20:04     ` Jens Axboe
  1 sibling, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2018-04-19 20:04 UTC (permalink / raw)
  To: Omar Sandoval, Steffen Maier
  Cc: linux-block, linux-scsi, Omar Sandoval, Johannes Thumshirn,
	Bart Van Assche, Hannes Reinecke, Douglas Gilbert,
	Damien Le Moal, Christoph Hellwig, Lee Duncan

On 4/19/18 1:13 PM, Omar Sandoval wrote:
> On Thu, Apr 19, 2018 at 11:53:30AM -0700, Omar Sandoval wrote:
>> Thanks for the test! Applied.
> 
> Side note, it's unfortunate that this test takes 180 seconds to run only
> because we have to wait for the command timeout. We should be able to
> export request_queue->rq_timeout writeable in sysfs. Would you be
> interested in doing that?

Here's one, I even tested it.


diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index d00d1b0ec109..d5d628c3c12d 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -496,6 +496,27 @@ static ssize_t queue_dax_show(struct request_queue *q, char *page)
 	return queue_var_show(blk_queue_dax(q), page);
 }
 
+static ssize_t queue_timeout_show(struct request_queue *q, char *page)
+{
+	return sprintf(page, "%u\n", q->rq_timeout / HZ);
+}
+
+static ssize_t queue_timeout_store(struct request_queue *q, const char *page,
+				   size_t count)
+{
+	unsigned long timeout;
+	ssize_t ret;
+
+	ret = queue_var_store(&timeout, page, count);
+	if (ret < 0)
+		return ret;
+	if (!timeout)
+		return -EINVAL;
+
+	blk_queue_rq_timeout(q, timeout * HZ);
+	return count;
+}
+
 static struct queue_sysfs_entry queue_requests_entry = {
 	.attr = {.name = "nr_requests", .mode = S_IRUGO | S_IWUSR },
 	.show = queue_requests_show,
@@ -678,6 +699,12 @@ static struct queue_sysfs_entry throtl_sample_time_entry = {
 };
 #endif
 
+static struct queue_sysfs_entry queue_timeout_entry = {
+	.attr = {.name = "timeout", .mode = S_IRUGO | S_IWUSR },
+	.show = queue_timeout_show,
+	.store = queue_timeout_store,
+};
+
 static struct attribute *default_attrs[] = {
 	&queue_requests_entry.attr,
 	&queue_ra_entry.attr,
@@ -714,6 +741,7 @@ static struct attribute *default_attrs[] = {
 #ifdef CONFIG_BLK_DEV_THROTTLING_LOW
 	&throtl_sample_time_entry.attr,
 #endif
+	&queue_timeout_entry.attr,
 	NULL,
 };
 

-- 
Jens Axboe

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status
  2018-04-19 19:44       ` Jens Axboe
@ 2018-04-19 20:18         ` Omar Sandoval
  2018-04-23 12:25           ` Steffen Maier
  0 siblings, 1 reply; 10+ messages in thread
From: Omar Sandoval @ 2018-04-19 20:18 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Bart Van Assche, maier, linux-block, jthumshirn, Damien Le Moal,
	dgilbert, hch, linux-scsi, hare, lduncan, osandov

On Thu, Apr 19, 2018 at 01:44:41PM -0600, Jens Axboe wrote:
> On 4/19/18 1:41 PM, Bart Van Assche wrote:
> > On Thu, 2018-04-19 at 12:13 -0700, Omar Sandoval wrote:
> >> On Thu, Apr 19, 2018 at 11:53:30AM -0700, Omar Sandoval wrote:
> >>> Thanks for the test! Applied.
> >>
> >> Side note, it's unfortunate that this test takes 180 seconds to run only
> >> because we have to wait for the command timeout. We should be able to
> >> export request_queue->rq_timeout writeable in sysfs. Would you be
> >> interested in doing that?
> > 
> > Hello Omar,
> > 
> > Is this perhaps what you are looking for?
> > # ls -l /sys/class/scsi_device/*/*/timeout             
> > -rw-r--r-- 1 root root 4096 Apr 19 08:52 /sys/class/scsi_device/2:0:0:0/device/timeout
> > -rw-r--r-- 1 root root 4096 Apr 19 12:39 /sys/class/scsi_device/8:0:0:1/device/timeout
> 
> We should have it generically available though, not just for SCSI. In
> retrospect, it should have been under queue/ from the start, now we'll
> end up with duplicate entries for SCSI.

For the sake of this test, I just decreased the timeout through SCSI.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status
  2018-04-19 20:18         ` Omar Sandoval
@ 2018-04-23 12:25           ` Steffen Maier
  2018-04-24 23:14             ` Omar Sandoval
  0 siblings, 1 reply; 10+ messages in thread
From: Steffen Maier @ 2018-04-23 12:25 UTC (permalink / raw)
  To: Omar Sandoval, Jens Axboe
  Cc: Bart Van Assche, linux-block, jthumshirn, Damien Le Moal,
	dgilbert, hch, linux-scsi, hare, lduncan, osandov


On 04/19/2018 10:18 PM, Omar Sandoval wrote:
> On Thu, Apr 19, 2018 at 01:44:41PM -0600, Jens Axboe wrote:
>> On 4/19/18 1:41 PM, Bart Van Assche wrote:
>>> On Thu, 2018-04-19 at 12:13 -0700, Omar Sandoval wrote:
>>>> On Thu, Apr 19, 2018 at 11:53:30AM -0700, Omar Sandoval wrote:
>>>>> Thanks for the test! Applied.
>>>>
>>>> Side note, it's unfortunate that this test takes 180 seconds to run =
only
>>>> because we have to wait for the command timeout. We should be able t=
o
>>>> export request_queue->rq_timeout writeable in sysfs. Would you be
>>>> interested in doing that?
>>>
>>> Hello Omar,
>>>
>>> Is this perhaps what you are looking for?
>>> # ls -l /sys/class/scsi_device/*/*/timeout
>>> -rw-r--r-- 1 root root 4096 Apr 19 08:52 /sys/class/scsi_device/2:0:0=
:0/device/timeout
>>> -rw-r--r-- 1 root root 4096 Apr 19 12:39 /sys/class/scsi_device/8:0:0=
:1/device/timeout
>>
>> We should have it generically available though, not just for SCSI. In
>> retrospect, it should have been under queue/ from the start, now we'll=

>> end up with duplicate entries for SCSI.
>=20
> For the sake of this test, I just decreased the timeout through SCSI.

Great idea.

> 	echo 5 > "/sys/block/${SCSI_DEBUG_DEVICES[0]}/device/timeout"

However, the timeout should be sufficiently larger than scsi_debug/delay,=

in order not to run into the command timeout.
It may be unfortunate that scsi_debug/delay uses jiffies as unit and
can thus differ in a range of an order of magnitude for different kernel =
configs.

> 	# delay to reduce response repetition: around 1..10sec depending on HZ=

> 	echo 1000 > /sys/bus/pseudo/drivers/scsi_debug/delay

On s390, we typically have HZ=3D100, so 1000 jiffies are 10 seconds.

We can increase the sdev cmd timeout or decrease the scsi_debug/delay.
100 instead of 1000 for scsi_debug/delay worked for me;
but for some reason the loop checking for busy did not work (any more?)
causing an unexpected test case error:

> # ./check scsi/004
> scsi/004 (ensure repeated TASK SET FULL results in EIO on timing out co=
mmand) [failed]
> runtime  31.892s  ...  31.720s
>     --- tests/scsi/004.out	2018-04-16 11:47:19.105931872 +0200
>     +++ results/nodev/scsi/004.out.bad	2018-04-23 14:07:33.615445253 +0=
200
>     @@ -1,3 +1,3 @@
>      Running scsi/004
>     -Input/output error
>     +modprobe: FATAL: Module scsi_debug is in use.
>      Test complete

so I added another sleep hack:

         # dd closing SCSI disk causes implicit TUR also being delayed on=
ce
+        # sleep over time window where READ was done and TUR not yet que=
ued
+        sleep 2
         while grep -q -F "in_use_bm BUSY:" "/proc/scsi/scsi_debug/${SCSI=
_DEBUG_HOSTS[0]}"; do

What do you think?

--=20
Mit freundlichen Gr=C3=BC=C3=9Fen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status
  2018-04-23 12:25           ` Steffen Maier
@ 2018-04-24 23:14             ` Omar Sandoval
  0 siblings, 0 replies; 10+ messages in thread
From: Omar Sandoval @ 2018-04-24 23:14 UTC (permalink / raw)
  To: Steffen Maier
  Cc: Jens Axboe, Bart Van Assche, linux-block, jthumshirn,
	Damien Le Moal, dgilbert, hch, linux-scsi, hare, lduncan,
	osandov

On Mon, Apr 23, 2018 at 02:25:03PM +0200, Steffen Maier wrote:
> 
> On 04/19/2018 10:18 PM, Omar Sandoval wrote:
> > On Thu, Apr 19, 2018 at 01:44:41PM -0600, Jens Axboe wrote:
> >> On 4/19/18 1:41 PM, Bart Van Assche wrote:
> >>> On Thu, 2018-04-19 at 12:13 -0700, Omar Sandoval wrote:
> >>>> On Thu, Apr 19, 2018 at 11:53:30AM -0700, Omar Sandoval wrote:
> >>>>> Thanks for the test! Applied.
> >>>>
> >>>> Side note, it's unfortunate that this test takes 180 seconds to run only
> >>>> because we have to wait for the command timeout. We should be able to
> >>>> export request_queue->rq_timeout writeable in sysfs. Would you be
> >>>> interested in doing that?
> >>>
> >>> Hello Omar,
> >>>
> >>> Is this perhaps what you are looking for?
> >>> # ls -l /sys/class/scsi_device/*/*/timeout
> >>> -rw-r--r-- 1 root root 4096 Apr 19 08:52 /sys/class/scsi_device/2:0:0:0/device/timeout
> >>> -rw-r--r-- 1 root root 4096 Apr 19 12:39 /sys/class/scsi_device/8:0:0:1/device/timeout
> >>
> >> We should have it generically available though, not just for SCSI. In
> >> retrospect, it should have been under queue/ from the start, now we'll
> >> end up with duplicate entries for SCSI.
> > 
> > For the sake of this test, I just decreased the timeout through SCSI.
> 
> Great idea.
> 
> > 	echo 5 > "/sys/block/${SCSI_DEBUG_DEVICES[0]}/device/timeout"
> 
> However, the timeout should be sufficiently larger than scsi_debug/delay,
> in order not to run into the command timeout.
> It may be unfortunate that scsi_debug/delay uses jiffies as unit and
> can thus differ in a range of an order of magnitude for different kernel configs.
> 
> > 	# delay to reduce response repetition: around 1..10sec depending on HZ
> > 	echo 1000 > /sys/bus/pseudo/drivers/scsi_debug/delay
> 
> On s390, we typically have HZ=100, so 1000 jiffies are 10 seconds.

Good catch, I just switched this to use ndelay in nanoseconds instead of
delay.

> We can increase the sdev cmd timeout or decrease the scsi_debug/delay.
> 100 instead of 1000 for scsi_debug/delay worked for me;
> but for some reason the loop checking for busy did not work (any more?)
> causing an unexpected test case error:
> 
> > # ./check scsi/004
> > scsi/004 (ensure repeated TASK SET FULL results in EIO on timing out command) [failed]
> > runtime  31.892s  ...  31.720s
> >     --- tests/scsi/004.out	2018-04-16 11:47:19.105931872 +0200
> >     +++ results/nodev/scsi/004.out.bad	2018-04-23 14:07:33.615445253 +0200
> >     @@ -1,3 +1,3 @@
> >      Running scsi/004
> >     -Input/output error
> >     +modprobe: FATAL: Module scsi_debug is in use.
> >      Test complete
> 
> so I added another sleep hack:
> 
>          # dd closing SCSI disk causes implicit TUR also being delayed once
> +        # sleep over time window where READ was done and TUR not yet queued
> +        sleep 2
>          while grep -q -F "in_use_bm BUSY:" "/proc/scsi/scsi_debug/${SCSI_DEBUG_HOSTS[0]}"; do
> 
> What do you think?

I've been hitting this on and off on all of the scsi-debug tests for
awhile, and I can't figure out where the lingering reference comes from.
I don't think it's related, but I'll look into it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-04-24 23:14 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-17  9:03 [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status Steffen Maier
2018-04-17  9:41 ` Johannes Thumshirn
2018-04-19 18:53 ` Omar Sandoval
2018-04-19 19:13   ` Omar Sandoval
2018-04-19 19:41     ` Bart Van Assche
2018-04-19 19:44       ` Jens Axboe
2018-04-19 20:18         ` Omar Sandoval
2018-04-23 12:25           ` Steffen Maier
2018-04-24 23:14             ` Omar Sandoval
2018-04-19 20:04     ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.