linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH blktests 0/3] improve block/011
@ 2023-05-26  4:58 Shin'ichiro Kawasaki
  2023-05-26  4:58 ` [PATCH blktests 1/3] common/rc: introduce _get_pci_from_dev_sysfs Shin'ichiro Kawasaki
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Shin'ichiro Kawasaki @ 2023-05-26  4:58 UTC (permalink / raw)
  To: linux-block, linux-nvme; +Cc: Shin'ichiro Kawasaki

As of today, three failure symptoms are observed with block/011 [1]. This series
addresses two of them. The first two patches address the disappearing system
disk (Symptom C in [1]). The third patch addresses the zero capacity NVME
devices. (Symptom B in [1]).

[1] https://lore.kernel.org/linux-block/rsmmxrchy6voi5qhl4irss5sprna3f5owkqtvybxglcv2pnylm@xmrnpfu3tfpe/

Shin'ichiro Kawasaki (3):
  common/rc: introduce _get_pci_from_dev_sysfs
  block/011: skip when mounted block devices are affected
  block/011: recover test target NVME device capacity

 common/rc       |  8 ++++++--
 tests/block/011 | 31 +++++++++++++++++++++++++++++++
 2 files changed, 37 insertions(+), 2 deletions(-)

-- 
2.40.1



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH blktests 1/3] common/rc: introduce _get_pci_from_dev_sysfs
  2023-05-26  4:58 [PATCH blktests 0/3] improve block/011 Shin'ichiro Kawasaki
@ 2023-05-26  4:58 ` Shin'ichiro Kawasaki
  2023-05-26  4:58 ` [PATCH blktests 2/3] block/011: skip when mounted block devices are affected Shin'ichiro Kawasaki
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Shin'ichiro Kawasaki @ 2023-05-26  4:58 UTC (permalink / raw)
  To: linux-block, linux-nvme; +Cc: Shin'ichiro Kawasaki

To prepare for block/011 test case improvement, add the helper function
which gets PCI device from the given sysfs path of a block device.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
 common/rc | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/common/rc b/common/rc
index 57e0f42..90122c0 100644
--- a/common/rc
+++ b/common/rc
@@ -313,12 +313,16 @@ _require_test_dev_is_pci() {
 	return 0
 }
 
-_get_pci_dev_from_blkdev() {
-	readlink -f "$TEST_DEV_SYSFS/device" | \
+_get_pci_from_dev_sysfs() {
+	readlink -f "$1/device" | \
 		grep -Eo '[0-9a-f]{4,5}:[0-9a-f]{2}:[0-9a-f]{2}\.[0-9a-f]' | \
 		tail -1
 }
 
+_get_pci_dev_from_blkdev() {
+	_get_pci_from_dev_sysfs "$TEST_DEV_SYSFS"
+}
+
 _get_pci_parent_from_blkdev() {
 	readlink -f "$TEST_DEV_SYSFS/device" | \
 		grep -Eo '[0-9a-f]{4,5}:[0-9a-f]{2}:[0-9a-f]{2}\.[0-9a-f]' | \
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH blktests 2/3] block/011: skip when mounted block devices are affected
  2023-05-26  4:58 [PATCH blktests 0/3] improve block/011 Shin'ichiro Kawasaki
  2023-05-26  4:58 ` [PATCH blktests 1/3] common/rc: introduce _get_pci_from_dev_sysfs Shin'ichiro Kawasaki
@ 2023-05-26  4:58 ` Shin'ichiro Kawasaki
  2023-05-26  4:58 ` [PATCH blktests 3/3] block/011: recover test target NVME device capacity Shin'ichiro Kawasaki
  2023-06-13 11:35 ` [PATCH blktests 0/3] improve block/011 Shinichiro Kawasaki
  3 siblings, 0 replies; 5+ messages in thread
From: Shin'ichiro Kawasaki @ 2023-05-26  4:58 UTC (permalink / raw)
  To: linux-block, linux-nvme; +Cc: Shin'ichiro Kawasaki

The test case disables PCI device of the test target block device. When
the PCI device has other block devices mounted, those block devices are
disabled also. If the mounted device is the system disk, the test screws
up the system. To avoid such dangerous operation, check if the target
PCI device has mounted block devices. In that case, skip the test.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
 tests/block/011 | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/tests/block/011 b/tests/block/011
index 4f331b4..0699936 100755
--- a/tests/block/011
+++ b/tests/block/011
@@ -10,12 +10,29 @@ DESCRIPTION="disable PCI device while doing I/O"
 TIMED=1
 CAN_BE_ZONED=1
 
+pci_dev_mounted() {
+	local d dev p pdev
+
+	pdev="$(_get_pci_dev_from_blkdev)"
+	for d in /sys/block/*; do
+		dev=${d##*/}
+		p=$(_get_pci_from_dev_sysfs "$d")
+		[[ $p != "$pdev" ]] && continue
+		grep -qe "/dev/$dev" /proc/mounts && return 0
+	done
+	return 1
+}
+
 requires() {
 	_have_fio && _have_program setpci
 }
 
 device_requires() {
 	_require_test_dev_is_pci
+	if pci_dev_mounted; then
+		SKIP_REASONS+=("mounted block device exists on test target PCI device")
+		return 1
+	fi
 }
 
 test_device() {
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH blktests 3/3] block/011: recover test target NVME device capacity
  2023-05-26  4:58 [PATCH blktests 0/3] improve block/011 Shin'ichiro Kawasaki
  2023-05-26  4:58 ` [PATCH blktests 1/3] common/rc: introduce _get_pci_from_dev_sysfs Shin'ichiro Kawasaki
  2023-05-26  4:58 ` [PATCH blktests 2/3] block/011: skip when mounted block devices are affected Shin'ichiro Kawasaki
@ 2023-05-26  4:58 ` Shin'ichiro Kawasaki
  2023-06-13 11:35 ` [PATCH blktests 0/3] improve block/011 Shinichiro Kawasaki
  3 siblings, 0 replies; 5+ messages in thread
From: Shin'ichiro Kawasaki @ 2023-05-26  4:58 UTC (permalink / raw)
  To: linux-block, linux-nvme; +Cc: Shin'ichiro Kawasaki

The test case runs fio while disabling and enabling PCI device of the
test target block device. When the block device is a NVME PCI device,
the test triggers NVME controller reset. When an error happens during
the reset, NVME PCI driver marks zero capacity for the device. This
zero capacity device causes failures of the following test cases.

To avoid the failures by zero device capacity, check the capacity at the
test end. If it is zero, remove the device and rescan PCI bus to detect
the device again, and regain the correct capacity.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
 tests/block/011 | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/tests/block/011 b/tests/block/011
index 0699936..a4230f4 100755
--- a/tests/block/011
+++ b/tests/block/011
@@ -59,4 +59,18 @@ test_device() {
 	done
 
 	echo "Test complete"
+
+	# This test triggers NVME controller resets. When any failure happens
+	# during the resets, the driver marks the NVME block devices with zero
+	# capacity. Then following tests fail with the zero capacity devices. To
+	# get back the correct capacity, remove and rescan the devices.
+	if ((!$(<"$TEST_DEV_SYSFS/size"))); then
+		echo "$TEST_DEV has zero capacity" >> "$FULL"
+		if [[ -w $TEST_DEV_SYSFS/device/device/remove ]] &&
+			   [[ -w /sys/bus/pci/rescan ]]; then
+			echo "Rescan to tegain the correct capacity" >> "$FULL"
+			echo 1 > "$TEST_DEV_SYSFS/device/device/remove"
+			echo 1 > /sys/bus/pci/rescan
+		fi
+	fi
 }
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH blktests 0/3] improve block/011
  2023-05-26  4:58 [PATCH blktests 0/3] improve block/011 Shin'ichiro Kawasaki
                   ` (2 preceding siblings ...)
  2023-05-26  4:58 ` [PATCH blktests 3/3] block/011: recover test target NVME device capacity Shin'ichiro Kawasaki
@ 2023-06-13 11:35 ` Shinichiro Kawasaki
  3 siblings, 0 replies; 5+ messages in thread
From: Shinichiro Kawasaki @ 2023-06-13 11:35 UTC (permalink / raw)
  To: linux-block, linux-nvme

On May 26, 2023 / 13:58, Shin'ichiro Kawasaki wrote:
> As of today, three failure symptoms are observed with block/011 [1]. This series
> addresses two of them. The first two patches address the disappearing system
> disk (Symptom C in [1]). The third patch addresses the zero capacity NVME
> devices. (Symptom B in [1]).
> 
> [1] https://lore.kernel.org/linux-block/rsmmxrchy6voi5qhl4irss5sprna3f5owkqtvybxglcv2pnylm@xmrnpfu3tfpe/

FYI, I've applied the patches.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-06-13 11:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-26  4:58 [PATCH blktests 0/3] improve block/011 Shin'ichiro Kawasaki
2023-05-26  4:58 ` [PATCH blktests 1/3] common/rc: introduce _get_pci_from_dev_sysfs Shin'ichiro Kawasaki
2023-05-26  4:58 ` [PATCH blktests 2/3] block/011: skip when mounted block devices are affected Shin'ichiro Kawasaki
2023-05-26  4:58 ` [PATCH blktests 3/3] block/011: recover test target NVME device capacity Shin'ichiro Kawasaki
2023-06-13 11:35 ` [PATCH blktests 0/3] improve block/011 Shinichiro Kawasaki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).