All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Fix DASD control unit updates
@ 2020-12-17 15:59 Stefan Haberland
  2020-12-17 15:59 ` [PATCH 1/4] s390/dasd: fix hanging device offline processing Stefan Haberland
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Stefan Haberland @ 2020-12-17 15:59 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Jan Hoeppner, linux-s390, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger

Hi Jens,

please apply the following patches that address problems during DASD
control unit updates.
Thanks.

Stefan Haberland (4):
  s390/dasd: fix hanging device offline processing
  s390/dasd: prevent inconsistent LCU device data
  s390/dasd: fix list corruption of pavgroup group list
  s390/dasd: fix list corruption of lcu list

 drivers/s390/block/dasd_alias.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] s390/dasd: fix hanging device offline processing
  2020-12-17 15:59 [PATCH 0/4] Fix DASD control unit updates Stefan Haberland
@ 2020-12-17 15:59 ` Stefan Haberland
  2020-12-17 15:59 ` [PATCH 2/4] s390/dasd: prevent inconsistent LCU device data Stefan Haberland
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Stefan Haberland @ 2020-12-17 15:59 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Jan Hoeppner, linux-s390, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger

For an LCU update a read unit address configuration IO is required.
This is started using sleep_on(), which has early exit paths in case the
device is not usable for IO. For example when it is in offline processing.

In those cases the LCU update should fail and not be retried.
Therefore lcu_update_work checks if EOPNOTSUPP is returned or not.

Commit 47825e05d93e ("s390/dasd: fix endless loop after read unit address configuration")
accidentally removed the EOPNOTSUPP return code from
read_unit_address_configuration(), which in turn might lead to an endless
loop of the LCU update in offline processing.

Fix by returning EOPNOTSUPP again if the device is not able to perform the
request.

Fixes: 47825e05d93e ("s390/dasd: fix endless loop after read unit address configuration")
Cc: stable@vger.kernel.org #5.3
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
---
 drivers/s390/block/dasd_alias.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/s390/block/dasd_alias.c b/drivers/s390/block/dasd_alias.c
index 99f86612f775..31e8b5d48e86 100644
--- a/drivers/s390/block/dasd_alias.c
+++ b/drivers/s390/block/dasd_alias.c
@@ -462,11 +462,19 @@ static int read_unit_address_configuration(struct dasd_device *device,
 	spin_unlock_irqrestore(&lcu->lock, flags);
 
 	rc = dasd_sleep_on(cqr);
-	if (rc && !suborder_not_supported(cqr)) {
+	if (!rc)
+		goto out;
+
+	if (suborder_not_supported(cqr)) {
+		/* suborder not supported or device unusable for IO */
+		rc = -EOPNOTSUPP;
+	} else {
+		/* IO failed but should be retried */
 		spin_lock_irqsave(&lcu->lock, flags);
 		lcu->flags |= NEED_UAC_UPDATE;
 		spin_unlock_irqrestore(&lcu->lock, flags);
 	}
+out:
 	dasd_sfree_request(cqr, cqr->memdev);
 	return rc;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/4] s390/dasd: prevent inconsistent LCU device data
  2020-12-17 15:59 [PATCH 0/4] Fix DASD control unit updates Stefan Haberland
  2020-12-17 15:59 ` [PATCH 1/4] s390/dasd: fix hanging device offline processing Stefan Haberland
@ 2020-12-17 15:59 ` Stefan Haberland
  2020-12-17 15:59 ` [PATCH 3/4] s390/dasd: fix list corruption of pavgroup group list Stefan Haberland
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Stefan Haberland @ 2020-12-17 15:59 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Jan Hoeppner, linux-s390, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger

Prevent _lcu_update from adding a device to a pavgroup if the LCU still
requires an update. The data is not reliable any longer and in parallel
devices might have been moved on the lists already.
This might lead to list corruptions or invalid PAV grouping.
Only add devices to a pavgroup if the LCU is up to date. Additional steps
are taken by the scheduled lcu update.

Fixes: 8e09f21574ea ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
Cc: stable@vger.kernel.org
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
---
 drivers/s390/block/dasd_alias.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/s390/block/dasd_alias.c b/drivers/s390/block/dasd_alias.c
index 31e8b5d48e86..f841518de6c5 100644
--- a/drivers/s390/block/dasd_alias.c
+++ b/drivers/s390/block/dasd_alias.c
@@ -511,6 +511,14 @@ static int _lcu_update(struct dasd_device *refdev, struct alias_lcu *lcu)
 		return rc;
 
 	spin_lock_irqsave(&lcu->lock, flags);
+	/*
+	 * there is another update needed skip the remaining handling
+	 * the data might already be outdated
+	 * but especially do not add the device to an LCU with pending
+	 * update
+	 */
+	if (lcu->flags & NEED_UAC_UPDATE)
+		goto out;
 	lcu->pav = NO_PAV;
 	for (i = 0; i < MAX_DEVICES_PER_LCU; ++i) {
 		switch (lcu->uac->unit[i].ua_type) {
@@ -529,6 +537,7 @@ static int _lcu_update(struct dasd_device *refdev, struct alias_lcu *lcu)
 				 alias_list) {
 		_add_device_to_lcu(lcu, device, refdev);
 	}
+out:
 	spin_unlock_irqrestore(&lcu->lock, flags);
 	return 0;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/4] s390/dasd: fix list corruption of pavgroup group list
  2020-12-17 15:59 [PATCH 0/4] Fix DASD control unit updates Stefan Haberland
  2020-12-17 15:59 ` [PATCH 1/4] s390/dasd: fix hanging device offline processing Stefan Haberland
  2020-12-17 15:59 ` [PATCH 2/4] s390/dasd: prevent inconsistent LCU device data Stefan Haberland
@ 2020-12-17 15:59 ` Stefan Haberland
  2020-12-17 15:59 ` [PATCH 4/4] s390/dasd: fix list corruption of lcu list Stefan Haberland
  2020-12-17 16:36 ` [PATCH 0/4] Fix DASD control unit updates Jens Axboe
  4 siblings, 0 replies; 6+ messages in thread
From: Stefan Haberland @ 2020-12-17 15:59 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Jan Hoeppner, linux-s390, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger

dasd_alias_add_device() moves devices to the active_devices list in case
of a scheduled LCU update regardless if they have previously been in a
pavgroup or not.

Example: device A and B are in the same pavgroup.

Device A has already been in a pavgroup and the private->pavgroup pointer
is set and points to a valid pavgroup. While going through dasd_add_device
it is moved from the pavgroup to the active_devices list.

In parallel device B might be removed from the same pavgroup in
remove_device_from_lcu() which in turn checks if the group is empty
and deletes it accordingly because device A has already been removed from
there.

When now device A enters remove_device_from_lcu() it is tried to remove it
from the pavgroup again because the pavgroup pointer is still set and again
the empty group will be cleaned up which leads to a list corruption.

Fix by setting private->pavgroup to NULL in dasd_add_device.

If the device has been the last device on the pavgroup an empty pavgroup
remains but this will be cleaned up by the scheduled lcu_update which
iterates over all existing pavgroups.

Fixes: 8e09f21574ea ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
Cc: stable@vger.kernel.org
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
---
 drivers/s390/block/dasd_alias.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/s390/block/dasd_alias.c b/drivers/s390/block/dasd_alias.c
index f841518de6c5..60344e13e87b 100644
--- a/drivers/s390/block/dasd_alias.c
+++ b/drivers/s390/block/dasd_alias.c
@@ -642,6 +642,7 @@ int dasd_alias_add_device(struct dasd_device *device)
 	}
 	if (lcu->flags & UPDATE_PENDING) {
 		list_move(&device->alias_list, &lcu->active_devices);
+		private->pavgroup = NULL;
 		_schedule_lcu_update(lcu, device);
 	}
 	spin_unlock_irqrestore(&lcu->lock, flags);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/4] s390/dasd: fix list corruption of lcu list
  2020-12-17 15:59 [PATCH 0/4] Fix DASD control unit updates Stefan Haberland
                   ` (2 preceding siblings ...)
  2020-12-17 15:59 ` [PATCH 3/4] s390/dasd: fix list corruption of pavgroup group list Stefan Haberland
@ 2020-12-17 15:59 ` Stefan Haberland
  2020-12-17 16:36 ` [PATCH 0/4] Fix DASD control unit updates Jens Axboe
  4 siblings, 0 replies; 6+ messages in thread
From: Stefan Haberland @ 2020-12-17 15:59 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Jan Hoeppner, linux-s390, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger

In dasd_alias_disconnect_device_from_lcu the device is removed from any
list on the LCU. Afterwards the LCU is removed from the lcu list if it
does not contain devices any longer.

The lcu->lock protects the lcu from parallel updates. But to cancel all
workers and wait for completion the lcu->lock has to be unlocked.

If two devices are removed in parallel and both are removed from the LCU
the first device that takes the lcu->lock again will delete the LCU because
it is already empty but the second device also tries to free the LCU which
leads to a list corruption of the lcu list.

Fix by removing the device right before the lcu is checked without
unlocking the lcu->lock in between.

Fixes: 8e09f21574ea ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
Cc: stable@vger.kernel.org
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
---
 drivers/s390/block/dasd_alias.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/s390/block/dasd_alias.c b/drivers/s390/block/dasd_alias.c
index 60344e13e87b..dc78a523a69f 100644
--- a/drivers/s390/block/dasd_alias.c
+++ b/drivers/s390/block/dasd_alias.c
@@ -256,7 +256,6 @@ void dasd_alias_disconnect_device_from_lcu(struct dasd_device *device)
 		return;
 	device->discipline->get_uid(device, &uid);
 	spin_lock_irqsave(&lcu->lock, flags);
-	list_del_init(&device->alias_list);
 	/* make sure that the workers don't use this device */
 	if (device == lcu->suc_data.device) {
 		spin_unlock_irqrestore(&lcu->lock, flags);
@@ -283,6 +282,7 @@ void dasd_alias_disconnect_device_from_lcu(struct dasd_device *device)
 
 	spin_lock_irqsave(&aliastree.lock, flags);
 	spin_lock(&lcu->lock);
+	list_del_init(&device->alias_list);
 	if (list_empty(&lcu->grouplist) &&
 	    list_empty(&lcu->active_devices) &&
 	    list_empty(&lcu->inactive_devices)) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/4] Fix DASD control unit updates
  2020-12-17 15:59 [PATCH 0/4] Fix DASD control unit updates Stefan Haberland
                   ` (3 preceding siblings ...)
  2020-12-17 15:59 ` [PATCH 4/4] s390/dasd: fix list corruption of lcu list Stefan Haberland
@ 2020-12-17 16:36 ` Jens Axboe
  4 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2020-12-17 16:36 UTC (permalink / raw)
  To: Stefan Haberland
  Cc: linux-block, Jan Hoeppner, linux-s390, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger

On 12/17/20 8:59 AM, Stefan Haberland wrote:
> Hi Jens,
> 
> please apply the following patches that address problems during DASD
> control unit updates.

Applied, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-12-17 16:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-17 15:59 [PATCH 0/4] Fix DASD control unit updates Stefan Haberland
2020-12-17 15:59 ` [PATCH 1/4] s390/dasd: fix hanging device offline processing Stefan Haberland
2020-12-17 15:59 ` [PATCH 2/4] s390/dasd: prevent inconsistent LCU device data Stefan Haberland
2020-12-17 15:59 ` [PATCH 3/4] s390/dasd: fix list corruption of pavgroup group list Stefan Haberland
2020-12-17 15:59 ` [PATCH 4/4] s390/dasd: fix list corruption of lcu list Stefan Haberland
2020-12-17 16:36 ` [PATCH 0/4] Fix DASD control unit updates Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.