All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] improve AP queue reset processing
@ 2022-12-13 15:44 Tony Krowiak
  2022-12-13 15:44 ` [PATCH 1/7] s390/vfio-ap: verify reset complete in separate function Tony Krowiak
                   ` (7 more replies)
  0 siblings, 8 replies; 25+ messages in thread
From: Tony Krowiak @ 2022-12-13 15:44 UTC (permalink / raw)
  To: linux-s390, linux-kernel, kvm
  Cc: jjherne, freude, borntraeger, cohuck, mjrosato, pasic,
	alex.williamson, kwankhede, fiuczy

This series introduces several improvements to the function that performs
AP queue resets:

* Breaks up reset processing into multiple smaller, more concise functions.

* Use TAPQ to verify completion of a reset in progress rather than mulitple
  invocations of ZAPQ.

* Check TAPQ response codes when verifying successful completion of ZAPQ.

* Fix erroneous handling of some error response codes.

* Increase the maximum amount of time to wait for successful completion of
  ZAPQ.

* Always clean up IRQ resources when the ZAPQ response code indicates an
  error.

* Consider reset complete when ZAPQ response code indicates the adapter to
  which a queue is connected is deconfigured. All queues associated with an
  adapter are reset when it is deconfigured. 

Tony Krowiak (7):
  s390/vfio-ap: verify reset complete in separate function
  s390/vfio_ap: check TAPQ response code when waiting for queue reset
  s390/vfio_ap: use TAPQ to verify reset in progress completes
  s390/vfio_ap: verify ZAPQ completion after return of response code
    zero
  s390/vfio_ap: fix handling of error response codes
  s390/vfio_ap: increase max wait time for reset verification
  s390/vfio_ap: always clean up IRQ resources

 drivers/s390/crypto/vfio_ap_ops.c | 106 ++++++++++++++++++++----------
 1 file changed, 73 insertions(+), 33 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/7] s390/vfio-ap: verify reset complete in separate function
  2022-12-13 15:44 [PATCH 0/7] improve AP queue reset processing Tony Krowiak
@ 2022-12-13 15:44 ` Tony Krowiak
  2022-12-15  8:24   ` Harald Freudenberger
  2022-12-13 15:44 ` [PATCH 2/7] s390/vfio_ap: check TAPQ response code when waiting for queue reset Tony Krowiak
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Tony Krowiak @ 2022-12-13 15:44 UTC (permalink / raw)
  To: linux-s390, linux-kernel, kvm
  Cc: jjherne, freude, borntraeger, cohuck, mjrosato, pasic,
	alex.williamson, kwankhede, fiuczy

The vfio_ap_mdev_reset_queue() function contains a loop to verify that the
reset successfully completes within 40ms. This patch moves that loop into
a separate function.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
---
 drivers/s390/crypto/vfio_ap_ops.c | 30 +++++++++++++++++++++---------
 1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 0b4cc8c597ae..83ff94a38102 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1587,12 +1587,30 @@ static struct vfio_ap_queue *vfio_ap_find_queue(int apqn)
 	return q;
 }
 
+static int apq_reset_check(struct vfio_ap_queue *q)
+{
+	int iters = 2;
+	struct ap_queue_status status;
+
+	while (iters--) {
+		msleep(20);
+		status = ap_tapq(q->apqn, NULL);
+		if (status.queue_empty && !status.irq_enabled)
+			return 0;
+	}
+	WARN_ONCE(iters <= 0,
+		  "timeout verifying reset of queue %02x.%04x (%u, %u, %u)",
+		  AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn),
+		  status.queue_empty, status.irq_enabled, status.response_code);
+
+	return -EBUSY;
+}
+
 static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q,
 				    unsigned int retry)
 {
 	struct ap_queue_status status;
 	int ret;
-	int retry2 = 2;
 
 	if (!q)
 		return 0;
@@ -1629,14 +1647,8 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q,
 	}
 
 	/* wait for the reset to take effect */
-	while (retry2--) {
-		if (status.queue_empty && !status.irq_enabled)
-			break;
-		msleep(20);
-		status = ap_tapq(q->apqn, NULL);
-	}
-	WARN_ONCE(retry2 <= 0, "unable to verify reset of queue %02x.%04x",
-		  AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn));
+	if (!status.queue_empty && status.irq_enabled)
+		ret = apq_reset_check(q);
 
 free_resources:
 	vfio_ap_free_aqic_resources(q);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/7] s390/vfio_ap: check TAPQ response code when waiting for queue reset
  2022-12-13 15:44 [PATCH 0/7] improve AP queue reset processing Tony Krowiak
  2022-12-13 15:44 ` [PATCH 1/7] s390/vfio-ap: verify reset complete in separate function Tony Krowiak
@ 2022-12-13 15:44 ` Tony Krowiak
  2022-12-15 10:48   ` Harald Freudenberger
  2022-12-13 15:44 ` [PATCH 3/7] s390/vfio_ap: use TAPQ to verify reset in progress completes Tony Krowiak
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Tony Krowiak @ 2022-12-13 15:44 UTC (permalink / raw)
  To: linux-s390, linux-kernel, kvm
  Cc: jjherne, freude, borntraeger, cohuck, mjrosato, pasic,
	alex.williamson, kwankhede, fiuczy

The vfio_ap_mdev_reset_queue() function does not check the status
response code returned form the PQAP(TAPQ) function when verifying the
queue's status; consequently, there is no way of knowing whether
verification failed because the wait time was exceeded, or because the
PQAP(TAPQ) failed.

This patch adds a function to check the status response code from the
PQAP(TAPQ) instruction and logs an appropriate message if it fails.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
---
 drivers/s390/crypto/vfio_ap_ops.c | 36 ++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 83ff94a38102..a5530a46cf31 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1587,23 +1587,49 @@ static struct vfio_ap_queue *vfio_ap_find_queue(int apqn)
 	return q;
 }
 
+static int apq_status_check(int apqn, struct ap_queue_status *status)
+{
+	switch (status->response_code) {
+	case AP_RESPONSE_NORMAL:
+	case AP_RESPONSE_RESET_IN_PROGRESS:
+		if (status->queue_empty && !status->irq_enabled)
+			return 0;
+		return -EBUSY;
+	case AP_RESPONSE_DECONFIGURED:
+		/*
+		 * If the AP queue is deconfigured, any subsequent AP command
+		 * targeting the queue will fail with the same response code. On the
+		 * other hand, when an AP adapter is deconfigured, the associated
+		 * queues are reset, so let's return a value indicating the reset
+		 * for which we're waiting completed successfully.
+		 */
+		return 0;
+	default:
+		WARN(true,
+		     "failed to verify reset of queue %02x.%04x: TAPQ rc=%u\n",
+		     AP_QID_CARD(apqn), AP_QID_QUEUE(apqn),
+		     status->response_code);
+		return -EIO;
+	}
+}
+
 static int apq_reset_check(struct vfio_ap_queue *q)
 {
-	int iters = 2;
+	int iters = 2, ret;
 	struct ap_queue_status status;
 
 	while (iters--) {
 		msleep(20);
 		status = ap_tapq(q->apqn, NULL);
-		if (status.queue_empty && !status.irq_enabled)
-			return 0;
+		ret = apq_status_check(q->apqn, &status);
+		if (ret != -EBUSY)
+			return ret;
 	}
 	WARN_ONCE(iters <= 0,
 		  "timeout verifying reset of queue %02x.%04x (%u, %u, %u)",
 		  AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn),
 		  status.queue_empty, status.irq_enabled, status.response_code);
-
-	return -EBUSY;
+	return ret;
 }
 
 static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/7] s390/vfio_ap: use TAPQ to verify reset in progress completes
  2022-12-13 15:44 [PATCH 0/7] improve AP queue reset processing Tony Krowiak
  2022-12-13 15:44 ` [PATCH 1/7] s390/vfio-ap: verify reset complete in separate function Tony Krowiak
  2022-12-13 15:44 ` [PATCH 2/7] s390/vfio_ap: check TAPQ response code when waiting for queue reset Tony Krowiak
@ 2022-12-13 15:44 ` Tony Krowiak
  2022-12-15 10:51   ` Harald Freudenberger
  2022-12-13 15:44 ` [PATCH 4/7] s390/vfio_ap: verify ZAPQ completion after return of response code zero Tony Krowiak
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Tony Krowiak @ 2022-12-13 15:44 UTC (permalink / raw)
  To: linux-s390, linux-kernel, kvm
  Cc: jjherne, freude, borntraeger, cohuck, mjrosato, pasic,
	alex.williamson, kwankhede, fiuczy

To eliminate the repeated calls to the PQAP(ZAPQ) function to verify that
a reset in progress completed successfully and ensure that error response
codes get appropriately logged, let's call the apq_reset_check() function
when the ZAPQ response code indicates that a reset that is already in
progress.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
---
 drivers/s390/crypto/vfio_ap_ops.c | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index a5530a46cf31..5bf2d93ae8af 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -33,7 +33,7 @@
 static int vfio_ap_mdev_reset_queues(struct ap_queue_table *qtable);
 static struct vfio_ap_queue *vfio_ap_find_queue(int apqn);
 static const struct vfio_device_ops vfio_ap_matrix_dev_ops;
-static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q, unsigned int retry);
+static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q);
 
 /**
  * get_update_locks_for_kvm: Acquire the locks required to dynamically update a
@@ -1632,8 +1632,7 @@ static int apq_reset_check(struct vfio_ap_queue *q)
 	return ret;
 }
 
-static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q,
-				    unsigned int retry)
+static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q)
 {
 	struct ap_queue_status status;
 	int ret;
@@ -1648,12 +1647,15 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q,
 		ret = 0;
 		break;
 	case AP_RESPONSE_RESET_IN_PROGRESS:
-		if (retry--) {
-			msleep(20);
-			goto retry_zapq;
-		}
-		ret = -EBUSY;
-		break;
+		/*
+		 * There is a reset issued by another process in progress. Let's wait
+		 * for that to complete. Since we have no idea whether it was a RAPQ or
+		 * ZAPQ, then if it completes successfully, let's issue the ZAPQ.
+		 */
+		ret = apq_reset_check(q);
+		if (ret)
+			break;
+		goto retry_zapq;
 	case AP_RESPONSE_Q_NOT_AVAIL:
 	case AP_RESPONSE_DECONFIGURED:
 	case AP_RESPONSE_CHECKSTOPPED:
@@ -1688,7 +1690,7 @@ static int vfio_ap_mdev_reset_queues(struct ap_queue_table *qtable)
 	struct vfio_ap_queue *q;
 
 	hash_for_each(qtable->queues, loop_cursor, q, mdev_qnode) {
-		ret = vfio_ap_mdev_reset_queue(q, 1);
+		ret = vfio_ap_mdev_reset_queue(q);
 		/*
 		 * Regardless whether a queue turns out to be busy, or
 		 * is not operational, we need to continue resetting
@@ -1931,7 +1933,7 @@ void vfio_ap_mdev_remove_queue(struct ap_device *apdev)
 		}
 	}
 
-	vfio_ap_mdev_reset_queue(q, 1);
+	vfio_ap_mdev_reset_queue(q);
 	dev_set_drvdata(&apdev->device, NULL);
 	kfree(q);
 	release_update_locks_for_mdev(matrix_mdev);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/7] s390/vfio_ap: verify ZAPQ completion after return of response code zero
  2022-12-13 15:44 [PATCH 0/7] improve AP queue reset processing Tony Krowiak
                   ` (2 preceding siblings ...)
  2022-12-13 15:44 ` [PATCH 3/7] s390/vfio_ap: use TAPQ to verify reset in progress completes Tony Krowiak
@ 2022-12-13 15:44 ` Tony Krowiak
  2022-12-15 10:54   ` Harald Freudenberger
  2022-12-13 15:44 ` [PATCH 5/7] s390/vfio_ap: fix handling of error response codes Tony Krowiak
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Tony Krowiak @ 2022-12-13 15:44 UTC (permalink / raw)
  To: linux-s390, linux-kernel, kvm
  Cc: jjherne, freude, borntraeger, cohuck, mjrosato, pasic,
	alex.williamson, kwankhede, fiuczy

Verification that the asynchronous ZAPQ function has completed only needs
to be done when the response code indicates the function was successfully
initiated; so, let's call the apq_reset_check function immediately after
the response code zero is returned from the ZAPQ.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
---
 drivers/s390/crypto/vfio_ap_ops.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 5bf2d93ae8af..c0cf5050be59 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1645,6 +1645,9 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q)
 	switch (status.response_code) {
 	case AP_RESPONSE_NORMAL:
 		ret = 0;
+		/* if the reset has not completed, wait for it to take effect */
+		if (!status.queue_empty || status.irq_enabled)
+			ret = apq_reset_check(q);
 		break;
 	case AP_RESPONSE_RESET_IN_PROGRESS:
 		/*
@@ -1674,10 +1677,6 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q)
 		return -EIO;
 	}
 
-	/* wait for the reset to take effect */
-	if (!status.queue_empty && status.irq_enabled)
-		ret = apq_reset_check(q);
-
 free_resources:
 	vfio_ap_free_aqic_resources(q);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/7] s390/vfio_ap: fix handling of error response codes
  2022-12-13 15:44 [PATCH 0/7] improve AP queue reset processing Tony Krowiak
                   ` (3 preceding siblings ...)
  2022-12-13 15:44 ` [PATCH 4/7] s390/vfio_ap: verify ZAPQ completion after return of response code zero Tony Krowiak
@ 2022-12-13 15:44 ` Tony Krowiak
  2022-12-15 10:56   ` Harald Freudenberger
  2022-12-13 15:44 ` [PATCH 6/7] s390/vfio_ap: increase max wait time for reset verification Tony Krowiak
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Tony Krowiak @ 2022-12-13 15:44 UTC (permalink / raw)
  To: linux-s390, linux-kernel, kvm
  Cc: jjherne, freude, borntraeger, cohuck, mjrosato, pasic,
	alex.williamson, kwankhede, fiuczy

Some response codes returned from the queue reset function are not being
handled correctly; this patch fixes them:

1. Response code 3, AP queue deconfigured: Deconfiguring an AP adapter
   resets all of its queues, so this is handled by indicating the reset
   verification completed successfully.

2. For all response codes other than 0 (normal reset completion), 2
   (queue reset in progress) and 3 (AP deconfigured), the -EIO error will
   be returned from the vfio_ap_mdev_reset_queue() function. In all cases,
   all fields of the status word other than the response code will be
   set to zero, so it makes no sense to check status bits.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
---
 drivers/s390/crypto/vfio_ap_ops.c | 17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index c0cf5050be59..dbf681715a6d 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1659,17 +1659,15 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q)
 		if (ret)
 			break;
 		goto retry_zapq;
-	case AP_RESPONSE_Q_NOT_AVAIL:
 	case AP_RESPONSE_DECONFIGURED:
-	case AP_RESPONSE_CHECKSTOPPED:
-		WARN_ONCE(status.irq_enabled,
-			  "PQAP/ZAPQ for %02x.%04x failed with rc=%u while IRQ enabled",
-			  AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn),
-			  status.response_code);
-		ret = -EBUSY;
-		goto free_resources;
+		/*
+		 * When an AP adapter is deconfigured, the associated
+		 * queues are reset, so let's return a value indicating the reset
+		 * completed successfully.
+		 */
+		ret = 0;
+		break;
 	default:
-		/* things are really broken, give up */
 		WARN(true,
 		     "PQAP/ZAPQ for %02x.%04x failed with invalid rc=%u\n",
 		     AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn),
@@ -1677,7 +1675,6 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q)
 		return -EIO;
 	}
 
-free_resources:
 	vfio_ap_free_aqic_resources(q);
 
 	return ret;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 6/7] s390/vfio_ap: increase max wait time for reset verification
  2022-12-13 15:44 [PATCH 0/7] improve AP queue reset processing Tony Krowiak
                   ` (4 preceding siblings ...)
  2022-12-13 15:44 ` [PATCH 5/7] s390/vfio_ap: fix handling of error response codes Tony Krowiak
@ 2022-12-13 15:44 ` Tony Krowiak
  2022-12-15 10:58   ` Harald Freudenberger
  2022-12-13 15:44 ` [PATCH 7/7] s390/vfio_ap: always clean up IRQ resources Tony Krowiak
  2022-12-14 15:16 ` [PATCH 0/7] improve AP queue reset processing Jason J. Herne
  7 siblings, 1 reply; 25+ messages in thread
From: Tony Krowiak @ 2022-12-13 15:44 UTC (permalink / raw)
  To: linux-s390, linux-kernel, kvm
  Cc: jjherne, freude, borntraeger, cohuck, mjrosato, pasic,
	alex.williamson, kwankhede, fiuczy

Increase the maximum time to wait for verification of a queue reset
operation to 200ms.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
---
 drivers/s390/crypto/vfio_ap_ops.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index dbf681715a6d..e80c5a6b91be 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -30,6 +30,9 @@
 #define AP_QUEUE_UNASSIGNED "unassigned"
 #define AP_QUEUE_IN_USE "in use"
 
+#define MAX_RESET_CHECK_WAIT	200	/* Sleep max 200ms for reset check	*/
+#define AP_RESET_INTERVAL		20	/* Reset sleep interval (20ms)		*/
+
 static int vfio_ap_mdev_reset_queues(struct ap_queue_table *qtable);
 static struct vfio_ap_queue *vfio_ap_find_queue(int apqn);
 static const struct vfio_device_ops vfio_ap_matrix_dev_ops;
@@ -1615,11 +1618,12 @@ static int apq_status_check(int apqn, struct ap_queue_status *status)
 
 static int apq_reset_check(struct vfio_ap_queue *q)
 {
-	int iters = 2, ret;
+	int ret;
+	int iters = MAX_RESET_CHECK_WAIT / AP_RESET_INTERVAL;
 	struct ap_queue_status status;
 
-	while (iters--) {
-		msleep(20);
+	for (; iters > 0; iters--) {
+		msleep(AP_RESET_INTERVAL);
 		status = ap_tapq(q->apqn, NULL);
 		ret = apq_status_check(q->apqn, &status);
 		if (ret != -EBUSY)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 7/7] s390/vfio_ap: always clean up IRQ resources
  2022-12-13 15:44 [PATCH 0/7] improve AP queue reset processing Tony Krowiak
                   ` (5 preceding siblings ...)
  2022-12-13 15:44 ` [PATCH 6/7] s390/vfio_ap: increase max wait time for reset verification Tony Krowiak
@ 2022-12-13 15:44 ` Tony Krowiak
  2022-12-15 10:58   ` Harald Freudenberger
  2022-12-19 14:10   ` Halil Pasic
  2022-12-14 15:16 ` [PATCH 0/7] improve AP queue reset processing Jason J. Herne
  7 siblings, 2 replies; 25+ messages in thread
From: Tony Krowiak @ 2022-12-13 15:44 UTC (permalink / raw)
  To: linux-s390, linux-kernel, kvm
  Cc: jjherne, freude, borntraeger, cohuck, mjrosato, pasic,
	alex.williamson, kwankhede, fiuczy

Clean up IRQ resources even when a PQAP(ZAPQ) function fails with an error
not handled by a case statement.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
---
 drivers/s390/crypto/vfio_ap_ops.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index e80c5a6b91be..2dd8db9ddb39 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1676,7 +1676,7 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q)
 		     "PQAP/ZAPQ for %02x.%04x failed with invalid rc=%u\n",
 		     AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn),
 		     status.response_code);
-		return -EIO;
+		break;
 	}
 
 	vfio_ap_free_aqic_resources(q);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/7] improve AP queue reset processing
  2022-12-13 15:44 [PATCH 0/7] improve AP queue reset processing Tony Krowiak
                   ` (6 preceding siblings ...)
  2022-12-13 15:44 ` [PATCH 7/7] s390/vfio_ap: always clean up IRQ resources Tony Krowiak
@ 2022-12-14 15:16 ` Jason J. Herne
  2022-12-14 19:10   ` Anthony Krowiak
  7 siblings, 1 reply; 25+ messages in thread
From: Jason J. Herne @ 2022-12-14 15:16 UTC (permalink / raw)
  To: Tony Krowiak, linux-s390, linux-kernel, kvm
  Cc: freude, borntraeger, cohuck, mjrosato, pasic, alex.williamson,
	kwankhede, fiuczy


On 12/13/22 10:44 AM, Tony Krowiak wrote:
> This series introduces several improvements to the function that performs
> AP queue resets:
> 
> * Breaks up reset processing into multiple smaller, more concise functions.
> 
> * Use TAPQ to verify completion of a reset in progress rather than mulitple
>    invocations of ZAPQ.
> 
> * Check TAPQ response codes when verifying successful completion of ZAPQ.
> 
> * Fix erroneous handling of some error response codes.
> 
> * Increase the maximum amount of time to wait for successful completion of
>    ZAPQ.
> 
> * Always clean up IRQ resources when the ZAPQ response code indicates an
>    error.
> 
> * Consider reset complete when ZAPQ response code indicates the adapter to
>    which a queue is connected is deconfigured. All queues associated with an
>    adapter are reset when it is deconfigured.
> 
> Tony Krowiak (7):
>    s390/vfio-ap: verify reset complete in separate function
>    s390/vfio_ap: check TAPQ response code when waiting for queue reset
>    s390/vfio_ap: use TAPQ to verify reset in progress completes
>    s390/vfio_ap: verify ZAPQ completion after return of response code
>      zero
>    s390/vfio_ap: fix handling of error response codes
>    s390/vfio_ap: increase max wait time for reset verification
>    s390/vfio_ap: always clean up IRQ resources
> 
>   drivers/s390/crypto/vfio_ap_ops.c | 106 ++++++++++++++++++++----------
>   1 file changed, 73 insertions(+), 33 deletions(-)


This series largely matches what I've already reviewed. I like the way 
you broke this up, it does a better job telling the story.

Here's my R-b for the entire series.
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/7] improve AP queue reset processing
  2022-12-14 15:16 ` [PATCH 0/7] improve AP queue reset processing Jason J. Herne
@ 2022-12-14 19:10   ` Anthony Krowiak
  0 siblings, 0 replies; 25+ messages in thread
From: Anthony Krowiak @ 2022-12-14 19:10 UTC (permalink / raw)
  To: Jason J. Herne, linux-s390, linux-kernel, kvm
  Cc: freude, borntraeger, cohuck, mjrosato, pasic, alex.williamson,
	kwankhede, fiuczy


On 12/14/22 10:16 AM, Jason J. Herne wrote:
>
> On 12/13/22 10:44 AM, Tony Krowiak wrote:
>> This series introduces several improvements to the function that 
>> performs
>> AP queue resets:
>>
>> * Breaks up reset processing into multiple smaller, more concise 
>> functions.
>>
>> * Use TAPQ to verify completion of a reset in progress rather than 
>> mulitple
>>    invocations of ZAPQ.
>>
>> * Check TAPQ response codes when verifying successful completion of 
>> ZAPQ.
>>
>> * Fix erroneous handling of some error response codes.
>>
>> * Increase the maximum amount of time to wait for successful 
>> completion of
>>    ZAPQ.
>>
>> * Always clean up IRQ resources when the ZAPQ response code indicates an
>>    error.
>>
>> * Consider reset complete when ZAPQ response code indicates the 
>> adapter to
>>    which a queue is connected is deconfigured. All queues associated 
>> with an
>>    adapter are reset when it is deconfigured.
>>
>> Tony Krowiak (7):
>>    s390/vfio-ap: verify reset complete in separate function
>>    s390/vfio_ap: check TAPQ response code when waiting for queue reset
>>    s390/vfio_ap: use TAPQ to verify reset in progress completes
>>    s390/vfio_ap: verify ZAPQ completion after return of response code
>>      zero
>>    s390/vfio_ap: fix handling of error response codes
>>    s390/vfio_ap: increase max wait time for reset verification
>>    s390/vfio_ap: always clean up IRQ resources
>>
>>   drivers/s390/crypto/vfio_ap_ops.c | 106 ++++++++++++++++++++----------
>>   1 file changed, 73 insertions(+), 33 deletions(-)
>
>
> This series largely matches what I've already reviewed. I like the way 
> you broke this up, it does a better job telling the story.
>
> Here's my R-b for the entire series.
> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>


Thanks for the review Jason.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/7] s390/vfio-ap: verify reset complete in separate function
  2022-12-13 15:44 ` [PATCH 1/7] s390/vfio-ap: verify reset complete in separate function Tony Krowiak
@ 2022-12-15  8:24   ` Harald Freudenberger
  0 siblings, 0 replies; 25+ messages in thread
From: Harald Freudenberger @ 2022-12-15  8:24 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, kvm, jjherne, borntraeger, cohuck,
	mjrosato, pasic, alex.williamson, kwankhede, fiuczy

On 2022-12-13 16:44, Tony Krowiak wrote:
> The vfio_ap_mdev_reset_queue() function contains a loop to verify that 
> the
> reset successfully completes within 40ms. This patch moves that loop 
> into
> a separate function.
> 
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  drivers/s390/crypto/vfio_ap_ops.c | 30 +++++++++++++++++++++---------
>  1 file changed, 21 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
> b/drivers/s390/crypto/vfio_ap_ops.c
> index 0b4cc8c597ae..83ff94a38102 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -1587,12 +1587,30 @@ static struct vfio_ap_queue
> *vfio_ap_find_queue(int apqn)
>  	return q;
>  }
> 
> +static int apq_reset_check(struct vfio_ap_queue *q)
> +{
> +	int iters = 2;
> +	struct ap_queue_status status;
> +
> +	while (iters--) {
> +		msleep(20);
> +		status = ap_tapq(q->apqn, NULL);
> +		if (status.queue_empty && !status.irq_enabled)
> +			return 0;
> +	}
> +	WARN_ONCE(iters <= 0,
> +		  "timeout verifying reset of queue %02x.%04x (%u, %u, %u)",
> +		  AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn),
> +		  status.queue_empty, status.irq_enabled, status.response_code);
> +
> +	return -EBUSY;
> +}
> +
>  static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q,
>  				    unsigned int retry)
>  {
>  	struct ap_queue_status status;
>  	int ret;
> -	int retry2 = 2;
> 
>  	if (!q)
>  		return 0;
> @@ -1629,14 +1647,8 @@ static int vfio_ap_mdev_reset_queue(struct
> vfio_ap_queue *q,
>  	}
> 
>  	/* wait for the reset to take effect */
> -	while (retry2--) {
> -		if (status.queue_empty && !status.irq_enabled)
> -			break;
> -		msleep(20);
> -		status = ap_tapq(q->apqn, NULL);
> -	}
> -	WARN_ONCE(retry2 <= 0, "unable to verify reset of queue %02x.%04x",
> -		  AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn));
> +	if (!status.queue_empty && status.irq_enabled)

Hm, you check for
   a) status.queue_empty and
   b) !status.irq_enabled
   which is (status.queue_empty && !status.irq_enabled)
   and now let's negate this to:
   !(status.queue_empty && !status.irq_enabled)
   with de-morgan this gives:
   (!status.queue_empty || status.irq_enabled)

> +		ret = apq_reset_check(q);
> 
>  free_resources:
>  	vfio_ap_free_aqic_resources(q);

Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] s390/vfio_ap: check TAPQ response code when waiting for queue reset
  2022-12-13 15:44 ` [PATCH 2/7] s390/vfio_ap: check TAPQ response code when waiting for queue reset Tony Krowiak
@ 2022-12-15 10:48   ` Harald Freudenberger
  0 siblings, 0 replies; 25+ messages in thread
From: Harald Freudenberger @ 2022-12-15 10:48 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, kvm, jjherne, borntraeger, cohuck,
	mjrosato, pasic, alex.williamson, kwankhede, fiuczy

On 2022-12-13 16:44, Tony Krowiak wrote:
> The vfio_ap_mdev_reset_queue() function does not check the status
> response code returned form the PQAP(TAPQ) function when verifying the
> queue's status; consequently, there is no way of knowing whether
> verification failed because the wait time was exceeded, or because the
> PQAP(TAPQ) failed.
> 
> This patch adds a function to check the status response code from the
> PQAP(TAPQ) instruction and logs an appropriate message if it fails.
> 
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  drivers/s390/crypto/vfio_ap_ops.c | 36 ++++++++++++++++++++++++++-----
>  1 file changed, 31 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
> b/drivers/s390/crypto/vfio_ap_ops.c
> index 83ff94a38102..a5530a46cf31 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -1587,23 +1587,49 @@ static struct vfio_ap_queue
> *vfio_ap_find_queue(int apqn)
>  	return q;
>  }
> 
> +static int apq_status_check(int apqn, struct ap_queue_status *status)
> +{
> +	switch (status->response_code) {
> +	case AP_RESPONSE_NORMAL:
> +	case AP_RESPONSE_RESET_IN_PROGRESS:
> +		if (status->queue_empty && !status->irq_enabled)
> +			return 0;
> +		return -EBUSY;
> +	case AP_RESPONSE_DECONFIGURED:
> +		/*
> +		 * If the AP queue is deconfigured, any subsequent AP command
> +		 * targeting the queue will fail with the same response code. On the
> +		 * other hand, when an AP adapter is deconfigured, the associated
> +		 * queues are reset, so let's return a value indicating the reset
> +		 * for which we're waiting completed successfully.
> +		 */
> +		return 0;
> +	default:
> +		WARN(true,
> +		     "failed to verify reset of queue %02x.%04x: TAPQ rc=%u\n",
> +		     AP_QID_CARD(apqn), AP_QID_QUEUE(apqn),
> +		     status->response_code);
> +		return -EIO;
> +	}
> +}
> +
>  static int apq_reset_check(struct vfio_ap_queue *q)
>  {
> -	int iters = 2;
> +	int iters = 2, ret;
>  	struct ap_queue_status status;
> 
>  	while (iters--) {
>  		msleep(20);
>  		status = ap_tapq(q->apqn, NULL);
> -		if (status.queue_empty && !status.irq_enabled)
> -			return 0;
> +		ret = apq_status_check(q->apqn, &status);
> +		if (ret != -EBUSY)
> +			return ret;
>  	}
>  	WARN_ONCE(iters <= 0,
>  		  "timeout verifying reset of queue %02x.%04x (%u, %u, %u)",
>  		  AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn),
>  		  status.queue_empty, status.irq_enabled, status.response_code);
> -
> -	return -EBUSY;
> +	return ret;
>  }
> 
>  static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q,

Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>

Just one word here: this function is only called once and it is very 
very special
to just check the status after RAPQ/ZAPQ. I would merge this function 
into the
calling code or rename the function to reflect the special condition 
under which
it is called. However - this is not my code and I don't need to maintain 
it, so
maybe simple ignore my words.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/7] s390/vfio_ap: use TAPQ to verify reset in progress completes
  2022-12-13 15:44 ` [PATCH 3/7] s390/vfio_ap: use TAPQ to verify reset in progress completes Tony Krowiak
@ 2022-12-15 10:51   ` Harald Freudenberger
  2022-12-20 14:22     ` Anthony Krowiak
  0 siblings, 1 reply; 25+ messages in thread
From: Harald Freudenberger @ 2022-12-15 10:51 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, kvm, jjherne, borntraeger, cohuck,
	mjrosato, pasic, alex.williamson, kwankhede, fiuczy

On 2022-12-13 16:44, Tony Krowiak wrote:
> To eliminate the repeated calls to the PQAP(ZAPQ) function to verify 
> that
> a reset in progress completed successfully and ensure that error 
> response
> codes get appropriately logged, let's call the apq_reset_check() 
> function
> when the ZAPQ response code indicates that a reset that is already in
> progress.
> 
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  drivers/s390/crypto/vfio_ap_ops.c | 24 +++++++++++++-----------
>  1 file changed, 13 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
> b/drivers/s390/crypto/vfio_ap_ops.c
> index a5530a46cf31..5bf2d93ae8af 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -33,7 +33,7 @@
>  static int vfio_ap_mdev_reset_queues(struct ap_queue_table *qtable);
>  static struct vfio_ap_queue *vfio_ap_find_queue(int apqn);
>  static const struct vfio_device_ops vfio_ap_matrix_dev_ops;
> -static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q, unsigned
> int retry);
> +static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q);
> 
>  /**
>   * get_update_locks_for_kvm: Acquire the locks required to dynamically 
> update a
> @@ -1632,8 +1632,7 @@ static int apq_reset_check(struct vfio_ap_queue 
> *q)
>  	return ret;
>  }
> 
> -static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q,
> -				    unsigned int retry)
> +static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q)
>  {
>  	struct ap_queue_status status;
>  	int ret;
> @@ -1648,12 +1647,15 @@ static int vfio_ap_mdev_reset_queue(struct
> vfio_ap_queue *q,
>  		ret = 0;
>  		break;
>  	case AP_RESPONSE_RESET_IN_PROGRESS:
> -		if (retry--) {
> -			msleep(20);
> -			goto retry_zapq;
> -		}
> -		ret = -EBUSY;
> -		break;
> +		/*
> +		 * There is a reset issued by another process in progress. Let's 
> wait
> +		 * for that to complete. Since we have no idea whether it was a RAPQ 
> or
> +		 * ZAPQ, then if it completes successfully, let's issue the ZAPQ.
> +		 */
> +		ret = apq_reset_check(q);
> +		if (ret)
> +			break;
> +		goto retry_zapq;
>  	case AP_RESPONSE_Q_NOT_AVAIL:
>  	case AP_RESPONSE_DECONFIGURED:
>  	case AP_RESPONSE_CHECKSTOPPED:
> @@ -1688,7 +1690,7 @@ static int vfio_ap_mdev_reset_queues(struct
> ap_queue_table *qtable)
>  	struct vfio_ap_queue *q;
> 
>  	hash_for_each(qtable->queues, loop_cursor, q, mdev_qnode) {
> -		ret = vfio_ap_mdev_reset_queue(q, 1);
> +		ret = vfio_ap_mdev_reset_queue(q);
>  		/*
>  		 * Regardless whether a queue turns out to be busy, or
>  		 * is not operational, we need to continue resetting
> @@ -1931,7 +1933,7 @@ void vfio_ap_mdev_remove_queue(struct ap_device 
> *apdev)
>  		}
>  	}
> 
> -	vfio_ap_mdev_reset_queue(q, 1);
> +	vfio_ap_mdev_reset_queue(q);
>  	dev_set_drvdata(&apdev->device, NULL);
>  	kfree(q);
>  	release_update_locks_for_mdev(matrix_mdev);

Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/7] s390/vfio_ap: verify ZAPQ completion after return of response code zero
  2022-12-13 15:44 ` [PATCH 4/7] s390/vfio_ap: verify ZAPQ completion after return of response code zero Tony Krowiak
@ 2022-12-15 10:54   ` Harald Freudenberger
  2022-12-20 14:24     ` Anthony Krowiak
  0 siblings, 1 reply; 25+ messages in thread
From: Harald Freudenberger @ 2022-12-15 10:54 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, kvm, jjherne, borntraeger, cohuck,
	mjrosato, pasic, alex.williamson, kwankhede, fiuczy

On 2022-12-13 16:44, Tony Krowiak wrote:
> Verification that the asynchronous ZAPQ function has completed only 
> needs
> to be done when the response code indicates the function was 
> successfully
> initiated; so, let's call the apq_reset_check function immediately 
> after
> the response code zero is returned from the ZAPQ.
> 
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  drivers/s390/crypto/vfio_ap_ops.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
> b/drivers/s390/crypto/vfio_ap_ops.c
> index 5bf2d93ae8af..c0cf5050be59 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -1645,6 +1645,9 @@ static int vfio_ap_mdev_reset_queue(struct
> vfio_ap_queue *q)
>  	switch (status.response_code) {
>  	case AP_RESPONSE_NORMAL:
>  		ret = 0;
> +		/* if the reset has not completed, wait for it to take effect */
> +		if (!status.queue_empty || status.irq_enabled)
> +			ret = apq_reset_check(q);
>  		break;
>  	case AP_RESPONSE_RESET_IN_PROGRESS:
>  		/*
> @@ -1674,10 +1677,6 @@ static int vfio_ap_mdev_reset_queue(struct
> vfio_ap_queue *q)
>  		return -EIO;
>  	}
> 
> -	/* wait for the reset to take effect */
> -	if (!status.queue_empty && status.irq_enabled)
> -		ret = apq_reset_check(q);
> -
>  free_resources:
>  	vfio_ap_free_aqic_resources(q);

Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/7] s390/vfio_ap: fix handling of error response codes
  2022-12-13 15:44 ` [PATCH 5/7] s390/vfio_ap: fix handling of error response codes Tony Krowiak
@ 2022-12-15 10:56   ` Harald Freudenberger
  2022-12-20 14:25     ` Anthony Krowiak
  0 siblings, 1 reply; 25+ messages in thread
From: Harald Freudenberger @ 2022-12-15 10:56 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, kvm, jjherne, borntraeger, cohuck,
	mjrosato, pasic, alex.williamson, kwankhede, fiuczy

On 2022-12-13 16:44, Tony Krowiak wrote:
> Some response codes returned from the queue reset function are not 
> being
> handled correctly; this patch fixes them:
> 
> 1. Response code 3, AP queue deconfigured: Deconfiguring an AP adapter
>    resets all of its queues, so this is handled by indicating the reset
>    verification completed successfully.
> 
> 2. For all response codes other than 0 (normal reset completion), 2
>    (queue reset in progress) and 3 (AP deconfigured), the -EIO error 
> will
>    be returned from the vfio_ap_mdev_reset_queue() function. In all 
> cases,
>    all fields of the status word other than the response code will be
>    set to zero, so it makes no sense to check status bits.
> 
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  drivers/s390/crypto/vfio_ap_ops.c | 17 +++++++----------
>  1 file changed, 7 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
> b/drivers/s390/crypto/vfio_ap_ops.c
> index c0cf5050be59..dbf681715a6d 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -1659,17 +1659,15 @@ static int vfio_ap_mdev_reset_queue(struct
> vfio_ap_queue *q)
>  		if (ret)
>  			break;
>  		goto retry_zapq;
> -	case AP_RESPONSE_Q_NOT_AVAIL:
>  	case AP_RESPONSE_DECONFIGURED:
> -	case AP_RESPONSE_CHECKSTOPPED:
> -		WARN_ONCE(status.irq_enabled,
> -			  "PQAP/ZAPQ for %02x.%04x failed with rc=%u while IRQ enabled",
> -			  AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn),
> -			  status.response_code);
> -		ret = -EBUSY;
> -		goto free_resources;
> +		/*
> +		 * When an AP adapter is deconfigured, the associated
> +		 * queues are reset, so let's return a value indicating the reset
> +		 * completed successfully.
> +		 */
> +		ret = 0;
> +		break;
>  	default:
> -		/* things are really broken, give up */
>  		WARN(true,
>  		     "PQAP/ZAPQ for %02x.%04x failed with invalid rc=%u\n",
>  		     AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn),
> @@ -1677,7 +1675,6 @@ static int vfio_ap_mdev_reset_queue(struct
> vfio_ap_queue *q)
>  		return -EIO;
>  	}
> 
> -free_resources:
>  	vfio_ap_free_aqic_resources(q);
> 
>  	return ret;

Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/7] s390/vfio_ap: increase max wait time for reset verification
  2022-12-13 15:44 ` [PATCH 6/7] s390/vfio_ap: increase max wait time for reset verification Tony Krowiak
@ 2022-12-15 10:58   ` Harald Freudenberger
  2022-12-20 14:27     ` Anthony Krowiak
  0 siblings, 1 reply; 25+ messages in thread
From: Harald Freudenberger @ 2022-12-15 10:58 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, kvm, jjherne, borntraeger, cohuck,
	mjrosato, pasic, alex.williamson, kwankhede, fiuczy

On 2022-12-13 16:44, Tony Krowiak wrote:
> Increase the maximum time to wait for verification of a queue reset
> operation to 200ms.
> 
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  drivers/s390/crypto/vfio_ap_ops.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
> b/drivers/s390/crypto/vfio_ap_ops.c
> index dbf681715a6d..e80c5a6b91be 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -30,6 +30,9 @@
>  #define AP_QUEUE_UNASSIGNED "unassigned"
>  #define AP_QUEUE_IN_USE "in use"
> 
> +#define MAX_RESET_CHECK_WAIT	200	/* Sleep max 200ms for reset check	*/
> +#define AP_RESET_INTERVAL		20	/* Reset sleep interval (20ms)		*/
> +
>  static int vfio_ap_mdev_reset_queues(struct ap_queue_table *qtable);
>  static struct vfio_ap_queue *vfio_ap_find_queue(int apqn);
>  static const struct vfio_device_ops vfio_ap_matrix_dev_ops;
> @@ -1615,11 +1618,12 @@ static int apq_status_check(int apqn, struct
> ap_queue_status *status)
> 
>  static int apq_reset_check(struct vfio_ap_queue *q)
>  {
> -	int iters = 2, ret;
> +	int ret;
> +	int iters = MAX_RESET_CHECK_WAIT / AP_RESET_INTERVAL;
>  	struct ap_queue_status status;
> 
> -	while (iters--) {
> -		msleep(20);
> +	for (; iters > 0; iters--) {
> +		msleep(AP_RESET_INTERVAL);
>  		status = ap_tapq(q->apqn, NULL);
>  		ret = apq_status_check(q->apqn, &status);
>  		if (ret != -EBUSY)

Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 7/7] s390/vfio_ap: always clean up IRQ resources
  2022-12-13 15:44 ` [PATCH 7/7] s390/vfio_ap: always clean up IRQ resources Tony Krowiak
@ 2022-12-15 10:58   ` Harald Freudenberger
  2022-12-19 14:10   ` Halil Pasic
  1 sibling, 0 replies; 25+ messages in thread
From: Harald Freudenberger @ 2022-12-15 10:58 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, kvm, jjherne, borntraeger, cohuck,
	mjrosato, pasic, alex.williamson, kwankhede, fiuczy

On 2022-12-13 16:44, Tony Krowiak wrote:
> Clean up IRQ resources even when a PQAP(ZAPQ) function fails with an 
> error
> not handled by a case statement.
> 
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  drivers/s390/crypto/vfio_ap_ops.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
> b/drivers/s390/crypto/vfio_ap_ops.c
> index e80c5a6b91be..2dd8db9ddb39 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -1676,7 +1676,7 @@ static int vfio_ap_mdev_reset_queue(struct
> vfio_ap_queue *q)
>  		     "PQAP/ZAPQ for %02x.%04x failed with invalid rc=%u\n",
>  		     AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn),
>  		     status.response_code);
> -		return -EIO;
> +		break;
>  	}
> 
>  	vfio_ap_free_aqic_resources(q);

Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 7/7] s390/vfio_ap: always clean up IRQ resources
  2022-12-13 15:44 ` [PATCH 7/7] s390/vfio_ap: always clean up IRQ resources Tony Krowiak
  2022-12-15 10:58   ` Harald Freudenberger
@ 2022-12-19 14:10   ` Halil Pasic
  2022-12-20 14:33     ` Anthony Krowiak
  1 sibling, 1 reply; 25+ messages in thread
From: Halil Pasic @ 2022-12-19 14:10 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, kvm, jjherne, freude, borntraeger,
	cohuck, mjrosato, alex.williamson, kwankhede, fiuczy,
	Halil Pasic

On Tue, 13 Dec 2022 10:44:37 -0500
Tony Krowiak <akrowiak@linux.ibm.com> wrote:

> Clean up IRQ resources even when a PQAP(ZAPQ) function fails with an error
> not handled by a case statement.

Why?

I'm afraid this is a step in the wrong direction...

> 
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  drivers/s390/crypto/vfio_ap_ops.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index e80c5a6b91be..2dd8db9ddb39 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -1676,7 +1676,7 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q)
>  		     "PQAP/ZAPQ for %02x.%04x failed with invalid rc=%u\n",
>  		     AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn),
>  		     status.response_code);
> -		return -EIO;
> +		break;
>  	}
>  
>  	vfio_ap_free_aqic_resources(q);


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/7] s390/vfio_ap: use TAPQ to verify reset in progress completes
  2022-12-15 10:51   ` Harald Freudenberger
@ 2022-12-20 14:22     ` Anthony Krowiak
  0 siblings, 0 replies; 25+ messages in thread
From: Anthony Krowiak @ 2022-12-20 14:22 UTC (permalink / raw)
  To: freude
  Cc: linux-s390, linux-kernel, kvm, jjherne, borntraeger, cohuck,
	mjrosato, pasic, alex.williamson, kwankhede, fiuczy


On 12/15/22 5:51 AM, Harald Freudenberger wrote:
> On 2022-12-13 16:44, Tony Krowiak wrote:
>> To eliminate the repeated calls to the PQAP(ZAPQ) function to verify 
>> that
>> a reset in progress completed successfully and ensure that error 
>> response
>> codes get appropriately logged, let's call the apq_reset_check() 
>> function
>> when the ZAPQ response code indicates that a reset that is already in
>> progress.
>>
>> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
>> ---
>>  drivers/s390/crypto/vfio_ap_ops.c | 24 +++++++++++++-----------
>>  1 file changed, 13 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
>> b/drivers/s390/crypto/vfio_ap_ops.c
>> index a5530a46cf31..5bf2d93ae8af 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -33,7 +33,7 @@
>>  static int vfio_ap_mdev_reset_queues(struct ap_queue_table *qtable);
>>  static struct vfio_ap_queue *vfio_ap_find_queue(int apqn);
>>  static const struct vfio_device_ops vfio_ap_matrix_dev_ops;
>> -static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q, unsigned
>> int retry);
>> +static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q);
>>
>>  /**
>>   * get_update_locks_for_kvm: Acquire the locks required to 
>> dynamically update a
>> @@ -1632,8 +1632,7 @@ static int apq_reset_check(struct vfio_ap_queue 
>> *q)
>>      return ret;
>>  }
>>
>> -static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q,
>> -                    unsigned int retry)
>> +static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q)
>>  {
>>      struct ap_queue_status status;
>>      int ret;
>> @@ -1648,12 +1647,15 @@ static int vfio_ap_mdev_reset_queue(struct
>> vfio_ap_queue *q,
>>          ret = 0;
>>          break;
>>      case AP_RESPONSE_RESET_IN_PROGRESS:
>> -        if (retry--) {
>> -            msleep(20);
>> -            goto retry_zapq;
>> -        }
>> -        ret = -EBUSY;
>> -        break;
>> +        /*
>> +         * There is a reset issued by another process in progress. 
>> Let's wait
>> +         * for that to complete. Since we have no idea whether it 
>> was a RAPQ or
>> +         * ZAPQ, then if it completes successfully, let's issue the 
>> ZAPQ.
>> +         */
>> +        ret = apq_reset_check(q);
>> +        if (ret)
>> +            break;
>> +        goto retry_zapq;
>>      case AP_RESPONSE_Q_NOT_AVAIL:
>>      case AP_RESPONSE_DECONFIGURED:
>>      case AP_RESPONSE_CHECKSTOPPED:
>> @@ -1688,7 +1690,7 @@ static int vfio_ap_mdev_reset_queues(struct
>> ap_queue_table *qtable)
>>      struct vfio_ap_queue *q;
>>
>>      hash_for_each(qtable->queues, loop_cursor, q, mdev_qnode) {
>> -        ret = vfio_ap_mdev_reset_queue(q, 1);
>> +        ret = vfio_ap_mdev_reset_queue(q);
>>          /*
>>           * Regardless whether a queue turns out to be busy, or
>>           * is not operational, we need to continue resetting
>> @@ -1931,7 +1933,7 @@ void vfio_ap_mdev_remove_queue(struct ap_device 
>> *apdev)
>>          }
>>      }
>>
>> -    vfio_ap_mdev_reset_queue(q, 1);
>> +    vfio_ap_mdev_reset_queue(q);
>>      dev_set_drvdata(&apdev->device, NULL);
>>      kfree(q);
>>      release_update_locks_for_mdev(matrix_mdev);
>
> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>


Thanks for the review.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/7] s390/vfio_ap: verify ZAPQ completion after return of response code zero
  2022-12-15 10:54   ` Harald Freudenberger
@ 2022-12-20 14:24     ` Anthony Krowiak
  0 siblings, 0 replies; 25+ messages in thread
From: Anthony Krowiak @ 2022-12-20 14:24 UTC (permalink / raw)
  To: freude
  Cc: linux-s390, linux-kernel, kvm, jjherne, borntraeger, cohuck,
	mjrosato, pasic, alex.williamson, kwankhede, fiuczy


On 12/15/22 5:54 AM, Harald Freudenberger wrote:
> On 2022-12-13 16:44, Tony Krowiak wrote:
>> Verification that the asynchronous ZAPQ function has completed only 
>> needs
>> to be done when the response code indicates the function was 
>> successfully
>> initiated; so, let's call the apq_reset_check function immediately after
>> the response code zero is returned from the ZAPQ.
>>
>> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
>> ---
>>  drivers/s390/crypto/vfio_ap_ops.c | 7 +++----
>>  1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
>> b/drivers/s390/crypto/vfio_ap_ops.c
>> index 5bf2d93ae8af..c0cf5050be59 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -1645,6 +1645,9 @@ static int vfio_ap_mdev_reset_queue(struct
>> vfio_ap_queue *q)
>>      switch (status.response_code) {
>>      case AP_RESPONSE_NORMAL:
>>          ret = 0;
>> +        /* if the reset has not completed, wait for it to take 
>> effect */
>> +        if (!status.queue_empty || status.irq_enabled)
>> +            ret = apq_reset_check(q);
>>          break;
>>      case AP_RESPONSE_RESET_IN_PROGRESS:
>>          /*
>> @@ -1674,10 +1677,6 @@ static int vfio_ap_mdev_reset_queue(struct
>> vfio_ap_queue *q)
>>          return -EIO;
>>      }
>>
>> -    /* wait for the reset to take effect */
>> -    if (!status.queue_empty && status.irq_enabled)
>> -        ret = apq_reset_check(q);
>> -
>>  free_resources:
>>      vfio_ap_free_aqic_resources(q);
>
> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>


Thanks for the review.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/7] s390/vfio_ap: fix handling of error response codes
  2022-12-15 10:56   ` Harald Freudenberger
@ 2022-12-20 14:25     ` Anthony Krowiak
  0 siblings, 0 replies; 25+ messages in thread
From: Anthony Krowiak @ 2022-12-20 14:25 UTC (permalink / raw)
  To: freude
  Cc: linux-s390, linux-kernel, kvm, jjherne, borntraeger, cohuck,
	mjrosato, pasic, alex.williamson, kwankhede, fiuczy


On 12/15/22 5:56 AM, Harald Freudenberger wrote:
> On 2022-12-13 16:44, Tony Krowiak wrote:
>> Some response codes returned from the queue reset function are not being
>> handled correctly; this patch fixes them:
>>
>> 1. Response code 3, AP queue deconfigured: Deconfiguring an AP adapter
>>    resets all of its queues, so this is handled by indicating the reset
>>    verification completed successfully.
>>
>> 2. For all response codes other than 0 (normal reset completion), 2
>>    (queue reset in progress) and 3 (AP deconfigured), the -EIO error 
>> will
>>    be returned from the vfio_ap_mdev_reset_queue() function. In all 
>> cases,
>>    all fields of the status word other than the response code will be
>>    set to zero, so it makes no sense to check status bits.
>>
>> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
>> ---
>>  drivers/s390/crypto/vfio_ap_ops.c | 17 +++++++----------
>>  1 file changed, 7 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
>> b/drivers/s390/crypto/vfio_ap_ops.c
>> index c0cf5050be59..dbf681715a6d 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -1659,17 +1659,15 @@ static int vfio_ap_mdev_reset_queue(struct
>> vfio_ap_queue *q)
>>          if (ret)
>>              break;
>>          goto retry_zapq;
>> -    case AP_RESPONSE_Q_NOT_AVAIL:
>>      case AP_RESPONSE_DECONFIGURED:
>> -    case AP_RESPONSE_CHECKSTOPPED:
>> -        WARN_ONCE(status.irq_enabled,
>> -              "PQAP/ZAPQ for %02x.%04x failed with rc=%u while IRQ 
>> enabled",
>> -              AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn),
>> -              status.response_code);
>> -        ret = -EBUSY;
>> -        goto free_resources;
>> +        /*
>> +         * When an AP adapter is deconfigured, the associated
>> +         * queues are reset, so let's return a value indicating the 
>> reset
>> +         * completed successfully.
>> +         */
>> +        ret = 0;
>> +        break;
>>      default:
>> -        /* things are really broken, give up */
>>          WARN(true,
>>               "PQAP/ZAPQ for %02x.%04x failed with invalid rc=%u\n",
>>               AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn),
>> @@ -1677,7 +1675,6 @@ static int vfio_ap_mdev_reset_queue(struct
>> vfio_ap_queue *q)
>>          return -EIO;
>>      }
>>
>> -free_resources:
>>      vfio_ap_free_aqic_resources(q);
>>
>>      return ret;
>
> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>


Thanks for the review.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/7] s390/vfio_ap: increase max wait time for reset verification
  2022-12-15 10:58   ` Harald Freudenberger
@ 2022-12-20 14:27     ` Anthony Krowiak
  0 siblings, 0 replies; 25+ messages in thread
From: Anthony Krowiak @ 2022-12-20 14:27 UTC (permalink / raw)
  To: freude
  Cc: linux-s390, linux-kernel, kvm, jjherne, borntraeger, cohuck,
	mjrosato, pasic, alex.williamson, kwankhede, fiuczy


On 12/15/22 5:58 AM, Harald Freudenberger wrote:
> On 2022-12-13 16:44, Tony Krowiak wrote:
>> Increase the maximum time to wait for verification of a queue reset
>> operation to 200ms.
>>
>> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
>> ---
>>  drivers/s390/crypto/vfio_ap_ops.c | 10 +++++++---
>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
>> b/drivers/s390/crypto/vfio_ap_ops.c
>> index dbf681715a6d..e80c5a6b91be 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -30,6 +30,9 @@
>>  #define AP_QUEUE_UNASSIGNED "unassigned"
>>  #define AP_QUEUE_IN_USE "in use"
>>
>> +#define MAX_RESET_CHECK_WAIT    200    /* Sleep max 200ms for reset 
>> check    */
>> +#define AP_RESET_INTERVAL        20    /* Reset sleep interval 
>> (20ms)        */
>> +
>>  static int vfio_ap_mdev_reset_queues(struct ap_queue_table *qtable);
>>  static struct vfio_ap_queue *vfio_ap_find_queue(int apqn);
>>  static const struct vfio_device_ops vfio_ap_matrix_dev_ops;
>> @@ -1615,11 +1618,12 @@ static int apq_status_check(int apqn, struct
>> ap_queue_status *status)
>>
>>  static int apq_reset_check(struct vfio_ap_queue *q)
>>  {
>> -    int iters = 2, ret;
>> +    int ret;
>> +    int iters = MAX_RESET_CHECK_WAIT / AP_RESET_INTERVAL;
>>      struct ap_queue_status status;
>>
>> -    while (iters--) {
>> -        msleep(20);
>> +    for (; iters > 0; iters--) {
>> +        msleep(AP_RESET_INTERVAL);
>>          status = ap_tapq(q->apqn, NULL);
>>          ret = apq_status_check(q->apqn, &status);
>>          if (ret != -EBUSY)
>
> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>


Thanks for the review.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 7/7] s390/vfio_ap: always clean up IRQ resources
  2022-12-19 14:10   ` Halil Pasic
@ 2022-12-20 14:33     ` Anthony Krowiak
  2022-12-20 17:24       ` Halil Pasic
  0 siblings, 1 reply; 25+ messages in thread
From: Anthony Krowiak @ 2022-12-20 14:33 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, linux-kernel, kvm, jjherne, freude, borntraeger,
	cohuck, mjrosato, alex.williamson, kwankhede, fiuczy


On 12/19/22 9:10 AM, Halil Pasic wrote:
> On Tue, 13 Dec 2022 10:44:37 -0500
> Tony Krowiak <akrowiak@linux.ibm.com> wrote:
>
>> Clean up IRQ resources even when a PQAP(ZAPQ) function fails with an error
>> not handled by a case statement.
> Why?


If the ZAPQ failed, then instructions submitted to the same queue will 
likewise fail. Are you saying it's not safe to assume, therefore, that 
interrupts will not be occurring?


>
> I'm afraid this is a step in the wrong direction...


Please explain why.


>
>> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
>> ---
>>   drivers/s390/crypto/vfio_ap_ops.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
>> index e80c5a6b91be..2dd8db9ddb39 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -1676,7 +1676,7 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q)
>>   		     "PQAP/ZAPQ for %02x.%04x failed with invalid rc=%u\n",
>>   		     AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn),
>>   		     status.response_code);
>> -		return -EIO;
>> +		break;
>>   	}
>>   
>>   	vfio_ap_free_aqic_resources(q);

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 7/7] s390/vfio_ap: always clean up IRQ resources
  2022-12-20 14:33     ` Anthony Krowiak
@ 2022-12-20 17:24       ` Halil Pasic
  2023-01-09 19:40         ` Anthony Krowiak
  0 siblings, 1 reply; 25+ messages in thread
From: Halil Pasic @ 2022-12-20 17:24 UTC (permalink / raw)
  To: Anthony Krowiak
  Cc: linux-s390, linux-kernel, kvm, jjherne, freude, borntraeger,
	cohuck, mjrosato, alex.williamson, kwankhede, fiuczy,
	Halil Pasic

On Tue, 20 Dec 2022 09:33:03 -0500
Anthony Krowiak <akrowiak@linux.ibm.com> wrote:

> On 12/19/22 9:10 AM, Halil Pasic wrote:
> > On Tue, 13 Dec 2022 10:44:37 -0500
> > Tony Krowiak <akrowiak@linux.ibm.com> wrote:
> >  
> >> Clean up IRQ resources even when a PQAP(ZAPQ) function fails with an error
> >> not handled by a case statement.  
> > Why?  
> 
> 
> If the ZAPQ failed, then instructions submitted to the same queue will 
> likewise fail. Are you saying it's not safe to assume, therefore, that 
> interrupts will not be occurring?

Right. We are talking about the default branch here, and I suppose, the
codes where we know that it is safe to assume that no reset is needed
handled separately (AP_RESPONSE_DECONFIGURED).

I'm not convinced that if we take the default branch we can safely
assume, that we won't see any interrupts.

For example consider hot-unplug as done by KVM. We modify the
CRYCB/APCB with all vCPUS take out of SIE, but we don't keep
the vCPUs out of SIE until the resets of the unpugged queues
are done, and we don't do any extra interrupt disablement
with all vCPUs keept out of SIE. So I believe currently there
may be a window where the guest can observe a 01 but the
interrupts are still live. That may be a bug, but IMHO it ain't clear
cut.

But it is not just about interrupts. Before we returned an error
code, which gets propagated to the userspace if this reset was
triggered via the ioctl.

With this change, ret seems to be uninitialized when returned 
if we take the code path which you change here. So we would
end up logging a warning and returning garbage?

One could also debate, whether RCs introduced down the road
can affect the logic here (even if the statement "if we
see an RC other that 00 and 02, we don't need to pursue a
reset any further, and interrpts are disabled" were to be
guaranteed to be true now, new RCs could theoretically mess
this up).

 
> 
> 
> >
> > I'm afraid this is a step in the wrong direction...  
> 
> 
> Please explain why.
> 

Sorry, I kept this brief because IMHO it is your job to tell us why
this needs to be changed. But I gave in, as you see.

Regards,
Halil

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 7/7] s390/vfio_ap: always clean up IRQ resources
  2022-12-20 17:24       ` Halil Pasic
@ 2023-01-09 19:40         ` Anthony Krowiak
  0 siblings, 0 replies; 25+ messages in thread
From: Anthony Krowiak @ 2023-01-09 19:40 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, linux-kernel, kvm, jjherne, freude, borntraeger,
	cohuck, mjrosato, alex.williamson, kwankhede, fiuczy


On 12/20/22 12:24 PM, Halil Pasic wrote:
> On Tue, 20 Dec 2022 09:33:03 -0500
> Anthony Krowiak <akrowiak@linux.ibm.com> wrote:
>
>> On 12/19/22 9:10 AM, Halil Pasic wrote:
>>> On Tue, 13 Dec 2022 10:44:37 -0500
>>> Tony Krowiak <akrowiak@linux.ibm.com> wrote:
>>>   
>>>> Clean up IRQ resources even when a PQAP(ZAPQ) function fails with an error
>>>> not handled by a case statement.
>>> Why?
>>
>> If the ZAPQ failed, then instructions submitted to the same queue will
>> likewise fail. Are you saying it's not safe to assume, therefore, that
>> interrupts will not be occurring?
> Right. We are talking about the default branch here, and I suppose, the
> codes where we know that it is safe to assume that no reset is needed
> handled separately (AP_RESPONSE_DECONFIGURED).
>
> I'm not convinced that if we take the default branch we can safely
> assume, that we won't see any interrupts.
>
> For example consider hot-unplug as done by KVM. We modify the
> CRYCB/APCB with all vCPUS take out of SIE, but we don't keep
> the vCPUs out of SIE until the resets of the unpugged queues
> are done, and we don't do any extra interrupt disablement
> with all vCPUs keept out of SIE. So I believe currently there
> may be a window where the guest can observe a 01 but the
> interrupts are still live. That may be a bug, but IMHO it ain't clear
> cut.
>
> But it is not just about interrupts. Before we returned an error
> code, which gets propagated to the userspace if this reset was
> triggered via the ioctl.
>
> With this change, ret seems to be uninitialized when returned
> if we take the code path which you change here. So we would
> end up logging a warning and returning garbage?


That was an oversight. The -EIO value was returned previously, so the 
ret = -EIO should be set in the default case.


>
> One could also debate, whether RCs introduced down the road
> can affect the logic here (even if the statement "if we
> see an RC other that 00 and 02, we don't need to pursue a
> reset any further, and interrpts are disabled" were to be
> guaranteed to be true now, new RCs could theoretically mess
> this up).


I think that would be the case regardless of this change. If new RCs are 
introduced, this function ought to be revisited anyway and appropriate 
changes made.


>
>   
>>
>>> I'm afraid this is a step in the wrong direction...
>>
>> Please explain why.
>>
> Sorry, I kept this brief because IMHO it is your job to tell us why
> this needs to be changed. But I gave in, as you see.
>
> Regards,
> Halil

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2023-01-09 19:41 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-13 15:44 [PATCH 0/7] improve AP queue reset processing Tony Krowiak
2022-12-13 15:44 ` [PATCH 1/7] s390/vfio-ap: verify reset complete in separate function Tony Krowiak
2022-12-15  8:24   ` Harald Freudenberger
2022-12-13 15:44 ` [PATCH 2/7] s390/vfio_ap: check TAPQ response code when waiting for queue reset Tony Krowiak
2022-12-15 10:48   ` Harald Freudenberger
2022-12-13 15:44 ` [PATCH 3/7] s390/vfio_ap: use TAPQ to verify reset in progress completes Tony Krowiak
2022-12-15 10:51   ` Harald Freudenberger
2022-12-20 14:22     ` Anthony Krowiak
2022-12-13 15:44 ` [PATCH 4/7] s390/vfio_ap: verify ZAPQ completion after return of response code zero Tony Krowiak
2022-12-15 10:54   ` Harald Freudenberger
2022-12-20 14:24     ` Anthony Krowiak
2022-12-13 15:44 ` [PATCH 5/7] s390/vfio_ap: fix handling of error response codes Tony Krowiak
2022-12-15 10:56   ` Harald Freudenberger
2022-12-20 14:25     ` Anthony Krowiak
2022-12-13 15:44 ` [PATCH 6/7] s390/vfio_ap: increase max wait time for reset verification Tony Krowiak
2022-12-15 10:58   ` Harald Freudenberger
2022-12-20 14:27     ` Anthony Krowiak
2022-12-13 15:44 ` [PATCH 7/7] s390/vfio_ap: always clean up IRQ resources Tony Krowiak
2022-12-15 10:58   ` Harald Freudenberger
2022-12-19 14:10   ` Halil Pasic
2022-12-20 14:33     ` Anthony Krowiak
2022-12-20 17:24       ` Halil Pasic
2023-01-09 19:40         ` Anthony Krowiak
2022-12-14 15:16 ` [PATCH 0/7] improve AP queue reset processing Jason J. Herne
2022-12-14 19:10   ` Anthony Krowiak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.