All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] drm/msm/dp: Clean up handling of DP AUX interrupts
@ 2023-01-19 22:53 ` Douglas Anderson
  0 siblings, 0 replies; 14+ messages in thread
From: Douglas Anderson @ 2023-01-19 22:53 UTC (permalink / raw)
  To: Rob Clark, Abhinav Kumar, Dmitry Baryshkov
  Cc: freedreno, Sankeerth Billakanti, Thomas Zimmermann, Sean Paul,
	Douglas Anderson, dri-devel, Stephen Boyd, linux-arm-msm,
	Kuogee Hsieh, linux-kernel

The DP AUX interrupt handling was a bit of a mess.
* There were two functions (one for "native" transfers and one for
  "i2c" transfers) that were quite similar. It was hard to say how
  many of the differences between the two functions were on purpose
  and how many of them were just an accident of how they were coded.
* Each function sometimes used "else if" to test for error bits and
  sometimes didn't and again it was hard to say if this was on purpose
  or just an accident.
* The two functions wouldn't notice whether "unknown" bits were
  set. For instance, there seems to be a bit "DP_INTR_PLL_UNLOCKED"
  and if it was set there would be no indication.
* The two functions wouldn't notice if more than one error was set.

Let's fix this by being more consistent / explicit about what we're
doing.

By design this could cause different handling for AUX transfers,
though I'm not actually aware of any bug fixed as a result of
this patch (this patch was created because we simply noticed how odd
the old code was by code inspection). Specific notes here:
1. In the old native transfer case if we got "done + wrong address"
   we'd ignore the "wrong address" (because of the "else if"). Now we
   won't.
2. In the old native transfer case if we got "done + timeout" we'd
   ignore the "timeout" (because of the "else if"). Now we won't.
3. In the old native transfer case we'd see "nack_defer" and translate
   it to the error number for "nack". This differed from the i2c
   transfer case where "nack_defer" was given the error number for
   "nack_defer". This 100% can't matter because the only user of this
   error number treats "nack defer" the same as "nack", so it's clear
   that the difference between the "native" and "i2c" was pointless
   here.
4. In the old i2c transfer case if we got "done" plus any error
   besides "nack" or "defer" then we'd ignore the error. Now we don't.
5. If there is more than one error signaled by the hardware it's
   possible that we'll report a different one than we used to. I don't
   know if this matters. If someone is aware of a case this matters we
   should document it and change the code to make it explicit.
6. One quirk we keep (I don't know if this is important) is that in
   the i2c transfer case if we see "done + defer" we report that as a
   "nack". That seemed too intentional in the old code to just drop.

After this change we will add extra logging, including:
* A warning if we see more than one error bit set.
* A warning if we see an unexpected interrupt.
* A warning if we get an AUX transfer interrupt when shouldn't.

It actually turns out that as a result of this change then at boot we
sometimes see an error:
  [drm:dp_aux_isr] *ERROR* Unexpected DP AUX IRQ 0x01000000 when not busy
That means that, during init, we are seeing DP_INTR_PLL_UNLOCKED. For
now I'm going to say that leaving this error reported in the logs is
OK-ish and hopefully it will encourage someone to track down what's
going on at init time.

One last note here is that this change renames one of the interrupt
bits. The bit named "i2c done" clearly was used for native transfers
being done too, so I renamed it to indicate this.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
I don't have good test coverage for this change and it does have the
potential to change behavior. I confirmed that eDP and DP still
continue to work OK on one machine. Hopefully folks can test it more.

 drivers/gpu/drm/msm/dp/dp_aux.c     | 80 ++++++++++++-----------------
 drivers/gpu/drm/msm/dp/dp_catalog.c |  2 +-
 drivers/gpu/drm/msm/dp/dp_catalog.h |  2 +-
 3 files changed, 36 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
index cc3efed593aa..34ad08ae6eb9 100644
--- a/drivers/gpu/drm/msm/dp/dp_aux.c
+++ b/drivers/gpu/drm/msm/dp/dp_aux.c
@@ -162,47 +162,6 @@ static ssize_t dp_aux_cmd_fifo_rx(struct dp_aux_private *aux,
 	return i;
 }
 
-static void dp_aux_native_handler(struct dp_aux_private *aux, u32 isr)
-{
-	if (isr & DP_INTR_AUX_I2C_DONE)
-		aux->aux_error_num = DP_AUX_ERR_NONE;
-	else if (isr & DP_INTR_WRONG_ADDR)
-		aux->aux_error_num = DP_AUX_ERR_ADDR;
-	else if (isr & DP_INTR_TIMEOUT)
-		aux->aux_error_num = DP_AUX_ERR_TOUT;
-	if (isr & DP_INTR_NACK_DEFER)
-		aux->aux_error_num = DP_AUX_ERR_NACK;
-	if (isr & DP_INTR_AUX_ERROR) {
-		aux->aux_error_num = DP_AUX_ERR_PHY;
-		dp_catalog_aux_clear_hw_interrupts(aux->catalog);
-	}
-}
-
-static void dp_aux_i2c_handler(struct dp_aux_private *aux, u32 isr)
-{
-	if (isr & DP_INTR_AUX_I2C_DONE) {
-		if (isr & (DP_INTR_I2C_NACK | DP_INTR_I2C_DEFER))
-			aux->aux_error_num = DP_AUX_ERR_NACK;
-		else
-			aux->aux_error_num = DP_AUX_ERR_NONE;
-	} else {
-		if (isr & DP_INTR_WRONG_ADDR)
-			aux->aux_error_num = DP_AUX_ERR_ADDR;
-		else if (isr & DP_INTR_TIMEOUT)
-			aux->aux_error_num = DP_AUX_ERR_TOUT;
-		if (isr & DP_INTR_NACK_DEFER)
-			aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
-		if (isr & DP_INTR_I2C_NACK)
-			aux->aux_error_num = DP_AUX_ERR_NACK;
-		if (isr & DP_INTR_I2C_DEFER)
-			aux->aux_error_num = DP_AUX_ERR_DEFER;
-		if (isr & DP_INTR_AUX_ERROR) {
-			aux->aux_error_num = DP_AUX_ERR_PHY;
-			dp_catalog_aux_clear_hw_interrupts(aux->catalog);
-		}
-	}
-}
-
 static void dp_aux_update_offset_and_segment(struct dp_aux_private *aux,
 					     struct drm_dp_aux_msg *input_msg)
 {
@@ -427,13 +386,42 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
 	if (!isr)
 		return;
 
-	if (!aux->cmd_busy)
+	if (!aux->cmd_busy) {
+		DRM_ERROR("Unexpected DP AUX IRQ %#010x when not busy\n", isr);
 		return;
+	}
 
-	if (aux->native)
-		dp_aux_native_handler(aux, isr);
-	else
-		dp_aux_i2c_handler(aux, isr);
+	/*
+	 * The logic below assumes only one error bit is set (other than "done"
+	 * which can apparently be set at the same time as some of the other
+	 * bits). Warn if more than one get set so we know we need to improve
+	 * the logic.
+	 */
+	if (hweight32(isr & ~DP_INTR_AUX_XFER_DONE) > 1)
+		DRM_WARN("Some DP AUX interrupts unhandled: %#010x\n", isr);
+
+	if (isr & DP_INTR_AUX_ERROR) {
+		aux->aux_error_num = DP_AUX_ERR_PHY;
+		dp_catalog_aux_clear_hw_interrupts(aux->catalog);
+	} else if (isr & DP_INTR_NACK_DEFER) {
+		aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
+	} else if (isr & DP_INTR_WRONG_ADDR) {
+		aux->aux_error_num = DP_AUX_ERR_ADDR;
+	} else if (isr & DP_INTR_TIMEOUT) {
+		aux->aux_error_num = DP_AUX_ERR_TOUT;
+	} else if (isr & DP_INTR_AUX_XFER_DONE) {
+		aux->aux_error_num = DP_AUX_ERR_NONE;
+	} else if (!aux->native && (isr & DP_INTR_I2C_NACK)) {
+		aux->aux_error_num = DP_AUX_ERR_NACK;
+	} else if (!aux->native && (isr & DP_INTR_I2C_DEFER)) {
+		if (isr & DP_INTR_AUX_XFER_DONE)
+			aux->aux_error_num = DP_AUX_ERR_NACK;
+		else
+			aux->aux_error_num = DP_AUX_ERR_DEFER;
+	} else {
+		DRM_WARN("Unexpected interrupt: %#010x\n", isr);
+		return;
+	}
 
 	complete(&aux->comp);
 }
diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.c b/drivers/gpu/drm/msm/dp/dp_catalog.c
index 676279d0ca8d..421391755427 100644
--- a/drivers/gpu/drm/msm/dp/dp_catalog.c
+++ b/drivers/gpu/drm/msm/dp/dp_catalog.c
@@ -27,7 +27,7 @@
 #define DP_INTF_CONFIG_DATABUS_WIDEN     BIT(4)
 
 #define DP_INTERRUPT_STATUS1 \
-	(DP_INTR_AUX_I2C_DONE| \
+	(DP_INTR_AUX_XFER_DONE| \
 	DP_INTR_WRONG_ADDR | DP_INTR_TIMEOUT | \
 	DP_INTR_NACK_DEFER | DP_INTR_WRONG_DATA_CNT | \
 	DP_INTR_I2C_NACK | DP_INTR_I2C_DEFER | \
diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.h b/drivers/gpu/drm/msm/dp/dp_catalog.h
index 1f717f45c115..f36b7b372a06 100644
--- a/drivers/gpu/drm/msm/dp/dp_catalog.h
+++ b/drivers/gpu/drm/msm/dp/dp_catalog.h
@@ -13,7 +13,7 @@
 
 /* interrupts */
 #define DP_INTR_HPD		BIT(0)
-#define DP_INTR_AUX_I2C_DONE	BIT(3)
+#define DP_INTR_AUX_XFER_DONE	BIT(3)
 #define DP_INTR_WRONG_ADDR	BIT(6)
 #define DP_INTR_TIMEOUT		BIT(9)
 #define DP_INTR_NACK_DEFER	BIT(12)
-- 
2.39.0.246.g2a6d74b583-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 1/2] drm/msm/dp: Clean up handling of DP AUX interrupts
@ 2023-01-19 22:53 ` Douglas Anderson
  0 siblings, 0 replies; 14+ messages in thread
From: Douglas Anderson @ 2023-01-19 22:53 UTC (permalink / raw)
  To: Rob Clark, Abhinav Kumar, Dmitry Baryshkov
  Cc: Kuogee Hsieh, Stephen Boyd, Douglas Anderson, Daniel Vetter,
	David Airlie, Sankeerth Billakanti, Sean Paul, Thomas Zimmermann,
	dri-devel, freedreno, linux-arm-msm, linux-kernel

The DP AUX interrupt handling was a bit of a mess.
* There were two functions (one for "native" transfers and one for
  "i2c" transfers) that were quite similar. It was hard to say how
  many of the differences between the two functions were on purpose
  and how many of them were just an accident of how they were coded.
* Each function sometimes used "else if" to test for error bits and
  sometimes didn't and again it was hard to say if this was on purpose
  or just an accident.
* The two functions wouldn't notice whether "unknown" bits were
  set. For instance, there seems to be a bit "DP_INTR_PLL_UNLOCKED"
  and if it was set there would be no indication.
* The two functions wouldn't notice if more than one error was set.

Let's fix this by being more consistent / explicit about what we're
doing.

By design this could cause different handling for AUX transfers,
though I'm not actually aware of any bug fixed as a result of
this patch (this patch was created because we simply noticed how odd
the old code was by code inspection). Specific notes here:
1. In the old native transfer case if we got "done + wrong address"
   we'd ignore the "wrong address" (because of the "else if"). Now we
   won't.
2. In the old native transfer case if we got "done + timeout" we'd
   ignore the "timeout" (because of the "else if"). Now we won't.
3. In the old native transfer case we'd see "nack_defer" and translate
   it to the error number for "nack". This differed from the i2c
   transfer case where "nack_defer" was given the error number for
   "nack_defer". This 100% can't matter because the only user of this
   error number treats "nack defer" the same as "nack", so it's clear
   that the difference between the "native" and "i2c" was pointless
   here.
4. In the old i2c transfer case if we got "done" plus any error
   besides "nack" or "defer" then we'd ignore the error. Now we don't.
5. If there is more than one error signaled by the hardware it's
   possible that we'll report a different one than we used to. I don't
   know if this matters. If someone is aware of a case this matters we
   should document it and change the code to make it explicit.
6. One quirk we keep (I don't know if this is important) is that in
   the i2c transfer case if we see "done + defer" we report that as a
   "nack". That seemed too intentional in the old code to just drop.

After this change we will add extra logging, including:
* A warning if we see more than one error bit set.
* A warning if we see an unexpected interrupt.
* A warning if we get an AUX transfer interrupt when shouldn't.

It actually turns out that as a result of this change then at boot we
sometimes see an error:
  [drm:dp_aux_isr] *ERROR* Unexpected DP AUX IRQ 0x01000000 when not busy
That means that, during init, we are seeing DP_INTR_PLL_UNLOCKED. For
now I'm going to say that leaving this error reported in the logs is
OK-ish and hopefully it will encourage someone to track down what's
going on at init time.

One last note here is that this change renames one of the interrupt
bits. The bit named "i2c done" clearly was used for native transfers
being done too, so I renamed it to indicate this.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
I don't have good test coverage for this change and it does have the
potential to change behavior. I confirmed that eDP and DP still
continue to work OK on one machine. Hopefully folks can test it more.

 drivers/gpu/drm/msm/dp/dp_aux.c     | 80 ++++++++++++-----------------
 drivers/gpu/drm/msm/dp/dp_catalog.c |  2 +-
 drivers/gpu/drm/msm/dp/dp_catalog.h |  2 +-
 3 files changed, 36 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
index cc3efed593aa..34ad08ae6eb9 100644
--- a/drivers/gpu/drm/msm/dp/dp_aux.c
+++ b/drivers/gpu/drm/msm/dp/dp_aux.c
@@ -162,47 +162,6 @@ static ssize_t dp_aux_cmd_fifo_rx(struct dp_aux_private *aux,
 	return i;
 }
 
-static void dp_aux_native_handler(struct dp_aux_private *aux, u32 isr)
-{
-	if (isr & DP_INTR_AUX_I2C_DONE)
-		aux->aux_error_num = DP_AUX_ERR_NONE;
-	else if (isr & DP_INTR_WRONG_ADDR)
-		aux->aux_error_num = DP_AUX_ERR_ADDR;
-	else if (isr & DP_INTR_TIMEOUT)
-		aux->aux_error_num = DP_AUX_ERR_TOUT;
-	if (isr & DP_INTR_NACK_DEFER)
-		aux->aux_error_num = DP_AUX_ERR_NACK;
-	if (isr & DP_INTR_AUX_ERROR) {
-		aux->aux_error_num = DP_AUX_ERR_PHY;
-		dp_catalog_aux_clear_hw_interrupts(aux->catalog);
-	}
-}
-
-static void dp_aux_i2c_handler(struct dp_aux_private *aux, u32 isr)
-{
-	if (isr & DP_INTR_AUX_I2C_DONE) {
-		if (isr & (DP_INTR_I2C_NACK | DP_INTR_I2C_DEFER))
-			aux->aux_error_num = DP_AUX_ERR_NACK;
-		else
-			aux->aux_error_num = DP_AUX_ERR_NONE;
-	} else {
-		if (isr & DP_INTR_WRONG_ADDR)
-			aux->aux_error_num = DP_AUX_ERR_ADDR;
-		else if (isr & DP_INTR_TIMEOUT)
-			aux->aux_error_num = DP_AUX_ERR_TOUT;
-		if (isr & DP_INTR_NACK_DEFER)
-			aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
-		if (isr & DP_INTR_I2C_NACK)
-			aux->aux_error_num = DP_AUX_ERR_NACK;
-		if (isr & DP_INTR_I2C_DEFER)
-			aux->aux_error_num = DP_AUX_ERR_DEFER;
-		if (isr & DP_INTR_AUX_ERROR) {
-			aux->aux_error_num = DP_AUX_ERR_PHY;
-			dp_catalog_aux_clear_hw_interrupts(aux->catalog);
-		}
-	}
-}
-
 static void dp_aux_update_offset_and_segment(struct dp_aux_private *aux,
 					     struct drm_dp_aux_msg *input_msg)
 {
@@ -427,13 +386,42 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
 	if (!isr)
 		return;
 
-	if (!aux->cmd_busy)
+	if (!aux->cmd_busy) {
+		DRM_ERROR("Unexpected DP AUX IRQ %#010x when not busy\n", isr);
 		return;
+	}
 
-	if (aux->native)
-		dp_aux_native_handler(aux, isr);
-	else
-		dp_aux_i2c_handler(aux, isr);
+	/*
+	 * The logic below assumes only one error bit is set (other than "done"
+	 * which can apparently be set at the same time as some of the other
+	 * bits). Warn if more than one get set so we know we need to improve
+	 * the logic.
+	 */
+	if (hweight32(isr & ~DP_INTR_AUX_XFER_DONE) > 1)
+		DRM_WARN("Some DP AUX interrupts unhandled: %#010x\n", isr);
+
+	if (isr & DP_INTR_AUX_ERROR) {
+		aux->aux_error_num = DP_AUX_ERR_PHY;
+		dp_catalog_aux_clear_hw_interrupts(aux->catalog);
+	} else if (isr & DP_INTR_NACK_DEFER) {
+		aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
+	} else if (isr & DP_INTR_WRONG_ADDR) {
+		aux->aux_error_num = DP_AUX_ERR_ADDR;
+	} else if (isr & DP_INTR_TIMEOUT) {
+		aux->aux_error_num = DP_AUX_ERR_TOUT;
+	} else if (isr & DP_INTR_AUX_XFER_DONE) {
+		aux->aux_error_num = DP_AUX_ERR_NONE;
+	} else if (!aux->native && (isr & DP_INTR_I2C_NACK)) {
+		aux->aux_error_num = DP_AUX_ERR_NACK;
+	} else if (!aux->native && (isr & DP_INTR_I2C_DEFER)) {
+		if (isr & DP_INTR_AUX_XFER_DONE)
+			aux->aux_error_num = DP_AUX_ERR_NACK;
+		else
+			aux->aux_error_num = DP_AUX_ERR_DEFER;
+	} else {
+		DRM_WARN("Unexpected interrupt: %#010x\n", isr);
+		return;
+	}
 
 	complete(&aux->comp);
 }
diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.c b/drivers/gpu/drm/msm/dp/dp_catalog.c
index 676279d0ca8d..421391755427 100644
--- a/drivers/gpu/drm/msm/dp/dp_catalog.c
+++ b/drivers/gpu/drm/msm/dp/dp_catalog.c
@@ -27,7 +27,7 @@
 #define DP_INTF_CONFIG_DATABUS_WIDEN     BIT(4)
 
 #define DP_INTERRUPT_STATUS1 \
-	(DP_INTR_AUX_I2C_DONE| \
+	(DP_INTR_AUX_XFER_DONE| \
 	DP_INTR_WRONG_ADDR | DP_INTR_TIMEOUT | \
 	DP_INTR_NACK_DEFER | DP_INTR_WRONG_DATA_CNT | \
 	DP_INTR_I2C_NACK | DP_INTR_I2C_DEFER | \
diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.h b/drivers/gpu/drm/msm/dp/dp_catalog.h
index 1f717f45c115..f36b7b372a06 100644
--- a/drivers/gpu/drm/msm/dp/dp_catalog.h
+++ b/drivers/gpu/drm/msm/dp/dp_catalog.h
@@ -13,7 +13,7 @@
 
 /* interrupts */
 #define DP_INTR_HPD		BIT(0)
-#define DP_INTR_AUX_I2C_DONE	BIT(3)
+#define DP_INTR_AUX_XFER_DONE	BIT(3)
 #define DP_INTR_WRONG_ADDR	BIT(6)
 #define DP_INTR_TIMEOUT		BIT(9)
 #define DP_INTR_NACK_DEFER	BIT(12)
-- 
2.39.0.246.g2a6d74b583-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] drm/msm/dp: Return IRQ_NONE for unhandled interrupts
  2023-01-19 22:53 ` Douglas Anderson
@ 2023-01-19 22:53   ` Douglas Anderson
  -1 siblings, 0 replies; 14+ messages in thread
From: Douglas Anderson @ 2023-01-19 22:53 UTC (permalink / raw)
  To: Rob Clark, Abhinav Kumar, Dmitry Baryshkov
  Cc: freedreno, Sankeerth Billakanti, linux-kernel, Thomas Zimmermann,
	Sean Paul, Bjorn Andersson, Douglas Anderson, dri-devel,
	Stephen Boyd, linux-arm-msm, Javier Martinez Canillas,
	Kuogee Hsieh, Johan Hovold

If our interrupt handler gets called and we don't really handle the
interrupt then we should return IRQ_NONE. The current interrupt
handler didn't do this, so let's fix it.

NOTE: for some of the cases it's clear that we should return IRQ_NONE
and some cases it's clear that we should return IRQ_HANDLED. However,
there are a few that fall somewhere in between. Specifically, the
documentation for when to return IRQ_NONE vs. IRQ_HANDLED is probably
best spelled out in the commit message of commit d9e4ad5badf4
("Document that IRQ_NONE should be returned when IRQ not actually
handled"). That commit makes it clear that we should return
IRQ_HANDLED if we've done something to make the interrupt stop
happening.

The case where it's unclear is, for instance, in dp_aux_isr() after
we've read the interrupt using dp_catalog_aux_get_irq() and confirmed
that "isr" is non-zero. The function dp_catalog_aux_get_irq() not only
reads the interrupts but it also "ack"s all the interrupts that are
returned. For an "unknown" interrupt this has a very good chance of
actually stopping the interrupt from happening. That would mean we've
identified that it's our device and done something to stop them from
happening and should return IRQ_HANDLED. Specifically, it should be
noted that most interrupts that need "ack"ing are ones that are
one-time events and doing an "ack" is enough to clear them. However,
since these interrupts are unknown then, by definition, it's unknown
if "ack"ing them is truly enough to clear them. It's possible that we
also need to remove the original source of the interrupt. In this
case, IRQ_NONE would be a better choice.

Given that returning an occasional IRQ_NONE isn't the absolute end of
the world, however, let's choose that course of action. The IRQ
framework will forgive a few IRQ_NONE returns now and again (and it
won't even log them, which is why we have to log them ourselves). This
means that if we _do_ end hitting an interrupt where "ack"ing isn't
enough the kernel will eventually detect the problem and shut our
device down.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/gpu/drm/msm/dp/dp_aux.c     | 12 +++++++-----
 drivers/gpu/drm/msm/dp/dp_aux.h     |  2 +-
 drivers/gpu/drm/msm/dp/dp_ctrl.c    | 10 ++++++++--
 drivers/gpu/drm/msm/dp/dp_ctrl.h    |  2 +-
 drivers/gpu/drm/msm/dp/dp_display.c |  8 +++++---
 5 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
index 34ad08ae6eb9..59e323b7499d 100644
--- a/drivers/gpu/drm/msm/dp/dp_aux.c
+++ b/drivers/gpu/drm/msm/dp/dp_aux.c
@@ -368,14 +368,14 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux *dp_aux,
 	return ret;
 }
 
-void dp_aux_isr(struct drm_dp_aux *dp_aux)
+irqreturn_t dp_aux_isr(struct drm_dp_aux *dp_aux)
 {
 	u32 isr;
 	struct dp_aux_private *aux;
 
 	if (!dp_aux) {
 		DRM_ERROR("invalid input\n");
-		return;
+		return IRQ_NONE;
 	}
 
 	aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
@@ -384,11 +384,11 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
 
 	/* no interrupts pending, return immediately */
 	if (!isr)
-		return;
+		return IRQ_NONE;
 
 	if (!aux->cmd_busy) {
 		DRM_ERROR("Unexpected DP AUX IRQ %#010x when not busy\n", isr);
-		return;
+		return IRQ_NONE;
 	}
 
 	/*
@@ -420,10 +420,12 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
 			aux->aux_error_num = DP_AUX_ERR_DEFER;
 	} else {
 		DRM_WARN("Unexpected interrupt: %#010x\n", isr);
-		return;
+		return IRQ_NONE;
 	}
 
 	complete(&aux->comp);
+
+	return IRQ_HANDLED;
 }
 
 void dp_aux_reconfig(struct drm_dp_aux *dp_aux)
diff --git a/drivers/gpu/drm/msm/dp/dp_aux.h b/drivers/gpu/drm/msm/dp/dp_aux.h
index e930974bcb5b..511305da4f66 100644
--- a/drivers/gpu/drm/msm/dp/dp_aux.h
+++ b/drivers/gpu/drm/msm/dp/dp_aux.h
@@ -11,7 +11,7 @@
 
 int dp_aux_register(struct drm_dp_aux *dp_aux);
 void dp_aux_unregister(struct drm_dp_aux *dp_aux);
-void dp_aux_isr(struct drm_dp_aux *dp_aux);
+irqreturn_t dp_aux_isr(struct drm_dp_aux *dp_aux);
 void dp_aux_init(struct drm_dp_aux *dp_aux);
 void dp_aux_deinit(struct drm_dp_aux *dp_aux);
 void dp_aux_reconfig(struct drm_dp_aux *dp_aux);
diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c b/drivers/gpu/drm/msm/dp/dp_ctrl.c
index dd26ca651a05..1a5377ef1967 100644
--- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
+++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
@@ -1979,27 +1979,33 @@ int dp_ctrl_off(struct dp_ctrl *dp_ctrl)
 	return ret;
 }
 
-void dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
+irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
 {
 	struct dp_ctrl_private *ctrl;
 	u32 isr;
+	irqreturn_t ret = IRQ_NONE;
 
 	if (!dp_ctrl)
-		return;
+		return IRQ_NONE;
 
 	ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
 
 	isr = dp_catalog_ctrl_get_interrupt(ctrl->catalog);
 
+
 	if (isr & DP_CTRL_INTR_READY_FOR_VIDEO) {
 		drm_dbg_dp(ctrl->drm_dev, "dp_video_ready\n");
 		complete(&ctrl->video_comp);
+		ret = IRQ_HANDLED;
 	}
 
 	if (isr & DP_CTRL_INTR_IDLE_PATTERN_SENT) {
 		drm_dbg_dp(ctrl->drm_dev, "idle_patterns_sent\n");
 		complete(&ctrl->idle_comp);
+		ret = IRQ_HANDLED;
 	}
+
+	return ret;
 }
 
 struct dp_ctrl *dp_ctrl_get(struct device *dev, struct dp_link *link,
diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.h b/drivers/gpu/drm/msm/dp/dp_ctrl.h
index 9f29734af81c..c3af06dc87b1 100644
--- a/drivers/gpu/drm/msm/dp/dp_ctrl.h
+++ b/drivers/gpu/drm/msm/dp/dp_ctrl.h
@@ -25,7 +25,7 @@ int dp_ctrl_off_link_stream(struct dp_ctrl *dp_ctrl);
 int dp_ctrl_off_link(struct dp_ctrl *dp_ctrl);
 int dp_ctrl_off(struct dp_ctrl *dp_ctrl);
 void dp_ctrl_push_idle(struct dp_ctrl *dp_ctrl);
-void dp_ctrl_isr(struct dp_ctrl *dp_ctrl);
+irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl);
 void dp_ctrl_handle_sink_request(struct dp_ctrl *dp_ctrl);
 struct dp_ctrl *dp_ctrl_get(struct device *dev, struct dp_link *link,
 			struct dp_panel *panel,	struct drm_dp_aux *aux,
diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
index 7ff60e5ff325..8996adbc5bd3 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -1192,7 +1192,7 @@ static int dp_hpd_event_thread_start(struct dp_display_private *dp_priv)
 static irqreturn_t dp_display_irq_handler(int irq, void *dev_id)
 {
 	struct dp_display_private *dp = dev_id;
-	irqreturn_t ret = IRQ_HANDLED;
+	irqreturn_t ret = IRQ_NONE;
 	u32 hpd_isr_status;
 
 	if (!dp) {
@@ -1220,13 +1220,15 @@ static irqreturn_t dp_display_irq_handler(int irq, void *dev_id)
 
 		if (hpd_isr_status & DP_DP_HPD_UNPLUG_INT_MASK)
 			dp_add_event(dp, EV_HPD_UNPLUG_INT, 0, 0);
+
+		ret = IRQ_HANDLED;
 	}
 
 	/* DP controller isr */
-	dp_ctrl_isr(dp->ctrl);
+	ret |= dp_ctrl_isr(dp->ctrl);
 
 	/* DP aux isr */
-	dp_aux_isr(dp->aux);
+	ret |= dp_aux_isr(dp->aux);
 
 	return ret;
 }
-- 
2.39.0.246.g2a6d74b583-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] drm/msm/dp: Return IRQ_NONE for unhandled interrupts
@ 2023-01-19 22:53   ` Douglas Anderson
  0 siblings, 0 replies; 14+ messages in thread
From: Douglas Anderson @ 2023-01-19 22:53 UTC (permalink / raw)
  To: Rob Clark, Abhinav Kumar, Dmitry Baryshkov
  Cc: Kuogee Hsieh, Stephen Boyd, Douglas Anderson, Bjorn Andersson,
	Daniel Vetter, David Airlie, Javier Martinez Canillas,
	Johan Hovold, Sankeerth Billakanti, Sean Paul, Thomas Zimmermann,
	dri-devel, freedreno, linux-arm-msm, linux-kernel

If our interrupt handler gets called and we don't really handle the
interrupt then we should return IRQ_NONE. The current interrupt
handler didn't do this, so let's fix it.

NOTE: for some of the cases it's clear that we should return IRQ_NONE
and some cases it's clear that we should return IRQ_HANDLED. However,
there are a few that fall somewhere in between. Specifically, the
documentation for when to return IRQ_NONE vs. IRQ_HANDLED is probably
best spelled out in the commit message of commit d9e4ad5badf4
("Document that IRQ_NONE should be returned when IRQ not actually
handled"). That commit makes it clear that we should return
IRQ_HANDLED if we've done something to make the interrupt stop
happening.

The case where it's unclear is, for instance, in dp_aux_isr() after
we've read the interrupt using dp_catalog_aux_get_irq() and confirmed
that "isr" is non-zero. The function dp_catalog_aux_get_irq() not only
reads the interrupts but it also "ack"s all the interrupts that are
returned. For an "unknown" interrupt this has a very good chance of
actually stopping the interrupt from happening. That would mean we've
identified that it's our device and done something to stop them from
happening and should return IRQ_HANDLED. Specifically, it should be
noted that most interrupts that need "ack"ing are ones that are
one-time events and doing an "ack" is enough to clear them. However,
since these interrupts are unknown then, by definition, it's unknown
if "ack"ing them is truly enough to clear them. It's possible that we
also need to remove the original source of the interrupt. In this
case, IRQ_NONE would be a better choice.

Given that returning an occasional IRQ_NONE isn't the absolute end of
the world, however, let's choose that course of action. The IRQ
framework will forgive a few IRQ_NONE returns now and again (and it
won't even log them, which is why we have to log them ourselves). This
means that if we _do_ end hitting an interrupt where "ack"ing isn't
enough the kernel will eventually detect the problem and shut our
device down.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/gpu/drm/msm/dp/dp_aux.c     | 12 +++++++-----
 drivers/gpu/drm/msm/dp/dp_aux.h     |  2 +-
 drivers/gpu/drm/msm/dp/dp_ctrl.c    | 10 ++++++++--
 drivers/gpu/drm/msm/dp/dp_ctrl.h    |  2 +-
 drivers/gpu/drm/msm/dp/dp_display.c |  8 +++++---
 5 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
index 34ad08ae6eb9..59e323b7499d 100644
--- a/drivers/gpu/drm/msm/dp/dp_aux.c
+++ b/drivers/gpu/drm/msm/dp/dp_aux.c
@@ -368,14 +368,14 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux *dp_aux,
 	return ret;
 }
 
-void dp_aux_isr(struct drm_dp_aux *dp_aux)
+irqreturn_t dp_aux_isr(struct drm_dp_aux *dp_aux)
 {
 	u32 isr;
 	struct dp_aux_private *aux;
 
 	if (!dp_aux) {
 		DRM_ERROR("invalid input\n");
-		return;
+		return IRQ_NONE;
 	}
 
 	aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
@@ -384,11 +384,11 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
 
 	/* no interrupts pending, return immediately */
 	if (!isr)
-		return;
+		return IRQ_NONE;
 
 	if (!aux->cmd_busy) {
 		DRM_ERROR("Unexpected DP AUX IRQ %#010x when not busy\n", isr);
-		return;
+		return IRQ_NONE;
 	}
 
 	/*
@@ -420,10 +420,12 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
 			aux->aux_error_num = DP_AUX_ERR_DEFER;
 	} else {
 		DRM_WARN("Unexpected interrupt: %#010x\n", isr);
-		return;
+		return IRQ_NONE;
 	}
 
 	complete(&aux->comp);
+
+	return IRQ_HANDLED;
 }
 
 void dp_aux_reconfig(struct drm_dp_aux *dp_aux)
diff --git a/drivers/gpu/drm/msm/dp/dp_aux.h b/drivers/gpu/drm/msm/dp/dp_aux.h
index e930974bcb5b..511305da4f66 100644
--- a/drivers/gpu/drm/msm/dp/dp_aux.h
+++ b/drivers/gpu/drm/msm/dp/dp_aux.h
@@ -11,7 +11,7 @@
 
 int dp_aux_register(struct drm_dp_aux *dp_aux);
 void dp_aux_unregister(struct drm_dp_aux *dp_aux);
-void dp_aux_isr(struct drm_dp_aux *dp_aux);
+irqreturn_t dp_aux_isr(struct drm_dp_aux *dp_aux);
 void dp_aux_init(struct drm_dp_aux *dp_aux);
 void dp_aux_deinit(struct drm_dp_aux *dp_aux);
 void dp_aux_reconfig(struct drm_dp_aux *dp_aux);
diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c b/drivers/gpu/drm/msm/dp/dp_ctrl.c
index dd26ca651a05..1a5377ef1967 100644
--- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
+++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
@@ -1979,27 +1979,33 @@ int dp_ctrl_off(struct dp_ctrl *dp_ctrl)
 	return ret;
 }
 
-void dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
+irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
 {
 	struct dp_ctrl_private *ctrl;
 	u32 isr;
+	irqreturn_t ret = IRQ_NONE;
 
 	if (!dp_ctrl)
-		return;
+		return IRQ_NONE;
 
 	ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
 
 	isr = dp_catalog_ctrl_get_interrupt(ctrl->catalog);
 
+
 	if (isr & DP_CTRL_INTR_READY_FOR_VIDEO) {
 		drm_dbg_dp(ctrl->drm_dev, "dp_video_ready\n");
 		complete(&ctrl->video_comp);
+		ret = IRQ_HANDLED;
 	}
 
 	if (isr & DP_CTRL_INTR_IDLE_PATTERN_SENT) {
 		drm_dbg_dp(ctrl->drm_dev, "idle_patterns_sent\n");
 		complete(&ctrl->idle_comp);
+		ret = IRQ_HANDLED;
 	}
+
+	return ret;
 }
 
 struct dp_ctrl *dp_ctrl_get(struct device *dev, struct dp_link *link,
diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.h b/drivers/gpu/drm/msm/dp/dp_ctrl.h
index 9f29734af81c..c3af06dc87b1 100644
--- a/drivers/gpu/drm/msm/dp/dp_ctrl.h
+++ b/drivers/gpu/drm/msm/dp/dp_ctrl.h
@@ -25,7 +25,7 @@ int dp_ctrl_off_link_stream(struct dp_ctrl *dp_ctrl);
 int dp_ctrl_off_link(struct dp_ctrl *dp_ctrl);
 int dp_ctrl_off(struct dp_ctrl *dp_ctrl);
 void dp_ctrl_push_idle(struct dp_ctrl *dp_ctrl);
-void dp_ctrl_isr(struct dp_ctrl *dp_ctrl);
+irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl);
 void dp_ctrl_handle_sink_request(struct dp_ctrl *dp_ctrl);
 struct dp_ctrl *dp_ctrl_get(struct device *dev, struct dp_link *link,
 			struct dp_panel *panel,	struct drm_dp_aux *aux,
diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
index 7ff60e5ff325..8996adbc5bd3 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -1192,7 +1192,7 @@ static int dp_hpd_event_thread_start(struct dp_display_private *dp_priv)
 static irqreturn_t dp_display_irq_handler(int irq, void *dev_id)
 {
 	struct dp_display_private *dp = dev_id;
-	irqreturn_t ret = IRQ_HANDLED;
+	irqreturn_t ret = IRQ_NONE;
 	u32 hpd_isr_status;
 
 	if (!dp) {
@@ -1220,13 +1220,15 @@ static irqreturn_t dp_display_irq_handler(int irq, void *dev_id)
 
 		if (hpd_isr_status & DP_DP_HPD_UNPLUG_INT_MASK)
 			dp_add_event(dp, EV_HPD_UNPLUG_INT, 0, 0);
+
+		ret = IRQ_HANDLED;
 	}
 
 	/* DP controller isr */
-	dp_ctrl_isr(dp->ctrl);
+	ret |= dp_ctrl_isr(dp->ctrl);
 
 	/* DP aux isr */
-	dp_aux_isr(dp->aux);
+	ret |= dp_aux_isr(dp->aux);
 
 	return ret;
 }
-- 
2.39.0.246.g2a6d74b583-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] drm/msm/dp: Clean up handling of DP AUX interrupts
  2023-01-19 22:53 ` Douglas Anderson
@ 2023-01-25 17:13   ` Kuogee Hsieh
  -1 siblings, 0 replies; 14+ messages in thread
From: Kuogee Hsieh @ 2023-01-25 17:13 UTC (permalink / raw)
  To: Douglas Anderson, Rob Clark, Abhinav Kumar, Dmitry Baryshkov
  Cc: freedreno, Sankeerth Billakanti, linux-kernel, dri-devel,
	Stephen Boyd, Thomas Zimmermann, linux-arm-msm, Sean Paul


On 1/19/2023 2:53 PM, Douglas Anderson wrote:
> The DP AUX interrupt handling was a bit of a mess.
> * There were two functions (one for "native" transfers and one for
>    "i2c" transfers) that were quite similar. It was hard to say how
>    many of the differences between the two functions were on purpose
>    and how many of them were just an accident of how they were coded.
> * Each function sometimes used "else if" to test for error bits and
>    sometimes didn't and again it was hard to say if this was on purpose
>    or just an accident.
> * The two functions wouldn't notice whether "unknown" bits were
>    set. For instance, there seems to be a bit "DP_INTR_PLL_UNLOCKED"
>    and if it was set there would be no indication.
> * The two functions wouldn't notice if more than one error was set.
>
> Let's fix this by being more consistent / explicit about what we're
> doing.
>
> By design this could cause different handling for AUX transfers,
> though I'm not actually aware of any bug fixed as a result of
> this patch (this patch was created because we simply noticed how odd
> the old code was by code inspection). Specific notes here:
> 1. In the old native transfer case if we got "done + wrong address"
>     we'd ignore the "wrong address" (because of the "else if"). Now we
>     won't.
> 2. In the old native transfer case if we got "done + timeout" we'd
>     ignore the "timeout" (because of the "else if"). Now we won't.
> 3. In the old native transfer case we'd see "nack_defer" and translate
>     it to the error number for "nack". This differed from the i2c
>     transfer case where "nack_defer" was given the error number for
>     "nack_defer". This 100% can't matter because the only user of this
>     error number treats "nack defer" the same as "nack", so it's clear
>     that the difference between the "native" and "i2c" was pointless
>     here.
> 4. In the old i2c transfer case if we got "done" plus any error
>     besides "nack" or "defer" then we'd ignore the error. Now we don't.
> 5. If there is more than one error signaled by the hardware it's
>     possible that we'll report a different one than we used to. I don't
>     know if this matters. If someone is aware of a case this matters we
>     should document it and change the code to make it explicit.
> 6. One quirk we keep (I don't know if this is important) is that in
>     the i2c transfer case if we see "done + defer" we report that as a
>     "nack". That seemed too intentional in the old code to just drop.
>
> After this change we will add extra logging, including:
> * A warning if we see more than one error bit set.
> * A warning if we see an unexpected interrupt.
> * A warning if we get an AUX transfer interrupt when shouldn't.
>
> It actually turns out that as a result of this change then at boot we
> sometimes see an error:
>    [drm:dp_aux_isr] *ERROR* Unexpected DP AUX IRQ 0x01000000 when not busy
> That means that, during init, we are seeing DP_INTR_PLL_UNLOCKED. For
> now I'm going to say that leaving this error reported in the logs is
> OK-ish and hopefully it will encourage someone to track down what's
> going on at init time.
>
> One last note here is that this change renames one of the interrupt
> bits. The bit named "i2c done" clearly was used for native transfers
> being done too, so I renamed it to indicate this.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> I don't have good test coverage for this change and it does have the
> potential to change behavior. I confirmed that eDP and DP still
> continue to work OK on one machine. Hopefully folks can test it more.
>
>   drivers/gpu/drm/msm/dp/dp_aux.c     | 80 ++++++++++++-----------------
>   drivers/gpu/drm/msm/dp/dp_catalog.c |  2 +-
>   drivers/gpu/drm/msm/dp/dp_catalog.h |  2 +-
>   3 files changed, 36 insertions(+), 48 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
> index cc3efed593aa..34ad08ae6eb9 100644
> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> @@ -162,47 +162,6 @@ static ssize_t dp_aux_cmd_fifo_rx(struct dp_aux_private *aux,
>   	return i;
>   }
>   
> -static void dp_aux_native_handler(struct dp_aux_private *aux, u32 isr)
> -{
> -	if (isr & DP_INTR_AUX_I2C_DONE)
> -		aux->aux_error_num = DP_AUX_ERR_NONE;
> -	else if (isr & DP_INTR_WRONG_ADDR)
> -		aux->aux_error_num = DP_AUX_ERR_ADDR;
> -	else if (isr & DP_INTR_TIMEOUT)
> -		aux->aux_error_num = DP_AUX_ERR_TOUT;
> -	if (isr & DP_INTR_NACK_DEFER)
> -		aux->aux_error_num = DP_AUX_ERR_NACK;
> -	if (isr & DP_INTR_AUX_ERROR) {
> -		aux->aux_error_num = DP_AUX_ERR_PHY;
> -		dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> -	}
> -}
> -
> -static void dp_aux_i2c_handler(struct dp_aux_private *aux, u32 isr)
> -{
> -	if (isr & DP_INTR_AUX_I2C_DONE) {
> -		if (isr & (DP_INTR_I2C_NACK | DP_INTR_I2C_DEFER))
> -			aux->aux_error_num = DP_AUX_ERR_NACK;
> -		else
> -			aux->aux_error_num = DP_AUX_ERR_NONE;
> -	} else {
> -		if (isr & DP_INTR_WRONG_ADDR)
> -			aux->aux_error_num = DP_AUX_ERR_ADDR;
> -		else if (isr & DP_INTR_TIMEOUT)
> -			aux->aux_error_num = DP_AUX_ERR_TOUT;
> -		if (isr & DP_INTR_NACK_DEFER)
> -			aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
> -		if (isr & DP_INTR_I2C_NACK)
> -			aux->aux_error_num = DP_AUX_ERR_NACK;
> -		if (isr & DP_INTR_I2C_DEFER)
> -			aux->aux_error_num = DP_AUX_ERR_DEFER;
> -		if (isr & DP_INTR_AUX_ERROR) {
> -			aux->aux_error_num = DP_AUX_ERR_PHY;
> -			dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> -		}
> -	}
> -}
> -
>   static void dp_aux_update_offset_and_segment(struct dp_aux_private *aux,
>   					     struct drm_dp_aux_msg *input_msg)
>   {
> @@ -427,13 +386,42 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
>   	if (!isr)
>   		return;
>   
> -	if (!aux->cmd_busy)
> +	if (!aux->cmd_busy) {
> +		DRM_ERROR("Unexpected DP AUX IRQ %#010x when not busy\n", isr);
>   		return;
> +	}
>   
> -	if (aux->native)
> -		dp_aux_native_handler(aux, isr);
> -	else
> -		dp_aux_i2c_handler(aux, isr);
> +	/*
> +	 * The logic below assumes only one error bit is set (other than "done"
> +	 * which can apparently be set at the same time as some of the other
> +	 * bits). Warn if more than one get set so we know we need to improve
> +	 * the logic.
> +	 */
> +	if (hweight32(isr & ~DP_INTR_AUX_XFER_DONE) > 1)
> +		DRM_WARN("Some DP AUX interrupts unhandled: %#010x\n", isr);
> +
> +	if (isr & DP_INTR_AUX_ERROR) {
> +		aux->aux_error_num = DP_AUX_ERR_PHY;
> +		dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> +	} else if (isr & DP_INTR_NACK_DEFER) {
> +		aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
> +	} else if (isr & DP_INTR_WRONG_ADDR) {
> +		aux->aux_error_num = DP_AUX_ERR_ADDR;
> +	} else if (isr & DP_INTR_TIMEOUT) {
> +		aux->aux_error_num = DP_AUX_ERR_TOUT;
> +	} else if (isr & DP_INTR_AUX_XFER_DONE) {
> +		aux->aux_error_num = DP_AUX_ERR_NONE;


1) both DP_INTR_AUX_XFER_DONE and DP_INTR_I2C_NACK are set

2) both DP_INTR_AUX_XFER_DONE and DP_INTR_I2C_DEFER are set

with above two condition, below two "else if" will not be reached since 
DP_INTR_AUX_XFER_DONE is check with higher priority

> +	} else if (!aux->native && (isr & DP_INTR_I2C_NACK)) {
> +		aux->aux_error_num = DP_AUX_ERR_NACK;
> +	} else if (!aux->native && (isr & DP_INTR_I2C_DEFER)) {
> +		if (isr & DP_INTR_AUX_XFER_DONE)
> +			aux->aux_error_num = DP_AUX_ERR_NACK;
> +		else
> +			aux->aux_error_num = DP_AUX_ERR_DEFER;
> +	} else {
> +		DRM_WARN("Unexpected interrupt: %#010x\n", isr);
> +		return;
> +	}
>   
>   	complete(&aux->comp);
>   }
> diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.c b/drivers/gpu/drm/msm/dp/dp_catalog.c
> index 676279d0ca8d..421391755427 100644
> --- a/drivers/gpu/drm/msm/dp/dp_catalog.c
> +++ b/drivers/gpu/drm/msm/dp/dp_catalog.c
> @@ -27,7 +27,7 @@
>   #define DP_INTF_CONFIG_DATABUS_WIDEN     BIT(4)
>   
>   #define DP_INTERRUPT_STATUS1 \
> -	(DP_INTR_AUX_I2C_DONE| \
> +	(DP_INTR_AUX_XFER_DONE| \
>   	DP_INTR_WRONG_ADDR | DP_INTR_TIMEOUT | \
>   	DP_INTR_NACK_DEFER | DP_INTR_WRONG_DATA_CNT | \
>   	DP_INTR_I2C_NACK | DP_INTR_I2C_DEFER | \
> diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.h b/drivers/gpu/drm/msm/dp/dp_catalog.h
> index 1f717f45c115..f36b7b372a06 100644
> --- a/drivers/gpu/drm/msm/dp/dp_catalog.h
> +++ b/drivers/gpu/drm/msm/dp/dp_catalog.h
> @@ -13,7 +13,7 @@
>   
>   /* interrupts */
>   #define DP_INTR_HPD		BIT(0)
> -#define DP_INTR_AUX_I2C_DONE	BIT(3)
> +#define DP_INTR_AUX_XFER_DONE	BIT(3)
>   #define DP_INTR_WRONG_ADDR	BIT(6)
>   #define DP_INTR_TIMEOUT		BIT(9)
>   #define DP_INTR_NACK_DEFER	BIT(12)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] drm/msm/dp: Clean up handling of DP AUX interrupts
@ 2023-01-25 17:13   ` Kuogee Hsieh
  0 siblings, 0 replies; 14+ messages in thread
From: Kuogee Hsieh @ 2023-01-25 17:13 UTC (permalink / raw)
  To: Douglas Anderson, Rob Clark, Abhinav Kumar, Dmitry Baryshkov
  Cc: Stephen Boyd, Daniel Vetter, David Airlie, Sankeerth Billakanti,
	Sean Paul, Thomas Zimmermann, dri-devel, freedreno,
	linux-arm-msm, linux-kernel


On 1/19/2023 2:53 PM, Douglas Anderson wrote:
> The DP AUX interrupt handling was a bit of a mess.
> * There were two functions (one for "native" transfers and one for
>    "i2c" transfers) that were quite similar. It was hard to say how
>    many of the differences between the two functions were on purpose
>    and how many of them were just an accident of how they were coded.
> * Each function sometimes used "else if" to test for error bits and
>    sometimes didn't and again it was hard to say if this was on purpose
>    or just an accident.
> * The two functions wouldn't notice whether "unknown" bits were
>    set. For instance, there seems to be a bit "DP_INTR_PLL_UNLOCKED"
>    and if it was set there would be no indication.
> * The two functions wouldn't notice if more than one error was set.
>
> Let's fix this by being more consistent / explicit about what we're
> doing.
>
> By design this could cause different handling for AUX transfers,
> though I'm not actually aware of any bug fixed as a result of
> this patch (this patch was created because we simply noticed how odd
> the old code was by code inspection). Specific notes here:
> 1. In the old native transfer case if we got "done + wrong address"
>     we'd ignore the "wrong address" (because of the "else if"). Now we
>     won't.
> 2. In the old native transfer case if we got "done + timeout" we'd
>     ignore the "timeout" (because of the "else if"). Now we won't.
> 3. In the old native transfer case we'd see "nack_defer" and translate
>     it to the error number for "nack". This differed from the i2c
>     transfer case where "nack_defer" was given the error number for
>     "nack_defer". This 100% can't matter because the only user of this
>     error number treats "nack defer" the same as "nack", so it's clear
>     that the difference between the "native" and "i2c" was pointless
>     here.
> 4. In the old i2c transfer case if we got "done" plus any error
>     besides "nack" or "defer" then we'd ignore the error. Now we don't.
> 5. If there is more than one error signaled by the hardware it's
>     possible that we'll report a different one than we used to. I don't
>     know if this matters. If someone is aware of a case this matters we
>     should document it and change the code to make it explicit.
> 6. One quirk we keep (I don't know if this is important) is that in
>     the i2c transfer case if we see "done + defer" we report that as a
>     "nack". That seemed too intentional in the old code to just drop.
>
> After this change we will add extra logging, including:
> * A warning if we see more than one error bit set.
> * A warning if we see an unexpected interrupt.
> * A warning if we get an AUX transfer interrupt when shouldn't.
>
> It actually turns out that as a result of this change then at boot we
> sometimes see an error:
>    [drm:dp_aux_isr] *ERROR* Unexpected DP AUX IRQ 0x01000000 when not busy
> That means that, during init, we are seeing DP_INTR_PLL_UNLOCKED. For
> now I'm going to say that leaving this error reported in the logs is
> OK-ish and hopefully it will encourage someone to track down what's
> going on at init time.
>
> One last note here is that this change renames one of the interrupt
> bits. The bit named "i2c done" clearly was used for native transfers
> being done too, so I renamed it to indicate this.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> I don't have good test coverage for this change and it does have the
> potential to change behavior. I confirmed that eDP and DP still
> continue to work OK on one machine. Hopefully folks can test it more.
>
>   drivers/gpu/drm/msm/dp/dp_aux.c     | 80 ++++++++++++-----------------
>   drivers/gpu/drm/msm/dp/dp_catalog.c |  2 +-
>   drivers/gpu/drm/msm/dp/dp_catalog.h |  2 +-
>   3 files changed, 36 insertions(+), 48 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
> index cc3efed593aa..34ad08ae6eb9 100644
> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> @@ -162,47 +162,6 @@ static ssize_t dp_aux_cmd_fifo_rx(struct dp_aux_private *aux,
>   	return i;
>   }
>   
> -static void dp_aux_native_handler(struct dp_aux_private *aux, u32 isr)
> -{
> -	if (isr & DP_INTR_AUX_I2C_DONE)
> -		aux->aux_error_num = DP_AUX_ERR_NONE;
> -	else if (isr & DP_INTR_WRONG_ADDR)
> -		aux->aux_error_num = DP_AUX_ERR_ADDR;
> -	else if (isr & DP_INTR_TIMEOUT)
> -		aux->aux_error_num = DP_AUX_ERR_TOUT;
> -	if (isr & DP_INTR_NACK_DEFER)
> -		aux->aux_error_num = DP_AUX_ERR_NACK;
> -	if (isr & DP_INTR_AUX_ERROR) {
> -		aux->aux_error_num = DP_AUX_ERR_PHY;
> -		dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> -	}
> -}
> -
> -static void dp_aux_i2c_handler(struct dp_aux_private *aux, u32 isr)
> -{
> -	if (isr & DP_INTR_AUX_I2C_DONE) {
> -		if (isr & (DP_INTR_I2C_NACK | DP_INTR_I2C_DEFER))
> -			aux->aux_error_num = DP_AUX_ERR_NACK;
> -		else
> -			aux->aux_error_num = DP_AUX_ERR_NONE;
> -	} else {
> -		if (isr & DP_INTR_WRONG_ADDR)
> -			aux->aux_error_num = DP_AUX_ERR_ADDR;
> -		else if (isr & DP_INTR_TIMEOUT)
> -			aux->aux_error_num = DP_AUX_ERR_TOUT;
> -		if (isr & DP_INTR_NACK_DEFER)
> -			aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
> -		if (isr & DP_INTR_I2C_NACK)
> -			aux->aux_error_num = DP_AUX_ERR_NACK;
> -		if (isr & DP_INTR_I2C_DEFER)
> -			aux->aux_error_num = DP_AUX_ERR_DEFER;
> -		if (isr & DP_INTR_AUX_ERROR) {
> -			aux->aux_error_num = DP_AUX_ERR_PHY;
> -			dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> -		}
> -	}
> -}
> -
>   static void dp_aux_update_offset_and_segment(struct dp_aux_private *aux,
>   					     struct drm_dp_aux_msg *input_msg)
>   {
> @@ -427,13 +386,42 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
>   	if (!isr)
>   		return;
>   
> -	if (!aux->cmd_busy)
> +	if (!aux->cmd_busy) {
> +		DRM_ERROR("Unexpected DP AUX IRQ %#010x when not busy\n", isr);
>   		return;
> +	}
>   
> -	if (aux->native)
> -		dp_aux_native_handler(aux, isr);
> -	else
> -		dp_aux_i2c_handler(aux, isr);
> +	/*
> +	 * The logic below assumes only one error bit is set (other than "done"
> +	 * which can apparently be set at the same time as some of the other
> +	 * bits). Warn if more than one get set so we know we need to improve
> +	 * the logic.
> +	 */
> +	if (hweight32(isr & ~DP_INTR_AUX_XFER_DONE) > 1)
> +		DRM_WARN("Some DP AUX interrupts unhandled: %#010x\n", isr);
> +
> +	if (isr & DP_INTR_AUX_ERROR) {
> +		aux->aux_error_num = DP_AUX_ERR_PHY;
> +		dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> +	} else if (isr & DP_INTR_NACK_DEFER) {
> +		aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
> +	} else if (isr & DP_INTR_WRONG_ADDR) {
> +		aux->aux_error_num = DP_AUX_ERR_ADDR;
> +	} else if (isr & DP_INTR_TIMEOUT) {
> +		aux->aux_error_num = DP_AUX_ERR_TOUT;
> +	} else if (isr & DP_INTR_AUX_XFER_DONE) {
> +		aux->aux_error_num = DP_AUX_ERR_NONE;


1) both DP_INTR_AUX_XFER_DONE and DP_INTR_I2C_NACK are set

2) both DP_INTR_AUX_XFER_DONE and DP_INTR_I2C_DEFER are set

with above two condition, below two "else if" will not be reached since 
DP_INTR_AUX_XFER_DONE is check with higher priority

> +	} else if (!aux->native && (isr & DP_INTR_I2C_NACK)) {
> +		aux->aux_error_num = DP_AUX_ERR_NACK;
> +	} else if (!aux->native && (isr & DP_INTR_I2C_DEFER)) {
> +		if (isr & DP_INTR_AUX_XFER_DONE)
> +			aux->aux_error_num = DP_AUX_ERR_NACK;
> +		else
> +			aux->aux_error_num = DP_AUX_ERR_DEFER;
> +	} else {
> +		DRM_WARN("Unexpected interrupt: %#010x\n", isr);
> +		return;
> +	}
>   
>   	complete(&aux->comp);
>   }
> diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.c b/drivers/gpu/drm/msm/dp/dp_catalog.c
> index 676279d0ca8d..421391755427 100644
> --- a/drivers/gpu/drm/msm/dp/dp_catalog.c
> +++ b/drivers/gpu/drm/msm/dp/dp_catalog.c
> @@ -27,7 +27,7 @@
>   #define DP_INTF_CONFIG_DATABUS_WIDEN     BIT(4)
>   
>   #define DP_INTERRUPT_STATUS1 \
> -	(DP_INTR_AUX_I2C_DONE| \
> +	(DP_INTR_AUX_XFER_DONE| \
>   	DP_INTR_WRONG_ADDR | DP_INTR_TIMEOUT | \
>   	DP_INTR_NACK_DEFER | DP_INTR_WRONG_DATA_CNT | \
>   	DP_INTR_I2C_NACK | DP_INTR_I2C_DEFER | \
> diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.h b/drivers/gpu/drm/msm/dp/dp_catalog.h
> index 1f717f45c115..f36b7b372a06 100644
> --- a/drivers/gpu/drm/msm/dp/dp_catalog.h
> +++ b/drivers/gpu/drm/msm/dp/dp_catalog.h
> @@ -13,7 +13,7 @@
>   
>   /* interrupts */
>   #define DP_INTR_HPD		BIT(0)
> -#define DP_INTR_AUX_I2C_DONE	BIT(3)
> +#define DP_INTR_AUX_XFER_DONE	BIT(3)
>   #define DP_INTR_WRONG_ADDR	BIT(6)
>   #define DP_INTR_TIMEOUT		BIT(9)
>   #define DP_INTR_NACK_DEFER	BIT(12)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] drm/msm/dp: Return IRQ_NONE for unhandled interrupts
  2023-01-19 22:53   ` Douglas Anderson
@ 2023-01-25 17:22     ` Kuogee Hsieh
  -1 siblings, 0 replies; 14+ messages in thread
From: Kuogee Hsieh @ 2023-01-25 17:22 UTC (permalink / raw)
  To: Douglas Anderson, Rob Clark, Abhinav Kumar, Dmitry Baryshkov
  Cc: Stephen Boyd, Bjorn Andersson, Daniel Vetter, David Airlie,
	Javier Martinez Canillas, Johan Hovold, Sankeerth Billakanti,
	Sean Paul, Thomas Zimmermann, dri-devel, freedreno,
	linux-arm-msm, linux-kernel


On 1/19/2023 2:53 PM, Douglas Anderson wrote:
> If our interrupt handler gets called and we don't really handle the
> interrupt then we should return IRQ_NONE. The current interrupt
> handler didn't do this, so let's fix it.
>
> NOTE: for some of the cases it's clear that we should return IRQ_NONE
> and some cases it's clear that we should return IRQ_HANDLED. However,
> there are a few that fall somewhere in between. Specifically, the
> documentation for when to return IRQ_NONE vs. IRQ_HANDLED is probably
> best spelled out in the commit message of commit d9e4ad5badf4
> ("Document that IRQ_NONE should be returned when IRQ not actually
> handled"). That commit makes it clear that we should return
> IRQ_HANDLED if we've done something to make the interrupt stop
> happening.
>
> The case where it's unclear is, for instance, in dp_aux_isr() after
> we've read the interrupt using dp_catalog_aux_get_irq() and confirmed
> that "isr" is non-zero. The function dp_catalog_aux_get_irq() not only
> reads the interrupts but it also "ack"s all the interrupts that are
> returned. For an "unknown" interrupt this has a very good chance of
> actually stopping the interrupt from happening. That would mean we've
> identified that it's our device and done something to stop them from
> happening and should return IRQ_HANDLED. Specifically, it should be
> noted that most interrupts that need "ack"ing are ones that are
> one-time events and doing an "ack" is enough to clear them. However,
> since these interrupts are unknown then, by definition, it's unknown
> if "ack"ing them is truly enough to clear them. It's possible that we
> also need to remove the original source of the interrupt. In this
> case, IRQ_NONE would be a better choice.
>
> Given that returning an occasional IRQ_NONE isn't the absolute end of
> the world, however, let's choose that course of action. The IRQ
> framework will forgive a few IRQ_NONE returns now and again (and it
> won't even log them, which is why we have to log them ourselves). This
> means that if we _do_ end hitting an interrupt where "ack"ing isn't
> enough the kernel will eventually detect the problem and shut our
> device down.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
>
>   drivers/gpu/drm/msm/dp/dp_aux.c     | 12 +++++++-----
>   drivers/gpu/drm/msm/dp/dp_aux.h     |  2 +-
>   drivers/gpu/drm/msm/dp/dp_ctrl.c    | 10 ++++++++--
>   drivers/gpu/drm/msm/dp/dp_ctrl.h    |  2 +-
>   drivers/gpu/drm/msm/dp/dp_display.c |  8 +++++---
>   5 files changed, 22 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
> index 34ad08ae6eb9..59e323b7499d 100644
> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> @@ -368,14 +368,14 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux *dp_aux,
>   	return ret;
>   }
>   
> -void dp_aux_isr(struct drm_dp_aux *dp_aux)
> +irqreturn_t dp_aux_isr(struct drm_dp_aux *dp_aux)
>   {
>   	u32 isr;
>   	struct dp_aux_private *aux;
>   
>   	if (!dp_aux) {
>   		DRM_ERROR("invalid input\n");
> -		return;
> +		return IRQ_NONE;
>   	}
>   
>   	aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
> @@ -384,11 +384,11 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
>   
>   	/* no interrupts pending, return immediately */
>   	if (!isr)
> -		return;
> +		return IRQ_NONE;
>   
>   	if (!aux->cmd_busy) {
>   		DRM_ERROR("Unexpected DP AUX IRQ %#010x when not busy\n", isr);
> -		return;
> +		return IRQ_NONE;
>   	}
>   
>   	/*
> @@ -420,10 +420,12 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
>   			aux->aux_error_num = DP_AUX_ERR_DEFER;
>   	} else {
>   		DRM_WARN("Unexpected interrupt: %#010x\n", isr);
> -		return;
> +		return IRQ_NONE;
>   	}
>   
>   	complete(&aux->comp);
> +
> +	return IRQ_HANDLED;
>   }
>   
>   void dp_aux_reconfig(struct drm_dp_aux *dp_aux)
> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.h b/drivers/gpu/drm/msm/dp/dp_aux.h
> index e930974bcb5b..511305da4f66 100644
> --- a/drivers/gpu/drm/msm/dp/dp_aux.h
> +++ b/drivers/gpu/drm/msm/dp/dp_aux.h
> @@ -11,7 +11,7 @@
>   
>   int dp_aux_register(struct drm_dp_aux *dp_aux);
>   void dp_aux_unregister(struct drm_dp_aux *dp_aux);
> -void dp_aux_isr(struct drm_dp_aux *dp_aux);
> +irqreturn_t dp_aux_isr(struct drm_dp_aux *dp_aux);
>   void dp_aux_init(struct drm_dp_aux *dp_aux);
>   void dp_aux_deinit(struct drm_dp_aux *dp_aux);
>   void dp_aux_reconfig(struct drm_dp_aux *dp_aux);
> diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c b/drivers/gpu/drm/msm/dp/dp_ctrl.c
> index dd26ca651a05..1a5377ef1967 100644
> --- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
> +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
> @@ -1979,27 +1979,33 @@ int dp_ctrl_off(struct dp_ctrl *dp_ctrl)
>   	return ret;
>   }
>   
> -void dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
> +irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
>   {
>   	struct dp_ctrl_private *ctrl;
>   	u32 isr;
> +	irqreturn_t ret = IRQ_NONE;
>   
>   	if (!dp_ctrl)
> -		return;
> +		return IRQ_NONE;
>   
>   	ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
>   
>   	isr = dp_catalog_ctrl_get_interrupt(ctrl->catalog);
can you add (!isr) check and return IRQ_NONE here to be consistent with 
dp_aux_isr()?
>
> +
>   	if (isr & DP_CTRL_INTR_READY_FOR_VIDEO) {
>   		drm_dbg_dp(ctrl->drm_dev, "dp_video_ready\n");
>   		complete(&ctrl->video_comp);
> +		ret = IRQ_HANDLED;
>   	}
>   
>   	if (isr & DP_CTRL_INTR_IDLE_PATTERN_SENT) {
>   		drm_dbg_dp(ctrl->drm_dev, "idle_patterns_sent\n");
>   		complete(&ctrl->idle_comp);
> +		ret = IRQ_HANDLED;
>   	}
> +
> +	return ret;
>   }
>   
>   struct dp_ctrl *dp_ctrl_get(struct device *dev, struct dp_link *link,
> diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.h b/drivers/gpu/drm/msm/dp/dp_ctrl.h
> index 9f29734af81c..c3af06dc87b1 100644
> --- a/drivers/gpu/drm/msm/dp/dp_ctrl.h
> +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.h
> @@ -25,7 +25,7 @@ int dp_ctrl_off_link_stream(struct dp_ctrl *dp_ctrl);
>   int dp_ctrl_off_link(struct dp_ctrl *dp_ctrl);
>   int dp_ctrl_off(struct dp_ctrl *dp_ctrl);
>   void dp_ctrl_push_idle(struct dp_ctrl *dp_ctrl);
> -void dp_ctrl_isr(struct dp_ctrl *dp_ctrl);
> +irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl);
>   void dp_ctrl_handle_sink_request(struct dp_ctrl *dp_ctrl);
>   struct dp_ctrl *dp_ctrl_get(struct device *dev, struct dp_link *link,
>   			struct dp_panel *panel,	struct drm_dp_aux *aux,
> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
> index 7ff60e5ff325..8996adbc5bd3 100644
> --- a/drivers/gpu/drm/msm/dp/dp_display.c
> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
> @@ -1192,7 +1192,7 @@ static int dp_hpd_event_thread_start(struct dp_display_private *dp_priv)
>   static irqreturn_t dp_display_irq_handler(int irq, void *dev_id)
>   {
>   	struct dp_display_private *dp = dev_id;
> -	irqreturn_t ret = IRQ_HANDLED;
> +	irqreturn_t ret = IRQ_NONE;
>   	u32 hpd_isr_status;
>   
>   	if (!dp) {
> @@ -1220,13 +1220,15 @@ static irqreturn_t dp_display_irq_handler(int irq, void *dev_id)
>   
>   		if (hpd_isr_status & DP_DP_HPD_UNPLUG_INT_MASK)
>   			dp_add_event(dp, EV_HPD_UNPLUG_INT, 0, 0);
> +
> +		ret = IRQ_HANDLED;
>   	}
>   
>   	/* DP controller isr */
> -	dp_ctrl_isr(dp->ctrl);
> +	ret |= dp_ctrl_isr(dp->ctrl);
>   
>   	/* DP aux isr */
> -	dp_aux_isr(dp->aux);
> +	ret |= dp_aux_isr(dp->aux);
>   
>   	return ret;
>   }

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] drm/msm/dp: Return IRQ_NONE for unhandled interrupts
@ 2023-01-25 17:22     ` Kuogee Hsieh
  0 siblings, 0 replies; 14+ messages in thread
From: Kuogee Hsieh @ 2023-01-25 17:22 UTC (permalink / raw)
  To: Douglas Anderson, Rob Clark, Abhinav Kumar, Dmitry Baryshkov
  Cc: freedreno, Sankeerth Billakanti, linux-kernel, Thomas Zimmermann,
	Sean Paul, Bjorn Andersson, Javier Martinez Canillas, dri-devel,
	Stephen Boyd, linux-arm-msm, Johan Hovold


On 1/19/2023 2:53 PM, Douglas Anderson wrote:
> If our interrupt handler gets called and we don't really handle the
> interrupt then we should return IRQ_NONE. The current interrupt
> handler didn't do this, so let's fix it.
>
> NOTE: for some of the cases it's clear that we should return IRQ_NONE
> and some cases it's clear that we should return IRQ_HANDLED. However,
> there are a few that fall somewhere in between. Specifically, the
> documentation for when to return IRQ_NONE vs. IRQ_HANDLED is probably
> best spelled out in the commit message of commit d9e4ad5badf4
> ("Document that IRQ_NONE should be returned when IRQ not actually
> handled"). That commit makes it clear that we should return
> IRQ_HANDLED if we've done something to make the interrupt stop
> happening.
>
> The case where it's unclear is, for instance, in dp_aux_isr() after
> we've read the interrupt using dp_catalog_aux_get_irq() and confirmed
> that "isr" is non-zero. The function dp_catalog_aux_get_irq() not only
> reads the interrupts but it also "ack"s all the interrupts that are
> returned. For an "unknown" interrupt this has a very good chance of
> actually stopping the interrupt from happening. That would mean we've
> identified that it's our device and done something to stop them from
> happening and should return IRQ_HANDLED. Specifically, it should be
> noted that most interrupts that need "ack"ing are ones that are
> one-time events and doing an "ack" is enough to clear them. However,
> since these interrupts are unknown then, by definition, it's unknown
> if "ack"ing them is truly enough to clear them. It's possible that we
> also need to remove the original source of the interrupt. In this
> case, IRQ_NONE would be a better choice.
>
> Given that returning an occasional IRQ_NONE isn't the absolute end of
> the world, however, let's choose that course of action. The IRQ
> framework will forgive a few IRQ_NONE returns now and again (and it
> won't even log them, which is why we have to log them ourselves). This
> means that if we _do_ end hitting an interrupt where "ack"ing isn't
> enough the kernel will eventually detect the problem and shut our
> device down.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
>
>   drivers/gpu/drm/msm/dp/dp_aux.c     | 12 +++++++-----
>   drivers/gpu/drm/msm/dp/dp_aux.h     |  2 +-
>   drivers/gpu/drm/msm/dp/dp_ctrl.c    | 10 ++++++++--
>   drivers/gpu/drm/msm/dp/dp_ctrl.h    |  2 +-
>   drivers/gpu/drm/msm/dp/dp_display.c |  8 +++++---
>   5 files changed, 22 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
> index 34ad08ae6eb9..59e323b7499d 100644
> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> @@ -368,14 +368,14 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux *dp_aux,
>   	return ret;
>   }
>   
> -void dp_aux_isr(struct drm_dp_aux *dp_aux)
> +irqreturn_t dp_aux_isr(struct drm_dp_aux *dp_aux)
>   {
>   	u32 isr;
>   	struct dp_aux_private *aux;
>   
>   	if (!dp_aux) {
>   		DRM_ERROR("invalid input\n");
> -		return;
> +		return IRQ_NONE;
>   	}
>   
>   	aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
> @@ -384,11 +384,11 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
>   
>   	/* no interrupts pending, return immediately */
>   	if (!isr)
> -		return;
> +		return IRQ_NONE;
>   
>   	if (!aux->cmd_busy) {
>   		DRM_ERROR("Unexpected DP AUX IRQ %#010x when not busy\n", isr);
> -		return;
> +		return IRQ_NONE;
>   	}
>   
>   	/*
> @@ -420,10 +420,12 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
>   			aux->aux_error_num = DP_AUX_ERR_DEFER;
>   	} else {
>   		DRM_WARN("Unexpected interrupt: %#010x\n", isr);
> -		return;
> +		return IRQ_NONE;
>   	}
>   
>   	complete(&aux->comp);
> +
> +	return IRQ_HANDLED;
>   }
>   
>   void dp_aux_reconfig(struct drm_dp_aux *dp_aux)
> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.h b/drivers/gpu/drm/msm/dp/dp_aux.h
> index e930974bcb5b..511305da4f66 100644
> --- a/drivers/gpu/drm/msm/dp/dp_aux.h
> +++ b/drivers/gpu/drm/msm/dp/dp_aux.h
> @@ -11,7 +11,7 @@
>   
>   int dp_aux_register(struct drm_dp_aux *dp_aux);
>   void dp_aux_unregister(struct drm_dp_aux *dp_aux);
> -void dp_aux_isr(struct drm_dp_aux *dp_aux);
> +irqreturn_t dp_aux_isr(struct drm_dp_aux *dp_aux);
>   void dp_aux_init(struct drm_dp_aux *dp_aux);
>   void dp_aux_deinit(struct drm_dp_aux *dp_aux);
>   void dp_aux_reconfig(struct drm_dp_aux *dp_aux);
> diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c b/drivers/gpu/drm/msm/dp/dp_ctrl.c
> index dd26ca651a05..1a5377ef1967 100644
> --- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
> +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
> @@ -1979,27 +1979,33 @@ int dp_ctrl_off(struct dp_ctrl *dp_ctrl)
>   	return ret;
>   }
>   
> -void dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
> +irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
>   {
>   	struct dp_ctrl_private *ctrl;
>   	u32 isr;
> +	irqreturn_t ret = IRQ_NONE;
>   
>   	if (!dp_ctrl)
> -		return;
> +		return IRQ_NONE;
>   
>   	ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
>   
>   	isr = dp_catalog_ctrl_get_interrupt(ctrl->catalog);
can you add (!isr) check and return IRQ_NONE here to be consistent with 
dp_aux_isr()?
>
> +
>   	if (isr & DP_CTRL_INTR_READY_FOR_VIDEO) {
>   		drm_dbg_dp(ctrl->drm_dev, "dp_video_ready\n");
>   		complete(&ctrl->video_comp);
> +		ret = IRQ_HANDLED;
>   	}
>   
>   	if (isr & DP_CTRL_INTR_IDLE_PATTERN_SENT) {
>   		drm_dbg_dp(ctrl->drm_dev, "idle_patterns_sent\n");
>   		complete(&ctrl->idle_comp);
> +		ret = IRQ_HANDLED;
>   	}
> +
> +	return ret;
>   }
>   
>   struct dp_ctrl *dp_ctrl_get(struct device *dev, struct dp_link *link,
> diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.h b/drivers/gpu/drm/msm/dp/dp_ctrl.h
> index 9f29734af81c..c3af06dc87b1 100644
> --- a/drivers/gpu/drm/msm/dp/dp_ctrl.h
> +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.h
> @@ -25,7 +25,7 @@ int dp_ctrl_off_link_stream(struct dp_ctrl *dp_ctrl);
>   int dp_ctrl_off_link(struct dp_ctrl *dp_ctrl);
>   int dp_ctrl_off(struct dp_ctrl *dp_ctrl);
>   void dp_ctrl_push_idle(struct dp_ctrl *dp_ctrl);
> -void dp_ctrl_isr(struct dp_ctrl *dp_ctrl);
> +irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl);
>   void dp_ctrl_handle_sink_request(struct dp_ctrl *dp_ctrl);
>   struct dp_ctrl *dp_ctrl_get(struct device *dev, struct dp_link *link,
>   			struct dp_panel *panel,	struct drm_dp_aux *aux,
> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
> index 7ff60e5ff325..8996adbc5bd3 100644
> --- a/drivers/gpu/drm/msm/dp/dp_display.c
> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
> @@ -1192,7 +1192,7 @@ static int dp_hpd_event_thread_start(struct dp_display_private *dp_priv)
>   static irqreturn_t dp_display_irq_handler(int irq, void *dev_id)
>   {
>   	struct dp_display_private *dp = dev_id;
> -	irqreturn_t ret = IRQ_HANDLED;
> +	irqreturn_t ret = IRQ_NONE;
>   	u32 hpd_isr_status;
>   
>   	if (!dp) {
> @@ -1220,13 +1220,15 @@ static irqreturn_t dp_display_irq_handler(int irq, void *dev_id)
>   
>   		if (hpd_isr_status & DP_DP_HPD_UNPLUG_INT_MASK)
>   			dp_add_event(dp, EV_HPD_UNPLUG_INT, 0, 0);
> +
> +		ret = IRQ_HANDLED;
>   	}
>   
>   	/* DP controller isr */
> -	dp_ctrl_isr(dp->ctrl);
> +	ret |= dp_ctrl_isr(dp->ctrl);
>   
>   	/* DP aux isr */
> -	dp_aux_isr(dp->aux);
> +	ret |= dp_aux_isr(dp->aux);
>   
>   	return ret;
>   }

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] drm/msm/dp: Return IRQ_NONE for unhandled interrupts
  2023-01-25 17:22     ` Kuogee Hsieh
@ 2023-01-25 18:21       ` Doug Anderson
  -1 siblings, 0 replies; 14+ messages in thread
From: Doug Anderson @ 2023-01-25 18:21 UTC (permalink / raw)
  To: Kuogee Hsieh
  Cc: Rob Clark, Abhinav Kumar, Dmitry Baryshkov, Stephen Boyd,
	Bjorn Andersson, Daniel Vetter, David Airlie,
	Javier Martinez Canillas, Johan Hovold, Sankeerth Billakanti,
	Sean Paul, Thomas Zimmermann, dri-devel, freedreno,
	linux-arm-msm, linux-kernel

Hi,

On Wed, Jan 25, 2023 at 9:22 AM Kuogee Hsieh <quic_khsieh@quicinc.com> wrote:
>
> > -void dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
> > +irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
> >   {
> >       struct dp_ctrl_private *ctrl;
> >       u32 isr;
> > +     irqreturn_t ret = IRQ_NONE;
> >
> >       if (!dp_ctrl)
> > -             return;
> > +             return IRQ_NONE;
> >
> >       ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
> >
> >       isr = dp_catalog_ctrl_get_interrupt(ctrl->catalog);
> can you add (!isr) check and return IRQ_NONE here to be consistent with
> dp_aux_isr()?

I could, though it doesn't really buy us a whole lot in this case and
just adds an extra test that's not needed. Here it should be easy for
someone reading the function to see that if "isr == 0" that neither of
the two "if" statements below will fire and we'll return "IRQ_NONE"
anyway.

...that actually made me go back and wonder whether we still needed
the "if" test in dp_aux_isr() or if it too was also redundant. It
turns out that it's not! The previous patch made dp_aux_irq() detect
unexpected interrupts. Thus the "if (!isr)" test earlier is important
because otherwise we'd end up WARNing "Unexpected interrupt:
0x00000000" which would be confusing.

So unless you or others feel strongly that I should add the redundant
test here, I'd rather keep it off. Let me know.

-Doug

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] drm/msm/dp: Return IRQ_NONE for unhandled interrupts
@ 2023-01-25 18:21       ` Doug Anderson
  0 siblings, 0 replies; 14+ messages in thread
From: Doug Anderson @ 2023-01-25 18:21 UTC (permalink / raw)
  To: Kuogee Hsieh
  Cc: freedreno, Sankeerth Billakanti, linux-kernel, Thomas Zimmermann,
	Sean Paul, Bjorn Andersson, Abhinav Kumar, dri-devel,
	Stephen Boyd, linux-arm-msm, Dmitry Baryshkov,
	Javier Martinez Canillas, Johan Hovold

Hi,

On Wed, Jan 25, 2023 at 9:22 AM Kuogee Hsieh <quic_khsieh@quicinc.com> wrote:
>
> > -void dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
> > +irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
> >   {
> >       struct dp_ctrl_private *ctrl;
> >       u32 isr;
> > +     irqreturn_t ret = IRQ_NONE;
> >
> >       if (!dp_ctrl)
> > -             return;
> > +             return IRQ_NONE;
> >
> >       ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
> >
> >       isr = dp_catalog_ctrl_get_interrupt(ctrl->catalog);
> can you add (!isr) check and return IRQ_NONE here to be consistent with
> dp_aux_isr()?

I could, though it doesn't really buy us a whole lot in this case and
just adds an extra test that's not needed. Here it should be easy for
someone reading the function to see that if "isr == 0" that neither of
the two "if" statements below will fire and we'll return "IRQ_NONE"
anyway.

...that actually made me go back and wonder whether we still needed
the "if" test in dp_aux_isr() or if it too was also redundant. It
turns out that it's not! The previous patch made dp_aux_irq() detect
unexpected interrupts. Thus the "if (!isr)" test earlier is important
because otherwise we'd end up WARNing "Unexpected interrupt:
0x00000000" which would be confusing.

So unless you or others feel strongly that I should add the redundant
test here, I'd rather keep it off. Let me know.

-Doug

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] drm/msm/dp: Return IRQ_NONE for unhandled interrupts
  2023-01-25 18:21       ` Doug Anderson
@ 2023-01-25 23:36         ` Kuogee Hsieh
  -1 siblings, 0 replies; 14+ messages in thread
From: Kuogee Hsieh @ 2023-01-25 23:36 UTC (permalink / raw)
  To: Doug Anderson
  Cc: freedreno, Sankeerth Billakanti, linux-kernel, Thomas Zimmermann,
	Sean Paul, Bjorn Andersson, Abhinav Kumar, dri-devel,
	Stephen Boyd, linux-arm-msm, Dmitry Baryshkov,
	Javier Martinez Canillas, Johan Hovold


On 1/25/2023 10:21 AM, Doug Anderson wrote:
> Hi,
>
> On Wed, Jan 25, 2023 at 9:22 AM Kuogee Hsieh <quic_khsieh@quicinc.com> wrote:
>>> -void dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
>>> +irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
>>>    {
>>>        struct dp_ctrl_private *ctrl;
>>>        u32 isr;
>>> +     irqreturn_t ret = IRQ_NONE;
>>>
>>>        if (!dp_ctrl)
>>> -             return;
>>> +             return IRQ_NONE;
>>>
>>>        ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
>>>
>>>        isr = dp_catalog_ctrl_get_interrupt(ctrl->catalog);
>> can you add (!isr) check and return IRQ_NONE here to be consistent with
>> dp_aux_isr()?
> I could, though it doesn't really buy us a whole lot in this case and
> just adds an extra test that's not needed. Here it should be easy for
> someone reading the function to see that if "isr == 0" that neither of
> the two "if" statements below will fire and we'll return "IRQ_NONE"
> anyway.
>
> ...that actually made me go back and wonder whether we still needed
> the "if" test in dp_aux_isr() or if it too was also redundant. It
> turns out that it's not! The previous patch made dp_aux_irq() detect
> unexpected interrupts. Thus the "if (!isr)" test earlier is important
> because otherwise we'd end up WARNing "Unexpected interrupt:
> 0x00000000" which would be confusing.
>
> So unless you or others feel strongly that I should add the redundant
> test here, I'd rather keep it off. Let me know.
>
> -Doug
ack

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] drm/msm/dp: Return IRQ_NONE for unhandled interrupts
@ 2023-01-25 23:36         ` Kuogee Hsieh
  0 siblings, 0 replies; 14+ messages in thread
From: Kuogee Hsieh @ 2023-01-25 23:36 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Rob Clark, Abhinav Kumar, Dmitry Baryshkov, Stephen Boyd,
	Bjorn Andersson, Daniel Vetter, David Airlie,
	Javier Martinez Canillas, Johan Hovold, Sankeerth Billakanti,
	Sean Paul, Thomas Zimmermann, dri-devel, freedreno,
	linux-arm-msm, linux-kernel


On 1/25/2023 10:21 AM, Doug Anderson wrote:
> Hi,
>
> On Wed, Jan 25, 2023 at 9:22 AM Kuogee Hsieh <quic_khsieh@quicinc.com> wrote:
>>> -void dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
>>> +irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
>>>    {
>>>        struct dp_ctrl_private *ctrl;
>>>        u32 isr;
>>> +     irqreturn_t ret = IRQ_NONE;
>>>
>>>        if (!dp_ctrl)
>>> -             return;
>>> +             return IRQ_NONE;
>>>
>>>        ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
>>>
>>>        isr = dp_catalog_ctrl_get_interrupt(ctrl->catalog);
>> can you add (!isr) check and return IRQ_NONE here to be consistent with
>> dp_aux_isr()?
> I could, though it doesn't really buy us a whole lot in this case and
> just adds an extra test that's not needed. Here it should be easy for
> someone reading the function to see that if "isr == 0" that neither of
> the two "if" statements below will fire and we'll return "IRQ_NONE"
> anyway.
>
> ...that actually made me go back and wonder whether we still needed
> the "if" test in dp_aux_isr() or if it too was also redundant. It
> turns out that it's not! The previous patch made dp_aux_irq() detect
> unexpected interrupts. Thus the "if (!isr)" test earlier is important
> because otherwise we'd end up WARNing "Unexpected interrupt:
> 0x00000000" which would be confusing.
>
> So unless you or others feel strongly that I should add the redundant
> test here, I'd rather keep it off. Let me know.
>
> -Doug
ack

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] drm/msm/dp: Clean up handling of DP AUX interrupts
  2023-01-25 17:13   ` Kuogee Hsieh
@ 2023-01-27  1:09     ` Doug Anderson
  -1 siblings, 0 replies; 14+ messages in thread
From: Doug Anderson @ 2023-01-27  1:09 UTC (permalink / raw)
  To: Kuogee Hsieh
  Cc: Rob Clark, Abhinav Kumar, Dmitry Baryshkov, Stephen Boyd,
	Daniel Vetter, David Airlie, Sankeerth Billakanti, Sean Paul,
	Thomas Zimmermann, dri-devel, freedreno, linux-arm-msm,
	linux-kernel

Hi,

On Wed, Jan 25, 2023 at 9:13 AM Kuogee Hsieh <quic_khsieh@quicinc.com> wrote:
>
>
> On 1/19/2023 2:53 PM, Douglas Anderson wrote:
> > The DP AUX interrupt handling was a bit of a mess.
> > * There were two functions (one for "native" transfers and one for
> >    "i2c" transfers) that were quite similar. It was hard to say how
> >    many of the differences between the two functions were on purpose
> >    and how many of them were just an accident of how they were coded.
> > * Each function sometimes used "else if" to test for error bits and
> >    sometimes didn't and again it was hard to say if this was on purpose
> >    or just an accident.
> > * The two functions wouldn't notice whether "unknown" bits were
> >    set. For instance, there seems to be a bit "DP_INTR_PLL_UNLOCKED"
> >    and if it was set there would be no indication.
> > * The two functions wouldn't notice if more than one error was set.
> >
> > Let's fix this by being more consistent / explicit about what we're
> > doing.
> >
> > By design this could cause different handling for AUX transfers,
> > though I'm not actually aware of any bug fixed as a result of
> > this patch (this patch was created because we simply noticed how odd
> > the old code was by code inspection). Specific notes here:
> > 1. In the old native transfer case if we got "done + wrong address"
> >     we'd ignore the "wrong address" (because of the "else if"). Now we
> >     won't.
> > 2. In the old native transfer case if we got "done + timeout" we'd
> >     ignore the "timeout" (because of the "else if"). Now we won't.
> > 3. In the old native transfer case we'd see "nack_defer" and translate
> >     it to the error number for "nack". This differed from the i2c
> >     transfer case where "nack_defer" was given the error number for
> >     "nack_defer". This 100% can't matter because the only user of this
> >     error number treats "nack defer" the same as "nack", so it's clear
> >     that the difference between the "native" and "i2c" was pointless
> >     here.
> > 4. In the old i2c transfer case if we got "done" plus any error
> >     besides "nack" or "defer" then we'd ignore the error. Now we don't.
> > 5. If there is more than one error signaled by the hardware it's
> >     possible that we'll report a different one than we used to. I don't
> >     know if this matters. If someone is aware of a case this matters we
> >     should document it and change the code to make it explicit.
> > 6. One quirk we keep (I don't know if this is important) is that in
> >     the i2c transfer case if we see "done + defer" we report that as a
> >     "nack". That seemed too intentional in the old code to just drop.
> >
> > After this change we will add extra logging, including:
> > * A warning if we see more than one error bit set.
> > * A warning if we see an unexpected interrupt.
> > * A warning if we get an AUX transfer interrupt when shouldn't.
> >
> > It actually turns out that as a result of this change then at boot we
> > sometimes see an error:
> >    [drm:dp_aux_isr] *ERROR* Unexpected DP AUX IRQ 0x01000000 when not busy
> > That means that, during init, we are seeing DP_INTR_PLL_UNLOCKED. For
> > now I'm going to say that leaving this error reported in the logs is
> > OK-ish and hopefully it will encourage someone to track down what's
> > going on at init time.
> >
> > One last note here is that this change renames one of the interrupt
> > bits. The bit named "i2c done" clearly was used for native transfers
> > being done too, so I renamed it to indicate this.
> >
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > ---
> > I don't have good test coverage for this change and it does have the
> > potential to change behavior. I confirmed that eDP and DP still
> > continue to work OK on one machine. Hopefully folks can test it more.
> >
> >   drivers/gpu/drm/msm/dp/dp_aux.c     | 80 ++++++++++++-----------------
> >   drivers/gpu/drm/msm/dp/dp_catalog.c |  2 +-
> >   drivers/gpu/drm/msm/dp/dp_catalog.h |  2 +-
> >   3 files changed, 36 insertions(+), 48 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
> > index cc3efed593aa..34ad08ae6eb9 100644
> > --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> > +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> > @@ -162,47 +162,6 @@ static ssize_t dp_aux_cmd_fifo_rx(struct dp_aux_private *aux,
> >       return i;
> >   }
> >
> > -static void dp_aux_native_handler(struct dp_aux_private *aux, u32 isr)
> > -{
> > -     if (isr & DP_INTR_AUX_I2C_DONE)
> > -             aux->aux_error_num = DP_AUX_ERR_NONE;
> > -     else if (isr & DP_INTR_WRONG_ADDR)
> > -             aux->aux_error_num = DP_AUX_ERR_ADDR;
> > -     else if (isr & DP_INTR_TIMEOUT)
> > -             aux->aux_error_num = DP_AUX_ERR_TOUT;
> > -     if (isr & DP_INTR_NACK_DEFER)
> > -             aux->aux_error_num = DP_AUX_ERR_NACK;
> > -     if (isr & DP_INTR_AUX_ERROR) {
> > -             aux->aux_error_num = DP_AUX_ERR_PHY;
> > -             dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> > -     }
> > -}
> > -
> > -static void dp_aux_i2c_handler(struct dp_aux_private *aux, u32 isr)
> > -{
> > -     if (isr & DP_INTR_AUX_I2C_DONE) {
> > -             if (isr & (DP_INTR_I2C_NACK | DP_INTR_I2C_DEFER))
> > -                     aux->aux_error_num = DP_AUX_ERR_NACK;
> > -             else
> > -                     aux->aux_error_num = DP_AUX_ERR_NONE;
> > -     } else {
> > -             if (isr & DP_INTR_WRONG_ADDR)
> > -                     aux->aux_error_num = DP_AUX_ERR_ADDR;
> > -             else if (isr & DP_INTR_TIMEOUT)
> > -                     aux->aux_error_num = DP_AUX_ERR_TOUT;
> > -             if (isr & DP_INTR_NACK_DEFER)
> > -                     aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
> > -             if (isr & DP_INTR_I2C_NACK)
> > -                     aux->aux_error_num = DP_AUX_ERR_NACK;
> > -             if (isr & DP_INTR_I2C_DEFER)
> > -                     aux->aux_error_num = DP_AUX_ERR_DEFER;
> > -             if (isr & DP_INTR_AUX_ERROR) {
> > -                     aux->aux_error_num = DP_AUX_ERR_PHY;
> > -                     dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> > -             }
> > -     }
> > -}
> > -
> >   static void dp_aux_update_offset_and_segment(struct dp_aux_private *aux,
> >                                            struct drm_dp_aux_msg *input_msg)
> >   {
> > @@ -427,13 +386,42 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
> >       if (!isr)
> >               return;
> >
> > -     if (!aux->cmd_busy)
> > +     if (!aux->cmd_busy) {
> > +             DRM_ERROR("Unexpected DP AUX IRQ %#010x when not busy\n", isr);
> >               return;
> > +     }
> >
> > -     if (aux->native)
> > -             dp_aux_native_handler(aux, isr);
> > -     else
> > -             dp_aux_i2c_handler(aux, isr);
> > +     /*
> > +      * The logic below assumes only one error bit is set (other than "done"
> > +      * which can apparently be set at the same time as some of the other
> > +      * bits). Warn if more than one get set so we know we need to improve
> > +      * the logic.
> > +      */
> > +     if (hweight32(isr & ~DP_INTR_AUX_XFER_DONE) > 1)
> > +             DRM_WARN("Some DP AUX interrupts unhandled: %#010x\n", isr);
> > +
> > +     if (isr & DP_INTR_AUX_ERROR) {
> > +             aux->aux_error_num = DP_AUX_ERR_PHY;
> > +             dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> > +     } else if (isr & DP_INTR_NACK_DEFER) {
> > +             aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
> > +     } else if (isr & DP_INTR_WRONG_ADDR) {
> > +             aux->aux_error_num = DP_AUX_ERR_ADDR;
> > +     } else if (isr & DP_INTR_TIMEOUT) {
> > +             aux->aux_error_num = DP_AUX_ERR_TOUT;
> > +     } else if (isr & DP_INTR_AUX_XFER_DONE) {
> > +             aux->aux_error_num = DP_AUX_ERR_NONE;
>
>
> 1) both DP_INTR_AUX_XFER_DONE and DP_INTR_I2C_NACK are set
>
> 2) both DP_INTR_AUX_XFER_DONE and DP_INTR_I2C_DEFER are set
>
> with above two condition, below two "else if" will not be reached since
> DP_INTR_AUX_XFER_DONE is check with higher priority

Indeed, that is a bug, good catch! I had the "DONE" at the end at the
beginning but then I remember thinking it looked ugly because the two
I2C cases below had the extra "aux->native". Moved it to the right
place and I'll send a quick v2 since I don't expect more feedback
since it's already been a week.

-Doug

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] drm/msm/dp: Clean up handling of DP AUX interrupts
@ 2023-01-27  1:09     ` Doug Anderson
  0 siblings, 0 replies; 14+ messages in thread
From: Doug Anderson @ 2023-01-27  1:09 UTC (permalink / raw)
  To: Kuogee Hsieh
  Cc: freedreno, Sankeerth Billakanti, Thomas Zimmermann, Sean Paul,
	Abhinav Kumar, dri-devel, Stephen Boyd, linux-arm-msm,
	Dmitry Baryshkov, linux-kernel

Hi,

On Wed, Jan 25, 2023 at 9:13 AM Kuogee Hsieh <quic_khsieh@quicinc.com> wrote:
>
>
> On 1/19/2023 2:53 PM, Douglas Anderson wrote:
> > The DP AUX interrupt handling was a bit of a mess.
> > * There were two functions (one for "native" transfers and one for
> >    "i2c" transfers) that were quite similar. It was hard to say how
> >    many of the differences between the two functions were on purpose
> >    and how many of them were just an accident of how they were coded.
> > * Each function sometimes used "else if" to test for error bits and
> >    sometimes didn't and again it was hard to say if this was on purpose
> >    or just an accident.
> > * The two functions wouldn't notice whether "unknown" bits were
> >    set. For instance, there seems to be a bit "DP_INTR_PLL_UNLOCKED"
> >    and if it was set there would be no indication.
> > * The two functions wouldn't notice if more than one error was set.
> >
> > Let's fix this by being more consistent / explicit about what we're
> > doing.
> >
> > By design this could cause different handling for AUX transfers,
> > though I'm not actually aware of any bug fixed as a result of
> > this patch (this patch was created because we simply noticed how odd
> > the old code was by code inspection). Specific notes here:
> > 1. In the old native transfer case if we got "done + wrong address"
> >     we'd ignore the "wrong address" (because of the "else if"). Now we
> >     won't.
> > 2. In the old native transfer case if we got "done + timeout" we'd
> >     ignore the "timeout" (because of the "else if"). Now we won't.
> > 3. In the old native transfer case we'd see "nack_defer" and translate
> >     it to the error number for "nack". This differed from the i2c
> >     transfer case where "nack_defer" was given the error number for
> >     "nack_defer". This 100% can't matter because the only user of this
> >     error number treats "nack defer" the same as "nack", so it's clear
> >     that the difference between the "native" and "i2c" was pointless
> >     here.
> > 4. In the old i2c transfer case if we got "done" plus any error
> >     besides "nack" or "defer" then we'd ignore the error. Now we don't.
> > 5. If there is more than one error signaled by the hardware it's
> >     possible that we'll report a different one than we used to. I don't
> >     know if this matters. If someone is aware of a case this matters we
> >     should document it and change the code to make it explicit.
> > 6. One quirk we keep (I don't know if this is important) is that in
> >     the i2c transfer case if we see "done + defer" we report that as a
> >     "nack". That seemed too intentional in the old code to just drop.
> >
> > After this change we will add extra logging, including:
> > * A warning if we see more than one error bit set.
> > * A warning if we see an unexpected interrupt.
> > * A warning if we get an AUX transfer interrupt when shouldn't.
> >
> > It actually turns out that as a result of this change then at boot we
> > sometimes see an error:
> >    [drm:dp_aux_isr] *ERROR* Unexpected DP AUX IRQ 0x01000000 when not busy
> > That means that, during init, we are seeing DP_INTR_PLL_UNLOCKED. For
> > now I'm going to say that leaving this error reported in the logs is
> > OK-ish and hopefully it will encourage someone to track down what's
> > going on at init time.
> >
> > One last note here is that this change renames one of the interrupt
> > bits. The bit named "i2c done" clearly was used for native transfers
> > being done too, so I renamed it to indicate this.
> >
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > ---
> > I don't have good test coverage for this change and it does have the
> > potential to change behavior. I confirmed that eDP and DP still
> > continue to work OK on one machine. Hopefully folks can test it more.
> >
> >   drivers/gpu/drm/msm/dp/dp_aux.c     | 80 ++++++++++++-----------------
> >   drivers/gpu/drm/msm/dp/dp_catalog.c |  2 +-
> >   drivers/gpu/drm/msm/dp/dp_catalog.h |  2 +-
> >   3 files changed, 36 insertions(+), 48 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
> > index cc3efed593aa..34ad08ae6eb9 100644
> > --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> > +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> > @@ -162,47 +162,6 @@ static ssize_t dp_aux_cmd_fifo_rx(struct dp_aux_private *aux,
> >       return i;
> >   }
> >
> > -static void dp_aux_native_handler(struct dp_aux_private *aux, u32 isr)
> > -{
> > -     if (isr & DP_INTR_AUX_I2C_DONE)
> > -             aux->aux_error_num = DP_AUX_ERR_NONE;
> > -     else if (isr & DP_INTR_WRONG_ADDR)
> > -             aux->aux_error_num = DP_AUX_ERR_ADDR;
> > -     else if (isr & DP_INTR_TIMEOUT)
> > -             aux->aux_error_num = DP_AUX_ERR_TOUT;
> > -     if (isr & DP_INTR_NACK_DEFER)
> > -             aux->aux_error_num = DP_AUX_ERR_NACK;
> > -     if (isr & DP_INTR_AUX_ERROR) {
> > -             aux->aux_error_num = DP_AUX_ERR_PHY;
> > -             dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> > -     }
> > -}
> > -
> > -static void dp_aux_i2c_handler(struct dp_aux_private *aux, u32 isr)
> > -{
> > -     if (isr & DP_INTR_AUX_I2C_DONE) {
> > -             if (isr & (DP_INTR_I2C_NACK | DP_INTR_I2C_DEFER))
> > -                     aux->aux_error_num = DP_AUX_ERR_NACK;
> > -             else
> > -                     aux->aux_error_num = DP_AUX_ERR_NONE;
> > -     } else {
> > -             if (isr & DP_INTR_WRONG_ADDR)
> > -                     aux->aux_error_num = DP_AUX_ERR_ADDR;
> > -             else if (isr & DP_INTR_TIMEOUT)
> > -                     aux->aux_error_num = DP_AUX_ERR_TOUT;
> > -             if (isr & DP_INTR_NACK_DEFER)
> > -                     aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
> > -             if (isr & DP_INTR_I2C_NACK)
> > -                     aux->aux_error_num = DP_AUX_ERR_NACK;
> > -             if (isr & DP_INTR_I2C_DEFER)
> > -                     aux->aux_error_num = DP_AUX_ERR_DEFER;
> > -             if (isr & DP_INTR_AUX_ERROR) {
> > -                     aux->aux_error_num = DP_AUX_ERR_PHY;
> > -                     dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> > -             }
> > -     }
> > -}
> > -
> >   static void dp_aux_update_offset_and_segment(struct dp_aux_private *aux,
> >                                            struct drm_dp_aux_msg *input_msg)
> >   {
> > @@ -427,13 +386,42 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
> >       if (!isr)
> >               return;
> >
> > -     if (!aux->cmd_busy)
> > +     if (!aux->cmd_busy) {
> > +             DRM_ERROR("Unexpected DP AUX IRQ %#010x when not busy\n", isr);
> >               return;
> > +     }
> >
> > -     if (aux->native)
> > -             dp_aux_native_handler(aux, isr);
> > -     else
> > -             dp_aux_i2c_handler(aux, isr);
> > +     /*
> > +      * The logic below assumes only one error bit is set (other than "done"
> > +      * which can apparently be set at the same time as some of the other
> > +      * bits). Warn if more than one get set so we know we need to improve
> > +      * the logic.
> > +      */
> > +     if (hweight32(isr & ~DP_INTR_AUX_XFER_DONE) > 1)
> > +             DRM_WARN("Some DP AUX interrupts unhandled: %#010x\n", isr);
> > +
> > +     if (isr & DP_INTR_AUX_ERROR) {
> > +             aux->aux_error_num = DP_AUX_ERR_PHY;
> > +             dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> > +     } else if (isr & DP_INTR_NACK_DEFER) {
> > +             aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
> > +     } else if (isr & DP_INTR_WRONG_ADDR) {
> > +             aux->aux_error_num = DP_AUX_ERR_ADDR;
> > +     } else if (isr & DP_INTR_TIMEOUT) {
> > +             aux->aux_error_num = DP_AUX_ERR_TOUT;
> > +     } else if (isr & DP_INTR_AUX_XFER_DONE) {
> > +             aux->aux_error_num = DP_AUX_ERR_NONE;
>
>
> 1) both DP_INTR_AUX_XFER_DONE and DP_INTR_I2C_NACK are set
>
> 2) both DP_INTR_AUX_XFER_DONE and DP_INTR_I2C_DEFER are set
>
> with above two condition, below two "else if" will not be reached since
> DP_INTR_AUX_XFER_DONE is check with higher priority

Indeed, that is a bug, good catch! I had the "DONE" at the end at the
beginning but then I remember thinking it looked ugly because the two
I2C cases below had the extra "aux->native". Moved it to the right
place and I'll send a quick v2 since I don't expect more feedback
since it's already been a week.

-Doug

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-01-27  1:09 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-19 22:53 [PATCH 1/2] drm/msm/dp: Clean up handling of DP AUX interrupts Douglas Anderson
2023-01-19 22:53 ` Douglas Anderson
2023-01-19 22:53 ` [PATCH 2/2] drm/msm/dp: Return IRQ_NONE for unhandled interrupts Douglas Anderson
2023-01-19 22:53   ` Douglas Anderson
2023-01-25 17:22   ` Kuogee Hsieh
2023-01-25 17:22     ` Kuogee Hsieh
2023-01-25 18:21     ` Doug Anderson
2023-01-25 18:21       ` Doug Anderson
2023-01-25 23:36       ` Kuogee Hsieh
2023-01-25 23:36         ` Kuogee Hsieh
2023-01-25 17:13 ` [PATCH 1/2] drm/msm/dp: Clean up handling of DP AUX interrupts Kuogee Hsieh
2023-01-25 17:13   ` Kuogee Hsieh
2023-01-27  1:09   ` Doug Anderson
2023-01-27  1:09     ` Doug Anderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.