All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] drm/vc4: Fix frame corruption when moving the cursor
@ 2022-02-21 13:41 Maxime Ripard
  2022-02-21 13:41 ` [PATCH 1/8] drm/vc4: kms: Take old state core clock rate into account Maxime Ripard
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Maxime Ripard @ 2022-02-21 13:41 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie
  Cc: Dom Cobley, Tim Gover, Dave Stevenson, dri-devel, Maxime Ripard,
	Thomas Zimmermann, Phil Elwell

Hi,

This series fixes a long standing issue with the VC4 driver when one was
moving the cursor on X11 along the edges of the monitor, if we had
overscan margins enabled.

The details are in the commit log of the last patch, but the main reason
was that moving along the edges with the overscan margins enabled
triggers a full blown commit, as opposed to an async commit. Since that
commit is on the cursor plane, it's treated as a legacy cursor update,
and won't wait for vblank, so it's possible to queue multiple commit
between vblank.

Now, the composition happens in the HVS, and the HVS has a series of
descriptors stored in an internal RAM, one for each plane. Allocations
in that RAM are tied to the CRTC state, and freed when that state is
destroyed. That internal RAM is also used by the HVS to store some
internal context while it's generating a frame.

If we get multiple commits between vblanks, chances are that the RAM
entries used by one of the first commit is going to be freed and reused
by a later commit, which will then overwrite the content of the earlier
entries, erasing the context in the process and corrupting the frame.

We've tested multiple solutions, but the one that seem to work without
any cons is to defer the deallocation of RAM entries to the next vblank
after the CRTC state has been freed.

Let me know what you think,
Maxime

Maxime Ripard (8):
  drm/vc4: kms: Take old state core clock rate into account
  drm/vc4: hvs: Fix frame count register readout
  drm/vc4: hvs: Store channel in variable
  drm/vc4: hvs: Remove dlist setup duplication
  drm/vc4: hvs: Move the dlist setup to its own function
  drm/vc4: hvs: Ignore atomic_flush if we're disabled
  drm/vc4: hvs: Use pointer to HVS in HVS_READ and HVS_WRITE macros
  drm/vc4: hvs: Defer dlist slots deallocation

 drivers/gpu/drm/vc4/vc4_crtc.c |  24 +--
 drivers/gpu/drm/vc4/vc4_drv.h  |  30 +++-
 drivers/gpu/drm/vc4/vc4_hvs.c  | 309 ++++++++++++++++++++++++++-------
 drivers/gpu/drm/vc4/vc4_kms.c  |  10 +-
 drivers/gpu/drm/vc4/vc4_regs.h |  13 +-
 5 files changed, 299 insertions(+), 87 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/8] drm/vc4: kms: Take old state core clock rate into account
  2022-02-21 13:41 [PATCH 0/8] drm/vc4: Fix frame corruption when moving the cursor Maxime Ripard
@ 2022-02-21 13:41 ` Maxime Ripard
  2022-02-21 13:41 ` [PATCH 2/8] drm/vc4: hvs: Fix frame count register readout Maxime Ripard
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Maxime Ripard @ 2022-02-21 13:41 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie
  Cc: Dom Cobley, Tim Gover, Dave Stevenson, dri-devel, Maxime Ripard,
	Thomas Zimmermann, Phil Elwell

During a commit, the core clock, which feeds the HVS, needs to run at
a minimum of 500MHz.

While doing that commit, we can also change the mode to one that
requires a higher core clock, so we take the core clock rate associated
to that new state into account for that boost.

However, the old state also needs to be taken into account if it
requires a core clock higher that the new one and our 500MHz limit,
since it's still live in hardware at the beginning of our commit.

Fixes: 16e101051f32 ("drm/vc4: Increase the core clock based on HVS load")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_kms.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
index 24de29bc1cda..992d6a240002 100644
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -385,9 +385,10 @@ static void vc4_atomic_commit_tail(struct drm_atomic_state *state)
 	}
 
 	if (vc4->hvs->hvs5) {
+		unsigned long state_rate = max(old_hvs_state->core_clock_rate,
+					       new_hvs_state->core_clock_rate);
 		unsigned long core_rate = max_t(unsigned long,
-						500000000,
-						new_hvs_state->core_clock_rate);
+						500000000, state_rate);
 
 		clk_set_min_rate(hvs->core_clk, core_rate);
 	}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/8] drm/vc4: hvs: Fix frame count register readout
  2022-02-21 13:41 [PATCH 0/8] drm/vc4: Fix frame corruption when moving the cursor Maxime Ripard
  2022-02-21 13:41 ` [PATCH 1/8] drm/vc4: kms: Take old state core clock rate into account Maxime Ripard
@ 2022-02-21 13:41 ` Maxime Ripard
  2022-02-21 13:41 ` [PATCH 3/8] drm/vc4: hvs: Store channel in variable Maxime Ripard
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Maxime Ripard @ 2022-02-21 13:41 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie
  Cc: Dom Cobley, Tim Gover, Dave Stevenson, dri-devel, Maxime Ripard,
	Thomas Zimmermann, Phil Elwell

In order to get the field currently being output, the driver has been
using the display FIFO frame count in the HVS, reading a 6-bit field at
the offset 12 in the DISPSTATx register.

While that field is indeed at that location for the FIFO 1 and 2, the
one for the FIFO0 is actually in the DISPSTAT1 register, at the offset
18.

Fixes: e538092cb15c ("drm/vc4: Enable precise vblank timestamping for interlaced modes.")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_crtc.c |  2 +-
 drivers/gpu/drm/vc4/vc4_drv.h  |  1 +
 drivers/gpu/drm/vc4/vc4_hvs.c  | 23 +++++++++++++++++++++++
 drivers/gpu/drm/vc4/vc4_regs.h | 12 ++++++++++--
 4 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index e6cc47470e03..72fadce38d32 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -123,7 +123,7 @@ static bool vc4_crtc_get_scanout_position(struct drm_crtc *crtc,
 		*vpos /= 2;
 
 		/* Use hpos to correct for field offset in interlaced mode. */
-		if (VC4_GET_FIELD(val, SCALER_DISPSTATX_FRAME_COUNT) % 2)
+		if (vc4_hvs_get_fifo_frame_count(dev, vc4_crtc_state->assigned_channel) % 2)
 			*hpos += mode->crtc_htotal / 2;
 	}
 
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index 4329e09d357c..801da3e8ebdb 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -935,6 +935,7 @@ void vc4_irq_reset(struct drm_device *dev);
 extern struct platform_driver vc4_hvs_driver;
 void vc4_hvs_stop_channel(struct drm_device *dev, unsigned int output);
 int vc4_hvs_get_fifo_from_output(struct drm_device *dev, unsigned int output);
+u8 vc4_hvs_get_fifo_frame_count(struct drm_device *dev, unsigned int fifo);
 int vc4_hvs_atomic_check(struct drm_crtc *crtc, struct drm_atomic_state *state);
 void vc4_hvs_atomic_begin(struct drm_crtc *crtc, struct drm_atomic_state *state);
 void vc4_hvs_atomic_enable(struct drm_crtc *crtc, struct drm_atomic_state *state);
diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
index 604933e20e6a..c8cae10500b9 100644
--- a/drivers/gpu/drm/vc4/vc4_hvs.c
+++ b/drivers/gpu/drm/vc4/vc4_hvs.c
@@ -197,6 +197,29 @@ static void vc4_hvs_update_gamma_lut(struct drm_crtc *crtc)
 	vc4_hvs_lut_load(crtc);
 }
 
+u8 vc4_hvs_get_fifo_frame_count(struct drm_device *dev, unsigned int fifo)
+{
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	u8 field = 0;
+
+	switch (fifo) {
+	case 0:
+		field = VC4_GET_FIELD(HVS_READ(SCALER_DISPSTAT1),
+				      SCALER_DISPSTAT1_FRCNT0);
+		break;
+	case 1:
+		field = VC4_GET_FIELD(HVS_READ(SCALER_DISPSTAT1),
+				      SCALER_DISPSTAT1_FRCNT1);
+		break;
+	case 2:
+		field = VC4_GET_FIELD(HVS_READ(SCALER_DISPSTAT2),
+				      SCALER_DISPSTAT2_FRCNT2);
+		break;
+	}
+
+	return field;
+}
+
 int vc4_hvs_get_fifo_from_output(struct drm_device *dev, unsigned int output)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h
index 33410718089e..bae8c9cd6f7c 100644
--- a/drivers/gpu/drm/vc4/vc4_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_regs.h
@@ -379,8 +379,6 @@
 # define SCALER_DISPSTATX_MODE_EOF		3
 # define SCALER_DISPSTATX_FULL			BIT(29)
 # define SCALER_DISPSTATX_EMPTY			BIT(28)
-# define SCALER_DISPSTATX_FRAME_COUNT_MASK	VC4_MASK(17, 12)
-# define SCALER_DISPSTATX_FRAME_COUNT_SHIFT	12
 # define SCALER_DISPSTATX_LINE_MASK		VC4_MASK(11, 0)
 # define SCALER_DISPSTATX_LINE_SHIFT		0
 
@@ -403,9 +401,15 @@
 						 (x) * (SCALER_DISPBKGND1 - \
 							SCALER_DISPBKGND0))
 #define SCALER_DISPSTAT1                        0x00000058
+# define SCALER_DISPSTAT1_FRCNT0_MASK		VC4_MASK(23, 18)
+# define SCALER_DISPSTAT1_FRCNT0_SHIFT		18
+# define SCALER_DISPSTAT1_FRCNT1_MASK		VC4_MASK(17, 12)
+# define SCALER_DISPSTAT1_FRCNT1_SHIFT		12
+
 #define SCALER_DISPSTATX(x)			(SCALER_DISPSTAT0 +        \
 						 (x) * (SCALER_DISPSTAT1 - \
 							SCALER_DISPSTAT0))
+
 #define SCALER_DISPBASE1                        0x0000005c
 #define SCALER_DISPBASEX(x)			(SCALER_DISPBASE0 +        \
 						 (x) * (SCALER_DISPBASE1 - \
@@ -415,7 +419,11 @@
 						 (x) * (SCALER_DISPCTRL1 - \
 							SCALER_DISPCTRL0))
 #define SCALER_DISPBKGND2                       0x00000064
+
 #define SCALER_DISPSTAT2                        0x00000068
+# define SCALER_DISPSTAT2_FRCNT2_MASK		VC4_MASK(17, 12)
+# define SCALER_DISPSTAT2_FRCNT2_SHIFT		12
+
 #define SCALER_DISPBASE2                        0x0000006c
 #define SCALER_DISPALPHA2                       0x00000070
 #define SCALER_GAMADDR                          0x00000078
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/8] drm/vc4: hvs: Store channel in variable
  2022-02-21 13:41 [PATCH 0/8] drm/vc4: Fix frame corruption when moving the cursor Maxime Ripard
  2022-02-21 13:41 ` [PATCH 1/8] drm/vc4: kms: Take old state core clock rate into account Maxime Ripard
  2022-02-21 13:41 ` [PATCH 2/8] drm/vc4: hvs: Fix frame count register readout Maxime Ripard
@ 2022-02-21 13:41 ` Maxime Ripard
  2022-02-21 13:41 ` [PATCH 4/8] drm/vc4: hvs: Remove dlist setup duplication Maxime Ripard
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Maxime Ripard @ 2022-02-21 13:41 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie
  Cc: Dom Cobley, Tim Gover, Dave Stevenson, dri-devel, Maxime Ripard,
	Thomas Zimmermann, Phil Elwell

The assigned_channel field of our vc4_crtc_state structure is accessed
multiple times in vc4_hvs_atomic_flush, so let's move it to a variable
that can be used in all those places.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_hvs.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
index c8cae10500b9..d225eea6e640 100644
--- a/drivers/gpu/drm/vc4/vc4_hvs.c
+++ b/drivers/gpu/drm/vc4/vc4_hvs.c
@@ -460,6 +460,7 @@ void vc4_hvs_atomic_flush(struct drm_crtc *crtc,
 	struct drm_device *dev = crtc->dev;
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct vc4_crtc_state *vc4_state = to_vc4_crtc_state(crtc->state);
+	unsigned int channel = vc4_state->assigned_channel;
 	struct drm_plane *plane;
 	struct vc4_plane_state *vc4_plane_state;
 	bool debug_dump_regs = false;
@@ -500,8 +501,8 @@ void vc4_hvs_atomic_flush(struct drm_crtc *crtc,
 		/* This sets a black background color fill, as is the case
 		 * with other DRM drivers.
 		 */
-		HVS_WRITE(SCALER_DISPBKGNDX(vc4_state->assigned_channel),
-			  HVS_READ(SCALER_DISPBKGNDX(vc4_state->assigned_channel)) |
+		HVS_WRITE(SCALER_DISPBKGNDX(channel),
+			  HVS_READ(SCALER_DISPBKGNDX(channel)) |
 			  SCALER_DISPBKGND_FILL);
 
 	/* Only update DISPLIST if the CRTC was already running and is not
@@ -515,7 +516,7 @@ void vc4_hvs_atomic_flush(struct drm_crtc *crtc,
 		vc4_hvs_update_dlist(crtc);
 
 	if (crtc->state->color_mgmt_changed) {
-		u32 dispbkgndx = HVS_READ(SCALER_DISPBKGNDX(vc4_state->assigned_channel));
+		u32 dispbkgndx = HVS_READ(SCALER_DISPBKGNDX(channel));
 
 		if (crtc->state->gamma_lut) {
 			vc4_hvs_update_gamma_lut(crtc);
@@ -527,7 +528,7 @@ void vc4_hvs_atomic_flush(struct drm_crtc *crtc,
 			 */
 			dispbkgndx &= ~SCALER_DISPBKGND_GAMMA;
 		}
-		HVS_WRITE(SCALER_DISPBKGNDX(vc4_state->assigned_channel), dispbkgndx);
+		HVS_WRITE(SCALER_DISPBKGNDX(channel), dispbkgndx);
 	}
 
 	if (debug_dump_regs) {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/8] drm/vc4: hvs: Remove dlist setup duplication
  2022-02-21 13:41 [PATCH 0/8] drm/vc4: Fix frame corruption when moving the cursor Maxime Ripard
                   ` (2 preceding siblings ...)
  2022-02-21 13:41 ` [PATCH 3/8] drm/vc4: hvs: Store channel in variable Maxime Ripard
@ 2022-02-21 13:41 ` Maxime Ripard
  2022-02-21 13:41 ` [PATCH 5/8] drm/vc4: hvs: Move the dlist setup to its own function Maxime Ripard
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Maxime Ripard @ 2022-02-21 13:41 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie
  Cc: Dom Cobley, Tim Gover, Dave Stevenson, dri-devel, Maxime Ripard,
	Thomas Zimmermann, Phil Elwell

Setting the DISPLISTx register needs to occur in every case, and we
don't need to protect the register using the event_lock, so we can just
move it after the if branches and simplify a bit the function.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_hvs.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
index d225eea6e640..71aa5081eaa3 100644
--- a/drivers/gpu/drm/vc4/vc4_hvs.c
+++ b/drivers/gpu/drm/vc4/vc4_hvs.c
@@ -402,15 +402,12 @@ static void vc4_hvs_update_dlist(struct drm_crtc *crtc)
 			crtc->state->event = NULL;
 		}
 
-		HVS_WRITE(SCALER_DISPLISTX(vc4_state->assigned_channel),
-			  vc4_state->mm.start);
-
 		spin_unlock_irqrestore(&dev->event_lock, flags);
-	} else {
-		HVS_WRITE(SCALER_DISPLISTX(vc4_state->assigned_channel),
-			  vc4_state->mm.start);
 	}
 
+	HVS_WRITE(SCALER_DISPLISTX(vc4_state->assigned_channel),
+		  vc4_state->mm.start);
+
 	spin_lock_irqsave(&vc4_crtc->irq_lock, flags);
 	vc4_crtc->current_dlist = vc4_state->mm.start;
 	spin_unlock_irqrestore(&vc4_crtc->irq_lock, flags);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5/8] drm/vc4: hvs: Move the dlist setup to its own function
  2022-02-21 13:41 [PATCH 0/8] drm/vc4: Fix frame corruption when moving the cursor Maxime Ripard
                   ` (3 preceding siblings ...)
  2022-02-21 13:41 ` [PATCH 4/8] drm/vc4: hvs: Remove dlist setup duplication Maxime Ripard
@ 2022-02-21 13:41 ` Maxime Ripard
  2022-02-21 13:41 ` [PATCH 6/8] drm/vc4: hvs: Ignore atomic_flush if we're disabled Maxime Ripard
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Maxime Ripard @ 2022-02-21 13:41 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie
  Cc: Dom Cobley, Tim Gover, Dave Stevenson, dri-devel, Maxime Ripard,
	Thomas Zimmermann, Phil Elwell

The vc4_hvs_update_dlist function mostly deals with setting up the
vblank events and setting up the dlist entry pointer to our current
active one.

We'll want to do the former separately from the vblank handling in later
patches, so let's move it to a function of its own.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_hvs.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
index 71aa5081eaa3..2d540fc11357 100644
--- a/drivers/gpu/drm/vc4/vc4_hvs.c
+++ b/drivers/gpu/drm/vc4/vc4_hvs.c
@@ -382,10 +382,19 @@ int vc4_hvs_atomic_check(struct drm_crtc *crtc, struct drm_atomic_state *state)
 	return 0;
 }
 
+static void vc4_hvs_install_dlist(struct drm_crtc *crtc)
+{
+	struct drm_device *dev = crtc->dev;
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	struct vc4_crtc_state *vc4_state = to_vc4_crtc_state(crtc->state);
+
+	HVS_WRITE(SCALER_DISPLISTX(vc4_state->assigned_channel),
+		  vc4_state->mm.start);
+}
+
 static void vc4_hvs_update_dlist(struct drm_crtc *crtc)
 {
 	struct drm_device *dev = crtc->dev;
-	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
 	struct vc4_crtc_state *vc4_state = to_vc4_crtc_state(crtc->state);
 	unsigned long flags;
@@ -405,9 +414,6 @@ static void vc4_hvs_update_dlist(struct drm_crtc *crtc)
 		spin_unlock_irqrestore(&dev->event_lock, flags);
 	}
 
-	HVS_WRITE(SCALER_DISPLISTX(vc4_state->assigned_channel),
-		  vc4_state->mm.start);
-
 	spin_lock_irqsave(&vc4_crtc->irq_lock, flags);
 	vc4_crtc->current_dlist = vc4_state->mm.start;
 	spin_unlock_irqrestore(&vc4_crtc->irq_lock, flags);
@@ -434,6 +440,7 @@ void vc4_hvs_atomic_enable(struct drm_crtc *crtc,
 	struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
 	bool oneshot = vc4_crtc->feeds_txp;
 
+	vc4_hvs_install_dlist(crtc);
 	vc4_hvs_update_dlist(crtc);
 	vc4_hvs_init_channel(vc4, crtc, mode, oneshot);
 }
@@ -509,8 +516,10 @@ void vc4_hvs_atomic_flush(struct drm_crtc *crtc,
 	 * If the CRTC is being disabled, there's no point in updating this
 	 * information.
 	 */
-	if (crtc->state->active && old_state->active)
+	if (crtc->state->active && old_state->active) {
+		vc4_hvs_install_dlist(crtc);
 		vc4_hvs_update_dlist(crtc);
+	}
 
 	if (crtc->state->color_mgmt_changed) {
 		u32 dispbkgndx = HVS_READ(SCALER_DISPBKGNDX(channel));
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 6/8] drm/vc4: hvs: Ignore atomic_flush if we're disabled
  2022-02-21 13:41 [PATCH 0/8] drm/vc4: Fix frame corruption when moving the cursor Maxime Ripard
                   ` (4 preceding siblings ...)
  2022-02-21 13:41 ` [PATCH 5/8] drm/vc4: hvs: Move the dlist setup to its own function Maxime Ripard
@ 2022-02-21 13:41 ` Maxime Ripard
  2022-02-21 13:41 ` [PATCH 7/8] drm/vc4: hvs: Use pointer to HVS in HVS_READ and HVS_WRITE macros Maxime Ripard
  2022-02-21 13:41 ` [PATCH 8/8] drm/vc4: hvs: Defer dlist slots deallocation Maxime Ripard
  7 siblings, 0 replies; 9+ messages in thread
From: Maxime Ripard @ 2022-02-21 13:41 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie
  Cc: Dom Cobley, Tim Gover, Dave Stevenson, dri-devel, Maxime Ripard,
	Thomas Zimmermann, Phil Elwell

atomic_flush will be called for each CRTC even if they aren't enabled.

The whole code we have there will thus run without a properly affected
channel, which can then result in all sorts of weird behaviour.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_hvs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
index 2d540fc11357..5db125dc2358 100644
--- a/drivers/gpu/drm/vc4/vc4_hvs.c
+++ b/drivers/gpu/drm/vc4/vc4_hvs.c
@@ -472,6 +472,9 @@ void vc4_hvs_atomic_flush(struct drm_crtc *crtc,
 	u32 __iomem *dlist_start = vc4->hvs->dlist + vc4_state->mm.start;
 	u32 __iomem *dlist_next = dlist_start;
 
+	if (vc4_state->assigned_channel == VC4_HVS_CHANNEL_DISABLED)
+		return;
+
 	if (debug_dump_regs) {
 		DRM_INFO("CRTC %d HVS before:\n", drm_crtc_index(crtc));
 		vc4_hvs_dump_state(dev);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 7/8] drm/vc4: hvs: Use pointer to HVS in HVS_READ and HVS_WRITE macros
  2022-02-21 13:41 [PATCH 0/8] drm/vc4: Fix frame corruption when moving the cursor Maxime Ripard
                   ` (5 preceding siblings ...)
  2022-02-21 13:41 ` [PATCH 6/8] drm/vc4: hvs: Ignore atomic_flush if we're disabled Maxime Ripard
@ 2022-02-21 13:41 ` Maxime Ripard
  2022-02-21 13:41 ` [PATCH 8/8] drm/vc4: hvs: Defer dlist slots deallocation Maxime Ripard
  7 siblings, 0 replies; 9+ messages in thread
From: Maxime Ripard @ 2022-02-21 13:41 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie
  Cc: Dom Cobley, Tim Gover, Dave Stevenson, dri-devel, Maxime Ripard,
	Thomas Zimmermann, Phil Elwell

Those macros are really about the HVS itself, and thus its associated
structure vc4_hvs, rather than the entire (virtual) vc4 device.

Let's change those macros to use the hvs pointer directly, and change
the calling sites accordingly.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_crtc.c | 14 +++++--
 drivers/gpu/drm/vc4/vc4_drv.h  | 16 +++----
 drivers/gpu/drm/vc4/vc4_hvs.c  | 77 +++++++++++++++++-----------------
 drivers/gpu/drm/vc4/vc4_kms.c  |  5 ++-
 4 files changed, 60 insertions(+), 52 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index 72fadce38d32..5bb4027e479e 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -70,6 +70,7 @@ static const struct debugfs_reg32 crtc_regs[] = {
 static unsigned int
 vc4_crtc_get_cob_allocation(struct vc4_dev *vc4, unsigned int channel)
 {
+	struct vc4_hvs *hvs = vc4->hvs;
 	u32 dispbase = HVS_READ(SCALER_DISPBASEX(channel));
 	/* Top/base are supposed to be 4-pixel aligned, but the
 	 * Raspberry Pi firmware fills the low bits (which are
@@ -89,6 +90,7 @@ static bool vc4_crtc_get_scanout_position(struct drm_crtc *crtc,
 {
 	struct drm_device *dev = crtc->dev;
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	struct vc4_hvs *hvs = vc4->hvs;
 	struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
 	struct vc4_crtc_state *vc4_crtc_state = to_vc4_crtc_state(crtc->state);
 	unsigned int cob_size;
@@ -123,7 +125,7 @@ static bool vc4_crtc_get_scanout_position(struct drm_crtc *crtc,
 		*vpos /= 2;
 
 		/* Use hpos to correct for field offset in interlaced mode. */
-		if (vc4_hvs_get_fifo_frame_count(dev, vc4_crtc_state->assigned_channel) % 2)
+		if (vc4_hvs_get_fifo_frame_count(hvs, vc4_crtc_state->assigned_channel) % 2)
 			*hpos += mode->crtc_htotal / 2;
 	}
 
@@ -413,6 +415,7 @@ static void vc4_crtc_config_pv(struct drm_crtc *crtc, struct drm_encoder *encode
 static void require_hvs_enabled(struct drm_device *dev)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	struct vc4_hvs *hvs = vc4->hvs;
 
 	WARN_ON_ONCE((HVS_READ(SCALER_DISPCTRL) & SCALER_DISPCTRL_ENABLE) !=
 		     SCALER_DISPCTRL_ENABLE);
@@ -426,6 +429,7 @@ static int vc4_crtc_disable(struct drm_crtc *crtc,
 	struct vc4_encoder *vc4_encoder = to_vc4_encoder(encoder);
 	struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
 	struct drm_device *dev = crtc->dev;
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	int ret;
 
 	CRTC_WRITE(PV_V_CONTROL,
@@ -455,7 +459,7 @@ static int vc4_crtc_disable(struct drm_crtc *crtc,
 		vc4_encoder->post_crtc_disable(encoder, state);
 
 	vc4_crtc_pixelvalve_reset(crtc);
-	vc4_hvs_stop_channel(dev, channel);
+	vc4_hvs_stop_channel(vc4->hvs, channel);
 
 	if (vc4_encoder && vc4_encoder->post_crtc_powerdown)
 		vc4_encoder->post_crtc_powerdown(encoder, state);
@@ -481,6 +485,7 @@ static struct drm_encoder *vc4_crtc_get_encoder_by_type(struct drm_crtc *crtc,
 int vc4_crtc_disable_at_boot(struct drm_crtc *crtc)
 {
 	struct drm_device *drm = crtc->dev;
+	struct vc4_dev *vc4 = to_vc4_dev(drm);
 	struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
 	enum vc4_encoder_type encoder_type;
 	const struct vc4_pv_data *pv_data;
@@ -502,7 +507,7 @@ int vc4_crtc_disable_at_boot(struct drm_crtc *crtc)
 	if (!(CRTC_READ(PV_V_CONTROL) & PV_VCONTROL_VIDEN))
 		return 0;
 
-	channel = vc4_hvs_get_fifo_from_output(drm, vc4_crtc->data->hvs_output);
+	channel = vc4_hvs_get_fifo_from_output(vc4->hvs, vc4_crtc->data->hvs_output);
 	if (channel < 0)
 		return 0;
 
@@ -715,6 +720,7 @@ static void vc4_crtc_handle_page_flip(struct vc4_crtc *vc4_crtc)
 	struct drm_crtc *crtc = &vc4_crtc->base;
 	struct drm_device *dev = crtc->dev;
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	struct vc4_hvs *hvs = vc4->hvs;
 	u32 chan = vc4_crtc->current_hvs_channel;
 	unsigned long flags;
 
@@ -733,7 +739,7 @@ static void vc4_crtc_handle_page_flip(struct vc4_crtc *vc4_crtc)
 		 * the CRTC and encoder already reconfigured, leading to
 		 * underruns. This can be seen when reconfiguring the CRTC.
 		 */
-		vc4_hvs_unmask_underrun(dev, chan);
+		vc4_hvs_unmask_underrun(hvs, chan);
 	}
 	spin_unlock(&vc4_crtc->irq_lock);
 	spin_unlock_irqrestore(&dev->event_lock, flags);
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index 801da3e8ebdb..15e0c2ac3940 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -574,8 +574,8 @@ to_vc4_crtc_state(struct drm_crtc_state *crtc_state)
 
 #define V3D_READ(offset) readl(vc4->v3d->regs + offset)
 #define V3D_WRITE(offset, val) writel(val, vc4->v3d->regs + offset)
-#define HVS_READ(offset) readl(vc4->hvs->regs + offset)
-#define HVS_WRITE(offset, val) writel(val, vc4->hvs->regs + offset)
+#define HVS_READ(offset) readl(hvs->regs + offset)
+#define HVS_WRITE(offset, val) writel(val, hvs->regs + offset)
 
 #define VC4_REG32(reg) { .name = #reg, .offset = reg }
 
@@ -933,17 +933,17 @@ void vc4_irq_reset(struct drm_device *dev);
 
 /* vc4_hvs.c */
 extern struct platform_driver vc4_hvs_driver;
-void vc4_hvs_stop_channel(struct drm_device *dev, unsigned int output);
-int vc4_hvs_get_fifo_from_output(struct drm_device *dev, unsigned int output);
-u8 vc4_hvs_get_fifo_frame_count(struct drm_device *dev, unsigned int fifo);
+void vc4_hvs_stop_channel(struct vc4_hvs *hvs, unsigned int output);
+int vc4_hvs_get_fifo_from_output(struct vc4_hvs *hvs, unsigned int output);
+u8 vc4_hvs_get_fifo_frame_count(struct vc4_hvs *hvs, unsigned int fifo);
 int vc4_hvs_atomic_check(struct drm_crtc *crtc, struct drm_atomic_state *state);
 void vc4_hvs_atomic_begin(struct drm_crtc *crtc, struct drm_atomic_state *state);
 void vc4_hvs_atomic_enable(struct drm_crtc *crtc, struct drm_atomic_state *state);
 void vc4_hvs_atomic_disable(struct drm_crtc *crtc, struct drm_atomic_state *state);
 void vc4_hvs_atomic_flush(struct drm_crtc *crtc, struct drm_atomic_state *state);
-void vc4_hvs_dump_state(struct drm_device *dev);
-void vc4_hvs_unmask_underrun(struct drm_device *dev, int channel);
-void vc4_hvs_mask_underrun(struct drm_device *dev, int channel);
+void vc4_hvs_dump_state(struct vc4_hvs *hvs);
+void vc4_hvs_unmask_underrun(struct vc4_hvs *hvs, int channel);
+void vc4_hvs_mask_underrun(struct vc4_hvs *hvs, int channel);
 
 /* vc4_kms.c */
 int vc4_kms_load(struct drm_device *dev);
diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
index 5db125dc2358..50faa5227cf1 100644
--- a/drivers/gpu/drm/vc4/vc4_hvs.c
+++ b/drivers/gpu/drm/vc4/vc4_hvs.c
@@ -64,22 +64,21 @@ static const struct debugfs_reg32 hvs_regs[] = {
 	VC4_REG32(SCALER_OLEDCOEF2),
 };
 
-void vc4_hvs_dump_state(struct drm_device *dev)
+void vc4_hvs_dump_state(struct vc4_hvs *hvs)
 {
-	struct vc4_dev *vc4 = to_vc4_dev(dev);
-	struct drm_printer p = drm_info_printer(&vc4->hvs->pdev->dev);
+	struct drm_printer p = drm_info_printer(&hvs->pdev->dev);
 	int i;
 
-	drm_print_regset32(&p, &vc4->hvs->regset);
+	drm_print_regset32(&p, &hvs->regset);
 
 	DRM_INFO("HVS ctx:\n");
 	for (i = 0; i < 64; i += 4) {
 		DRM_INFO("0x%08x (%s): 0x%08x 0x%08x 0x%08x 0x%08x\n",
 			 i * 4, i < HVS_BOOTLOADER_DLIST_END ? "B" : "D",
-			 readl((u32 __iomem *)vc4->hvs->dlist + i + 0),
-			 readl((u32 __iomem *)vc4->hvs->dlist + i + 1),
-			 readl((u32 __iomem *)vc4->hvs->dlist + i + 2),
-			 readl((u32 __iomem *)vc4->hvs->dlist + i + 3));
+			 readl((u32 __iomem *)hvs->dlist + i + 0),
+			 readl((u32 __iomem *)hvs->dlist + i + 1),
+			 readl((u32 __iomem *)hvs->dlist + i + 2),
+			 readl((u32 __iomem *)hvs->dlist + i + 3));
 	}
 }
 
@@ -157,11 +156,10 @@ static int vc4_hvs_upload_linear_kernel(struct vc4_hvs *hvs,
 	return 0;
 }
 
-static void vc4_hvs_lut_load(struct drm_crtc *crtc)
+static void vc4_hvs_lut_load(struct vc4_hvs *hvs,
+			     struct vc4_crtc *vc4_crtc)
 {
-	struct drm_device *dev = crtc->dev;
-	struct vc4_dev *vc4 = to_vc4_dev(dev);
-	struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
+	struct drm_crtc *crtc = &vc4_crtc->base;
 	struct vc4_crtc_state *vc4_state = to_vc4_crtc_state(crtc->state);
 	u32 i;
 
@@ -181,11 +179,12 @@ static void vc4_hvs_lut_load(struct drm_crtc *crtc)
 		HVS_WRITE(SCALER_GAMDATA, vc4_crtc->lut_b[i]);
 }
 
-static void vc4_hvs_update_gamma_lut(struct drm_crtc *crtc)
+static void vc4_hvs_update_gamma_lut(struct vc4_hvs *hvs,
+				     struct vc4_crtc *vc4_crtc)
 {
-	struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
-	struct drm_color_lut *lut = crtc->state->gamma_lut->data;
-	u32 length = drm_color_lut_size(crtc->state->gamma_lut);
+	struct drm_crtc_state *crtc_state = vc4_crtc->base.state;
+	struct drm_color_lut *lut = crtc_state->gamma_lut->data;
+	u32 length = drm_color_lut_size(crtc_state->gamma_lut);
 	u32 i;
 
 	for (i = 0; i < length; i++) {
@@ -194,12 +193,11 @@ static void vc4_hvs_update_gamma_lut(struct drm_crtc *crtc)
 		vc4_crtc->lut_b[i] = drm_color_lut_extract(lut[i].blue, 8);
 	}
 
-	vc4_hvs_lut_load(crtc);
+	vc4_hvs_lut_load(hvs, vc4_crtc);
 }
 
-u8 vc4_hvs_get_fifo_frame_count(struct drm_device *dev, unsigned int fifo)
+u8 vc4_hvs_get_fifo_frame_count(struct vc4_hvs *hvs, unsigned int fifo)
 {
-	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	u8 field = 0;
 
 	switch (fifo) {
@@ -220,13 +218,12 @@ u8 vc4_hvs_get_fifo_frame_count(struct drm_device *dev, unsigned int fifo)
 	return field;
 }
 
-int vc4_hvs_get_fifo_from_output(struct drm_device *dev, unsigned int output)
+int vc4_hvs_get_fifo_from_output(struct vc4_hvs *hvs, unsigned int output)
 {
-	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	u32 reg;
 	int ret;
 
-	if (!vc4->hvs->hvs5)
+	if (!hvs->hvs5)
 		return output;
 
 	switch (output) {
@@ -273,9 +270,10 @@ int vc4_hvs_get_fifo_from_output(struct drm_device *dev, unsigned int output)
 	}
 }
 
-static int vc4_hvs_init_channel(struct vc4_dev *vc4, struct drm_crtc *crtc,
+static int vc4_hvs_init_channel(struct vc4_hvs *hvs, struct drm_crtc *crtc,
 				struct drm_display_mode *mode, bool oneshot)
 {
+	struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
 	struct vc4_crtc_state *vc4_crtc_state = to_vc4_crtc_state(crtc->state);
 	unsigned int chan = vc4_crtc_state->assigned_channel;
 	bool interlace = mode->flags & DRM_MODE_FLAG_INTERLACE;
@@ -293,7 +291,7 @@ static int vc4_hvs_init_channel(struct vc4_dev *vc4, struct drm_crtc *crtc,
 	 */
 	dispctrl = SCALER_DISPCTRLX_ENABLE;
 
-	if (!vc4->hvs->hvs5)
+	if (!hvs->hvs5)
 		dispctrl |= VC4_SET_FIELD(mode->hdisplay,
 					  SCALER_DISPCTRLX_WIDTH) |
 			    VC4_SET_FIELD(mode->vdisplay,
@@ -314,21 +312,19 @@ static int vc4_hvs_init_channel(struct vc4_dev *vc4, struct drm_crtc *crtc,
 
 	HVS_WRITE(SCALER_DISPBKGNDX(chan), dispbkgndx |
 		  SCALER_DISPBKGND_AUTOHS |
-		  ((!vc4->hvs->hvs5) ? SCALER_DISPBKGND_GAMMA : 0) |
+		  ((!hvs->hvs5) ? SCALER_DISPBKGND_GAMMA : 0) |
 		  (interlace ? SCALER_DISPBKGND_INTERLACE : 0));
 
 	/* Reload the LUT, since the SRAMs would have been disabled if
 	 * all CRTCs had SCALER_DISPBKGND_GAMMA unset at once.
 	 */
-	vc4_hvs_lut_load(crtc);
+	vc4_hvs_lut_load(hvs, vc4_crtc);
 
 	return 0;
 }
 
-void vc4_hvs_stop_channel(struct drm_device *dev, unsigned int chan)
+void vc4_hvs_stop_channel(struct vc4_hvs *hvs, unsigned int chan)
 {
-	struct vc4_dev *vc4 = to_vc4_dev(dev);
-
 	if (HVS_READ(SCALER_DISPCTRLX(chan)) & SCALER_DISPCTRLX_ENABLE)
 		return;
 
@@ -386,6 +382,7 @@ static void vc4_hvs_install_dlist(struct drm_crtc *crtc)
 {
 	struct drm_device *dev = crtc->dev;
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	struct vc4_hvs *hvs = vc4->hvs;
 	struct vc4_crtc_state *vc4_state = to_vc4_crtc_state(crtc->state);
 
 	HVS_WRITE(SCALER_DISPLISTX(vc4_state->assigned_channel),
@@ -442,18 +439,19 @@ void vc4_hvs_atomic_enable(struct drm_crtc *crtc,
 
 	vc4_hvs_install_dlist(crtc);
 	vc4_hvs_update_dlist(crtc);
-	vc4_hvs_init_channel(vc4, crtc, mode, oneshot);
+	vc4_hvs_init_channel(vc4->hvs, crtc, mode, oneshot);
 }
 
 void vc4_hvs_atomic_disable(struct drm_crtc *crtc,
 			    struct drm_atomic_state *state)
 {
 	struct drm_device *dev = crtc->dev;
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct drm_crtc_state *old_state = drm_atomic_get_old_crtc_state(state, crtc);
 	struct vc4_crtc_state *vc4_state = to_vc4_crtc_state(old_state);
 	unsigned int chan = vc4_state->assigned_channel;
 
-	vc4_hvs_stop_channel(dev, chan);
+	vc4_hvs_stop_channel(vc4->hvs, chan);
 }
 
 void vc4_hvs_atomic_flush(struct drm_crtc *crtc,
@@ -463,6 +461,8 @@ void vc4_hvs_atomic_flush(struct drm_crtc *crtc,
 									 crtc);
 	struct drm_device *dev = crtc->dev;
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	struct vc4_hvs *hvs = vc4->hvs;
+	struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
 	struct vc4_crtc_state *vc4_state = to_vc4_crtc_state(crtc->state);
 	unsigned int channel = vc4_state->assigned_channel;
 	struct drm_plane *plane;
@@ -477,7 +477,7 @@ void vc4_hvs_atomic_flush(struct drm_crtc *crtc,
 
 	if (debug_dump_regs) {
 		DRM_INFO("CRTC %d HVS before:\n", drm_crtc_index(crtc));
-		vc4_hvs_dump_state(dev);
+		vc4_hvs_dump_state(hvs);
 	}
 
 	/* Copy all the active planes' dlist contents to the hardware dlist. */
@@ -528,7 +528,7 @@ void vc4_hvs_atomic_flush(struct drm_crtc *crtc,
 		u32 dispbkgndx = HVS_READ(SCALER_DISPBKGNDX(channel));
 
 		if (crtc->state->gamma_lut) {
-			vc4_hvs_update_gamma_lut(crtc);
+			vc4_hvs_update_gamma_lut(hvs, vc4_crtc);
 			dispbkgndx |= SCALER_DISPBKGND_GAMMA;
 		} else {
 			/* Unsetting DISPBKGND_GAMMA skips the gamma lut step
@@ -542,13 +542,12 @@ void vc4_hvs_atomic_flush(struct drm_crtc *crtc,
 
 	if (debug_dump_regs) {
 		DRM_INFO("CRTC %d HVS after:\n", drm_crtc_index(crtc));
-		vc4_hvs_dump_state(dev);
+		vc4_hvs_dump_state(hvs);
 	}
 }
 
-void vc4_hvs_mask_underrun(struct drm_device *dev, int channel)
+void vc4_hvs_mask_underrun(struct vc4_hvs *hvs, int channel)
 {
-	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	u32 dispctrl = HVS_READ(SCALER_DISPCTRL);
 
 	dispctrl &= ~SCALER_DISPCTRL_DSPEISLUR(channel);
@@ -556,9 +555,8 @@ void vc4_hvs_mask_underrun(struct drm_device *dev, int channel)
 	HVS_WRITE(SCALER_DISPCTRL, dispctrl);
 }
 
-void vc4_hvs_unmask_underrun(struct drm_device *dev, int channel)
+void vc4_hvs_unmask_underrun(struct vc4_hvs *hvs, int channel)
 {
-	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	u32 dispctrl = HVS_READ(SCALER_DISPCTRL);
 
 	dispctrl |= SCALER_DISPCTRL_DSPEISLUR(channel);
@@ -580,6 +578,7 @@ static irqreturn_t vc4_hvs_irq_handler(int irq, void *data)
 {
 	struct drm_device *dev = data;
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	struct vc4_hvs *hvs = vc4->hvs;
 	irqreturn_t irqret = IRQ_NONE;
 	int channel;
 	u32 control;
@@ -592,7 +591,7 @@ static irqreturn_t vc4_hvs_irq_handler(int irq, void *data)
 		/* Interrupt masking is not always honored, so check it here. */
 		if (status & SCALER_DISPSTAT_EUFLOW(channel) &&
 		    control & SCALER_DISPCTRL_DSPEISLUR(channel)) {
-			vc4_hvs_mask_underrun(dev, channel);
+			vc4_hvs_mask_underrun(hvs, channel);
 			vc4_hvs_report_underrun(dev);
 
 			irqret = IRQ_HANDLED;
diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
index 992d6a240002..0cee33464491 100644
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -157,6 +157,7 @@ static u16 vc4_ctm_s31_32_to_s0_9(u64 in)
 static void
 vc4_ctm_commit(struct vc4_dev *vc4, struct drm_atomic_state *state)
 {
+	struct vc4_hvs *hvs = vc4->hvs;
 	struct vc4_ctm_state *ctm_state = to_vc4_ctm_state(vc4->ctm_manager.state);
 	struct drm_color_ctm *ctm = ctm_state->ctm;
 
@@ -230,6 +231,7 @@ vc4_hvs_get_global_state(struct drm_atomic_state *state)
 static void vc4_hvs_pv_muxing_commit(struct vc4_dev *vc4,
 				     struct drm_atomic_state *state)
 {
+	struct vc4_hvs *hvs = vc4->hvs;
 	struct drm_crtc_state *crtc_state;
 	struct drm_crtc *crtc;
 	unsigned int i;
@@ -270,6 +272,7 @@ static void vc4_hvs_pv_muxing_commit(struct vc4_dev *vc4,
 static void vc5_hvs_pv_muxing_commit(struct vc4_dev *vc4,
 				     struct drm_atomic_state *state)
 {
+	struct vc4_hvs *hvs = vc4->hvs;
 	struct drm_crtc_state *crtc_state;
 	struct drm_crtc *crtc;
 	unsigned char mux;
@@ -362,7 +365,7 @@ static void vc4_atomic_commit_tail(struct drm_atomic_state *state)
 			continue;
 
 		vc4_crtc_state = to_vc4_crtc_state(new_crtc_state);
-		vc4_hvs_mask_underrun(dev, vc4_crtc_state->assigned_channel);
+		vc4_hvs_mask_underrun(hvs, vc4_crtc_state->assigned_channel);
 	}
 
 	for (channel = 0; channel < HVS_NUM_CHANNELS; channel++) {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 8/8] drm/vc4: hvs: Defer dlist slots deallocation
  2022-02-21 13:41 [PATCH 0/8] drm/vc4: Fix frame corruption when moving the cursor Maxime Ripard
                   ` (6 preceding siblings ...)
  2022-02-21 13:41 ` [PATCH 7/8] drm/vc4: hvs: Use pointer to HVS in HVS_READ and HVS_WRITE macros Maxime Ripard
@ 2022-02-21 13:41 ` Maxime Ripard
  7 siblings, 0 replies; 9+ messages in thread
From: Maxime Ripard @ 2022-02-21 13:41 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie
  Cc: Dom Cobley, Tim Gover, Dave Stevenson, dri-devel, Maxime Ripard,
	Thomas Zimmermann, Phil Elwell

During normal operations, the cursor position update is done through an
asynchronous plane update, which on the vc4 driver basically just
modifies the right dlist word to move the plane to the new coordinates.

However, when we have the overscan margins setup, we fall back to a
regular commit when we are next to the edges. And since that commit
happens to be on a cursor plane, it's considered a legacy cursor update
by KMS.

The main difference it makes is that it won't wait for its completion
(ie, next vblank) before returning. This means if we have multiple
commits happening in rapid succession, we can have several of them
happening before the next vblank.

In parallel, our dlist allocation is tied to a CRTC state, and each time
we do a commit we end up with a new CRTC state, with the previous one
being freed. This means that we free our previous dlist entry (but don't
clear it though) every time a new one is being committed.

Now, if we were to have two commits happening before the next vblank, we
could end up freeing reusing the same dlist entries before the next
vblank.

Indeed, we would start from an initial state taking, for example, the
dlist entries 10 to 20, then start a commit taking the entries 20 to 30
and setting the dlist pointer to 20, and freeing the dlist entries 10 to
20. However, since we haven't reach vblank yet, the HVS is still using
the entries 10 to 20.

If we were to make a new commit now, chances are the allocator are going
to give the 10 to 20 entries back, and we would change their content to
match the new state. If vblank hasn't happened yet, we just corrupted
the active dlist entries.

A first attempt to solve this was made by creating an intermediate dlist
buffer to store the current (ie, as of the last commit) dlist content,
that we would update each time the HVS is done with a frame. However, if
the interrupt handler missed the vblank window, we would end up copying
our intermediate dlist to the hardware one during the composition,
essentially creating the same issue.

Since making sure that our interrupt handler runs within a fixed,
constrained, time window would require to make Linux a real-time kernel,
this seems a bit out of scope.

Instead, we can work around our original issue by keeping the dlist
slots allocation longer. That way, we won't reuse a dlist slot while
it's still in flight. In order to achieve this, instead of freeing the
dlist slot when its associated CRTC state is destroyed, we'll queue it
in a list.

A naive implementation would free the buffers in that queue when we get
our end of frame interrupt. However, there's still a race since, just
like in the shadow dlist case, we don't control when the handler for
that interrupt is going to run. Thus, we can end up with a commit adding
an old dlist allocation to our queue during the window between our
actual interrupt and when our handler will run. And since that buffer is
still being used for the composition of the current frame, we can't free
it right away, exposing us to the original bug.

Fortunately for us, the hardware provides a frame counter that is
increased each time the first line of a frame is being generated.
Associating the frame counter the image is supposed to go away to the
allocation, and then only deallocate buffers that have a counter below
or equal to the one we see when the deallocation code should prevent the
above race from occuring.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_crtc.c |  10 +-
 drivers/gpu/drm/vc4/vc4_drv.h  |  15 ++-
 drivers/gpu/drm/vc4/vc4_hvs.c  | 181 ++++++++++++++++++++++++++++++---
 drivers/gpu/drm/vc4/vc4_regs.h |   1 +
 4 files changed, 184 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index 5bb4027e479e..ed5fc400b66d 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -920,14 +920,8 @@ void vc4_crtc_destroy_state(struct drm_crtc *crtc,
 	struct vc4_dev *vc4 = to_vc4_dev(crtc->dev);
 	struct vc4_crtc_state *vc4_state = to_vc4_crtc_state(state);
 
-	if (drm_mm_node_allocated(&vc4_state->mm)) {
-		unsigned long flags;
-
-		spin_lock_irqsave(&vc4->hvs->mm_lock, flags);
-		drm_mm_remove_node(&vc4_state->mm);
-		spin_unlock_irqrestore(&vc4->hvs->mm_lock, flags);
-
-	}
+	vc4_hvs_mark_dlist_entry_stale(vc4->hvs, vc4_state->mm);
+	vc4_state->mm = NULL;
 
 	drm_atomic_helper_crtc_destroy_state(crtc, state);
 }
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index 15e0c2ac3940..4d155af0262d 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -330,6 +330,9 @@ struct vc4_hvs {
 	struct drm_mm lbm_mm;
 	spinlock_t mm_lock;
 
+	struct list_head stale_dlist_entries;
+	struct work_struct free_dlist_work;
+
 	struct drm_mm_node mitchell_netravali_filter;
 
 	struct debugfs_regset32 regset;
@@ -544,10 +547,16 @@ vc4_crtc_to_vc4_pv_data(const struct vc4_crtc *crtc)
 struct drm_encoder *vc4_get_crtc_encoder(struct drm_crtc *crtc,
 					 struct drm_crtc_state *state);
 
+struct vc4_hvs_dlist_allocation {
+	struct list_head node;
+	struct drm_mm_node mm_node;
+	unsigned int channel;
+	u8 target_frame_count;
+};
+
 struct vc4_crtc_state {
 	struct drm_crtc_state base;
-	/* Dlist area for this CRTC configuration. */
-	struct drm_mm_node mm;
+	struct vc4_hvs_dlist_allocation *mm;
 	bool txp_armed;
 	unsigned int assigned_channel;
 
@@ -936,6 +945,8 @@ extern struct platform_driver vc4_hvs_driver;
 void vc4_hvs_stop_channel(struct vc4_hvs *hvs, unsigned int output);
 int vc4_hvs_get_fifo_from_output(struct vc4_hvs *hvs, unsigned int output);
 u8 vc4_hvs_get_fifo_frame_count(struct vc4_hvs *hvs, unsigned int fifo);
+void vc4_hvs_mark_dlist_entry_stale(struct vc4_hvs *hvs,
+				    struct vc4_hvs_dlist_allocation *alloc);
 int vc4_hvs_atomic_check(struct drm_crtc *crtc, struct drm_atomic_state *state);
 void vc4_hvs_atomic_begin(struct drm_crtc *crtc, struct drm_atomic_state *state);
 void vc4_hvs_atomic_enable(struct drm_crtc *crtc, struct drm_atomic_state *state);
diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
index 50faa5227cf1..70f8ee916133 100644
--- a/drivers/gpu/drm/vc4/vc4_hvs.c
+++ b/drivers/gpu/drm/vc4/vc4_hvs.c
@@ -196,6 +196,150 @@ static void vc4_hvs_update_gamma_lut(struct vc4_hvs *hvs,
 	vc4_hvs_lut_load(hvs, vc4_crtc);
 }
 
+static void vc4_hvs_irq_enable_eof(const struct vc4_hvs *hvs,
+				   unsigned int channel)
+{
+	u32 irq_mask = hvs->hvs5 ?
+		SCALER5_DISPCTRL_DSPEIEOF(channel) :
+		SCALER_DISPCTRL_DSPEIEOF(channel);
+
+	HVS_WRITE(SCALER_DISPCTRL,
+		  HVS_READ(SCALER_DISPCTRL) | irq_mask);
+}
+
+static void vc4_hvs_irq_clear_eof(const struct vc4_hvs *hvs,
+				  unsigned int channel)
+{
+	u32 irq_mask = hvs->hvs5 ?
+		SCALER5_DISPCTRL_DSPEIEOF(channel) :
+		SCALER_DISPCTRL_DSPEIEOF(channel);
+
+	HVS_WRITE(SCALER_DISPCTRL,
+		  HVS_READ(SCALER_DISPCTRL) & ~irq_mask);
+}
+
+static struct vc4_hvs_dlist_allocation *
+vc4_hvs_alloc_dlist_entry(struct vc4_hvs *hvs,
+			  unsigned int channel,
+			  size_t dlist_count)
+{
+	struct vc4_hvs_dlist_allocation *alloc;
+	unsigned long flags;
+	int ret;
+
+	if (channel == VC4_HVS_CHANNEL_DISABLED)
+		return NULL;
+
+	alloc = kzalloc(sizeof(*alloc), GFP_KERNEL);
+	if (!alloc)
+		return ERR_PTR(-ENOMEM);
+
+	spin_lock_irqsave(&hvs->mm_lock, flags);
+	ret = drm_mm_insert_node(&hvs->dlist_mm, &alloc->mm_node,
+				 dlist_count);
+	spin_unlock_irqrestore(&hvs->mm_lock, flags);
+	if (ret)
+		return ERR_PTR(ret);
+
+	alloc->channel = channel;
+
+	return alloc;
+}
+
+void vc4_hvs_mark_dlist_entry_stale(struct vc4_hvs *hvs,
+				    struct vc4_hvs_dlist_allocation *alloc)
+{
+	unsigned long flags;
+	u8 frcnt;
+
+	if (!alloc)
+		return;
+
+	if (!drm_mm_node_allocated(&alloc->mm_node))
+		return;
+
+	frcnt = vc4_hvs_get_fifo_frame_count(hvs, alloc->channel);
+	alloc->target_frame_count = (frcnt + 1) & ((1 << 6) - 1);
+
+	spin_lock_irqsave(&hvs->mm_lock, flags);
+
+	list_add_tail(&alloc->node, &hvs->stale_dlist_entries);
+
+	HVS_WRITE(SCALER_DISPSTAT, SCALER_DISPSTAT_EOF(alloc->channel));
+	vc4_hvs_irq_enable_eof(hvs, alloc->channel);
+
+	spin_unlock_irqrestore(&hvs->mm_lock, flags);
+}
+
+static void vc4_hvs_schedule_dlist_sweep(struct vc4_hvs *hvs,
+					 unsigned int channel)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&hvs->mm_lock, flags);
+
+	if (!list_empty(&hvs->stale_dlist_entries))
+		queue_work(system_unbound_wq, &hvs->free_dlist_work);
+
+	vc4_hvs_irq_clear_eof(hvs, channel);
+
+	spin_unlock_irqrestore(&hvs->mm_lock, flags);
+}
+
+/*
+ * Frame counts are essentially sequence numbers over 6 bits, and we
+ * thus can use sequence number arithmetic and follow the RFC1982 to
+ * implement proper comparison between them.
+ */
+static bool vc4_hvs_frcnt_lte(u8 cnt1, u8 cnt2)
+{
+	return (s8)((cnt1 << 2) - (cnt2 << 2)) <= 0;
+}
+
+/*
+ * Some atomic commits (legacy cursor updates, mostly) will not wait for
+ * the next vblank and will just return once the commit has been pushed
+ * to the hardware.
+ *
+ * On the hardware side, our HVS stores the planes parameters in its
+ * context RAM, and will use part of the RAM to store data during the
+ * frame rendering.
+ *
+ * This interacts badly if we get multiple commits before the next
+ * vblank since we could end up overwriting the DLIST entries used by
+ * previous commits if our dlist allocation reuses that entry. In such a
+ * case, we would overwrite the data currently being used by the
+ * hardware, resulting in a corrupted frame.
+ *
+ * In order to work around this, we'll queue the dlist entries in a list
+ * once the associated CRTC state is destroyed. The HVS only allows us
+ * to know which entry is being active, but not which one are no longer
+ * being used, so in order to avoid freeing entries that are still used
+ * by the hardware we add a guesstimate of the frame count where our
+ * entry will no longer be used, and thus will only free those entries
+ * when we will have reached that frame count.
+ */
+static void vc4_hvs_dlist_free_work(struct work_struct *work)
+{
+	struct vc4_hvs *hvs = container_of(work, struct vc4_hvs, free_dlist_work);
+	struct vc4_hvs_dlist_allocation *cur, *next;
+	unsigned long flags;
+
+	spin_lock_irqsave(&hvs->mm_lock, flags);
+	list_for_each_entry_safe(cur, next, &hvs->stale_dlist_entries, node) {
+		u8 frcnt;
+
+		frcnt = vc4_hvs_get_fifo_frame_count(hvs, cur->channel);
+		if (!vc4_hvs_frcnt_lte(cur->target_frame_count, frcnt))
+			continue;
+
+		list_del(&cur->node);
+		drm_mm_remove_node(&cur->mm_node);
+		kfree(cur);
+	}
+	spin_unlock_irqrestore(&hvs->mm_lock, flags);
+}
+
 u8 vc4_hvs_get_fifo_frame_count(struct vc4_hvs *hvs, unsigned int fifo)
 {
 	u8 field = 0;
@@ -349,13 +493,12 @@ int vc4_hvs_atomic_check(struct drm_crtc *crtc, struct drm_atomic_state *state)
 {
 	struct drm_crtc_state *crtc_state = drm_atomic_get_new_crtc_state(state, crtc);
 	struct vc4_crtc_state *vc4_state = to_vc4_crtc_state(crtc_state);
+	struct vc4_hvs_dlist_allocation *alloc;
 	struct drm_device *dev = crtc->dev;
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct drm_plane *plane;
-	unsigned long flags;
 	const struct drm_plane_state *plane_state;
 	u32 dlist_count = 0;
-	int ret;
 
 	/* The pixelvalve can only feed one encoder (and encoders are
 	 * 1:1 with connectors.)
@@ -368,12 +511,11 @@ int vc4_hvs_atomic_check(struct drm_crtc *crtc, struct drm_atomic_state *state)
 
 	dlist_count++; /* Account for SCALER_CTL0_END. */
 
-	spin_lock_irqsave(&vc4->hvs->mm_lock, flags);
-	ret = drm_mm_insert_node(&vc4->hvs->dlist_mm, &vc4_state->mm,
-				 dlist_count);
-	spin_unlock_irqrestore(&vc4->hvs->mm_lock, flags);
-	if (ret)
-		return ret;
+	alloc = vc4_hvs_alloc_dlist_entry(vc4->hvs, vc4_state->assigned_channel, dlist_count);
+	if (IS_ERR(alloc))
+		return PTR_ERR(alloc);
+
+	vc4_state->mm = alloc;
 
 	return 0;
 }
@@ -385,8 +527,9 @@ static void vc4_hvs_install_dlist(struct drm_crtc *crtc)
 	struct vc4_hvs *hvs = vc4->hvs;
 	struct vc4_crtc_state *vc4_state = to_vc4_crtc_state(crtc->state);
 
+	WARN_ON(!vc4_state->mm);
 	HVS_WRITE(SCALER_DISPLISTX(vc4_state->assigned_channel),
-		  vc4_state->mm.start);
+		  vc4_state->mm->mm_node.start);
 }
 
 static void vc4_hvs_update_dlist(struct drm_crtc *crtc)
@@ -411,8 +554,10 @@ static void vc4_hvs_update_dlist(struct drm_crtc *crtc)
 		spin_unlock_irqrestore(&dev->event_lock, flags);
 	}
 
+	WARN_ON(!vc4_state->mm);
+
 	spin_lock_irqsave(&vc4_crtc->irq_lock, flags);
-	vc4_crtc->current_dlist = vc4_state->mm.start;
+	vc4_crtc->current_dlist = vc4_state->mm->mm_node.start;
 	spin_unlock_irqrestore(&vc4_crtc->irq_lock, flags);
 }
 
@@ -469,8 +614,7 @@ void vc4_hvs_atomic_flush(struct drm_crtc *crtc,
 	struct vc4_plane_state *vc4_plane_state;
 	bool debug_dump_regs = false;
 	bool enable_bg_fill = false;
-	u32 __iomem *dlist_start = vc4->hvs->dlist + vc4_state->mm.start;
-	u32 __iomem *dlist_next = dlist_start;
+	u32 __iomem *dlist_start, *dlist_next;
 
 	if (vc4_state->assigned_channel == VC4_HVS_CHANNEL_DISABLED)
 		return;
@@ -480,6 +624,9 @@ void vc4_hvs_atomic_flush(struct drm_crtc *crtc,
 		vc4_hvs_dump_state(hvs);
 	}
 
+	dlist_start = vc4->hvs->dlist + vc4_state->mm->mm_node.start;
+	dlist_next = dlist_start;
+
 	/* Copy all the active planes' dlist contents to the hardware dlist. */
 	drm_atomic_crtc_for_each_plane(plane, crtc) {
 		/* Is this the first active plane? */
@@ -502,7 +649,8 @@ void vc4_hvs_atomic_flush(struct drm_crtc *crtc,
 	writel(SCALER_CTL0_END, dlist_next);
 	dlist_next++;
 
-	WARN_ON_ONCE(dlist_next - dlist_start != vc4_state->mm.size);
+	WARN_ON(!vc4_state->mm);
+	WARN_ON_ONCE(dlist_next - dlist_start != vc4_state->mm->mm_node.size);
 
 	if (enable_bg_fill)
 		/* This sets a black background color fill, as is the case
@@ -596,6 +744,11 @@ static irqreturn_t vc4_hvs_irq_handler(int irq, void *data)
 
 			irqret = IRQ_HANDLED;
 		}
+
+		if (status & SCALER_DISPSTAT_EOF(channel)) {
+			vc4_hvs_schedule_dlist_sweep(hvs, channel);
+			irqret = IRQ_HANDLED;
+		}
 	}
 
 	/* Clear every per-channel interrupt flag. */
@@ -652,6 +805,8 @@ static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
 		hvs->dlist = hvs->regs + SCALER5_DLIST_START;
 
 	spin_lock_init(&hvs->mm_lock);
+	INIT_LIST_HEAD(&hvs->stale_dlist_entries);
+	INIT_WORK(&hvs->free_dlist_work, vc4_hvs_dlist_free_work);
 
 	/* Set up the HVS display list memory manager.  We never
 	 * overwrite the setup from the bootloader (just 128b out of
diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h
index bae8c9cd6f7c..deb0c2f9c97f 100644
--- a/drivers/gpu/drm/vc4/vc4_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_regs.h
@@ -234,6 +234,7 @@
 # define SCALER_DISPCTRL_DSPEIEOLN(x)		BIT(8 + ((x) * 2))
 /* Enables Display 0 EOF contribution to SCALER_DISPSTAT_IRQDISP0 */
 # define SCALER_DISPCTRL_DSPEIEOF(x)		BIT(7 + ((x) * 2))
+# define SCALER5_DISPCTRL_DSPEIEOF(x)		BIT(7 + ((x) * 4))
 
 # define SCALER_DISPCTRL_SLVRDEIRQ		BIT(6)
 # define SCALER_DISPCTRL_SLVWREIRQ		BIT(5)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-02-21 13:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-21 13:41 [PATCH 0/8] drm/vc4: Fix frame corruption when moving the cursor Maxime Ripard
2022-02-21 13:41 ` [PATCH 1/8] drm/vc4: kms: Take old state core clock rate into account Maxime Ripard
2022-02-21 13:41 ` [PATCH 2/8] drm/vc4: hvs: Fix frame count register readout Maxime Ripard
2022-02-21 13:41 ` [PATCH 3/8] drm/vc4: hvs: Store channel in variable Maxime Ripard
2022-02-21 13:41 ` [PATCH 4/8] drm/vc4: hvs: Remove dlist setup duplication Maxime Ripard
2022-02-21 13:41 ` [PATCH 5/8] drm/vc4: hvs: Move the dlist setup to its own function Maxime Ripard
2022-02-21 13:41 ` [PATCH 6/8] drm/vc4: hvs: Ignore atomic_flush if we're disabled Maxime Ripard
2022-02-21 13:41 ` [PATCH 7/8] drm/vc4: hvs: Use pointer to HVS in HVS_READ and HVS_WRITE macros Maxime Ripard
2022-02-21 13:41 ` [PATCH 8/8] drm/vc4: hvs: Defer dlist slots deallocation Maxime Ripard

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.