[PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering
@ 2022-05-03 12:13 Maxime Ripard
  2022-05-03 12:13 ` [PATCH 01/14] drm/vc4: plane: Prevent async update if we don't have a dlist Maxime Ripard
                   ` (14 more replies)
  0 siblings, 15 replies; 25+ messages in thread
From: Maxime Ripard @ 2022-05-03 12:13 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann, Maxime Ripard
  Cc: Melissa Wen, dri-devel

Hi,

Here's a series that fixes a significant issue we missed when adding support
for the BCM2711 / RaspberryPi4 in the vc4 driver.

Indeed, before the introduction of the BCM2711 support, the GPU was fairly
intertwined with the display hardware, and was thus supported by the vc4
driver. Among other things, the driver needed to accomodate for a bunch of
hardware limitations (such as a lack of IOMMU) by implementing a custom memory
management backend, tied with the v3d hardware.

On the BCM2711, that GPU got moved into a completely separate hardware block
and thus we gained a new driver for it, v3d.

However, when we introduced the display support for the BCM2711 in vc4, we
didn't properly split out the v3d-related functions and ended up reusing a
significant portion of the code supposed to be backed by the v3d.

This created a bunch of easy to miss issues that would only pop up with IGT
tests, or when heavily testing some features (like asynchronous page-flipping).

This series properly does the split now by creating separate code path where
relevant, adds a loud complain when we use a v3d entry-point on the BCM2711,
and fixes an issue where we would just ignore any fence on an asynchronous
page-flip, resulting in frames appearing out-of-order for some workloads.

Let me know what you think,
Maxime

Maxime Ripard (14):
  drm/vc4: plane: Prevent async update if we don't have a dlist
  drm/vc4: Consolidate Hardware Revision Check
  drm/vc4: bo: Rename vc4_dumb_create
  drm/vc4: bo: Split out Dumb buffers fixup
  drm/vc4: drv: Register a different driver on BCM2711
  drm/vc4: kms: Register a different drm_mode_config_funcs on BCM2711
  drm/vc4: plane: Register a different drm_plane_helper_funcs on BCM2711
  drm/vc4: drv: Skip BO Backend Initialization on BCM2711
  drm/vc4: crtc: Use an union to store the page flip callback
  drm/vc4: crtc: Move the BO handling out of common page-flip callback
  drm/vc4: crtc: Move the BO Handling out of Common Page-Flip Handler
  drm/vc4: crtc: Don't call into BO Handling on Async Page-Flips on
    BCM2711
  drm/vc4: crtc: Fix out of order frames during asynchronous page flips
  drm/vc4: Warn if some v3d code is run on BCM2711

 drivers/gpu/drm/vc4/vc4_bo.c               |  62 ++++++-
 drivers/gpu/drm/vc4/vc4_crtc.c             | 193 +++++++++++++++------
 drivers/gpu/drm/vc4/vc4_drv.c              |  97 +++++++++--
 drivers/gpu/drm/vc4/vc4_drv.h              |  19 +-
 drivers/gpu/drm/vc4/vc4_gem.c              |  40 +++++
 drivers/gpu/drm/vc4/vc4_hvs.c              |  18 +-
 drivers/gpu/drm/vc4/vc4_irq.c              |  16 ++
 drivers/gpu/drm/vc4/vc4_kms.c              |  24 ++-
 drivers/gpu/drm/vc4/vc4_perfmon.c          |  41 +++++
 drivers/gpu/drm/vc4/vc4_plane.c            |  29 +++-
 drivers/gpu/drm/vc4/vc4_render_cl.c        |   4 +
 drivers/gpu/drm/vc4/vc4_v3d.c              |  15 ++
 drivers/gpu/drm/vc4/vc4_validate.c         |  16 ++
 drivers/gpu/drm/vc4/vc4_validate_shaders.c |   4 +
 14 files changed, 470 insertions(+), 108 deletions(-)

-- 
2.35.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 01/14] drm/vc4: plane: Prevent async update if we don't have a dlist
  2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
@ 2022-05-03 12:13 ` Maxime Ripard
  2022-05-03 12:13 ` [PATCH 02/14] drm/vc4: Consolidate Hardware Revision Check Maxime Ripard
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: Maxime Ripard @ 2022-05-03 12:13 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann, Maxime Ripard
  Cc: Melissa Wen, dri-devel

The vc4 planes are setup in hardware by creating a hardware descriptor
in a dedicated RAM. As part of the process to setup a plane in KMS, we
thus need to allocate some part of that dedicated RAM to store our
descriptor there.

The async update path will just reuse the descriptor already allocated
for that plane and will modify it directly in RAM to match whatever has
been asked for.

In order to do that, it will compare the descriptor for the old plane
state and the new plane state, will make sure they fit in the same size,
and check that only the position or buffer address have changed.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_plane.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c
index b3438f4a81ce..811a2d004cc4 100644
--- a/drivers/gpu/drm/vc4/vc4_plane.c
+++ b/drivers/gpu/drm/vc4/vc4_plane.c
@@ -1321,6 +1321,10 @@ static int vc4_plane_atomic_async_check(struct drm_plane *plane,
 
 	old_vc4_state = to_vc4_plane_state(plane->state);
 	new_vc4_state = to_vc4_plane_state(new_plane_state);
+
+	if (!new_vc4_state->hw_dlist)
+		return -EINVAL;
+
 	if (old_vc4_state->dlist_count != new_vc4_state->dlist_count ||
 	    old_vc4_state->pos0_offset != new_vc4_state->pos0_offset ||
 	    old_vc4_state->pos2_offset != new_vc4_state->pos2_offset ||
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 02/14] drm/vc4: Consolidate Hardware Revision Check
  2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
  2022-05-03 12:13 ` [PATCH 01/14] drm/vc4: plane: Prevent async update if we don't have a dlist Maxime Ripard
@ 2022-05-03 12:13 ` Maxime Ripard
  2022-05-03 12:13 ` [PATCH 03/14] drm/vc4: bo: Rename vc4_dumb_create Maxime Ripard
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: Maxime Ripard @ 2022-05-03 12:13 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann, Maxime Ripard
  Cc: Melissa Wen, dri-devel

A new generation of controller has been introduced with the
BCM2711/RaspberryPi4. This generation needs a bunch of quirks, and over
time we've piled on a number of checks in most parts of the drivers.

All these checks are performed several times, and are not always
consistent. Let's create a single, global, variable to hold it and use
it everywhere.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_crtc.c  |  6 +++---
 drivers/gpu/drm/vc4/vc4_drv.c   |  4 ++++
 drivers/gpu/drm/vc4/vc4_drv.h   |  6 +++---
 drivers/gpu/drm/vc4/vc4_hvs.c   | 18 +++++++++---------
 drivers/gpu/drm/vc4/vc4_kms.c   | 12 +++++-------
 drivers/gpu/drm/vc4/vc4_plane.c | 13 ++++++-------
 6 files changed, 30 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index 59b20c8f132b..dd5fb25d0f43 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -256,7 +256,7 @@ static u32 vc4_get_fifo_full_level(struct vc4_crtc *vc4_crtc, u32 format)
 		 * Removing 1 from the FIFO full level however
 		 * seems to completely remove that issue.
 		 */
-		if (!vc4->hvs->hvs5)
+		if (!vc4->is_vc5)
 			return fifo_len_bytes - 3 * HVS_FIFO_LATENCY_PIX - 1;
 
 		return fifo_len_bytes - 3 * HVS_FIFO_LATENCY_PIX;
@@ -389,7 +389,7 @@ static void vc4_crtc_config_pv(struct drm_crtc *crtc, struct drm_encoder *encode
 	if (is_dsi)
 		CRTC_WRITE(PV_HACT_ACT, mode->hdisplay * pixel_rep);
 
-	if (vc4->hvs->hvs5)
+	if (vc4->is_vc5)
 		CRTC_WRITE(PV_MUX_CFG,
 			   VC4_SET_FIELD(PV_MUX_CFG_RGB_PIXEL_MUX_MODE_NO_SWAP,
 					 PV_MUX_CFG_RGB_PIXEL_MUX_MODE));
@@ -1149,7 +1149,7 @@ int vc4_crtc_init(struct drm_device *drm, struct vc4_crtc *vc4_crtc,
 				  crtc_funcs, NULL);
 	drm_crtc_helper_add(crtc, crtc_helper_funcs);
 
-	if (!vc4->hvs->hvs5) {
+	if (!vc4->is_vc5) {
 		drm_mode_crtc_set_gamma_size(crtc, ARRAY_SIZE(vc4_crtc->lut_r));
 
 		drm_crtc_enable_color_mgmt(crtc, 0, false, crtc->gamma_size);
diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index 162bc18e7497..53067525b586 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -217,10 +217,13 @@ static int vc4_drm_bind(struct device *dev)
 	struct vc4_dev *vc4;
 	struct device_node *node;
 	struct drm_crtc *crtc;
+	bool is_vc5;
 	int ret = 0;
 
 	dev->coherent_dma_mask = DMA_BIT_MASK(32);
 
+	is_vc5 = of_device_is_compatible(dev->of_node, "brcm,bcm2711-vc5");
+
 	/* If VC4 V3D is missing, don't advertise render nodes. */
 	node = of_find_matching_node_and_match(NULL, vc4_v3d_dt_match, NULL);
 	if (!node || !of_device_is_available(node))
@@ -230,6 +233,7 @@ static int vc4_drm_bind(struct device *dev)
 	vc4 = devm_drm_dev_alloc(dev, &vc4_drm_driver, struct vc4_dev, base);
 	if (IS_ERR(vc4))
 		return PTR_ERR(vc4);
+	vc4->is_vc5 = is_vc5;
 
 	drm = &vc4->base;
 	platform_set_drvdata(pdev, drm);
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index 15e0c2ac3940..82453a3bcffe 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -74,6 +74,8 @@ struct vc4_perfmon {
 struct vc4_dev {
 	struct drm_device base;
 
+	bool is_vc5;
+
 	unsigned int irq;
 
 	struct vc4_hvs *hvs;
@@ -316,6 +318,7 @@ struct vc4_v3d {
 };
 
 struct vc4_hvs {
+	struct vc4_dev *vc4;
 	struct platform_device *pdev;
 	void __iomem *regs;
 	u32 __iomem *dlist;
@@ -333,9 +336,6 @@ struct vc4_hvs {
 	struct drm_mm_node mitchell_netravali_filter;
 
 	struct debugfs_regset32 regset;
-
-	/* HVS version 5 flag, therefore requires updated dlist structures */
-	bool hvs5;
 };
 
 struct vc4_plane {
diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
index 2a58fc421cf6..ba2c8e5a9b64 100644
--- a/drivers/gpu/drm/vc4/vc4_hvs.c
+++ b/drivers/gpu/drm/vc4/vc4_hvs.c
@@ -220,10 +220,11 @@ u8 vc4_hvs_get_fifo_frame_count(struct vc4_hvs *hvs, unsigned int fifo)
 
 int vc4_hvs_get_fifo_from_output(struct vc4_hvs *hvs, unsigned int output)
 {
+	struct vc4_dev *vc4 = hvs->vc4;
 	u32 reg;
 	int ret;
 
-	if (!hvs->hvs5)
+	if (!vc4->is_vc5)
 		return output;
 
 	switch (output) {
@@ -273,6 +274,7 @@ int vc4_hvs_get_fifo_from_output(struct vc4_hvs *hvs, unsigned int output)
 static int vc4_hvs_init_channel(struct vc4_hvs *hvs, struct drm_crtc *crtc,
 				struct drm_display_mode *mode, bool oneshot)
 {
+	struct vc4_dev *vc4 = hvs->vc4;
 	struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
 	struct vc4_crtc_state *vc4_crtc_state = to_vc4_crtc_state(crtc->state);
 	unsigned int chan = vc4_crtc_state->assigned_channel;
@@ -291,7 +293,7 @@ static int vc4_hvs_init_channel(struct vc4_hvs *hvs, struct drm_crtc *crtc,
 	 */
 	dispctrl = SCALER_DISPCTRLX_ENABLE;
 
-	if (!hvs->hvs5)
+	if (!vc4->is_vc5)
 		dispctrl |= VC4_SET_FIELD(mode->hdisplay,
 					  SCALER_DISPCTRLX_WIDTH) |
 			    VC4_SET_FIELD(mode->vdisplay,
@@ -312,7 +314,7 @@ static int vc4_hvs_init_channel(struct vc4_hvs *hvs, struct drm_crtc *crtc,
 
 	HVS_WRITE(SCALER_DISPBKGNDX(chan), dispbkgndx |
 		  SCALER_DISPBKGND_AUTOHS |
-		  ((!hvs->hvs5) ? SCALER_DISPBKGND_GAMMA : 0) |
+		  ((!vc4->is_vc5) ? SCALER_DISPBKGND_GAMMA : 0) |
 		  (interlace ? SCALER_DISPBKGND_INTERLACE : 0));
 
 	/* Reload the LUT, since the SRAMs would have been disabled if
@@ -617,11 +619,9 @@ static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
 	if (!hvs)
 		return -ENOMEM;
 
+	hvs->vc4 = vc4;
 	hvs->pdev = pdev;
 
-	if (of_device_is_compatible(pdev->dev.of_node, "brcm,bcm2711-hvs"))
-		hvs->hvs5 = true;
-
 	hvs->regs = vc4_ioremap_regs(pdev, 0);
 	if (IS_ERR(hvs->regs))
 		return PTR_ERR(hvs->regs);
@@ -630,7 +630,7 @@ static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
 	hvs->regset.regs = hvs_regs;
 	hvs->regset.nregs = ARRAY_SIZE(hvs_regs);
 
-	if (hvs->hvs5) {
+	if (vc4->is_vc5) {
 		hvs->core_clk = devm_clk_get(&pdev->dev, NULL);
 		if (IS_ERR(hvs->core_clk)) {
 			dev_err(&pdev->dev, "Couldn't get core clock\n");
@@ -644,7 +644,7 @@ static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
 		}
 	}
 
-	if (!hvs->hvs5)
+	if (!vc4->is_vc5)
 		hvs->dlist = hvs->regs + SCALER_DLIST_START;
 	else
 		hvs->dlist = hvs->regs + SCALER5_DLIST_START;
@@ -665,7 +665,7 @@ static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
 	 * between planes when they don't overlap on the screen, but
 	 * for now we just allocate globally.
 	 */
-	if (!hvs->hvs5)
+	if (!vc4->is_vc5)
 		/* 48k words of 2x12-bit pixels */
 		drm_mm_init(&hvs->lbm_mm, 0, 48 * 1024);
 	else
diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
index c169bd72e53b..3c232d85ab85 100644
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -393,7 +393,7 @@ static void vc4_atomic_commit_tail(struct drm_atomic_state *state)
 		old_hvs_state->fifo_state[channel].pending_commit = NULL;
 	}
 
-	if (vc4->hvs->hvs5) {
+	if (vc4->is_vc5) {
 		unsigned long state_rate = max(old_hvs_state->core_clock_rate,
 					       new_hvs_state->core_clock_rate);
 		unsigned long core_rate = max_t(unsigned long,
@@ -412,7 +412,7 @@ static void vc4_atomic_commit_tail(struct drm_atomic_state *state)
 
 	vc4_ctm_commit(vc4, state);
 
-	if (vc4->hvs->hvs5)
+	if (vc4->is_vc5)
 		vc5_hvs_pv_muxing_commit(vc4, state);
 	else
 		vc4_hvs_pv_muxing_commit(vc4, state);
@@ -430,7 +430,7 @@ static void vc4_atomic_commit_tail(struct drm_atomic_state *state)
 
 	drm_atomic_helper_cleanup_planes(dev, state);
 
-	if (vc4->hvs->hvs5) {
+	if (vc4->is_vc5) {
 		drm_dbg(dev, "Running the core clock at %lu Hz\n",
 			new_hvs_state->core_clock_rate);
 
@@ -1000,8 +1000,6 @@ static const struct drm_mode_config_funcs vc4_mode_funcs = {
 int vc4_kms_load(struct drm_device *dev)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
-	bool is_vc5 = of_device_is_compatible(dev->dev->of_node,
-					      "brcm,bcm2711-vc5");
 	int ret;
 
 	/*
@@ -1009,7 +1007,7 @@ int vc4_kms_load(struct drm_device *dev)
 	 * the BCM2711, but the load tracker computations are used for
 	 * the core clock rate calculation.
 	 */
-	if (!is_vc5) {
+	if (!vc4->is_vc5) {
 		/* Start with the load tracker enabled. Can be
 		 * disabled through the debugfs load_tracker file.
 		 */
@@ -1025,7 +1023,7 @@ int vc4_kms_load(struct drm_device *dev)
 		return ret;
 	}
 
-	if (is_vc5) {
+	if (vc4->is_vc5) {
 		dev->mode_config.max_width = 7680;
 		dev->mode_config.max_height = 7680;
 	} else {
diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c
index 811a2d004cc4..ba7359516d75 100644
--- a/drivers/gpu/drm/vc4/vc4_plane.c
+++ b/drivers/gpu/drm/vc4/vc4_plane.c
@@ -489,10 +489,10 @@ static u32 vc4_lbm_size(struct drm_plane_state *state)
 	}
 
 	/* Align it to 64 or 128 (hvs5) bytes */
-	lbm = roundup(lbm, vc4->hvs->hvs5 ? 128 : 64);
+	lbm = roundup(lbm, vc4->is_vc5 ? 128 : 64);
 
 	/* Each "word" of the LBM memory contains 2 or 4 (hvs5) pixels */
-	lbm /= vc4->hvs->hvs5 ? 4 : 2;
+	lbm /= vc4->is_vc5 ? 4 : 2;
 
 	return lbm;
 }
@@ -608,7 +608,7 @@ static int vc4_plane_allocate_lbm(struct drm_plane_state *state)
 		ret = drm_mm_insert_node_generic(&vc4->hvs->lbm_mm,
 						 &vc4_state->lbm,
 						 lbm_size,
-						 vc4->hvs->hvs5 ? 64 : 32,
+						 vc4->is_vc5 ? 64 : 32,
 						 0, 0);
 		spin_unlock_irqrestore(&vc4->hvs->mm_lock, irqflags);
 
@@ -917,7 +917,7 @@ static int vc4_plane_mode_set(struct drm_plane *plane,
 	mix_plane_alpha = state->alpha != DRM_BLEND_ALPHA_OPAQUE &&
 			  fb->format->has_alpha;
 
-	if (!vc4->hvs->hvs5) {
+	if (!vc4->is_vc5) {
 	/* Control word */
 		vc4_dlist_write(vc4_state,
 				SCALER_CTL0_VALID |
@@ -1457,14 +1457,13 @@ static const struct drm_plane_funcs vc4_plane_funcs = {
 struct drm_plane *vc4_plane_init(struct drm_device *dev,
 				 enum drm_plane_type type)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct drm_plane *plane = NULL;
 	struct vc4_plane *vc4_plane;
 	u32 formats[ARRAY_SIZE(hvs_formats)];
 	int num_formats = 0;
 	int ret = 0;
 	unsigned i;
-	bool hvs5 = of_device_is_compatible(dev->dev->of_node,
-					    "brcm,bcm2711-vc5");
 	static const uint64_t modifiers[] = {
 		DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED,
 		DRM_FORMAT_MOD_BROADCOM_SAND128,
@@ -1480,7 +1479,7 @@ struct drm_plane *vc4_plane_init(struct drm_device *dev,
 		return ERR_PTR(-ENOMEM);
 
 	for (i = 0; i < ARRAY_SIZE(hvs_formats); i++) {
-		if (!hvs_formats[i].hvs5_only || hvs5) {
+		if (!hvs_formats[i].hvs5_only || vc4->is_vc5) {
 			formats[num_formats] = hvs_formats[i].drm;
 			num_formats++;
 		}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 03/14] drm/vc4: bo: Rename vc4_dumb_create
  2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
  2022-05-03 12:13 ` [PATCH 01/14] drm/vc4: plane: Prevent async update if we don't have a dlist Maxime Ripard
  2022-05-03 12:13 ` [PATCH 02/14] drm/vc4: Consolidate Hardware Revision Check Maxime Ripard
@ 2022-05-03 12:13 ` Maxime Ripard
  2022-05-03 12:13 ` [PATCH 04/14] drm/vc4: bo: Split out Dumb buffers fixup Maxime Ripard
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: Maxime Ripard @ 2022-05-03 12:13 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann, Maxime Ripard
  Cc: Melissa Wen, dri-devel

We're going to add a new variant of the dumb BO allocation function, so
let's rename vc4_dumb_create() to something a bit more specific.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_bo.c  | 6 +++---
 drivers/gpu/drm/vc4/vc4_drv.c | 2 +-
 drivers/gpu/drm/vc4/vc4_drv.h | 6 +++---
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
index 49c0f2ac868b..6d505da6b6cf 100644
--- a/drivers/gpu/drm/vc4/vc4_bo.c
+++ b/drivers/gpu/drm/vc4/vc4_bo.c
@@ -471,9 +471,9 @@ struct vc4_bo *vc4_bo_create(struct drm_device *dev, size_t unaligned_size,
 	return bo;
 }
 
-int vc4_dumb_create(struct drm_file *file_priv,
-		    struct drm_device *dev,
-		    struct drm_mode_create_dumb *args)
+int vc4_bo_dumb_create(struct drm_file *file_priv,
+		       struct drm_device *dev,
+		       struct drm_mode_create_dumb *args)
 {
 	int min_pitch = DIV_ROUND_UP(args->width * args->bpp, 8);
 	struct vc4_bo *bo = NULL;
diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index 53067525b586..5f39e40ef238 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -175,7 +175,7 @@ static struct drm_driver vc4_drm_driver = {
 
 	.gem_create_object = vc4_create_object,
 
-	DRM_GEM_CMA_DRIVER_OPS_WITH_DUMB_CREATE(vc4_dumb_create),
+	DRM_GEM_CMA_DRIVER_OPS_WITH_DUMB_CREATE(vc4_bo_dumb_create),
 
 	.ioctls = vc4_drm_ioctls,
 	.num_ioctls = ARRAY_SIZE(vc4_drm_ioctls),
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index 82453a3bcffe..37c93654480f 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -814,9 +814,9 @@ struct vc4_validated_shader_info {
 struct drm_gem_object *vc4_create_object(struct drm_device *dev, size_t size);
 struct vc4_bo *vc4_bo_create(struct drm_device *dev, size_t size,
 			     bool from_cache, enum vc4_kernel_bo_type type);
-int vc4_dumb_create(struct drm_file *file_priv,
-		    struct drm_device *dev,
-		    struct drm_mode_create_dumb *args);
+int vc4_bo_dumb_create(struct drm_file *file_priv,
+		       struct drm_device *dev,
+		       struct drm_mode_create_dumb *args);
 int vc4_create_bo_ioctl(struct drm_device *dev, void *data,
 			struct drm_file *file_priv);
 int vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 04/14] drm/vc4: bo: Split out Dumb buffers fixup
  2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
                   ` (2 preceding siblings ...)
  2022-05-03 12:13 ` [PATCH 03/14] drm/vc4: bo: Rename vc4_dumb_create Maxime Ripard
@ 2022-05-03 12:13 ` Maxime Ripard
  2022-05-03 12:13 ` [PATCH 05/14] drm/vc4: drv: Register a different driver on BCM2711 Maxime Ripard
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: Maxime Ripard @ 2022-05-03 12:13 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann, Maxime Ripard
  Cc: Melissa Wen, dri-devel

The vc4_bo_dumb_create() both fixes up the allocation arguments to match
the hardware constraints and actually performs the allocation.

Since we're going to introduce a new function that uses a different
allocator, let's split the arguments fixup to a separate function we
will be able to reuse.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_bo.c  |  9 +++------
 drivers/gpu/drm/vc4/vc4_drv.c | 13 +++++++++++++
 drivers/gpu/drm/vc4/vc4_drv.h |  1 +
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
index 6d505da6b6cf..3ca16d682fc0 100644
--- a/drivers/gpu/drm/vc4/vc4_bo.c
+++ b/drivers/gpu/drm/vc4/vc4_bo.c
@@ -475,15 +475,12 @@ int vc4_bo_dumb_create(struct drm_file *file_priv,
 		       struct drm_device *dev,
 		       struct drm_mode_create_dumb *args)
 {
-	int min_pitch = DIV_ROUND_UP(args->width * args->bpp, 8);
 	struct vc4_bo *bo = NULL;
 	int ret;
 
-	if (args->pitch < min_pitch)
-		args->pitch = min_pitch;
-
-	if (args->size < args->pitch * args->height)
-		args->size = args->pitch * args->height;
+	ret = vc4_dumb_fixup_args(args);
+	if (ret)
+		return ret;
 
 	bo = vc4_bo_create(dev, args->size, false, VC4_BO_TYPE_DUMB);
 	if (IS_ERR(bo))
diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index 5f39e40ef238..eb08940028d3 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -63,6 +63,19 @@ void __iomem *vc4_ioremap_regs(struct platform_device *pdev, int index)
 	return map;
 }
 
+int vc4_dumb_fixup_args(struct drm_mode_create_dumb *args)
+{
+	int min_pitch = DIV_ROUND_UP(args->width * args->bpp, 8);
+
+	if (args->pitch < min_pitch)
+		args->pitch = min_pitch;
+
+	if (args->size < args->pitch * args->height)
+		args->size = args->pitch * args->height;
+
+	return 0;
+}
+
 static int vc4_get_param_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file_priv)
 {
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index 37c93654480f..9c324c12c410 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -885,6 +885,7 @@ static inline void vc4_debugfs_add_regset32(struct drm_device *drm,
 
 /* vc4_drv.c */
 void __iomem *vc4_ioremap_regs(struct platform_device *dev, int index);
+int vc4_dumb_fixup_args(struct drm_mode_create_dumb *args);
 
 /* vc4_dpi.c */
 extern struct platform_driver vc4_dpi_driver;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 05/14] drm/vc4: drv: Register a different driver on BCM2711
  2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
                   ` (3 preceding siblings ...)
  2022-05-03 12:13 ` [PATCH 04/14] drm/vc4: bo: Split out Dumb buffers fixup Maxime Ripard
@ 2022-05-03 12:13 ` Maxime Ripard
  2022-05-09 16:37   ` Melissa Wen
  2022-05-03 12:13 ` [PATCH 06/14] drm/vc4: kms: Register a different drm_mode_config_funcs " Maxime Ripard
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 25+ messages in thread
From: Maxime Ripard @ 2022-05-03 12:13 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann, Maxime Ripard
  Cc: Melissa Wen, dri-devel

Prior to the BCM2711/RaspberryPi4, the GPU was a part of the display
components of the SoC. It was thus a part of the vc4 driver.

However, with the BCM2711, it got split out and thus the v3d driver was
created. The vc4 driver now only handles the display part.

We didn't properly split out the code when doing the BCM2711 support
though, and most of the code around buffer allocations is still
involved, even though it doesn't have the backing hardware anymore.

Let's start the split out by creating a new drm_driver that only reports
and uses what we support on the BCM2711. The ioctl were properly
filtered already, but we were still exposing a .gem_create_object hook,
as well as having an .open and .postclose hooks which are only relevant
on older generations.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_drv.c | 51 ++++++++++++++++++++++++++++-------
 1 file changed, 42 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index eb08940028d3..4f9e2067dad0 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -76,6 +76,19 @@ int vc4_dumb_fixup_args(struct drm_mode_create_dumb *args)
 	return 0;
 }
 
+static int vc4_dumb_create(struct drm_file *file_priv,
+			   struct drm_device *dev,
+			   struct drm_mode_create_dumb *args)
+{
+	int ret;
+
+	ret = vc4_dumb_fixup_args(args);
+	if (ret)
+		return ret;
+
+	return drm_gem_cma_dumb_create_internal(file_priv, dev, args);
+}
+
 static int vc4_get_param_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file_priv)
 {
@@ -173,7 +186,7 @@ static const struct drm_ioctl_desc vc4_drm_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(VC4_PERFMON_GET_VALUES, vc4_perfmon_get_values_ioctl, DRM_RENDER_ALLOW),
 };
 
-static struct drm_driver vc4_drm_driver = {
+static const struct drm_driver vc4_drm_driver = {
 	.driver_features = (DRIVER_MODESET |
 			    DRIVER_ATOMIC |
 			    DRIVER_GEM |
@@ -202,6 +215,27 @@ static struct drm_driver vc4_drm_driver = {
 	.patchlevel = DRIVER_PATCHLEVEL,
 };
 
+static const struct drm_driver vc5_drm_driver = {
+	.driver_features = (DRIVER_MODESET |
+			    DRIVER_ATOMIC |
+			    DRIVER_GEM),
+
+#if defined(CONFIG_DEBUG_FS)
+	.debugfs_init = vc4_debugfs_init,
+#endif
+
+	DRM_GEM_CMA_DRIVER_OPS_WITH_DUMB_CREATE(vc4_dumb_create),
+
+	.fops = &vc4_drm_fops,
+
+	.name = DRIVER_NAME,
+	.desc = DRIVER_DESC,
+	.date = DRIVER_DATE,
+	.major = DRIVER_MAJOR,
+	.minor = DRIVER_MINOR,
+	.patchlevel = DRIVER_PATCHLEVEL,
+};
+
 static void vc4_match_add_drivers(struct device *dev,
 				  struct component_match **match,
 				  struct platform_driver *const *drivers,
@@ -225,6 +259,7 @@ static void vc4_match_add_drivers(struct device *dev,
 static int vc4_drm_bind(struct device *dev)
 {
 	struct platform_device *pdev = to_platform_device(dev);
+	const struct drm_driver *driver;
 	struct rpi_firmware *firmware = NULL;
 	struct drm_device *drm;
 	struct vc4_dev *vc4;
@@ -236,14 +271,12 @@ static int vc4_drm_bind(struct device *dev)
 	dev->coherent_dma_mask = DMA_BIT_MASK(32);
 
 	is_vc5 = of_device_is_compatible(dev->of_node, "brcm,bcm2711-vc5");
+	if (is_vc5)
+		driver = &vc5_drm_driver;
+	else
+		driver = &vc4_drm_driver;
 
-	/* If VC4 V3D is missing, don't advertise render nodes. */
-	node = of_find_matching_node_and_match(NULL, vc4_v3d_dt_match, NULL);
-	if (!node || !of_device_is_available(node))
-		vc4_drm_driver.driver_features &= ~DRIVER_RENDER;
-	of_node_put(node);
-
-	vc4 = devm_drm_dev_alloc(dev, &vc4_drm_driver, struct vc4_dev, base);
+	vc4 = devm_drm_dev_alloc(dev, driver, struct vc4_dev, base);
 	if (IS_ERR(vc4))
 		return PTR_ERR(vc4);
 	vc4->is_vc5 = is_vc5;
@@ -275,7 +308,7 @@ static int vc4_drm_bind(struct device *dev)
 			return -EPROBE_DEFER;
 	}
 
-	ret = drm_aperture_remove_framebuffers(false, &vc4_drm_driver);
+	ret = drm_aperture_remove_framebuffers(false, driver);
 	if (ret)
 		return ret;
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 06/14] drm/vc4: kms: Register a different drm_mode_config_funcs on BCM2711
  2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
                   ` (4 preceding siblings ...)
  2022-05-03 12:13 ` [PATCH 05/14] drm/vc4: drv: Register a different driver on BCM2711 Maxime Ripard
@ 2022-05-03 12:13 ` Maxime Ripard
  2022-05-03 12:13 ` [PATCH 07/14] drm/vc4: plane: Register a different drm_plane_helper_funcs " Maxime Ripard
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: Maxime Ripard @ 2022-05-03 12:13 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann, Maxime Ripard
  Cc: Melissa Wen, dri-devel

On the BCM2711, our current definition of drm_mode_config_funcs uses the
custom vc4_fb_create().

However, that function relies on the buffer allocation path that was
relying on the GPU, and is no longer relevant.

Let's create another drm_mode_config_funcs structure that we will
register on the BCM2711.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_kms.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
index 3c232d85ab85..1d3b31fb71ea 100644
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -997,6 +997,12 @@ static const struct drm_mode_config_funcs vc4_mode_funcs = {
 	.fb_create = vc4_fb_create,
 };
 
+static const struct drm_mode_config_funcs vc5_mode_funcs = {
+	.atomic_check = vc4_atomic_check,
+	.atomic_commit = drm_atomic_helper_commit,
+	.fb_create = drm_gem_fb_create,
+};
+
 int vc4_kms_load(struct drm_device *dev)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
@@ -1031,7 +1037,7 @@ int vc4_kms_load(struct drm_device *dev)
 		dev->mode_config.max_height = 2048;
 	}
 
-	dev->mode_config.funcs = &vc4_mode_funcs;
+	dev->mode_config.funcs = vc4->is_vc5 ? &vc5_mode_funcs : &vc4_mode_funcs;
 	dev->mode_config.helper_private = &vc4_mode_config_helpers;
 	dev->mode_config.preferred_depth = 24;
 	dev->mode_config.async_page_flip = true;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 07/14] drm/vc4: plane: Register a different drm_plane_helper_funcs on BCM2711
  2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
                   ` (5 preceding siblings ...)
  2022-05-03 12:13 ` [PATCH 06/14] drm/vc4: kms: Register a different drm_mode_config_funcs " Maxime Ripard
@ 2022-05-03 12:13 ` Maxime Ripard
  2022-05-03 12:13 ` [PATCH 08/14] drm/vc4: drv: Skip BO Backend Initialization " Maxime Ripard
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: Maxime Ripard @ 2022-05-03 12:13 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann, Maxime Ripard
  Cc: Melissa Wen, dri-devel

On the BCM2711, our current definition of drm_plane_helper_funcs uses
the custom vc4_prepare_fb() and vc4_cleanup_fb().

Those functions rely on the buffer allocation path that was relying on
the GPU, and is no longer relevant.

Let's create another drm_plane_helper_funcs structure that we will
register on the BCM2711.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_plane.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c
index ba7359516d75..1e866dc00ac3 100644
--- a/drivers/gpu/drm/vc4/vc4_plane.c
+++ b/drivers/gpu/drm/vc4/vc4_plane.c
@@ -1389,6 +1389,13 @@ static const struct drm_plane_helper_funcs vc4_plane_helper_funcs = {
 	.atomic_async_update = vc4_plane_atomic_async_update,
 };
 
+static const struct drm_plane_helper_funcs vc5_plane_helper_funcs = {
+	.atomic_check = vc4_plane_atomic_check,
+	.atomic_update = vc4_plane_atomic_update,
+	.atomic_async_check = vc4_plane_atomic_async_check,
+	.atomic_async_update = vc4_plane_atomic_async_update,
+};
+
 static bool vc4_format_mod_supported(struct drm_plane *plane,
 				     uint32_t format,
 				     uint64_t modifier)
@@ -1493,7 +1500,10 @@ struct drm_plane *vc4_plane_init(struct drm_device *dev,
 	if (ret)
 		return ERR_PTR(ret);
 
-	drm_plane_helper_add(plane, &vc4_plane_helper_funcs);
+	if (vc4->is_vc5)
+		drm_plane_helper_add(plane, &vc5_plane_helper_funcs);
+	else
+		drm_plane_helper_add(plane, &vc4_plane_helper_funcs);
 
 	drm_plane_create_alpha_property(plane);
 	drm_plane_create_rotation_property(plane, DRM_MODE_ROTATE_0,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 08/14] drm/vc4: drv: Skip BO Backend Initialization on BCM2711
  2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
                   ` (6 preceding siblings ...)
  2022-05-03 12:13 ` [PATCH 07/14] drm/vc4: plane: Register a different drm_plane_helper_funcs " Maxime Ripard
@ 2022-05-03 12:13 ` Maxime Ripard
  2022-05-03 12:13 ` [PATCH 09/14] drm/vc4: crtc: Use an union to store the page flip callback Maxime Ripard
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: Maxime Ripard @ 2022-05-03 12:13 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann, Maxime Ripard
  Cc: Melissa Wen, dri-devel

On the BCM2711, we currently call the vc4_bo_cache_init() and
vc4_gem_init() functions. These functions initialize the BO and GEM
backends.

However, this code was initially created to accomodate the requirements
of the GPU on the older SoCs, while the BCM2711 has a separate driver
for it. So let's just skip these calls when we're on a newer hardware.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_drv.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index 4f9e2067dad0..64c7696aa9e4 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -285,19 +285,23 @@ static int vc4_drm_bind(struct device *dev)
 	platform_set_drvdata(pdev, drm);
 	INIT_LIST_HEAD(&vc4->debugfs_list);
 
-	mutex_init(&vc4->bin_bo_lock);
+	if (!is_vc5) {
+		mutex_init(&vc4->bin_bo_lock);
 
-	ret = vc4_bo_cache_init(drm);
-	if (ret)
-		return ret;
+		ret = vc4_bo_cache_init(drm);
+		if (ret)
+			return ret;
+	}
 
 	ret = drmm_mode_config_init(drm);
 	if (ret)
 		return ret;
 
-	ret = vc4_gem_init(drm);
-	if (ret)
-		return ret;
+	if (!is_vc5) {
+		ret = vc4_gem_init(drm);
+		if (ret)
+			return ret;
+	}
 
 	node = of_find_compatible_node(NULL, NULL, "raspberrypi,bcm2835-firmware");
 	if (node) {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 09/14] drm/vc4: crtc: Use an union to store the page flip callback
  2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
                   ` (7 preceding siblings ...)
  2022-05-03 12:13 ` [PATCH 08/14] drm/vc4: drv: Skip BO Backend Initialization " Maxime Ripard
@ 2022-05-03 12:13 ` Maxime Ripard
  2022-05-03 12:13 ` [PATCH 10/14] drm/vc4: crtc: Move the BO handling out of common page-flip callback Maxime Ripard
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: Maxime Ripard @ 2022-05-03 12:13 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann, Maxime Ripard
  Cc: Melissa Wen, dri-devel

We'll need to extend the vc4_async_flip_state structure to rely on
another callback implementation, so let's move the current one into a
union.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_crtc.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index dd5fb25d0f43..1f247c037ce0 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -775,17 +775,17 @@ struct vc4_async_flip_state {
 	struct drm_framebuffer *old_fb;
 	struct drm_pending_vblank_event *event;
 
-	struct vc4_seqno_cb cb;
+	union {
+		struct vc4_seqno_cb seqno;
+	} cb;
 };
 
 /* Called when the V3D execution for the BO being flipped to is done, so that
  * we can actually update the plane's address to point to it.
  */
 static void
-vc4_async_page_flip_complete(struct vc4_seqno_cb *cb)
+vc4_async_page_flip_complete(struct vc4_async_flip_state *flip_state)
 {
-	struct vc4_async_flip_state *flip_state =
-		container_of(cb, struct vc4_async_flip_state, cb);
 	struct drm_crtc *crtc = flip_state->crtc;
 	struct drm_device *dev = crtc->dev;
 	struct drm_plane *plane = crtc->primary;
@@ -821,6 +821,14 @@ vc4_async_page_flip_complete(struct vc4_seqno_cb *cb)
 	kfree(flip_state);
 }
 
+static void vc4_async_page_flip_seqno_complete(struct vc4_seqno_cb *cb)
+{
+	struct vc4_async_flip_state *flip_state =
+		container_of(cb, struct vc4_async_flip_state, cb.seqno);
+
+	vc4_async_page_flip_complete(flip_state);
+}
+
 /* Implements async (non-vblank-synced) page flips.
  *
  * The page flip ioctl needs to return immediately, so we grab the
@@ -881,8 +889,8 @@ static int vc4_async_page_flip(struct drm_crtc *crtc,
 	 */
 	drm_atomic_set_fb_for_plane(plane->state, fb);
 
-	vc4_queue_seqno_cb(dev, &flip_state->cb, bo->seqno,
-			   vc4_async_page_flip_complete);
+	vc4_queue_seqno_cb(dev, &flip_state->cb.seqno, bo->seqno,
+			   vc4_async_page_flip_seqno_complete);
 
 	/* Driver takes ownership of state on successful async commit. */
 	return 0;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 10/14] drm/vc4: crtc: Move the BO handling out of common page-flip callback
  2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
                   ` (8 preceding siblings ...)
  2022-05-03 12:13 ` [PATCH 09/14] drm/vc4: crtc: Use an union to store the page flip callback Maxime Ripard
@ 2022-05-03 12:13 ` Maxime Ripard
  2022-05-03 12:13 ` [PATCH 11/14] drm/vc4: crtc: Move the BO Handling out of Common Page-Flip Handler Maxime Ripard
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: Maxime Ripard @ 2022-05-03 12:13 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann, Maxime Ripard
  Cc: Melissa Wen, dri-devel

We'll soon introduce another completion callback source that won't need
to use the BO reference counting, so let's move it around to create a
function we will be able to share between both callbacks.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_crtc.c | 34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index 1f247c037ce0..0410db97b9d1 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -802,21 +802,8 @@ vc4_async_page_flip_complete(struct vc4_async_flip_state *flip_state)
 	drm_crtc_vblank_put(crtc);
 	drm_framebuffer_put(flip_state->fb);
 
-	/* Decrement the BO usecnt in order to keep the inc/dec calls balanced
-	 * when the planes are updated through the async update path.
-	 * FIXME: we should move to generic async-page-flip when it's
-	 * available, so that we can get rid of this hand-made cleanup_fb()
-	 * logic.
-	 */
-	if (flip_state->old_fb) {
-		struct drm_gem_cma_object *cma_bo;
-		struct vc4_bo *bo;
-
-		cma_bo = drm_fb_cma_get_gem_obj(flip_state->old_fb, 0);
-		bo = to_vc4_bo(&cma_bo->base);
-		vc4_bo_dec_usecnt(bo);
+	if (flip_state->old_fb)
 		drm_framebuffer_put(flip_state->old_fb);
-	}
 
 	kfree(flip_state);
 }
@@ -825,8 +812,27 @@ static void vc4_async_page_flip_seqno_complete(struct vc4_seqno_cb *cb)
 {
 	struct vc4_async_flip_state *flip_state =
 		container_of(cb, struct vc4_async_flip_state, cb.seqno);
+	struct vc4_bo *bo = NULL;
+
+	if (flip_state->old_fb) {
+		struct drm_gem_cma_object *cma_bo =
+			drm_fb_cma_get_gem_obj(flip_state->old_fb, 0);
+		bo = to_vc4_bo(&cma_bo->base);
+	}
 
 	vc4_async_page_flip_complete(flip_state);
+
+	/*
+	 * Decrement the BO usecnt in order to keep the inc/dec
+	 * calls balanced when the planes are updated through
+	 * the async update path.
+	 *
+	 * FIXME: we should move to generic async-page-flip when
+	 * it's available, so that we can get rid of this
+	 * hand-made cleanup_fb() logic.
+	 */
+	if (bo)
+		vc4_bo_dec_usecnt(bo);
 }
 
 /* Implements async (non-vblank-synced) page flips.
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 11/14] drm/vc4: crtc: Move the BO Handling out of Common Page-Flip Handler
  2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
                   ` (9 preceding siblings ...)
  2022-05-03 12:13 ` [PATCH 10/14] drm/vc4: crtc: Move the BO handling out of common page-flip callback Maxime Ripard
@ 2022-05-03 12:13 ` Maxime Ripard
  2022-05-03 12:13 ` [PATCH 12/14] drm/vc4: crtc: Don't call into BO Handling on Async Page-Flips on BCM2711 Maxime Ripard
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: Maxime Ripard @ 2022-05-03 12:13 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann, Maxime Ripard
  Cc: Melissa Wen, dri-devel

The function vc4_async_page_flip() handles asynchronous page-flips in
the vc4 driver.

However, it mixes some generic code with code that should only be run on
older generations that have the GPU handled by the vc4 driver.

Let's split the generic part out of vc4_async_page_flip() and into a
common function that we be reusable by an handler made for the BCM2711.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_crtc.c | 75 ++++++++++++++++++++++------------
 1 file changed, 48 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index 0410db97b9d1..bd85b1f13cee 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -835,40 +835,19 @@ static void vc4_async_page_flip_seqno_complete(struct vc4_seqno_cb *cb)
 		vc4_bo_dec_usecnt(bo);
 }
 
-/* Implements async (non-vblank-synced) page flips.
- *
- * The page flip ioctl needs to return immediately, so we grab the
- * modeset semaphore on the pipe, and queue the address update for
- * when V3D is done with the BO being flipped to.
- */
-static int vc4_async_page_flip(struct drm_crtc *crtc,
-			       struct drm_framebuffer *fb,
-			       struct drm_pending_vblank_event *event,
-			       uint32_t flags)
+static int
+vc4_async_page_flip_common(struct drm_crtc *crtc,
+			   struct drm_framebuffer *fb,
+			   struct drm_pending_vblank_event *event,
+			   uint32_t flags)
 {
 	struct drm_device *dev = crtc->dev;
 	struct drm_plane *plane = crtc->primary;
-	int ret = 0;
 	struct vc4_async_flip_state *flip_state;
-	struct drm_gem_cma_object *cma_bo = drm_fb_cma_get_gem_obj(fb, 0);
-	struct vc4_bo *bo = to_vc4_bo(&cma_bo->base);
-
-	/* Increment the BO usecnt here, so that we never end up with an
-	 * unbalanced number of vc4_bo_{dec,inc}_usecnt() calls when the
-	 * plane is later updated through the non-async path.
-	 * FIXME: we should move to generic async-page-flip when it's
-	 * available, so that we can get rid of this hand-made prepare_fb()
-	 * logic.
-	 */
-	ret = vc4_bo_inc_usecnt(bo);
-	if (ret)
-		return ret;
 
 	flip_state = kzalloc(sizeof(*flip_state), GFP_KERNEL);
-	if (!flip_state) {
-		vc4_bo_dec_usecnt(bo);
+	if (!flip_state)
 		return -ENOMEM;
-	}
 
 	drm_framebuffer_get(fb);
 	flip_state->fb = fb;
@@ -902,6 +881,48 @@ static int vc4_async_page_flip(struct drm_crtc *crtc,
 	return 0;
 }
 
+/* Implements async (non-vblank-synced) page flips.
+ *
+ * The page flip ioctl needs to return immediately, so we grab the
+ * modeset semaphore on the pipe, and queue the address update for
+ * when V3D is done with the BO being flipped to.
+ */
+static int vc4_async_page_flip(struct drm_crtc *crtc,
+			       struct drm_framebuffer *fb,
+			       struct drm_pending_vblank_event *event,
+			       uint32_t flags)
+{
+	struct drm_device *dev = crtc->dev;
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	struct drm_gem_cma_object *cma_bo = drm_fb_cma_get_gem_obj(fb, 0);
+	struct vc4_bo *bo = to_vc4_bo(&cma_bo->base);
+	int ret;
+
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
+	/*
+	 * Increment the BO usecnt here, so that we never end up with an
+	 * unbalanced number of vc4_bo_{dec,inc}_usecnt() calls when the
+	 * plane is later updated through the non-async path.
+	 *
+	 * FIXME: we should move to generic async-page-flip when
+	 * it's available, so that we can get rid of this
+	 * hand-made prepare_fb() logic.
+	 */
+	ret = vc4_bo_inc_usecnt(bo);
+	if (ret)
+		return ret;
+
+	ret = vc4_async_page_flip_common(crtc, fb, event, flags);
+	if (ret) {
+		vc4_bo_dec_usecnt(bo);
+		return ret;
+	}
+
+	return 0;
+}
+
 int vc4_page_flip(struct drm_crtc *crtc,
 		  struct drm_framebuffer *fb,
 		  struct drm_pending_vblank_event *event,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 12/14] drm/vc4: crtc: Don't call into BO Handling on Async Page-Flips on BCM2711
  2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
                   ` (10 preceding siblings ...)
  2022-05-03 12:13 ` [PATCH 11/14] drm/vc4: crtc: Move the BO Handling out of Common Page-Flip Handler Maxime Ripard
@ 2022-05-03 12:13 ` Maxime Ripard
  2022-05-03 12:13 ` [PATCH 13/14] drm/vc4: crtc: Fix out of order frames during asynchronous page flips Maxime Ripard
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 25+ messages in thread
From: Maxime Ripard @ 2022-05-03 12:13 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann, Maxime Ripard
  Cc: Melissa Wen, dri-devel

The BCM2711 doesn't have a v3d GPU so we don't want to call into its BO
management code. Let's create an asynchronous page-flip handler for the
BCM2711 that just calls into the common code.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_crtc.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index bd85b1f13cee..e0ae7bef08fa 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -923,16 +923,31 @@ static int vc4_async_page_flip(struct drm_crtc *crtc,
 	return 0;
 }
 
+static int vc5_async_page_flip(struct drm_crtc *crtc,
+			       struct drm_framebuffer *fb,
+			       struct drm_pending_vblank_event *event,
+			       uint32_t flags)
+{
+	return vc4_async_page_flip_common(crtc, fb, event, flags);
+}
+
 int vc4_page_flip(struct drm_crtc *crtc,
 		  struct drm_framebuffer *fb,
 		  struct drm_pending_vblank_event *event,
 		  uint32_t flags,
 		  struct drm_modeset_acquire_ctx *ctx)
 {
-	if (flags & DRM_MODE_PAGE_FLIP_ASYNC)
-		return vc4_async_page_flip(crtc, fb, event, flags);
-	else
+	if (flags & DRM_MODE_PAGE_FLIP_ASYNC) {
+		struct drm_device *dev = crtc->dev;
+		struct vc4_dev *vc4 = to_vc4_dev(dev);
+
+		if (vc4->is_vc5)
+			return vc5_async_page_flip(crtc, fb, event, flags);
+		else
+			return vc4_async_page_flip(crtc, fb, event, flags);
+	} else {
 		return drm_atomic_helper_page_flip(crtc, fb, event, flags, ctx);
+	}
 }
 
 struct drm_crtc_state *vc4_crtc_duplicate_state(struct drm_crtc *crtc)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 13/14] drm/vc4: crtc: Fix out of order frames during asynchronous page flips
  2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
                   ` (11 preceding siblings ...)
  2022-05-03 12:13 ` [PATCH 12/14] drm/vc4: crtc: Don't call into BO Handling on Async Page-Flips on BCM2711 Maxime Ripard
@ 2022-05-03 12:13 ` Maxime Ripard
  2022-05-03 13:53   ` kernel test robot
  2022-05-09 17:10   ` Melissa Wen
  2022-05-03 12:13 ` [PATCH 14/14] drm/vc4: Warn if some v3d code is run on BCM2711 Maxime Ripard
  2022-05-09 18:26 ` [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Melissa Wen
  14 siblings, 2 replies; 25+ messages in thread
From: Maxime Ripard @ 2022-05-03 12:13 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann, Maxime Ripard
  Cc: Melissa Wen, dri-devel

When doing an asynchronous page flip (PAGE_FLIP ioctl with the
DRM_MODE_PAGE_FLIP_ASYNC flag set), the current code waits for the
possible GPU buffer being rendered through a call to
vc4_queue_seqno_cb().

On the BCM2835-37, the GPU driver is part of the vc4 driver and that
function is defined in vc4_gem.c to wait for the buffer to be rendered,
and once it's done, call a callback.

However, on the BCM2711 used on the RaspberryPi4, the GPU driver is
separate (v3d) and that function won't do anything. This was working
because we were going into a path, due to uninitialized variables, that
was always scheduling the callback.

However, we were never actually waiting for the buffer to be rendered
which was resulting in frames being displayed out of order.

The generic API to signal those kind of completion in the kernel are the
DMA fences, and fortunately the v3d drivers supports them and signal
when its job is done. That API also provides an equivalent function that
allows to have a callback being executed when the fence is signalled as
done.

Let's change our driver a bit to rely on the previous function for the
older SoCs, and on DMA fences for the BCM2711.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_crtc.c | 41 ++++++++++++++++++++++++++++++++--
 1 file changed, 39 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index e0ae7bef08fa..8e1369fca937 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -776,6 +776,7 @@ struct vc4_async_flip_state {
 	struct drm_pending_vblank_event *event;
 
 	union {
+		struct dma_fence_cb fence;
 		struct vc4_seqno_cb seqno;
 	} cb;
 };
@@ -835,6 +836,43 @@ static void vc4_async_page_flip_seqno_complete(struct vc4_seqno_cb *cb)
 		vc4_bo_dec_usecnt(bo);
 }
 
+static void vc4_async_page_flip_fence_complete(struct dma_fence *fence,
+					       struct dma_fence_cb *cb)
+{
+	struct vc4_async_flip_state *flip_state =
+		container_of(cb, struct vc4_async_flip_state, cb.fence);
+
+	vc4_async_page_flip_complete(flip_state);
+	dma_fence_put(fence);
+}
+
+static int vc4_async_set_fence_cb(struct drm_device *dev,
+				  struct vc4_async_flip_state *flip_state)
+{
+	struct drm_framebuffer *fb = flip_state->fb;
+	struct drm_gem_cma_object *cma_bo = drm_fb_cma_get_gem_obj(fb, 0);
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	struct dma_fence *fence;
+	int ret;
+
+	if (!vc4->is_vc5) {
+		struct vc4_bo *bo = to_vc4_bo(&cma_bo->base);
+
+		return vc4_queue_seqno_cb(dev, &flip_state->cb.seqno, bo->seqno,
+					  vc4_async_page_flip_seqno_complete);
+	}
+
+	ret = dma_resv_get_singleton(cma_bo->base.resv, false, &fence);
+	if (ret)
+		return ret;
+
+	if (dma_fence_add_callback(fence, &flip_state->cb.fence,
+				   vc4_async_page_flip_fence_complete))
+		vc4_async_page_flip_fence_complete(fence, &flip_state->cb.fence);
+
+	return 0;
+}
+
 static int
 vc4_async_page_flip_common(struct drm_crtc *crtc,
 			   struct drm_framebuffer *fb,
@@ -874,8 +912,7 @@ vc4_async_page_flip_common(struct drm_crtc *crtc,
 	 */
 	drm_atomic_set_fb_for_plane(plane->state, fb);
 
-	vc4_queue_seqno_cb(dev, &flip_state->cb.seqno, bo->seqno,
-			   vc4_async_page_flip_seqno_complete);
+	vc4_async_set_fence_cb(dev, flip_state);
 
 	/* Driver takes ownership of state on successful async commit. */
 	return 0;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 14/14] drm/vc4: Warn if some v3d code is run on BCM2711
  2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
                   ` (12 preceding siblings ...)
  2022-05-03 12:13 ` [PATCH 13/14] drm/vc4: crtc: Fix out of order frames during asynchronous page flips Maxime Ripard
@ 2022-05-03 12:13 ` Maxime Ripard
  2022-05-09 16:52   ` Melissa Wen
  2022-05-09 18:26 ` [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Melissa Wen
  14 siblings, 1 reply; 25+ messages in thread
From: Maxime Ripard @ 2022-05-03 12:13 UTC (permalink / raw)
  To: Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann, Maxime Ripard
  Cc: Melissa Wen, dri-devel

The BCM2711 has a separate driver for the v3d, and thus we can't call
into any of the driver entrypoints that rely on the v3d being there.

Let's add a bunch of checks and complain loudly if that ever happen.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
---
 drivers/gpu/drm/vc4/vc4_bo.c               | 49 ++++++++++++++++++++++
 drivers/gpu/drm/vc4/vc4_drv.c              | 11 +++++
 drivers/gpu/drm/vc4/vc4_drv.h              |  6 +++
 drivers/gpu/drm/vc4/vc4_gem.c              | 40 ++++++++++++++++++
 drivers/gpu/drm/vc4/vc4_irq.c              | 16 +++++++
 drivers/gpu/drm/vc4/vc4_kms.c              |  4 ++
 drivers/gpu/drm/vc4/vc4_perfmon.c          | 41 ++++++++++++++++++
 drivers/gpu/drm/vc4/vc4_render_cl.c        |  4 ++
 drivers/gpu/drm/vc4/vc4_v3d.c              | 15 +++++++
 drivers/gpu/drm/vc4/vc4_validate.c         | 16 +++++++
 drivers/gpu/drm/vc4/vc4_validate_shaders.c |  4 ++
 11 files changed, 206 insertions(+)

diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
index 3ca16d682fc0..b8d856312846 100644
--- a/drivers/gpu/drm/vc4/vc4_bo.c
+++ b/drivers/gpu/drm/vc4/vc4_bo.c
@@ -248,6 +248,9 @@ void vc4_bo_add_to_purgeable_pool(struct vc4_bo *bo)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(bo->base.base.dev);
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	mutex_lock(&vc4->purgeable.lock);
 	list_add_tail(&bo->size_head, &vc4->purgeable.list);
 	vc4->purgeable.num++;
@@ -259,6 +262,9 @@ static void vc4_bo_remove_from_purgeable_pool_locked(struct vc4_bo *bo)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(bo->base.base.dev);
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	/* list_del_init() is used here because the caller might release
 	 * the purgeable lock in order to acquire the madv one and update the
 	 * madv status.
@@ -387,6 +393,9 @@ struct drm_gem_object *vc4_create_object(struct drm_device *dev, size_t size)
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct vc4_bo *bo;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return ERR_PTR(-ENODEV);
+
 	bo = kzalloc(sizeof(*bo), GFP_KERNEL);
 	if (!bo)
 		return ERR_PTR(-ENOMEM);
@@ -413,6 +422,9 @@ struct vc4_bo *vc4_bo_create(struct drm_device *dev, size_t unaligned_size,
 	struct drm_gem_cma_object *cma_obj;
 	struct vc4_bo *bo;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return ERR_PTR(-ENODEV);
+
 	if (size == 0)
 		return ERR_PTR(-EINVAL);
 
@@ -475,9 +487,13 @@ int vc4_bo_dumb_create(struct drm_file *file_priv,
 		       struct drm_device *dev,
 		       struct drm_mode_create_dumb *args)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct vc4_bo *bo = NULL;
 	int ret;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	ret = vc4_dumb_fixup_args(args);
 	if (ret)
 		return ret;
@@ -598,8 +614,12 @@ static void vc4_bo_cache_time_work(struct work_struct *work)
 
 int vc4_bo_inc_usecnt(struct vc4_bo *bo)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(bo->base.base.dev);
 	int ret;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	/* Fast path: if the BO is already retained by someone, no need to
 	 * check the madv status.
 	 */
@@ -634,6 +654,11 @@ int vc4_bo_inc_usecnt(struct vc4_bo *bo)
 
 void vc4_bo_dec_usecnt(struct vc4_bo *bo)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(bo->base.base.dev);
+
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	/* Fast path: if the BO is still retained by someone, no need to test
 	 * the madv value.
 	 */
@@ -753,6 +778,9 @@ int vc4_create_bo_ioctl(struct drm_device *dev, void *data,
 	struct vc4_bo *bo = NULL;
 	int ret;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	ret = vc4_grab_bin_bo(vc4, vc4file);
 	if (ret)
 		return ret;
@@ -776,9 +804,13 @@ int vc4_create_bo_ioctl(struct drm_device *dev, void *data,
 int vc4_mmap_bo_ioctl(struct drm_device *dev, void *data,
 		      struct drm_file *file_priv)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct drm_vc4_mmap_bo *args = data;
 	struct drm_gem_object *gem_obj;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	gem_obj = drm_gem_object_lookup(file_priv, args->handle);
 	if (!gem_obj) {
 		DRM_DEBUG("Failed to look up GEM BO %d\n", args->handle);
@@ -802,6 +834,9 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
 	struct vc4_bo *bo = NULL;
 	int ret;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	if (args->size == 0)
 		return -EINVAL;
 
@@ -872,11 +907,15 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
 int vc4_set_tiling_ioctl(struct drm_device *dev, void *data,
 			 struct drm_file *file_priv)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct drm_vc4_set_tiling *args = data;
 	struct drm_gem_object *gem_obj;
 	struct vc4_bo *bo;
 	bool t_format;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	if (args->flags != 0)
 		return -EINVAL;
 
@@ -915,10 +954,14 @@ int vc4_set_tiling_ioctl(struct drm_device *dev, void *data,
 int vc4_get_tiling_ioctl(struct drm_device *dev, void *data,
 			 struct drm_file *file_priv)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct drm_vc4_get_tiling *args = data;
 	struct drm_gem_object *gem_obj;
 	struct vc4_bo *bo;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	if (args->flags != 0 || args->modifier != 0)
 		return -EINVAL;
 
@@ -945,6 +988,9 @@ int vc4_bo_cache_init(struct drm_device *dev)
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	int i;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	/* Create the initial set of BO labels that the kernel will
 	 * use.  This lets us avoid a bunch of string reallocation in
 	 * the kernel's draw and BO allocation paths.
@@ -1004,6 +1050,9 @@ int vc4_label_bo_ioctl(struct drm_device *dev, void *data,
 	struct drm_gem_object *gem_obj;
 	int ret = 0, label;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	if (!args->len)
 		return -EINVAL;
 
diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index 64c7696aa9e4..ba017421736c 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -99,6 +99,9 @@ static int vc4_get_param_ioctl(struct drm_device *dev, void *data,
 	if (args->pad != 0)
 		return -EINVAL;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	if (!vc4->v3d)
 		return -ENODEV;
 
@@ -142,11 +145,16 @@ static int vc4_get_param_ioctl(struct drm_device *dev, void *data,
 
 static int vc4_open(struct drm_device *dev, struct drm_file *file)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct vc4_file *vc4file;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	vc4file = kzalloc(sizeof(*vc4file), GFP_KERNEL);
 	if (!vc4file)
 		return -ENOMEM;
+	vc4file->dev = vc4;
 
 	vc4_perfmon_open_file(vc4file);
 	file->driver_priv = vc4file;
@@ -158,6 +166,9 @@ static void vc4_close(struct drm_device *dev, struct drm_file *file)
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct vc4_file *vc4file = file->driver_priv;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	if (vc4file->bin_bo_used)
 		vc4_v3d_bin_bo_put(vc4);
 
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index 9c324c12c410..93fd55b9e99e 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -48,6 +48,8 @@ enum vc4_kernel_bo_type {
  * done. This way, only events related to a specific job will be counted.
  */
 struct vc4_perfmon {
+	struct vc4_dev *dev;
+
 	/* Tracks the number of users of the perfmon, when this counter reaches
 	 * zero the perfmon is destroyed.
 	 */
@@ -580,6 +582,8 @@ to_vc4_crtc_state(struct drm_crtc_state *crtc_state)
 #define VC4_REG32(reg) { .name = #reg, .offset = reg }
 
 struct vc4_exec_info {
+	struct vc4_dev *dev;
+
 	/* Sequence number for this bin/render job. */
 	uint64_t seqno;
 
@@ -701,6 +705,8 @@ struct vc4_exec_info {
  * released when the DRM file is closed should be placed here.
  */
 struct vc4_file {
+	struct vc4_dev *dev;
+
 	struct {
 		struct idr idr;
 		struct mutex lock;
diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
index 9eaf304fc20d..fe10d9c3fff8 100644
--- a/drivers/gpu/drm/vc4/vc4_gem.c
+++ b/drivers/gpu/drm/vc4/vc4_gem.c
@@ -76,6 +76,9 @@ vc4_get_hang_state_ioctl(struct drm_device *dev, void *data,
 	u32 i;
 	int ret = 0;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	if (!vc4->v3d) {
 		DRM_DEBUG("VC4_GET_HANG_STATE with no VC4 V3D probed\n");
 		return -ENODEV;
@@ -386,6 +389,9 @@ vc4_wait_for_seqno(struct drm_device *dev, uint64_t seqno, uint64_t timeout_ns,
 	unsigned long timeout_expire;
 	DEFINE_WAIT(wait);
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	if (vc4->finished_seqno >= seqno)
 		return 0;
 
@@ -468,6 +474,9 @@ vc4_submit_next_bin_job(struct drm_device *dev)
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct vc4_exec_info *exec;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 again:
 	exec = vc4_first_bin_job(vc4);
 	if (!exec)
@@ -513,6 +522,9 @@ vc4_submit_next_render_job(struct drm_device *dev)
 	if (!exec)
 		return;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	/* A previous RCL may have written to one of our textures, and
 	 * our full cache flush at bin time may have occurred before
 	 * that RCL completed.  Flush the texture cache now, but not
@@ -531,6 +543,9 @@ vc4_move_job_to_render(struct drm_device *dev, struct vc4_exec_info *exec)
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	bool was_empty = list_empty(&vc4->render_job_list);
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	list_move_tail(&exec->head, &vc4->render_job_list);
 	if (was_empty)
 		vc4_submit_next_render_job(dev);
@@ -997,6 +1012,9 @@ vc4_job_handle_completed(struct vc4_dev *vc4)
 	unsigned long irqflags;
 	struct vc4_seqno_cb *cb, *cb_temp;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	spin_lock_irqsave(&vc4->job_lock, irqflags);
 	while (!list_empty(&vc4->job_done_list)) {
 		struct vc4_exec_info *exec =
@@ -1033,6 +1051,9 @@ int vc4_queue_seqno_cb(struct drm_device *dev,
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	unsigned long irqflags;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	cb->func = func;
 	INIT_WORK(&cb->work, vc4_seqno_cb_work);
 
@@ -1083,8 +1104,12 @@ int
 vc4_wait_seqno_ioctl(struct drm_device *dev, void *data,
 		     struct drm_file *file_priv)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct drm_vc4_wait_seqno *args = data;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	return vc4_wait_for_seqno_ioctl_helper(dev, args->seqno,
 					       &args->timeout_ns);
 }
@@ -1093,11 +1118,15 @@ int
 vc4_wait_bo_ioctl(struct drm_device *dev, void *data,
 		  struct drm_file *file_priv)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	int ret;
 	struct drm_vc4_wait_bo *args = data;
 	struct drm_gem_object *gem_obj;
 	struct vc4_bo *bo;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	if (args->pad != 0)
 		return -EINVAL;
 
@@ -1144,6 +1173,9 @@ vc4_submit_cl_ioctl(struct drm_device *dev, void *data,
 				  args->shader_rec_size,
 				  args->bo_handle_count);
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	if (!vc4->v3d) {
 		DRM_DEBUG("VC4_SUBMIT_CL with no VC4 V3D probed\n");
 		return -ENODEV;
@@ -1167,6 +1199,7 @@ vc4_submit_cl_ioctl(struct drm_device *dev, void *data,
 		DRM_ERROR("malloc failure on exec struct\n");
 		return -ENOMEM;
 	}
+	exec->dev = vc4;
 
 	ret = vc4_v3d_pm_get(vc4);
 	if (ret) {
@@ -1276,6 +1309,9 @@ int vc4_gem_init(struct drm_device *dev)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	vc4->dma_fence_context = dma_fence_context_alloc(1);
 
 	INIT_LIST_HEAD(&vc4->bin_job_list);
@@ -1321,11 +1357,15 @@ static void vc4_gem_destroy(struct drm_device *dev, void *unused)
 int vc4_gem_madvise_ioctl(struct drm_device *dev, void *data,
 			  struct drm_file *file_priv)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct drm_vc4_gem_madvise *args = data;
 	struct drm_gem_object *gem_obj;
 	struct vc4_bo *bo;
 	int ret;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	switch (args->madv) {
 	case VC4_MADV_DONTNEED:
 	case VC4_MADV_WILLNEED:
diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
index 4342fb43e8c1..2eacfb6773d2 100644
--- a/drivers/gpu/drm/vc4/vc4_irq.c
+++ b/drivers/gpu/drm/vc4/vc4_irq.c
@@ -265,6 +265,9 @@ vc4_irq_enable(struct drm_device *dev)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	if (!vc4->v3d)
 		return;
 
@@ -279,6 +282,9 @@ vc4_irq_disable(struct drm_device *dev)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	if (!vc4->v3d)
 		return;
 
@@ -296,8 +302,12 @@ vc4_irq_disable(struct drm_device *dev)
 
 int vc4_irq_install(struct drm_device *dev, int irq)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	int ret;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	if (irq == IRQ_NOTCONNECTED)
 		return -ENOTCONN;
 
@@ -316,6 +326,9 @@ void vc4_irq_uninstall(struct drm_device *dev)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	vc4_irq_disable(dev);
 	free_irq(vc4->irq, dev);
 }
@@ -326,6 +339,9 @@ void vc4_irq_reset(struct drm_device *dev)
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	unsigned long irqflags;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	/* Acknowledge any stale IRQs. */
 	V3D_WRITE(V3D_INTCTL, V3D_DRIVER_IRQS);
 
diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
index 1d3b31fb71ea..893d831b24aa 100644
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -479,8 +479,12 @@ static struct drm_framebuffer *vc4_fb_create(struct drm_device *dev,
 					     struct drm_file *file_priv,
 					     const struct drm_mode_fb_cmd2 *mode_cmd)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct drm_mode_fb_cmd2 mode_cmd_local;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return ERR_PTR(-ENODEV);
+
 	/* If the user didn't specify a modifier, use the
 	 * vc4_set_tiling_ioctl() state for the BO.
 	 */
diff --git a/drivers/gpu/drm/vc4/vc4_perfmon.c b/drivers/gpu/drm/vc4/vc4_perfmon.c
index 18abc06335c1..d853d647b44d 100644
--- a/drivers/gpu/drm/vc4/vc4_perfmon.c
+++ b/drivers/gpu/drm/vc4/vc4_perfmon.c
@@ -17,12 +17,22 @@
 
 void vc4_perfmon_get(struct vc4_perfmon *perfmon)
 {
+	struct vc4_dev *vc4 = perfmon->dev;
+
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	if (perfmon)
 		refcount_inc(&perfmon->refcnt);
 }
 
 void vc4_perfmon_put(struct vc4_perfmon *perfmon)
 {
+	struct vc4_dev *vc4 = perfmon->dev;
+
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	if (perfmon && refcount_dec_and_test(&perfmon->refcnt))
 		kfree(perfmon);
 }
@@ -32,6 +42,9 @@ void vc4_perfmon_start(struct vc4_dev *vc4, struct vc4_perfmon *perfmon)
 	unsigned int i;
 	u32 mask;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	if (WARN_ON_ONCE(!perfmon || vc4->active_perfmon))
 		return;
 
@@ -49,6 +62,9 @@ void vc4_perfmon_stop(struct vc4_dev *vc4, struct vc4_perfmon *perfmon,
 {
 	unsigned int i;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	if (WARN_ON_ONCE(!vc4->active_perfmon ||
 			 perfmon != vc4->active_perfmon))
 		return;
@@ -64,8 +80,12 @@ void vc4_perfmon_stop(struct vc4_dev *vc4, struct vc4_perfmon *perfmon,
 
 struct vc4_perfmon *vc4_perfmon_find(struct vc4_file *vc4file, int id)
 {
+	struct vc4_dev *vc4 = vc4file->dev;
 	struct vc4_perfmon *perfmon;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return NULL;
+
 	mutex_lock(&vc4file->perfmon.lock);
 	perfmon = idr_find(&vc4file->perfmon.idr, id);
 	vc4_perfmon_get(perfmon);
@@ -76,8 +96,14 @@ struct vc4_perfmon *vc4_perfmon_find(struct vc4_file *vc4file, int id)
 
 void vc4_perfmon_open_file(struct vc4_file *vc4file)
 {
+	struct vc4_dev *vc4 = vc4file->dev;
+
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	mutex_init(&vc4file->perfmon.lock);
 	idr_init_base(&vc4file->perfmon.idr, VC4_PERFMONID_MIN);
+	vc4file->dev = vc4;
 }
 
 static int vc4_perfmon_idr_del(int id, void *elem, void *data)
@@ -91,6 +117,11 @@ static int vc4_perfmon_idr_del(int id, void *elem, void *data)
 
 void vc4_perfmon_close_file(struct vc4_file *vc4file)
 {
+	struct vc4_dev *vc4 = vc4file->dev;
+
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	mutex_lock(&vc4file->perfmon.lock);
 	idr_for_each(&vc4file->perfmon.idr, vc4_perfmon_idr_del, NULL);
 	idr_destroy(&vc4file->perfmon.idr);
@@ -107,6 +138,9 @@ int vc4_perfmon_create_ioctl(struct drm_device *dev, void *data,
 	unsigned int i;
 	int ret;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	if (!vc4->v3d) {
 		DRM_DEBUG("Creating perfmon no VC4 V3D probed\n");
 		return -ENODEV;
@@ -127,6 +161,7 @@ int vc4_perfmon_create_ioctl(struct drm_device *dev, void *data,
 			  GFP_KERNEL);
 	if (!perfmon)
 		return -ENOMEM;
+	perfmon->dev = vc4;
 
 	for (i = 0; i < req->ncounters; i++)
 		perfmon->events[i] = req->events[i];
@@ -157,6 +192,9 @@ int vc4_perfmon_destroy_ioctl(struct drm_device *dev, void *data,
 	struct drm_vc4_perfmon_destroy *req = data;
 	struct vc4_perfmon *perfmon;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	if (!vc4->v3d) {
 		DRM_DEBUG("Destroying perfmon no VC4 V3D probed\n");
 		return -ENODEV;
@@ -182,6 +220,9 @@ int vc4_perfmon_get_values_ioctl(struct drm_device *dev, void *data,
 	struct vc4_perfmon *perfmon;
 	int ret;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	if (!vc4->v3d) {
 		DRM_DEBUG("Getting perfmon no VC4 V3D probed\n");
 		return -ENODEV;
diff --git a/drivers/gpu/drm/vc4/vc4_render_cl.c b/drivers/gpu/drm/vc4/vc4_render_cl.c
index 3c918eeaf56e..f6b7dc3df08c 100644
--- a/drivers/gpu/drm/vc4/vc4_render_cl.c
+++ b/drivers/gpu/drm/vc4/vc4_render_cl.c
@@ -593,11 +593,15 @@ vc4_rcl_render_config_surface_setup(struct vc4_exec_info *exec,
 
 int vc4_get_rcl(struct drm_device *dev, struct vc4_exec_info *exec)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct vc4_rcl_setup setup = {0};
 	struct drm_vc4_submit_cl *args = exec->args;
 	bool has_bin = args->bin_cl_size != 0;
 	int ret;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	if (args->min_x_tile > args->max_x_tile ||
 	    args->min_y_tile > args->max_y_tile) {
 		DRM_DEBUG("Bad render tile set (%d,%d)-(%d,%d)\n",
diff --git a/drivers/gpu/drm/vc4/vc4_v3d.c b/drivers/gpu/drm/vc4/vc4_v3d.c
index 7bb3067f8425..cc714dcfe1f2 100644
--- a/drivers/gpu/drm/vc4/vc4_v3d.c
+++ b/drivers/gpu/drm/vc4/vc4_v3d.c
@@ -127,6 +127,9 @@ static int vc4_v3d_debugfs_ident(struct seq_file *m, void *unused)
 int
 vc4_v3d_pm_get(struct vc4_dev *vc4)
 {
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	mutex_lock(&vc4->power_lock);
 	if (vc4->power_refcount++ == 0) {
 		int ret = pm_runtime_get_sync(&vc4->v3d->pdev->dev);
@@ -145,6 +148,9 @@ vc4_v3d_pm_get(struct vc4_dev *vc4)
 void
 vc4_v3d_pm_put(struct vc4_dev *vc4)
 {
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	mutex_lock(&vc4->power_lock);
 	if (--vc4->power_refcount == 0) {
 		pm_runtime_mark_last_busy(&vc4->v3d->pdev->dev);
@@ -172,6 +178,9 @@ int vc4_v3d_get_bin_slot(struct vc4_dev *vc4)
 	uint64_t seqno = 0;
 	struct vc4_exec_info *exec;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 try_again:
 	spin_lock_irqsave(&vc4->job_lock, irqflags);
 	slot = ffs(~vc4->bin_alloc_used);
@@ -316,6 +325,9 @@ int vc4_v3d_bin_bo_get(struct vc4_dev *vc4, bool *used)
 {
 	int ret = 0;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	mutex_lock(&vc4->bin_bo_lock);
 
 	if (used && *used)
@@ -348,6 +360,9 @@ static void bin_bo_release(struct kref *ref)
 
 void vc4_v3d_bin_bo_put(struct vc4_dev *vc4)
 {
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return;
+
 	mutex_lock(&vc4->bin_bo_lock);
 	kref_put(&vc4->bin_bo_kref, bin_bo_release);
 	mutex_unlock(&vc4->bin_bo_lock);
diff --git a/drivers/gpu/drm/vc4/vc4_validate.c b/drivers/gpu/drm/vc4/vc4_validate.c
index eec76af49f04..833eb623d545 100644
--- a/drivers/gpu/drm/vc4/vc4_validate.c
+++ b/drivers/gpu/drm/vc4/vc4_validate.c
@@ -105,9 +105,13 @@ size_is_lt(uint32_t width, uint32_t height, int cpp)
 struct drm_gem_cma_object *
 vc4_use_bo(struct vc4_exec_info *exec, uint32_t hindex)
 {
+	struct vc4_dev *vc4 = exec->dev;
 	struct drm_gem_cma_object *obj;
 	struct vc4_bo *bo;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return NULL;
+
 	if (hindex >= exec->bo_count) {
 		DRM_DEBUG("BO index %d greater than BO count %d\n",
 			  hindex, exec->bo_count);
@@ -160,10 +164,14 @@ vc4_check_tex_size(struct vc4_exec_info *exec, struct drm_gem_cma_object *fbo,
 		   uint32_t offset, uint8_t tiling_format,
 		   uint32_t width, uint32_t height, uint8_t cpp)
 {
+	struct vc4_dev *vc4 = exec->dev;
 	uint32_t aligned_width, aligned_height, stride, size;
 	uint32_t utile_w = utile_width(cpp);
 	uint32_t utile_h = utile_height(cpp);
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	/* The shaded vertex format stores signed 12.4 fixed point
 	 * (-2048,2047) offsets from the viewport center, so we should
 	 * never have a render target larger than 4096.  The texture
@@ -482,10 +490,14 @@ vc4_validate_bin_cl(struct drm_device *dev,
 		    void *unvalidated,
 		    struct vc4_exec_info *exec)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	uint32_t len = exec->args->bin_cl_size;
 	uint32_t dst_offset = 0;
 	uint32_t src_offset = 0;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	while (src_offset < len) {
 		void *dst_pkt = validated + dst_offset;
 		void *src_pkt = unvalidated + src_offset;
@@ -926,9 +938,13 @@ int
 vc4_validate_shader_recs(struct drm_device *dev,
 			 struct vc4_exec_info *exec)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	uint32_t i;
 	int ret = 0;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return -ENODEV;
+
 	for (i = 0; i < exec->shader_state_count; i++) {
 		ret = validate_gl_shader_rec(dev, exec, &exec->shader_state[i]);
 		if (ret)
diff --git a/drivers/gpu/drm/vc4/vc4_validate_shaders.c b/drivers/gpu/drm/vc4/vc4_validate_shaders.c
index 7cf82b071de2..e315aeb5fef5 100644
--- a/drivers/gpu/drm/vc4/vc4_validate_shaders.c
+++ b/drivers/gpu/drm/vc4/vc4_validate_shaders.c
@@ -778,6 +778,7 @@ vc4_handle_branch_target(struct vc4_shader_validation_state *validation_state)
 struct vc4_validated_shader_info *
 vc4_validate_shader(struct drm_gem_cma_object *shader_obj)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(shader_obj->base.dev);
 	bool found_shader_end = false;
 	int shader_end_ip = 0;
 	uint32_t last_thread_switch_ip = -3;
@@ -785,6 +786,9 @@ vc4_validate_shader(struct drm_gem_cma_object *shader_obj)
 	struct vc4_validated_shader_info *validated_shader = NULL;
 	struct vc4_shader_validation_state validation_state;
 
+	if (WARN_ON_ONCE(vc4->is_vc5))
+		return NULL;
+
 	memset(&validation_state, 0, sizeof(validation_state));
 	validation_state.shader = shader_obj->vaddr;
 	validation_state.max_ip = shader_obj->base.size / sizeof(uint64_t);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 13/14] drm/vc4: crtc: Fix out of order frames during asynchronous page flips
  2022-05-03 12:13 ` [PATCH 13/14] drm/vc4: crtc: Fix out of order frames during asynchronous page flips Maxime Ripard
@ 2022-05-03 13:53   ` kernel test robot
  2022-05-09 17:10   ` Melissa Wen
  1 sibling, 0 replies; 25+ messages in thread
From: kernel test robot @ 2022-05-03 13:53 UTC (permalink / raw)
  To: Maxime Ripard, Daniel Vetter, David Airlie, Maarten Lankhorst,
	Thomas Zimmermann
  Cc: Melissa Wen, kbuild-all, dri-devel

Hi Maxime,

I love your patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on drm-tip/drm-tip next-20220503]
[cannot apply to anholt/for-next v5.18-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/Maxime-Ripard/drm-vc4-Properly-separate-v3d-on-BCM2711-and-fix-frame-ordering/20220503-201745
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: arc-allyesconfig (https://download.01.org/0day-ci/archive/20220503/202205032134.wMMUQ85T-lkp@intel.com/config)
compiler: arceb-elf-gcc (GCC) 11.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/0db44bd12236a0a64cf67bbef6b11df5bbe37134
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Maxime-Ripard/drm-vc4-Properly-separate-v3d-on-BCM2711-and-fix-frame-ordering/20220503-201745
        git checkout 0db44bd12236a0a64cf67bbef6b11df5bbe37134
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.3.0 make.cross W=1 O=build_dir ARCH=arc SHELL=/bin/bash drivers/gpu/drm/vc4/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/vc4/vc4_crtc.c: In function 'vc4_async_set_fence_cb':
>> drivers/gpu/drm/vc4/vc4_crtc.c:865:57: warning: implicit conversion from 'enum <anonymous>' to 'enum dma_resv_usage' [-Wenum-conversion]
     865 |         ret = dma_resv_get_singleton(cma_bo->base.resv, false, &fence);
         |                                                         ^~~~~


vim +865 drivers/gpu/drm/vc4/vc4_crtc.c

   848	
   849	static int vc4_async_set_fence_cb(struct drm_device *dev,
   850					  struct vc4_async_flip_state *flip_state)
   851	{
   852		struct drm_framebuffer *fb = flip_state->fb;
   853		struct drm_gem_cma_object *cma_bo = drm_fb_cma_get_gem_obj(fb, 0);
   854		struct vc4_dev *vc4 = to_vc4_dev(dev);
   855		struct dma_fence *fence;
   856		int ret;
   857	
   858		if (!vc4->is_vc5) {
   859			struct vc4_bo *bo = to_vc4_bo(&cma_bo->base);
   860	
   861			return vc4_queue_seqno_cb(dev, &flip_state->cb.seqno, bo->seqno,
   862						  vc4_async_page_flip_seqno_complete);
   863		}
   864	
 > 865		ret = dma_resv_get_singleton(cma_bo->base.resv, false, &fence);
   866		if (ret)
   867			return ret;
   868	
   869		if (dma_fence_add_callback(fence, &flip_state->cb.fence,
   870					   vc4_async_page_flip_fence_complete))
   871			vc4_async_page_flip_fence_complete(fence, &flip_state->cb.fence);
   872	
   873		return 0;
   874	}
   875	

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 05/14] drm/vc4: drv: Register a different driver on BCM2711
  2022-05-03 12:13 ` [PATCH 05/14] drm/vc4: drv: Register a different driver on BCM2711 Maxime Ripard
@ 2022-05-09 16:37   ` Melissa Wen
  0 siblings, 0 replies; 25+ messages in thread
From: Melissa Wen @ 2022-05-09 16:37 UTC (permalink / raw)
  To: Maxime Ripard; +Cc: David Airlie, Daniel Vetter, dri-devel, Thomas Zimmermann

[-- Attachment #1: Type: text/plain, Size: 4733 bytes --]

On 05/03, Maxime Ripard wrote:
> Prior to the BCM2711/RaspberryPi4, the GPU was a part of the display
> components of the SoC. It was thus a part of the vc4 driver.
> 
> However, with the BCM2711, it got split out and thus the v3d driver was
> created. The vc4 driver now only handles the display part.
> 
> We didn't properly split out the code when doing the BCM2711 support
> though, and most of the code around buffer allocations is still
> involved, even though it doesn't have the backing hardware anymore.
> 
> Let's start the split out by creating a new drm_driver that only reports
> and uses what we support on the BCM2711. The ioctl were properly
> filtered already, but we were still exposing a .gem_create_object hook,
> as well as having an .open and .postclose hooks which are only relevant
> on older generations.
> 
> Signed-off-by: Maxime Ripard <maxime@cerno.tech>
> ---
>  drivers/gpu/drm/vc4/vc4_drv.c | 51 ++++++++++++++++++++++++++++-------
>  1 file changed, 42 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
> index eb08940028d3..4f9e2067dad0 100644
> --- a/drivers/gpu/drm/vc4/vc4_drv.c
> +++ b/drivers/gpu/drm/vc4/vc4_drv.c
> @@ -76,6 +76,19 @@ int vc4_dumb_fixup_args(struct drm_mode_create_dumb *args)
>  	return 0;
>  }
>  
> +static int vc4_dumb_create(struct drm_file *file_priv,
> +			   struct drm_device *dev,
> +			   struct drm_mode_create_dumb *args)
> +{
> +	int ret;
> +
> +	ret = vc4_dumb_fixup_args(args);
> +	if (ret)
> +		return ret;
> +
> +	return drm_gem_cma_dumb_create_internal(file_priv, dev, args);
> +}
> +
>  static int vc4_get_param_ioctl(struct drm_device *dev, void *data,
>  			       struct drm_file *file_priv)
>  {
> @@ -173,7 +186,7 @@ static const struct drm_ioctl_desc vc4_drm_ioctls[] = {
>  	DRM_IOCTL_DEF_DRV(VC4_PERFMON_GET_VALUES, vc4_perfmon_get_values_ioctl, DRM_RENDER_ALLOW),
>  };
>  
> -static struct drm_driver vc4_drm_driver = {
> +static const struct drm_driver vc4_drm_driver = {
>  	.driver_features = (DRIVER_MODESET |
>  			    DRIVER_ATOMIC |
>  			    DRIVER_GEM |
> @@ -202,6 +215,27 @@ static struct drm_driver vc4_drm_driver = {
>  	.patchlevel = DRIVER_PATCHLEVEL,
>  };
>  
> +static const struct drm_driver vc5_drm_driver = {
> +	.driver_features = (DRIVER_MODESET |
> +			    DRIVER_ATOMIC |
> +			    DRIVER_GEM),
> +
> +#if defined(CONFIG_DEBUG_FS)
> +	.debugfs_init = vc4_debugfs_init,
> +#endif
> +
> +	DRM_GEM_CMA_DRIVER_OPS_WITH_DUMB_CREATE(vc4_dumb_create),

I wonder if we can call it `vc5_dumb_create` to dissociate from vc4.
Instead of a `vc4_dumb_create` only used by vc5_drm_driver.

I mean, mistunderstandings already happen between vc4->v3d (component)
and v3d (driver). Then now we have a vc5 - without v3d (component) and
existing a v3d (driver) - and now I worry that it's going to get even
more confusing...

> +
> +	.fops = &vc4_drm_fops,
> +
> +	.name = DRIVER_NAME,
> +	.desc = DRIVER_DESC,
> +	.date = DRIVER_DATE,
> +	.major = DRIVER_MAJOR,
> +	.minor = DRIVER_MINOR,
> +	.patchlevel = DRIVER_PATCHLEVEL,
> +};
> +
>  static void vc4_match_add_drivers(struct device *dev,
>  				  struct component_match **match,
>  				  struct platform_driver *const *drivers,
> @@ -225,6 +259,7 @@ static void vc4_match_add_drivers(struct device *dev,
>  static int vc4_drm_bind(struct device *dev)
>  {
>  	struct platform_device *pdev = to_platform_device(dev);
> +	const struct drm_driver *driver;
>  	struct rpi_firmware *firmware = NULL;
>  	struct drm_device *drm;
>  	struct vc4_dev *vc4;
> @@ -236,14 +271,12 @@ static int vc4_drm_bind(struct device *dev)
>  	dev->coherent_dma_mask = DMA_BIT_MASK(32);
>  
>  	is_vc5 = of_device_is_compatible(dev->of_node, "brcm,bcm2711-vc5");
> +	if (is_vc5)
> +		driver = &vc5_drm_driver;
> +	else
> +		driver = &vc4_drm_driver;
>  
> -	/* If VC4 V3D is missing, don't advertise render nodes. */
> -	node = of_find_matching_node_and_match(NULL, vc4_v3d_dt_match, NULL);
> -	if (!node || !of_device_is_available(node))
> -		vc4_drm_driver.driver_features &= ~DRIVER_RENDER;
> -	of_node_put(node);
> -
> -	vc4 = devm_drm_dev_alloc(dev, &vc4_drm_driver, struct vc4_dev, base);
> +	vc4 = devm_drm_dev_alloc(dev, driver, struct vc4_dev, base);
>  	if (IS_ERR(vc4))
>  		return PTR_ERR(vc4);
>  	vc4->is_vc5 = is_vc5;
> @@ -275,7 +308,7 @@ static int vc4_drm_bind(struct device *dev)
>  			return -EPROBE_DEFER;
>  	}
>  
> -	ret = drm_aperture_remove_framebuffers(false, &vc4_drm_driver);
> +	ret = drm_aperture_remove_framebuffers(false, driver);
>  	if (ret)
>  		return ret;
>  
> -- 
> 2.35.1
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 14/14] drm/vc4: Warn if some v3d code is run on BCM2711
  2022-05-03 12:13 ` [PATCH 14/14] drm/vc4: Warn if some v3d code is run on BCM2711 Maxime Ripard
@ 2022-05-09 16:52   ` Melissa Wen
  2022-05-31  9:40     ` Maxime Ripard
  0 siblings, 1 reply; 25+ messages in thread
From: Melissa Wen @ 2022-05-09 16:52 UTC (permalink / raw)
  To: Maxime Ripard; +Cc: David Airlie, Daniel Vetter, dri-devel, Thomas Zimmermann

[-- Attachment #1: Type: text/plain, Size: 25167 bytes --]

On 05/03, Maxime Ripard wrote:
> The BCM2711 has a separate driver for the v3d, and thus we can't call
> into any of the driver entrypoints that rely on the v3d being there.
> 
> Let's add a bunch of checks and complain loudly if that ever happen.
> 
> Signed-off-by: Maxime Ripard <maxime@cerno.tech>
> ---
>  drivers/gpu/drm/vc4/vc4_bo.c               | 49 ++++++++++++++++++++++
>  drivers/gpu/drm/vc4/vc4_drv.c              | 11 +++++
>  drivers/gpu/drm/vc4/vc4_drv.h              |  6 +++
>  drivers/gpu/drm/vc4/vc4_gem.c              | 40 ++++++++++++++++++
>  drivers/gpu/drm/vc4/vc4_irq.c              | 16 +++++++
>  drivers/gpu/drm/vc4/vc4_kms.c              |  4 ++
>  drivers/gpu/drm/vc4/vc4_perfmon.c          | 41 ++++++++++++++++++
>  drivers/gpu/drm/vc4/vc4_render_cl.c        |  4 ++
>  drivers/gpu/drm/vc4/vc4_v3d.c              | 15 +++++++
>  drivers/gpu/drm/vc4/vc4_validate.c         | 16 +++++++
>  drivers/gpu/drm/vc4/vc4_validate_shaders.c |  4 ++
>  11 files changed, 206 insertions(+)
> 
> diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
> index 3ca16d682fc0..b8d856312846 100644
> --- a/drivers/gpu/drm/vc4/vc4_bo.c
> +++ b/drivers/gpu/drm/vc4/vc4_bo.c
> @@ -248,6 +248,9 @@ void vc4_bo_add_to_purgeable_pool(struct vc4_bo *bo)
>  {
>  	struct vc4_dev *vc4 = to_vc4_dev(bo->base.base.dev);
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	mutex_lock(&vc4->purgeable.lock);
>  	list_add_tail(&bo->size_head, &vc4->purgeable.list);
>  	vc4->purgeable.num++;
> @@ -259,6 +262,9 @@ static void vc4_bo_remove_from_purgeable_pool_locked(struct vc4_bo *bo)
>  {
>  	struct vc4_dev *vc4 = to_vc4_dev(bo->base.base.dev);
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	/* list_del_init() is used here because the caller might release
>  	 * the purgeable lock in order to acquire the madv one and update the
>  	 * madv status.
> @@ -387,6 +393,9 @@ struct drm_gem_object *vc4_create_object(struct drm_device *dev, size_t size)
>  	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	struct vc4_bo *bo;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return ERR_PTR(-ENODEV);
> +
>  	bo = kzalloc(sizeof(*bo), GFP_KERNEL);
>  	if (!bo)
>  		return ERR_PTR(-ENOMEM);
> @@ -413,6 +422,9 @@ struct vc4_bo *vc4_bo_create(struct drm_device *dev, size_t unaligned_size,
>  	struct drm_gem_cma_object *cma_obj;
>  	struct vc4_bo *bo;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return ERR_PTR(-ENODEV);
> +
>  	if (size == 0)
>  		return ERR_PTR(-EINVAL);
>  
> @@ -475,9 +487,13 @@ int vc4_bo_dumb_create(struct drm_file *file_priv,
>  		       struct drm_device *dev,
>  		       struct drm_mode_create_dumb *args)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	struct vc4_bo *bo = NULL;
>  	int ret;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	ret = vc4_dumb_fixup_args(args);
>  	if (ret)
>  		return ret;
> @@ -598,8 +614,12 @@ static void vc4_bo_cache_time_work(struct work_struct *work)
>  
>  int vc4_bo_inc_usecnt(struct vc4_bo *bo)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(bo->base.base.dev);
>  	int ret;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	/* Fast path: if the BO is already retained by someone, no need to
>  	 * check the madv status.
>  	 */
> @@ -634,6 +654,11 @@ int vc4_bo_inc_usecnt(struct vc4_bo *bo)
>  
>  void vc4_bo_dec_usecnt(struct vc4_bo *bo)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(bo->base.base.dev);
> +
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	/* Fast path: if the BO is still retained by someone, no need to test
>  	 * the madv value.
>  	 */
> @@ -753,6 +778,9 @@ int vc4_create_bo_ioctl(struct drm_device *dev, void *data,
>  	struct vc4_bo *bo = NULL;
>  	int ret;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	ret = vc4_grab_bin_bo(vc4, vc4file);
>  	if (ret)
>  		return ret;
> @@ -776,9 +804,13 @@ int vc4_create_bo_ioctl(struct drm_device *dev, void *data,
>  int vc4_mmap_bo_ioctl(struct drm_device *dev, void *data,
>  		      struct drm_file *file_priv)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	struct drm_vc4_mmap_bo *args = data;
>  	struct drm_gem_object *gem_obj;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	gem_obj = drm_gem_object_lookup(file_priv, args->handle);
>  	if (!gem_obj) {
>  		DRM_DEBUG("Failed to look up GEM BO %d\n", args->handle);
> @@ -802,6 +834,9 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
>  	struct vc4_bo *bo = NULL;
>  	int ret;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	if (args->size == 0)
>  		return -EINVAL;
>  
> @@ -872,11 +907,15 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
>  int vc4_set_tiling_ioctl(struct drm_device *dev, void *data,
>  			 struct drm_file *file_priv)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	struct drm_vc4_set_tiling *args = data;
>  	struct drm_gem_object *gem_obj;
>  	struct vc4_bo *bo;
>  	bool t_format;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	if (args->flags != 0)
>  		return -EINVAL;
>  
> @@ -915,10 +954,14 @@ int vc4_set_tiling_ioctl(struct drm_device *dev, void *data,
>  int vc4_get_tiling_ioctl(struct drm_device *dev, void *data,
>  			 struct drm_file *file_priv)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	struct drm_vc4_get_tiling *args = data;
>  	struct drm_gem_object *gem_obj;
>  	struct vc4_bo *bo;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +

Just to confirm: as none of these ioctls were wired up in
vc5_drm_driver, is there any situation where this path can be taken
wrongly?

>  	if (args->flags != 0 || args->modifier != 0)
>  		return -EINVAL;
>  
> @@ -945,6 +988,9 @@ int vc4_bo_cache_init(struct drm_device *dev)
>  	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	int i;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	/* Create the initial set of BO labels that the kernel will
>  	 * use.  This lets us avoid a bunch of string reallocation in
>  	 * the kernel's draw and BO allocation paths.
> @@ -1004,6 +1050,9 @@ int vc4_label_bo_ioctl(struct drm_device *dev, void *data,
>  	struct drm_gem_object *gem_obj;
>  	int ret = 0, label;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	if (!args->len)
>  		return -EINVAL;
>  
> diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
> index 64c7696aa9e4..ba017421736c 100644
> --- a/drivers/gpu/drm/vc4/vc4_drv.c
> +++ b/drivers/gpu/drm/vc4/vc4_drv.c
> @@ -99,6 +99,9 @@ static int vc4_get_param_ioctl(struct drm_device *dev, void *data,
>  	if (args->pad != 0)
>  		return -EINVAL;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	if (!vc4->v3d)
>  		return -ENODEV;
>  
> @@ -142,11 +145,16 @@ static int vc4_get_param_ioctl(struct drm_device *dev, void *data,
>  
>  static int vc4_open(struct drm_device *dev, struct drm_file *file)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	struct vc4_file *vc4file;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	vc4file = kzalloc(sizeof(*vc4file), GFP_KERNEL);
>  	if (!vc4file)
>  		return -ENOMEM;
> +	vc4file->dev = vc4;
>  
>  	vc4_perfmon_open_file(vc4file);
>  	file->driver_priv = vc4file;
> @@ -158,6 +166,9 @@ static void vc4_close(struct drm_device *dev, struct drm_file *file)
>  	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	struct vc4_file *vc4file = file->driver_priv;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	if (vc4file->bin_bo_used)
>  		vc4_v3d_bin_bo_put(vc4);
>  
> diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
> index 9c324c12c410..93fd55b9e99e 100644
> --- a/drivers/gpu/drm/vc4/vc4_drv.h
> +++ b/drivers/gpu/drm/vc4/vc4_drv.h
> @@ -48,6 +48,8 @@ enum vc4_kernel_bo_type {
>   * done. This way, only events related to a specific job will be counted.
>   */
>  struct vc4_perfmon {
> +	struct vc4_dev *dev;
> +
>  	/* Tracks the number of users of the perfmon, when this counter reaches
>  	 * zero the perfmon is destroyed.
>  	 */
> @@ -580,6 +582,8 @@ to_vc4_crtc_state(struct drm_crtc_state *crtc_state)
>  #define VC4_REG32(reg) { .name = #reg, .offset = reg }
>  
>  struct vc4_exec_info {
> +	struct vc4_dev *dev;
> +
>  	/* Sequence number for this bin/render job. */
>  	uint64_t seqno;
>  
> @@ -701,6 +705,8 @@ struct vc4_exec_info {
>   * released when the DRM file is closed should be placed here.
>   */
>  struct vc4_file {
> +	struct vc4_dev *dev;
> +
>  	struct {
>  		struct idr idr;
>  		struct mutex lock;
> diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
> index 9eaf304fc20d..fe10d9c3fff8 100644
> --- a/drivers/gpu/drm/vc4/vc4_gem.c
> +++ b/drivers/gpu/drm/vc4/vc4_gem.c
> @@ -76,6 +76,9 @@ vc4_get_hang_state_ioctl(struct drm_device *dev, void *data,
>  	u32 i;
>  	int ret = 0;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	if (!vc4->v3d) {
>  		DRM_DEBUG("VC4_GET_HANG_STATE with no VC4 V3D probed\n");
>  		return -ENODEV;
> @@ -386,6 +389,9 @@ vc4_wait_for_seqno(struct drm_device *dev, uint64_t seqno, uint64_t timeout_ns,
>  	unsigned long timeout_expire;
>  	DEFINE_WAIT(wait);
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	if (vc4->finished_seqno >= seqno)
>  		return 0;
>  
> @@ -468,6 +474,9 @@ vc4_submit_next_bin_job(struct drm_device *dev)
>  	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	struct vc4_exec_info *exec;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  again:
>  	exec = vc4_first_bin_job(vc4);
>  	if (!exec)
> @@ -513,6 +522,9 @@ vc4_submit_next_render_job(struct drm_device *dev)
>  	if (!exec)
>  		return;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	/* A previous RCL may have written to one of our textures, and
>  	 * our full cache flush at bin time may have occurred before
>  	 * that RCL completed.  Flush the texture cache now, but not
> @@ -531,6 +543,9 @@ vc4_move_job_to_render(struct drm_device *dev, struct vc4_exec_info *exec)
>  	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	bool was_empty = list_empty(&vc4->render_job_list);
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	list_move_tail(&exec->head, &vc4->render_job_list);
>  	if (was_empty)
>  		vc4_submit_next_render_job(dev);
> @@ -997,6 +1012,9 @@ vc4_job_handle_completed(struct vc4_dev *vc4)
>  	unsigned long irqflags;
>  	struct vc4_seqno_cb *cb, *cb_temp;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	spin_lock_irqsave(&vc4->job_lock, irqflags);
>  	while (!list_empty(&vc4->job_done_list)) {
>  		struct vc4_exec_info *exec =
> @@ -1033,6 +1051,9 @@ int vc4_queue_seqno_cb(struct drm_device *dev,
>  	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	unsigned long irqflags;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	cb->func = func;
>  	INIT_WORK(&cb->work, vc4_seqno_cb_work);
>  
> @@ -1083,8 +1104,12 @@ int
>  vc4_wait_seqno_ioctl(struct drm_device *dev, void *data,
>  		     struct drm_file *file_priv)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	struct drm_vc4_wait_seqno *args = data;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	return vc4_wait_for_seqno_ioctl_helper(dev, args->seqno,
>  					       &args->timeout_ns);
>  }
> @@ -1093,11 +1118,15 @@ int
>  vc4_wait_bo_ioctl(struct drm_device *dev, void *data,
>  		  struct drm_file *file_priv)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	int ret;
>  	struct drm_vc4_wait_bo *args = data;
>  	struct drm_gem_object *gem_obj;
>  	struct vc4_bo *bo;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	if (args->pad != 0)
>  		return -EINVAL;
>  
> @@ -1144,6 +1173,9 @@ vc4_submit_cl_ioctl(struct drm_device *dev, void *data,
>  				  args->shader_rec_size,
>  				  args->bo_handle_count);
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	if (!vc4->v3d) {
>  		DRM_DEBUG("VC4_SUBMIT_CL with no VC4 V3D probed\n");
>  		return -ENODEV;
> @@ -1167,6 +1199,7 @@ vc4_submit_cl_ioctl(struct drm_device *dev, void *data,
>  		DRM_ERROR("malloc failure on exec struct\n");
>  		return -ENOMEM;
>  	}
> +	exec->dev = vc4;
>  
>  	ret = vc4_v3d_pm_get(vc4);
>  	if (ret) {
> @@ -1276,6 +1309,9 @@ int vc4_gem_init(struct drm_device *dev)
>  {
>  	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	vc4->dma_fence_context = dma_fence_context_alloc(1);
>  
>  	INIT_LIST_HEAD(&vc4->bin_job_list);
> @@ -1321,11 +1357,15 @@ static void vc4_gem_destroy(struct drm_device *dev, void *unused)
>  int vc4_gem_madvise_ioctl(struct drm_device *dev, void *data,
>  			  struct drm_file *file_priv)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	struct drm_vc4_gem_madvise *args = data;
>  	struct drm_gem_object *gem_obj;
>  	struct vc4_bo *bo;
>  	int ret;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	switch (args->madv) {
>  	case VC4_MADV_DONTNEED:
>  	case VC4_MADV_WILLNEED:
> diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
> index 4342fb43e8c1..2eacfb6773d2 100644
> --- a/drivers/gpu/drm/vc4/vc4_irq.c
> +++ b/drivers/gpu/drm/vc4/vc4_irq.c
> @@ -265,6 +265,9 @@ vc4_irq_enable(struct drm_device *dev)
>  {
>  	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	if (!vc4->v3d)
>  		return;
>  
> @@ -279,6 +282,9 @@ vc4_irq_disable(struct drm_device *dev)
>  {
>  	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	if (!vc4->v3d)
>  		return;
>  
> @@ -296,8 +302,12 @@ vc4_irq_disable(struct drm_device *dev)
>  
>  int vc4_irq_install(struct drm_device *dev, int irq)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	int ret;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	if (irq == IRQ_NOTCONNECTED)
>  		return -ENOTCONN;
>  
> @@ -316,6 +326,9 @@ void vc4_irq_uninstall(struct drm_device *dev)
>  {
>  	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	vc4_irq_disable(dev);
>  	free_irq(vc4->irq, dev);
>  }
> @@ -326,6 +339,9 @@ void vc4_irq_reset(struct drm_device *dev)
>  	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	unsigned long irqflags;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	/* Acknowledge any stale IRQs. */
>  	V3D_WRITE(V3D_INTCTL, V3D_DRIVER_IRQS);
>  
> diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
> index 1d3b31fb71ea..893d831b24aa 100644
> --- a/drivers/gpu/drm/vc4/vc4_kms.c
> +++ b/drivers/gpu/drm/vc4/vc4_kms.c
> @@ -479,8 +479,12 @@ static struct drm_framebuffer *vc4_fb_create(struct drm_device *dev,
>  					     struct drm_file *file_priv,
>  					     const struct drm_mode_fb_cmd2 *mode_cmd)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	struct drm_mode_fb_cmd2 mode_cmd_local;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return ERR_PTR(-ENODEV);
> +
>  	/* If the user didn't specify a modifier, use the
>  	 * vc4_set_tiling_ioctl() state for the BO.
>  	 */
> diff --git a/drivers/gpu/drm/vc4/vc4_perfmon.c b/drivers/gpu/drm/vc4/vc4_perfmon.c
> index 18abc06335c1..d853d647b44d 100644
> --- a/drivers/gpu/drm/vc4/vc4_perfmon.c
> +++ b/drivers/gpu/drm/vc4/vc4_perfmon.c
> @@ -17,12 +17,22 @@
>  
>  void vc4_perfmon_get(struct vc4_perfmon *perfmon)
>  {
> +	struct vc4_dev *vc4 = perfmon->dev;
> +
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	if (perfmon)
>  		refcount_inc(&perfmon->refcnt);
>  }
>  
>  void vc4_perfmon_put(struct vc4_perfmon *perfmon)
>  {
> +	struct vc4_dev *vc4 = perfmon->dev;
> +
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	if (perfmon && refcount_dec_and_test(&perfmon->refcnt))
>  		kfree(perfmon);
>  }
> @@ -32,6 +42,9 @@ void vc4_perfmon_start(struct vc4_dev *vc4, struct vc4_perfmon *perfmon)
>  	unsigned int i;
>  	u32 mask;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	if (WARN_ON_ONCE(!perfmon || vc4->active_perfmon))
>  		return;
>  
> @@ -49,6 +62,9 @@ void vc4_perfmon_stop(struct vc4_dev *vc4, struct vc4_perfmon *perfmon,
>  {
>  	unsigned int i;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	if (WARN_ON_ONCE(!vc4->active_perfmon ||
>  			 perfmon != vc4->active_perfmon))
>  		return;
> @@ -64,8 +80,12 @@ void vc4_perfmon_stop(struct vc4_dev *vc4, struct vc4_perfmon *perfmon,
>  
>  struct vc4_perfmon *vc4_perfmon_find(struct vc4_file *vc4file, int id)
>  {
> +	struct vc4_dev *vc4 = vc4file->dev;
>  	struct vc4_perfmon *perfmon;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return NULL;
> +
>  	mutex_lock(&vc4file->perfmon.lock);
>  	perfmon = idr_find(&vc4file->perfmon.idr, id);
>  	vc4_perfmon_get(perfmon);
> @@ -76,8 +96,14 @@ struct vc4_perfmon *vc4_perfmon_find(struct vc4_file *vc4file, int id)
>  
>  void vc4_perfmon_open_file(struct vc4_file *vc4file)
>  {
> +	struct vc4_dev *vc4 = vc4file->dev;
> +
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	mutex_init(&vc4file->perfmon.lock);
>  	idr_init_base(&vc4file->perfmon.idr, VC4_PERFMONID_MIN);
> +	vc4file->dev = vc4;
>  }
>  
>  static int vc4_perfmon_idr_del(int id, void *elem, void *data)
> @@ -91,6 +117,11 @@ static int vc4_perfmon_idr_del(int id, void *elem, void *data)
>  
>  void vc4_perfmon_close_file(struct vc4_file *vc4file)
>  {
> +	struct vc4_dev *vc4 = vc4file->dev;
> +
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	mutex_lock(&vc4file->perfmon.lock);
>  	idr_for_each(&vc4file->perfmon.idr, vc4_perfmon_idr_del, NULL);
>  	idr_destroy(&vc4file->perfmon.idr);
> @@ -107,6 +138,9 @@ int vc4_perfmon_create_ioctl(struct drm_device *dev, void *data,
>  	unsigned int i;
>  	int ret;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	if (!vc4->v3d) {
>  		DRM_DEBUG("Creating perfmon no VC4 V3D probed\n");
>  		return -ENODEV;
> @@ -127,6 +161,7 @@ int vc4_perfmon_create_ioctl(struct drm_device *dev, void *data,
>  			  GFP_KERNEL);
>  	if (!perfmon)
>  		return -ENOMEM;
> +	perfmon->dev = vc4;
>  
>  	for (i = 0; i < req->ncounters; i++)
>  		perfmon->events[i] = req->events[i];
> @@ -157,6 +192,9 @@ int vc4_perfmon_destroy_ioctl(struct drm_device *dev, void *data,
>  	struct drm_vc4_perfmon_destroy *req = data;
>  	struct vc4_perfmon *perfmon;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	if (!vc4->v3d) {
>  		DRM_DEBUG("Destroying perfmon no VC4 V3D probed\n");
>  		return -ENODEV;
> @@ -182,6 +220,9 @@ int vc4_perfmon_get_values_ioctl(struct drm_device *dev, void *data,
>  	struct vc4_perfmon *perfmon;
>  	int ret;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	if (!vc4->v3d) {
>  		DRM_DEBUG("Getting perfmon no VC4 V3D probed\n");
>  		return -ENODEV;
> diff --git a/drivers/gpu/drm/vc4/vc4_render_cl.c b/drivers/gpu/drm/vc4/vc4_render_cl.c
> index 3c918eeaf56e..f6b7dc3df08c 100644
> --- a/drivers/gpu/drm/vc4/vc4_render_cl.c
> +++ b/drivers/gpu/drm/vc4/vc4_render_cl.c
> @@ -593,11 +593,15 @@ vc4_rcl_render_config_surface_setup(struct vc4_exec_info *exec,
>  
>  int vc4_get_rcl(struct drm_device *dev, struct vc4_exec_info *exec)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	struct vc4_rcl_setup setup = {0};
>  	struct drm_vc4_submit_cl *args = exec->args;
>  	bool has_bin = args->bin_cl_size != 0;
>  	int ret;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	if (args->min_x_tile > args->max_x_tile ||
>  	    args->min_y_tile > args->max_y_tile) {
>  		DRM_DEBUG("Bad render tile set (%d,%d)-(%d,%d)\n",
> diff --git a/drivers/gpu/drm/vc4/vc4_v3d.c b/drivers/gpu/drm/vc4/vc4_v3d.c
> index 7bb3067f8425..cc714dcfe1f2 100644
> --- a/drivers/gpu/drm/vc4/vc4_v3d.c
> +++ b/drivers/gpu/drm/vc4/vc4_v3d.c
> @@ -127,6 +127,9 @@ static int vc4_v3d_debugfs_ident(struct seq_file *m, void *unused)
>  int
>  vc4_v3d_pm_get(struct vc4_dev *vc4)
>  {
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	mutex_lock(&vc4->power_lock);
>  	if (vc4->power_refcount++ == 0) {
>  		int ret = pm_runtime_get_sync(&vc4->v3d->pdev->dev);
> @@ -145,6 +148,9 @@ vc4_v3d_pm_get(struct vc4_dev *vc4)
>  void
>  vc4_v3d_pm_put(struct vc4_dev *vc4)
>  {
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	mutex_lock(&vc4->power_lock);
>  	if (--vc4->power_refcount == 0) {
>  		pm_runtime_mark_last_busy(&vc4->v3d->pdev->dev);
> @@ -172,6 +178,9 @@ int vc4_v3d_get_bin_slot(struct vc4_dev *vc4)
>  	uint64_t seqno = 0;
>  	struct vc4_exec_info *exec;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  try_again:
>  	spin_lock_irqsave(&vc4->job_lock, irqflags);
>  	slot = ffs(~vc4->bin_alloc_used);
> @@ -316,6 +325,9 @@ int vc4_v3d_bin_bo_get(struct vc4_dev *vc4, bool *used)
>  {
>  	int ret = 0;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	mutex_lock(&vc4->bin_bo_lock);
>  
>  	if (used && *used)
> @@ -348,6 +360,9 @@ static void bin_bo_release(struct kref *ref)
>  
>  void vc4_v3d_bin_bo_put(struct vc4_dev *vc4)
>  {
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return;
> +
>  	mutex_lock(&vc4->bin_bo_lock);
>  	kref_put(&vc4->bin_bo_kref, bin_bo_release);
>  	mutex_unlock(&vc4->bin_bo_lock);
> diff --git a/drivers/gpu/drm/vc4/vc4_validate.c b/drivers/gpu/drm/vc4/vc4_validate.c
> index eec76af49f04..833eb623d545 100644
> --- a/drivers/gpu/drm/vc4/vc4_validate.c
> +++ b/drivers/gpu/drm/vc4/vc4_validate.c
> @@ -105,9 +105,13 @@ size_is_lt(uint32_t width, uint32_t height, int cpp)
>  struct drm_gem_cma_object *
>  vc4_use_bo(struct vc4_exec_info *exec, uint32_t hindex)
>  {
> +	struct vc4_dev *vc4 = exec->dev;
>  	struct drm_gem_cma_object *obj;
>  	struct vc4_bo *bo;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return NULL;
> +
>  	if (hindex >= exec->bo_count) {
>  		DRM_DEBUG("BO index %d greater than BO count %d\n",
>  			  hindex, exec->bo_count);
> @@ -160,10 +164,14 @@ vc4_check_tex_size(struct vc4_exec_info *exec, struct drm_gem_cma_object *fbo,
>  		   uint32_t offset, uint8_t tiling_format,
>  		   uint32_t width, uint32_t height, uint8_t cpp)
>  {
> +	struct vc4_dev *vc4 = exec->dev;
>  	uint32_t aligned_width, aligned_height, stride, size;
>  	uint32_t utile_w = utile_width(cpp);
>  	uint32_t utile_h = utile_height(cpp);
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	/* The shaded vertex format stores signed 12.4 fixed point
>  	 * (-2048,2047) offsets from the viewport center, so we should
>  	 * never have a render target larger than 4096.  The texture
> @@ -482,10 +490,14 @@ vc4_validate_bin_cl(struct drm_device *dev,
>  		    void *unvalidated,
>  		    struct vc4_exec_info *exec)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	uint32_t len = exec->args->bin_cl_size;
>  	uint32_t dst_offset = 0;
>  	uint32_t src_offset = 0;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	while (src_offset < len) {
>  		void *dst_pkt = validated + dst_offset;
>  		void *src_pkt = unvalidated + src_offset;
> @@ -926,9 +938,13 @@ int
>  vc4_validate_shader_recs(struct drm_device *dev,
>  			 struct vc4_exec_info *exec)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	uint32_t i;
>  	int ret = 0;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return -ENODEV;
> +
>  	for (i = 0; i < exec->shader_state_count; i++) {
>  		ret = validate_gl_shader_rec(dev, exec, &exec->shader_state[i]);
>  		if (ret)
> diff --git a/drivers/gpu/drm/vc4/vc4_validate_shaders.c b/drivers/gpu/drm/vc4/vc4_validate_shaders.c
> index 7cf82b071de2..e315aeb5fef5 100644
> --- a/drivers/gpu/drm/vc4/vc4_validate_shaders.c
> +++ b/drivers/gpu/drm/vc4/vc4_validate_shaders.c
> @@ -778,6 +778,7 @@ vc4_handle_branch_target(struct vc4_shader_validation_state *validation_state)
>  struct vc4_validated_shader_info *
>  vc4_validate_shader(struct drm_gem_cma_object *shader_obj)
>  {
> +	struct vc4_dev *vc4 = to_vc4_dev(shader_obj->base.dev);
>  	bool found_shader_end = false;
>  	int shader_end_ip = 0;
>  	uint32_t last_thread_switch_ip = -3;
> @@ -785,6 +786,9 @@ vc4_validate_shader(struct drm_gem_cma_object *shader_obj)
>  	struct vc4_validated_shader_info *validated_shader = NULL;
>  	struct vc4_shader_validation_state validation_state;
>  
> +	if (WARN_ON_ONCE(vc4->is_vc5))
> +		return NULL;
> +
>  	memset(&validation_state, 0, sizeof(validation_state));
>  	validation_state.shader = shader_obj->vaddr;
>  	validation_state.max_ip = shader_obj->base.size / sizeof(uint64_t);
> -- 
> 2.35.1
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 13/14] drm/vc4: crtc: Fix out of order frames during asynchronous page flips
  2022-05-03 12:13 ` [PATCH 13/14] drm/vc4: crtc: Fix out of order frames during asynchronous page flips Maxime Ripard
  2022-05-03 13:53   ` kernel test robot
@ 2022-05-09 17:10   ` Melissa Wen
  2022-05-09 17:15     ` Melissa Wen
  1 sibling, 1 reply; 25+ messages in thread
From: Melissa Wen @ 2022-05-09 17:10 UTC (permalink / raw)
  To: Maxime Ripard; +Cc: David Airlie, Daniel Vetter, dri-devel, Thomas Zimmermann

[-- Attachment #1: Type: text/plain, Size: 3917 bytes --]

On 05/03, Maxime Ripard wrote:
> When doing an asynchronous page flip (PAGE_FLIP ioctl with the
> DRM_MODE_PAGE_FLIP_ASYNC flag set), the current code waits for the
> possible GPU buffer being rendered through a call to
> vc4_queue_seqno_cb().
> 
> On the BCM2835-37, the GPU driver is part of the vc4 driver and that
> function is defined in vc4_gem.c to wait for the buffer to be rendered,
> and once it's done, call a callback.
> 
> However, on the BCM2711 used on the RaspberryPi4, the GPU driver is
> separate (v3d) and that function won't do anything. This was working
> because we were going into a path, due to uninitialized variables, that
> was always scheduling the callback.
> 
> However, we were never actually waiting for the buffer to be rendered
> which was resulting in frames being displayed out of order.
> 
> The generic API to signal those kind of completion in the kernel are the
> DMA fences, and fortunately the v3d drivers supports them and signal
> when its job is done. That API also provides an equivalent function that
> allows to have a callback being executed when the fence is signalled as
> done.
> 
> Let's change our driver a bit to rely on the previous function for the
> older SoCs, and on DMA fences for the BCM2711.
> 
> Signed-off-by: Maxime Ripard <maxime@cerno.tech>
> ---
>  drivers/gpu/drm/vc4/vc4_crtc.c | 41 ++++++++++++++++++++++++++++++++--
>  1 file changed, 39 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
> index e0ae7bef08fa..8e1369fca937 100644
> --- a/drivers/gpu/drm/vc4/vc4_crtc.c
> +++ b/drivers/gpu/drm/vc4/vc4_crtc.c
> @@ -776,6 +776,7 @@ struct vc4_async_flip_state {
>  	struct drm_pending_vblank_event *event;
>  
>  	union {
> +		struct dma_fence_cb fence;
>  		struct vc4_seqno_cb seqno;
>  	} cb;
>  };
> @@ -835,6 +836,43 @@ static void vc4_async_page_flip_seqno_complete(struct vc4_seqno_cb *cb)
>  		vc4_bo_dec_usecnt(bo);
>  }
>  
> +static void vc4_async_page_flip_fence_complete(struct dma_fence *fence,
> +					       struct dma_fence_cb *cb)
> +{
> +	struct vc4_async_flip_state *flip_state =
> +		container_of(cb, struct vc4_async_flip_state, cb.fence);
> +
> +	vc4_async_page_flip_complete(flip_state);
> +	dma_fence_put(fence);
> +}
> +
> +static int vc4_async_set_fence_cb(struct drm_device *dev,
> +				  struct vc4_async_flip_state *flip_state)
> +{
> +	struct drm_framebuffer *fb = flip_state->fb;
> +	struct drm_gem_cma_object *cma_bo = drm_fb_cma_get_gem_obj(fb, 0);
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> +	struct dma_fence *fence;
> +	int ret;
> +
> +	if (!vc4->is_vc5) {
> +		struct vc4_bo *bo = to_vc4_bo(&cma_bo->base);
> +
> +		return vc4_queue_seqno_cb(dev, &flip_state->cb.seqno, bo->seqno,
> +					  vc4_async_page_flip_seqno_complete);
> +	}
> +
> +	ret = dma_resv_get_singleton(cma_bo->base.resv, false, &fence);
> +	if (ret)
> +		return ret;
> +
> +	if (dma_fence_add_callback(fence, &flip_state->cb.fence,
> +				   vc4_async_page_flip_fence_complete))
> +		vc4_async_page_flip_fence_complete(fence, &flip_state->cb.fence);
> +
> +	return 0;
> +}
> +
>  static int
>  vc4_async_page_flip_common(struct drm_crtc *crtc,
>  			   struct drm_framebuffer *fb,
> @@ -874,8 +912,7 @@ vc4_async_page_flip_common(struct drm_crtc *crtc,
>  	 */
>  	drm_atomic_set_fb_for_plane(plane->state, fb);
>  
> -	vc4_queue_seqno_cb(dev, &flip_state->cb.seqno, bo->seqno,
> -			   vc4_async_page_flip_seqno_complete);
> +	vc4_async_set_fence_cb(dev, flip_state);

I tried to run igt kms_async_flip, but the test still seems very tied to
the i915 driver. So, as far as I could check, I didn't see any red
flags and sync page flip behaviour seems preserved.

>  
>  	/* Driver takes ownership of state on successful async commit. */
>  	return 0;
> -- 
> 2.35.1
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 13/14] drm/vc4: crtc: Fix out of order frames during asynchronous page flips
  2022-05-09 17:10   ` Melissa Wen
@ 2022-05-09 17:15     ` Melissa Wen
  2022-05-12 10:44       ` Melissa Wen
  0 siblings, 1 reply; 25+ messages in thread
From: Melissa Wen @ 2022-05-09 17:15 UTC (permalink / raw)
  To: Maxime Ripard; +Cc: David Airlie, Daniel Vetter, Thomas Zimmermann, dri-devel

[-- Attachment #1: Type: text/plain, Size: 4256 bytes --]

O 05/09, Melissa Wen wrote:
> On 05/03, Maxime Ripard wrote:
> > When doing an asynchronous page flip (PAGE_FLIP ioctl with the
> > DRM_MODE_PAGE_FLIP_ASYNC flag set), the current code waits for the
> > possible GPU buffer being rendered through a call to
> > vc4_queue_seqno_cb().
> > 
> > On the BCM2835-37, the GPU driver is part of the vc4 driver and that
> > function is defined in vc4_gem.c to wait for the buffer to be rendered,
> > and once it's done, call a callback.
> > 
> > However, on the BCM2711 used on the RaspberryPi4, the GPU driver is
> > separate (v3d) and that function won't do anything. This was working
> > because we were going into a path, due to uninitialized variables, that
> > was always scheduling the callback.
> > 
> > However, we were never actually waiting for the buffer to be rendered
> > which was resulting in frames being displayed out of order.
> > 
> > The generic API to signal those kind of completion in the kernel are the
> > DMA fences, and fortunately the v3d drivers supports them and signal
> > when its job is done. That API also provides an equivalent function that
> > allows to have a callback being executed when the fence is signalled as
> > done.
> > 
> > Let's change our driver a bit to rely on the previous function for the
> > older SoCs, and on DMA fences for the BCM2711.
> > 
> > Signed-off-by: Maxime Ripard <maxime@cerno.tech>
> > ---
> >  drivers/gpu/drm/vc4/vc4_crtc.c | 41 ++++++++++++++++++++++++++++++++--
> >  1 file changed, 39 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
> > index e0ae7bef08fa..8e1369fca937 100644
> > --- a/drivers/gpu/drm/vc4/vc4_crtc.c
> > +++ b/drivers/gpu/drm/vc4/vc4_crtc.c
> > @@ -776,6 +776,7 @@ struct vc4_async_flip_state {
> >  	struct drm_pending_vblank_event *event;
> >  
> >  	union {
> > +		struct dma_fence_cb fence;
> >  		struct vc4_seqno_cb seqno;
> >  	} cb;
> >  };
> > @@ -835,6 +836,43 @@ static void vc4_async_page_flip_seqno_complete(struct vc4_seqno_cb *cb)
> >  		vc4_bo_dec_usecnt(bo);
> >  }
> >  
> > +static void vc4_async_page_flip_fence_complete(struct dma_fence *fence,
> > +					       struct dma_fence_cb *cb)
> > +{
> > +	struct vc4_async_flip_state *flip_state =
> > +		container_of(cb, struct vc4_async_flip_state, cb.fence);
> > +
> > +	vc4_async_page_flip_complete(flip_state);
> > +	dma_fence_put(fence);
> > +}
> > +
> > +static int vc4_async_set_fence_cb(struct drm_device *dev,
> > +				  struct vc4_async_flip_state *flip_state)
> > +{
> > +	struct drm_framebuffer *fb = flip_state->fb;
> > +	struct drm_gem_cma_object *cma_bo = drm_fb_cma_get_gem_obj(fb, 0);
> > +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> > +	struct dma_fence *fence;
> > +	int ret;
> > +
> > +	if (!vc4->is_vc5) {
> > +		struct vc4_bo *bo = to_vc4_bo(&cma_bo->base);
> > +
> > +		return vc4_queue_seqno_cb(dev, &flip_state->cb.seqno, bo->seqno,
> > +					  vc4_async_page_flip_seqno_complete);
> > +	}
> > +
> > +	ret = dma_resv_get_singleton(cma_bo->base.resv, false, &fence);
+ for kernel bot complaint, I replaced false with `DMA_RESV_USAGE_READ`
to run some tests

> > +	if (ret)
> > +		return ret;
> > +
> > +	if (dma_fence_add_callback(fence, &flip_state->cb.fence,
> > +				   vc4_async_page_flip_fence_complete))
> > +		vc4_async_page_flip_fence_complete(fence, &flip_state->cb.fence);
> > +
> > +	return 0;
> > +}
> > +
> >  static int
> >  vc4_async_page_flip_common(struct drm_crtc *crtc,
> >  			   struct drm_framebuffer *fb,
> > @@ -874,8 +912,7 @@ vc4_async_page_flip_common(struct drm_crtc *crtc,
> >  	 */
> >  	drm_atomic_set_fb_for_plane(plane->state, fb);
> >  
> > -	vc4_queue_seqno_cb(dev, &flip_state->cb.seqno, bo->seqno,
> > -			   vc4_async_page_flip_seqno_complete);
> > +	vc4_async_set_fence_cb(dev, flip_state);
> 
> I tried to run igt kms_async_flip, but the test still seems very tied to
> the i915 driver. So, as far as I could check, I didn't see any red
> flags and sync page flip behaviour seems preserved.
> 
> >  
> >  	/* Driver takes ownership of state on successful async commit. */
> >  	return 0;
> > -- 
> > 2.35.1
> > 



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering
  2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
                   ` (13 preceding siblings ...)
  2022-05-03 12:13 ` [PATCH 14/14] drm/vc4: Warn if some v3d code is run on BCM2711 Maxime Ripard
@ 2022-05-09 18:26 ` Melissa Wen
  14 siblings, 0 replies; 25+ messages in thread
From: Melissa Wen @ 2022-05-09 18:26 UTC (permalink / raw)
  To: Maxime Ripard; +Cc: David Airlie, Daniel Vetter, dri-devel, Thomas Zimmermann

[-- Attachment #1: Type: text/plain, Size: 3727 bytes --]

On 05/03, Maxime Ripard wrote:
> Hi,
> 
> Here's a series that fixes a significant issue we missed when adding support
> for the BCM2711 / RaspberryPi4 in the vc4 driver.
> 
> Indeed, before the introduction of the BCM2711 support, the GPU was fairly
> intertwined with the display hardware, and was thus supported by the vc4
> driver. Among other things, the driver needed to accomodate for a bunch of
> hardware limitations (such as a lack of IOMMU) by implementing a custom memory
> management backend, tied with the v3d hardware.
> 
> On the BCM2711, that GPU got moved into a completely separate hardware block
> and thus we gained a new driver for it, v3d.
> 
> However, when we introduced the display support for the BCM2711 in vc4, we
> didn't properly split out the v3d-related functions and ended up reusing a
> significant portion of the code supposed to be backed by the v3d.
> 
> This created a bunch of easy to miss issues that would only pop up with IGT
> tests, or when heavily testing some features (like asynchronous page-flipping).
> 
> This series properly does the split now by creating separate code path where
> relevant, adds a loud complain when we use a v3d entry-point on the BCM2711,
> and fixes an issue where we would just ignore any fence on an asynchronous
> page-flip, resulting in frames appearing out-of-order for some workloads.
> 
> Let me know what you think,

Hi Maxime,

Overall lgtm. I didn't catch anything tricky and I did some tests to
verify the current behaviour is preserved. Unfortunately, I couldn't
test the async mechanisms much, let me know if you have any suggestions
for testing. I only concern about accommodate vc5 naming since vc4->v3d
vs v3d is already a little confusing. After addressing
dma_resv_get_singleton usage, this series is:

Reviewed-by: Melissa Wen <mwen@igalia.com>

> Maxime
> 
> Maxime Ripard (14):
>   drm/vc4: plane: Prevent async update if we don't have a dlist
>   drm/vc4: Consolidate Hardware Revision Check
>   drm/vc4: bo: Rename vc4_dumb_create
>   drm/vc4: bo: Split out Dumb buffers fixup
>   drm/vc4: drv: Register a different driver on BCM2711
>   drm/vc4: kms: Register a different drm_mode_config_funcs on BCM2711
>   drm/vc4: plane: Register a different drm_plane_helper_funcs on BCM2711
>   drm/vc4: drv: Skip BO Backend Initialization on BCM2711
>   drm/vc4: crtc: Use an union to store the page flip callback
>   drm/vc4: crtc: Move the BO handling out of common page-flip callback
>   drm/vc4: crtc: Move the BO Handling out of Common Page-Flip Handler
>   drm/vc4: crtc: Don't call into BO Handling on Async Page-Flips on
>     BCM2711
>   drm/vc4: crtc: Fix out of order frames during asynchronous page flips
>   drm/vc4: Warn if some v3d code is run on BCM2711
> 
>  drivers/gpu/drm/vc4/vc4_bo.c               |  62 ++++++-
>  drivers/gpu/drm/vc4/vc4_crtc.c             | 193 +++++++++++++++------
>  drivers/gpu/drm/vc4/vc4_drv.c              |  97 +++++++++--
>  drivers/gpu/drm/vc4/vc4_drv.h              |  19 +-
>  drivers/gpu/drm/vc4/vc4_gem.c              |  40 +++++
>  drivers/gpu/drm/vc4/vc4_hvs.c              |  18 +-
>  drivers/gpu/drm/vc4/vc4_irq.c              |  16 ++
>  drivers/gpu/drm/vc4/vc4_kms.c              |  24 ++-
>  drivers/gpu/drm/vc4/vc4_perfmon.c          |  41 +++++
>  drivers/gpu/drm/vc4/vc4_plane.c            |  29 +++-
>  drivers/gpu/drm/vc4/vc4_render_cl.c        |   4 +
>  drivers/gpu/drm/vc4/vc4_v3d.c              |  15 ++
>  drivers/gpu/drm/vc4/vc4_validate.c         |  16 ++
>  drivers/gpu/drm/vc4/vc4_validate_shaders.c |   4 +
>  14 files changed, 470 insertions(+), 108 deletions(-)
> 
> -- 
> 2.35.1
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 13/14] drm/vc4: crtc: Fix out of order frames during asynchronous page flips
  2022-05-09 17:15     ` Melissa Wen
@ 2022-05-12 10:44       ` Melissa Wen
  2022-06-06 13:59         ` Maxime Ripard
  0 siblings, 1 reply; 25+ messages in thread
From: Melissa Wen @ 2022-05-12 10:44 UTC (permalink / raw)
  To: Maxime Ripard; +Cc: David Airlie, Daniel Vetter, dri-devel, Thomas Zimmermann

[-- Attachment #1: Type: text/plain, Size: 4800 bytes --]

On 05/09, Melissa Wen wrote:
> O 05/09, Melissa Wen wrote:
> > On 05/03, Maxime Ripard wrote:
> > > When doing an asynchronous page flip (PAGE_FLIP ioctl with the
> > > DRM_MODE_PAGE_FLIP_ASYNC flag set), the current code waits for the
> > > possible GPU buffer being rendered through a call to
> > > vc4_queue_seqno_cb().
> > > 
> > > On the BCM2835-37, the GPU driver is part of the vc4 driver and that
> > > function is defined in vc4_gem.c to wait for the buffer to be rendered,
> > > and once it's done, call a callback.
> > > 
> > > However, on the BCM2711 used on the RaspberryPi4, the GPU driver is
> > > separate (v3d) and that function won't do anything. This was working
> > > because we were going into a path, due to uninitialized variables, that
> > > was always scheduling the callback.
> > > 
> > > However, we were never actually waiting for the buffer to be rendered
> > > which was resulting in frames being displayed out of order.
> > > 
> > > The generic API to signal those kind of completion in the kernel are the
> > > DMA fences, and fortunately the v3d drivers supports them and signal
> > > when its job is done. That API also provides an equivalent function that
> > > allows to have a callback being executed when the fence is signalled as
> > > done.
> > > 
> > > Let's change our driver a bit to rely on the previous function for the
> > > older SoCs, and on DMA fences for the BCM2711.
> > > 
> > > Signed-off-by: Maxime Ripard <maxime@cerno.tech>
> > > ---
> > >  drivers/gpu/drm/vc4/vc4_crtc.c | 41 ++++++++++++++++++++++++++++++++--
> > >  1 file changed, 39 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
> > > index e0ae7bef08fa..8e1369fca937 100644
> > > --- a/drivers/gpu/drm/vc4/vc4_crtc.c
> > > +++ b/drivers/gpu/drm/vc4/vc4_crtc.c
> > > @@ -776,6 +776,7 @@ struct vc4_async_flip_state {
> > >  	struct drm_pending_vblank_event *event;
> > >  
> > >  	union {
> > > +		struct dma_fence_cb fence;
> > >  		struct vc4_seqno_cb seqno;
> > >  	} cb;
> > >  };
> > > @@ -835,6 +836,43 @@ static void vc4_async_page_flip_seqno_complete(struct vc4_seqno_cb *cb)
> > >  		vc4_bo_dec_usecnt(bo);
> > >  }
> > >  
> > > +static void vc4_async_page_flip_fence_complete(struct dma_fence *fence,
> > > +					       struct dma_fence_cb *cb)
> > > +{
> > > +	struct vc4_async_flip_state *flip_state =
> > > +		container_of(cb, struct vc4_async_flip_state, cb.fence);
> > > +
> > > +	vc4_async_page_flip_complete(flip_state);
> > > +	dma_fence_put(fence);
> > > +}
> > > +
> > > +static int vc4_async_set_fence_cb(struct drm_device *dev,
> > > +				  struct vc4_async_flip_state *flip_state)
> > > +{
> > > +	struct drm_framebuffer *fb = flip_state->fb;
> > > +	struct drm_gem_cma_object *cma_bo = drm_fb_cma_get_gem_obj(fb, 0);
> > > +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> > > +	struct dma_fence *fence;
> > > +	int ret;
> > > +
> > > +	if (!vc4->is_vc5) {
> > > +		struct vc4_bo *bo = to_vc4_bo(&cma_bo->base);
> > > +
> > > +		return vc4_queue_seqno_cb(dev, &flip_state->cb.seqno, bo->seqno,
> > > +					  vc4_async_page_flip_seqno_complete);
> > > +	}
> > > +
> > > +	ret = dma_resv_get_singleton(cma_bo->base.resv, false, &fence);
> + for kernel bot complaint, I replaced false with `DMA_RESV_USAGE_READ`
> to run some tests
> 
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	if (dma_fence_add_callback(fence, &flip_state->cb.fence,
me again :)

I was thinking if we should add a check here for !fence and just complete the page flip,
instead of letting `dma_fence_add_callback` warns whenever fence is NULL.
I think there are situation in which fence is NULL and it is not an
issue, right? Does it make sense?

> > > +				   vc4_async_page_flip_fence_complete))
> > > +		vc4_async_page_flip_fence_complete(fence, &flip_state->cb.fence);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >  static int
> > >  vc4_async_page_flip_common(struct drm_crtc *crtc,
> > >  			   struct drm_framebuffer *fb,
> > > @@ -874,8 +912,7 @@ vc4_async_page_flip_common(struct drm_crtc *crtc,
> > >  	 */
> > >  	drm_atomic_set_fb_for_plane(plane->state, fb);
> > >  
> > > -	vc4_queue_seqno_cb(dev, &flip_state->cb.seqno, bo->seqno,
> > > -			   vc4_async_page_flip_seqno_complete);
> > > +	vc4_async_set_fence_cb(dev, flip_state);
> > 
> > I tried to run igt kms_async_flip, but the test still seems very tied to
> > the i915 driver. So, as far as I could check, I didn't see any red
> > flags and sync page flip behaviour seems preserved.
> > 
> > >  
> > >  	/* Driver takes ownership of state on successful async commit. */
> > >  	return 0;
> > > -- 
> > > 2.35.1
> > > 
> 
> 



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 14/14] drm/vc4: Warn if some v3d code is run on BCM2711
  2022-05-09 16:52   ` Melissa Wen
@ 2022-05-31  9:40     ` Maxime Ripard
  0 siblings, 0 replies; 25+ messages in thread
From: Maxime Ripard @ 2022-05-31  9:40 UTC (permalink / raw)
  To: Melissa Wen; +Cc: David Airlie, Daniel Vetter, dri-devel, Thomas Zimmermann

Hi Melissa,

On Mon, May 09, 2022 at 03:52:04PM -0100, Melissa Wen wrote:
> > @@ -915,10 +954,14 @@ int vc4_set_tiling_ioctl(struct drm_device *dev, void *data,
> >  int vc4_get_tiling_ioctl(struct drm_device *dev, void *data,
> >  			 struct drm_file *file_priv)
> >  {
> > +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> >  	struct drm_vc4_get_tiling *args = data;
> >  	struct drm_gem_object *gem_obj;
> >  	struct vc4_bo *bo;
> >  
> > +	if (WARN_ON_ONCE(vc4->is_vc5))
> > +		return -ENODEV;
> > +
> 
> Just to confirm: as none of these ioctls were wired up in
> vc5_drm_driver, is there any situation where this path can be taken
> wrongly?

For this particular function, no, it wasn't reachable before since the
ioctl were only exposed for devices with the render capability, and we
were not using it.

Other functions in that patch however are called prior to this series,
especially in the BO/GEM related path.

Maxime

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 13/14] drm/vc4: crtc: Fix out of order frames during asynchronous page flips
  2022-05-12 10:44       ` Melissa Wen
@ 2022-06-06 13:59         ` Maxime Ripard
  2022-06-08 11:24           ` Melissa Wen
  0 siblings, 1 reply; 25+ messages in thread
From: Maxime Ripard @ 2022-06-06 13:59 UTC (permalink / raw)
  To: Melissa Wen; +Cc: David Airlie, Daniel Vetter, dri-devel, Thomas Zimmermann

[-- Attachment #1: Type: text/plain, Size: 4126 bytes --]

Hi,

On Thu, May 12, 2022 at 09:44:42AM -0100, Melissa Wen wrote:
> On 05/09, Melissa Wen wrote:
> > O 05/09, Melissa Wen wrote:
> > > On 05/03, Maxime Ripard wrote:
> > > > When doing an asynchronous page flip (PAGE_FLIP ioctl with the
> > > > DRM_MODE_PAGE_FLIP_ASYNC flag set), the current code waits for the
> > > > possible GPU buffer being rendered through a call to
> > > > vc4_queue_seqno_cb().
> > > > 
> > > > On the BCM2835-37, the GPU driver is part of the vc4 driver and that
> > > > function is defined in vc4_gem.c to wait for the buffer to be rendered,
> > > > and once it's done, call a callback.
> > > > 
> > > > However, on the BCM2711 used on the RaspberryPi4, the GPU driver is
> > > > separate (v3d) and that function won't do anything. This was working
> > > > because we were going into a path, due to uninitialized variables, that
> > > > was always scheduling the callback.
> > > > 
> > > > However, we were never actually waiting for the buffer to be rendered
> > > > which was resulting in frames being displayed out of order.
> > > > 
> > > > The generic API to signal those kind of completion in the kernel are the
> > > > DMA fences, and fortunately the v3d drivers supports them and signal
> > > > when its job is done. That API also provides an equivalent function that
> > > > allows to have a callback being executed when the fence is signalled as
> > > > done.
> > > > 
> > > > Let's change our driver a bit to rely on the previous function for the
> > > > older SoCs, and on DMA fences for the BCM2711.
> > > > 
> > > > Signed-off-by: Maxime Ripard <maxime@cerno.tech>
> > > > ---
> > > >  drivers/gpu/drm/vc4/vc4_crtc.c | 41 ++++++++++++++++++++++++++++++++--
> > > >  1 file changed, 39 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
> > > > index e0ae7bef08fa..8e1369fca937 100644
> > > > --- a/drivers/gpu/drm/vc4/vc4_crtc.c
> > > > +++ b/drivers/gpu/drm/vc4/vc4_crtc.c
> > > > @@ -776,6 +776,7 @@ struct vc4_async_flip_state {
> > > >  	struct drm_pending_vblank_event *event;
> > > >  
> > > >  	union {
> > > > +		struct dma_fence_cb fence;
> > > >  		struct vc4_seqno_cb seqno;
> > > >  	} cb;
> > > >  };
> > > > @@ -835,6 +836,43 @@ static void vc4_async_page_flip_seqno_complete(struct vc4_seqno_cb *cb)
> > > >  		vc4_bo_dec_usecnt(bo);
> > > >  }
> > > >  
> > > > +static void vc4_async_page_flip_fence_complete(struct dma_fence *fence,
> > > > +					       struct dma_fence_cb *cb)
> > > > +{
> > > > +	struct vc4_async_flip_state *flip_state =
> > > > +		container_of(cb, struct vc4_async_flip_state, cb.fence);
> > > > +
> > > > +	vc4_async_page_flip_complete(flip_state);
> > > > +	dma_fence_put(fence);
> > > > +}
> > > > +
> > > > +static int vc4_async_set_fence_cb(struct drm_device *dev,
> > > > +				  struct vc4_async_flip_state *flip_state)
> > > > +{
> > > > +	struct drm_framebuffer *fb = flip_state->fb;
> > > > +	struct drm_gem_cma_object *cma_bo = drm_fb_cma_get_gem_obj(fb, 0);
> > > > +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> > > > +	struct dma_fence *fence;
> > > > +	int ret;
> > > > +
> > > > +	if (!vc4->is_vc5) {
> > > > +		struct vc4_bo *bo = to_vc4_bo(&cma_bo->base);
> > > > +
> > > > +		return vc4_queue_seqno_cb(dev, &flip_state->cb.seqno, bo->seqno,
> > > > +					  vc4_async_page_flip_seqno_complete);
> > > > +	}
> > > > +
> > > > +	ret = dma_resv_get_singleton(cma_bo->base.resv, false, &fence);
> > + for kernel bot complaint, I replaced false with `DMA_RESV_USAGE_READ`
> > to run some tests
> > 
> > > > +	if (ret)
> > > > +		return ret;
> > > > +
> > > > +	if (dma_fence_add_callback(fence, &flip_state->cb.fence,
> me again :)
> 
> I was thinking if we should add a check here for !fence and just complete the page flip,
> instead of letting `dma_fence_add_callback` warns whenever fence is NULL.
> I think there are situation in which fence is NULL and it is not an
> issue, right? Does it make sense?

I'm not sure. What situation do you have in mind?

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 13/14] drm/vc4: crtc: Fix out of order frames during asynchronous page flips
  2022-06-06 13:59         ` Maxime Ripard
@ 2022-06-08 11:24           ` Melissa Wen
  0 siblings, 0 replies; 25+ messages in thread
From: Melissa Wen @ 2022-06-08 11:24 UTC (permalink / raw)
  To: Maxime Ripard; +Cc: David Airlie, Daniel Vetter, Thomas Zimmermann, dri-devel

[-- Attachment #1: Type: text/plain, Size: 4791 bytes --]

On 06/06, Maxime Ripard wrote:
> Hi,
> 
> On Thu, May 12, 2022 at 09:44:42AM -0100, Melissa Wen wrote:
> > On 05/09, Melissa Wen wrote:
> > > O 05/09, Melissa Wen wrote:
> > > > On 05/03, Maxime Ripard wrote:
> > > > > When doing an asynchronous page flip (PAGE_FLIP ioctl with the
> > > > > DRM_MODE_PAGE_FLIP_ASYNC flag set), the current code waits for the
> > > > > possible GPU buffer being rendered through a call to
> > > > > vc4_queue_seqno_cb().
> > > > > 
> > > > > On the BCM2835-37, the GPU driver is part of the vc4 driver and that
> > > > > function is defined in vc4_gem.c to wait for the buffer to be rendered,
> > > > > and once it's done, call a callback.
> > > > > 
> > > > > However, on the BCM2711 used on the RaspberryPi4, the GPU driver is
> > > > > separate (v3d) and that function won't do anything. This was working
> > > > > because we were going into a path, due to uninitialized variables, that
> > > > > was always scheduling the callback.
> > > > > 
> > > > > However, we were never actually waiting for the buffer to be rendered
> > > > > which was resulting in frames being displayed out of order.
> > > > > 
> > > > > The generic API to signal those kind of completion in the kernel are the
> > > > > DMA fences, and fortunately the v3d drivers supports them and signal
> > > > > when its job is done. That API also provides an equivalent function that
> > > > > allows to have a callback being executed when the fence is signalled as
> > > > > done.
> > > > > 
> > > > > Let's change our driver a bit to rely on the previous function for the
> > > > > older SoCs, and on DMA fences for the BCM2711.
> > > > > 
> > > > > Signed-off-by: Maxime Ripard <maxime@cerno.tech>
> > > > > ---
> > > > >  drivers/gpu/drm/vc4/vc4_crtc.c | 41 ++++++++++++++++++++++++++++++++--
> > > > >  1 file changed, 39 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
> > > > > index e0ae7bef08fa..8e1369fca937 100644
> > > > > --- a/drivers/gpu/drm/vc4/vc4_crtc.c
> > > > > +++ b/drivers/gpu/drm/vc4/vc4_crtc.c
> > > > > @@ -776,6 +776,7 @@ struct vc4_async_flip_state {
> > > > >  	struct drm_pending_vblank_event *event;
> > > > >  
> > > > >  	union {
> > > > > +		struct dma_fence_cb fence;
> > > > >  		struct vc4_seqno_cb seqno;
> > > > >  	} cb;
> > > > >  };
> > > > > @@ -835,6 +836,43 @@ static void vc4_async_page_flip_seqno_complete(struct vc4_seqno_cb *cb)
> > > > >  		vc4_bo_dec_usecnt(bo);
> > > > >  }
> > > > >  
> > > > > +static void vc4_async_page_flip_fence_complete(struct dma_fence *fence,
> > > > > +					       struct dma_fence_cb *cb)
> > > > > +{
> > > > > +	struct vc4_async_flip_state *flip_state =
> > > > > +		container_of(cb, struct vc4_async_flip_state, cb.fence);
> > > > > +
> > > > > +	vc4_async_page_flip_complete(flip_state);
> > > > > +	dma_fence_put(fence);
> > > > > +}
> > > > > +
> > > > > +static int vc4_async_set_fence_cb(struct drm_device *dev,
> > > > > +				  struct vc4_async_flip_state *flip_state)
> > > > > +{
> > > > > +	struct drm_framebuffer *fb = flip_state->fb;
> > > > > +	struct drm_gem_cma_object *cma_bo = drm_fb_cma_get_gem_obj(fb, 0);
> > > > > +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> > > > > +	struct dma_fence *fence;
> > > > > +	int ret;
> > > > > +
> > > > > +	if (!vc4->is_vc5) {
> > > > > +		struct vc4_bo *bo = to_vc4_bo(&cma_bo->base);
> > > > > +
> > > > > +		return vc4_queue_seqno_cb(dev, &flip_state->cb.seqno, bo->seqno,
> > > > > +					  vc4_async_page_flip_seqno_complete);
> > > > > +	}
> > > > > +
> > > > > +	ret = dma_resv_get_singleton(cma_bo->base.resv, false, &fence);
> > > + for kernel bot complaint, I replaced false with `DMA_RESV_USAGE_READ`
> > > to run some tests
> > > 
> > > > > +	if (ret)
> > > > > +		return ret;
> > > > > +
> > > > > +	if (dma_fence_add_callback(fence, &flip_state->cb.fence,
> > me again :)
> > 
> > I was thinking if we should add a check here for !fence and just complete the page flip,
> > instead of letting `dma_fence_add_callback` warns whenever fence is NULL.
> > I think there are situation in which fence is NULL and it is not an
> > issue, right? Does it make sense?
> 
> I'm not sure. What situation do you have in mind?

I mean, if no implicity fence was attached to this bo, it's safe to just
do the page flip. This behaviour will happen anyway, after
dma_fence_add_callback() checks fence is NULL and return -EINVAL. But
this check will also trigger a warning for something that it's not an
issue here, I think. So, if we just check `fence` before calling
dma_fence_add_callback(), we keep the same behaviour and prevent the
warning.

Melissa
> 
> Maxime



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2022-06-08 11:25 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-03 12:13 [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Maxime Ripard
2022-05-03 12:13 ` [PATCH 01/14] drm/vc4: plane: Prevent async update if we don't have a dlist Maxime Ripard
2022-05-03 12:13 ` [PATCH 02/14] drm/vc4: Consolidate Hardware Revision Check Maxime Ripard
2022-05-03 12:13 ` [PATCH 03/14] drm/vc4: bo: Rename vc4_dumb_create Maxime Ripard
2022-05-03 12:13 ` [PATCH 04/14] drm/vc4: bo: Split out Dumb buffers fixup Maxime Ripard
2022-05-03 12:13 ` [PATCH 05/14] drm/vc4: drv: Register a different driver on BCM2711 Maxime Ripard
2022-05-09 16:37   ` Melissa Wen
2022-05-03 12:13 ` [PATCH 06/14] drm/vc4: kms: Register a different drm_mode_config_funcs " Maxime Ripard
2022-05-03 12:13 ` [PATCH 07/14] drm/vc4: plane: Register a different drm_plane_helper_funcs " Maxime Ripard
2022-05-03 12:13 ` [PATCH 08/14] drm/vc4: drv: Skip BO Backend Initialization " Maxime Ripard
2022-05-03 12:13 ` [PATCH 09/14] drm/vc4: crtc: Use an union to store the page flip callback Maxime Ripard
2022-05-03 12:13 ` [PATCH 10/14] drm/vc4: crtc: Move the BO handling out of common page-flip callback Maxime Ripard
2022-05-03 12:13 ` [PATCH 11/14] drm/vc4: crtc: Move the BO Handling out of Common Page-Flip Handler Maxime Ripard
2022-05-03 12:13 ` [PATCH 12/14] drm/vc4: crtc: Don't call into BO Handling on Async Page-Flips on BCM2711 Maxime Ripard
2022-05-03 12:13 ` [PATCH 13/14] drm/vc4: crtc: Fix out of order frames during asynchronous page flips Maxime Ripard
2022-05-03 13:53   ` kernel test robot
2022-05-09 17:10   ` Melissa Wen
2022-05-09 17:15     ` Melissa Wen
2022-05-12 10:44       ` Melissa Wen
2022-06-06 13:59         ` Maxime Ripard
2022-06-08 11:24           ` Melissa Wen
2022-05-03 12:13 ` [PATCH 14/14] drm/vc4: Warn if some v3d code is run on BCM2711 Maxime Ripard
2022-05-09 16:52   ` Melissa Wen
2022-05-31  9:40     ` Maxime Ripard
2022-05-09 18:26 ` [PATCH 00/14] drm/vc4: Properly separate v3d on BCM2711, and fix frame ordering Melissa Wen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.