All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4] drm/vc4: Add a load tracker
@ 2019-02-06 14:49 Paul Kocialkowski
  2019-02-06 14:49 ` [PATCH v4 1/4] drm/vc4: Report HVS underrun errors Paul Kocialkowski
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Paul Kocialkowski @ 2019-02-06 14:49 UTC (permalink / raw)
  To: dri-devel, linux-kernel
  Cc: Eric Anholt, David Airlie, Daniel Vetter, Maxime Ripard,
	Thomas Petazzoni, Eben Upton, Paul Kocialkowski

Hi,

Here is a fourth iteration of the VC4 load tracking series, which was
initially developed by Boris Brezillon and that I have now taken over.

This new iteration takes in account comments from v3 and comes with a
new approach for avoiding underrun reports when reconfiguring the
pipeline. It is now based on detection instead of delaying the underrun
interrupt unmasking.

It can be tested with a dedicated IGT GPU Tools series:
  VC4 load tracker testing

Changes since v3:
* Returned IRQ_NONE for short line interrupt to avoid storm;
* Made register definitions with per-channel offsets more direct;
* cleaned all IRQ fields instead of writing back the status bit as
  advised by docs;
* Reworked first commit's message;
* Removed in-commit changelogs with inconsistent revision numbers;
* Removed explicit wait for sync before unmasking underrun interrupt;
* Checked that the display lists are synced before reporting underrun;

Cheers,

Paul

Boris Brezillon (2):
  drm/vc4: Report HVS underrun errors
  drm/vc4: Add a load tracker to prevent HVS underflow errors

Paul Kocialkowski (2):
  drm/vc4: Detect and ignore underruns caused by out-of-sync dlists
  drm/vc4: Add a debugfs entry to disable/enable the load tracker

 drivers/gpu/drm/vc4/vc4_debugfs.c |  10 +++
 drivers/gpu/drm/vc4/vc4_drv.c     |   1 +
 drivers/gpu/drm/vc4/vc4_drv.h     |  32 ++++++++
 drivers/gpu/drm/vc4/vc4_hvs.c     | 123 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/vc4/vc4_kms.c     | 116 +++++++++++++++++++++++++++-
 drivers/gpu/drm/vc4/vc4_plane.c   |  57 ++++++++++++++
 drivers/gpu/drm/vc4/vc4_regs.h    |  51 ++++---------
 7 files changed, 354 insertions(+), 36 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v4 1/4] drm/vc4: Report HVS underrun errors
  2019-02-06 14:49 [PATCH v4 0/4] drm/vc4: Add a load tracker Paul Kocialkowski
@ 2019-02-06 14:49 ` Paul Kocialkowski
  2019-02-06 23:51     ` Eric Anholt
  2019-02-06 14:49 ` [PATCH v4 2/4] drm/vc4: Add a load tracker to prevent HVS underflow errors Paul Kocialkowski
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Paul Kocialkowski @ 2019-02-06 14:49 UTC (permalink / raw)
  To: dri-devel, linux-kernel
  Cc: Eric Anholt, David Airlie, Daniel Vetter, Maxime Ripard,
	Thomas Petazzoni, Eben Upton, Boris Brezillon, Paul Kocialkowski

From: Boris Brezillon <boris.brezillon@bootlin.com>

Add a debugfs entry and helper for reporting HVS underrun errors as
well as helpers for masking and unmasking the underrun interrupts.
Add an IRQ handler and initial IRQ configuration.
Rework related register definitions to take the channel number.

Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
---
 drivers/gpu/drm/vc4/vc4_debugfs.c |  1 +
 drivers/gpu/drm/vc4/vc4_drv.h     | 10 ++++
 drivers/gpu/drm/vc4/vc4_hvs.c     | 92 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/vc4/vc4_kms.c     |  4 ++
 drivers/gpu/drm/vc4/vc4_regs.h    | 49 +++++-----------
 5 files changed, 121 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_debugfs.c b/drivers/gpu/drm/vc4/vc4_debugfs.c
index 7a0003de71ab..64021bcba041 100644
--- a/drivers/gpu/drm/vc4/vc4_debugfs.c
+++ b/drivers/gpu/drm/vc4/vc4_debugfs.c
@@ -23,6 +23,7 @@ static const struct drm_info_list vc4_debugfs_list[] = {
 	{"vec_regs", vc4_vec_debugfs_regs, 0},
 	{"txp_regs", vc4_txp_debugfs_regs, 0},
 	{"hvs_regs", vc4_hvs_debugfs_regs, 0},
+	{"hvs_underrun",  vc4_hvs_debugfs_underrun, 0},
 	{"crtc0_regs", vc4_crtc_debugfs_regs, 0, (void *)(uintptr_t)0},
 	{"crtc1_regs", vc4_crtc_debugfs_regs, 0, (void *)(uintptr_t)1},
 	{"crtc2_regs", vc4_crtc_debugfs_regs, 0, (void *)(uintptr_t)2},
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index 2c635f001c71..b34da7b6ee6b 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -185,6 +185,13 @@ struct vc4_dev {
 	/* Bitmask of the current bin_alloc used for overflow memory. */
 	uint32_t bin_alloc_overflow;
 
+	/* Incremented when an underrun error happened after an atomic commit.
+	 * This is particularly useful to detect when a specific modeset is too
+	 * demanding in term of memory or HVS bandwidth which is hard to guess
+	 * at atomic check time.
+	 */
+	atomic_t underrun;
+
 	struct work_struct overflow_mem_work;
 
 	int power_refcount;
@@ -773,6 +780,9 @@ void vc4_irq_reset(struct drm_device *dev);
 extern struct platform_driver vc4_hvs_driver;
 void vc4_hvs_dump_state(struct drm_device *dev);
 int vc4_hvs_debugfs_regs(struct seq_file *m, void *unused);
+int vc4_hvs_debugfs_underrun(struct seq_file *m, void *unused);
+void vc4_hvs_unmask_underrun(struct drm_device *dev);
+void vc4_hvs_mask_underrun(struct drm_device *dev);
 
 /* vc4_kms.c */
 int vc4_kms_load(struct drm_device *dev);
diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
index 5d8c749c9749..d5bc3bcd3e51 100644
--- a/drivers/gpu/drm/vc4/vc4_hvs.c
+++ b/drivers/gpu/drm/vc4/vc4_hvs.c
@@ -22,6 +22,7 @@
  * each CRTC.
  */
 
+#include <drm/drm_atomic_helper.h>
 #include <linux/component.h>
 #include "vc4_drv.h"
 #include "vc4_regs.h"
@@ -102,6 +103,18 @@ int vc4_hvs_debugfs_regs(struct seq_file *m, void *unused)
 
 	return 0;
 }
+
+int vc4_hvs_debugfs_underrun(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	struct drm_printer p = drm_seq_file_printer(m);
+
+	drm_printf(&p, "%d\n", atomic_read(&vc4->underrun));
+
+	return 0;
+}
 #endif
 
 /* The filter kernel is composed of dwords each containing 3 9-bit
@@ -166,6 +179,67 @@ static int vc4_hvs_upload_linear_kernel(struct vc4_hvs *hvs,
 	return 0;
 }
 
+void vc4_hvs_mask_underrun(struct drm_device *dev)
+{
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	u32 dispctrl = HVS_READ(SCALER_DISPCTRL);
+
+	dispctrl &= ~(SCALER_DISPCTRL_DSPEISLUR(0) |
+		      SCALER_DISPCTRL_DSPEISLUR(1) |
+		      SCALER_DISPCTRL_DSPEISLUR(2));
+
+	HVS_WRITE(SCALER_DISPCTRL, dispctrl);
+}
+
+void vc4_hvs_unmask_underrun(struct drm_device *dev)
+{
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	u32 dispctrl = HVS_READ(SCALER_DISPCTRL);
+
+	dispctrl |= SCALER_DISPCTRL_DSPEISLUR(0) |
+		    SCALER_DISPCTRL_DSPEISLUR(1) |
+		    SCALER_DISPCTRL_DSPEISLUR(2);
+
+	HVS_WRITE(SCALER_DISPSTAT,
+		  SCALER_DISPSTAT_EUFLOW(0) |
+		  SCALER_DISPSTAT_EUFLOW(1) |
+		  SCALER_DISPSTAT_EUFLOW(2));
+	HVS_WRITE(SCALER_DISPCTRL, dispctrl);
+}
+
+static void vc4_hvs_report_underrun(struct drm_device *dev)
+{
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
+
+	atomic_inc(&vc4->underrun);
+	DRM_DEV_ERROR(dev->dev, "HVS underrun\n");
+}
+
+static irqreturn_t vc4_hvs_irq_handler(int irq, void *data)
+{
+	struct drm_device *dev = data;
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	irqreturn_t irqret = IRQ_NONE;
+	u32 status;
+
+	status = HVS_READ(SCALER_DISPSTAT);
+
+	if (status & (SCALER_DISPSTAT_EUFLOW(0) |
+		      SCALER_DISPSTAT_EUFLOW(1) |
+		      SCALER_DISPSTAT_EUFLOW(2))) {
+		vc4_hvs_mask_underrun(dev);
+		vc4_hvs_report_underrun(dev);
+
+		irqret = IRQ_HANDLED;
+	}
+
+	HVS_WRITE(SCALER_DISPSTAT, SCALER_DISPSTAT_IRQMASK(0) |
+				   SCALER_DISPSTAT_IRQMASK(1) |
+				   SCALER_DISPSTAT_IRQMASK(2));
+
+	return irqret;
+}
+
 static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
 {
 	struct platform_device *pdev = to_platform_device(dev);
@@ -219,15 +293,33 @@ static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
 	dispctrl = HVS_READ(SCALER_DISPCTRL);
 
 	dispctrl |= SCALER_DISPCTRL_ENABLE;
+	dispctrl |= SCALER_DISPCTRL_DSPEISLUR(0) | SCALER_DISPCTRL_DISPEIRQ(0) |
+		    SCALER_DISPCTRL_DSPEISLUR(1) | SCALER_DISPCTRL_DISPEIRQ(1) |
+		    SCALER_DISPCTRL_DSPEISLUR(2) | SCALER_DISPCTRL_DISPEIRQ(2);
 
 	/* Set DSP3 (PV1) to use HVS channel 2, which would otherwise
 	 * be unused.
 	 */
 	dispctrl &= ~SCALER_DISPCTRL_DSP3_MUX_MASK;
+	dispctrl &= ~(SCALER_DISPCTRL_DMAEIRQ |
+		      SCALER_DISPCTRL_SLVWREIRQ |
+		      SCALER_DISPCTRL_SLVRDEIRQ |
+		      SCALER_DISPCTRL_DSPEIEOF(0) |
+		      SCALER_DISPCTRL_DSPEIEOF(1) |
+		      SCALER_DISPCTRL_DSPEIEOF(2) |
+		      SCALER_DISPCTRL_DSPEIEOLN(0) |
+		      SCALER_DISPCTRL_DSPEIEOLN(1) |
+		      SCALER_DISPCTRL_DSPEIEOLN(2) |
+		      SCALER_DISPCTRL_SCLEIRQ);
 	dispctrl |= VC4_SET_FIELD(2, SCALER_DISPCTRL_DSP3_MUX);
 
 	HVS_WRITE(SCALER_DISPCTRL, dispctrl);
 
+	ret = devm_request_irq(dev, platform_get_irq(pdev, 0),
+			       vc4_hvs_irq_handler, 0, "vc4 hvs", drm);
+	if (ret)
+		return ret;
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
index 91b8c72ff361..3c87fbcd0b17 100644
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -139,6 +139,8 @@ vc4_atomic_complete_commit(struct drm_atomic_state *state)
 	struct drm_device *dev = state->dev;
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 
+	vc4_hvs_mask_underrun(dev);
+
 	drm_atomic_helper_wait_for_fences(dev, state, false);
 
 	drm_atomic_helper_wait_for_dependencies(state);
@@ -155,6 +157,8 @@ vc4_atomic_complete_commit(struct drm_atomic_state *state)
 
 	drm_atomic_helper_commit_hw_done(state);
 
+	vc4_hvs_unmask_underrun(dev);
+
 	drm_atomic_helper_wait_for_flip_done(dev, state);
 
 	drm_atomic_helper_cleanup_planes(dev, state);
diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h
index 931088014272..e0834de8410f 100644
--- a/drivers/gpu/drm/vc4/vc4_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_regs.h
@@ -215,8 +215,6 @@
 #define SCALER_DISPCTRL                         0x00000000
 /* Global register for clock gating the HVS */
 # define SCALER_DISPCTRL_ENABLE			BIT(31)
-# define SCALER_DISPCTRL_DSP2EISLUR		BIT(15)
-# define SCALER_DISPCTRL_DSP1EISLUR		BIT(14)
 # define SCALER_DISPCTRL_DSP3_MUX_MASK		VC4_MASK(19, 18)
 # define SCALER_DISPCTRL_DSP3_MUX_SHIFT		18
 
@@ -224,45 +222,25 @@
  * SCALER_DISPSTAT_IRQDISP0.  Note that short frame contributions are
  * always enabled.
  */
-# define SCALER_DISPCTRL_DSP0EISLUR		BIT(13)
-# define SCALER_DISPCTRL_DSP2EIEOLN		BIT(12)
-# define SCALER_DISPCTRL_DSP2EIEOF		BIT(11)
-# define SCALER_DISPCTRL_DSP1EIEOLN		BIT(10)
-# define SCALER_DISPCTRL_DSP1EIEOF		BIT(9)
+# define SCALER_DISPCTRL_DSPEISLUR(x)		BIT(13 + (x))
 /* Enables Display 0 end-of-line-N contribution to
  * SCALER_DISPSTAT_IRQDISP0
  */
-# define SCALER_DISPCTRL_DSP0EIEOLN		BIT(8)
+# define SCALER_DISPCTRL_DSPEIEOLN(x)		BIT(8 + ((x) * 2))
 /* Enables Display 0 EOF contribution to SCALER_DISPSTAT_IRQDISP0 */
-# define SCALER_DISPCTRL_DSP0EIEOF		BIT(7)
+# define SCALER_DISPCTRL_DSPEIEOF(x)		BIT(7 + ((x) * 2))
 
 # define SCALER_DISPCTRL_SLVRDEIRQ		BIT(6)
 # define SCALER_DISPCTRL_SLVWREIRQ		BIT(5)
 # define SCALER_DISPCTRL_DMAEIRQ		BIT(4)
-# define SCALER_DISPCTRL_DISP2EIRQ		BIT(3)
-# define SCALER_DISPCTRL_DISP1EIRQ		BIT(2)
 /* Enables interrupt generation on the enabled EOF/EOLN/EISLUR
  * bits and short frames..
  */
-# define SCALER_DISPCTRL_DISP0EIRQ		BIT(1)
+# define SCALER_DISPCTRL_DISPEIRQ(x)		BIT(1 + (x))
 /* Enables interrupt generation on scaler profiler interrupt. */
 # define SCALER_DISPCTRL_SCLEIRQ		BIT(0)
 
 #define SCALER_DISPSTAT                         0x00000004
-# define SCALER_DISPSTAT_COBLOW2		BIT(29)
-# define SCALER_DISPSTAT_EOLN2			BIT(28)
-# define SCALER_DISPSTAT_ESFRAME2		BIT(27)
-# define SCALER_DISPSTAT_ESLINE2		BIT(26)
-# define SCALER_DISPSTAT_EUFLOW2		BIT(25)
-# define SCALER_DISPSTAT_EOF2			BIT(24)
-
-# define SCALER_DISPSTAT_COBLOW1		BIT(21)
-# define SCALER_DISPSTAT_EOLN1			BIT(20)
-# define SCALER_DISPSTAT_ESFRAME1		BIT(19)
-# define SCALER_DISPSTAT_ESLINE1		BIT(18)
-# define SCALER_DISPSTAT_EUFLOW1		BIT(17)
-# define SCALER_DISPSTAT_EOF1			BIT(16)
-
 # define SCALER_DISPSTAT_RESP_MASK		VC4_MASK(15, 14)
 # define SCALER_DISPSTAT_RESP_SHIFT		14
 # define SCALER_DISPSTAT_RESP_OKAY		0
@@ -270,23 +248,26 @@
 # define SCALER_DISPSTAT_RESP_SLVERR		2
 # define SCALER_DISPSTAT_RESP_DECERR		3
 
-# define SCALER_DISPSTAT_COBLOW0		BIT(13)
+# define SCALER_DISPSTAT_COBLOW(x)		BIT(13 + ((x) * 8))
 /* Set when the DISPEOLN line is done compositing. */
-# define SCALER_DISPSTAT_EOLN0			BIT(12)
+# define SCALER_DISPSTAT_EOLN(x)		BIT(12 + ((x) * 8))
 /* Set when VSTART is seen but there are still pixels in the current
  * output line.
  */
-# define SCALER_DISPSTAT_ESFRAME0		BIT(11)
+# define SCALER_DISPSTAT_ESFRAME(x)		BIT(11 + ((x) * 8))
 /* Set when HSTART is seen but there are still pixels in the current
  * output line.
  */
-# define SCALER_DISPSTAT_ESLINE0		BIT(10)
+# define SCALER_DISPSTAT_ESLINE(x)		BIT(10 + ((x) * 8))
 /* Set when the the downstream tries to read from the display FIFO
  * while it's empty.
  */
-# define SCALER_DISPSTAT_EUFLOW0		BIT(9)
+# define SCALER_DISPSTAT_EUFLOW(x)		BIT(9 + ((x) * 8))
 /* Set when the display mode changes from RUN to EOF */
-# define SCALER_DISPSTAT_EOF0			BIT(8)
+# define SCALER_DISPSTAT_EOF(x)			BIT(8 + ((x) * 8))
+
+# define SCALER_DISPSTAT_IRQMASK(x)		VC4_MASK(13 + ((x) * 8), \
+							 8 + ((x) * 8))
 
 /* Set on AXI invalid DMA ID error. */
 # define SCALER_DISPSTAT_DMA_ERROR		BIT(7)
@@ -298,12 +279,10 @@
  * SCALER_DISPSTAT_RESP_ERROR is not SCALER_DISPSTAT_RESP_OKAY.
  */
 # define SCALER_DISPSTAT_IRQDMA			BIT(4)
-# define SCALER_DISPSTAT_IRQDISP2		BIT(3)
-# define SCALER_DISPSTAT_IRQDISP1		BIT(2)
 /* Set when any of the EOF/EOLN/ESFRAME/ESLINE bits are set and their
  * corresponding interrupt bit is enabled in DISPCTRL.
  */
-# define SCALER_DISPSTAT_IRQDISP0		BIT(1)
+# define SCALER_DISPSTAT_IRQDISP(x)		BIT(1 + (x))
 /* On read, the profiler interrupt.  On write, clear *all* interrupt bits. */
 # define SCALER_DISPSTAT_IRQSCL			BIT(0)
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 2/4] drm/vc4: Add a load tracker to prevent HVS underflow errors
  2019-02-06 14:49 [PATCH v4 0/4] drm/vc4: Add a load tracker Paul Kocialkowski
  2019-02-06 14:49 ` [PATCH v4 1/4] drm/vc4: Report HVS underrun errors Paul Kocialkowski
@ 2019-02-06 14:49 ` Paul Kocialkowski
  2019-02-06 14:49   ` Paul Kocialkowski
  2019-02-06 14:49 ` [PATCH v4 4/4] drm/vc4: Add a debugfs entry to disable/enable the load tracker Paul Kocialkowski
  3 siblings, 0 replies; 11+ messages in thread
From: Paul Kocialkowski @ 2019-02-06 14:49 UTC (permalink / raw)
  To: dri-devel, linux-kernel
  Cc: Eric Anholt, David Airlie, Daniel Vetter, Maxime Ripard,
	Thomas Petazzoni, Eben Upton, Boris Brezillon, Paul Kocialkowski

From: Boris Brezillon <boris.brezillon@bootlin.com>

The HVS block is supposed to fill the pixelvalve FIFOs fast enough to
meet the requested framerate. The problem is, the HVS and memory bus
bandwidths are limited, and if we don't take these limitations into
account we might end up with HVS underflow errors.

This patch is trying to model the per-plane HVS and memory bus bandwidth
consumption and take a decision at atomic_check() time whether the
estimated load will fit in the HVS and membus budget.

Note that we take an extra margin on the memory bus consumption to let
the system run smoothly when other blocks are doing heavy use of the
memory bus. Same goes for the HVS limit, except the margin is smaller in
this case, since the HVS is not used by external components.

Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
---
 drivers/gpu/drm/vc4/vc4_drv.c   |   1 +
 drivers/gpu/drm/vc4/vc4_drv.h   |  11 ++++
 drivers/gpu/drm/vc4/vc4_kms.c   | 103 +++++++++++++++++++++++++++++++-
 drivers/gpu/drm/vc4/vc4_plane.c |  57 ++++++++++++++++++
 4 files changed, 171 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index 5fcd2f0da7f7..82a688192201 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -312,6 +312,7 @@ static void vc4_drm_unbind(struct device *dev)
 
 	drm_mode_config_cleanup(drm);
 
+	drm_atomic_private_obj_fini(&vc4->load_tracker);
 	drm_atomic_private_obj_fini(&vc4->ctm_manager);
 
 	drm_dev_put(drm);
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index b34da7b6ee6b..ea596791231d 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -208,6 +208,7 @@ struct vc4_dev {
 
 	struct drm_modeset_lock ctm_state_lock;
 	struct drm_private_obj ctm_manager;
+	struct drm_private_obj load_tracker;
 };
 
 static inline struct vc4_dev *
@@ -383,6 +384,16 @@ struct vc4_plane_state {
 	 * when async update is not possible.
 	 */
 	bool dlist_initialized;
+
+	/* Load of this plane on the HVS block. The load is expressed in HVS
+	 * cycles/sec.
+	 */
+	u64 hvs_load;
+
+	/* Memory bandwidth needed for this plane. This is expressed in
+	 * bytes/sec.
+	 */
+	u64 membus_load;
 };
 
 static inline struct vc4_plane_state *
diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
index 3c87fbcd0b17..8c9341bee5ce 100644
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -34,6 +34,18 @@ static struct vc4_ctm_state *to_vc4_ctm_state(struct drm_private_state *priv)
 	return container_of(priv, struct vc4_ctm_state, base);
 }
 
+struct vc4_load_tracker_state {
+	struct drm_private_state base;
+	u64 hvs_load;
+	u64 membus_load;
+};
+
+static struct vc4_load_tracker_state *
+to_vc4_load_tracker_state(struct drm_private_state *priv)
+{
+	return container_of(priv, struct vc4_load_tracker_state, base);
+}
+
 static struct vc4_ctm_state *vc4_get_ctm_state(struct drm_atomic_state *state,
 					       struct drm_private_obj *manager)
 {
@@ -389,6 +401,81 @@ vc4_ctm_atomic_check(struct drm_device *dev, struct drm_atomic_state *state)
 	return 0;
 }
 
+static int vc4_load_tracker_atomic_check(struct drm_atomic_state *state)
+{
+	struct drm_plane_state *old_plane_state, *new_plane_state;
+	struct vc4_dev *vc4 = to_vc4_dev(state->dev);
+	struct vc4_load_tracker_state *load_state;
+	struct drm_private_state *priv_state;
+	struct drm_plane *plane;
+	int i;
+
+	priv_state = drm_atomic_get_private_obj_state(state,
+						      &vc4->load_tracker);
+	if (IS_ERR(priv_state))
+		return PTR_ERR(priv_state);
+
+	load_state = to_vc4_load_tracker_state(priv_state);
+	for_each_oldnew_plane_in_state(state, plane, old_plane_state,
+				       new_plane_state, i) {
+		struct vc4_plane_state *vc4_plane_state;
+
+		if (old_plane_state->fb && old_plane_state->crtc) {
+			vc4_plane_state = to_vc4_plane_state(old_plane_state);
+			load_state->membus_load -= vc4_plane_state->membus_load;
+			load_state->hvs_load -= vc4_plane_state->hvs_load;
+		}
+
+		if (new_plane_state->fb && new_plane_state->crtc) {
+			vc4_plane_state = to_vc4_plane_state(new_plane_state);
+			load_state->membus_load += vc4_plane_state->membus_load;
+			load_state->hvs_load += vc4_plane_state->hvs_load;
+		}
+	}
+
+	/* The absolute limit is 2Gbyte/sec, but let's take a margin to let
+	 * the system work when other blocks are accessing the memory.
+	 */
+	if (load_state->membus_load > SZ_1G + SZ_512M)
+		return -ENOSPC;
+
+	/* HVS clock is supposed to run @ 250Mhz, let's take a margin and
+	 * consider the maximum number of cycles is 240M.
+	 */
+	if (load_state->hvs_load > 240000000ULL)
+		return -ENOSPC;
+
+	return 0;
+}
+
+static struct drm_private_state *
+vc4_load_tracker_duplicate_state(struct drm_private_obj *obj)
+{
+	struct vc4_load_tracker_state *state;
+
+	state = kmemdup(obj->state, sizeof(*state), GFP_KERNEL);
+	if (!state)
+		return NULL;
+
+	__drm_atomic_helper_private_obj_duplicate_state(obj, &state->base);
+
+	return &state->base;
+}
+
+static void vc4_load_tracker_destroy_state(struct drm_private_obj *obj,
+					   struct drm_private_state *state)
+{
+	struct vc4_load_tracker_state *load_state;
+
+	load_state = to_vc4_load_tracker_state(state);
+	kfree(load_state);
+}
+
+static const struct drm_private_state_funcs vc4_load_tracker_state_funcs = {
+	.atomic_duplicate_state = vc4_load_tracker_duplicate_state,
+	.atomic_destroy_state = vc4_load_tracker_destroy_state,
+};
+
 static int
 vc4_atomic_check(struct drm_device *dev, struct drm_atomic_state *state)
 {
@@ -398,7 +485,11 @@ vc4_atomic_check(struct drm_device *dev, struct drm_atomic_state *state)
 	if (ret < 0)
 		return ret;
 
-	return drm_atomic_helper_check(dev, state);
+	ret = drm_atomic_helper_check(dev, state);
+	if (ret)
+		return ret;
+
+	return vc4_load_tracker_atomic_check(state);
 }
 
 static const struct drm_mode_config_funcs vc4_mode_funcs = {
@@ -411,6 +502,7 @@ int vc4_kms_load(struct drm_device *dev)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct vc4_ctm_state *ctm_state;
+	struct vc4_load_tracker_state *load_state;
 	int ret;
 
 	sema_init(&vc4->async_modeset, 1);
@@ -440,6 +532,15 @@ int vc4_kms_load(struct drm_device *dev)
 	drm_atomic_private_obj_init(dev, &vc4->ctm_manager, &ctm_state->base,
 				    &vc4_ctm_state_funcs);
 
+	load_state = kzalloc(sizeof(*load_state), GFP_KERNEL);
+	if (!load_state) {
+		drm_atomic_private_obj_fini(&vc4->ctm_manager);
+		return -ENOMEM;
+	}
+
+	drm_atomic_private_obj_init(dev, &vc4->load_tracker, &load_state->base,
+				    &vc4_load_tracker_state_funcs);
+
 	drm_mode_config_reset(dev);
 
 	drm_kms_helper_poll_init(dev);
diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c
index d098337c10e9..88288793862a 100644
--- a/drivers/gpu/drm/vc4/vc4_plane.c
+++ b/drivers/gpu/drm/vc4/vc4_plane.c
@@ -488,6 +488,61 @@ static void vc4_write_scaling_parameters(struct drm_plane_state *state,
 	}
 }
 
+static void vc4_plane_calc_load(struct drm_plane_state *state)
+{
+	unsigned int hvs_load_shift, vrefresh, i;
+	struct drm_framebuffer *fb = state->fb;
+	struct vc4_plane_state *vc4_state;
+	struct drm_crtc_state *crtc_state;
+	unsigned int vscale_factor;
+
+	vc4_state = to_vc4_plane_state(state);
+	crtc_state = drm_atomic_get_existing_crtc_state(state->state,
+							state->crtc);
+	vrefresh = drm_mode_vrefresh(&crtc_state->adjusted_mode);
+
+	/* The HVS is able to process 2 pixels/cycle when scaling the source,
+	 * 4 pixels/cycle otherwise.
+	 * Alpha blending step seems to be pipelined and it's always operating
+	 * at 4 pixels/cycle, so the limiting aspect here seems to be the
+	 * scaler block.
+	 * HVS load is expressed in clk-cycles/sec (AKA Hz).
+	 */
+	if (vc4_state->x_scaling[0] != VC4_SCALING_NONE ||
+	    vc4_state->x_scaling[1] != VC4_SCALING_NONE ||
+	    vc4_state->y_scaling[0] != VC4_SCALING_NONE ||
+	    vc4_state->y_scaling[1] != VC4_SCALING_NONE)
+		hvs_load_shift = 1;
+	else
+		hvs_load_shift = 2;
+
+	vc4_state->membus_load = 0;
+	vc4_state->hvs_load = 0;
+	for (i = 0; i < fb->format->num_planes; i++) {
+		/* Even if the bandwidth/plane required for a single frame is
+		 *
+		 * vc4_state->src_w[i] * vc4_state->src_h[i] * cpp * vrefresh
+		 *
+		 * when downscaling, we have to read more pixels per line in
+		 * the time frame reserved for a single line, so the bandwidth
+		 * demand can be punctually higher. To account for that, we
+		 * calculate the down-scaling factor and multiply the plane
+		 * load by this number. We're likely over-estimating the read
+		 * demand, but that's better than under-estimating it.
+		 */
+		vscale_factor = DIV_ROUND_UP(vc4_state->src_h[i],
+					     vc4_state->crtc_h);
+		vc4_state->membus_load += vc4_state->src_w[i] *
+					  vc4_state->src_h[i] * vscale_factor *
+					  fb->format->cpp[i];
+		vc4_state->hvs_load += vc4_state->crtc_h * vc4_state->crtc_w;
+	}
+
+	vc4_state->hvs_load *= vrefresh;
+	vc4_state->hvs_load >>= hvs_load_shift;
+	vc4_state->membus_load *= vrefresh;
+}
+
 static int vc4_plane_allocate_lbm(struct drm_plane_state *state)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(state->plane->dev);
@@ -875,6 +930,8 @@ static int vc4_plane_mode_set(struct drm_plane *plane,
 	 */
 	vc4_state->dlist_initialized = 1;
 
+	vc4_plane_calc_load(state);
+
 	return 0;
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 3/4] drm/vc4: Detect and ignore underruns caused by out-of-sync dlists
  2019-02-06 14:49 [PATCH v4 0/4] drm/vc4: Add a load tracker Paul Kocialkowski
@ 2019-02-06 14:49   ` Paul Kocialkowski
  2019-02-06 14:49 ` [PATCH v4 2/4] drm/vc4: Add a load tracker to prevent HVS underflow errors Paul Kocialkowski
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Paul Kocialkowski @ 2019-02-06 14:49 UTC (permalink / raw)
  To: dri-devel, linux-kernel
  Cc: Eric Anholt, David Airlie, Daniel Vetter, Maxime Ripard,
	Thomas Petazzoni, Eben Upton, Paul Kocialkowski

When the pipeline is reconfigured with a different mode, changes take
effect immediately for the CRTC and encoder while the HVS takes some
time to switch the active display list. This results in a period of
time where the pipeline is out of sync, that is very likely to cause
an underrun to be reported. Because the underrun is not related to the
new configuration, reporting it to userspace is a false positive.

Since this only concerns the first frame and we have no immediate fix
to avoid this situation, detect and work around it.

A helper is introduced to return whether the enabled display channels
have their display list in sync. This hint is set when unmasking the
underrun interrupt and it is updated in the interrupt itself.

With that, underrun reports that happen when the display list is out of
sync are ignored. The interrupt is kept enabled so that proper underrun
indications can be pick up as soon as the new display list takes over.

Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
---
 drivers/gpu/drm/vc4/vc4_drv.h  |  8 ++++++++
 drivers/gpu/drm/vc4/vc4_hvs.c  | 35 ++++++++++++++++++++++++++++++++--
 drivers/gpu/drm/vc4/vc4_regs.h |  2 ++
 3 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index ea596791231d..a6ed55da0d10 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -192,6 +192,13 @@ struct vc4_dev {
 	 */
 	atomic_t underrun;
 
+	/* Stores whether the display lists were syncronized when unmasking the
+	 * underrun IRQ. This is used to skip underruns reported when the
+	 * pipeline was reconfigured but the previous display list is still
+	 * active.
+	 */
+	bool underrun_dlist_sync;
+
 	struct work_struct overflow_mem_work;
 
 	int power_refcount;
@@ -792,6 +799,7 @@ extern struct platform_driver vc4_hvs_driver;
 void vc4_hvs_dump_state(struct drm_device *dev);
 int vc4_hvs_debugfs_regs(struct seq_file *m, void *unused);
 int vc4_hvs_debugfs_underrun(struct seq_file *m, void *unused);
+bool vc4_hvs_check_dlist_sync(struct drm_device *dev);
 void vc4_hvs_unmask_underrun(struct drm_device *dev);
 void vc4_hvs_mask_underrun(struct drm_device *dev);
 
diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
index d5bc3bcd3e51..53ba24aed8fd 100644
--- a/drivers/gpu/drm/vc4/vc4_hvs.c
+++ b/drivers/gpu/drm/vc4/vc4_hvs.c
@@ -179,6 +179,23 @@ static int vc4_hvs_upload_linear_kernel(struct vc4_hvs *hvs,
 	return 0;
 }
 
+bool vc4_hvs_check_dlist_sync(struct drm_device *dev)
+{
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	unsigned int i;
+
+	for (i = 0; i < SCALER_CHANNELS_COUNT; i++) {
+		if (!(HVS_READ(SCALER_DISPCTRLX(i)) & SCALER_DISPCTRLX_ENABLE))
+			continue;
+
+		if (HVS_READ(SCALER_DISPLACTX(i)) !=
+		    HVS_READ(SCALER_DISPLISTX(i)))
+			return false;
+	}
+
+	return true;
+}
+
 void vc4_hvs_mask_underrun(struct drm_device *dev)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
@@ -196,6 +213,9 @@ void vc4_hvs_unmask_underrun(struct drm_device *dev)
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	u32 dispctrl = HVS_READ(SCALER_DISPCTRL);
 
+	/* An underrun will occur when the display lists are out of sync. */
+	vc4->underrun_dlist_sync = vc4_hvs_check_dlist_sync(dev);
+
 	dispctrl |= SCALER_DISPCTRL_DSPEISLUR(0) |
 		    SCALER_DISPCTRL_DSPEISLUR(1) |
 		    SCALER_DISPCTRL_DSPEISLUR(2);
@@ -220,19 +240,30 @@ static irqreturn_t vc4_hvs_irq_handler(int irq, void *data)
 	struct drm_device *dev = data;
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	irqreturn_t irqret = IRQ_NONE;
+	bool dlist_sync;
 	u32 status;
 
+	dlist_sync = vc4_hvs_check_dlist_sync(dev);
 	status = HVS_READ(SCALER_DISPSTAT);
 
 	if (status & (SCALER_DISPSTAT_EUFLOW(0) |
 		      SCALER_DISPSTAT_EUFLOW(1) |
 		      SCALER_DISPSTAT_EUFLOW(2))) {
-		vc4_hvs_mask_underrun(dev);
-		vc4_hvs_report_underrun(dev);
+		/* An underrun will be reported when the current display list
+		 * was not yet updated by the hardware to the requested one and
+		 * the other elements of the pipeline were already reconfigured.
+		 * Ignore it in that case.
+		 */
+		if (vc4->underrun_dlist_sync && dlist_sync) {
+			vc4_hvs_mask_underrun(dev);
+			vc4_hvs_report_underrun(dev);
+		}
 
 		irqret = IRQ_HANDLED;
 	}
 
+	vc4->underrun_dlist_sync = dlist_sync;
+
 	HVS_WRITE(SCALER_DISPSTAT, SCALER_DISPSTAT_IRQMASK(0) |
 				   SCALER_DISPSTAT_IRQMASK(1) |
 				   SCALER_DISPSTAT_IRQMASK(2));
diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h
index e0834de8410f..c0c5fadaf7e3 100644
--- a/drivers/gpu/drm/vc4/vc4_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_regs.h
@@ -212,6 +212,8 @@
 
 #define PV_HACT_ACT				0x30
 
+#define SCALER_CHANNELS_COUNT			3
+
 #define SCALER_DISPCTRL                         0x00000000
 /* Global register for clock gating the HVS */
 # define SCALER_DISPCTRL_ENABLE			BIT(31)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 3/4] drm/vc4: Detect and ignore underruns caused by out-of-sync dlists
@ 2019-02-06 14:49   ` Paul Kocialkowski
  0 siblings, 0 replies; 11+ messages in thread
From: Paul Kocialkowski @ 2019-02-06 14:49 UTC (permalink / raw)
  To: dri-devel, linux-kernel
  Cc: Maxime Ripard, Eben Upton, David Airlie, Paul Kocialkowski,
	Thomas Petazzoni

When the pipeline is reconfigured with a different mode, changes take
effect immediately for the CRTC and encoder while the HVS takes some
time to switch the active display list. This results in a period of
time where the pipeline is out of sync, that is very likely to cause
an underrun to be reported. Because the underrun is not related to the
new configuration, reporting it to userspace is a false positive.

Since this only concerns the first frame and we have no immediate fix
to avoid this situation, detect and work around it.

A helper is introduced to return whether the enabled display channels
have their display list in sync. This hint is set when unmasking the
underrun interrupt and it is updated in the interrupt itself.

With that, underrun reports that happen when the display list is out of
sync are ignored. The interrupt is kept enabled so that proper underrun
indications can be pick up as soon as the new display list takes over.

Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
---
 drivers/gpu/drm/vc4/vc4_drv.h  |  8 ++++++++
 drivers/gpu/drm/vc4/vc4_hvs.c  | 35 ++++++++++++++++++++++++++++++++--
 drivers/gpu/drm/vc4/vc4_regs.h |  2 ++
 3 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index ea596791231d..a6ed55da0d10 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -192,6 +192,13 @@ struct vc4_dev {
 	 */
 	atomic_t underrun;
 
+	/* Stores whether the display lists were syncronized when unmasking the
+	 * underrun IRQ. This is used to skip underruns reported when the
+	 * pipeline was reconfigured but the previous display list is still
+	 * active.
+	 */
+	bool underrun_dlist_sync;
+
 	struct work_struct overflow_mem_work;
 
 	int power_refcount;
@@ -792,6 +799,7 @@ extern struct platform_driver vc4_hvs_driver;
 void vc4_hvs_dump_state(struct drm_device *dev);
 int vc4_hvs_debugfs_regs(struct seq_file *m, void *unused);
 int vc4_hvs_debugfs_underrun(struct seq_file *m, void *unused);
+bool vc4_hvs_check_dlist_sync(struct drm_device *dev);
 void vc4_hvs_unmask_underrun(struct drm_device *dev);
 void vc4_hvs_mask_underrun(struct drm_device *dev);
 
diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
index d5bc3bcd3e51..53ba24aed8fd 100644
--- a/drivers/gpu/drm/vc4/vc4_hvs.c
+++ b/drivers/gpu/drm/vc4/vc4_hvs.c
@@ -179,6 +179,23 @@ static int vc4_hvs_upload_linear_kernel(struct vc4_hvs *hvs,
 	return 0;
 }
 
+bool vc4_hvs_check_dlist_sync(struct drm_device *dev)
+{
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	unsigned int i;
+
+	for (i = 0; i < SCALER_CHANNELS_COUNT; i++) {
+		if (!(HVS_READ(SCALER_DISPCTRLX(i)) & SCALER_DISPCTRLX_ENABLE))
+			continue;
+
+		if (HVS_READ(SCALER_DISPLACTX(i)) !=
+		    HVS_READ(SCALER_DISPLISTX(i)))
+			return false;
+	}
+
+	return true;
+}
+
 void vc4_hvs_mask_underrun(struct drm_device *dev)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
@@ -196,6 +213,9 @@ void vc4_hvs_unmask_underrun(struct drm_device *dev)
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	u32 dispctrl = HVS_READ(SCALER_DISPCTRL);
 
+	/* An underrun will occur when the display lists are out of sync. */
+	vc4->underrun_dlist_sync = vc4_hvs_check_dlist_sync(dev);
+
 	dispctrl |= SCALER_DISPCTRL_DSPEISLUR(0) |
 		    SCALER_DISPCTRL_DSPEISLUR(1) |
 		    SCALER_DISPCTRL_DSPEISLUR(2);
@@ -220,19 +240,30 @@ static irqreturn_t vc4_hvs_irq_handler(int irq, void *data)
 	struct drm_device *dev = data;
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	irqreturn_t irqret = IRQ_NONE;
+	bool dlist_sync;
 	u32 status;
 
+	dlist_sync = vc4_hvs_check_dlist_sync(dev);
 	status = HVS_READ(SCALER_DISPSTAT);
 
 	if (status & (SCALER_DISPSTAT_EUFLOW(0) |
 		      SCALER_DISPSTAT_EUFLOW(1) |
 		      SCALER_DISPSTAT_EUFLOW(2))) {
-		vc4_hvs_mask_underrun(dev);
-		vc4_hvs_report_underrun(dev);
+		/* An underrun will be reported when the current display list
+		 * was not yet updated by the hardware to the requested one and
+		 * the other elements of the pipeline were already reconfigured.
+		 * Ignore it in that case.
+		 */
+		if (vc4->underrun_dlist_sync && dlist_sync) {
+			vc4_hvs_mask_underrun(dev);
+			vc4_hvs_report_underrun(dev);
+		}
 
 		irqret = IRQ_HANDLED;
 	}
 
+	vc4->underrun_dlist_sync = dlist_sync;
+
 	HVS_WRITE(SCALER_DISPSTAT, SCALER_DISPSTAT_IRQMASK(0) |
 				   SCALER_DISPSTAT_IRQMASK(1) |
 				   SCALER_DISPSTAT_IRQMASK(2));
diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h
index e0834de8410f..c0c5fadaf7e3 100644
--- a/drivers/gpu/drm/vc4/vc4_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_regs.h
@@ -212,6 +212,8 @@
 
 #define PV_HACT_ACT				0x30
 
+#define SCALER_CHANNELS_COUNT			3
+
 #define SCALER_DISPCTRL                         0x00000000
 /* Global register for clock gating the HVS */
 # define SCALER_DISPCTRL_ENABLE			BIT(31)
-- 
2.20.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 4/4] drm/vc4: Add a debugfs entry to disable/enable the load tracker
  2019-02-06 14:49 [PATCH v4 0/4] drm/vc4: Add a load tracker Paul Kocialkowski
                   ` (2 preceding siblings ...)
  2019-02-06 14:49   ` Paul Kocialkowski
@ 2019-02-06 14:49 ` Paul Kocialkowski
  3 siblings, 0 replies; 11+ messages in thread
From: Paul Kocialkowski @ 2019-02-06 14:49 UTC (permalink / raw)
  To: dri-devel, linux-kernel
  Cc: Eric Anholt, David Airlie, Daniel Vetter, Maxime Ripard,
	Thomas Petazzoni, Eben Upton, Paul Kocialkowski

In order to test whether the load tracker is working as expected, we
need the ability to compare the commit result with the underrun
indication. With the load tracker always enabled, commits that are
expected to trigger an underrun are always rejected, so userspace
cannot get the actual underrun indication from the hardware.

Add a debugfs entry to disable/enable the load tracker, so that a DRM
commit expected to trigger an underrun can go through with the load
tracker disabled. The underrun indication is then available to
userspace and can be checked against the commit result with the load
tracker enabled.

Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
---
 drivers/gpu/drm/vc4/vc4_debugfs.c | 9 +++++++++
 drivers/gpu/drm/vc4/vc4_drv.h     | 3 +++
 drivers/gpu/drm/vc4/vc4_kms.c     | 9 +++++++++
 3 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/vc4/vc4_debugfs.c b/drivers/gpu/drm/vc4/vc4_debugfs.c
index 64021bcba041..59cdad89f844 100644
--- a/drivers/gpu/drm/vc4/vc4_debugfs.c
+++ b/drivers/gpu/drm/vc4/vc4_debugfs.c
@@ -36,6 +36,15 @@ static const struct drm_info_list vc4_debugfs_list[] = {
 int
 vc4_debugfs_init(struct drm_minor *minor)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(minor->dev);
+	struct dentry *dentry;
+
+	dentry = debugfs_create_bool("hvs_load_tracker", S_IRUGO | S_IWUSR,
+				     minor->debugfs_root,
+				     &vc4->load_tracker_enabled);
+	if (!dentry)
+		return -ENOMEM;
+
 	return drm_debugfs_create_files(vc4_debugfs_list, VC4_DEBUGFS_ENTRIES,
 					minor->debugfs_root, minor);
 }
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index a6ed55da0d10..a7a50c4a62bc 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -203,6 +203,9 @@ struct vc4_dev {
 
 	int power_refcount;
 
+	/* Set to true when the load tracker is active. */
+	bool load_tracker_enabled;
+
 	/* Mutex controlling the power refcount. */
 	struct mutex power_lock;
 
diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
index 8c9341bee5ce..476bf746b014 100644
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -433,6 +433,10 @@ static int vc4_load_tracker_atomic_check(struct drm_atomic_state *state)
 		}
 	}
 
+	/* Don't check the load when the tracker is disabled. */
+	if (!vc4->load_tracker_enabled)
+		return 0;
+
 	/* The absolute limit is 2Gbyte/sec, but let's take a margin to let
 	 * the system work when other blocks are accessing the memory.
 	 */
@@ -505,6 +509,11 @@ int vc4_kms_load(struct drm_device *dev)
 	struct vc4_load_tracker_state *load_state;
 	int ret;
 
+	/* Start with the load tracker enabled. Can be disabled through the
+	 * debugfs load_tracker file.
+	 */
+	vc4->load_tracker_enabled = true;
+
 	sema_init(&vc4->async_modeset, 1);
 
 	/* Set support for vblank irq fast disable, before drm_vblank_init() */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 3/4] drm/vc4: Detect and ignore underruns caused by out-of-sync dlists
  2019-02-06 14:49   ` Paul Kocialkowski
@ 2019-02-06 23:51     ` Eric Anholt
  -1 siblings, 0 replies; 11+ messages in thread
From: Eric Anholt @ 2019-02-06 23:51 UTC (permalink / raw)
  To: Paul Kocialkowski, dri-devel, linux-kernel
  Cc: David Airlie, Daniel Vetter, Maxime Ripard, Thomas Petazzoni,
	Eben Upton, Paul Kocialkowski

[-- Attachment #1: Type: text/plain, Size: 1094 bytes --]

Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:

> When the pipeline is reconfigured with a different mode, changes take
> effect immediately for the CRTC and encoder while the HVS takes some
> time to switch the active display list. This results in a period of
> time where the pipeline is out of sync, that is very likely to cause
> an underrun to be reported. Because the underrun is not related to the
> new configuration, reporting it to userspace is a false positive.

This seems like a serious issue.  How are we enabling a CRTC with the
corresponding HVS still scanning out old contents?  Did we need to wait
for HVS to finish its old frame when we turned off the CRTC, so it's
ready to receive the START when it's been set up with the new dlist and
the CRTC is turned back on?  Or maybe do some sort of reset of that
dlist when a crtc is being enabled?

If we can't sort that out, it feels to me like we should be enabling the
interrupts from the flip_done path (when we know that the HVS is
scanning out the new frame) instead of trying to mitigate enabling them
too early.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 3/4] drm/vc4: Detect and ignore underruns caused by out-of-sync dlists
@ 2019-02-06 23:51     ` Eric Anholt
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Anholt @ 2019-02-06 23:51 UTC (permalink / raw)
  To: dri-devel, linux-kernel
  Cc: David Airlie, Daniel Vetter, Maxime Ripard, Thomas Petazzoni,
	Eben Upton, Paul Kocialkowski

[-- Attachment #1: Type: text/plain, Size: 1094 bytes --]

Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:

> When the pipeline is reconfigured with a different mode, changes take
> effect immediately for the CRTC and encoder while the HVS takes some
> time to switch the active display list. This results in a period of
> time where the pipeline is out of sync, that is very likely to cause
> an underrun to be reported. Because the underrun is not related to the
> new configuration, reporting it to userspace is a false positive.

This seems like a serious issue.  How are we enabling a CRTC with the
corresponding HVS still scanning out old contents?  Did we need to wait
for HVS to finish its old frame when we turned off the CRTC, so it's
ready to receive the START when it's been set up with the new dlist and
the CRTC is turned back on?  Or maybe do some sort of reset of that
dlist when a crtc is being enabled?

If we can't sort that out, it feels to me like we should be enabling the
interrupts from the flip_done path (when we know that the HVS is
scanning out the new frame) instead of trying to mitigate enabling them
too early.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/4] drm/vc4: Report HVS underrun errors
  2019-02-06 14:49 ` [PATCH v4 1/4] drm/vc4: Report HVS underrun errors Paul Kocialkowski
@ 2019-02-06 23:51     ` Eric Anholt
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Anholt @ 2019-02-06 23:51 UTC (permalink / raw)
  To: Paul Kocialkowski, dri-devel, linux-kernel
  Cc: David Airlie, Daniel Vetter, Maxime Ripard, Thomas Petazzoni,
	Eben Upton, Boris Brezillon, Paul Kocialkowski

[-- Attachment #1: Type: text/plain, Size: 8167 bytes --]

Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:

> From: Boris Brezillon <boris.brezillon@bootlin.com>
>
> Add a debugfs entry and helper for reporting HVS underrun errors as
> well as helpers for masking and unmasking the underrun interrupts.
> Add an IRQ handler and initial IRQ configuration.
> Rework related register definitions to take the channel number.
>
> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
> Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> ---
>  drivers/gpu/drm/vc4/vc4_debugfs.c |  1 +
>  drivers/gpu/drm/vc4/vc4_drv.h     | 10 ++++
>  drivers/gpu/drm/vc4/vc4_hvs.c     | 92 +++++++++++++++++++++++++++++++
>  drivers/gpu/drm/vc4/vc4_kms.c     |  4 ++
>  drivers/gpu/drm/vc4/vc4_regs.h    | 49 +++++-----------
>  5 files changed, 121 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_debugfs.c b/drivers/gpu/drm/vc4/vc4_debugfs.c
> index 7a0003de71ab..64021bcba041 100644
> --- a/drivers/gpu/drm/vc4/vc4_debugfs.c
> +++ b/drivers/gpu/drm/vc4/vc4_debugfs.c
> @@ -23,6 +23,7 @@ static const struct drm_info_list vc4_debugfs_list[] = {
>  	{"vec_regs", vc4_vec_debugfs_regs, 0},
>  	{"txp_regs", vc4_txp_debugfs_regs, 0},
>  	{"hvs_regs", vc4_hvs_debugfs_regs, 0},
> +	{"hvs_underrun",  vc4_hvs_debugfs_underrun, 0},
>  	{"crtc0_regs", vc4_crtc_debugfs_regs, 0, (void *)(uintptr_t)0},
>  	{"crtc1_regs", vc4_crtc_debugfs_regs, 0, (void *)(uintptr_t)1},
>  	{"crtc2_regs", vc4_crtc_debugfs_regs, 0, (void *)(uintptr_t)2},
> diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
> index 2c635f001c71..b34da7b6ee6b 100644
> --- a/drivers/gpu/drm/vc4/vc4_drv.h
> +++ b/drivers/gpu/drm/vc4/vc4_drv.h
> @@ -185,6 +185,13 @@ struct vc4_dev {
>  	/* Bitmask of the current bin_alloc used for overflow memory. */
>  	uint32_t bin_alloc_overflow;
>  
> +	/* Incremented when an underrun error happened after an atomic commit.
> +	 * This is particularly useful to detect when a specific modeset is too
> +	 * demanding in term of memory or HVS bandwidth which is hard to guess
> +	 * at atomic check time.
> +	 */
> +	atomic_t underrun;
> +
>  	struct work_struct overflow_mem_work;
>  
>  	int power_refcount;
> @@ -773,6 +780,9 @@ void vc4_irq_reset(struct drm_device *dev);
>  extern struct platform_driver vc4_hvs_driver;
>  void vc4_hvs_dump_state(struct drm_device *dev);
>  int vc4_hvs_debugfs_regs(struct seq_file *m, void *unused);
> +int vc4_hvs_debugfs_underrun(struct seq_file *m, void *unused);
> +void vc4_hvs_unmask_underrun(struct drm_device *dev);
> +void vc4_hvs_mask_underrun(struct drm_device *dev);
>  
>  /* vc4_kms.c */
>  int vc4_kms_load(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
> index 5d8c749c9749..d5bc3bcd3e51 100644
> --- a/drivers/gpu/drm/vc4/vc4_hvs.c
> +++ b/drivers/gpu/drm/vc4/vc4_hvs.c
> @@ -22,6 +22,7 @@
>   * each CRTC.
>   */
>  
> +#include <drm/drm_atomic_helper.h>
>  #include <linux/component.h>
>  #include "vc4_drv.h"
>  #include "vc4_regs.h"
> @@ -102,6 +103,18 @@ int vc4_hvs_debugfs_regs(struct seq_file *m, void *unused)
>  
>  	return 0;
>  }
> +
> +int vc4_hvs_debugfs_underrun(struct seq_file *m, void *data)
> +{
> +	struct drm_info_node *node = m->private;
> +	struct drm_device *dev = node->minor->dev;
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> +	struct drm_printer p = drm_seq_file_printer(m);
> +
> +	drm_printf(&p, "%d\n", atomic_read(&vc4->underrun));
> +
> +	return 0;
> +}
>  #endif
>  
>  /* The filter kernel is composed of dwords each containing 3 9-bit
> @@ -166,6 +179,67 @@ static int vc4_hvs_upload_linear_kernel(struct vc4_hvs *hvs,
>  	return 0;
>  }
>  
> +void vc4_hvs_mask_underrun(struct drm_device *dev)
> +{
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> +	u32 dispctrl = HVS_READ(SCALER_DISPCTRL);
> +
> +	dispctrl &= ~(SCALER_DISPCTRL_DSPEISLUR(0) |
> +		      SCALER_DISPCTRL_DSPEISLUR(1) |
> +		      SCALER_DISPCTRL_DSPEISLUR(2));
> +
> +	HVS_WRITE(SCALER_DISPCTRL, dispctrl);
> +}
> +
> +void vc4_hvs_unmask_underrun(struct drm_device *dev)
> +{
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> +	u32 dispctrl = HVS_READ(SCALER_DISPCTRL);
> +
> +	dispctrl |= SCALER_DISPCTRL_DSPEISLUR(0) |
> +		    SCALER_DISPCTRL_DSPEISLUR(1) |
> +		    SCALER_DISPCTRL_DSPEISLUR(2);
> +
> +	HVS_WRITE(SCALER_DISPSTAT,
> +		  SCALER_DISPSTAT_EUFLOW(0) |
> +		  SCALER_DISPSTAT_EUFLOW(1) |
> +		  SCALER_DISPSTAT_EUFLOW(2));
> +	HVS_WRITE(SCALER_DISPCTRL, dispctrl);
> +}
> +
> +static void vc4_hvs_report_underrun(struct drm_device *dev)
> +{
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> +
> +	atomic_inc(&vc4->underrun);
> +	DRM_DEV_ERROR(dev->dev, "HVS underrun\n");
> +}
> +
> +static irqreturn_t vc4_hvs_irq_handler(int irq, void *data)
> +{
> +	struct drm_device *dev = data;
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> +	irqreturn_t irqret = IRQ_NONE;
> +	u32 status;
> +
> +	status = HVS_READ(SCALER_DISPSTAT);
> +
> +	if (status & (SCALER_DISPSTAT_EUFLOW(0) |
> +		      SCALER_DISPSTAT_EUFLOW(1) |
> +		      SCALER_DISPSTAT_EUFLOW(2))) {
> +		vc4_hvs_mask_underrun(dev);
> +		vc4_hvs_report_underrun(dev);
> +
> +		irqret = IRQ_HANDLED;
> +	}
> +
> +	HVS_WRITE(SCALER_DISPSTAT, SCALER_DISPSTAT_IRQMASK(0) |
> +				   SCALER_DISPSTAT_IRQMASK(1) |
> +				   SCALER_DISPSTAT_IRQMASK(2));
> +
> +	return irqret;
> +}
> +
>  static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
>  {
>  	struct platform_device *pdev = to_platform_device(dev);
> @@ -219,15 +293,33 @@ static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
>  	dispctrl = HVS_READ(SCALER_DISPCTRL);
>  
>  	dispctrl |= SCALER_DISPCTRL_ENABLE;
> +	dispctrl |= SCALER_DISPCTRL_DSPEISLUR(0) | SCALER_DISPCTRL_DISPEIRQ(0) |
> +		    SCALER_DISPCTRL_DSPEISLUR(1) | SCALER_DISPCTRL_DISPEIRQ(1) |
> +		    SCALER_DISPCTRL_DSPEISLUR(2) | SCALER_DISPCTRL_DISPEIRQ(2);
>  
>  	/* Set DSP3 (PV1) to use HVS channel 2, which would otherwise
>  	 * be unused.
>  	 */
>  	dispctrl &= ~SCALER_DISPCTRL_DSP3_MUX_MASK;
> +	dispctrl &= ~(SCALER_DISPCTRL_DMAEIRQ |
> +		      SCALER_DISPCTRL_SLVWREIRQ |
> +		      SCALER_DISPCTRL_SLVRDEIRQ |
> +		      SCALER_DISPCTRL_DSPEIEOF(0) |
> +		      SCALER_DISPCTRL_DSPEIEOF(1) |
> +		      SCALER_DISPCTRL_DSPEIEOF(2) |
> +		      SCALER_DISPCTRL_DSPEIEOLN(0) |
> +		      SCALER_DISPCTRL_DSPEIEOLN(1) |
> +		      SCALER_DISPCTRL_DSPEIEOLN(2) |
> +		      SCALER_DISPCTRL_SCLEIRQ);
>  	dispctrl |= VC4_SET_FIELD(2, SCALER_DISPCTRL_DSP3_MUX);
>  
>  	HVS_WRITE(SCALER_DISPCTRL, dispctrl);
>  
> +	ret = devm_request_irq(dev, platform_get_irq(pdev, 0),
> +			       vc4_hvs_irq_handler, 0, "vc4 hvs", drm);
> +	if (ret)
> +		return ret;
> +
>  	return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
> index 91b8c72ff361..3c87fbcd0b17 100644
> --- a/drivers/gpu/drm/vc4/vc4_kms.c
> +++ b/drivers/gpu/drm/vc4/vc4_kms.c
> @@ -139,6 +139,8 @@ vc4_atomic_complete_commit(struct drm_atomic_state *state)
>  	struct drm_device *dev = state->dev;
>  	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  
> +	vc4_hvs_mask_underrun(dev);
> +
>  	drm_atomic_helper_wait_for_fences(dev, state, false);
>  
>  	drm_atomic_helper_wait_for_dependencies(state);
> @@ -155,6 +157,8 @@ vc4_atomic_complete_commit(struct drm_atomic_state *state)
>  
>  	drm_atomic_helper_commit_hw_done(state);
>  
> +	vc4_hvs_unmask_underrun(dev);
> +
>  	drm_atomic_helper_wait_for_flip_done(dev, state);
>  
>  	drm_atomic_helper_cleanup_planes(dev, state);

I think the mask/unmask in here could race against another atomic_commit
happening on another CRTC in parallel.  I guess maybe we should mask off
interrupts on the specific channel being modified.

However, given that we're just talking about a debugfs entry and
user-space testing tool, I'm fine with this as-is.

Other than my concern for patch #4, patch 1-3 are:

Reviewed-by: Eric Anholt <eric@anholt.net>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/4] drm/vc4: Report HVS underrun errors
@ 2019-02-06 23:51     ` Eric Anholt
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Anholt @ 2019-02-06 23:51 UTC (permalink / raw)
  To: dri-devel, linux-kernel
  Cc: David Airlie, Daniel Vetter, Maxime Ripard, Thomas Petazzoni,
	Eben Upton, Boris Brezillon, Paul Kocialkowski

[-- Attachment #1: Type: text/plain, Size: 8167 bytes --]

Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:

> From: Boris Brezillon <boris.brezillon@bootlin.com>
>
> Add a debugfs entry and helper for reporting HVS underrun errors as
> well as helpers for masking and unmasking the underrun interrupts.
> Add an IRQ handler and initial IRQ configuration.
> Rework related register definitions to take the channel number.
>
> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
> Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> ---
>  drivers/gpu/drm/vc4/vc4_debugfs.c |  1 +
>  drivers/gpu/drm/vc4/vc4_drv.h     | 10 ++++
>  drivers/gpu/drm/vc4/vc4_hvs.c     | 92 +++++++++++++++++++++++++++++++
>  drivers/gpu/drm/vc4/vc4_kms.c     |  4 ++
>  drivers/gpu/drm/vc4/vc4_regs.h    | 49 +++++-----------
>  5 files changed, 121 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_debugfs.c b/drivers/gpu/drm/vc4/vc4_debugfs.c
> index 7a0003de71ab..64021bcba041 100644
> --- a/drivers/gpu/drm/vc4/vc4_debugfs.c
> +++ b/drivers/gpu/drm/vc4/vc4_debugfs.c
> @@ -23,6 +23,7 @@ static const struct drm_info_list vc4_debugfs_list[] = {
>  	{"vec_regs", vc4_vec_debugfs_regs, 0},
>  	{"txp_regs", vc4_txp_debugfs_regs, 0},
>  	{"hvs_regs", vc4_hvs_debugfs_regs, 0},
> +	{"hvs_underrun",  vc4_hvs_debugfs_underrun, 0},
>  	{"crtc0_regs", vc4_crtc_debugfs_regs, 0, (void *)(uintptr_t)0},
>  	{"crtc1_regs", vc4_crtc_debugfs_regs, 0, (void *)(uintptr_t)1},
>  	{"crtc2_regs", vc4_crtc_debugfs_regs, 0, (void *)(uintptr_t)2},
> diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
> index 2c635f001c71..b34da7b6ee6b 100644
> --- a/drivers/gpu/drm/vc4/vc4_drv.h
> +++ b/drivers/gpu/drm/vc4/vc4_drv.h
> @@ -185,6 +185,13 @@ struct vc4_dev {
>  	/* Bitmask of the current bin_alloc used for overflow memory. */
>  	uint32_t bin_alloc_overflow;
>  
> +	/* Incremented when an underrun error happened after an atomic commit.
> +	 * This is particularly useful to detect when a specific modeset is too
> +	 * demanding in term of memory or HVS bandwidth which is hard to guess
> +	 * at atomic check time.
> +	 */
> +	atomic_t underrun;
> +
>  	struct work_struct overflow_mem_work;
>  
>  	int power_refcount;
> @@ -773,6 +780,9 @@ void vc4_irq_reset(struct drm_device *dev);
>  extern struct platform_driver vc4_hvs_driver;
>  void vc4_hvs_dump_state(struct drm_device *dev);
>  int vc4_hvs_debugfs_regs(struct seq_file *m, void *unused);
> +int vc4_hvs_debugfs_underrun(struct seq_file *m, void *unused);
> +void vc4_hvs_unmask_underrun(struct drm_device *dev);
> +void vc4_hvs_mask_underrun(struct drm_device *dev);
>  
>  /* vc4_kms.c */
>  int vc4_kms_load(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
> index 5d8c749c9749..d5bc3bcd3e51 100644
> --- a/drivers/gpu/drm/vc4/vc4_hvs.c
> +++ b/drivers/gpu/drm/vc4/vc4_hvs.c
> @@ -22,6 +22,7 @@
>   * each CRTC.
>   */
>  
> +#include <drm/drm_atomic_helper.h>
>  #include <linux/component.h>
>  #include "vc4_drv.h"
>  #include "vc4_regs.h"
> @@ -102,6 +103,18 @@ int vc4_hvs_debugfs_regs(struct seq_file *m, void *unused)
>  
>  	return 0;
>  }
> +
> +int vc4_hvs_debugfs_underrun(struct seq_file *m, void *data)
> +{
> +	struct drm_info_node *node = m->private;
> +	struct drm_device *dev = node->minor->dev;
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> +	struct drm_printer p = drm_seq_file_printer(m);
> +
> +	drm_printf(&p, "%d\n", atomic_read(&vc4->underrun));
> +
> +	return 0;
> +}
>  #endif
>  
>  /* The filter kernel is composed of dwords each containing 3 9-bit
> @@ -166,6 +179,67 @@ static int vc4_hvs_upload_linear_kernel(struct vc4_hvs *hvs,
>  	return 0;
>  }
>  
> +void vc4_hvs_mask_underrun(struct drm_device *dev)
> +{
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> +	u32 dispctrl = HVS_READ(SCALER_DISPCTRL);
> +
> +	dispctrl &= ~(SCALER_DISPCTRL_DSPEISLUR(0) |
> +		      SCALER_DISPCTRL_DSPEISLUR(1) |
> +		      SCALER_DISPCTRL_DSPEISLUR(2));
> +
> +	HVS_WRITE(SCALER_DISPCTRL, dispctrl);
> +}
> +
> +void vc4_hvs_unmask_underrun(struct drm_device *dev)
> +{
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> +	u32 dispctrl = HVS_READ(SCALER_DISPCTRL);
> +
> +	dispctrl |= SCALER_DISPCTRL_DSPEISLUR(0) |
> +		    SCALER_DISPCTRL_DSPEISLUR(1) |
> +		    SCALER_DISPCTRL_DSPEISLUR(2);
> +
> +	HVS_WRITE(SCALER_DISPSTAT,
> +		  SCALER_DISPSTAT_EUFLOW(0) |
> +		  SCALER_DISPSTAT_EUFLOW(1) |
> +		  SCALER_DISPSTAT_EUFLOW(2));
> +	HVS_WRITE(SCALER_DISPCTRL, dispctrl);
> +}
> +
> +static void vc4_hvs_report_underrun(struct drm_device *dev)
> +{
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> +
> +	atomic_inc(&vc4->underrun);
> +	DRM_DEV_ERROR(dev->dev, "HVS underrun\n");
> +}
> +
> +static irqreturn_t vc4_hvs_irq_handler(int irq, void *data)
> +{
> +	struct drm_device *dev = data;
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> +	irqreturn_t irqret = IRQ_NONE;
> +	u32 status;
> +
> +	status = HVS_READ(SCALER_DISPSTAT);
> +
> +	if (status & (SCALER_DISPSTAT_EUFLOW(0) |
> +		      SCALER_DISPSTAT_EUFLOW(1) |
> +		      SCALER_DISPSTAT_EUFLOW(2))) {
> +		vc4_hvs_mask_underrun(dev);
> +		vc4_hvs_report_underrun(dev);
> +
> +		irqret = IRQ_HANDLED;
> +	}
> +
> +	HVS_WRITE(SCALER_DISPSTAT, SCALER_DISPSTAT_IRQMASK(0) |
> +				   SCALER_DISPSTAT_IRQMASK(1) |
> +				   SCALER_DISPSTAT_IRQMASK(2));
> +
> +	return irqret;
> +}
> +
>  static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
>  {
>  	struct platform_device *pdev = to_platform_device(dev);
> @@ -219,15 +293,33 @@ static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
>  	dispctrl = HVS_READ(SCALER_DISPCTRL);
>  
>  	dispctrl |= SCALER_DISPCTRL_ENABLE;
> +	dispctrl |= SCALER_DISPCTRL_DSPEISLUR(0) | SCALER_DISPCTRL_DISPEIRQ(0) |
> +		    SCALER_DISPCTRL_DSPEISLUR(1) | SCALER_DISPCTRL_DISPEIRQ(1) |
> +		    SCALER_DISPCTRL_DSPEISLUR(2) | SCALER_DISPCTRL_DISPEIRQ(2);
>  
>  	/* Set DSP3 (PV1) to use HVS channel 2, which would otherwise
>  	 * be unused.
>  	 */
>  	dispctrl &= ~SCALER_DISPCTRL_DSP3_MUX_MASK;
> +	dispctrl &= ~(SCALER_DISPCTRL_DMAEIRQ |
> +		      SCALER_DISPCTRL_SLVWREIRQ |
> +		      SCALER_DISPCTRL_SLVRDEIRQ |
> +		      SCALER_DISPCTRL_DSPEIEOF(0) |
> +		      SCALER_DISPCTRL_DSPEIEOF(1) |
> +		      SCALER_DISPCTRL_DSPEIEOF(2) |
> +		      SCALER_DISPCTRL_DSPEIEOLN(0) |
> +		      SCALER_DISPCTRL_DSPEIEOLN(1) |
> +		      SCALER_DISPCTRL_DSPEIEOLN(2) |
> +		      SCALER_DISPCTRL_SCLEIRQ);
>  	dispctrl |= VC4_SET_FIELD(2, SCALER_DISPCTRL_DSP3_MUX);
>  
>  	HVS_WRITE(SCALER_DISPCTRL, dispctrl);
>  
> +	ret = devm_request_irq(dev, platform_get_irq(pdev, 0),
> +			       vc4_hvs_irq_handler, 0, "vc4 hvs", drm);
> +	if (ret)
> +		return ret;
> +
>  	return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
> index 91b8c72ff361..3c87fbcd0b17 100644
> --- a/drivers/gpu/drm/vc4/vc4_kms.c
> +++ b/drivers/gpu/drm/vc4/vc4_kms.c
> @@ -139,6 +139,8 @@ vc4_atomic_complete_commit(struct drm_atomic_state *state)
>  	struct drm_device *dev = state->dev;
>  	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  
> +	vc4_hvs_mask_underrun(dev);
> +
>  	drm_atomic_helper_wait_for_fences(dev, state, false);
>  
>  	drm_atomic_helper_wait_for_dependencies(state);
> @@ -155,6 +157,8 @@ vc4_atomic_complete_commit(struct drm_atomic_state *state)
>  
>  	drm_atomic_helper_commit_hw_done(state);
>  
> +	vc4_hvs_unmask_underrun(dev);
> +
>  	drm_atomic_helper_wait_for_flip_done(dev, state);
>  
>  	drm_atomic_helper_cleanup_planes(dev, state);

I think the mask/unmask in here could race against another atomic_commit
happening on another CRTC in parallel.  I guess maybe we should mask off
interrupts on the specific channel being modified.

However, given that we're just talking about a debugfs entry and
user-space testing tool, I'm fine with this as-is.

Other than my concern for patch #4, patch 1-3 are:

Reviewed-by: Eric Anholt <eric@anholt.net>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 3/4] drm/vc4: Detect and ignore underruns caused by out-of-sync dlists
  2019-02-06 23:51     ` Eric Anholt
  (?)
@ 2019-02-20 14:30     ` Paul Kocialkowski
  -1 siblings, 0 replies; 11+ messages in thread
From: Paul Kocialkowski @ 2019-02-20 14:30 UTC (permalink / raw)
  To: Eric Anholt, dri-devel, linux-kernel
  Cc: David Airlie, Daniel Vetter, Maxime Ripard, Thomas Petazzoni, Eben Upton

Hi,

On Wed, 2019-02-06 at 15:51 -0800, Eric Anholt wrote:
> Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:
> 
> > When the pipeline is reconfigured with a different mode, changes take
> > effect immediately for the CRTC and encoder while the HVS takes some
> > time to switch the active display list. This results in a period of
> > time where the pipeline is out of sync, that is very likely to cause
> > an underrun to be reported. Because the underrun is not related to the
> > new configuration, reporting it to userspace is a false positive.
> 
> This seems like a serious issue.  How are we enabling a CRTC with the
> corresponding HVS still scanning out old contents?  Did we need to wait
> for HVS to finish its old frame when we turned off the CRTC, so it's
> ready to receive the START when it's been set up with the new dlist and
> the CRTC is turned back on?  Or maybe do some sort of reset of that
> dlist when a crtc is being enabled?

Yes this has definitely been quite a burden. I have already tried
waiting for the end of frame before disabling the crtc, as well as
playing with various other bits and FIFO reset (see the "HDMI mode
reconfiguration issue mitigation" thread). Alas, nothing helped and I
moved on to implementing a workaround after you said not to block on
this particular point.

> If we can't sort that out, it feels to me like we should be enabling the
> interrupts from the flip_done path (when we know that the HVS is
> scanning out the new frame) instead of trying to mitigate enabling them
> too early.

Thanks for the hint! I looked into this solution and it seems viable,
although I found out that interrupt masking is not always honored by
the hardware. With some extra care, it seems to be working reliably.

I'll send a new iteration implementing the workaround this way.

Cheers,

Paul

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-02-20 14:30 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-06 14:49 [PATCH v4 0/4] drm/vc4: Add a load tracker Paul Kocialkowski
2019-02-06 14:49 ` [PATCH v4 1/4] drm/vc4: Report HVS underrun errors Paul Kocialkowski
2019-02-06 23:51   ` Eric Anholt
2019-02-06 23:51     ` Eric Anholt
2019-02-06 14:49 ` [PATCH v4 2/4] drm/vc4: Add a load tracker to prevent HVS underflow errors Paul Kocialkowski
2019-02-06 14:49 ` [PATCH v4 3/4] drm/vc4: Detect and ignore underruns caused by out-of-sync dlists Paul Kocialkowski
2019-02-06 14:49   ` Paul Kocialkowski
2019-02-06 23:51   ` Eric Anholt
2019-02-06 23:51     ` Eric Anholt
2019-02-20 14:30     ` Paul Kocialkowski
2019-02-06 14:49 ` [PATCH v4 4/4] drm/vc4: Add a debugfs entry to disable/enable the load tracker Paul Kocialkowski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.