All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] Add support for GPU load values
@ 2022-06-21  7:20 ` Christian Gmeiner
  0 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-06-21  7:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Christian Gmeiner, Lucas Stach, Russell King, David Airlie,
	Daniel Vetter, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP

This patch series add support for loadavg values for GPU
sub-components. I am adding a SMA algorithm as I was not
really sure if EWMA would be a good fit for this use case.

Changes v2:
 - Addressed feedback from Lucas

Christian Gmeiner (4):
  drm/etnaviv: add simple moving average (SMA)
  drm/etnaviv: add loadavg accounting
  drm/etnaviv: show loadavg in debugfs
  drm/etnaviv: export loadavg via perfmon

 drivers/gpu/drm/etnaviv/etnaviv_drv.c     | 14 ++++
 drivers/gpu/drm/etnaviv/etnaviv_gpu.c     | 76 +++++++++++++++++++++-
 drivers/gpu/drm/etnaviv/etnaviv_gpu.h     | 37 +++++++++++
 drivers/gpu/drm/etnaviv/etnaviv_perfmon.c | 79 +++++++++++++++++++++++
 drivers/gpu/drm/etnaviv/etnaviv_sma.h     | 53 +++++++++++++++
 5 files changed, 257 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/etnaviv/etnaviv_sma.h

-- 
2.36.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 0/4] Add support for GPU load values
@ 2022-06-21  7:20 ` Christian Gmeiner
  0 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-06-21  7:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Airlie, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP, Russell King

This patch series add support for loadavg values for GPU
sub-components. I am adding a SMA algorithm as I was not
really sure if EWMA would be a good fit for this use case.

Changes v2:
 - Addressed feedback from Lucas

Christian Gmeiner (4):
  drm/etnaviv: add simple moving average (SMA)
  drm/etnaviv: add loadavg accounting
  drm/etnaviv: show loadavg in debugfs
  drm/etnaviv: export loadavg via perfmon

 drivers/gpu/drm/etnaviv/etnaviv_drv.c     | 14 ++++
 drivers/gpu/drm/etnaviv/etnaviv_gpu.c     | 76 +++++++++++++++++++++-
 drivers/gpu/drm/etnaviv/etnaviv_gpu.h     | 37 +++++++++++
 drivers/gpu/drm/etnaviv/etnaviv_perfmon.c | 79 +++++++++++++++++++++++
 drivers/gpu/drm/etnaviv/etnaviv_sma.h     | 53 +++++++++++++++
 5 files changed, 257 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/etnaviv/etnaviv_sma.h

-- 
2.36.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 1/4] drm/etnaviv: add simple moving average (SMA)
  2022-06-21  7:20 ` Christian Gmeiner
@ 2022-06-21  7:20   ` Christian Gmeiner
  -1 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-06-21  7:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Christian Gmeiner, Lucas Stach, Russell King, David Airlie,
	Daniel Vetter, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP

This adds a SMA algorithm inspired by Exponentially weighted moving
average (EWMA) algorithm found in the kernel.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
---
 drivers/gpu/drm/etnaviv/etnaviv_sma.h | 53 +++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)
 create mode 100644 drivers/gpu/drm/etnaviv/etnaviv_sma.h

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sma.h b/drivers/gpu/drm/etnaviv/etnaviv_sma.h
new file mode 100644
index 000000000000..81564d5cbdc3
--- /dev/null
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sma.h
@@ -0,0 +1,53 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020 Etnaviv Project
+ */
+
+#ifndef __ETNAVIV_SMA_H__
+#define __ETNAVIV_SMA_H__
+
+#include <linux/bug.h>
+#include <linux/compiler.h>
+
+/*
+ * Simple moving average (SMA)
+ *
+ * This implements a fixed-size SMA algorithm.
+ *
+ * The first argument to the macro is the name that will be used
+ * for the struct and helper functions.
+ *
+ * The second argument, the samples, expresses how many samples are
+ * used for the SMA algorithm.
+ */
+
+#define DECLARE_SMA(name, _samples) \
+    struct sma_##name { \
+        unsigned long pos; \
+        unsigned long sum; \
+        unsigned long samples[_samples]; \
+    }; \
+    static inline void sma_##name##_init(struct sma_##name *s) \
+    { \
+        BUILD_BUG_ON(!__builtin_constant_p(_samples));	\
+        memset(s, 0, sizeof(struct sma_##name)); \
+    } \
+    static inline unsigned long sma_##name##_read(struct sma_##name *s) \
+    { \
+        BUILD_BUG_ON(!__builtin_constant_p(_samples));	\
+        return s->sum / _samples; \
+    } \
+    static inline void sma_##name##_add(struct sma_##name *s, unsigned long val) \
+    { \
+        unsigned long pos = READ_ONCE(s->pos); \
+        unsigned long sum = READ_ONCE(s->sum); \
+        unsigned long sample = READ_ONCE(s->samples[pos]); \
+      \
+        BUILD_BUG_ON(!__builtin_constant_p(_samples));	\
+      \
+       WRITE_ONCE(s->sum, sum - sample + val); \
+       WRITE_ONCE(s->samples[pos], val); \
+       WRITE_ONCE(s->pos, pos + 1 == _samples ? 0 : pos + 1); \
+    }
+
+#endif /* __ETNAVIV_SMA_H__ */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 1/4] drm/etnaviv: add simple moving average (SMA)
@ 2022-06-21  7:20   ` Christian Gmeiner
  0 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-06-21  7:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Airlie, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP, Russell King

This adds a SMA algorithm inspired by Exponentially weighted moving
average (EWMA) algorithm found in the kernel.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
---
 drivers/gpu/drm/etnaviv/etnaviv_sma.h | 53 +++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)
 create mode 100644 drivers/gpu/drm/etnaviv/etnaviv_sma.h

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sma.h b/drivers/gpu/drm/etnaviv/etnaviv_sma.h
new file mode 100644
index 000000000000..81564d5cbdc3
--- /dev/null
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sma.h
@@ -0,0 +1,53 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020 Etnaviv Project
+ */
+
+#ifndef __ETNAVIV_SMA_H__
+#define __ETNAVIV_SMA_H__
+
+#include <linux/bug.h>
+#include <linux/compiler.h>
+
+/*
+ * Simple moving average (SMA)
+ *
+ * This implements a fixed-size SMA algorithm.
+ *
+ * The first argument to the macro is the name that will be used
+ * for the struct and helper functions.
+ *
+ * The second argument, the samples, expresses how many samples are
+ * used for the SMA algorithm.
+ */
+
+#define DECLARE_SMA(name, _samples) \
+    struct sma_##name { \
+        unsigned long pos; \
+        unsigned long sum; \
+        unsigned long samples[_samples]; \
+    }; \
+    static inline void sma_##name##_init(struct sma_##name *s) \
+    { \
+        BUILD_BUG_ON(!__builtin_constant_p(_samples));	\
+        memset(s, 0, sizeof(struct sma_##name)); \
+    } \
+    static inline unsigned long sma_##name##_read(struct sma_##name *s) \
+    { \
+        BUILD_BUG_ON(!__builtin_constant_p(_samples));	\
+        return s->sum / _samples; \
+    } \
+    static inline void sma_##name##_add(struct sma_##name *s, unsigned long val) \
+    { \
+        unsigned long pos = READ_ONCE(s->pos); \
+        unsigned long sum = READ_ONCE(s->sum); \
+        unsigned long sample = READ_ONCE(s->samples[pos]); \
+      \
+        BUILD_BUG_ON(!__builtin_constant_p(_samples));	\
+      \
+       WRITE_ONCE(s->sum, sum - sample + val); \
+       WRITE_ONCE(s->samples[pos], val); \
+       WRITE_ONCE(s->pos, pos + 1 == _samples ? 0 : pos + 1); \
+    }
+
+#endif /* __ETNAVIV_SMA_H__ */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 2/4] drm/etnaviv: add loadavg accounting
  2022-06-21  7:20 ` Christian Gmeiner
@ 2022-06-21  7:20   ` Christian Gmeiner
  -1 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-06-21  7:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Christian Gmeiner, Lucas Stach, Russell King, David Airlie,
	Daniel Vetter, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP

The GPU has an idle state register where each bit represents the idle
state of a sub-GPU component like FE or TX. Sample this register
every 10ms and calculate a simple moving average over the sub-GPU
component idle states with a total observation time frame of 1s.

This provides us with a percentage based load of each sub-GPU
component.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
---
 drivers/gpu/drm/etnaviv/etnaviv_drv.c | 14 ++++++
 drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 64 ++++++++++++++++++++++++++-
 drivers/gpu/drm/etnaviv/etnaviv_gpu.h | 37 ++++++++++++++++
 3 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
index 1d2b4fb4bcf8..d5c6115e56bd 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
@@ -46,6 +46,19 @@ static void load_gpu(struct drm_device *dev)
 	}
 }
 
+static void unload_gpu(struct drm_device *dev)
+{
+	struct etnaviv_drm_private *priv = dev->dev_private;
+	unsigned int i;
+
+	for (i = 0; i < ETNA_MAX_PIPES; i++) {
+		struct etnaviv_gpu *g = priv->gpu[i];
+
+		if (g)
+			etnaviv_gpu_shutdown(g);
+	}
+}
+
 static int etnaviv_open(struct drm_device *dev, struct drm_file *file)
 {
 	struct etnaviv_drm_private *priv = dev->dev_private;
@@ -557,6 +570,7 @@ static void etnaviv_unbind(struct device *dev)
 	struct drm_device *drm = dev_get_drvdata(dev);
 	struct etnaviv_drm_private *priv = drm->dev_private;
 
+	unload_gpu(drm);
 	drm_dev_unregister(drm);
 
 	component_unbind_all(dev, drm);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
index 37018bc55810..202002ae75ee 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
@@ -27,6 +27,8 @@
 #include "state_hi.xml.h"
 #include "cmdstream.xml.h"
 
+static const ktime_t loadavg_polling_frequency = 10 * NSEC_PER_MSEC;
+
 static const struct platform_device_id gpu_ids[] = {
 	{ .name = "etnaviv-gpu,2d" },
 	{ },
@@ -745,6 +747,32 @@ static void etnaviv_gpu_hw_init(struct etnaviv_gpu *gpu)
 	gpu_write(gpu, VIVS_HI_INTR_ENBL, ~0U);
 }
 
+static enum hrtimer_restart etnaviv_loadavg_function(struct hrtimer *t)
+{
+	struct etnaviv_gpu *gpu = container_of(t, struct etnaviv_gpu, loadavg_timer);
+	const u32 idle = gpu_read(gpu, VIVS_HI_IDLE_STATE);
+	int i;
+
+	gpu->loadavg_last_sample_time = ktime_get();
+
+	for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++)
+		if ((idle & etna_idle_module_names[i].bit))
+			sma_loadavg_add(&gpu->loadavg_value[i], 0);
+		else
+			sma_loadavg_add(&gpu->loadavg_value[i], 100);
+
+	spin_lock(&gpu->loadavg_spinlock);
+
+	for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++)
+		gpu->loadavg_percentage[i] = sma_loadavg_read(&gpu->loadavg_value[i]);
+
+	spin_unlock(&gpu->loadavg_spinlock);
+
+	hrtimer_forward_now(t, loadavg_polling_frequency);
+
+	return HRTIMER_RESTART;
+}
+
 int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
 {
 	struct etnaviv_drm_private *priv = gpu->drm->dev_private;
@@ -839,6 +867,11 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
 	for (i = 0; i < ARRAY_SIZE(gpu->event); i++)
 		complete(&gpu->event_free);
 
+	/* Setup loadavg timer */
+	hrtimer_init(&gpu->loadavg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_SOFT);
+	gpu->loadavg_timer.function = etnaviv_loadavg_function;
+	hrtimer_start(&gpu->loadavg_timer, loadavg_polling_frequency, HRTIMER_MODE_ABS_SOFT);
+
 	/* Now program the hardware */
 	mutex_lock(&gpu->lock);
 	etnaviv_gpu_hw_init(gpu);
@@ -859,6 +892,11 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
 	return ret;
 }
 
+void etnaviv_gpu_shutdown(struct etnaviv_gpu *gpu)
+{
+	hrtimer_cancel(&gpu->loadavg_timer);
+}
+
 #ifdef CONFIG_DEBUG_FS
 struct dma_debug {
 	u32 address[2];
@@ -1585,6 +1623,8 @@ int etnaviv_gpu_wait_idle(struct etnaviv_gpu *gpu, unsigned int timeout_ms)
 static int etnaviv_gpu_hw_suspend(struct etnaviv_gpu *gpu)
 {
 	if (gpu->initialized && gpu->fe_running) {
+		hrtimer_cancel(&gpu->loadavg_timer);
+
 		/* Replace the last WAIT with END */
 		mutex_lock(&gpu->lock);
 		etnaviv_buffer_end(gpu);
@@ -1608,7 +1648,8 @@ static int etnaviv_gpu_hw_suspend(struct etnaviv_gpu *gpu)
 #ifdef CONFIG_PM
 static int etnaviv_gpu_hw_resume(struct etnaviv_gpu *gpu)
 {
-	int ret;
+	s64 missing_samples;
+	int ret, i, j;
 
 	ret = mutex_lock_killable(&gpu->lock);
 	if (ret)
@@ -1617,7 +1658,27 @@ static int etnaviv_gpu_hw_resume(struct etnaviv_gpu *gpu)
 	etnaviv_gpu_update_clock(gpu);
 	etnaviv_gpu_hw_init(gpu);
 
+	/* Update loadavg based on delta of suspend and resume ktime.
+	 *
+	 * Our SMA algorithm uses a fixed size of 100 items to be able
+	 * to calculate the mean over one second as we sample every 10ms.
+	 */
+	missing_samples = div_s64(ktime_ms_delta(ktime_get(), gpu->loadavg_last_sample_time), 10);
+	missing_samples = min(missing_samples, (s64)100);
+
+	spin_lock_bh(&gpu->loadavg_spinlock);
+
+	for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++) {
+		struct sma_loadavg *loadavg = &gpu->loadavg_value[i];
+
+		for (j = 0; j < missing_samples; j++)
+			sma_loadavg_add(loadavg, 0);
+	}
+
+	spin_unlock_bh(&gpu->loadavg_spinlock);
+
 	mutex_unlock(&gpu->lock);
+	hrtimer_start(&gpu->loadavg_timer, loadavg_polling_frequency, HRTIMER_MODE_ABS_SOFT);
 
 	return 0;
 }
@@ -1787,6 +1848,7 @@ static int etnaviv_gpu_platform_probe(struct platform_device *pdev)
 	gpu->dev = &pdev->dev;
 	mutex_init(&gpu->lock);
 	mutex_init(&gpu->fence_lock);
+	spin_lock_init(&gpu->loadavg_spinlock);
 
 	/* Map registers: */
 	gpu->mmio = devm_platform_ioremap_resource(pdev, 0);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
index 85eddd492774..881f071f640e 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
@@ -10,6 +10,8 @@
 #include "etnaviv_gem.h"
 #include "etnaviv_mmu.h"
 #include "etnaviv_drv.h"
+#include "etnaviv_sma.h"
+#include "state_hi.xml.h"
 
 struct etnaviv_gem_submit;
 struct etnaviv_vram_mapping;
@@ -91,6 +93,33 @@ struct clk;
 
 #define ETNA_NR_EVENTS 30
 
+DECLARE_SMA(loadavg, 100)
+
+static const struct {
+    const char *name;
+    u32 bit;
+} etna_idle_module_names[] = {
+    { "FE", VIVS_HI_IDLE_STATE_FE },
+    { "DE", VIVS_HI_IDLE_STATE_DE },
+    { "PE", VIVS_HI_IDLE_STATE_PE },
+    { "SH", VIVS_HI_IDLE_STATE_SH },
+    { "PA", VIVS_HI_IDLE_STATE_PA },
+    { "SE", VIVS_HI_IDLE_STATE_SE },
+    { "RA", VIVS_HI_IDLE_STATE_RA },
+    { "TX", VIVS_HI_IDLE_STATE_TX },
+    { "VG", VIVS_HI_IDLE_STATE_VG },
+    { "IM", VIVS_HI_IDLE_STATE_IM },
+    { "FP", VIVS_HI_IDLE_STATE_FP },
+    { "TS", VIVS_HI_IDLE_STATE_TS },
+    { "BL", VIVS_HI_IDLE_STATE_BL },
+    { "ASYNCFE", VIVS_HI_IDLE_STATE_ASYNCFE },
+    { "MC", VIVS_HI_IDLE_STATE_MC },
+    { "PPA", VIVS_HI_IDLE_STATE_PPA },
+    { "WD", VIVS_HI_IDLE_STATE_WD },
+    { "NN", VIVS_HI_IDLE_STATE_NN },
+    { "TP", VIVS_HI_IDLE_STATE_TP },
+};
+
 struct etnaviv_gpu {
 	struct drm_device *drm;
 	struct thermal_cooling_device *cooling;
@@ -147,6 +176,13 @@ struct etnaviv_gpu {
 	unsigned int freq_scale;
 	unsigned long base_rate_core;
 	unsigned long base_rate_shader;
+
+	/* Loadavg: */
+	struct hrtimer loadavg_timer;
+	spinlock_t loadavg_spinlock;
+	ktime_t loadavg_last_sample_time;
+	struct sma_loadavg loadavg_value[ARRAY_SIZE(etna_idle_module_names)];
+	unsigned int loadavg_percentage[ARRAY_SIZE(etna_idle_module_names)];
 };
 
 static inline void gpu_write(struct etnaviv_gpu *gpu, u32 reg, u32 data)
@@ -162,6 +198,7 @@ static inline u32 gpu_read(struct etnaviv_gpu *gpu, u32 reg)
 int etnaviv_gpu_get_param(struct etnaviv_gpu *gpu, u32 param, u64 *value);
 
 int etnaviv_gpu_init(struct etnaviv_gpu *gpu);
+void etnaviv_gpu_shutdown(struct etnaviv_gpu *gpu);
 bool etnaviv_fill_identity_from_hwdb(struct etnaviv_gpu *gpu);
 
 #ifdef CONFIG_DEBUG_FS
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 2/4] drm/etnaviv: add loadavg accounting
@ 2022-06-21  7:20   ` Christian Gmeiner
  0 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-06-21  7:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Airlie, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP, Russell King

The GPU has an idle state register where each bit represents the idle
state of a sub-GPU component like FE or TX. Sample this register
every 10ms and calculate a simple moving average over the sub-GPU
component idle states with a total observation time frame of 1s.

This provides us with a percentage based load of each sub-GPU
component.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
---
 drivers/gpu/drm/etnaviv/etnaviv_drv.c | 14 ++++++
 drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 64 ++++++++++++++++++++++++++-
 drivers/gpu/drm/etnaviv/etnaviv_gpu.h | 37 ++++++++++++++++
 3 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
index 1d2b4fb4bcf8..d5c6115e56bd 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
@@ -46,6 +46,19 @@ static void load_gpu(struct drm_device *dev)
 	}
 }
 
+static void unload_gpu(struct drm_device *dev)
+{
+	struct etnaviv_drm_private *priv = dev->dev_private;
+	unsigned int i;
+
+	for (i = 0; i < ETNA_MAX_PIPES; i++) {
+		struct etnaviv_gpu *g = priv->gpu[i];
+
+		if (g)
+			etnaviv_gpu_shutdown(g);
+	}
+}
+
 static int etnaviv_open(struct drm_device *dev, struct drm_file *file)
 {
 	struct etnaviv_drm_private *priv = dev->dev_private;
@@ -557,6 +570,7 @@ static void etnaviv_unbind(struct device *dev)
 	struct drm_device *drm = dev_get_drvdata(dev);
 	struct etnaviv_drm_private *priv = drm->dev_private;
 
+	unload_gpu(drm);
 	drm_dev_unregister(drm);
 
 	component_unbind_all(dev, drm);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
index 37018bc55810..202002ae75ee 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
@@ -27,6 +27,8 @@
 #include "state_hi.xml.h"
 #include "cmdstream.xml.h"
 
+static const ktime_t loadavg_polling_frequency = 10 * NSEC_PER_MSEC;
+
 static const struct platform_device_id gpu_ids[] = {
 	{ .name = "etnaviv-gpu,2d" },
 	{ },
@@ -745,6 +747,32 @@ static void etnaviv_gpu_hw_init(struct etnaviv_gpu *gpu)
 	gpu_write(gpu, VIVS_HI_INTR_ENBL, ~0U);
 }
 
+static enum hrtimer_restart etnaviv_loadavg_function(struct hrtimer *t)
+{
+	struct etnaviv_gpu *gpu = container_of(t, struct etnaviv_gpu, loadavg_timer);
+	const u32 idle = gpu_read(gpu, VIVS_HI_IDLE_STATE);
+	int i;
+
+	gpu->loadavg_last_sample_time = ktime_get();
+
+	for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++)
+		if ((idle & etna_idle_module_names[i].bit))
+			sma_loadavg_add(&gpu->loadavg_value[i], 0);
+		else
+			sma_loadavg_add(&gpu->loadavg_value[i], 100);
+
+	spin_lock(&gpu->loadavg_spinlock);
+
+	for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++)
+		gpu->loadavg_percentage[i] = sma_loadavg_read(&gpu->loadavg_value[i]);
+
+	spin_unlock(&gpu->loadavg_spinlock);
+
+	hrtimer_forward_now(t, loadavg_polling_frequency);
+
+	return HRTIMER_RESTART;
+}
+
 int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
 {
 	struct etnaviv_drm_private *priv = gpu->drm->dev_private;
@@ -839,6 +867,11 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
 	for (i = 0; i < ARRAY_SIZE(gpu->event); i++)
 		complete(&gpu->event_free);
 
+	/* Setup loadavg timer */
+	hrtimer_init(&gpu->loadavg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_SOFT);
+	gpu->loadavg_timer.function = etnaviv_loadavg_function;
+	hrtimer_start(&gpu->loadavg_timer, loadavg_polling_frequency, HRTIMER_MODE_ABS_SOFT);
+
 	/* Now program the hardware */
 	mutex_lock(&gpu->lock);
 	etnaviv_gpu_hw_init(gpu);
@@ -859,6 +892,11 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
 	return ret;
 }
 
+void etnaviv_gpu_shutdown(struct etnaviv_gpu *gpu)
+{
+	hrtimer_cancel(&gpu->loadavg_timer);
+}
+
 #ifdef CONFIG_DEBUG_FS
 struct dma_debug {
 	u32 address[2];
@@ -1585,6 +1623,8 @@ int etnaviv_gpu_wait_idle(struct etnaviv_gpu *gpu, unsigned int timeout_ms)
 static int etnaviv_gpu_hw_suspend(struct etnaviv_gpu *gpu)
 {
 	if (gpu->initialized && gpu->fe_running) {
+		hrtimer_cancel(&gpu->loadavg_timer);
+
 		/* Replace the last WAIT with END */
 		mutex_lock(&gpu->lock);
 		etnaviv_buffer_end(gpu);
@@ -1608,7 +1648,8 @@ static int etnaviv_gpu_hw_suspend(struct etnaviv_gpu *gpu)
 #ifdef CONFIG_PM
 static int etnaviv_gpu_hw_resume(struct etnaviv_gpu *gpu)
 {
-	int ret;
+	s64 missing_samples;
+	int ret, i, j;
 
 	ret = mutex_lock_killable(&gpu->lock);
 	if (ret)
@@ -1617,7 +1658,27 @@ static int etnaviv_gpu_hw_resume(struct etnaviv_gpu *gpu)
 	etnaviv_gpu_update_clock(gpu);
 	etnaviv_gpu_hw_init(gpu);
 
+	/* Update loadavg based on delta of suspend and resume ktime.
+	 *
+	 * Our SMA algorithm uses a fixed size of 100 items to be able
+	 * to calculate the mean over one second as we sample every 10ms.
+	 */
+	missing_samples = div_s64(ktime_ms_delta(ktime_get(), gpu->loadavg_last_sample_time), 10);
+	missing_samples = min(missing_samples, (s64)100);
+
+	spin_lock_bh(&gpu->loadavg_spinlock);
+
+	for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++) {
+		struct sma_loadavg *loadavg = &gpu->loadavg_value[i];
+
+		for (j = 0; j < missing_samples; j++)
+			sma_loadavg_add(loadavg, 0);
+	}
+
+	spin_unlock_bh(&gpu->loadavg_spinlock);
+
 	mutex_unlock(&gpu->lock);
+	hrtimer_start(&gpu->loadavg_timer, loadavg_polling_frequency, HRTIMER_MODE_ABS_SOFT);
 
 	return 0;
 }
@@ -1787,6 +1848,7 @@ static int etnaviv_gpu_platform_probe(struct platform_device *pdev)
 	gpu->dev = &pdev->dev;
 	mutex_init(&gpu->lock);
 	mutex_init(&gpu->fence_lock);
+	spin_lock_init(&gpu->loadavg_spinlock);
 
 	/* Map registers: */
 	gpu->mmio = devm_platform_ioremap_resource(pdev, 0);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
index 85eddd492774..881f071f640e 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
@@ -10,6 +10,8 @@
 #include "etnaviv_gem.h"
 #include "etnaviv_mmu.h"
 #include "etnaviv_drv.h"
+#include "etnaviv_sma.h"
+#include "state_hi.xml.h"
 
 struct etnaviv_gem_submit;
 struct etnaviv_vram_mapping;
@@ -91,6 +93,33 @@ struct clk;
 
 #define ETNA_NR_EVENTS 30
 
+DECLARE_SMA(loadavg, 100)
+
+static const struct {
+    const char *name;
+    u32 bit;
+} etna_idle_module_names[] = {
+    { "FE", VIVS_HI_IDLE_STATE_FE },
+    { "DE", VIVS_HI_IDLE_STATE_DE },
+    { "PE", VIVS_HI_IDLE_STATE_PE },
+    { "SH", VIVS_HI_IDLE_STATE_SH },
+    { "PA", VIVS_HI_IDLE_STATE_PA },
+    { "SE", VIVS_HI_IDLE_STATE_SE },
+    { "RA", VIVS_HI_IDLE_STATE_RA },
+    { "TX", VIVS_HI_IDLE_STATE_TX },
+    { "VG", VIVS_HI_IDLE_STATE_VG },
+    { "IM", VIVS_HI_IDLE_STATE_IM },
+    { "FP", VIVS_HI_IDLE_STATE_FP },
+    { "TS", VIVS_HI_IDLE_STATE_TS },
+    { "BL", VIVS_HI_IDLE_STATE_BL },
+    { "ASYNCFE", VIVS_HI_IDLE_STATE_ASYNCFE },
+    { "MC", VIVS_HI_IDLE_STATE_MC },
+    { "PPA", VIVS_HI_IDLE_STATE_PPA },
+    { "WD", VIVS_HI_IDLE_STATE_WD },
+    { "NN", VIVS_HI_IDLE_STATE_NN },
+    { "TP", VIVS_HI_IDLE_STATE_TP },
+};
+
 struct etnaviv_gpu {
 	struct drm_device *drm;
 	struct thermal_cooling_device *cooling;
@@ -147,6 +176,13 @@ struct etnaviv_gpu {
 	unsigned int freq_scale;
 	unsigned long base_rate_core;
 	unsigned long base_rate_shader;
+
+	/* Loadavg: */
+	struct hrtimer loadavg_timer;
+	spinlock_t loadavg_spinlock;
+	ktime_t loadavg_last_sample_time;
+	struct sma_loadavg loadavg_value[ARRAY_SIZE(etna_idle_module_names)];
+	unsigned int loadavg_percentage[ARRAY_SIZE(etna_idle_module_names)];
 };
 
 static inline void gpu_write(struct etnaviv_gpu *gpu, u32 reg, u32 data)
@@ -162,6 +198,7 @@ static inline u32 gpu_read(struct etnaviv_gpu *gpu, u32 reg)
 int etnaviv_gpu_get_param(struct etnaviv_gpu *gpu, u32 param, u64 *value);
 
 int etnaviv_gpu_init(struct etnaviv_gpu *gpu);
+void etnaviv_gpu_shutdown(struct etnaviv_gpu *gpu);
 bool etnaviv_fill_identity_from_hwdb(struct etnaviv_gpu *gpu);
 
 #ifdef CONFIG_DEBUG_FS
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 3/4] drm/etnaviv: show loadavg in debugfs
  2022-06-21  7:20 ` Christian Gmeiner
@ 2022-06-21  7:20   ` Christian Gmeiner
  -1 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-06-21  7:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Christian Gmeiner, Lucas Stach, Russell King, David Airlie,
	Daniel Vetter, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP

Might be helpful to see the loadavg in debugfs.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
---
 drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
index 202002ae75ee..113faff9b02e 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
@@ -926,7 +926,7 @@ int etnaviv_gpu_debugfs(struct etnaviv_gpu *gpu, struct seq_file *m)
 {
 	struct dma_debug debug;
 	u32 dma_lo, dma_hi, axi, idle;
-	int ret;
+	int ret, i;
 
 	seq_printf(m, "%s Status:\n", dev_name(gpu->dev));
 
@@ -1044,6 +1044,16 @@ int etnaviv_gpu_debugfs(struct etnaviv_gpu *gpu, struct seq_file *m)
 	if (idle & VIVS_HI_IDLE_STATE_AXI_LP)
 		seq_puts(m, "\t AXI low power mode\n");
 
+	seq_printf(m, "\tload:\n");
+	spin_lock_bh(&gpu->loadavg_spinlock);
+
+	for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++)
+		seq_printf(m, "\t %s: %u%%\n",
+				  etna_idle_module_names[i].name,
+				  gpu->loadavg_percentage[i]);
+
+	spin_unlock_bh(&gpu->loadavg_spinlock);
+
 	if (gpu->identity.features & chipFeatures_DEBUG_MODE) {
 		u32 read0 = gpu_read(gpu, VIVS_MC_DEBUG_READ0);
 		u32 read1 = gpu_read(gpu, VIVS_MC_DEBUG_READ1);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 3/4] drm/etnaviv: show loadavg in debugfs
@ 2022-06-21  7:20   ` Christian Gmeiner
  0 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-06-21  7:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Airlie, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP, Russell King

Might be helpful to see the loadavg in debugfs.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
---
 drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
index 202002ae75ee..113faff9b02e 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
@@ -926,7 +926,7 @@ int etnaviv_gpu_debugfs(struct etnaviv_gpu *gpu, struct seq_file *m)
 {
 	struct dma_debug debug;
 	u32 dma_lo, dma_hi, axi, idle;
-	int ret;
+	int ret, i;
 
 	seq_printf(m, "%s Status:\n", dev_name(gpu->dev));
 
@@ -1044,6 +1044,16 @@ int etnaviv_gpu_debugfs(struct etnaviv_gpu *gpu, struct seq_file *m)
 	if (idle & VIVS_HI_IDLE_STATE_AXI_LP)
 		seq_puts(m, "\t AXI low power mode\n");
 
+	seq_printf(m, "\tload:\n");
+	spin_lock_bh(&gpu->loadavg_spinlock);
+
+	for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++)
+		seq_printf(m, "\t %s: %u%%\n",
+				  etna_idle_module_names[i].name,
+				  gpu->loadavg_percentage[i]);
+
+	spin_unlock_bh(&gpu->loadavg_spinlock);
+
 	if (gpu->identity.features & chipFeatures_DEBUG_MODE) {
 		u32 read0 = gpu_read(gpu, VIVS_MC_DEBUG_READ0);
 		u32 read1 = gpu_read(gpu, VIVS_MC_DEBUG_READ1);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 4/4] drm/etnaviv: export loadavg via perfmon
  2022-06-21  7:20 ` Christian Gmeiner
@ 2022-06-21  7:20   ` Christian Gmeiner
  -1 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-06-21  7:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Christian Gmeiner, Lucas Stach, Russell King, David Airlie,
	Daniel Vetter, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP

Make it possible to access the sub-GPU component load value from
user space with the perfmon infrastructure.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
---
 drivers/gpu/drm/etnaviv/etnaviv_perfmon.c | 79 +++++++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
index bafdfe49c1d8..d65d9af3a74a 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
@@ -120,6 +120,19 @@ static u32 hi_total_idle_cycle_read(struct etnaviv_gpu *gpu,
 	return gpu_read(gpu, reg);
 }
 
+static u32 load_read(struct etnaviv_gpu *gpu,
+	const struct etnaviv_pm_domain *domain,
+	const struct etnaviv_pm_signal *signal)
+{
+	u32 load;
+
+	spin_lock_bh(&gpu->loadavg_spinlock);
+	load = gpu->loadavg_percentage[signal->data];
+	spin_unlock_bh(&gpu->loadavg_spinlock);
+
+	return load;
+}
+
 static const struct etnaviv_pm_domain doms_3d[] = {
 	{
 		.name = "HI",
@@ -419,6 +432,72 @@ static const struct etnaviv_pm_domain doms_3d[] = {
 				&perf_reg_read
 			}
 		}
+	},
+	{
+		.name = "LOAD",
+		.nr_signals = 12,
+		.signal = (const struct etnaviv_pm_signal[]) {
+			{
+				"FE",
+				0,
+				&load_read
+			},
+			{
+				"DE",
+				1,
+				&load_read
+			},
+			{
+				"PE",
+				2,
+				&load_read
+			},
+			{
+				"SH",
+				3,
+				&load_read
+			},
+			{
+				"PA",
+				4,
+				&load_read
+			},
+			{
+				"SE",
+				5,
+				&load_read
+			},
+			{
+				"RA",
+				6,
+				&load_read
+			},
+			{
+				"TX",
+				7,
+				&load_read
+			},
+			{
+				"VG",
+				8,
+				&load_read
+			},
+			{
+				"IM",
+				9,
+				&load_read
+			},
+			{
+				"FP",
+				10,
+				&load_read
+			},
+			{
+				"TS",
+				11,
+				&load_read
+			}
+		}
 	}
 };
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 4/4] drm/etnaviv: export loadavg via perfmon
@ 2022-06-21  7:20   ` Christian Gmeiner
  0 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-06-21  7:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Airlie, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP, Russell King

Make it possible to access the sub-GPU component load value from
user space with the perfmon infrastructure.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
---
 drivers/gpu/drm/etnaviv/etnaviv_perfmon.c | 79 +++++++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
index bafdfe49c1d8..d65d9af3a74a 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
@@ -120,6 +120,19 @@ static u32 hi_total_idle_cycle_read(struct etnaviv_gpu *gpu,
 	return gpu_read(gpu, reg);
 }
 
+static u32 load_read(struct etnaviv_gpu *gpu,
+	const struct etnaviv_pm_domain *domain,
+	const struct etnaviv_pm_signal *signal)
+{
+	u32 load;
+
+	spin_lock_bh(&gpu->loadavg_spinlock);
+	load = gpu->loadavg_percentage[signal->data];
+	spin_unlock_bh(&gpu->loadavg_spinlock);
+
+	return load;
+}
+
 static const struct etnaviv_pm_domain doms_3d[] = {
 	{
 		.name = "HI",
@@ -419,6 +432,72 @@ static const struct etnaviv_pm_domain doms_3d[] = {
 				&perf_reg_read
 			}
 		}
+	},
+	{
+		.name = "LOAD",
+		.nr_signals = 12,
+		.signal = (const struct etnaviv_pm_signal[]) {
+			{
+				"FE",
+				0,
+				&load_read
+			},
+			{
+				"DE",
+				1,
+				&load_read
+			},
+			{
+				"PE",
+				2,
+				&load_read
+			},
+			{
+				"SH",
+				3,
+				&load_read
+			},
+			{
+				"PA",
+				4,
+				&load_read
+			},
+			{
+				"SE",
+				5,
+				&load_read
+			},
+			{
+				"RA",
+				6,
+				&load_read
+			},
+			{
+				"TX",
+				7,
+				&load_read
+			},
+			{
+				"VG",
+				8,
+				&load_read
+			},
+			{
+				"IM",
+				9,
+				&load_read
+			},
+			{
+				"FP",
+				10,
+				&load_read
+			},
+			{
+				"TS",
+				11,
+				&load_read
+			}
+		}
 	}
 };
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/4] drm/etnaviv: add simple moving average (SMA)
  2022-06-21  7:20   ` Christian Gmeiner
@ 2022-06-24  9:22     ` Lucas Stach
  -1 siblings, 0 replies; 22+ messages in thread
From: Lucas Stach @ 2022-06-24  9:22 UTC (permalink / raw)
  To: Christian Gmeiner, linux-kernel
  Cc: David Airlie, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP, Daniel Vetter,
	Russell King

Hi Christian,

Am Dienstag, dem 21.06.2022 um 09:20 +0200 schrieb Christian Gmeiner:
> This adds a SMA algorithm inspired by Exponentially weighted moving
> average (EWMA) algorithm found in the kernel.
> 
Still not sure about this one. I _feel_ that a simple moving average
over a period of one second does not do a good job of reflecting the
real GPU load for a bursty workload, where EWMA might be better suited.
But then I also don't have a real informed opinion to offer on this.

Regards,
Lucas

> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_sma.h | 53 +++++++++++++++++++++++++++
>  1 file changed, 53 insertions(+)
>  create mode 100644 drivers/gpu/drm/etnaviv/etnaviv_sma.h
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sma.h b/drivers/gpu/drm/etnaviv/etnaviv_sma.h
> new file mode 100644
> index 000000000000..81564d5cbdc3
> --- /dev/null
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sma.h
> @@ -0,0 +1,53 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020 Etnaviv Project
> + */
> +
> +#ifndef __ETNAVIV_SMA_H__
> +#define __ETNAVIV_SMA_H__
> +
> +#include <linux/bug.h>
> +#include <linux/compiler.h>
> +
> +/*
> + * Simple moving average (SMA)
> + *
> + * This implements a fixed-size SMA algorithm.
> + *
> + * The first argument to the macro is the name that will be used
> + * for the struct and helper functions.
> + *
> + * The second argument, the samples, expresses how many samples are
> + * used for the SMA algorithm.
> + */
> +
> +#define DECLARE_SMA(name, _samples) \
> +    struct sma_##name { \
> +        unsigned long pos; \
> +        unsigned long sum; \
> +        unsigned long samples[_samples]; \
> +    }; \
> +    static inline void sma_##name##_init(struct sma_##name *s) \
> +    { \
> +        BUILD_BUG_ON(!__builtin_constant_p(_samples));	\
> +        memset(s, 0, sizeof(struct sma_##name)); \
> +    } \
> +    static inline unsigned long sma_##name##_read(struct sma_##name *s) \
> +    { \
> +        BUILD_BUG_ON(!__builtin_constant_p(_samples));	\
> +        return s->sum / _samples; \
> +    } \
> +    static inline void sma_##name##_add(struct sma_##name *s, unsigned long val) \
> +    { \
> +        unsigned long pos = READ_ONCE(s->pos); \
> +        unsigned long sum = READ_ONCE(s->sum); \
> +        unsigned long sample = READ_ONCE(s->samples[pos]); \
> +      \
> +        BUILD_BUG_ON(!__builtin_constant_p(_samples));	\
> +      \
> +       WRITE_ONCE(s->sum, sum - sample + val); \
> +       WRITE_ONCE(s->samples[pos], val); \
> +       WRITE_ONCE(s->pos, pos + 1 == _samples ? 0 : pos + 1); \
> +    }
> +
> +#endif /* __ETNAVIV_SMA_H__ */



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/4] drm/etnaviv: add simple moving average (SMA)
@ 2022-06-24  9:22     ` Lucas Stach
  0 siblings, 0 replies; 22+ messages in thread
From: Lucas Stach @ 2022-06-24  9:22 UTC (permalink / raw)
  To: Christian Gmeiner, linux-kernel
  Cc: David Airlie, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP, Russell King

Hi Christian,

Am Dienstag, dem 21.06.2022 um 09:20 +0200 schrieb Christian Gmeiner:
> This adds a SMA algorithm inspired by Exponentially weighted moving
> average (EWMA) algorithm found in the kernel.
> 
Still not sure about this one. I _feel_ that a simple moving average
over a period of one second does not do a good job of reflecting the
real GPU load for a bursty workload, where EWMA might be better suited.
But then I also don't have a real informed opinion to offer on this.

Regards,
Lucas

> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_sma.h | 53 +++++++++++++++++++++++++++
>  1 file changed, 53 insertions(+)
>  create mode 100644 drivers/gpu/drm/etnaviv/etnaviv_sma.h
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sma.h b/drivers/gpu/drm/etnaviv/etnaviv_sma.h
> new file mode 100644
> index 000000000000..81564d5cbdc3
> --- /dev/null
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sma.h
> @@ -0,0 +1,53 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020 Etnaviv Project
> + */
> +
> +#ifndef __ETNAVIV_SMA_H__
> +#define __ETNAVIV_SMA_H__
> +
> +#include <linux/bug.h>
> +#include <linux/compiler.h>
> +
> +/*
> + * Simple moving average (SMA)
> + *
> + * This implements a fixed-size SMA algorithm.
> + *
> + * The first argument to the macro is the name that will be used
> + * for the struct and helper functions.
> + *
> + * The second argument, the samples, expresses how many samples are
> + * used for the SMA algorithm.
> + */
> +
> +#define DECLARE_SMA(name, _samples) \
> +    struct sma_##name { \
> +        unsigned long pos; \
> +        unsigned long sum; \
> +        unsigned long samples[_samples]; \
> +    }; \
> +    static inline void sma_##name##_init(struct sma_##name *s) \
> +    { \
> +        BUILD_BUG_ON(!__builtin_constant_p(_samples));	\
> +        memset(s, 0, sizeof(struct sma_##name)); \
> +    } \
> +    static inline unsigned long sma_##name##_read(struct sma_##name *s) \
> +    { \
> +        BUILD_BUG_ON(!__builtin_constant_p(_samples));	\
> +        return s->sum / _samples; \
> +    } \
> +    static inline void sma_##name##_add(struct sma_##name *s, unsigned long val) \
> +    { \
> +        unsigned long pos = READ_ONCE(s->pos); \
> +        unsigned long sum = READ_ONCE(s->sum); \
> +        unsigned long sample = READ_ONCE(s->samples[pos]); \
> +      \
> +        BUILD_BUG_ON(!__builtin_constant_p(_samples));	\
> +      \
> +       WRITE_ONCE(s->sum, sum - sample + val); \
> +       WRITE_ONCE(s->samples[pos], val); \
> +       WRITE_ONCE(s->pos, pos + 1 == _samples ? 0 : pos + 1); \
> +    }
> +
> +#endif /* __ETNAVIV_SMA_H__ */



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/4] drm/etnaviv: add loadavg accounting
  2022-06-21  7:20   ` Christian Gmeiner
@ 2022-06-24  9:38     ` Lucas Stach
  -1 siblings, 0 replies; 22+ messages in thread
From: Lucas Stach @ 2022-06-24  9:38 UTC (permalink / raw)
  To: Christian Gmeiner, linux-kernel
  Cc: David Airlie, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP, Daniel Vetter,
	Russell King

Am Dienstag, dem 21.06.2022 um 09:20 +0200 schrieb Christian Gmeiner:
> The GPU has an idle state register where each bit represents the idle
> state of a sub-GPU component like FE or TX. Sample this register
> every 10ms and calculate a simple moving average over the sub-GPU
> component idle states with a total observation time frame of 1s.
> 
> This provides us with a percentage based load of each sub-GPU
> component.
> 
> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_drv.c | 14 ++++++
>  drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 64 ++++++++++++++++++++++++++-
>  drivers/gpu/drm/etnaviv/etnaviv_gpu.h | 37 ++++++++++++++++
>  3 files changed, 114 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
> index 1d2b4fb4bcf8..d5c6115e56bd 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
> @@ -46,6 +46,19 @@ static void load_gpu(struct drm_device *dev)
>  	}
>  }
>  
> +static void unload_gpu(struct drm_device *dev)
> +{
> +	struct etnaviv_drm_private *priv = dev->dev_private;
> +	unsigned int i;
> +
> +	for (i = 0; i < ETNA_MAX_PIPES; i++) {
> +		struct etnaviv_gpu *g = priv->gpu[i];
> +
> +		if (g)
> +			etnaviv_gpu_shutdown(g);
> +	}
> +}
> +
>  static int etnaviv_open(struct drm_device *dev, struct drm_file *file)
>  {
>  	struct etnaviv_drm_private *priv = dev->dev_private;
> @@ -557,6 +570,7 @@ static void etnaviv_unbind(struct device *dev)
>  	struct drm_device *drm = dev_get_drvdata(dev);
>  	struct etnaviv_drm_private *priv = drm->dev_private;
>  
> +	unload_gpu(drm);
>  	drm_dev_unregister(drm);
>  
>  	component_unbind_all(dev, drm);
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> index 37018bc55810..202002ae75ee 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> @@ -27,6 +27,8 @@
>  #include "state_hi.xml.h"
>  #include "cmdstream.xml.h"
>  
> +static const ktime_t loadavg_polling_frequency = 10 * NSEC_PER_MSEC;
> +
Feeling like a nitpicker, but the thing defined here isn't a frequency,
but a time delta/interval.

>  static const struct platform_device_id gpu_ids[] = {
>  	{ .name = "etnaviv-gpu,2d" },
>  	{ },
> @@ -745,6 +747,32 @@ static void etnaviv_gpu_hw_init(struct etnaviv_gpu *gpu)
>  	gpu_write(gpu, VIVS_HI_INTR_ENBL, ~0U);
>  }
>  
> +static enum hrtimer_restart etnaviv_loadavg_function(struct hrtimer *t)
> +{
> +	struct etnaviv_gpu *gpu = container_of(t, struct etnaviv_gpu, loadavg_timer);
> +	const u32 idle = gpu_read(gpu, VIVS_HI_IDLE_STATE);
> +	int i;
> +
> +	gpu->loadavg_last_sample_time = ktime_get();
> +
> +	for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++)
> +		if ((idle & etna_idle_module_names[i].bit))
> +			sma_loadavg_add(&gpu->loadavg_value[i], 0);
> +		else
> +			sma_loadavg_add(&gpu->loadavg_value[i], 100);
> +
> +	spin_lock(&gpu->loadavg_spinlock);
> +
> +	for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++)
> +		gpu->loadavg_percentage[i] = sma_loadavg_read(&gpu->loadavg_value[i]);
> +
> +	spin_unlock(&gpu->loadavg_spinlock);

After pondering this for a bit, I don't think we need this spinlock.
The percentage is a single value per engine, so they are already single
write atomic. The worst thing that can happen without this spinlock is
that on read of the loadavg some engines already have the value of
sample period n+1 integrated, while another set is still at n, which I
don't think we care much about, as those load values are already quite
coarse.

> +
> +	hrtimer_forward_now(t, loadavg_polling_frequency);
> +
> +	return HRTIMER_RESTART;
> +}
> +
>  int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
>  {
>  	struct etnaviv_drm_private *priv = gpu->drm->dev_private;
> @@ -839,6 +867,11 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
>  	for (i = 0; i < ARRAY_SIZE(gpu->event); i++)
>  		complete(&gpu->event_free);
>  
> +	/* Setup loadavg timer */
> +	hrtimer_init(&gpu->loadavg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_SOFT);
> +	gpu->loadavg_timer.function = etnaviv_loadavg_function;
> +	hrtimer_start(&gpu->loadavg_timer, loadavg_polling_frequency, HRTIMER_MODE_ABS_SOFT);
> +
>  	/* Now program the hardware */
>  	mutex_lock(&gpu->lock);
>  	etnaviv_gpu_hw_init(gpu);
> @@ -859,6 +892,11 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
>  	return ret;
>  }
>  
> +void etnaviv_gpu_shutdown(struct etnaviv_gpu *gpu)
> +{
> +	hrtimer_cancel(&gpu->loadavg_timer);
> +}
> +
>  #ifdef CONFIG_DEBUG_FS
>  struct dma_debug {
>  	u32 address[2];
> @@ -1585,6 +1623,8 @@ int etnaviv_gpu_wait_idle(struct etnaviv_gpu *gpu, unsigned int timeout_ms)
>  static int etnaviv_gpu_hw_suspend(struct etnaviv_gpu *gpu)
>  {
>  	if (gpu->initialized && gpu->fe_running) {
> +		hrtimer_cancel(&gpu->loadavg_timer);
> +
This isn't symmetric. Here you only cancel the timer when FE was
running, but in the resume function you unconditionally start the
timer.

Moving the timer start into etnaviv_gpu_start_fe() seems to be a good
idea. Sampling the idle state of a GPU with the FE not running doesn't
make much sense in the first place, as it will unsurprisingly be fully
idle. Doing this would also allow you to drop the
etnaviv_gpu_shutdown() and unload_gpu() functions, as the timer doesn't
need to be started when initializing the GPU.


>  		/* Replace the last WAIT with END */
>  		mutex_lock(&gpu->lock);
>  		etnaviv_buffer_end(gpu);
> @@ -1608,7 +1648,8 @@ static int etnaviv_gpu_hw_suspend(struct etnaviv_gpu *gpu)
>  #ifdef CONFIG_PM
>  static int etnaviv_gpu_hw_resume(struct etnaviv_gpu *gpu)
>  {
> -	int ret;
> +	s64 missing_samples;
> +	int ret, i, j;
>  
>  	ret = mutex_lock_killable(&gpu->lock);
>  	if (ret)
> @@ -1617,7 +1658,27 @@ static int etnaviv_gpu_hw_resume(struct etnaviv_gpu *gpu)
>  	etnaviv_gpu_update_clock(gpu);
>  	etnaviv_gpu_hw_init(gpu);
>  
> +	/* Update loadavg based on delta of suspend and resume ktime.
> +	 *
> +	 * Our SMA algorithm uses a fixed size of 100 items to be able
> +	 * to calculate the mean over one second as we sample every 10ms.
> +	 */
> +	missing_samples = div_s64(ktime_ms_delta(ktime_get(), gpu->loadavg_last_sample_time), 10);

In the timer function you use the loadavg_polling_frequency const for
this value. It would be good to be consistent here. Probably just
#define the polling interval and use this both here and in the timer
function.

> +	missing_samples = min(missing_samples, (s64)100);
> +
> +	spin_lock_bh(&gpu->loadavg_spinlock);
> +
> +	for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++) {
> +		struct sma_loadavg *loadavg = &gpu->loadavg_value[i];
> +
> +		for (j = 0; j < missing_samples; j++)
> +			sma_loadavg_add(loadavg, 0);
> +	}
> +
> +	spin_unlock_bh(&gpu->loadavg_spinlock);
> +
>  	mutex_unlock(&gpu->lock);
> +	hrtimer_start(&gpu->loadavg_timer, loadavg_polling_frequency, HRTIMER_MODE_ABS_SOFT);
>  
>  	return 0;
>  }
> @@ -1787,6 +1848,7 @@ static int etnaviv_gpu_platform_probe(struct platform_device *pdev)
>  	gpu->dev = &pdev->dev;
>  	mutex_init(&gpu->lock);
>  	mutex_init(&gpu->fence_lock);
> +	spin_lock_init(&gpu->loadavg_spinlock);
>  
>  	/* Map registers: */
>  	gpu->mmio = devm_platform_ioremap_resource(pdev, 0);
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> index 85eddd492774..881f071f640e 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> @@ -10,6 +10,8 @@
>  #include "etnaviv_gem.h"
>  #include "etnaviv_mmu.h"
>  #include "etnaviv_drv.h"
> +#include "etnaviv_sma.h"
> +#include "state_hi.xml.h"
>  
>  struct etnaviv_gem_submit;
>  struct etnaviv_vram_mapping;
> @@ -91,6 +93,33 @@ struct clk;
>  
>  #define ETNA_NR_EVENTS 30
>  
> +DECLARE_SMA(loadavg, 100)
> +
> +static const struct {
> +    const char *name;
> +    u32 bit;
> +} etna_idle_module_names[] = {

Drop the _names prefix. This isn't just enumerating names, but also the
bit positions in the state register.

Regards,
Lucas

> +    { "FE", VIVS_HI_IDLE_STATE_FE },
> +    { "DE", VIVS_HI_IDLE_STATE_DE },
> +    { "PE", VIVS_HI_IDLE_STATE_PE },
> +    { "SH", VIVS_HI_IDLE_STATE_SH },
> +    { "PA", VIVS_HI_IDLE_STATE_PA },
> +    { "SE", VIVS_HI_IDLE_STATE_SE },
> +    { "RA", VIVS_HI_IDLE_STATE_RA },
> +    { "TX", VIVS_HI_IDLE_STATE_TX },
> +    { "VG", VIVS_HI_IDLE_STATE_VG },
> +    { "IM", VIVS_HI_IDLE_STATE_IM },
> +    { "FP", VIVS_HI_IDLE_STATE_FP },
> +    { "TS", VIVS_HI_IDLE_STATE_TS },
> +    { "BL", VIVS_HI_IDLE_STATE_BL },
> +    { "ASYNCFE", VIVS_HI_IDLE_STATE_ASYNCFE },
> +    { "MC", VIVS_HI_IDLE_STATE_MC },
> +    { "PPA", VIVS_HI_IDLE_STATE_PPA },
> +    { "WD", VIVS_HI_IDLE_STATE_WD },
> +    { "NN", VIVS_HI_IDLE_STATE_NN },
> +    { "TP", VIVS_HI_IDLE_STATE_TP },
> +};
> +
>  struct etnaviv_gpu {
>  	struct drm_device *drm;
>  	struct thermal_cooling_device *cooling;
> @@ -147,6 +176,13 @@ struct etnaviv_gpu {
>  	unsigned int freq_scale;
>  	unsigned long base_rate_core;
>  	unsigned long base_rate_shader;
> +
> +	/* Loadavg: */
> +	struct hrtimer loadavg_timer;
> +	spinlock_t loadavg_spinlock;
> +	ktime_t loadavg_last_sample_time;
> +	struct sma_loadavg loadavg_value[ARRAY_SIZE(etna_idle_module_names)];
> +	unsigned int loadavg_percentage[ARRAY_SIZE(etna_idle_module_names)];
>  };
>  
>  static inline void gpu_write(struct etnaviv_gpu *gpu, u32 reg, u32 data)
> @@ -162,6 +198,7 @@ static inline u32 gpu_read(struct etnaviv_gpu *gpu, u32 reg)
>  int etnaviv_gpu_get_param(struct etnaviv_gpu *gpu, u32 param, u64 *value);
>  
>  int etnaviv_gpu_init(struct etnaviv_gpu *gpu);
> +void etnaviv_gpu_shutdown(struct etnaviv_gpu *gpu);
>  bool etnaviv_fill_identity_from_hwdb(struct etnaviv_gpu *gpu);
>  
>  #ifdef CONFIG_DEBUG_FS



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/4] drm/etnaviv: add loadavg accounting
@ 2022-06-24  9:38     ` Lucas Stach
  0 siblings, 0 replies; 22+ messages in thread
From: Lucas Stach @ 2022-06-24  9:38 UTC (permalink / raw)
  To: Christian Gmeiner, linux-kernel
  Cc: David Airlie, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP, Russell King

Am Dienstag, dem 21.06.2022 um 09:20 +0200 schrieb Christian Gmeiner:
> The GPU has an idle state register where each bit represents the idle
> state of a sub-GPU component like FE or TX. Sample this register
> every 10ms and calculate a simple moving average over the sub-GPU
> component idle states with a total observation time frame of 1s.
> 
> This provides us with a percentage based load of each sub-GPU
> component.
> 
> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_drv.c | 14 ++++++
>  drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 64 ++++++++++++++++++++++++++-
>  drivers/gpu/drm/etnaviv/etnaviv_gpu.h | 37 ++++++++++++++++
>  3 files changed, 114 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
> index 1d2b4fb4bcf8..d5c6115e56bd 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
> @@ -46,6 +46,19 @@ static void load_gpu(struct drm_device *dev)
>  	}
>  }
>  
> +static void unload_gpu(struct drm_device *dev)
> +{
> +	struct etnaviv_drm_private *priv = dev->dev_private;
> +	unsigned int i;
> +
> +	for (i = 0; i < ETNA_MAX_PIPES; i++) {
> +		struct etnaviv_gpu *g = priv->gpu[i];
> +
> +		if (g)
> +			etnaviv_gpu_shutdown(g);
> +	}
> +}
> +
>  static int etnaviv_open(struct drm_device *dev, struct drm_file *file)
>  {
>  	struct etnaviv_drm_private *priv = dev->dev_private;
> @@ -557,6 +570,7 @@ static void etnaviv_unbind(struct device *dev)
>  	struct drm_device *drm = dev_get_drvdata(dev);
>  	struct etnaviv_drm_private *priv = drm->dev_private;
>  
> +	unload_gpu(drm);
>  	drm_dev_unregister(drm);
>  
>  	component_unbind_all(dev, drm);
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> index 37018bc55810..202002ae75ee 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> @@ -27,6 +27,8 @@
>  #include "state_hi.xml.h"
>  #include "cmdstream.xml.h"
>  
> +static const ktime_t loadavg_polling_frequency = 10 * NSEC_PER_MSEC;
> +
Feeling like a nitpicker, but the thing defined here isn't a frequency,
but a time delta/interval.

>  static const struct platform_device_id gpu_ids[] = {
>  	{ .name = "etnaviv-gpu,2d" },
>  	{ },
> @@ -745,6 +747,32 @@ static void etnaviv_gpu_hw_init(struct etnaviv_gpu *gpu)
>  	gpu_write(gpu, VIVS_HI_INTR_ENBL, ~0U);
>  }
>  
> +static enum hrtimer_restart etnaviv_loadavg_function(struct hrtimer *t)
> +{
> +	struct etnaviv_gpu *gpu = container_of(t, struct etnaviv_gpu, loadavg_timer);
> +	const u32 idle = gpu_read(gpu, VIVS_HI_IDLE_STATE);
> +	int i;
> +
> +	gpu->loadavg_last_sample_time = ktime_get();
> +
> +	for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++)
> +		if ((idle & etna_idle_module_names[i].bit))
> +			sma_loadavg_add(&gpu->loadavg_value[i], 0);
> +		else
> +			sma_loadavg_add(&gpu->loadavg_value[i], 100);
> +
> +	spin_lock(&gpu->loadavg_spinlock);
> +
> +	for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++)
> +		gpu->loadavg_percentage[i] = sma_loadavg_read(&gpu->loadavg_value[i]);
> +
> +	spin_unlock(&gpu->loadavg_spinlock);

After pondering this for a bit, I don't think we need this spinlock.
The percentage is a single value per engine, so they are already single
write atomic. The worst thing that can happen without this spinlock is
that on read of the loadavg some engines already have the value of
sample period n+1 integrated, while another set is still at n, which I
don't think we care much about, as those load values are already quite
coarse.

> +
> +	hrtimer_forward_now(t, loadavg_polling_frequency);
> +
> +	return HRTIMER_RESTART;
> +}
> +
>  int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
>  {
>  	struct etnaviv_drm_private *priv = gpu->drm->dev_private;
> @@ -839,6 +867,11 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
>  	for (i = 0; i < ARRAY_SIZE(gpu->event); i++)
>  		complete(&gpu->event_free);
>  
> +	/* Setup loadavg timer */
> +	hrtimer_init(&gpu->loadavg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_SOFT);
> +	gpu->loadavg_timer.function = etnaviv_loadavg_function;
> +	hrtimer_start(&gpu->loadavg_timer, loadavg_polling_frequency, HRTIMER_MODE_ABS_SOFT);
> +
>  	/* Now program the hardware */
>  	mutex_lock(&gpu->lock);
>  	etnaviv_gpu_hw_init(gpu);
> @@ -859,6 +892,11 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
>  	return ret;
>  }
>  
> +void etnaviv_gpu_shutdown(struct etnaviv_gpu *gpu)
> +{
> +	hrtimer_cancel(&gpu->loadavg_timer);
> +}
> +
>  #ifdef CONFIG_DEBUG_FS
>  struct dma_debug {
>  	u32 address[2];
> @@ -1585,6 +1623,8 @@ int etnaviv_gpu_wait_idle(struct etnaviv_gpu *gpu, unsigned int timeout_ms)
>  static int etnaviv_gpu_hw_suspend(struct etnaviv_gpu *gpu)
>  {
>  	if (gpu->initialized && gpu->fe_running) {
> +		hrtimer_cancel(&gpu->loadavg_timer);
> +
This isn't symmetric. Here you only cancel the timer when FE was
running, but in the resume function you unconditionally start the
timer.

Moving the timer start into etnaviv_gpu_start_fe() seems to be a good
idea. Sampling the idle state of a GPU with the FE not running doesn't
make much sense in the first place, as it will unsurprisingly be fully
idle. Doing this would also allow you to drop the
etnaviv_gpu_shutdown() and unload_gpu() functions, as the timer doesn't
need to be started when initializing the GPU.


>  		/* Replace the last WAIT with END */
>  		mutex_lock(&gpu->lock);
>  		etnaviv_buffer_end(gpu);
> @@ -1608,7 +1648,8 @@ static int etnaviv_gpu_hw_suspend(struct etnaviv_gpu *gpu)
>  #ifdef CONFIG_PM
>  static int etnaviv_gpu_hw_resume(struct etnaviv_gpu *gpu)
>  {
> -	int ret;
> +	s64 missing_samples;
> +	int ret, i, j;
>  
>  	ret = mutex_lock_killable(&gpu->lock);
>  	if (ret)
> @@ -1617,7 +1658,27 @@ static int etnaviv_gpu_hw_resume(struct etnaviv_gpu *gpu)
>  	etnaviv_gpu_update_clock(gpu);
>  	etnaviv_gpu_hw_init(gpu);
>  
> +	/* Update loadavg based on delta of suspend and resume ktime.
> +	 *
> +	 * Our SMA algorithm uses a fixed size of 100 items to be able
> +	 * to calculate the mean over one second as we sample every 10ms.
> +	 */
> +	missing_samples = div_s64(ktime_ms_delta(ktime_get(), gpu->loadavg_last_sample_time), 10);

In the timer function you use the loadavg_polling_frequency const for
this value. It would be good to be consistent here. Probably just
#define the polling interval and use this both here and in the timer
function.

> +	missing_samples = min(missing_samples, (s64)100);
> +
> +	spin_lock_bh(&gpu->loadavg_spinlock);
> +
> +	for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++) {
> +		struct sma_loadavg *loadavg = &gpu->loadavg_value[i];
> +
> +		for (j = 0; j < missing_samples; j++)
> +			sma_loadavg_add(loadavg, 0);
> +	}
> +
> +	spin_unlock_bh(&gpu->loadavg_spinlock);
> +
>  	mutex_unlock(&gpu->lock);
> +	hrtimer_start(&gpu->loadavg_timer, loadavg_polling_frequency, HRTIMER_MODE_ABS_SOFT);
>  
>  	return 0;
>  }
> @@ -1787,6 +1848,7 @@ static int etnaviv_gpu_platform_probe(struct platform_device *pdev)
>  	gpu->dev = &pdev->dev;
>  	mutex_init(&gpu->lock);
>  	mutex_init(&gpu->fence_lock);
> +	spin_lock_init(&gpu->loadavg_spinlock);
>  
>  	/* Map registers: */
>  	gpu->mmio = devm_platform_ioremap_resource(pdev, 0);
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> index 85eddd492774..881f071f640e 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> @@ -10,6 +10,8 @@
>  #include "etnaviv_gem.h"
>  #include "etnaviv_mmu.h"
>  #include "etnaviv_drv.h"
> +#include "etnaviv_sma.h"
> +#include "state_hi.xml.h"
>  
>  struct etnaviv_gem_submit;
>  struct etnaviv_vram_mapping;
> @@ -91,6 +93,33 @@ struct clk;
>  
>  #define ETNA_NR_EVENTS 30
>  
> +DECLARE_SMA(loadavg, 100)
> +
> +static const struct {
> +    const char *name;
> +    u32 bit;
> +} etna_idle_module_names[] = {

Drop the _names prefix. This isn't just enumerating names, but also the
bit positions in the state register.

Regards,
Lucas

> +    { "FE", VIVS_HI_IDLE_STATE_FE },
> +    { "DE", VIVS_HI_IDLE_STATE_DE },
> +    { "PE", VIVS_HI_IDLE_STATE_PE },
> +    { "SH", VIVS_HI_IDLE_STATE_SH },
> +    { "PA", VIVS_HI_IDLE_STATE_PA },
> +    { "SE", VIVS_HI_IDLE_STATE_SE },
> +    { "RA", VIVS_HI_IDLE_STATE_RA },
> +    { "TX", VIVS_HI_IDLE_STATE_TX },
> +    { "VG", VIVS_HI_IDLE_STATE_VG },
> +    { "IM", VIVS_HI_IDLE_STATE_IM },
> +    { "FP", VIVS_HI_IDLE_STATE_FP },
> +    { "TS", VIVS_HI_IDLE_STATE_TS },
> +    { "BL", VIVS_HI_IDLE_STATE_BL },
> +    { "ASYNCFE", VIVS_HI_IDLE_STATE_ASYNCFE },
> +    { "MC", VIVS_HI_IDLE_STATE_MC },
> +    { "PPA", VIVS_HI_IDLE_STATE_PPA },
> +    { "WD", VIVS_HI_IDLE_STATE_WD },
> +    { "NN", VIVS_HI_IDLE_STATE_NN },
> +    { "TP", VIVS_HI_IDLE_STATE_TP },
> +};
> +
>  struct etnaviv_gpu {
>  	struct drm_device *drm;
>  	struct thermal_cooling_device *cooling;
> @@ -147,6 +176,13 @@ struct etnaviv_gpu {
>  	unsigned int freq_scale;
>  	unsigned long base_rate_core;
>  	unsigned long base_rate_shader;
> +
> +	/* Loadavg: */
> +	struct hrtimer loadavg_timer;
> +	spinlock_t loadavg_spinlock;
> +	ktime_t loadavg_last_sample_time;
> +	struct sma_loadavg loadavg_value[ARRAY_SIZE(etna_idle_module_names)];
> +	unsigned int loadavg_percentage[ARRAY_SIZE(etna_idle_module_names)];
>  };
>  
>  static inline void gpu_write(struct etnaviv_gpu *gpu, u32 reg, u32 data)
> @@ -162,6 +198,7 @@ static inline u32 gpu_read(struct etnaviv_gpu *gpu, u32 reg)
>  int etnaviv_gpu_get_param(struct etnaviv_gpu *gpu, u32 param, u64 *value);
>  
>  int etnaviv_gpu_init(struct etnaviv_gpu *gpu);
> +void etnaviv_gpu_shutdown(struct etnaviv_gpu *gpu);
>  bool etnaviv_fill_identity_from_hwdb(struct etnaviv_gpu *gpu);
>  
>  #ifdef CONFIG_DEBUG_FS



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 4/4] drm/etnaviv: export loadavg via perfmon
  2022-06-21  7:20   ` Christian Gmeiner
@ 2022-06-24  9:44     ` Lucas Stach
  -1 siblings, 0 replies; 22+ messages in thread
From: Lucas Stach @ 2022-06-24  9:44 UTC (permalink / raw)
  To: Christian Gmeiner, linux-kernel
  Cc: David Airlie, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP, Daniel Vetter,
	Russell King

Am Dienstag, dem 21.06.2022 um 09:20 +0200 schrieb Christian Gmeiner:
> Make it possible to access the sub-GPU component load value from
> user space with the perfmon infrastructure.
> 
You need to explain a bit more how you intend to use those.

Contrary to all other perfmon values, where we go to great lengths to
only count the load put onto the GPU by the specific process requesting
the perfmon, the loadavg values also include the load caused by other
submits. Due to this difference in behavior I fear that those new
counters might be confusing to use. But maybe you have a use-case in
mind that I don't see right now.

Regards,
Lucas

> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_perfmon.c | 79 +++++++++++++++++++++++
>  1 file changed, 79 insertions(+)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
> index bafdfe49c1d8..d65d9af3a74a 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
> @@ -120,6 +120,19 @@ static u32 hi_total_idle_cycle_read(struct etnaviv_gpu *gpu,
>  	return gpu_read(gpu, reg);
>  }
>  
> +static u32 load_read(struct etnaviv_gpu *gpu,
> +	const struct etnaviv_pm_domain *domain,
> +	const struct etnaviv_pm_signal *signal)
> +{
> +	u32 load;
> +
> +	spin_lock_bh(&gpu->loadavg_spinlock);
> +	load = gpu->loadavg_percentage[signal->data];
> +	spin_unlock_bh(&gpu->loadavg_spinlock);
> +
> +	return load;
> +}
> +
>  static const struct etnaviv_pm_domain doms_3d[] = {
>  	{
>  		.name = "HI",
> @@ -419,6 +432,72 @@ static const struct etnaviv_pm_domain doms_3d[] = {
>  				&perf_reg_read
>  			}
>  		}
> +	},
> +	{
> +		.name = "LOAD",
> +		.nr_signals = 12,
> +		.signal = (const struct etnaviv_pm_signal[]) {
> +			{
> +				"FE",
> +				0,
> +				&load_read
> +			},
> +			{
> +				"DE",
> +				1,
> +				&load_read
> +			},
> +			{
> +				"PE",
> +				2,
> +				&load_read
> +			},
> +			{
> +				"SH",
> +				3,
> +				&load_read
> +			},
> +			{
> +				"PA",
> +				4,
> +				&load_read
> +			},
> +			{
> +				"SE",
> +				5,
> +				&load_read
> +			},
> +			{
> +				"RA",
> +				6,
> +				&load_read
> +			},
> +			{
> +				"TX",
> +				7,
> +				&load_read
> +			},
> +			{
> +				"VG",
> +				8,
> +				&load_read
> +			},
> +			{
> +				"IM",
> +				9,
> +				&load_read
> +			},
> +			{
> +				"FP",
> +				10,
> +				&load_read
> +			},
> +			{
> +				"TS",
> +				11,
> +				&load_read
> +			}
> +		}
>  	}
>  };
>  



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 4/4] drm/etnaviv: export loadavg via perfmon
@ 2022-06-24  9:44     ` Lucas Stach
  0 siblings, 0 replies; 22+ messages in thread
From: Lucas Stach @ 2022-06-24  9:44 UTC (permalink / raw)
  To: Christian Gmeiner, linux-kernel
  Cc: David Airlie, moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP, Russell King

Am Dienstag, dem 21.06.2022 um 09:20 +0200 schrieb Christian Gmeiner:
> Make it possible to access the sub-GPU component load value from
> user space with the perfmon infrastructure.
> 
You need to explain a bit more how you intend to use those.

Contrary to all other perfmon values, where we go to great lengths to
only count the load put onto the GPU by the specific process requesting
the perfmon, the loadavg values also include the load caused by other
submits. Due to this difference in behavior I fear that those new
counters might be confusing to use. But maybe you have a use-case in
mind that I don't see right now.

Regards,
Lucas

> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_perfmon.c | 79 +++++++++++++++++++++++
>  1 file changed, 79 insertions(+)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
> index bafdfe49c1d8..d65d9af3a74a 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
> @@ -120,6 +120,19 @@ static u32 hi_total_idle_cycle_read(struct etnaviv_gpu *gpu,
>  	return gpu_read(gpu, reg);
>  }
>  
> +static u32 load_read(struct etnaviv_gpu *gpu,
> +	const struct etnaviv_pm_domain *domain,
> +	const struct etnaviv_pm_signal *signal)
> +{
> +	u32 load;
> +
> +	spin_lock_bh(&gpu->loadavg_spinlock);
> +	load = gpu->loadavg_percentage[signal->data];
> +	spin_unlock_bh(&gpu->loadavg_spinlock);
> +
> +	return load;
> +}
> +
>  static const struct etnaviv_pm_domain doms_3d[] = {
>  	{
>  		.name = "HI",
> @@ -419,6 +432,72 @@ static const struct etnaviv_pm_domain doms_3d[] = {
>  				&perf_reg_read
>  			}
>  		}
> +	},
> +	{
> +		.name = "LOAD",
> +		.nr_signals = 12,
> +		.signal = (const struct etnaviv_pm_signal[]) {
> +			{
> +				"FE",
> +				0,
> +				&load_read
> +			},
> +			{
> +				"DE",
> +				1,
> +				&load_read
> +			},
> +			{
> +				"PE",
> +				2,
> +				&load_read
> +			},
> +			{
> +				"SH",
> +				3,
> +				&load_read
> +			},
> +			{
> +				"PA",
> +				4,
> +				&load_read
> +			},
> +			{
> +				"SE",
> +				5,
> +				&load_read
> +			},
> +			{
> +				"RA",
> +				6,
> +				&load_read
> +			},
> +			{
> +				"TX",
> +				7,
> +				&load_read
> +			},
> +			{
> +				"VG",
> +				8,
> +				&load_read
> +			},
> +			{
> +				"IM",
> +				9,
> +				&load_read
> +			},
> +			{
> +				"FP",
> +				10,
> +				&load_read
> +			},
> +			{
> +				"TS",
> +				11,
> +				&load_read
> +			}
> +		}
>  	}
>  };
>  



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/4] drm/etnaviv: add simple moving average (SMA)
  2022-06-24  9:22     ` Lucas Stach
@ 2022-07-02 11:41       ` Christian Gmeiner
  -1 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-07-02 11:41 UTC (permalink / raw)
  To: Lucas Stach
  Cc: David Airlie, LKML, open list:DRM DRIVERS FOR VIVANTE GPU IP,
	moderated list:DRM DRIVERS FOR VIVANTE GPU IP, Russell King

Hi Lucas

>
> Am Dienstag, dem 21.06.2022 um 09:20 +0200 schrieb Christian Gmeiner:
> > This adds a SMA algorithm inspired by Exponentially weighted moving
> > average (EWMA) algorithm found in the kernel.
> >
> Still not sure about this one. I _feel_ that a simple moving average
> over a period of one second does not do a good job of reflecting the
> real GPU load for a bursty workload, where EWMA might be better suited.
> But then I also don't have a real informed opinion to offer on this.
>

I will play with EWMA and see what happens.

-- 
greets
--
Christian Gmeiner, MSc

https://christian-gmeiner.info/privacypolicy

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/4] drm/etnaviv: add simple moving average (SMA)
@ 2022-07-02 11:41       ` Christian Gmeiner
  0 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-07-02 11:41 UTC (permalink / raw)
  To: Lucas Stach
  Cc: LKML, David Airlie,
	moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP, Daniel Vetter,
	Russell King

Hi Lucas

>
> Am Dienstag, dem 21.06.2022 um 09:20 +0200 schrieb Christian Gmeiner:
> > This adds a SMA algorithm inspired by Exponentially weighted moving
> > average (EWMA) algorithm found in the kernel.
> >
> Still not sure about this one. I _feel_ that a simple moving average
> over a period of one second does not do a good job of reflecting the
> real GPU load for a bursty workload, where EWMA might be better suited.
> But then I also don't have a real informed opinion to offer on this.
>

I will play with EWMA and see what happens.

-- 
greets
--
Christian Gmeiner, MSc

https://christian-gmeiner.info/privacypolicy

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/4] drm/etnaviv: add loadavg accounting
  2022-06-24  9:38     ` Lucas Stach
@ 2022-07-02 11:53       ` Christian Gmeiner
  -1 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-07-02 11:53 UTC (permalink / raw)
  To: Lucas Stach
  Cc: LKML, David Airlie,
	moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP, Daniel Vetter,
	Russell King

Am Fr., 24. Juni 2022 um 11:38 Uhr schrieb Lucas Stach <l.stach@pengutronix.de>:
>
> Am Dienstag, dem 21.06.2022 um 09:20 +0200 schrieb Christian Gmeiner:
> > The GPU has an idle state register where each bit represents the idle
> > state of a sub-GPU component like FE or TX. Sample this register
> > every 10ms and calculate a simple moving average over the sub-GPU
> > component idle states with a total observation time frame of 1s.
> >
> > This provides us with a percentage based load of each sub-GPU
> > component.
> >
> > Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
> > ---
> >  drivers/gpu/drm/etnaviv/etnaviv_drv.c | 14 ++++++
> >  drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 64 ++++++++++++++++++++++++++-
> >  drivers/gpu/drm/etnaviv/etnaviv_gpu.h | 37 ++++++++++++++++
> >  3 files changed, 114 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
> > index 1d2b4fb4bcf8..d5c6115e56bd 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
> > @@ -46,6 +46,19 @@ static void load_gpu(struct drm_device *dev)
> >       }
> >  }
> >
> > +static void unload_gpu(struct drm_device *dev)
> > +{
> > +     struct etnaviv_drm_private *priv = dev->dev_private;
> > +     unsigned int i;
> > +
> > +     for (i = 0; i < ETNA_MAX_PIPES; i++) {
> > +             struct etnaviv_gpu *g = priv->gpu[i];
> > +
> > +             if (g)
> > +                     etnaviv_gpu_shutdown(g);
> > +     }
> > +}
> > +
> >  static int etnaviv_open(struct drm_device *dev, struct drm_file *file)
> >  {
> >       struct etnaviv_drm_private *priv = dev->dev_private;
> > @@ -557,6 +570,7 @@ static void etnaviv_unbind(struct device *dev)
> >       struct drm_device *drm = dev_get_drvdata(dev);
> >       struct etnaviv_drm_private *priv = drm->dev_private;
> >
> > +     unload_gpu(drm);
> >       drm_dev_unregister(drm);
> >
> >       component_unbind_all(dev, drm);
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> > index 37018bc55810..202002ae75ee 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> > @@ -27,6 +27,8 @@
> >  #include "state_hi.xml.h"
> >  #include "cmdstream.xml.h"
> >
> > +static const ktime_t loadavg_polling_frequency = 10 * NSEC_PER_MSEC;
> > +
> Feeling like a nitpicker, but the thing defined here isn't a frequency,
> but a time delta/interval.
>

Will rename it in the next version of the patch series.

>
> >  static const struct platform_device_id gpu_ids[] = {
> >       { .name = "etnaviv-gpu,2d" },
> >       { },
> > @@ -745,6 +747,32 @@ static void etnaviv_gpu_hw_init(struct etnaviv_gpu *gpu)
> >       gpu_write(gpu, VIVS_HI_INTR_ENBL, ~0U);
> >  }
> >
> > +static enum hrtimer_restart etnaviv_loadavg_function(struct hrtimer *t)
> > +{
> > +     struct etnaviv_gpu *gpu = container_of(t, struct etnaviv_gpu, loadavg_timer);
> > +     const u32 idle = gpu_read(gpu, VIVS_HI_IDLE_STATE);
> > +     int i;
> > +
> > +     gpu->loadavg_last_sample_time = ktime_get();
> > +
> > +     for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++)
> > +             if ((idle & etna_idle_module_names[i].bit))
> > +                     sma_loadavg_add(&gpu->loadavg_value[i], 0);
> > +             else
> > +                     sma_loadavg_add(&gpu->loadavg_value[i], 100);
> > +
> > +     spin_lock(&gpu->loadavg_spinlock);
> > +
> > +     for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++)
> > +             gpu->loadavg_percentage[i] = sma_loadavg_read(&gpu->loadavg_value[i]);
> > +
> > +     spin_unlock(&gpu->loadavg_spinlock);
>
> After pondering this for a bit, I don't think we need this spinlock.
> The percentage is a single value per engine, so they are already single
> write atomic. The worst thing that can happen without this spinlock is
> that on read of the loadavg some engines already have the value of
> sample period n+1 integrated, while another set is still at n, which I
> don't think we care much about, as those load values are already quite
> coarse.
>

Okay.. will remove the spinlock.

> > +
> > +     hrtimer_forward_now(t, loadavg_polling_frequency);
> > +
> > +     return HRTIMER_RESTART;
> > +}
> > +
> >  int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
> >  {
> >       struct etnaviv_drm_private *priv = gpu->drm->dev_private;
> > @@ -839,6 +867,11 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
> >       for (i = 0; i < ARRAY_SIZE(gpu->event); i++)
> >               complete(&gpu->event_free);
> >
> > +     /* Setup loadavg timer */
> > +     hrtimer_init(&gpu->loadavg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_SOFT);
> > +     gpu->loadavg_timer.function = etnaviv_loadavg_function;
> > +     hrtimer_start(&gpu->loadavg_timer, loadavg_polling_frequency, HRTIMER_MODE_ABS_SOFT);
> > +
> >       /* Now program the hardware */
> >       mutex_lock(&gpu->lock);
> >       etnaviv_gpu_hw_init(gpu);
> > @@ -859,6 +892,11 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
> >       return ret;
> >  }
> >
> > +void etnaviv_gpu_shutdown(struct etnaviv_gpu *gpu)
> > +{
> > +     hrtimer_cancel(&gpu->loadavg_timer);
> > +}
> > +
> >  #ifdef CONFIG_DEBUG_FS
> >  struct dma_debug {
> >       u32 address[2];
> > @@ -1585,6 +1623,8 @@ int etnaviv_gpu_wait_idle(struct etnaviv_gpu *gpu, unsigned int timeout_ms)
> >  static int etnaviv_gpu_hw_suspend(struct etnaviv_gpu *gpu)
> >  {
> >       if (gpu->initialized && gpu->fe_running) {
> > +             hrtimer_cancel(&gpu->loadavg_timer);
> > +
> This isn't symmetric. Here you only cancel the timer when FE was
> running, but in the resume function you unconditionally start the
> timer.
>
>
> Moving the timer start into etnaviv_gpu_start_fe() seems to be a good
> idea. Sampling the idle state of a GPU with the FE not running doesn't
> make much sense in the first place, as it will unsurprisingly be fully
> idle. Doing this would also allow you to drop the
> etnaviv_gpu_shutdown() and unload_gpu() functions, as the timer doesn't
> need to be started when initializing the GPU.
>

Will try that.

>
> >               /* Replace the last WAIT with END */
> >               mutex_lock(&gpu->lock);
> >               etnaviv_buffer_end(gpu);
> > @@ -1608,7 +1648,8 @@ static int etnaviv_gpu_hw_suspend(struct etnaviv_gpu *gpu)
> >  #ifdef CONFIG_PM
> >  static int etnaviv_gpu_hw_resume(struct etnaviv_gpu *gpu)
> >  {
> > -     int ret;
> > +     s64 missing_samples;
> > +     int ret, i, j;
> >
> >       ret = mutex_lock_killable(&gpu->lock);
> >       if (ret)
> > @@ -1617,7 +1658,27 @@ static int etnaviv_gpu_hw_resume(struct etnaviv_gpu *gpu)
> >       etnaviv_gpu_update_clock(gpu);
> >       etnaviv_gpu_hw_init(gpu);
> >
> > +     /* Update loadavg based on delta of suspend and resume ktime.
> > +      *
> > +      * Our SMA algorithm uses a fixed size of 100 items to be able
> > +      * to calculate the mean over one second as we sample every 10ms.
> > +      */
> > +     missing_samples = div_s64(ktime_ms_delta(ktime_get(), gpu->loadavg_last_sample_time), 10);
>
> In the timer function you use the loadavg_polling_frequency const for
> this value. It would be good to be consistent here. Probably just
> #define the polling interval and use this both here and in the timer
> function.
>

Okay.

> > +     missing_samples = min(missing_samples, (s64)100);
> > +
> > +     spin_lock_bh(&gpu->loadavg_spinlock);
> > +
> > +     for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++) {
> > +             struct sma_loadavg *loadavg = &gpu->loadavg_value[i];
> > +
> > +             for (j = 0; j < missing_samples; j++)
> > +                     sma_loadavg_add(loadavg, 0);
> > +     }
> > +
> > +     spin_unlock_bh(&gpu->loadavg_spinlock);
> > +
> >       mutex_unlock(&gpu->lock);
> > +     hrtimer_start(&gpu->loadavg_timer, loadavg_polling_frequency, HRTIMER_MODE_ABS_SOFT);
> >
> >       return 0;
> >  }
> > @@ -1787,6 +1848,7 @@ static int etnaviv_gpu_platform_probe(struct platform_device *pdev)
> >       gpu->dev = &pdev->dev;
> >       mutex_init(&gpu->lock);
> >       mutex_init(&gpu->fence_lock);
> > +     spin_lock_init(&gpu->loadavg_spinlock);
> >
> >       /* Map registers: */
> >       gpu->mmio = devm_platform_ioremap_resource(pdev, 0);
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> > index 85eddd492774..881f071f640e 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> > @@ -10,6 +10,8 @@
> >  #include "etnaviv_gem.h"
> >  #include "etnaviv_mmu.h"
> >  #include "etnaviv_drv.h"
> > +#include "etnaviv_sma.h"
> > +#include "state_hi.xml.h"
> >
> >  struct etnaviv_gem_submit;
> >  struct etnaviv_vram_mapping;
> > @@ -91,6 +93,33 @@ struct clk;
> >
> >  #define ETNA_NR_EVENTS 30
> >
> > +DECLARE_SMA(loadavg, 100)
> > +
> > +static const struct {
> > +    const char *name;
> > +    u32 bit;
> > +} etna_idle_module_names[] = {
>
> Drop the _names prefix. This isn't just enumerating names, but also the
> bit positions in the state register.
>

Okay.

-- 
greets
--
Christian Gmeiner, MSc

https://christian-gmeiner.info/privacypolicy

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/4] drm/etnaviv: add loadavg accounting
@ 2022-07-02 11:53       ` Christian Gmeiner
  0 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-07-02 11:53 UTC (permalink / raw)
  To: Lucas Stach
  Cc: David Airlie, LKML, open list:DRM DRIVERS FOR VIVANTE GPU IP,
	moderated list:DRM DRIVERS FOR VIVANTE GPU IP, Russell King

Am Fr., 24. Juni 2022 um 11:38 Uhr schrieb Lucas Stach <l.stach@pengutronix.de>:
>
> Am Dienstag, dem 21.06.2022 um 09:20 +0200 schrieb Christian Gmeiner:
> > The GPU has an idle state register where each bit represents the idle
> > state of a sub-GPU component like FE or TX. Sample this register
> > every 10ms and calculate a simple moving average over the sub-GPU
> > component idle states with a total observation time frame of 1s.
> >
> > This provides us with a percentage based load of each sub-GPU
> > component.
> >
> > Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
> > ---
> >  drivers/gpu/drm/etnaviv/etnaviv_drv.c | 14 ++++++
> >  drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 64 ++++++++++++++++++++++++++-
> >  drivers/gpu/drm/etnaviv/etnaviv_gpu.h | 37 ++++++++++++++++
> >  3 files changed, 114 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
> > index 1d2b4fb4bcf8..d5c6115e56bd 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
> > @@ -46,6 +46,19 @@ static void load_gpu(struct drm_device *dev)
> >       }
> >  }
> >
> > +static void unload_gpu(struct drm_device *dev)
> > +{
> > +     struct etnaviv_drm_private *priv = dev->dev_private;
> > +     unsigned int i;
> > +
> > +     for (i = 0; i < ETNA_MAX_PIPES; i++) {
> > +             struct etnaviv_gpu *g = priv->gpu[i];
> > +
> > +             if (g)
> > +                     etnaviv_gpu_shutdown(g);
> > +     }
> > +}
> > +
> >  static int etnaviv_open(struct drm_device *dev, struct drm_file *file)
> >  {
> >       struct etnaviv_drm_private *priv = dev->dev_private;
> > @@ -557,6 +570,7 @@ static void etnaviv_unbind(struct device *dev)
> >       struct drm_device *drm = dev_get_drvdata(dev);
> >       struct etnaviv_drm_private *priv = drm->dev_private;
> >
> > +     unload_gpu(drm);
> >       drm_dev_unregister(drm);
> >
> >       component_unbind_all(dev, drm);
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> > index 37018bc55810..202002ae75ee 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> > @@ -27,6 +27,8 @@
> >  #include "state_hi.xml.h"
> >  #include "cmdstream.xml.h"
> >
> > +static const ktime_t loadavg_polling_frequency = 10 * NSEC_PER_MSEC;
> > +
> Feeling like a nitpicker, but the thing defined here isn't a frequency,
> but a time delta/interval.
>

Will rename it in the next version of the patch series.

>
> >  static const struct platform_device_id gpu_ids[] = {
> >       { .name = "etnaviv-gpu,2d" },
> >       { },
> > @@ -745,6 +747,32 @@ static void etnaviv_gpu_hw_init(struct etnaviv_gpu *gpu)
> >       gpu_write(gpu, VIVS_HI_INTR_ENBL, ~0U);
> >  }
> >
> > +static enum hrtimer_restart etnaviv_loadavg_function(struct hrtimer *t)
> > +{
> > +     struct etnaviv_gpu *gpu = container_of(t, struct etnaviv_gpu, loadavg_timer);
> > +     const u32 idle = gpu_read(gpu, VIVS_HI_IDLE_STATE);
> > +     int i;
> > +
> > +     gpu->loadavg_last_sample_time = ktime_get();
> > +
> > +     for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++)
> > +             if ((idle & etna_idle_module_names[i].bit))
> > +                     sma_loadavg_add(&gpu->loadavg_value[i], 0);
> > +             else
> > +                     sma_loadavg_add(&gpu->loadavg_value[i], 100);
> > +
> > +     spin_lock(&gpu->loadavg_spinlock);
> > +
> > +     for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++)
> > +             gpu->loadavg_percentage[i] = sma_loadavg_read(&gpu->loadavg_value[i]);
> > +
> > +     spin_unlock(&gpu->loadavg_spinlock);
>
> After pondering this for a bit, I don't think we need this spinlock.
> The percentage is a single value per engine, so they are already single
> write atomic. The worst thing that can happen without this spinlock is
> that on read of the loadavg some engines already have the value of
> sample period n+1 integrated, while another set is still at n, which I
> don't think we care much about, as those load values are already quite
> coarse.
>

Okay.. will remove the spinlock.

> > +
> > +     hrtimer_forward_now(t, loadavg_polling_frequency);
> > +
> > +     return HRTIMER_RESTART;
> > +}
> > +
> >  int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
> >  {
> >       struct etnaviv_drm_private *priv = gpu->drm->dev_private;
> > @@ -839,6 +867,11 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
> >       for (i = 0; i < ARRAY_SIZE(gpu->event); i++)
> >               complete(&gpu->event_free);
> >
> > +     /* Setup loadavg timer */
> > +     hrtimer_init(&gpu->loadavg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_SOFT);
> > +     gpu->loadavg_timer.function = etnaviv_loadavg_function;
> > +     hrtimer_start(&gpu->loadavg_timer, loadavg_polling_frequency, HRTIMER_MODE_ABS_SOFT);
> > +
> >       /* Now program the hardware */
> >       mutex_lock(&gpu->lock);
> >       etnaviv_gpu_hw_init(gpu);
> > @@ -859,6 +892,11 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
> >       return ret;
> >  }
> >
> > +void etnaviv_gpu_shutdown(struct etnaviv_gpu *gpu)
> > +{
> > +     hrtimer_cancel(&gpu->loadavg_timer);
> > +}
> > +
> >  #ifdef CONFIG_DEBUG_FS
> >  struct dma_debug {
> >       u32 address[2];
> > @@ -1585,6 +1623,8 @@ int etnaviv_gpu_wait_idle(struct etnaviv_gpu *gpu, unsigned int timeout_ms)
> >  static int etnaviv_gpu_hw_suspend(struct etnaviv_gpu *gpu)
> >  {
> >       if (gpu->initialized && gpu->fe_running) {
> > +             hrtimer_cancel(&gpu->loadavg_timer);
> > +
> This isn't symmetric. Here you only cancel the timer when FE was
> running, but in the resume function you unconditionally start the
> timer.
>
>
> Moving the timer start into etnaviv_gpu_start_fe() seems to be a good
> idea. Sampling the idle state of a GPU with the FE not running doesn't
> make much sense in the first place, as it will unsurprisingly be fully
> idle. Doing this would also allow you to drop the
> etnaviv_gpu_shutdown() and unload_gpu() functions, as the timer doesn't
> need to be started when initializing the GPU.
>

Will try that.

>
> >               /* Replace the last WAIT with END */
> >               mutex_lock(&gpu->lock);
> >               etnaviv_buffer_end(gpu);
> > @@ -1608,7 +1648,8 @@ static int etnaviv_gpu_hw_suspend(struct etnaviv_gpu *gpu)
> >  #ifdef CONFIG_PM
> >  static int etnaviv_gpu_hw_resume(struct etnaviv_gpu *gpu)
> >  {
> > -     int ret;
> > +     s64 missing_samples;
> > +     int ret, i, j;
> >
> >       ret = mutex_lock_killable(&gpu->lock);
> >       if (ret)
> > @@ -1617,7 +1658,27 @@ static int etnaviv_gpu_hw_resume(struct etnaviv_gpu *gpu)
> >       etnaviv_gpu_update_clock(gpu);
> >       etnaviv_gpu_hw_init(gpu);
> >
> > +     /* Update loadavg based on delta of suspend and resume ktime.
> > +      *
> > +      * Our SMA algorithm uses a fixed size of 100 items to be able
> > +      * to calculate the mean over one second as we sample every 10ms.
> > +      */
> > +     missing_samples = div_s64(ktime_ms_delta(ktime_get(), gpu->loadavg_last_sample_time), 10);
>
> In the timer function you use the loadavg_polling_frequency const for
> this value. It would be good to be consistent here. Probably just
> #define the polling interval and use this both here and in the timer
> function.
>

Okay.

> > +     missing_samples = min(missing_samples, (s64)100);
> > +
> > +     spin_lock_bh(&gpu->loadavg_spinlock);
> > +
> > +     for (i = 0; i < ARRAY_SIZE(etna_idle_module_names); i++) {
> > +             struct sma_loadavg *loadavg = &gpu->loadavg_value[i];
> > +
> > +             for (j = 0; j < missing_samples; j++)
> > +                     sma_loadavg_add(loadavg, 0);
> > +     }
> > +
> > +     spin_unlock_bh(&gpu->loadavg_spinlock);
> > +
> >       mutex_unlock(&gpu->lock);
> > +     hrtimer_start(&gpu->loadavg_timer, loadavg_polling_frequency, HRTIMER_MODE_ABS_SOFT);
> >
> >       return 0;
> >  }
> > @@ -1787,6 +1848,7 @@ static int etnaviv_gpu_platform_probe(struct platform_device *pdev)
> >       gpu->dev = &pdev->dev;
> >       mutex_init(&gpu->lock);
> >       mutex_init(&gpu->fence_lock);
> > +     spin_lock_init(&gpu->loadavg_spinlock);
> >
> >       /* Map registers: */
> >       gpu->mmio = devm_platform_ioremap_resource(pdev, 0);
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> > index 85eddd492774..881f071f640e 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> > @@ -10,6 +10,8 @@
> >  #include "etnaviv_gem.h"
> >  #include "etnaviv_mmu.h"
> >  #include "etnaviv_drv.h"
> > +#include "etnaviv_sma.h"
> > +#include "state_hi.xml.h"
> >
> >  struct etnaviv_gem_submit;
> >  struct etnaviv_vram_mapping;
> > @@ -91,6 +93,33 @@ struct clk;
> >
> >  #define ETNA_NR_EVENTS 30
> >
> > +DECLARE_SMA(loadavg, 100)
> > +
> > +static const struct {
> > +    const char *name;
> > +    u32 bit;
> > +} etna_idle_module_names[] = {
>
> Drop the _names prefix. This isn't just enumerating names, but also the
> bit positions in the state register.
>

Okay.

-- 
greets
--
Christian Gmeiner, MSc

https://christian-gmeiner.info/privacypolicy

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 4/4] drm/etnaviv: export loadavg via perfmon
  2022-06-24  9:44     ` Lucas Stach
@ 2022-07-02 11:57       ` Christian Gmeiner
  -1 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-07-02 11:57 UTC (permalink / raw)
  To: Lucas Stach
  Cc: LKML, David Airlie,
	moderated list:DRM DRIVERS FOR VIVANTE GPU IP,
	open list:DRM DRIVERS FOR VIVANTE GPU IP, Daniel Vetter,
	Russell King

Hi Lucas

> You need to explain a bit more how you intend to use those.
>
> Contrary to all other perfmon values, where we go to great lengths to
> only count the load put onto the GPU by the specific process requesting
> the perfmon, the loadavg values also include the load caused by other
> submits. Due to this difference in behavior I fear that those new
> counters might be confusing to use. But maybe you have a use-case in
> mind that I don't see right now.

I see these values as total load avg of a sub-GPU component. Sure it is not per
process but it is a starting point. I have no problem introducing per
process load avg
values .. lets see how the next version of this series will look like,

-- 
greets
--
Christian Gmeiner, MSc

https://christian-gmeiner.info/privacypolicy

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 4/4] drm/etnaviv: export loadavg via perfmon
@ 2022-07-02 11:57       ` Christian Gmeiner
  0 siblings, 0 replies; 22+ messages in thread
From: Christian Gmeiner @ 2022-07-02 11:57 UTC (permalink / raw)
  To: Lucas Stach
  Cc: David Airlie, LKML, open list:DRM DRIVERS FOR VIVANTE GPU IP,
	moderated list:DRM DRIVERS FOR VIVANTE GPU IP, Russell King

Hi Lucas

> You need to explain a bit more how you intend to use those.
>
> Contrary to all other perfmon values, where we go to great lengths to
> only count the load put onto the GPU by the specific process requesting
> the perfmon, the loadavg values also include the load caused by other
> submits. Due to this difference in behavior I fear that those new
> counters might be confusing to use. But maybe you have a use-case in
> mind that I don't see right now.

I see these values as total load avg of a sub-GPU component. Sure it is not per
process but it is a starting point. I have no problem introducing per
process load avg
values .. lets see how the next version of this series will look like,

-- 
greets
--
Christian Gmeiner, MSc

https://christian-gmeiner.info/privacypolicy

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-07-02 11:58 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-21  7:20 [PATCH v2 0/4] Add support for GPU load values Christian Gmeiner
2022-06-21  7:20 ` Christian Gmeiner
2022-06-21  7:20 ` [PATCH v2 1/4] drm/etnaviv: add simple moving average (SMA) Christian Gmeiner
2022-06-21  7:20   ` Christian Gmeiner
2022-06-24  9:22   ` Lucas Stach
2022-06-24  9:22     ` Lucas Stach
2022-07-02 11:41     ` Christian Gmeiner
2022-07-02 11:41       ` Christian Gmeiner
2022-06-21  7:20 ` [PATCH v2 2/4] drm/etnaviv: add loadavg accounting Christian Gmeiner
2022-06-21  7:20   ` Christian Gmeiner
2022-06-24  9:38   ` Lucas Stach
2022-06-24  9:38     ` Lucas Stach
2022-07-02 11:53     ` Christian Gmeiner
2022-07-02 11:53       ` Christian Gmeiner
2022-06-21  7:20 ` [PATCH v2 3/4] drm/etnaviv: show loadavg in debugfs Christian Gmeiner
2022-06-21  7:20   ` Christian Gmeiner
2022-06-21  7:20 ` [PATCH v2 4/4] drm/etnaviv: export loadavg via perfmon Christian Gmeiner
2022-06-21  7:20   ` Christian Gmeiner
2022-06-24  9:44   ` Lucas Stach
2022-06-24  9:44     ` Lucas Stach
2022-07-02 11:57     ` Christian Gmeiner
2022-07-02 11:57       ` Christian Gmeiner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.