All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-xe] [PATCH v2 0/2] drm/xe/pmu: Enable PMU interface
@ 2023-06-27 12:21 Aravind Iddamsetty
  2023-06-27 12:21 ` [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs Aravind Iddamsetty
                   ` (6 more replies)
  0 siblings, 7 replies; 59+ messages in thread
From: Aravind Iddamsetty @ 2023-06-27 12:21 UTC (permalink / raw)
  To: intel-xe; +Cc: Bommu Krishnaiah, Tvrtko Ursulin

There are a set of engine group busyness counters provided by HW which are
perfect fit to be exposed via PMU perf events.

BSPEC: 46559, 46560, 46722, 46729

events can be listed using:
perf list
  xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
  xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
  xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
  xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
  xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]

and can be read using:

perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
           time             counts unit events
     1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
    10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
    10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/

The pmu base implementation is taken from i915.

v2:
Store last known value when device is awake return that while the GT is
suspended and then update the driver copy when read during awake.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>

Aravind Iddamsetty (2):
  drm/xe: Get GT clock to nanosecs
  drm/xe/pmu: Enable PMU interface

 drivers/gpu/drm/xe/Makefile          |   2 +
 drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
 drivers/gpu/drm/xe/xe_device.c       |   2 +
 drivers/gpu/drm/xe/xe_device_types.h |   4 +
 drivers/gpu/drm/xe/xe_gt.c           |   2 +
 drivers/gpu/drm/xe/xe_gt_clock.c     |  10 +
 drivers/gpu/drm/xe/xe_gt_clock.h     |   4 +-
 drivers/gpu/drm/xe/xe_irq.c          |  22 +
 drivers/gpu/drm/xe/xe_module.c       |   5 +
 drivers/gpu/drm/xe/xe_pmu.c          | 739 +++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_pmu.h          |  25 +
 drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
 include/uapi/drm/xe_drm.h            |  16 +
 13 files changed, 915 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
 create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
 create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs
  2023-06-27 12:21 [Intel-xe] [PATCH v2 0/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
@ 2023-06-27 12:21 ` Aravind Iddamsetty
  2023-07-04  9:29   ` Upadhyay, Tejas
  2023-07-06  0:55   ` Dixit, Ashutosh
  2023-06-27 12:21 ` [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 59+ messages in thread
From: Aravind Iddamsetty @ 2023-06-27 12:21 UTC (permalink / raw)
  To: intel-xe

Helpers to get GT clock to nanosecs

Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
---
 drivers/gpu/drm/xe/xe_gt_clock.c | 10 ++++++++++
 drivers/gpu/drm/xe/xe_gt_clock.h |  4 +++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_gt_clock.c b/drivers/gpu/drm/xe/xe_gt_clock.c
index 7cf11078ff57..3689c7d5cf53 100644
--- a/drivers/gpu/drm/xe/xe_gt_clock.c
+++ b/drivers/gpu/drm/xe/xe_gt_clock.c
@@ -78,3 +78,13 @@ int xe_gt_clock_init(struct xe_gt *gt)
 	gt->info.clock_freq = freq;
 	return 0;
 }
+
+static u64 div_u64_roundup(u64 nom, u32 den)
+{
+	return div_u64(nom + den - 1, den);
+}
+
+u64 xe_gt_clock_interval_to_ns(const struct xe_gt *gt, u64 count)
+{
+	return div_u64_roundup(count * NSEC_PER_SEC, gt->info.clock_freq);
+}
diff --git a/drivers/gpu/drm/xe/xe_gt_clock.h b/drivers/gpu/drm/xe/xe_gt_clock.h
index 511923afd224..91fc9b7e83f5 100644
--- a/drivers/gpu/drm/xe/xe_gt_clock.h
+++ b/drivers/gpu/drm/xe/xe_gt_clock.h
@@ -6,8 +6,10 @@
 #ifndef _XE_GT_CLOCK_H_
 #define _XE_GT_CLOCK_H_
 
+#include <linux/types.h>
+
 struct xe_gt;
 
 int xe_gt_clock_init(struct xe_gt *gt);
-
+u64 xe_gt_clock_interval_to_ns(const struct xe_gt *gt, u64 count);
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-06-27 12:21 [Intel-xe] [PATCH v2 0/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
  2023-06-27 12:21 ` [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs Aravind Iddamsetty
@ 2023-06-27 12:21 ` Aravind Iddamsetty
  2023-06-30 13:53   ` Upadhyay, Tejas
                     ` (6 more replies)
  2023-06-27 13:04 ` [Intel-xe] ✓ CI.Patch_applied: success for drm/xe/pmu: Enable PMU interface (rev2) Patchwork
                   ` (4 subsequent siblings)
  6 siblings, 7 replies; 59+ messages in thread
From: Aravind Iddamsetty @ 2023-06-27 12:21 UTC (permalink / raw)
  To: intel-xe; +Cc: Bommu Krishnaiah, Tvrtko Ursulin

There are a set of engine group busyness counters provided by HW which are
perfect fit to be exposed via PMU perf events.

BSPEC: 46559, 46560, 46722, 46729

events can be listed using:
perf list
  xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
  xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
  xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
  xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
  xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]

and can be read using:

perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
           time             counts unit events
     1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
    10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
    10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/

The pmu base implementation is taken from i915.

v2:
Store last known value when device is awake return that while the GT is
suspended and then update the driver copy when read during awake.

Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
---
 drivers/gpu/drm/xe/Makefile          |   2 +
 drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
 drivers/gpu/drm/xe/xe_device.c       |   2 +
 drivers/gpu/drm/xe/xe_device_types.h |   4 +
 drivers/gpu/drm/xe/xe_gt.c           |   2 +
 drivers/gpu/drm/xe/xe_irq.c          |  22 +
 drivers/gpu/drm/xe/xe_module.c       |   5 +
 drivers/gpu/drm/xe/xe_pmu.c          | 739 +++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_pmu.h          |  25 +
 drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
 include/uapi/drm/xe_drm.h            |  16 +
 11 files changed, 902 insertions(+)
 create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
 create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
 create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h

diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 081c57fd8632..e52ab795c566 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -217,6 +217,8 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
 	i915-display/skl_universal_plane.o \
 	i915-display/skl_watermark.o
 
+xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
+
 ifeq ($(CONFIG_ACPI),y)
 	xe-$(CONFIG_DRM_XE_DISPLAY) += \
 		i915-display/intel_acpi.o \
diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
index 3f664011eaea..c7d9e4634745 100644
--- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
+++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
@@ -285,6 +285,11 @@
 #define   INVALIDATION_BROADCAST_MODE_DIS	REG_BIT(12)
 #define   GLOBAL_INVALIDATION_MODE		REG_BIT(2)
 
+#define XE_OAG_RC0_ANY_ENGINE_BUSY_FREE		XE_REG(0xdb80)
+#define XE_OAG_ANY_MEDIA_FF_BUSY_FREE		XE_REG(0xdba0)
+#define XE_OAG_BLT_BUSY_FREE			XE_REG(0xdbbc)
+#define XE_OAG_RENDER_BUSY_FREE			XE_REG(0xdbdc)
+
 #define SAMPLER_MODE				XE_REG_MCR(0xe18c, XE_REG_OPTION_MASKED)
 #define   ENABLE_SMALLPL			REG_BIT(15)
 #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index c7985af85a53..b2c7bd4a97d9 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -328,6 +328,8 @@ int xe_device_probe(struct xe_device *xe)
 
 	xe_debugfs_register(xe);
 
+	xe_pmu_register(&xe->pmu);
+
 	err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize, xe);
 	if (err)
 		return err;
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 0226d44a6af2..3ba99aae92b9 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -15,6 +15,7 @@
 #include "xe_devcoredump_types.h"
 #include "xe_gt_types.h"
 #include "xe_platform_types.h"
+#include "xe_pmu.h"
 #include "xe_step_types.h"
 
 #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
@@ -332,6 +333,9 @@ struct xe_device {
 	/** @d3cold_allowed: Indicates if d3cold is a valid device state */
 	bool d3cold_allowed;
 
+	/* @pmu: performance monitoring unit */
+	struct xe_pmu pmu;
+
 	/* private: */
 
 #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index 2458397ce8af..96e3720923d4 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
 	if (err)
 		goto err_msg;
 
+	engine_group_busyness_store(gt);
+
 	err = xe_uc_suspend(&gt->uc);
 	if (err)
 		goto err_force_wake;
diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
index b4ed1e4a3388..cb943fb94ec7 100644
--- a/drivers/gpu/drm/xe/xe_irq.c
+++ b/drivers/gpu/drm/xe/xe_irq.c
@@ -27,6 +27,24 @@
 #define IIR(offset)				XE_REG(offset + 0x8)
 #define IER(offset)				XE_REG(offset + 0xc)
 
+/*
+ * Interrupt statistic for PMU. Increments the counter only if the
+ * interrupt originated from the GPU so interrupts from a device which
+ * shares the interrupt line are not accounted.
+ */
+static inline void xe_pmu_irq_stats(struct xe_device *xe,
+				    irqreturn_t res)
+{
+	if (unlikely(res != IRQ_HANDLED))
+		return;
+
+	/*
+	 * A clever compiler translates that into INC. A not so clever one
+	 * should at least prevent store tearing.
+	 */
+	WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1);
+}
+
 static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)
 {
 	u32 val = xe_mmio_read32(mmio, reg);
@@ -325,6 +343,8 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
 
 	xe_display_irq_enable(xe, gu_misc_iir);
 
+	xe_pmu_irq_stats(xe, IRQ_HANDLED);
+
 	return IRQ_HANDLED;
 }
 
@@ -414,6 +434,8 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
 	dg1_intr_enable(xe, false);
 	xe_display_irq_enable(xe, gu_misc_iir);
 
+	xe_pmu_irq_stats(xe, IRQ_HANDLED);
+
 	return IRQ_HANDLED;
 }
 
diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
index 75e5be939f53..f6fe89748525 100644
--- a/drivers/gpu/drm/xe/xe_module.c
+++ b/drivers/gpu/drm/xe/xe_module.c
@@ -12,6 +12,7 @@
 #include "xe_hw_fence.h"
 #include "xe_module.h"
 #include "xe_pci.h"
+#include "xe_pmu.h"
 #include "xe_sched_job.h"
 
 bool enable_guc = true;
@@ -49,6 +50,10 @@ static const struct init_funcs init_funcs[] = {
 		.init = xe_sched_job_module_init,
 		.exit = xe_sched_job_module_exit,
 	},
+	{
+		.init = xe_pmu_init,
+		.exit = xe_pmu_exit,
+	},
 	{
 		.init = xe_register_pci_driver,
 		.exit = xe_unregister_pci_driver,
diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
new file mode 100644
index 000000000000..bef1895be9f7
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_pmu.c
@@ -0,0 +1,739 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2023 Intel Corporation
+ */
+
+#include <drm/drm_drv.h>
+#include <drm/drm_managed.h>
+#include <drm/xe_drm.h>
+
+#include "regs/xe_gt_regs.h"
+#include "xe_device.h"
+#include "xe_gt_clock.h"
+#include "xe_mmio.h"
+
+static cpumask_t xe_pmu_cpumask;
+static unsigned int xe_pmu_target_cpu = -1;
+
+static unsigned int config_gt_id(const u64 config)
+{
+	return config >> __XE_PMU_GT_SHIFT;
+}
+
+static u64 config_counter(const u64 config)
+{
+	return config & ~(~0ULL << __XE_PMU_GT_SHIFT);
+}
+
+static unsigned int
+__sample_idx(struct xe_pmu *pmu, unsigned int gt_id, int sample)
+{
+	unsigned int idx = gt_id * __XE_NUM_PMU_SAMPLERS + sample;
+
+	XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
+
+	return idx;
+}
+
+static u64 read_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample)
+{
+	return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur;
+}
+
+static void
+store_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample, u64 val)
+{
+	pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val;
+}
+
+static int engine_busyness_sample_type(u64 config)
+{
+	int type = 0;
+
+	switch (config) {
+	case XE_PMU_RENDER_GROUP_BUSY(0):
+		type =  __XE_SAMPLE_RENDER_GROUP_BUSY;
+		break;
+	case XE_PMU_COPY_GROUP_BUSY(0):
+		type = __XE_SAMPLE_COPY_GROUP_BUSY;
+		break;
+	case XE_PMU_MEDIA_GROUP_BUSY(0):
+		type = __XE_SAMPLE_MEDIA_GROUP_BUSY;
+		break;
+	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
+		type = __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY;
+		break;
+	}
+
+	return type;
+}
+
+static void xe_pmu_event_destroy(struct perf_event *event)
+{
+	struct xe_device *xe =
+		container_of(event->pmu, typeof(*xe), pmu.base);
+
+	drm_WARN_ON(&xe->drm, event->parent);
+
+	drm_dev_put(&xe->drm);
+}
+
+static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
+{
+	u64 val = 0;
+
+	switch (config) {
+	case XE_PMU_RENDER_GROUP_BUSY(0):
+		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
+		break;
+	case XE_PMU_COPY_GROUP_BUSY(0):
+		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
+		break;
+	case XE_PMU_MEDIA_GROUP_BUSY(0):
+		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
+		break;
+	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
+		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
+		break;
+	default:
+		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
+	}
+
+	return xe_gt_clock_interval_to_ns(gt, val * 16);
+}
+
+static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config)
+{
+	int sample_type = engine_busyness_sample_type(config);
+	struct xe_device *xe = gt->tile->xe;
+	const unsigned int gt_id = gt->info.id;
+	struct xe_pmu *pmu = &xe->pmu;
+	bool device_awake;
+	unsigned long flags;
+	u64 val;
+
+	/*
+	 * found no better way to check if device is awake or not. Before
+	 * we suspend we set the submission_state.enabled to false.
+	 */
+	device_awake = gt->uc.guc.submission_state.enabled ? true : false;
+	if (device_awake)
+		val = __engine_group_busyness_read(gt, config);
+
+	spin_lock_irqsave(&pmu->lock, flags);
+
+	if (device_awake)
+		store_sample(pmu, gt_id, sample_type, val);
+	else
+		val = read_sample(pmu, gt_id, sample_type);
+
+	spin_unlock_irqrestore(&pmu->lock, flags);
+
+	return val;
+}
+
+void engine_group_busyness_store(struct xe_gt *gt)
+{
+	struct xe_pmu *pmu = &gt->tile->xe->pmu;
+	unsigned int gt_id = gt->info.id;
+	unsigned long flags;
+
+	spin_lock_irqsave(&pmu->lock, flags);
+
+	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
+		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
+	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
+		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
+	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
+		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
+	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
+		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
+
+	spin_unlock_irqrestore(&pmu->lock, flags);
+}
+
+static int
+config_status(struct xe_device *xe, u64 config)
+{
+	unsigned int max_gt_id = xe->info.gt_count > 1 ? 1 : 0;
+	unsigned int gt_id = config_gt_id(config);
+	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
+
+	if (gt_id > max_gt_id)
+		return -ENOENT;
+
+	switch (config_counter(config)) {
+	case XE_PMU_INTERRUPTS(0):
+		if (gt_id)
+			return -ENOENT;
+		break;
+	case XE_PMU_RENDER_GROUP_BUSY(0):
+	case XE_PMU_COPY_GROUP_BUSY(0):
+	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
+		if (GRAPHICS_VER(xe) < 12)
+			return -ENOENT;
+		break;
+	case XE_PMU_MEDIA_GROUP_BUSY(0):
+		if (MEDIA_VER(xe) >= 13 && gt->info.type != XE_GT_TYPE_MEDIA)
+			return -ENOENT;
+		break;
+	default:
+		return -ENOENT;
+	}
+
+	return 0;
+}
+
+static int xe_pmu_event_init(struct perf_event *event)
+{
+	struct xe_device *xe =
+		container_of(event->pmu, typeof(*xe), pmu.base);
+	struct xe_pmu *pmu = &xe->pmu;
+	int ret;
+
+	if (pmu->closed)
+		return -ENODEV;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	/* unsupported modes and filters */
+	if (event->attr.sample_period) /* no sampling */
+		return -EINVAL;
+
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
+	if (event->cpu < 0)
+		return -EINVAL;
+
+	/* only allow running on one cpu at a time */
+	if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask))
+		return -EINVAL;
+
+	ret = config_status(xe, event->attr.config);
+	if (ret)
+		return ret;
+
+	if (!event->parent) {
+		drm_dev_get(&xe->drm);
+		event->destroy = xe_pmu_event_destroy;
+	}
+
+	return 0;
+}
+
+static u64 __xe_pmu_event_read(struct perf_event *event)
+{
+	struct xe_device *xe =
+		container_of(event->pmu, typeof(*xe), pmu.base);
+	const unsigned int gt_id = config_gt_id(event->attr.config);
+	const u64 config = config_counter(event->attr.config);
+	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
+	struct xe_pmu *pmu = &xe->pmu;
+	u64 val = 0;
+
+	switch (config) {
+	case XE_PMU_INTERRUPTS(0):
+		val = READ_ONCE(pmu->irq_count);
+		break;
+	case XE_PMU_RENDER_GROUP_BUSY(0):
+	case XE_PMU_COPY_GROUP_BUSY(0):
+	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
+	case XE_PMU_MEDIA_GROUP_BUSY(0):
+		val = engine_group_busyness_read(gt, config);
+	}
+
+	return val;
+}
+
+static void xe_pmu_event_read(struct perf_event *event)
+{
+	struct xe_device *xe =
+		container_of(event->pmu, typeof(*xe), pmu.base);
+	struct hw_perf_event *hwc = &event->hw;
+	struct xe_pmu *pmu = &xe->pmu;
+	u64 prev, new;
+
+	if (pmu->closed) {
+		event->hw.state = PERF_HES_STOPPED;
+		return;
+	}
+again:
+	prev = local64_read(&hwc->prev_count);
+	new = __xe_pmu_event_read(event);
+
+	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
+		goto again;
+
+	local64_add(new - prev, &event->count);
+}
+
+static void xe_pmu_enable(struct perf_event *event)
+{
+	/*
+	 * Store the current counter value so we can report the correct delta
+	 * for all listeners. Even when the event was already enabled and has
+	 * an existing non-zero value.
+	 */
+	local64_set(&event->hw.prev_count, __xe_pmu_event_read(event));
+}
+
+static void xe_pmu_event_start(struct perf_event *event, int flags)
+{
+	struct xe_device *xe =
+		container_of(event->pmu, typeof(*xe), pmu.base);
+	struct xe_pmu *pmu = &xe->pmu;
+
+	if (pmu->closed)
+		return;
+
+	xe_pmu_enable(event);
+	event->hw.state = 0;
+}
+
+static void xe_pmu_event_stop(struct perf_event *event, int flags)
+{
+	if (flags & PERF_EF_UPDATE)
+		xe_pmu_event_read(event);
+
+	event->hw.state = PERF_HES_STOPPED;
+}
+
+static int xe_pmu_event_add(struct perf_event *event, int flags)
+{
+	struct xe_device *xe =
+		container_of(event->pmu, typeof(*xe), pmu.base);
+	struct xe_pmu *pmu = &xe->pmu;
+
+	if (pmu->closed)
+		return -ENODEV;
+
+	if (flags & PERF_EF_START)
+		xe_pmu_event_start(event, flags);
+
+	return 0;
+}
+
+static void xe_pmu_event_del(struct perf_event *event, int flags)
+{
+	xe_pmu_event_stop(event, PERF_EF_UPDATE);
+}
+
+static int xe_pmu_event_event_idx(struct perf_event *event)
+{
+	return 0;
+}
+
+struct xe_str_attribute {
+	struct device_attribute attr;
+	const char *str;
+};
+
+static ssize_t xe_pmu_format_show(struct device *dev,
+				  struct device_attribute *attr, char *buf)
+{
+	struct xe_str_attribute *eattr;
+
+	eattr = container_of(attr, struct xe_str_attribute, attr);
+	return sprintf(buf, "%s\n", eattr->str);
+}
+
+#define XE_PMU_FORMAT_ATTR(_name, _config) \
+	(&((struct xe_str_attribute[]) { \
+		{ .attr = __ATTR(_name, 0444, xe_pmu_format_show, NULL), \
+		  .str = _config, } \
+	})[0].attr.attr)
+
+static struct attribute *xe_pmu_format_attrs[] = {
+	XE_PMU_FORMAT_ATTR(xe_eventid, "config:0-20"),
+	NULL,
+};
+
+static const struct attribute_group xe_pmu_format_attr_group = {
+	.name = "format",
+	.attrs = xe_pmu_format_attrs,
+};
+
+struct xe_ext_attribute {
+	struct device_attribute attr;
+	unsigned long val;
+};
+
+static ssize_t xe_pmu_event_show(struct device *dev,
+				 struct device_attribute *attr, char *buf)
+{
+	struct xe_ext_attribute *eattr;
+
+	eattr = container_of(attr, struct xe_ext_attribute, attr);
+	return sprintf(buf, "config=0x%lx\n", eattr->val);
+}
+
+static ssize_t cpumask_show(struct device *dev,
+			    struct device_attribute *attr, char *buf)
+{
+	return cpumap_print_to_pagebuf(true, buf, &xe_pmu_cpumask);
+}
+
+static DEVICE_ATTR_RO(cpumask);
+
+static struct attribute *xe_cpumask_attrs[] = {
+	&dev_attr_cpumask.attr,
+	NULL,
+};
+
+static const struct attribute_group xe_pmu_cpumask_attr_group = {
+	.attrs = xe_cpumask_attrs,
+};
+
+#define __event(__counter, __name, __unit) \
+{ \
+	.counter = (__counter), \
+	.name = (__name), \
+	.unit = (__unit), \
+	.global = false, \
+}
+
+#define __global_event(__counter, __name, __unit) \
+{ \
+	.counter = (__counter), \
+	.name = (__name), \
+	.unit = (__unit), \
+	.global = true, \
+}
+
+static struct xe_ext_attribute *
+add_xe_attr(struct xe_ext_attribute *attr, const char *name, u64 config)
+{
+	sysfs_attr_init(&attr->attr.attr);
+	attr->attr.attr.name = name;
+	attr->attr.attr.mode = 0444;
+	attr->attr.show = xe_pmu_event_show;
+	attr->val = config;
+
+	return ++attr;
+}
+
+static struct perf_pmu_events_attr *
+add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name,
+	     const char *str)
+{
+	sysfs_attr_init(&attr->attr.attr);
+	attr->attr.attr.name = name;
+	attr->attr.attr.mode = 0444;
+	attr->attr.show = perf_event_sysfs_show;
+	attr->event_str = str;
+
+	return ++attr;
+}
+
+static struct attribute **
+create_event_attributes(struct xe_pmu *pmu)
+{
+	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
+	static const struct {
+		unsigned int counter;
+		const char *name;
+		const char *unit;
+		bool global;
+	} events[] = {
+		__global_event(0, "interrupts", NULL),
+		__event(1, "render-group-busy", "ns"),
+		__event(2, "copy-group-busy", "ns"),
+		__event(3, "media-group-busy", "ns"),
+		__event(4, "any-engine-group-busy", "ns"),
+	};
+
+	unsigned int count = 0;
+	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
+	struct xe_ext_attribute *xe_attr = NULL, *xe_iter;
+	struct attribute **attr = NULL, **attr_iter;
+	struct xe_gt *gt;
+	unsigned int i, j;
+
+	/* Count how many counters we will be exposing. */
+	for_each_gt(gt, xe, j) {
+		for (i = 0; i < ARRAY_SIZE(events); i++) {
+			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
+
+			if (!config_status(xe, config))
+				count++;
+		}
+	}
+
+	/* Allocate attribute objects and table. */
+	xe_attr = kcalloc(count, sizeof(*xe_attr), GFP_KERNEL);
+	if (!xe_attr)
+		goto err_alloc;
+
+	pmu_attr = kcalloc(count, sizeof(*pmu_attr), GFP_KERNEL);
+	if (!pmu_attr)
+		goto err_alloc;
+
+	/* Max one pointer of each attribute type plus a termination entry. */
+	attr = kcalloc(count * 2 + 1, sizeof(*attr), GFP_KERNEL);
+	if (!attr)
+		goto err_alloc;
+
+	xe_iter = xe_attr;
+	pmu_iter = pmu_attr;
+	attr_iter = attr;
+
+	for_each_gt(gt, xe, j) {
+		for (i = 0; i < ARRAY_SIZE(events); i++) {
+			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
+			char *str;
+
+			if (config_status(xe, config))
+				continue;
+
+			if (events[i].global)
+				str = kstrdup(events[i].name, GFP_KERNEL);
+			else
+				str = kasprintf(GFP_KERNEL, "%s-gt%u",
+						events[i].name, j);
+			if (!str)
+				goto err;
+
+			*attr_iter++ = &xe_iter->attr.attr;
+			xe_iter = add_xe_attr(xe_iter, str, config);
+
+			if (events[i].unit) {
+				if (events[i].global)
+					str = kasprintf(GFP_KERNEL, "%s.unit",
+							events[i].name);
+				else
+					str = kasprintf(GFP_KERNEL, "%s-gt%u.unit",
+							events[i].name, j);
+				if (!str)
+					goto err;
+
+				*attr_iter++ = &pmu_iter->attr.attr;
+				pmu_iter = add_pmu_attr(pmu_iter, str,
+							events[i].unit);
+			}
+		}
+	}
+
+	pmu->xe_attr = xe_attr;
+	pmu->pmu_attr = pmu_attr;
+
+	return attr;
+
+err:
+	for (attr_iter = attr; *attr_iter; attr_iter++)
+		kfree((*attr_iter)->name);
+
+err_alloc:
+	kfree(attr);
+	kfree(xe_attr);
+	kfree(pmu_attr);
+
+	return NULL;
+}
+
+static void free_event_attributes(struct xe_pmu *pmu)
+{
+	struct attribute **attr_iter = pmu->events_attr_group.attrs;
+
+	for (; *attr_iter; attr_iter++)
+		kfree((*attr_iter)->name);
+
+	kfree(pmu->events_attr_group.attrs);
+	kfree(pmu->xe_attr);
+	kfree(pmu->pmu_attr);
+
+	pmu->events_attr_group.attrs = NULL;
+	pmu->xe_attr = NULL;
+	pmu->pmu_attr = NULL;
+}
+
+static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
+{
+	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
+
+	XE_BUG_ON(!pmu->base.event_init);
+
+	/* Select the first online CPU as a designated reader. */
+	if (cpumask_empty(&xe_pmu_cpumask))
+		cpumask_set_cpu(cpu, &xe_pmu_cpumask);
+
+	return 0;
+}
+
+static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
+{
+	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
+	unsigned int target = xe_pmu_target_cpu;
+
+	XE_BUG_ON(!pmu->base.event_init);
+
+	/*
+	 * Unregistering an instance generates a CPU offline event which we must
+	 * ignore to avoid incorrectly modifying the shared xe_pmu_cpumask.
+	 */
+	if (pmu->closed)
+		return 0;
+
+	if (cpumask_test_and_clear_cpu(cpu, &xe_pmu_cpumask)) {
+		target = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
+
+		/* Migrate events if there is a valid target */
+		if (target < nr_cpu_ids) {
+			cpumask_set_cpu(target, &xe_pmu_cpumask);
+			xe_pmu_target_cpu = target;
+		}
+	}
+
+	if (target < nr_cpu_ids && target != pmu->cpuhp.cpu) {
+		perf_pmu_migrate_context(&pmu->base, cpu, target);
+		pmu->cpuhp.cpu = target;
+	}
+
+	return 0;
+}
+
+static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
+
+int xe_pmu_init(void)
+{
+	int ret;
+
+	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
+				      "perf/x86/intel/xe:online",
+				      xe_pmu_cpu_online,
+				      xe_pmu_cpu_offline);
+	if (ret < 0)
+		pr_notice("Failed to setup cpuhp state for xe PMU! (%d)\n",
+			  ret);
+	else
+		cpuhp_slot = ret;
+
+	return 0;
+}
+
+void xe_pmu_exit(void)
+{
+	if (cpuhp_slot != CPUHP_INVALID)
+		cpuhp_remove_multi_state(cpuhp_slot);
+}
+
+static int xe_pmu_register_cpuhp_state(struct xe_pmu *pmu)
+{
+	if (cpuhp_slot == CPUHP_INVALID)
+		return -EINVAL;
+
+	return cpuhp_state_add_instance(cpuhp_slot, &pmu->cpuhp.node);
+}
+
+static void xe_pmu_unregister_cpuhp_state(struct xe_pmu *pmu)
+{
+	cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node);
+}
+
+static void xe_pmu_unregister(struct drm_device *device, void *arg)
+{
+	struct xe_pmu *pmu = arg;
+
+	if (!pmu->base.event_init)
+		return;
+
+	/*
+	 * "Disconnect" the PMU callbacks - since all are atomic synchronize_rcu
+	 * ensures all currently executing ones will have exited before we
+	 * proceed with unregistration.
+	 */
+	pmu->closed = true;
+	synchronize_rcu();
+
+	xe_pmu_unregister_cpuhp_state(pmu);
+
+	perf_pmu_unregister(&pmu->base);
+	pmu->base.event_init = NULL;
+	kfree(pmu->base.attr_groups);
+	kfree(pmu->name);
+	free_event_attributes(pmu);
+}
+
+static void init_samples(struct xe_pmu *pmu)
+{
+	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
+	struct xe_gt *gt;
+	unsigned int i;
+
+	for_each_gt(gt, xe, i)
+		engine_group_busyness_store(gt);
+}
+
+void xe_pmu_register(struct xe_pmu *pmu)
+{
+	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
+	const struct attribute_group *attr_groups[] = {
+		&xe_pmu_format_attr_group,
+		&pmu->events_attr_group,
+		&xe_pmu_cpumask_attr_group,
+		NULL
+	};
+
+	int ret = -ENOMEM;
+
+	spin_lock_init(&pmu->lock);
+	pmu->cpuhp.cpu = -1;
+	init_samples(pmu);
+
+	pmu->name = kasprintf(GFP_KERNEL,
+			      "xe_%s",
+			      dev_name(xe->drm.dev));
+	if (pmu->name)
+		/* tools/perf reserves colons as special. */
+		strreplace((char *)pmu->name, ':', '_');
+
+	if (!pmu->name)
+		goto err;
+
+	pmu->events_attr_group.name = "events";
+	pmu->events_attr_group.attrs = create_event_attributes(pmu);
+	if (!pmu->events_attr_group.attrs)
+		goto err_name;
+
+	pmu->base.attr_groups = kmemdup(attr_groups, sizeof(attr_groups),
+					GFP_KERNEL);
+	if (!pmu->base.attr_groups)
+		goto err_attr;
+
+	pmu->base.module	= THIS_MODULE;
+	pmu->base.task_ctx_nr	= perf_invalid_context;
+	pmu->base.event_init	= xe_pmu_event_init;
+	pmu->base.add		= xe_pmu_event_add;
+	pmu->base.del		= xe_pmu_event_del;
+	pmu->base.start		= xe_pmu_event_start;
+	pmu->base.stop		= xe_pmu_event_stop;
+	pmu->base.read		= xe_pmu_event_read;
+	pmu->base.event_idx	= xe_pmu_event_event_idx;
+
+	ret = perf_pmu_register(&pmu->base, pmu->name, -1);
+	if (ret)
+		goto err_groups;
+
+	ret = xe_pmu_register_cpuhp_state(pmu);
+	if (ret)
+		goto err_unreg;
+
+	ret = drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister, pmu);
+	XE_WARN_ON(ret);
+
+	return;
+
+err_unreg:
+	perf_pmu_unregister(&pmu->base);
+err_groups:
+	kfree(pmu->base.attr_groups);
+err_attr:
+	pmu->base.event_init = NULL;
+	free_event_attributes(pmu);
+err_name:
+	kfree(pmu->name);
+err:
+	drm_notice(&xe->drm, "Failed to register PMU!\n");
+}
diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
new file mode 100644
index 000000000000..d3f47f4ab343
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_pmu.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#ifndef _XE_PMU_H_
+#define _XE_PMU_H_
+
+#include "xe_gt_types.h"
+#include "xe_pmu_types.h"
+
+#ifdef CONFIG_PERF_EVENTS
+int xe_pmu_init(void);
+void xe_pmu_exit(void);
+void xe_pmu_register(struct xe_pmu *pmu);
+void engine_group_busyness_store(struct xe_gt *gt);
+#else
+static inline int xe_pmu_init(void) { return 0; }
+static inline void xe_pmu_exit(void) {}
+static inline void xe_pmu_register(struct xe_pmu *pmu) {}
+static inline void engine_group_busyness_store(struct xe_gt *gt) {}
+#endif
+
+#endif
+
diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h b/drivers/gpu/drm/xe/xe_pmu_types.h
new file mode 100644
index 000000000000..e87edd4d6a87
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_pmu_types.h
@@ -0,0 +1,80 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#ifndef _XE_PMU_TYPES_H_
+#define _XE_PMU_TYPES_H_
+
+#include <linux/perf_event.h>
+#include <linux/spinlock_types.h>
+#include <uapi/drm/xe_drm.h>
+
+enum {
+	__XE_SAMPLE_RENDER_GROUP_BUSY,
+	__XE_SAMPLE_COPY_GROUP_BUSY,
+	__XE_SAMPLE_MEDIA_GROUP_BUSY,
+	__XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
+	__XE_NUM_PMU_SAMPLERS
+};
+
+struct xe_pmu_sample {
+	u64 cur;
+};
+
+#define XE_MAX_GT_PER_TILE 2
+
+struct xe_pmu {
+	/**
+	 * @cpuhp: Struct used for CPU hotplug handling.
+	 */
+	struct {
+		struct hlist_node node;
+		unsigned int cpu;
+	} cpuhp;
+	/**
+	 * @base: PMU base.
+	 */
+	struct pmu base;
+	/**
+	 * @closed: xe is unregistering.
+	 */
+	bool closed;
+	/**
+	 * @name: Name as registered with perf core.
+	 */
+	const char *name;
+	/**
+	 * @lock: Lock protecting enable mask and ref count handling.
+	 */
+	spinlock_t lock;
+	/**
+	 * @sample: Current and previous (raw) counters.
+	 *
+	 * These counters are updated when the device is awake.
+	 *
+	 */
+	struct xe_pmu_sample sample[XE_MAX_GT_PER_TILE * __XE_NUM_PMU_SAMPLERS];
+	/**
+	 * @irq_count: Number of interrupts
+	 *
+	 * Intentionally unsigned long to avoid atomics or heuristics on 32bit.
+	 * 4e9 interrupts are a lot and postprocessing can really deal with an
+	 * occasional wraparound easily. It's 32bit after all.
+	 */
+	unsigned long irq_count;
+	/**
+	 * @events_attr_group: Device events attribute group.
+	 */
+	struct attribute_group events_attr_group;
+	/**
+	 * @xe_attr: Memory block holding device attributes.
+	 */
+	void *xe_attr;
+	/**
+	 * @pmu_attr: Memory block holding device attributes.
+	 */
+	void *pmu_attr;
+};
+
+#endif
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 965cd9527ff1..ed097056f944 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -990,6 +990,22 @@ struct drm_xe_vm_madvise {
 	__u64 reserved[2];
 };
 
+/* PMU event config IDs */
+
+/*
+ * Top 4 bits of every counter are GT id.
+ */
+#define __XE_PMU_GT_SHIFT (60)
+
+#define ___XE_PMU_OTHER(gt, x) \
+	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
+
+#define XE_PMU_INTERRUPTS(gt)			___XE_PMU_OTHER(gt, 0)
+#define XE_PMU_RENDER_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 1)
+#define XE_PMU_COPY_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 2)
+#define XE_PMU_MEDIA_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 3)
+#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)	___XE_PMU_OTHER(gt, 4)
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Intel-xe] ✓ CI.Patch_applied: success for drm/xe/pmu: Enable PMU interface (rev2)
  2023-06-27 12:21 [Intel-xe] [PATCH v2 0/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
  2023-06-27 12:21 ` [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs Aravind Iddamsetty
  2023-06-27 12:21 ` [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
@ 2023-06-27 13:04 ` Patchwork
  2023-06-27 13:05 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2023-06-27 13:04 UTC (permalink / raw)
  To: Aravind Iddamsetty; +Cc: intel-xe

== Series Details ==

Series: drm/xe/pmu: Enable PMU interface (rev2)
URL   : https://patchwork.freedesktop.org/series/119504/
State : success

== Summary ==

=== Applying kernel patches on branch 'drm-xe-next' with base: ===
Base commit: f135580b5 drm/xe: Fix vm refcount races
=== git am output follows ===
.git/rebase-apply/patch:940: new blank line at EOF.
+
warning: 1 line adds whitespace errors.
Applying: drm/xe: Get GT clock to nanosecs
Applying: drm/xe/pmu: Enable PMU interface



^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Intel-xe] ✗ CI.checkpatch: warning for drm/xe/pmu: Enable PMU interface (rev2)
  2023-06-27 12:21 [Intel-xe] [PATCH v2 0/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
                   ` (2 preceding siblings ...)
  2023-06-27 13:04 ` [Intel-xe] ✓ CI.Patch_applied: success for drm/xe/pmu: Enable PMU interface (rev2) Patchwork
@ 2023-06-27 13:05 ` Patchwork
  2023-06-27 13:06 ` [Intel-xe] ✓ CI.KUnit: success " Patchwork
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2023-06-27 13:05 UTC (permalink / raw)
  To: Aravind Iddamsetty; +Cc: intel-xe

== Series Details ==

Series: drm/xe/pmu: Enable PMU interface (rev2)
URL   : https://patchwork.freedesktop.org/series/119504/
State : warning

== Summary ==

+ KERNEL=/kernel
+ git clone https://gitlab.freedesktop.org/drm/maintainer-tools mt
Cloning into 'mt'...
warning: redirecting to https://gitlab.freedesktop.org/drm/maintainer-tools.git/
+ git -C mt rev-list -n1 origin/master
c7d32770e3cd31d9fc134ce41f329b10aa33ee15
+ cd /kernel
+ git config --global --add safe.directory /kernel
+ git log -n1
commit 599be1fe6c1e4819af3f217d845c675722ea0b9f
Author: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Date:   Tue Jun 27 17:51:13 2023 +0530

    drm/xe/pmu: Enable PMU interface
    
    There are a set of engine group busyness counters provided by HW which are
    perfect fit to be exposed via PMU perf events.
    
    BSPEC: 46559, 46560, 46722, 46729
    
    events can be listed using:
    perf list
      xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
      xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
      xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
      xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
      xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
    
    and can be read using:
    
    perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
               time             counts unit events
         1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
         2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
         3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
         4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
         5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
         6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
         7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
         8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
         9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
        10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
        10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
    
    The pmu base implementation is taken from i915.
    
    v2:
    Store last known value when device is awake return that while the GT is
    suspended and then update the driver copy when read during awake.
    
    Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
    Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
+ /mt/dim checkpatch f135580b567158b23349db5544259730dfb91e78 drm-intel
da50f9f65 drm/xe: Get GT clock to nanosecs
599be1fe6 drm/xe/pmu: Enable PMU interface
Traceback (most recent call last):
  File "scripts/spdxcheck.py", line 6, in <module>
    from ply import lex, yacc
ModuleNotFoundError: No module named 'ply'
Traceback (most recent call last):
  File "scripts/spdxcheck.py", line 6, in <module>
    from ply import lex, yacc
ModuleNotFoundError: No module named 'ply'
-:23: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#23: 
     1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/

-:41: WARNING:BAD_SIGN_OFF: Co-developed-by: must be immediately followed by Signed-off-by:
#41: 
Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>

-:42: WARNING:BAD_SIGN_OFF: Co-developed-by and Signed-off-by: name/email do not match
#42: 
Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>

-:193: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#193: 
new file mode 100644

-:198: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#198: FILE: drivers/gpu/drm/xe/xe_pmu.c:1:
+/*

-:199: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#199: FILE: drivers/gpu/drm/xe/xe_pmu.c:2:
+ * SPDX-License-Identifier: MIT

-:231: WARNING:AVOID_BUG: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants
#231: FILE: drivers/gpu/drm/xe/xe_pmu.c:34:
+	XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));

-:753: WARNING:AVOID_BUG: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants
#753: FILE: drivers/gpu/drm/xe/xe_pmu.c:556:
+	XE_BUG_ON(!pmu->base.event_init);

-:767: WARNING:AVOID_BUG: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants
#767: FILE: drivers/gpu/drm/xe/xe_pmu.c:570:
+	XE_BUG_ON(!pmu->base.event_init);

total: 0 errors, 9 warnings, 0 checks, 974 lines checked



^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Intel-xe] ✓ CI.KUnit: success for drm/xe/pmu: Enable PMU interface (rev2)
  2023-06-27 12:21 [Intel-xe] [PATCH v2 0/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
                   ` (3 preceding siblings ...)
  2023-06-27 13:05 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork
@ 2023-06-27 13:06 ` Patchwork
  2023-06-27 13:10 ` [Intel-xe] ✓ CI.Build: " Patchwork
  2023-06-27 13:10 ` [Intel-xe] ✗ CI.Hooks: failure " Patchwork
  6 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2023-06-27 13:06 UTC (permalink / raw)
  To: Aravind Iddamsetty; +Cc: intel-xe

== Series Details ==

Series: drm/xe/pmu: Enable PMU interface (rev2)
URL   : https://patchwork.freedesktop.org/series/119504/
State : success

== Summary ==

+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
stty: 'standard input': Inappropriate ioctl for device
[13:05:01] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[13:05:05] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make ARCH=um O=.kunit --jobs=48
[13:05:27] Starting KUnit Kernel (1/1)...
[13:05:27] ============================================================
[13:05:27] ==================== xe_bo (2 subtests) ====================
[13:05:27] [SKIPPED] xe_ccs_migrate_kunit
[13:05:27] [SKIPPED] xe_bo_evict_kunit
[13:05:27] ===================== [SKIPPED] xe_bo ======================
[13:05:27] ================== xe_dma_buf (1 subtest) ==================
[13:05:27] [SKIPPED] xe_dma_buf_kunit
[13:05:27] =================== [SKIPPED] xe_dma_buf ===================
[13:05:27] ================== xe_migrate (1 subtest) ==================
[13:05:27] [SKIPPED] xe_migrate_sanity_kunit
[13:05:27] =================== [SKIPPED] xe_migrate ===================
[13:05:27] =================== xe_pci (2 subtests) ====================
[13:05:27] [PASSED] xe_gmdid_graphics_ip
[13:05:27] [PASSED] xe_gmdid_media_ip
[13:05:27] ===================== [PASSED] xe_pci ======================
[13:05:27] ==================== xe_rtp (1 subtest) ====================
[13:05:27] ================== xe_rtp_process_tests  ===================
[13:05:27] [PASSED] coalesce-same-reg
[13:05:27] [PASSED] no-match-no-add
[13:05:27] [PASSED] no-match-no-add-multiple-rules
[13:05:27] [PASSED] two-regs-two-entries
[13:05:27] [PASSED] clr-one-set-other
[13:05:27] [PASSED] set-field
[13:05:27] [PASSED] conflict-duplicate
[13:05:27] [PASSED] conflict-not-disjoint
[13:05:27] [PASSED] conflict-reg-type
[13:05:27] ============== [PASSED] xe_rtp_process_tests ===============
[13:05:27] ===================== [PASSED] xe_rtp ======================
[13:05:27] ==================== xe_wa (1 subtest) =====================
[13:05:27] ======================== xe_wa_gt  =========================
[13:05:27] [PASSED] TIGERLAKE (B0)
[13:05:27] [PASSED] DG1 (A0)
[13:05:27] [PASSED] DG1 (B0)
[13:05:27] [PASSED] ALDERLAKE_S (A0)
[13:05:27] [PASSED] ALDERLAKE_S (B0)
[13:05:27] [PASSED] ALDERLAKE_S (C0)
[13:05:27] [PASSED] ALDERLAKE_S (D0)
[13:05:27] [PASSED] DG2_G10 (A0)
[13:05:27] [PASSED] DG2_G10 (A1)
[13:05:27] [PASSED] DG2_G10 (B0)
[13:05:27] [PASSED] DG2_G10 (C0)
[13:05:27] [PASSED] DG2_G11 (A0)
[13:05:27] [PASSED] DG2_G11 (B0)
[13:05:27] [PASSED] DG2_G11 (B1)
[13:05:27] [PASSED] DG2_G12 (A0)
[13:05:27] [PASSED] DG2_G12 (A1)
[13:05:27] [PASSED] PVC (B0)
[13:05:27] [PASSED] PVC (B1)
[13:05:27] [PASSED] PVC (C0)
[13:05:27] ==================== [PASSED] xe_wa_gt =====================
[13:05:27] ====================== [PASSED] xe_wa ======================
[13:05:27] ============================================================
[13:05:27] Testing complete. Ran 34 tests: passed: 30, skipped: 4
[13:05:27] Elapsed time: 26.537s total, 4.194s configuring, 22.174s building, 0.146s running

+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/tests/.kunitconfig
[13:05:27] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[13:05:29] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make ARCH=um O=.kunit --jobs=48
[13:05:47] Starting KUnit Kernel (1/1)...
[13:05:47] ============================================================
[13:05:47] ============ drm_test_pick_cmdline (2 subtests) ============
[13:05:47] [PASSED] drm_test_pick_cmdline_res_1920_1080_60
[13:05:47] =============== drm_test_pick_cmdline_named  ===============
[13:05:47] [PASSED] NTSC
[13:05:47] [PASSED] NTSC-J
[13:05:47] [PASSED] PAL
[13:05:47] [PASSED] PAL-M
[13:05:47] =========== [PASSED] drm_test_pick_cmdline_named ===========
[13:05:47] ============== [PASSED] drm_test_pick_cmdline ==============
[13:05:47] ================== drm_buddy (6 subtests) ==================
[13:05:47] [PASSED] drm_test_buddy_alloc_limit
[13:05:47] [PASSED] drm_test_buddy_alloc_range
[13:05:47] [PASSED] drm_test_buddy_alloc_optimistic
[13:05:47] [PASSED] drm_test_buddy_alloc_pessimistic
[13:05:47] [PASSED] drm_test_buddy_alloc_smoke
[13:05:47] [PASSED] drm_test_buddy_alloc_pathological
[13:05:47] ==================== [PASSED] drm_buddy ====================
[13:05:47] ============= drm_cmdline_parser (40 subtests) =============
[13:05:47] [PASSED] drm_test_cmdline_force_d_only
[13:05:47] [PASSED] drm_test_cmdline_force_D_only_dvi
[13:05:47] [PASSED] drm_test_cmdline_force_D_only_hdmi
[13:05:47] [PASSED] drm_test_cmdline_force_D_only_not_digital
[13:05:47] [PASSED] drm_test_cmdline_force_e_only
[13:05:47] [PASSED] drm_test_cmdline_res
[13:05:47] [PASSED] drm_test_cmdline_res_vesa
[13:05:47] [PASSED] drm_test_cmdline_res_vesa_rblank
[13:05:47] [PASSED] drm_test_cmdline_res_rblank
[13:05:47] [PASSED] drm_test_cmdline_res_bpp
[13:05:47] [PASSED] drm_test_cmdline_res_refresh
[13:05:47] [PASSED] drm_test_cmdline_res_bpp_refresh
[13:05:47] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced
[13:05:47] [PASSED] drm_test_cmdline_res_bpp_refresh_margins
[13:05:47] [PASSED] drm_test_cmdline_res_bpp_refresh_force_off
[13:05:47] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on
[13:05:47] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_analog
[13:05:47] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_digital
[13:05:47] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced_margins_force_on
[13:05:47] [PASSED] drm_test_cmdline_res_margins_force_on
[13:05:47] [PASSED] drm_test_cmdline_res_vesa_margins
[13:05:47] [PASSED] drm_test_cmdline_name
[13:05:47] [PASSED] drm_test_cmdline_name_bpp
[13:05:47] [PASSED] drm_test_cmdline_name_option
[13:05:47] [PASSED] drm_test_cmdline_name_bpp_option
[13:05:47] [PASSED] drm_test_cmdline_rotate_0
[13:05:47] [PASSED] drm_test_cmdline_rotate_90
[13:05:47] [PASSED] drm_test_cmdline_rotate_180
[13:05:47] [PASSED] drm_test_cmdline_rotate_270
[13:05:47] [PASSED] drm_test_cmdline_hmirror
[13:05:47] [PASSED] drm_test_cmdline_vmirror
[13:05:47] [PASSED] drm_test_cmdline_margin_options
[13:05:47] [PASSED] drm_test_cmdline_multiple_options
[13:05:47] [PASSED] drm_test_cmdline_bpp_extra_and_option
[13:05:47] [PASSED] drm_test_cmdline_extra_and_option
[13:05:47] [PASSED] drm_test_cmdline_freestanding_options
[13:05:47] [PASSED] drm_test_cmdline_freestanding_force_e_and_options
[13:05:47] [PASSED] drm_test_cmdline_panel_orientation
[13:05:47] ================ drm_test_cmdline_invalid  =================
[13:05:47] [PASSED] margin_only
[13:05:47] [PASSED] interlace_only
[13:05:47] [PASSED] res_missing_x
[13:05:47] [PASSED] res_missing_y
[13:05:47] [PASSED] res_bad_y
[13:05:47] [PASSED] res_missing_y_bpp
[13:05:47] [PASSED] res_bad_bpp
[13:05:47] [PASSED] res_bad_refresh
[13:05:47] [PASSED] res_bpp_refresh_force_on_off
[13:05:47] [PASSED] res_invalid_mode
[13:05:47] [PASSED] res_bpp_wrong_place_mode
[13:05:47] [PASSED] name_bpp_refresh
[13:05:47] [PASSED] name_refresh
[13:05:47] [PASSED] name_refresh_wrong_mode
[13:05:47] [PASSED] name_refresh_invalid_mode
[13:05:47] [PASSED] rotate_multiple
[13:05:47] [PASSED] rotate_invalid_val
[13:05:47] [PASSED] rotate_truncated
[13:05:47] [PASSED] invalid_option
[13:05:47] [PASSED] invalid_tv_option
[13:05:47] [PASSED] truncated_tv_option
[13:05:47] ============ [PASSED] drm_test_cmdline_invalid =============
[13:05:47] =============== drm_test_cmdline_tv_options  ===============
[13:05:47] [PASSED] NTSC
[13:05:47] [PASSED] NTSC_443
[13:05:47] [PASSED] NTSC_J
[13:05:47] [PASSED] PAL
[13:05:47] [PASSED] PAL_M
[13:05:47] [PASSED] PAL_N
[13:05:47] [PASSED] SECAM
[13:05:47] =========== [PASSED] drm_test_cmdline_tv_options ===========
[13:05:47] =============== [PASSED] drm_cmdline_parser ================
[13:05:47] ========== drm_get_tv_mode_from_name (2 subtests) ==========
[13:05:47] ========== drm_test_get_tv_mode_from_name_valid  ===========
[13:05:47] [PASSED] NTSC
[13:05:47] [PASSED] NTSC-443
[13:05:47] [PASSED] NTSC-J
[13:05:47] [PASSED] PAL
[13:05:47] [PASSED] PAL-M
[13:05:47] [PASSED] PAL-N
[13:05:47] [PASSED] SECAM
[13:05:47] ====== [PASSED] drm_test_get_tv_mode_from_name_valid =======
[13:05:47] [PASSED] drm_test_get_tv_mode_from_name_truncated
[13:05:47] ============ [PASSED] drm_get_tv_mode_from_name ============
[13:05:47] ============= drm_damage_helper (21 subtests) ==============
[13:05:47] [PASSED] drm_test_damage_iter_no_damage
[13:05:47] [PASSED] drm_test_damage_iter_no_damage_fractional_src
[13:05:47] [PASSED] drm_test_damage_iter_no_damage_src_moved
[13:05:47] [PASSED] drm_test_damage_iter_no_damage_fractional_src_moved
[13:05:47] [PASSED] drm_test_damage_iter_no_damage_not_visible
[13:05:47] [PASSED] drm_test_damage_iter_no_damage_no_crtc
[13:05:47] [PASSED] drm_test_damage_iter_no_damage_no_fb
[13:05:47] [PASSED] drm_test_damage_iter_simple_damage
[13:05:47] [PASSED] drm_test_damage_iter_single_damage
[13:05:47] [PASSED] drm_test_damage_iter_single_damage_intersect_src
[13:05:47] [PASSED] drm_test_damage_iter_single_damage_outside_src
[13:05:47] [PASSED] drm_test_damage_iter_single_damage_fractional_src
[13:05:47] [PASSED] drm_test_damage_iter_single_damage_intersect_fractional_src
[13:05:47] [PASSED] drm_test_damage_iter_single_damage_outside_fractional_src
[13:05:47] [PASSED] drm_test_damage_iter_single_damage_src_moved
[13:05:47] [PASSED] drm_test_damage_iter_single_damage_fractional_src_moved
[13:05:47] [PASSED] drm_test_damage_iter_damage
[13:05:47] [PASSED] drm_test_damage_iter_damage_one_intersect
[13:05:47] [PASSED] drm_test_damage_iter_damage_one_outside
[13:05:47] [PASSED] drm_test_damage_iter_damage_src_moved
[13:05:47] [PASSED] drm_test_damage_iter_damage_not_visible
[13:05:47] ================ [PASSED] drm_damage_helper ================
[13:05:47] ============== drm_dp_mst_helper (2 subtests) ==============
[13:05:47] ============== drm_test_dp_mst_calc_pbn_mode  ==============
[13:05:47] [PASSED] Clock 154000 BPP 30 DSC disabled
[13:05:47] [PASSED] Clock 234000 BPP 30 DSC disabled
[13:05:47] [PASSED] Clock 297000 BPP 24 DSC disabled
[13:05:47] [PASSED] Clock 332880 BPP 24 DSC enabled
[13:05:47] [PASSED] Clock 324540 BPP 24 DSC enabled
[13:05:47] ========== [PASSED] drm_test_dp_mst_calc_pbn_mode ==========
[13:05:47] ========= drm_test_dp_mst_sideband_msg_req_decode  =========
[13:05:47] [PASSED] DP_ENUM_PATH_RESOURCES with port number
[13:05:47] [PASSED] DP_POWER_UP_PHY with port number
[13:05:47] [PASSED] DP_POWER_DOWN_PHY with port number
[13:05:47] [PASSED] DP_ALLOCATE_PAYLOAD with SDP stream sinks
[13:05:47] [PASSED] DP_ALLOCATE_PAYLOAD with port number
[13:05:47] [PASSED] DP_ALLOCATE_PAYLOAD with VCPI
[13:05:47] [PASSED] DP_ALLOCATE_PAYLOAD with PBN
[13:05:47] [PASSED] DP_QUERY_PAYLOAD with port number
[13:05:47] [PASSED] DP_QUERY_PAYLOAD with VCPI
[13:05:47] [PASSED] DP_REMOTE_DPCD_READ with port number
[13:05:47] [PASSED] DP_REMOTE_DPCD_READ with DPCD address
[13:05:47] [PASSED] DP_REMOTE_DPCD_READ with max number of bytes
[13:05:47] [PASSED] DP_REMOTE_DPCD_WRITE with port number
[13:05:47] [PASSED] DP_REMOTE_DPCD_WRITE with DPCD address
[13:05:47] [PASSED] DP_REMOTE_DPCD_WRITE with data array
[13:05:47] [PASSED] DP_REMOTE_I2C_READ with port number
[13:05:47] [PASSED] DP_REMOTE_I2C_READ with I2C device ID
[13:05:47] [PASSED] DP_REMOTE_I2C_READ with transactions array
[13:05:47] [PASSED] DP_REMOTE_I2C_WRITE with port number
[13:05:47] [PASSED] DP_REMOTE_I2C_WRITE with I2C device ID
[13:05:47] [PASSED] DP_REMOTE_I2C_WRITE with data array
[13:05:47] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream ID
[13:05:47] [PASSED] DP_QUERY_STREAM_ENC_STATUS with client ID
[13:05:47] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream event
[13:05:47] [PASSED] DP_QUERY_STREAM_ENC_STATUS with valid stream event
[13:05:47] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream behavior
[13:05:47] [PASSED] DP_QUERY_STREAM_ENC_STATUS with a valid stream behavior
[13:05:47] ===== [PASSED] drm_test_dp_mst_sideband_msg_req_decode =====
[13:05:47] ================ [PASSED] drm_dp_mst_helper ================
[13:05:47] =========== drm_format_helper_test (11 subtests) ===========
[13:05:47] ============== drm_test_fb_xrgb8888_to_gray8  ==============
[13:05:47] [PASSED] single_pixel_source_buffer
[13:05:47] [PASSED] single_pixel_clip_rectangle
[13:05:47] [PASSED] well_known_colors
[13:05:47] [PASSED] destination_pitch
[13:05:47] ========== [PASSED] drm_test_fb_xrgb8888_to_gray8 ==========
[13:05:47] ============= drm_test_fb_xrgb8888_to_rgb332  ==============
[13:05:47] [PASSED] single_pixel_source_buffer
[13:05:47] [PASSED] single_pixel_clip_rectangle
[13:05:47] [PASSED] well_known_colors
[13:05:47] [PASSED] destination_pitch
[13:05:47] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb332 ==========
[13:05:47] ============= drm_test_fb_xrgb8888_to_rgb565  ==============
[13:05:47] [PASSED] single_pixel_source_buffer
[13:05:47] [PASSED] single_pixel_clip_rectangle
[13:05:47] [PASSED] well_known_colors
[13:05:47] [PASSED] destination_pitch
[13:05:47] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb565 ==========
[13:05:47] ============ drm_test_fb_xrgb8888_to_xrgb1555  =============
[13:05:47] [PASSED] single_pixel_source_buffer
[13:05:47] [PASSED] single_pixel_clip_rectangle
[13:05:47] [PASSED] well_known_colors
[13:05:47] [PASSED] destination_pitch
[13:05:47] ======== [PASSED] drm_test_fb_xrgb8888_to_xrgb1555 =========
[13:05:47] ============ drm_test_fb_xrgb8888_to_argb1555  =============
[13:05:47] [PASSED] single_pixel_source_buffer
[13:05:47] [PASSED] single_pixel_clip_rectangle
[13:05:47] [PASSED] well_known_colors
[13:05:47] [PASSED] destination_pitch
[13:05:47] ======== [PASSED] drm_test_fb_xrgb8888_to_argb1555 =========
[13:05:47] ============ drm_test_fb_xrgb8888_to_rgba5551  =============
[13:05:47] [PASSED] single_pixel_source_buffer
[13:05:47] [PASSED] single_pixel_clip_rectangle
[13:05:47] [PASSED] well_known_colors
[13:05:47] [PASSED] destination_pitch
[13:05:47] ======== [PASSED] drm_test_fb_xrgb8888_to_rgba5551 =========
[13:05:47] ============= drm_test_fb_xrgb8888_to_rgb888  ==============
[13:05:47] [PASSED] single_pixel_source_buffer
[13:05:47] [PASSED] single_pixel_clip_rectangle
[13:05:47] [PASSED] well_known_colors
[13:05:47] [PASSED] destination_pitch
[13:05:47] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb888 ==========
[13:05:47] ============ drm_test_fb_xrgb8888_to_argb8888  =============
[13:05:47] [PASSED] single_pixel_source_buffer
[13:05:47] [PASSED] single_pixel_clip_rectangle
[13:05:47] [PASSED] well_known_colors
[13:05:47] [PASSED] destination_pitch
[13:05:47] ======== [PASSED] drm_test_fb_xrgb8888_to_argb8888 =========
[13:05:47] =========== drm_test_fb_xrgb8888_to_xrgb2101010  ===========
[13:05:47] [PASSED] single_pixel_source_buffer
[13:05:47] [PASSED] single_pixel_clip_rectangle
[13:05:47] [PASSED] well_known_colors
[13:05:47] [PASSED] destination_pitch
[13:05:47] ======= [PASSED] drm_test_fb_xrgb8888_to_xrgb2101010 =======
[13:05:47] =========== drm_test_fb_xrgb8888_to_argb2101010  ===========
[13:05:47] [PASSED] single_pixel_source_buffer
[13:05:47] [PASSED] single_pixel_clip_rectangle
[13:05:47] [PASSED] well_known_colors
[13:05:47] [PASSED] destination_pitch
[13:05:47] ======= [PASSED] drm_test_fb_xrgb8888_to_argb2101010 =======
[13:05:47] ============== drm_test_fb_xrgb8888_to_mono  ===============
[13:05:47] [PASSED] single_pixel_source_buffer
[13:05:47] [PASSED] single_pixel_clip_rectangle
[13:05:47] [PASSED] well_known_colors
[13:05:47] [PASSED] destination_pitch
[13:05:47] ========== [PASSED] drm_test_fb_xrgb8888_to_mono ===========
[13:05:47] ============= [PASSED] drm_format_helper_test ==============
[13:05:47] ================= drm_format (18 subtests) =================
[13:05:47] [PASSED] drm_test_format_block_width_invalid
[13:05:47] [PASSED] drm_test_format_block_width_one_plane
[13:05:47] [PASSED] drm_test_format_block_width_two_plane
[13:05:47] [PASSED] drm_test_format_block_width_three_plane
[13:05:47] [PASSED] drm_test_format_block_width_tiled
[13:05:47] [PASSED] drm_test_format_block_height_invalid
[13:05:47] [PASSED] drm_test_format_block_height_one_plane
[13:05:47] [PASSED] drm_test_format_block_height_two_plane
[13:05:47] [PASSED] drm_test_format_block_height_three_plane
[13:05:47] [PASSED] drm_test_format_block_height_tiled
[13:05:47] [PASSED] drm_test_format_min_pitch_invalid
[13:05:47] [PASSED] drm_test_format_min_pitch_one_plane_8bpp
[13:05:47] [PASSED] drm_test_format_min_pitch_one_plane_16bpp
[13:05:47] [PASSED] drm_test_format_min_pitch_one_plane_24bpp
[13:05:47] [PASSED] drm_test_format_min_pitch_one_plane_32bpp
[13:05:47] [PASSED] drm_test_format_min_pitch_two_plane
[13:05:47] [PASSED] drm_test_format_min_pitch_three_plane_8bpp
[13:05:47] [PASSED] drm_test_format_min_pitch_tiled
[13:05:47] =================== [PASSED] drm_format ====================
[13:05:47] =============== drm_framebuffer (1 subtest) ================
[13:05:47] =============== drm_test_framebuffer_create  ===============
[13:05:47] [PASSED] ABGR8888 normal sizes
[13:05:47] [PASSED] ABGR8888 max sizes
[13:05:47] [PASSED] ABGR8888 pitch greater than min required
[13:05:47] [PASSED] ABGR8888 pitch less than min required
[13:05:47] [PASSED] ABGR8888 Invalid width
[13:05:47] [PASSED] ABGR8888 Invalid buffer handle
[13:05:47] [PASSED] No pixel format
[13:05:47] [PASSED] ABGR8888 Width 0
[13:05:47] [PASSED] ABGR8888 Height 0
[13:05:47] [PASSED] ABGR8888 Out of bound height * pitch combination
[13:05:47] [PASSED] ABGR8888 Large buffer offset
[13:05:47] [PASSED] ABGR8888 Set DRM_MODE_FB_MODIFIERS without modifiers
[13:05:47] [PASSED] ABGR8888 Valid buffer modifier
[13:05:47] [PASSED] ABGR8888 Invalid buffer modifier(DRM_FORMAT_MOD_SAMSUNG_64_32_TILE)
[13:05:47] [PASSED] ABGR8888 Extra pitches without DRM_MODE_FB_MODIFIERS
[13:05:47] [PASSED] ABGR8888 Extra pitches with DRM_MODE_FB_MODIFIERS
[13:05:47] [PASSED] NV12 Normal sizes
[13:05:47] [PASSED] NV12 Max sizes
[13:05:47] [PASSED] NV12 Invalid pitch
[13:05:47] [PASSED] NV12 Invalid modifier/missing DRM_MODE_FB_MODIFIERS flag
[13:05:47] [PASSED] NV12 different  modifier per-plane
[13:05:47] [PASSED] NV12 with DRM_FORMAT_MOD_SAMSUNG_64_32_TILE
[13:05:47] [PASSED] NV12 Valid modifiers without DRM_MODE_FB_MODIFIERS
[13:05:47] [PASSED] NV12 Modifier for inexistent plane
[13:05:47] [PASSED] NV12 Handle for inexistent plane
[13:05:47] [PASSED] NV12 Handle for inexistent plane without DRM_MODE_FB_MODIFIERS
[13:05:47] [PASSED] YVU420 Normal sizes
[13:05:47] [PASSED] YVU420 DRM_MODE_FB_MODIFIERS set without modifier
[13:05:47] [PASSED] YVU420 Max sizes
[13:05:47] [PASSED] YVU420 Invalid pitch
[13:05:47] [PASSED] YVU420 Different pitches
[13:05:47] [PASSED] YVU420 Different buffer offsets/pitches
[13:05:47] [PASSED] YVU420 Modifier set just for plane 0, without DRM_MODE_FB_MODIFIERS
[13:05:47] [PASSED] YVU420 Modifier set just for planes 0, 1, without DRM_MODE_FB_MODIFIERS
[13:05:47] [PASSED] YVU420 Modifier set just for plane 0, 1, with DRM_MODE_FB_MODIFIERS
[13:05:47] [PASSED] YVU420 Valid modifier
[13:05:47] [PASSED] YVU420 Different modifiers per plane
[13:05:47] [PASSED] YVU420 Modifier for inexistent plane
[13:05:47] [PASSED] X0L2 Normal sizes
[13:05:47] [PASSED] X0L2 Max sizes
[13:05:47] [PASSED] X0L2 Invalid pitch
[13:05:47] [PASSED] X0L2 Pitch greater than minimum required
stty: 'standard input': Inappropriate ioctl for device
[13:05:47] [PASSED] X0L2 Handle for inexistent plane
[13:05:47] [PASSED] X0L2 Offset for inexistent plane, without DRM_MODE_FB_MODIFIERS set
[13:05:47] [PASSED] X0L2 Modifier without DRM_MODE_FB_MODIFIERS set
[13:05:47] [PASSED] X0L2 Valid modifier
[13:05:47] [PASSED] X0L2 Modifier for inexistent plane
[13:05:47] =========== [PASSED] drm_test_framebuffer_create ===========
[13:05:47] ================= [PASSED] drm_framebuffer =================
[13:05:47] =============== drm-test-managed (1 subtest) ===============
[13:05:47] [PASSED] drm_test_managed_run_action
[13:05:47] ================ [PASSED] drm-test-managed =================
[13:05:47] =================== drm_mm (19 subtests) ===================
[13:05:47] [PASSED] drm_test_mm_init
[13:05:48] [PASSED] drm_test_mm_debug
[13:05:58] [PASSED] drm_test_mm_reserve
[13:06:08] [PASSED] drm_test_mm_insert
[13:06:09] [PASSED] drm_test_mm_replace
[13:06:09] [PASSED] drm_test_mm_insert_range
[13:06:09] [PASSED] drm_test_mm_frag
[13:06:09] [PASSED] drm_test_mm_align
[13:06:09] [PASSED] drm_test_mm_align32
[13:06:09] [PASSED] drm_test_mm_align64
[13:06:09] [PASSED] drm_test_mm_evict
[13:06:09] [PASSED] drm_test_mm_evict_range
[13:06:09] [PASSED] drm_test_mm_topdown
[13:06:09] [PASSED] drm_test_mm_bottomup
[13:06:09] [PASSED] drm_test_mm_lowest
[13:06:09] [PASSED] drm_test_mm_highest
[13:06:10] [PASSED] drm_test_mm_color
[13:06:11] [PASSED] drm_test_mm_color_evict
[13:06:11] [PASSED] drm_test_mm_color_evict_range
[13:06:11] ===================== [PASSED] drm_mm ======================
[13:06:11] ============= drm_modes_analog_tv (4 subtests) =============
[13:06:11] [PASSED] drm_test_modes_analog_tv_ntsc_480i
[13:06:11] [PASSED] drm_test_modes_analog_tv_ntsc_480i_inlined
[13:06:11] [PASSED] drm_test_modes_analog_tv_pal_576i
[13:06:11] [PASSED] drm_test_modes_analog_tv_pal_576i_inlined
[13:06:11] =============== [PASSED] drm_modes_analog_tv ===============
[13:06:11] ============== drm_plane_helper (2 subtests) ===============
[13:06:11] =============== drm_test_check_plane_state  ================
[13:06:11] [PASSED] clipping_simple
[13:06:11] [PASSED] clipping_rotate_reflect
[13:06:11] [PASSED] positioning_simple
[13:06:11] [PASSED] upscaling
[13:06:11] [PASSED] downscaling
[13:06:11] [PASSED] rounding1
[13:06:11] [PASSED] rounding2
[13:06:11] [PASSED] rounding3
[13:06:11] [PASSED] rounding4
[13:06:11] =========== [PASSED] drm_test_check_plane_state ============
[13:06:11] =========== drm_test_check_invalid_plane_state  ============
[13:06:11] [PASSED] positioning_invalid
[13:06:11] [PASSED] upscaling_invalid
[13:06:11] [PASSED] downscaling_invalid
[13:06:11] ======= [PASSED] drm_test_check_invalid_plane_state ========
[13:06:11] ================ [PASSED] drm_plane_helper =================
[13:06:11] ====== drm_connector_helper_tv_get_modes (1 subtest) =======
[13:06:11] ====== drm_test_connector_helper_tv_get_modes_check  =======
[13:06:11] [PASSED] None
[13:06:11] [PASSED] PAL
[13:06:11] [PASSED] NTSC
[13:06:11] [PASSED] Both, NTSC Default
[13:06:11] [PASSED] Both, PAL Default
[13:06:11] [PASSED] Both, NTSC Default, with PAL on command-line
[13:06:11] [PASSED] Both, PAL Default, with NTSC on command-line
[13:06:11] == [PASSED] drm_test_connector_helper_tv_get_modes_check ===
[13:06:11] ======== [PASSED] drm_connector_helper_tv_get_modes ========
[13:06:11] ================== drm_rect (9 subtests) ===================
[13:06:11] [PASSED] drm_test_rect_clip_scaled_div_by_zero
[13:06:11] [PASSED] drm_test_rect_clip_scaled_not_clipped
[13:06:11] [PASSED] drm_test_rect_clip_scaled_clipped
[13:06:11] [PASSED] drm_test_rect_clip_scaled_signed_vs_unsigned
[13:06:11] ================= drm_test_rect_intersect  =================
[13:06:11] [PASSED] top-left x bottom-right: 2x2+1+1 x 2x2+0+0
[13:06:11] [PASSED] top-right x bottom-left: 2x2+0+0 x 2x2+1-1
[13:06:11] [PASSED] bottom-left x top-right: 2x2+1-1 x 2x2+0+0
[13:06:11] [PASSED] bottom-right x top-left: 2x2+0+0 x 2x2+1+1
[13:06:11] [PASSED] right x left: 2x1+0+0 x 3x1+1+0
[13:06:11] [PASSED] left x right: 3x1+1+0 x 2x1+0+0
[13:06:11] [PASSED] up x bottom: 1x2+0+0 x 1x3+0-1
[13:06:11] [PASSED] bottom x up: 1x3+0-1 x 1x2+0+0
[13:06:11] [PASSED] touching corner: 1x1+0+0 x 2x2+1+1
[13:06:11] [PASSED] touching side: 1x1+0+0 x 1x1+1+0
[13:06:11] [PASSED] equal rects: 2x2+0+0 x 2x2+0+0
[13:06:11] [PASSED] inside another: 2x2+0+0 x 1x1+1+1
[13:06:11] [PASSED] far away: 1x1+0+0 x 1x1+3+6
[13:06:11] [PASSED] points intersecting: 0x0+5+10 x 0x0+5+10
[13:06:11] [PASSED] points not intersecting: 0x0+0+0 x 0x0+5+10
[13:06:11] ============= [PASSED] drm_test_rect_intersect =============
[13:06:11] ================ drm_test_rect_calc_hscale  ================
[13:06:11] [PASSED] normal use
[13:06:11] [PASSED] out of max range
[13:06:11] [PASSED] out of min range
[13:06:11] [PASSED] zero dst
[13:06:11] [PASSED] negative src
[13:06:11] [PASSED] negative dst
[13:06:11] ============ [PASSED] drm_test_rect_calc_hscale ============
[13:06:11] ================ drm_test_rect_calc_vscale  ================
[13:06:11] [PASSED] normal use
[13:06:11] [PASSED] out of max range
[13:06:11] [PASSED] out of min range
[13:06:11] [PASSED] zero dst
[13:06:11] [PASSED] negative src
[13:06:11] [PASSED] negative dst
[13:06:11] ============ [PASSED] drm_test_rect_calc_vscale ============
[13:06:11] ================== drm_test_rect_rotate  ===================
[13:06:11] [PASSED] reflect-x
[13:06:11] [PASSED] reflect-y
[13:06:11] [PASSED] rotate-0
[13:06:11] [PASSED] rotate-90
[13:06:11] [PASSED] rotate-180
[13:06:11] [PASSED] rotate-270
[13:06:11] ============== [PASSED] drm_test_rect_rotate ===============
[13:06:11] ================ drm_test_rect_rotate_inv  =================
[13:06:11] [PASSED] reflect-x
[13:06:11] [PASSED] reflect-y
[13:06:11] [PASSED] rotate-0
[13:06:11] [PASSED] rotate-90
[13:06:11] [PASSED] rotate-180
[13:06:11] [PASSED] rotate-270
[13:06:11] ============ [PASSED] drm_test_rect_rotate_inv =============
[13:06:11] ==================== [PASSED] drm_rect =====================
[13:06:11] ============================================================
[13:06:11] Testing complete. Ran 333 tests: passed: 333
[13:06:11] Elapsed time: 43.492s total, 1.701s configuring, 18.310s building, 23.463s running

+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel



^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Intel-xe] ✓ CI.Build: success for drm/xe/pmu: Enable PMU interface (rev2)
  2023-06-27 12:21 [Intel-xe] [PATCH v2 0/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
                   ` (4 preceding siblings ...)
  2023-06-27 13:06 ` [Intel-xe] ✓ CI.KUnit: success " Patchwork
@ 2023-06-27 13:10 ` Patchwork
  2023-06-27 13:10 ` [Intel-xe] ✗ CI.Hooks: failure " Patchwork
  6 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2023-06-27 13:10 UTC (permalink / raw)
  To: Aravind Iddamsetty; +Cc: intel-xe

== Series Details ==

Series: drm/xe/pmu: Enable PMU interface (rev2)
URL   : https://patchwork.freedesktop.org/series/119504/
State : success

== Summary ==

+ trap cleanup EXIT
+ cd /kernel
+ git clone https://gitlab.freedesktop.org/drm/xe/ci.git .ci
Cloning into '.ci'...
++ date +%s
+ echo -e '\e[0Ksection_start:1687871180:build_x86_64[collapsed=true]\r\e[0KBuild x86-64'
+ mkdir -p build64
^[[0Ksection_start:1687871180:build_x86_64[collapsed=true]
^[[0KBuild x86-64
+ cat .ci/kernel/kconfig
+ [[ '' != '' ]]
+ make O=build64 olddefconfig
make[1]: Entering directory '/kernel/build64'
  GEN     Makefile
  HOSTCC  scripts/basic/fixdep
  HOSTCC  scripts/kconfig/conf.o
  HOSTCC  scripts/kconfig/confdata.o
  HOSTCC  scripts/kconfig/expr.o
  LEX     scripts/kconfig/lexer.lex.c
  YACC    scripts/kconfig/parser.tab.[ch]
  HOSTCC  scripts/kconfig/lexer.lex.o
  HOSTCC  scripts/kconfig/menu.o
  HOSTCC  scripts/kconfig/parser.tab.o
  HOSTCC  scripts/kconfig/preprocess.o
  HOSTCC  scripts/kconfig/symbol.o
  HOSTCC  scripts/kconfig/util.o
  HOSTLD  scripts/kconfig/conf
#
# configuration written to .config
#
make[1]: Leaving directory '/kernel/build64'
++ nproc
+ make O=build64 -j48
make[1]: Entering directory '/kernel/build64'
  GEN     Makefile
  WRAP    arch/x86/include/generated/uapi/asm/bpf_perf_event.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_32.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_64.h
  WRAP    arch/x86/include/generated/uapi/asm/errno.h
  SYSTBL  arch/x86/include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_x32.h
  WRAP    arch/x86/include/generated/uapi/asm/fcntl.h
  WRAP    arch/x86/include/generated/uapi/asm/ioctl.h
  SYSHDR  arch/x86/include/generated/asm/unistd_32_ia32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_64_x32.h
  WRAP    arch/x86/include/generated/uapi/asm/ioctls.h
  SYSTBL  arch/x86/include/generated/asm/syscalls_64.h
  WRAP    arch/x86/include/generated/uapi/asm/ipcbuf.h
  WRAP    arch/x86/include/generated/uapi/asm/poll.h
  WRAP    arch/x86/include/generated/uapi/asm/param.h
  WRAP    arch/x86/include/generated/uapi/asm/resource.h
  WRAP    arch/x86/include/generated/uapi/asm/socket.h
  WRAP    arch/x86/include/generated/uapi/asm/sockios.h
  WRAP    arch/x86/include/generated/uapi/asm/termios.h
  WRAP    arch/x86/include/generated/uapi/asm/termbits.h
  WRAP    arch/x86/include/generated/uapi/asm/types.h
  UPD     include/generated/uapi/linux/version.h
  UPD     include/config/kernel.release
  HOSTCC  arch/x86/tools/relocs_32.o
  HOSTCC  arch/x86/tools/relocs_64.o
  HOSTCC  arch/x86/tools/relocs_common.o
  UPD     include/generated/compile.h
  WRAP    arch/x86/include/generated/asm/early_ioremap.h
  WRAP    arch/x86/include/generated/asm/export.h
  WRAP    arch/x86/include/generated/asm/mcs_spinlock.h
  WRAP    arch/x86/include/generated/asm/irq_regs.h
  WRAP    arch/x86/include/generated/asm/kmap_size.h
  WRAP    arch/x86/include/generated/asm/local64.h
  WRAP    arch/x86/include/generated/asm/mmiowb.h
  WRAP    arch/x86/include/generated/asm/module.lds.h
  WRAP    arch/x86/include/generated/asm/rwonce.h
  WRAP    arch/x86/include/generated/asm/unaligned.h
  UPD     include/generated/utsrelease.h
  HOSTCC  scripts/unifdef
  HOSTCC  scripts/kallsyms
  HOSTCC  scripts/asn1_compiler
  HOSTCC  scripts/sorttable
  DESCEND objtool
  HOSTCC  /kernel/build64/tools/objtool/fixdep.o
  HOSTLD  /kernel/build64/tools/objtool/fixdep-in.o
  LINK    /kernel/build64/tools/objtool/fixdep
  INSTALL /kernel/build64/tools/objtool/libsubcmd/include/subcmd/help.h
  INSTALL /kernel/build64/tools/objtool/libsubcmd/include/subcmd/exec-cmd.h
  INSTALL /kernel/build64/tools/objtool/libsubcmd/include/subcmd/pager.h
  INSTALL /kernel/build64/tools/objtool/libsubcmd/include/subcmd/parse-options.h
  INSTALL /kernel/build64/tools/objtool/libsubcmd/include/subcmd/run-command.h
  CC      /kernel/build64/tools/objtool/libsubcmd/exec-cmd.o
  CC      /kernel/build64/tools/objtool/libsubcmd/help.o
  INSTALL libsubcmd_headers
  CC      /kernel/build64/tools/objtool/libsubcmd/pager.o
  CC      /kernel/build64/tools/objtool/libsubcmd/parse-options.o
  CC      /kernel/build64/tools/objtool/libsubcmd/run-command.o
  CC      /kernel/build64/tools/objtool/libsubcmd/sigchain.o
  HOSTLD  arch/x86/tools/relocs
  CC      /kernel/build64/tools/objtool/libsubcmd/subcmd-config.o
  CC      scripts/mod/empty.o
  HOSTCC  scripts/mod/mk_elfconfig
  CC      scripts/mod/devicetable-offsets.s
  HDRINST usr/include/video/edid.h
  HDRINST usr/include/video/sisfb.h
  HDRINST usr/include/video/uvesafb.h
  HDRINST usr/include/drm/amdgpu_drm.h
  HDRINST usr/include/drm/qaic_accel.h
  HDRINST usr/include/drm/i915_drm.h
  HDRINST usr/include/drm/vgem_drm.h
  HDRINST usr/include/drm/virtgpu_drm.h
  HDRINST usr/include/drm/xe_drm.h
  HDRINST usr/include/drm/omap_drm.h
  HDRINST usr/include/drm/radeon_drm.h
  HDRINST usr/include/drm/tegra_drm.h
  HDRINST usr/include/drm/drm_mode.h
  HDRINST usr/include/drm/ivpu_accel.h
  HDRINST usr/include/drm/exynos_drm.h
  HDRINST usr/include/drm/drm_sarea.h
  HDRINST usr/include/drm/v3d_drm.h
  HDRINST usr/include/drm/drm_fourcc.h
  HDRINST usr/include/drm/qxl_drm.h
  HDRINST usr/include/drm/nouveau_drm.h
  HDRINST usr/include/drm/habanalabs_accel.h
  HDRINST usr/include/drm/vmwgfx_drm.h
  HDRINST usr/include/drm/msm_drm.h
  HDRINST usr/include/drm/etnaviv_drm.h
  HDRINST usr/include/drm/vc4_drm.h
  HDRINST usr/include/drm/lima_drm.h
  HDRINST usr/include/drm/panfrost_drm.h
  HDRINST usr/include/drm/drm.h
  HDRINST usr/include/drm/armada_drm.h
  HDRINST usr/include/mtd/inftl-user.h
  HDRINST usr/include/mtd/nftl-user.h
  HDRINST usr/include/mtd/mtd-user.h
  HDRINST usr/include/mtd/ubi-user.h
  HDRINST usr/include/mtd/mtd-abi.h
  HDRINST usr/include/xen/gntdev.h
  HDRINST usr/include/xen/gntalloc.h
  HDRINST usr/include/xen/evtchn.h
  HDRINST usr/include/xen/privcmd.h
  HDRINST usr/include/asm-generic/auxvec.h
  HDRINST usr/include/asm-generic/bitsperlong.h
  HDRINST usr/include/asm-generic/posix_types.h
  HDRINST usr/include/asm-generic/ioctls.h
  HDRINST usr/include/asm-generic/mman.h
  HDRINST usr/include/asm-generic/shmbuf.h
  HDRINST usr/include/asm-generic/bpf_perf_event.h
  HDRINST usr/include/asm-generic/types.h
  HDRINST usr/include/asm-generic/poll.h
  HDRINST usr/include/asm-generic/msgbuf.h
  HDRINST usr/include/asm-generic/swab.h
  HDRINST usr/include/asm-generic/statfs.h
  HDRINST usr/include/asm-generic/unistd.h
  HDRINST usr/include/asm-generic/hugetlb_encode.h
  HDRINST usr/include/asm-generic/resource.h
  HDRINST usr/include/asm-generic/param.h
  HDRINST usr/include/asm-generic/termbits-common.h
  HDRINST usr/include/asm-generic/sockios.h
  HDRINST usr/include/asm-generic/kvm_para.h
  HDRINST usr/include/asm-generic/errno.h
  HDRINST usr/include/asm-generic/termios.h
  HDRINST usr/include/asm-generic/mman-common.h
  HDRINST usr/include/asm-generic/ioctl.h
  HDRINST usr/include/asm-generic/socket.h
  HDRINST usr/include/asm-generic/signal-defs.h
  HDRINST usr/include/asm-generic/termbits.h
  HDRINST usr/include/asm-generic/signal.h
  HDRINST usr/include/asm-generic/int-ll64.h
  HDRINST usr/include/asm-generic/siginfo.h
  HDRINST usr/include/asm-generic/stat.h
  HDRINST usr/include/asm-generic/int-l64.h
  HDRINST usr/include/asm-generic/errno-base.h
  HDRINST usr/include/asm-generic/fcntl.h
  HDRINST usr/include/asm-generic/setup.h
  HDRINST usr/include/asm-generic/ipcbuf.h
  HDRINST usr/include/asm-generic/sembuf.h
  HDRINST usr/include/asm-generic/ucontext.h
  HDRINST usr/include/rdma/mlx5_user_ioctl_cmds.h
  HDRINST usr/include/rdma/irdma-abi.h
  HDRINST usr/include/rdma/mana-abi.h
  HDRINST usr/include/rdma/hfi/hfi1_user.h
  HDRINST usr/include/rdma/hfi/hfi1_ioctl.h
  HDRINST usr/include/rdma/rdma_user_rxe.h
  HDRINST usr/include/rdma/rdma_user_ioctl.h
  HDRINST usr/include/rdma/mlx5_user_ioctl_verbs.h
  UPD     scripts/mod/devicetable-offsets.h
  HDRINST usr/include/rdma/bnxt_re-abi.h
  HDRINST usr/include/rdma/hns-abi.h
  HDRINST usr/include/rdma/qedr-abi.h
  HDRINST usr/include/rdma/ib_user_ioctl_cmds.h
  HDRINST usr/include/rdma/vmw_pvrdma-abi.h
  HDRINST usr/include/rdma/ib_user_sa.h
  HDRINST usr/include/rdma/ib_user_ioctl_verbs.h
  HDRINST usr/include/rdma/rvt-abi.h
  HDRINST usr/include/rdma/mlx5-abi.h
  HDRINST usr/include/rdma/rdma_netlink.h
  HDRINST usr/include/rdma/erdma-abi.h
  HDRINST usr/include/rdma/rdma_user_ioctl_cmds.h
  HDRINST usr/include/rdma/rdma_user_cm.h
  HDRINST usr/include/rdma/ib_user_verbs.h
  HDRINST usr/include/rdma/efa-abi.h
  HDRINST usr/include/rdma/siw-abi.h
  HDRINST usr/include/rdma/mlx4-abi.h
  HDRINST usr/include/rdma/mthca-abi.h
  HDRINST usr/include/rdma/ib_user_mad.h
  HDRINST usr/include/rdma/ocrdma-abi.h
  HDRINST usr/include/rdma/cxgb4-abi.h
  HDRINST usr/include/misc/xilinx_sdfec.h
  HDRINST usr/include/misc/uacce/hisi_qm.h
  HDRINST usr/include/misc/uacce/uacce.h
  HDRINST usr/include/misc/cxl.h
  HDRINST usr/include/misc/ocxl.h
  HDRINST usr/include/misc/pvpanic.h
  HDRINST usr/include/misc/fastrpc.h
  HDRINST usr/include/linux/i8k.h
  HDRINST usr/include/linux/acct.h
  HDRINST usr/include/linux/atmmpc.h
  HDRINST usr/include/linux/fs.h
  HDRINST usr/include/linux/cifs/cifs_mount.h
  HDRINST usr/include/linux/cifs/cifs_netlink.h
  HDRINST usr/include/linux/if_packet.h
  HDRINST usr/include/linux/route.h
  HDRINST usr/include/linux/patchkey.h
  HDRINST usr/include/linux/tc_ematch/tc_em_cmp.h
  HDRINST usr/include/linux/tc_ematch/tc_em_ipt.h
  HDRINST usr/include/linux/tc_ematch/tc_em_meta.h
  HDRINST usr/include/linux/tc_ematch/tc_em_nbyte.h
  HDRINST usr/include/linux/tc_ematch/tc_em_text.h
  HDRINST usr/include/linux/virtio_pmem.h
  HDRINST usr/include/linux/rkisp1-config.h
  HDRINST usr/include/linux/vhost.h
  HDRINST usr/include/linux/cec-funcs.h
  HDRINST usr/include/linux/ppdev.h
  HDRINST usr/include/linux/isdn/capicmd.h
  HDRINST usr/include/linux/virtio_fs.h
  HDRINST usr/include/linux/netfilter_ipv6.h
  HDRINST usr/include/linux/lirc.h
  HDRINST usr/include/linux/mroute6.h
  HDRINST usr/include/linux/nl80211-vnd-intel.h
  HDRINST usr/include/linux/ivtvfb.h
  HDRINST usr/include/linux/auxvec.h
  HDRINST usr/include/linux/dm-log-userspace.h
  HDRINST usr/include/linux/dccp.h
  HDRINST usr/include/linux/virtio_scmi.h
  HDRINST usr/include/linux/atmarp.h
  HDRINST usr/include/linux/arcfb.h
  HDRINST usr/include/linux/nbd-netlink.h
  HDRINST usr/include/linux/sched/types.h
  HDRINST usr/include/linux/tcp.h
  HDRINST usr/include/linux/neighbour.h
  HDRINST usr/include/linux/dlm_device.h
  HDRINST usr/include/linux/wmi.h
  HDRINST usr/include/linux/btrfs_tree.h
  HDRINST usr/include/linux/virtio_crypto.h
  HDRINST usr/include/linux/vbox_err.h
  HDRINST usr/include/linux/edd.h
  HDRINST usr/include/linux/loop.h
  HDRINST usr/include/linux/nvme_ioctl.h
  HDRINST usr/include/linux/mmtimer.h
  MKELF   scripts/mod/elfconfig.h
  HDRINST usr/include/linux/if_pppol2tp.h
  HDRINST usr/include/linux/mtio.h
  HDRINST usr/include/linux/if_arcnet.h
  HDRINST usr/include/linux/romfs_fs.h
  HDRINST usr/include/linux/posix_types.h
  HOSTCC  scripts/mod/modpost.o
  HDRINST usr/include/linux/rtc.h
  HOSTCC  scripts/mod/file2alias.o
  HDRINST usr/include/linux/landlock.h
  HOSTCC  scripts/mod/sumversion.o
  HDRINST usr/include/linux/gpio.h
  HDRINST usr/include/linux/selinux_netlink.h
  HDRINST usr/include/linux/pps.h
  HDRINST usr/include/linux/ndctl.h
  HDRINST usr/include/linux/virtio_gpu.h
  HDRINST usr/include/linux/android/binderfs.h
  HDRINST usr/include/linux/android/binder.h
  HDRINST usr/include/linux/virtio_vsock.h
  HDRINST usr/include/linux/sound.h
  HDRINST usr/include/linux/vtpm_proxy.h
  HDRINST usr/include/linux/nfs_fs.h
  HDRINST usr/include/linux/elf-fdpic.h
  HDRINST usr/include/linux/adfs_fs.h
  HDRINST usr/include/linux/target_core_user.h
  HDRINST usr/include/linux/netlink_diag.h
  HDRINST usr/include/linux/const.h
  HDRINST usr/include/linux/firewire-cdev.h
  HDRINST usr/include/linux/vdpa.h
  HDRINST usr/include/linux/if_infiniband.h
  HDRINST usr/include/linux/serial.h
  HDRINST usr/include/linux/iio/types.h
  HDRINST usr/include/linux/iio/buffer.h
  HDRINST usr/include/linux/iio/events.h
  HDRINST usr/include/linux/baycom.h
  HDRINST usr/include/linux/major.h
  HDRINST usr/include/linux/atmppp.h
  HDRINST usr/include/linux/ipv6_route.h
  HDRINST usr/include/linux/spi/spidev.h
  HDRINST usr/include/linux/spi/spi.h
  HDRINST usr/include/linux/virtio_ring.h
  HDRINST usr/include/linux/hdlc/ioctl.h
  HDRINST usr/include/linux/remoteproc_cdev.h
  HDRINST usr/include/linux/hyperv.h
  HDRINST usr/include/linux/rpl_iptunnel.h
  HDRINST usr/include/linux/sync_file.h
  HDRINST usr/include/linux/igmp.h
  HDRINST usr/include/linux/v4l2-dv-timings.h
  HDRINST usr/include/linux/virtio_i2c.h
  HDRINST usr/include/linux/xfrm.h
  HDRINST usr/include/linux/capability.h
  HDRINST usr/include/linux/gtp.h
  HDRINST usr/include/linux/xdp_diag.h
  HDRINST usr/include/linux/pkt_cls.h
  HDRINST usr/include/linux/suspend_ioctls.h
  HDRINST usr/include/linux/vt.h
  HDRINST usr/include/linux/loadpin.h
  HDRINST usr/include/linux/dlm_plock.h
  HDRINST usr/include/linux/fb.h
  HDRINST usr/include/linux/max2175.h
  HDRINST usr/include/linux/sunrpc/debug.h
  HDRINST usr/include/linux/gsmmux.h
  HDRINST usr/include/linux/watchdog.h
  HDRINST usr/include/linux/vhost_types.h
  HDRINST usr/include/linux/vduse.h
  HDRINST usr/include/linux/ila.h
  HDRINST usr/include/linux/tdx-guest.h
  HDRINST usr/include/linux/close_range.h
  HDRINST usr/include/linux/cryptouser.h
  HDRINST usr/include/linux/ivtv.h
  HDRINST usr/include/linux/netfilter/xt_string.h
  HDRINST usr/include/linux/netfilter/nfnetlink_compat.h
  HDRINST usr/include/linux/netfilter/nf_nat.h
  HDRINST usr/include/linux/netfilter/xt_recent.h
  HDRINST usr/include/linux/netfilter/xt_addrtype.h
  HDRINST usr/include/linux/netfilter/nf_conntrack_tcp.h
  HDRINST usr/include/linux/netfilter/xt_MARK.h
  HDRINST usr/include/linux/netfilter/xt_SYNPROXY.h
  HDRINST usr/include/linux/netfilter/xt_multiport.h
  HDRINST usr/include/linux/netfilter/nfnetlink.h
  HDRINST usr/include/linux/netfilter/xt_cgroup.h
  HDRINST usr/include/linux/netfilter/nf_synproxy.h
  HDRINST usr/include/linux/netfilter/xt_TCPOPTSTRIP.h
  HDRINST usr/include/linux/netfilter/nfnetlink_log.h
  HDRINST usr/include/linux/netfilter/xt_TPROXY.h
  HDRINST usr/include/linux/netfilter/xt_u32.h
  HDRINST usr/include/linux/netfilter/nfnetlink_osf.h
  HDRINST usr/include/linux/netfilter/xt_ecn.h
  HDRINST usr/include/linux/netfilter/xt_esp.h
  HDRINST usr/include/linux/netfilter/nfnetlink_hook.h
  HDRINST usr/include/linux/netfilter/xt_mac.h
  HDRINST usr/include/linux/netfilter/xt_comment.h
  HDRINST usr/include/linux/netfilter/xt_NFQUEUE.h
  HDRINST usr/include/linux/netfilter/xt_osf.h
  HDRINST usr/include/linux/netfilter/xt_hashlimit.h
  HDRINST usr/include/linux/netfilter/nf_conntrack_sctp.h
  HDRINST usr/include/linux/netfilter/xt_socket.h
  HDRINST usr/include/linux/netfilter/xt_connmark.h
  HDRINST usr/include/linux/netfilter/xt_sctp.h
  HDRINST usr/include/linux/netfilter/xt_tcpudp.h
  HDRINST usr/include/linux/netfilter/xt_DSCP.h
  HDRINST usr/include/linux/netfilter/xt_time.h
  HDRINST usr/include/linux/netfilter/xt_IDLETIMER.h
  HDRINST usr/include/linux/netfilter/xt_policy.h
  HDRINST usr/include/linux/netfilter/xt_rpfilter.h
  HDRINST usr/include/linux/netfilter/xt_nfacct.h
  HDRINST usr/include/linux/netfilter/xt_SECMARK.h
  HDRINST usr/include/linux/netfilter/xt_length.h
  HDRINST usr/include/linux/netfilter/nfnetlink_cthelper.h
  HDRINST usr/include/linux/netfilter/xt_quota.h
  HDRINST usr/include/linux/netfilter/xt_CLASSIFY.h
  HDRINST usr/include/linux/netfilter/xt_iprange.h
  HDRINST usr/include/linux/netfilter/xt_ipcomp.h
  HDRINST usr/include/linux/netfilter/xt_bpf.h
  HDRINST usr/include/linux/netfilter/xt_LOG.h
  HDRINST usr/include/linux/netfilter/xt_rateest.h
  HDRINST usr/include/linux/netfilter/xt_CONNSECMARK.h
  HDRINST usr/include/linux/netfilter/xt_HMARK.h
  HDRINST usr/include/linux/netfilter/xt_CONNMARK.h
  HDRINST usr/include/linux/netfilter/xt_pkttype.h
  HDRINST usr/include/linux/netfilter/xt_ipvs.h
  HDRINST usr/include/linux/netfilter/xt_devgroup.h
  HDRINST usr/include/linux/netfilter/xt_AUDIT.h
  HDRINST usr/include/linux/netfilter/xt_realm.h
  HDRINST usr/include/linux/netfilter/nf_conntrack_common.h
  HDRINST usr/include/linux/netfilter/xt_set.h
  HDRINST usr/include/linux/netfilter/xt_LED.h
  HDRINST usr/include/linux/netfilter/xt_connlabel.h
  HDRINST usr/include/linux/netfilter/xt_owner.h
  HDRINST usr/include/linux/netfilter/xt_dccp.h
  HDRINST usr/include/linux/netfilter/xt_limit.h
  HDRINST usr/include/linux/netfilter/xt_conntrack.h
  HDRINST usr/include/linux/netfilter/xt_TEE.h
  HDRINST usr/include/linux/netfilter/xt_RATEEST.h
  HDRINST usr/include/linux/netfilter/xt_connlimit.h
  HDRINST usr/include/linux/netfilter/ipset/ip_set.h
  HDRINST usr/include/linux/netfilter/ipset/ip_set_list.h
  HDRINST usr/include/linux/netfilter/ipset/ip_set_hash.h
  HDRINST usr/include/linux/netfilter/ipset/ip_set_bitmap.h
  HDRINST usr/include/linux/netfilter/x_tables.h
  HDRINST usr/include/linux/netfilter/xt_dscp.h
  HDRINST usr/include/linux/netfilter/nf_conntrack_ftp.h
  HDRINST usr/include/linux/netfilter/xt_cluster.h
  HDRINST usr/include/linux/netfilter/nf_log.h
  HDRINST usr/include/linux/netfilter/nf_conntrack_tuple_common.h
  HDRINST usr/include/linux/netfilter/xt_tcpmss.h
  HDRINST usr/include/linux/netfilter/xt_NFLOG.h
  HDRINST usr/include/linux/netfilter/xt_l2tp.h
  HDRINST usr/include/linux/netfilter/xt_helper.h
  HDRINST usr/include/linux/netfilter/xt_statistic.h
  HDRINST usr/include/linux/netfilter/nfnetlink_queue.h
  HDRINST usr/include/linux/netfilter/nfnetlink_cttimeout.h
  HDRINST usr/include/linux/netfilter/xt_CT.h
  HDRINST usr/include/linux/netfilter/xt_CHECKSUM.h
  HDRINST usr/include/linux/netfilter/xt_connbytes.h
  HDRINST usr/include/linux/netfilter/xt_state.h
  HDRINST usr/include/linux/netfilter/nf_tables.h
  HDRINST usr/include/linux/netfilter/xt_mark.h
  HDRINST usr/include/linux/netfilter/xt_cpu.h
  HDRINST usr/include/linux/netfilter/nf_tables_compat.h
  HDRINST usr/include/linux/netfilter/xt_physdev.h
  HDRINST usr/include/linux/netfilter/nfnetlink_conntrack.h
  HDRINST usr/include/linux/netfilter/nfnetlink_acct.h
  HDRINST usr/include/linux/netfilter/xt_TCPMSS.h
  HDRINST usr/include/linux/tty_flags.h
  HDRINST usr/include/linux/if_phonet.h
  HDRINST usr/include/linux/elf-em.h
  HDRINST usr/include/linux/vm_sockets.h
  HDRINST usr/include/linux/dlmconstants.h
  HDRINST usr/include/linux/bsg.h
  HDRINST usr/include/linux/matroxfb.h
  HDRINST usr/include/linux/sysctl.h
  HDRINST usr/include/linux/unix_diag.h
  HDRINST usr/include/linux/pcitest.h
  HDRINST usr/include/linux/mman.h
  HDRINST usr/include/linux/if_plip.h
  HDRINST usr/include/linux/virtio_balloon.h
  HDRINST usr/include/linux/pidfd.h
  HDRINST usr/include/linux/f2fs.h
  HDRINST usr/include/linux/x25.h
  HDRINST usr/include/linux/if_cablemodem.h
  HDRINST usr/include/linux/utsname.h
  HDRINST usr/include/linux/counter.h
  HDRINST usr/include/linux/atm_tcp.h
  HDRINST usr/include/linux/atalk.h
  HDRINST usr/include/linux/virtio_rng.h
  HDRINST usr/include/linux/vboxguest.h
  HDRINST usr/include/linux/bpf_perf_event.h
  HDRINST usr/include/linux/ipmi_ssif_bmc.h
  HDRINST usr/include/linux/nfs_mount.h
  HDRINST usr/include/linux/sonet.h
  HDRINST usr/include/linux/netfilter.h
  HDRINST usr/include/linux/keyctl.h
  HDRINST usr/include/linux/nl80211.h
  HDRINST usr/include/linux/misc/bcm_vk.h
  HDRINST usr/include/linux/audit.h
  HDRINST usr/include/linux/tipc_config.h
  HDRINST usr/include/linux/tipc_sockets_diag.h
  HDRINST usr/include/linux/futex.h
  HDRINST usr/include/linux/sev-guest.h
  HDRINST usr/include/linux/ublk_cmd.h
  HDRINST usr/include/linux/types.h
  HDRINST usr/include/linux/virtio_input.h
  HDRINST usr/include/linux/if_slip.h
  HDRINST usr/include/linux/personality.h
  HDRINST usr/include/linux/openat2.h
  HDRINST usr/include/linux/poll.h
  HDRINST usr/include/linux/posix_acl.h
  HDRINST usr/include/linux/smc_diag.h
  HDRINST usr/include/linux/snmp.h
  HDRINST usr/include/linux/errqueue.h
  HDRINST usr/include/linux/if_tunnel.h
  HDRINST usr/include/linux/fanotify.h
  HDRINST usr/include/linux/kernel.h
  HDRINST usr/include/linux/rtnetlink.h
  HDRINST usr/include/linux/rpl.h
  HDRINST usr/include/linux/memfd.h
  HDRINST usr/include/linux/serial_core.h
  HDRINST usr/include/linux/dns_resolver.h
  HDRINST usr/include/linux/pr.h
  HDRINST usr/include/linux/atm_eni.h
  HDRINST usr/include/linux/lp.h
  HDRINST usr/include/linux/virtio_mem.h
  HDRINST usr/include/linux/ultrasound.h
  HDRINST usr/include/linux/sctp.h
  HDRINST usr/include/linux/uio.h
  HDRINST usr/include/linux/tcp_metrics.h
  HDRINST usr/include/linux/wwan.h
  HDRINST usr/include/linux/atmbr2684.h
  HDRINST usr/include/linux/in_route.h
  HDRINST usr/include/linux/qemu_fw_cfg.h
  HDRINST usr/include/linux/if_macsec.h
  HDRINST usr/include/linux/usb/charger.h
  HDRINST usr/include/linux/usb/g_uvc.h
  HDRINST usr/include/linux/usb/gadgetfs.h
  HDRINST usr/include/linux/usb/raw_gadget.h
  HDRINST usr/include/linux/usb/cdc-wdm.h
  HDRINST usr/include/linux/usb/g_printer.h
  HDRINST usr/include/linux/usb/midi.h
  HDRINST usr/include/linux/usb/tmc.h
  HDRINST usr/include/linux/usb/video.h
  HDRINST usr/include/linux/usb/functionfs.h
  HDRINST usr/include/linux/usb/audio.h
  HDRINST usr/include/linux/usb/ch11.h
  HDRINST usr/include/linux/usb/ch9.h
  HDRINST usr/include/linux/usb/cdc.h
  HDRINST usr/include/linux/jffs2.h
  HDRINST usr/include/linux/ax25.h
  HDRINST usr/include/linux/auto_fs.h
  HDRINST usr/include/linux/tiocl.h
  HDRINST usr/include/linux/scc.h
  HDRINST usr/include/linux/psci.h
  HDRINST usr/include/linux/swab.h
  HDRINST usr/include/linux/cec.h
  HDRINST usr/include/linux/kfd_ioctl.h
  HDRINST usr/include/linux/smc.h
  HDRINST usr/include/linux/qrtr.h
  HDRINST usr/include/linux/screen_info.h
  HDRINST usr/include/linux/nfsacl.h
  HDRINST usr/include/linux/seg6_hmac.h
  HDRINST usr/include/linux/gameport.h
  HDRINST usr/include/linux/wireless.h
  HDRINST usr/include/linux/fdreg.h
  HDRINST usr/include/linux/cciss_defs.h
  HDRINST usr/include/linux/serial_reg.h
  HDRINST usr/include/linux/perf_event.h
  HDRINST usr/include/linux/in6.h
  HDRINST usr/include/linux/hid.h
  HDRINST usr/include/linux/netlink.h
  HDRINST usr/include/linux/fuse.h
  HDRINST usr/include/linux/magic.h
  HDRINST usr/include/linux/ioam6_iptunnel.h
  HDRINST usr/include/linux/stm.h
  HDRINST usr/include/linux/vsockmon.h
  HDRINST usr/include/linux/seg6.h
  HDRINST usr/include/linux/idxd.h
  HDRINST usr/include/linux/nitro_enclaves.h
  HDRINST usr/include/linux/ptrace.h
  HDRINST usr/include/linux/ioam6_genl.h
  HDRINST usr/include/linux/qnx4_fs.h
  HDRINST usr/include/linux/fsl_mc.h
  HDRINST usr/include/linux/net_tstamp.h
  HDRINST usr/include/linux/msg.h
  HDRINST usr/include/linux/netfilter_ipv4/ipt_TTL.h
  HDRINST usr/include/linux/netfilter_ipv4/ipt_ttl.h
  HDRINST usr/include/linux/netfilter_ipv4/ipt_ah.h
  HDRINST usr/include/linux/netfilter_ipv4/ipt_ECN.h
  HDRINST usr/include/linux/netfilter_ipv4/ip_tables.h
  HDRINST usr/include/linux/netfilter_ipv4/ipt_ecn.h
  HDRINST usr/include/linux/netfilter_ipv4/ipt_CLUSTERIP.h
  HDRINST usr/include/linux/netfilter_ipv4/ipt_REJECT.h
  HDRINST usr/include/linux/netfilter_ipv4/ipt_LOG.h
  HDRINST usr/include/linux/sem.h
  HDRINST usr/include/linux/net_namespace.h
  HDRINST usr/include/linux/radeonfb.h
  HDRINST usr/include/linux/tee.h
  HDRINST usr/include/linux/udp.h
  HDRINST usr/include/linux/virtio_bt.h
  HDRINST usr/include/linux/v4l2-subdev.h
  HDRINST usr/include/linux/posix_acl_xattr.h
  HDRINST usr/include/linux/v4l2-mediabus.h
  HDRINST usr/include/linux/atmapi.h
  HDRINST usr/include/linux/raid/md_p.h
  HDRINST usr/include/linux/raid/md_u.h
  HDRINST usr/include/linux/zorro_ids.h
  HDRINST usr/include/linux/nbd.h
  HDRINST usr/include/linux/isst_if.h
  HDRINST usr/include/linux/rxrpc.h
  HDRINST usr/include/linux/unistd.h
  HDRINST usr/include/linux/if_arp.h
  HDRINST usr/include/linux/atm_zatm.h
  HDRINST usr/include/linux/io_uring.h
  HDRINST usr/include/linux/if_fddi.h
  HDRINST usr/include/linux/bpqether.h
  HDRINST usr/include/linux/sysinfo.h
  HDRINST usr/include/linux/auto_dev-ioctl.h
  HDRINST usr/include/linux/nfs4_mount.h
  HDRINST usr/include/linux/keyboard.h
  HDRINST usr/include/linux/virtio_mmio.h
  HDRINST usr/include/linux/input.h
  HDRINST usr/include/linux/qnxtypes.h
  HDRINST usr/include/linux/mdio.h
  HDRINST usr/include/linux/lwtunnel.h
  HDRINST usr/include/linux/gfs2_ondisk.h
  HDRINST usr/include/linux/nfs4.h
  HDRINST usr/include/linux/ptp_clock.h
  HDRINST usr/include/linux/nubus.h
  HDRINST usr/include/linux/if_bonding.h
  HDRINST usr/include/linux/kcov.h
  HDRINST usr/include/linux/fadvise.h
  HDRINST usr/include/linux/taskstats.h
  HDRINST usr/include/linux/veth.h
  HDRINST usr/include/linux/atm.h
  HDRINST usr/include/linux/ipmi.h
  HDRINST usr/include/linux/kdev_t.h
  HDRINST usr/include/linux/mount.h
  HDRINST usr/include/linux/shm.h
  HDRINST usr/include/linux/resource.h
  HDRINST usr/include/linux/prctl.h
  HDRINST usr/include/linux/watch_queue.h
  HDRINST usr/include/linux/sched.h
  HDRINST usr/include/linux/phonet.h
  HDRINST usr/include/linux/random.h
  HDRINST usr/include/linux/tty.h
  HDRINST usr/include/linux/apm_bios.h
  HDRINST usr/include/linux/fd.h
  HDRINST usr/include/linux/um_timetravel.h
  HDRINST usr/include/linux/tls.h
  HDRINST usr/include/linux/rpmsg_types.h
  HDRINST usr/include/linux/pfrut.h
  HDRINST usr/include/linux/mei.h
  HDRINST usr/include/linux/fsi.h
  HDRINST usr/include/linux/rds.h
  HDRINST usr/include/linux/if_x25.h
  HDRINST usr/include/linux/param.h
  HDRINST usr/include/linux/netdevice.h
  HDRINST usr/include/linux/binfmts.h
  HDRINST usr/include/linux/if_pppox.h
  HDRINST usr/include/linux/sockios.h
  HDRINST usr/include/linux/kcm.h
  HDRINST usr/include/linux/virtio_9p.h
  HDRINST usr/include/linux/genwqe/genwqe_card.h
  HDRINST usr/include/linux/if_tun.h
  HDRINST usr/include/linux/if_ether.h
  HDRINST usr/include/linux/kvm_para.h
  HDRINST usr/include/linux/kernel-page-flags.h
  HDRINST usr/include/linux/cdrom.h
  HDRINST usr/include/linux/un.h
  HDRINST usr/include/linux/module.h
  HDRINST usr/include/linux/a.out.h
  HDRINST usr/include/linux/mqueue.h
  HDRINST usr/include/linux/input-event-codes.h
  HDRINST usr/include/linux/coda.h
  HDRINST usr/include/linux/rio_mport_cdev.h
  HDRINST usr/include/linux/ipsec.h
  HDRINST usr/include/linux/blkpg.h
  HDRINST usr/include/linux/blkzoned.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_arpreply.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_redirect.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_nflog.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_802_3.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_nat.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_mark_m.h
  HDRINST usr/include/linux/netfilter_bridge/ebtables.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_vlan.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_limit.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_log.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_stp.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_pkttype.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_ip.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_ip6.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_arp.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_mark_t.h
  HDRINST usr/include/linux/netfilter_bridge/ebt_among.h
  HDRINST usr/include/linux/reiserfs_fs.h
  HDRINST usr/include/linux/cciss_ioctl.h
  HDRINST usr/include/linux/fsmap.h
  HDRINST usr/include/linux/smiapp.h
  HDRINST usr/include/linux/switchtec_ioctl.h
  HDRINST usr/include/linux/atmdev.h
  HDRINST usr/include/linux/hpet.h
  HDRINST usr/include/linux/virtio_config.h
  HDRINST usr/include/linux/string.h
  HDRINST usr/include/linux/kfd_sysfs.h
  HDRINST usr/include/linux/inet_diag.h
  HDRINST usr/include/linux/netdev.h
  HDRINST usr/include/linux/xattr.h
  HDRINST usr/include/linux/iommufd.h
  HDRINST usr/include/linux/errno.h
  HDRINST usr/include/linux/icmp.h
  HDRINST usr/include/linux/i2o-dev.h
  HDRINST usr/include/linux/pg.h
  HDRINST usr/include/linux/if_bridge.h
  HDRINST usr/include/linux/thermal.h
  HDRINST usr/include/linux/uinput.h
  HDRINST usr/include/linux/dqblk_xfs.h
  HDRINST usr/include/linux/v4l2-common.h
  HDRINST usr/include/linux/nvram.h
  HDRINST usr/include/linux/if_vlan.h
  HDRINST usr/include/linux/uhid.h
  HDRINST usr/include/linux/omap3isp.h
  HDRINST usr/include/linux/rose.h
  HDRINST usr/include/linux/phantom.h
  HDRINST usr/include/linux/ipmi_msgdefs.h
  HDRINST usr/include/linux/bcm933xx_hcs.h
  HDRINST usr/include/linux/bpf.h
  HDRINST usr/include/linux/mempolicy.h
  HDRINST usr/include/linux/efs_fs_sb.h
  HDRINST usr/include/linux/nexthop.h
  HDRINST usr/include/linux/net_dropmon.h
  HDRINST usr/include/linux/surface_aggregator/cdev.h
  HDRINST usr/include/linux/surface_aggregator/dtx.h
  HDRINST usr/include/linux/net.h
  HDRINST usr/include/linux/mii.h
  HDRINST usr/include/linux/cm4000_cs.h
  HDRINST usr/include/linux/virtio_pcidev.h
  HDRINST usr/include/linux/termios.h
  HDRINST usr/include/linux/cgroupstats.h
  HDRINST usr/include/linux/mpls.h
  HDRINST usr/include/linux/iommu.h
  HDRINST usr/include/linux/toshiba.h
  HDRINST usr/include/linux/virtio_scsi.h
  HDRINST usr/include/linux/zorro.h
  HDRINST usr/include/linux/chio.h
  HDRINST usr/include/linux/pkt_sched.h
  HDRINST usr/include/linux/cramfs_fs.h
  HDRINST usr/include/linux/nfs3.h
  HDRINST usr/include/linux/vfio_ccw.h
  HDRINST usr/include/linux/atm_nicstar.h
  HDRINST usr/include/linux/ncsi.h
  HDRINST usr/include/linux/virtio_net.h
  HDRINST usr/include/linux/ioctl.h
  HDRINST usr/include/linux/stddef.h
  HDRINST usr/include/linux/limits.h
  HDRINST usr/include/linux/ipmi_bmc.h
  HDRINST usr/include/linux/netfilter_arp.h
  HDRINST usr/include/linux/if_addr.h
  HDRINST usr/include/linux/rpmsg.h
  HDRINST usr/include/linux/media-bus-format.h
  HDRINST usr/include/linux/kernelcapi.h
  HDRINST usr/include/linux/ppp_defs.h
  HDRINST usr/include/linux/ethtool.h
  HDRINST usr/include/linux/aspeed-video.h
  HDRINST usr/include/linux/hdlc.h
  HDRINST usr/include/linux/fscrypt.h
  HDRINST usr/include/linux/batadv_packet.h
  HDRINST usr/include/linux/uuid.h
  HDRINST usr/include/linux/capi.h
  HDRINST usr/include/linux/mptcp.h
  HDRINST usr/include/linux/hidraw.h
  HDRINST usr/include/linux/virtio_console.h
  HDRINST usr/include/linux/irqnr.h
  HDRINST usr/include/linux/coresight-stm.h
  HDRINST usr/include/linux/cxl_mem.h
  HDRINST usr/include/linux/iso_fs.h
  HDRINST usr/include/linux/virtio_blk.h
  HDRINST usr/include/linux/udf_fs_i.h
  HDRINST usr/include/linux/coff.h
  HDRINST usr/include/linux/dma-buf.h
  HDRINST usr/include/linux/ife.h
  HDRINST usr/include/linux/agpgart.h
  HDRINST usr/include/linux/socket.h
  HDRINST usr/include/linux/nilfs2_ondisk.h
  HDRINST usr/include/linux/connector.h
  HDRINST usr/include/linux/auto_fs4.h
  HDRINST usr/include/linux/bt-bmc.h
  HDRINST usr/include/linux/map_to_7segment.h
  HDRINST usr/include/linux/tc_act/tc_skbedit.h
  HDRINST usr/include/linux/tc_act/tc_ctinfo.h
  HDRINST usr/include/linux/tc_act/tc_defact.h
  HDRINST usr/include/linux/tc_act/tc_gact.h
  HDRINST usr/include/linux/tc_act/tc_vlan.h
  HDRINST usr/include/linux/tc_act/tc_skbmod.h
  HDRINST usr/include/linux/tc_act/tc_sample.h
  HDRINST usr/include/linux/tc_act/tc_tunnel_key.h
  HDRINST usr/include/linux/tc_act/tc_gate.h
  HDRINST usr/include/linux/tc_act/tc_mirred.h
  HDRINST usr/include/linux/tc_act/tc_nat.h
  HDRINST usr/include/linux/tc_act/tc_csum.h
  HDRINST usr/include/linux/tc_act/tc_connmark.h
  LD      /kernel/build64/tools/objtool/libsubcmd/libsubcmd-in.o
  HDRINST usr/include/linux/tc_act/tc_ife.h
  HDRINST usr/include/linux/tc_act/tc_mpls.h
  HDRINST usr/include/linux/tc_act/tc_ct.h
  HDRINST usr/include/linux/tc_act/tc_pedit.h
  HDRINST usr/include/linux/tc_act/tc_bpf.h
  HDRINST usr/include/linux/tc_act/tc_ipt.h
  HDRINST usr/include/linux/netrom.h
  HDRINST usr/include/linux/joystick.h
  HDRINST usr/include/linux/falloc.h
  HDRINST usr/include/linux/cycx_cfm.h
  HDRINST usr/include/linux/omapfb.h
  HDRINST usr/include/linux/msdos_fs.h
  HDRINST usr/include/linux/virtio_types.h
  HDRINST usr/include/linux/mroute.h
  HDRINST usr/include/linux/psample.h
  HDRINST usr/include/linux/ipv6.h
  HDRINST usr/include/linux/dw100.h
  HDRINST usr/include/linux/psp-sev.h
  HDRINST usr/include/linux/vfio.h
  HDRINST usr/include/linux/if_ppp.h
  HDRINST usr/include/linux/byteorder/big_endian.h
  HDRINST usr/include/linux/byteorder/little_endian.h
  HDRINST usr/include/linux/comedi.h
  HDRINST usr/include/linux/scif_ioctl.h
  HDRINST usr/include/linux/timerfd.h
  HDRINST usr/include/linux/time_types.h
  HDRINST usr/include/linux/firewire-constants.h
  HDRINST usr/include/linux/virtio_snd.h
  HDRINST usr/include/linux/ppp-ioctl.h
  HDRINST usr/include/linux/fib_rules.h
  HDRINST usr/include/linux/gen_stats.h
  HDRINST usr/include/linux/virtio_iommu.h
  HDRINST usr/include/linux/genetlink.h
  HDRINST usr/include/linux/uvcvideo.h
  AR      /kernel/build64/tools/objtool/libsubcmd/libsubcmd.a
  HDRINST usr/include/linux/pfkeyv2.h
  HDRINST usr/include/linux/soundcard.h
  HDRINST usr/include/linux/times.h
  HDRINST usr/include/linux/nfc.h
  HDRINST usr/include/linux/affs_hardblocks.h
  HDRINST usr/include/linux/nilfs2_api.h
  HDRINST usr/include/linux/rseq.h
  HDRINST usr/include/linux/caif/caif_socket.h
  HDRINST usr/include/linux/caif/if_caif.h
  HDRINST usr/include/linux/i2c-dev.h
  HDRINST usr/include/linux/cuda.h
  HDRINST usr/include/linux/cn_proc.h
  HDRINST usr/include/linux/parport.h
  HDRINST usr/include/linux/v4l2-controls.h
  HDRINST usr/include/linux/hsi/cs-protocol.h
  HDRINST usr/include/linux/hsi/hsi_char.h
  HDRINST usr/include/linux/seg6_genl.h
  HDRINST usr/include/linux/am437x-vpfe.h
  HDRINST usr/include/linux/amt.h
  HDRINST usr/include/linux/netconf.h
  HDRINST usr/include/linux/erspan.h
  HDRINST usr/include/linux/nsfs.h
  HDRINST usr/include/linux/xilinx-v4l2-controls.h
  HDRINST usr/include/linux/aspeed-p2a-ctrl.h
  HDRINST usr/include/linux/vfio_zdev.h
  HDRINST usr/include/linux/serio.h
  HDRINST usr/include/linux/acrn.h
  HDRINST usr/include/linux/nfs2.h
  HDRINST usr/include/linux/virtio_pci.h
  HDRINST usr/include/linux/ipc.h
  HDRINST usr/include/linux/ethtool_netlink.h
  HDRINST usr/include/linux/kd.h
  HDRINST usr/include/linux/elf.h
  HDRINST usr/include/linux/videodev2.h
  HDRINST usr/include/linux/if_alg.h
  HDRINST usr/include/linux/sonypi.h
  HDRINST usr/include/linux/fsverity.h
  HDRINST usr/include/linux/if.h
  HDRINST usr/include/linux/btrfs.h
  HDRINST usr/include/linux/vm_sockets_diag.h
  HDRINST usr/include/linux/netfilter_bridge.h
  HDRINST usr/include/linux/packet_diag.h
  HDRINST usr/include/linux/netfilter_ipv4.h
  HDRINST usr/include/linux/kvm.h
  HDRINST usr/include/linux/pci.h
  HDRINST usr/include/linux/if_addrlabel.h
  HDRINST usr/include/linux/hdlcdrv.h
  HDRINST usr/include/linux/cfm_bridge.h
  HDRINST usr/include/linux/fiemap.h
  HDRINST usr/include/linux/dm-ioctl.h
  HDRINST usr/include/linux/aspeed-lpc-ctrl.h
  HDRINST usr/include/linux/atmioc.h
  HDRINST usr/include/linux/dlm.h
  HDRINST usr/include/linux/pci_regs.h
  HDRINST usr/include/linux/cachefiles.h
  HDRINST usr/include/linux/membarrier.h
  HDRINST usr/include/linux/nfs_idmap.h
  HDRINST usr/include/linux/ip.h
  HDRINST usr/include/linux/atm_he.h
  HDRINST usr/include/linux/nfsd/export.h
  HDRINST usr/include/linux/nfsd/stats.h
  HDRINST usr/include/linux/nfsd/debug.h
  HDRINST usr/include/linux/nfsd/cld.h
  HDRINST usr/include/linux/ip_vs.h
  HDRINST usr/include/linux/vmcore.h
  HDRINST usr/include/linux/vbox_vmmdev_types.h
  HDRINST usr/include/linux/dvb/osd.h
  CC      /kernel/build64/tools/objtool/weak.o
  HDRINST usr/include/linux/dvb/dmx.h
  HDRINST usr/include/linux/dvb/net.h
  HDRINST usr/include/linux/dvb/frontend.h
  HDRINST usr/include/linux/dvb/ca.h
  HDRINST usr/include/linux/dvb/version.h
  CC      /kernel/build64/tools/objtool/check.o
  HDRINST usr/include/linux/dvb/video.h
  HDRINST usr/include/linux/dvb/audio.h
  CC      /kernel/build64/tools/objtool/special.o
  HDRINST usr/include/linux/nfs.h
  CC      /kernel/build64/tools/objtool/builtin-check.o
  MKDIR   /kernel/build64/tools/objtool/arch/x86/
  HDRINST usr/include/linux/if_link.h
  HDRINST usr/include/linux/wait.h
  CC      /kernel/build64/tools/objtool/elf.o
  HDRINST usr/include/linux/icmpv6.h
  HDRINST usr/include/linux/media.h
  MKDIR   /kernel/build64/tools/objtool/arch/x86/lib/
  CC      /kernel/build64/tools/objtool/objtool.o
  HDRINST usr/include/linux/seg6_local.h
  CC      /kernel/build64/tools/objtool/arch/x86/special.o
  CC      /kernel/build64/tools/objtool/orc_gen.o
  HDRINST usr/include/linux/openvswitch.h
  GEN     /kernel/build64/tools/objtool/arch/x86/lib/inat-tables.c
  HDRINST usr/include/linux/atmsap.h
  CC      /kernel/build64/tools/objtool/orc_dump.o
  HDRINST usr/include/linux/bpfilter.h
  HDRINST usr/include/linux/fpga-dfl.h
  HDRINST usr/include/linux/userio.h
  HDRINST usr/include/linux/signal.h
  CC      /kernel/build64/tools/objtool/libstring.o
  HDRINST usr/include/linux/map_to_14segment.h
  CC      /kernel/build64/tools/objtool/libctype.o
  HDRINST usr/include/linux/hdreg.h
  HDRINST usr/include/linux/utime.h
  CC      /kernel/build64/tools/objtool/str_error_r.o
  CC      /kernel/build64/tools/objtool/librbtree.o
  HDRINST usr/include/linux/usbdevice_fs.h
  HDRINST usr/include/linux/timex.h
  HDRINST usr/include/linux/if_fc.h
  HDRINST usr/include/linux/reiserfs_xattr.h
  HDRINST usr/include/linux/hw_breakpoint.h
  HDRINST usr/include/linux/quota.h
  HDRINST usr/include/linux/ioprio.h
  HDRINST usr/include/linux/eventpoll.h
  HDRINST usr/include/linux/atmclip.h
  HDRINST usr/include/linux/can.h
  HDRINST usr/include/linux/if_team.h
  HDRINST usr/include/linux/usbip.h
  HDRINST usr/include/linux/stat.h
  HDRINST usr/include/linux/fou.h
  HDRINST usr/include/linux/hash_info.h
  HDRINST usr/include/linux/ppp-comp.h
  HDRINST usr/include/linux/ip6_tunnel.h
  HDRINST usr/include/linux/tipc_netlink.h
  HDRINST usr/include/linux/in.h
  HDRINST usr/include/linux/wireguard.h
  HDRINST usr/include/linux/btf.h
  HDRINST usr/include/linux/batman_adv.h
  HDRINST usr/include/linux/fcntl.h
  HDRINST usr/include/linux/if_ltalk.h
  HDRINST usr/include/linux/i2c.h
  HDRINST usr/include/linux/atm_idt77105.h
  HDRINST usr/include/linux/kexec.h
  HDRINST usr/include/linux/arm_sdei.h
  HDRINST usr/include/linux/netfilter_ipv6/ip6_tables.h
  HDRINST usr/include/linux/netfilter_ipv6/ip6t_ah.h
  HDRINST usr/include/linux/netfilter_ipv6/ip6t_NPT.h
  HDRINST usr/include/linux/netfilter_ipv6/ip6t_rt.h
  HDRINST usr/include/linux/netfilter_ipv6/ip6t_REJECT.h
  HDRINST usr/include/linux/netfilter_ipv6/ip6t_opts.h
  HDRINST usr/include/linux/netfilter_ipv6/ip6t_srh.h
  HDRINST usr/include/linux/netfilter_ipv6/ip6t_LOG.h
  HDRINST usr/include/linux/netfilter_ipv6/ip6t_mh.h
  HDRINST usr/include/linux/netfilter_ipv6/ip6t_HL.h
  HDRINST usr/include/linux/netfilter_ipv6/ip6t_hl.h
  HDRINST usr/include/linux/netfilter_ipv6/ip6t_frag.h
  HDRINST usr/include/linux/netfilter_ipv6/ip6t_ipv6header.h
  HDRINST usr/include/linux/minix_fs.h
  CC      /kernel/build64/tools/objtool/arch/x86/decode.o
  HDRINST usr/include/linux/aio_abi.h
  HDRINST usr/include/linux/pktcdvd.h
  HDRINST usr/include/linux/libc-compat.h
  HDRINST usr/include/linux/atmlec.h
  HDRINST usr/include/linux/signalfd.h
  HDRINST usr/include/linux/bpf_common.h
  HDRINST usr/include/linux/seg6_iptunnel.h
  HDRINST usr/include/linux/synclink.h
  HDRINST usr/include/linux/mpls_iptunnel.h
  HDRINST usr/include/linux/mctp.h
  HDRINST usr/include/linux/if_xdp.h
  HDRINST usr/include/linux/llc.h
  HDRINST usr/include/linux/atmsvc.h
  HDRINST usr/include/linux/sed-opal.h
  HDRINST usr/include/linux/sock_diag.h
  HDRINST usr/include/linux/time.h
  HDRINST usr/include/linux/securebits.h
  HDRINST usr/include/linux/fsl_hypervisor.h
  HDRINST usr/include/linux/if_hippi.h
  HDRINST usr/include/linux/dlm_netlink.h
  HDRINST usr/include/linux/seccomp.h
  HDRINST usr/include/linux/oom.h
  HDRINST usr/include/linux/filter.h
  HDRINST usr/include/linux/inotify.h
  HDRINST usr/include/linux/rfkill.h
  HDRINST usr/include/linux/reboot.h
  HDRINST usr/include/linux/can/vxcan.h
  HDRINST usr/include/linux/can/j1939.h
  HDRINST usr/include/linux/can/netlink.h
  HDRINST usr/include/linux/can/bcm.h
  HDRINST usr/include/linux/can/raw.h
  HDRINST usr/include/linux/can/gw.h
  HDRINST usr/include/linux/can/error.h
  HDRINST usr/include/linux/can/isotp.h
  HDRINST usr/include/linux/if_eql.h
  HDRINST usr/include/linux/hiddev.h
  HDRINST usr/include/linux/blktrace_api.h
  HDRINST usr/include/linux/ccs.h
  HDRINST usr/include/linux/ioam6.h
  HDRINST usr/include/linux/hsr_netlink.h
  HDRINST usr/include/linux/mmc/ioctl.h
  HDRINST usr/include/linux/bfs_fs.h
  HDRINST usr/include/linux/rio_cm_cdev.h
  HDRINST usr/include/linux/uleds.h
  HDRINST usr/include/linux/mrp_bridge.h
  HDRINST usr/include/linux/adb.h
  HDRINST usr/include/linux/pmu.h
  HDRINST usr/include/linux/udmabuf.h
  HDRINST usr/include/linux/kcmp.h
  HDRINST usr/include/linux/dma-heap.h
  HDRINST usr/include/linux/userfaultfd.h
  HDRINST usr/include/linux/netfilter_arp/arpt_mangle.h
  HDRINST usr/include/linux/netfilter_arp/arp_tables.h
  HDRINST usr/include/linux/tipc.h
  HDRINST usr/include/linux/virtio_ids.h
  HDRINST usr/include/linux/l2tp.h
  HDRINST usr/include/linux/devlink.h
  HDRINST usr/include/linux/virtio_gpio.h
  HDRINST usr/include/linux/dcbnl.h
  HDRINST usr/include/linux/cyclades.h
  HDRINST usr/include/sound/intel/avs/tokens.h
  HDRINST usr/include/sound/sof/fw.h
  HDRINST usr/include/sound/sof/abi.h
  HDRINST usr/include/sound/sof/tokens.h
  HDRINST usr/include/sound/sof/header.h
  HDRINST usr/include/sound/usb_stream.h
  HDRINST usr/include/sound/sfnt_info.h
  HDRINST usr/include/sound/asequencer.h
  HDRINST usr/include/sound/tlv.h
  HDRINST usr/include/sound/asound.h
  HDRINST usr/include/sound/asoc.h
  HDRINST usr/include/sound/sb16_csp.h
  HDRINST usr/include/sound/compress_offload.h
  HDRINST usr/include/sound/hdsp.h
  HDRINST usr/include/sound/emu10k1.h
  HDRINST usr/include/sound/snd_ar_tokens.h
  HDRINST usr/include/sound/snd_sst_tokens.h
  HDRINST usr/include/sound/asound_fm.h
  HDRINST usr/include/sound/hdspm.h
  HDRINST usr/include/sound/compress_params.h
  HDRINST usr/include/sound/firewire.h
  HDRINST usr/include/sound/skl-tplg-interface.h
  HDRINST usr/include/scsi/scsi_bsg_ufs.h
  HDRINST usr/include/scsi/scsi_netlink_fc.h
  HDRINST usr/include/scsi/scsi_bsg_mpi3mr.h
  HDRINST usr/include/scsi/fc/fc_ns.h
  HDRINST usr/include/scsi/fc/fc_fs.h
  HDRINST usr/include/scsi/fc/fc_els.h
  HDRINST usr/include/scsi/fc/fc_gs.h
  HDRINST usr/include/scsi/scsi_bsg_fc.h
  HDRINST usr/include/scsi/cxlflash_ioctl.h
  HDRINST usr/include/scsi/scsi_netlink.h
  HDRINST usr/include/linux/version.h
  HDRINST usr/include/asm/processor-flags.h
  HDRINST usr/include/asm/auxvec.h
  HDRINST usr/include/asm/svm.h
  HDRINST usr/include/asm/bitsperlong.h
  HDRINST usr/include/asm/kvm_perf.h
  HDRINST usr/include/asm/mce.h
  HDRINST usr/include/asm/posix_types.h
  HDRINST usr/include/asm/msr.h
  HDRINST usr/include/asm/sigcontext32.h
  HDRINST usr/include/asm/mman.h
  HDRINST usr/include/asm/shmbuf.h
  HDRINST usr/include/asm/e820.h
  HDRINST usr/include/asm/posix_types_64.h
  HDRINST usr/include/asm/vsyscall.h
  HDRINST usr/include/asm/msgbuf.h
  HDRINST usr/include/asm/swab.h
  HDRINST usr/include/asm/statfs.h
  HDRINST usr/include/asm/posix_types_x32.h
  HDRINST usr/include/asm/ptrace.h
  HDRINST usr/include/asm/unistd.h
  HDRINST usr/include/asm/ist.h
  HDRINST usr/include/asm/prctl.h
  HDRINST usr/include/asm/boot.h
  HDRINST usr/include/asm/sigcontext.h
  HDRINST usr/include/asm/posix_types_32.h
  HDRINST usr/include/asm/kvm_para.h
  HDRINST usr/include/asm/a.out.h
  HDRINST usr/include/asm/mtrr.h
  HDRINST usr/include/asm/amd_hsmp.h
  HDRINST usr/include/asm/hwcap2.h
  HDRINST usr/include/asm/ptrace-abi.h
  HDRINST usr/include/asm/vm86.h
  HDRINST usr/include/asm/vmx.h
  HDRINST usr/include/asm/ldt.h
  HDRINST usr/include/asm/perf_regs.h
  HDRINST usr/include/asm/kvm.h
  HDRINST usr/include/asm/debugreg.h
  HDRINST usr/include/asm/signal.h
  HDRINST usr/include/asm/bootparam.h
  HDRINST usr/include/asm/siginfo.h
  HDRINST usr/include/asm/hw_breakpoint.h
  HDRINST usr/include/asm/stat.h
  HDRINST usr/include/asm/setup.h
  HDRINST usr/include/asm/sembuf.h
  HDRINST usr/include/asm/sgx.h
  HDRINST usr/include/asm/ucontext.h
  HDRINST usr/include/asm/byteorder.h
  HDRINST usr/include/asm/unistd_64.h
  HDRINST usr/include/asm/ioctls.h
  HDRINST usr/include/asm/bpf_perf_event.h
  HDRINST usr/include/asm/types.h
  HDRINST usr/include/asm/poll.h
  HDRINST usr/include/asm/resource.h
  HDRINST usr/include/asm/param.h
  HDRINST usr/include/asm/sockios.h
  HDRINST usr/include/asm/errno.h
  HDRINST usr/include/asm/unistd_x32.h
  HDRINST usr/include/asm/termios.h
  HDRINST usr/include/asm/ioctl.h
  HDRINST usr/include/asm/socket.h
  HDRINST usr/include/asm/unistd_32.h
  HDRINST usr/include/asm/termbits.h
  HDRINST usr/include/asm/fcntl.h
  HDRINST usr/include/asm/ipcbuf.h
  HOSTLD  scripts/mod/modpost
  CC      kernel/bounds.s
  CHKSHA1 ../include/linux/atomic/atomic-arch-fallback.h
  CHKSHA1 ../include/linux/atomic/atomic-instrumented.h
  CHKSHA1 ../include/linux/atomic/atomic-long.h
  UPD     include/generated/timeconst.h
  UPD     include/generated/bounds.h
  CC      arch/x86/kernel/asm-offsets.s
  LD      /kernel/build64/tools/objtool/arch/x86/objtool-in.o
  UPD     include/generated/asm-offsets.h
  CALL    ../scripts/checksyscalls.sh
  LD      /kernel/build64/tools/objtool/objtool-in.o
  LINK    /kernel/build64/tools/objtool/objtool
  LDS     scripts/module.lds
  CC      ipc/compat.o
  CC      ipc/util.o
  AR      certs/built-in.a
  CC      ipc/msgutil.o
  CC      ipc/msg.o
  HOSTCC  usr/gen_init_cpio
  CC      ipc/sem.o
  CC      init/main.o
  CC      ipc/shm.o
  CC      ipc/syscall.o
  AS      arch/x86/lib/clear_page_64.o
  CC      security/commoncap.o
  CC      ipc/ipc_sysctl.o
  CC      io_uring/io_uring.o
  CC      arch/x86/lib/cmdline.o
  CC      init/do_mounts.o
  AR      arch/x86/video/built-in.a
  CC      ipc/mqueue.o
  CC      arch/x86/pci/i386.o
  CC      arch/x86/power/cpu.o
  UPD     init/utsversion-tmp.h
  AR      arch/x86/ia32/built-in.a
  AR      virt/lib/built-in.a
  CC      security/keys/gc.o
  CC [M]  arch/x86/video/fbdev.o
  CC      arch/x86/realmode/init.o
  CC      net/llc/llc_core.o
  CC      block/partitions/core.o
  CC      net/ethernet/eth.o
  CC [M]  virt/lib/irqbypass.o
  CC      net/802/p8022.o
  AR      arch/x86/net/built-in.a
  CC      net/netlink/af_netlink.o
  CC      net/sched/sch_generic.o
  AS      arch/x86/crypto/aesni-intel_asm.o
  CC      net/core/sock.o
  AR      drivers/irqchip/built-in.a
  CC      block/partitions/ldm.o
  CC      sound/core/seq/seq.o
  AR      arch/x86/platform/atom/built-in.a
  AR      sound/i2c/other/built-in.a
  CC      net/netlink/genetlink.o
  CC [M]  arch/x86/kvm/../../../virt/kvm/kvm_main.o
  AR      sound/drivers/opl3/built-in.a
  CC      arch/x86/mm/pat/set_memory.o
  CC      arch/x86/events/amd/core.o
  CC      arch/x86/kernel/fpu/init.o
  AR      sound/i2c/built-in.a
  CC      fs/notify/dnotify/dnotify.o
  AR      arch/x86/platform/ce4100/built-in.a
  AR      sound/drivers/opl4/built-in.a
  CC [M]  arch/x86/kvm/../../../virt/kvm/eventfd.o
  CC      arch/x86/entry/vdso/vma.o
  AR      drivers/phy/allwinner/built-in.a
  AR      sound/drivers/mpu401/built-in.a
  AR      drivers/bus/mhi/built-in.a
  CC      lib/kunit/test.o
  AR      drivers/bus/built-in.a
  CC      lib/math/div64.o
  CC      arch/x86/platform/efi/memmap.o
  AR      sound/drivers/vx/built-in.a
  CC      mm/kasan/common.o
  AR      drivers/phy/amlogic/built-in.a
  CC      lib/math/gcd.o
  CC      kernel/sched/core.o
  CC      crypto/api.o
  CC      arch/x86/crypto/aesni-intel_glue.o
  AR      sound/drivers/pcsp/built-in.a
  AR      sound/drivers/built-in.a
  AR      drivers/phy/broadcom/built-in.a
  AR      drivers/phy/cadence/built-in.a
  CC      kernel/sched/fair.o
  AR      drivers/phy/freescale/built-in.a
  AS      arch/x86/crypto/aesni-intel_avx-x86_64.o
  AR      drivers/phy/hisilicon/built-in.a
  AS      arch/x86/lib/cmpxchg16b_emu.o
  CC      lib/math/lcm.o
  AR      drivers/phy/ingenic/built-in.a
  AR      drivers/phy/intel/built-in.a
  CC      arch/x86/lib/copy_mc.o
  AR      drivers/phy/lantiq/built-in.a
  AR      drivers/phy/marvell/built-in.a
  CC      lib/math/int_pow.o
  AR      drivers/phy/mediatek/built-in.a
  AR      drivers/phy/microchip/built-in.a
  AR      drivers/phy/motorola/built-in.a
  AR      drivers/phy/mscc/built-in.a
  AR      drivers/phy/qualcomm/built-in.a
  CC      lib/math/int_sqrt.o
  AR      drivers/phy/ralink/built-in.a
  AR      drivers/phy/renesas/built-in.a
  AR      drivers/phy/rockchip/built-in.a
  AR      drivers/phy/samsung/built-in.a
  GEN     usr/initramfs_data.cpio
  AR      drivers/phy/socionext/built-in.a
  COPY    usr/initramfs_inc_data
  AS      usr/initramfs_data.o
  CC      lib/math/reciprocal_div.o
  AR      drivers/phy/st/built-in.a
  AR      drivers/phy/sunplus/built-in.a
  AR      usr/built-in.a
  AR      drivers/phy/tegra/built-in.a
  AS      arch/x86/lib/copy_mc_64.o
  AR      drivers/phy/ti/built-in.a
  AR      drivers/phy/xilinx/built-in.a
  CC      drivers/phy/phy-core.o
  CC      ipc/namespace.o
  CC      lib/math/rational.o
  CC      fs/nfs_common/grace.o
  AS      arch/x86/lib/copy_page_64.o
  AR      virt/built-in.a
  AS      arch/x86/lib/copy_user_64.o
  CC      kernel/locking/mutex.o
  CC      arch/x86/lib/cpu.o
  CC      sound/core/seq/seq_lock.o
  CC      kernel/locking/semaphore.o
  CC      arch/x86/lib/delay.o
  AS      arch/x86/realmode/rm/header.o
  CC      arch/x86/kernel/fpu/bugs.o
  AS      arch/x86/realmode/rm/trampoline_64.o
  CC      security/keys/key.o
  CC      kernel/power/qos.o
  AS      arch/x86/realmode/rm/stack.o
  CC      security/keys/keyring.o
  AS      arch/x86/realmode/rm/reboot.o
  AR      fs/notify/dnotify/built-in.a
  CC      fs/notify/inotify/inotify_fsnotify.o
  AS      arch/x86/realmode/rm/wakeup_asm.o
  CC      arch/x86/pci/init.o
  CC      arch/x86/realmode/rm/wakemain.o
  CC      arch/x86/kernel/fpu/core.o
  CC      net/802/psnap.o
  CC      security/keys/keyctl.o
  CC      lib/kunit/resource.o
  CC      arch/x86/platform/efi/quirks.o
  CC      crypto/cipher.o
  CC [M]  lib/math/prime_numbers.o
  CC      net/llc/llc_input.o
  CC      arch/x86/realmode/rm/video-mode.o
  CC      arch/x86/entry/vdso/extable.o
  CC      arch/x86/power/hibernate_64.o
  CC      mm/kasan/report.o
  CC      net/llc/llc_output.o
  CC      arch/x86/kernel/fpu/regset.o
  AS      arch/x86/lib/getuser.o
  CC      sound/core/seq/seq_clientmgr.o
  GEN     arch/x86/lib/inat-tables.c
  AS      arch/x86/realmode/rm/copy.o
  CC      sound/core/seq/seq_memory.o
  CC      arch/x86/lib/insn-eval.o
  AS      arch/x86/realmode/rm/bioscall.o
  CC      arch/x86/realmode/rm/regs.o
  AS      arch/x86/crypto/aes_ctrby8_avx-x86_64.o
  CC      arch/x86/events/amd/lbr.o
  CC      arch/x86/realmode/rm/video-vga.o
  CC      block/partitions/msdos.o
  AS [M]  arch/x86/crypto/ghash-clmulni-intel_asm.o
  CC      arch/x86/events/amd/ibs.o
  AR      fs/nfs_common/built-in.a
  CC      ipc/mq_sysctl.o
  CC [M]  arch/x86/crypto/ghash-clmulni-intel_glue.o
  CC      block/partitions/efi.o
  CC      arch/x86/realmode/rm/video-vesa.o
  CC      kernel/power/main.o
  CC      security/min_addr.o
  CC      fs/notify/inotify/inotify_user.o
  AR      net/ethernet/built-in.a
  CC      security/inode.o
  CC      lib/kunit/static_stub.o
  CC      crypto/compress.o
  CC      lib/kunit/string-stream.o
  CC      arch/x86/realmode/rm/video-bios.o
  CC [M]  arch/x86/kvm/../../../virt/kvm/binary_stats.o
  CC      arch/x86/pci/mmconfig_64.o
  AR      drivers/phy/built-in.a
  AR      lib/math/built-in.a
  PASYMS  arch/x86/realmode/rm/pasyms.h
  AS [M]  arch/x86/crypto/crc32-pclmul_asm.o
  LDS     arch/x86/realmode/rm/realmode.lds
  AR      drivers/pinctrl/actions/built-in.a
  LD      arch/x86/realmode/rm/realmode.elf
  AR      drivers/pinctrl/bcm/built-in.a
  RELOCS  arch/x86/realmode/rm/realmode.relocs
  OBJCOPY arch/x86/realmode/rm/realmode.bin
  AS      arch/x86/realmode/rmpiggy.o
  AR      drivers/pinctrl/cirrus/built-in.a
  CC      lib/crypto/memneq.o
  AR      drivers/pinctrl/freescale/built-in.a
  AR      arch/x86/realmode/built-in.a
  CC      drivers/pinctrl/intel/pinctrl-baytrail.o
  AR      drivers/pinctrl/mediatek/built-in.a
  AR      drivers/pinctrl/mvebu/built-in.a
  AS      arch/x86/power/hibernate_asm_64.o
  CC      lib/zlib_inflate/inffast.o
  CC      init/do_mounts_initrd.o
  CC      lib/zlib_inflate/inflate.o
  CC      arch/x86/entry/vdso/vdso32-setup.o
  CC      arch/x86/power/hibernate.o
  CC      net/802/stp.o
  CC      lib/zlib_inflate/infutil.o
  CC      security/keys/permission.o
  CC      security/keys/process_keys.o
  CC      arch/x86/mm/pat/memtype.o
  CC      kernel/power/console.o
  CC      arch/x86/kernel/fpu/signal.o
  AR      net/llc/built-in.a
  CC      kernel/locking/rwsem.o
  CC      drivers/pinctrl/intel/pinctrl-intel.o
  CC      arch/x86/platform/efi/efi.o
  CC      crypto/algapi.o
  CC      mm/kasan/init.o
  CC [M]  arch/x86/crypto/crc32-pclmul_glue.o
  CC      drivers/gpio/gpiolib.o
  LDS     arch/x86/entry/vdso/vdso.lds
  CC      kernel/locking/percpu-rwsem.o
  AR      drivers/pwm/built-in.a
  CC      kernel/locking/irqflag-debug.o
  AS      arch/x86/entry/vdso/vdso-note.o
  CC      lib/kunit/assert.o
  CC      lib/kunit/try-catch.o
  CC      arch/x86/entry/vdso/vclock_gettime.o
  CC      security/keys/request_key.o
  CC      kernel/locking/mutex-debug.o
  AR      arch/x86/platform/geode/built-in.a
  CC      kernel/locking/lockdep.o
  CC      lib/crypto/utils.o
  AR      sound/isa/ad1816a/built-in.a
  AR      sound/pci/ac97/built-in.a
  AR      sound/isa/ad1848/built-in.a
  AR      sound/pci/ali5451/built-in.a
  CC      io_uring/xattr.o
  CC      arch/x86/lib/insn.o
  AR      sound/isa/cs423x/built-in.a
  AR      sound/pci/asihpi/built-in.a
  AR      sound/isa/es1688/built-in.a
  AR      sound/pci/au88x0/built-in.a
  AR      sound/isa/galaxy/built-in.a
  AR      sound/pci/aw2/built-in.a
  AR      sound/isa/gus/built-in.a
  AR      sound/isa/msnd/built-in.a
  AR      sound/pci/ctxfi/built-in.a
  AR      sound/pci/ca0106/built-in.a
  AR      sound/isa/opti9xx/built-in.a
  CC      net/sched/sch_mq.o
  AR      sound/pci/cs46xx/built-in.a
  AR      sound/isa/sb/built-in.a
  AR      sound/pci/cs5535audio/built-in.a
  AR      sound/isa/wavefront/built-in.a
  CC      net/sched/sch_frag.o
  CC      arch/x86/pci/direct.o
  AR      sound/pci/lola/built-in.a
  AR      sound/isa/wss/built-in.a
  AR      sound/isa/built-in.a
  AR      sound/pci/lx6464es/built-in.a
  AR      sound/pci/echoaudio/built-in.a
  CC      arch/x86/kernel/fpu/xstate.o
  CC      init/initramfs.o
  AR      sound/pci/emu10k1/built-in.a
  AR      sound/pci/hda/built-in.a
  CC      init/calibrate.o
  CC [M]  sound/pci/hda/hda_bind.o
  CC      arch/x86/pci/mmconfig-shared.o
  AR      block/partitions/built-in.a
  CC      block/bdev.o
  CC      lib/zlib_inflate/inftrees.o
  AR      sound/pci/ice1712/built-in.a
  CC      security/keys/request_key_auth.o
  AR      ipc/built-in.a
  AS [M]  arch/x86/crypto/crct10dif-pcl-asm_64.o
  AR      drivers/pinctrl/nomadik/built-in.a
  CC      fs/iomap/trace.o
  CC      block/fops.o
  CC      init/init_task.o
  CC [M]  arch/x86/crypto/crct10dif-pclmul_glue.o
  AR      fs/notify/inotify/built-in.a
  LD [M]  arch/x86/crypto/ghash-clmulni-intel.o
  CC      fs/notify/fanotify/fanotify.o
  AR      arch/x86/power/built-in.a
  CC      lib/kunit/executor.o
  CC      lib/kunit/hooks.o
  CC      arch/x86/pci/fixup.o
  CC      arch/x86/events/amd/uncore.o
  AS      arch/x86/lib/memcpy_64.o
  CC      kernel/power/process.o
  CC      arch/x86/entry/vdso/vgetcpu.o
  CC      sound/core/seq/seq_queue.o
  CC      lib/zlib_inflate/inflate_syms.o
  AS      arch/x86/lib/memmove_64.o
  CC      lib/crypto/chacha.o
  AS      arch/x86/lib/memset_64.o
  AR      net/802/built-in.a
  CC      fs/notify/fsnotify.o
  HOSTCC  arch/x86/entry/vdso/vdso2c
  CC      fs/notify/notification.o
  CC      arch/x86/lib/misc.o
  AR      net/bpf/built-in.a
  CC      kernel/locking/lockdep_proc.o
  CC      arch/x86/lib/pc-conf-reg.o
  AR      sound/ppc/built-in.a
  CC      arch/x86/pci/acpi.o
  CC      lib/crypto/aes.o
  CC      sound/core/seq/seq_fifo.o
  CC      arch/x86/mm/pat/memtype_interval.o
  CC      mm/kasan/generic.o
  CC      arch/x86/events/intel/core.o
  CC      net/sched/sch_api.o
  CC      arch/x86/events/intel/bts.o
  CC      arch/x86/platform/efi/efi_64.o
  CC      arch/x86/events/intel/ds.o
  AS      arch/x86/lib/putuser.o
  LD [M]  arch/x86/crypto/crc32-pclmul.o
  AR      lib/zlib_inflate/built-in.a
  AS      arch/x86/lib/retpoline.o
  LD [M]  arch/x86/crypto/crct10dif-pclmul.o
  CC      mm/kasan/report_generic.o
  CC      security/keys/user_defined.o
  AR      arch/x86/crypto/built-in.a
  AS      arch/x86/entry/entry.o
  CC      arch/x86/entry/vsyscall/vsyscall_64.o
  CC      arch/x86/lib/usercopy.o
  AS      arch/x86/entry/vsyscall/vsyscall_emu_64.o
  CC      arch/x86/pci/legacy.o
  LDS     arch/x86/entry/vdso/vdso32/vdso32.lds
  AS      arch/x86/entry/vdso/vdso32/note.o
  CC      mm/kasan/shadow.o
  AR      lib/kunit/built-in.a
  CC      lib/zlib_deflate/deflate.o
  AS      arch/x86/entry/vdso/vdso32/system_call.o
  CC [M]  sound/pci/hda/hda_codec.o
  AS      arch/x86/entry/vdso/vdso32/sigreturn.o
  CC      lib/zlib_deflate/deftree.o
  CC      arch/x86/entry/vdso/vdso32/vclock_gettime.o
  CC      net/netlink/policy.o
  CC      arch/x86/pci/irq.o
  CC      crypto/scatterwalk.o
  CC      lib/crypto/gf128mul.o
  CC [M]  drivers/pinctrl/intel/pinctrl-cherryview.o
  CC      init/version.o
  AS      arch/x86/entry/entry_64.o
  CC      lib/crypto/blake2s.o
  AR      sound/arm/built-in.a
  CC      drivers/gpio/gpiolib-devres.o
  CC      arch/x86/entry/syscall_64.o
  CC [M]  drivers/pinctrl/intel/pinctrl-broxton.o
  AR      arch/x86/mm/pat/built-in.a
  CC      mm/filemap.o
  CC      arch/x86/mm/init.o
  CC      sound/core/seq/seq_prioq.o
  CC      mm/mempool.o
  CC      net/sched/sch_blackhole.o
  CC      fs/iomap/iter.o
  CC      drivers/gpio/gpiolib-legacy.o
  AR      init/built-in.a
  AR      arch/x86/events/amd/built-in.a
  CC      block/bio.o
  CC      block/elevator.o
  CC      fs/notify/fanotify/fanotify_user.o
  CC      lib/lzo/lzo1x_compress.o
  CC      arch/x86/pci/common.o
  CC      lib/crypto/blake2s-generic.o
  CC      kernel/power/suspend.o
  CC      arch/x86/lib/usercopy_64.o
  CC      security/keys/compat.o
  CC      arch/x86/entry/vdso/vdso32/vgetcpu.o
  CC      fs/iomap/buffered-io.o
  AR      arch/x86/kernel/fpu/built-in.a
  CC      arch/x86/kernel/cpu/mce/core.o
  CC      arch/x86/kernel/cpu/mce/severity.o
  CC      arch/x86/kernel/acpi/boot.o
  CC      arch/x86/events/intel/knc.o
  CC      arch/x86/events/intel/lbr.o
  AS      arch/x86/platform/efi/efi_stub_64.o
  CC      arch/x86/events/intel/p4.o
  CC      mm/kasan/quarantine.o
  AR      arch/x86/platform/efi/built-in.a
  AR      arch/x86/platform/iris/built-in.a
  CC      arch/x86/platform/intel/iosf_mbi.o
  CC      crypto/proc.o
  VDSO    arch/x86/entry/vdso/vdso64.so.dbg
  AR      arch/x86/entry/vsyscall/built-in.a
  CC      crypto/aead.o
  CC      crypto/geniv.o
  CC      net/core/request_sock.o
  VDSO    arch/x86/entry/vdso/vdso32.so.dbg
  OBJCOPY arch/x86/entry/vdso/vdso64.so
  OBJCOPY arch/x86/entry/vdso/vdso32.so
  VDSO2C  arch/x86/entry/vdso/vdso-image-64.c
  VDSO2C  arch/x86/entry/vdso/vdso-image-32.c
  CC      arch/x86/entry/vdso/vdso-image-64.o
  AR      arch/x86/platform/intel-mid/built-in.a
  CC      arch/x86/kernel/acpi/sleep.o
  AR      arch/x86/platform/intel-quark/built-in.a
  CC      net/sched/sch_fifo.o
  AR      sound/pci/korg1212/built-in.a
  CC      arch/x86/kernel/apic/apic.o
  CC      net/netlink/diag.o
  CC      arch/x86/kernel/kprobes/core.o
  AR      sound/pci/mixart/built-in.a
  AR      sound/pci/nm256/built-in.a
  CC      lib/lzo/lzo1x_decompress_safe.o
  AR      sound/pci/oxygen/built-in.a
  CC      arch/x86/kernel/apic/apic_common.o
  CC      lib/zlib_deflate/deflate_syms.o
  CC      sound/core/seq/seq_timer.o
  CC      arch/x86/entry/vdso/vdso-image-32.o
  CC      arch/x86/lib/msr-smp.o
  CC      arch/x86/pci/early.o
  CC      security/keys/proc.o
  CC      security/keys/sysctl.o
  CC      lib/crypto/blake2s-selftest.o
  CC      crypto/skcipher.o
  CC      crypto/seqiv.o
  AS      arch/x86/kernel/acpi/wakeup_64.o
  AR      arch/x86/entry/vdso/built-in.a
  CC      arch/x86/entry/common.o
  CC      lib/crypto/des.o
  CC      fs/iomap/direct-io.o
  CC      crypto/echainiv.o
  AR      lib/zlib_deflate/built-in.a
  CC [M]  drivers/pinctrl/intel/pinctrl-geminilake.o
  CC      fs/iomap/fiemap.o
  AR      sound/pci/pcxhr/built-in.a
  CC      arch/x86/mm/init_64.o
  CC      arch/x86/lib/cache-smp.o
  CC      arch/x86/mm/fault.o
  AR      mm/kasan/built-in.a
  AR      sound/pci/riptide/built-in.a
  CC      mm/oom_kill.o
  CC      sound/core/seq/seq_system.o
  AR      arch/x86/platform/intel/built-in.a
  AR      arch/x86/platform/olpc/built-in.a
  CC      arch/x86/kernel/apic/apic_noop.o
  AR      lib/lzo/built-in.a
  AR      arch/x86/platform/scx200/built-in.a
  CC      arch/x86/lib/msr.o
  AR      arch/x86/platform/ts5500/built-in.a
  AR      arch/x86/platform/uv/built-in.a
  AR      arch/x86/platform/built-in.a
  CC      arch/x86/kernel/acpi/apei.o
  CC      drivers/gpio/gpiolib-cdev.o
  CC      drivers/gpio/gpiolib-sysfs.o
  CC      crypto/ahash.o
  CC      drivers/gpio/gpiolib-acpi.o
  CC      block/blk-core.o
  AR      drivers/pinctrl/nuvoton/built-in.a
  CC      kernel/power/hibernate.o
  CC      drivers/gpio/gpiolib-swnode.o
  CC      arch/x86/kernel/apic/ipi.o
  CC      net/core/skbuff.o
  CC      arch/x86/pci/bus_numa.o
  CC      arch/x86/pci/amd_bus.o
  CC      arch/x86/kernel/apic/vector.o
  AR      drivers/pinctrl/sprd/built-in.a
  CC      arch/x86/kernel/acpi/cppc.o
  CC      arch/x86/kernel/apic/hw_nmi.o
  CC      lib/lz4/lz4_compress.o
  CC      lib/lz4/lz4hc_compress.o
  CC      arch/x86/kernel/kprobes/opt.o
  AR      fs/notify/fanotify/built-in.a
  CC [M]  drivers/pinctrl/intel/pinctrl-sunrisepoint.o
  AR      security/keys/built-in.a
  CC      fs/notify/group.o
  CC      security/device_cgroup.o
  CC      arch/x86/events/intel/p6.o
  CC      arch/x86/kernel/kprobes/ftrace.o
  AR      net/sched/built-in.a
  AR      net/netlink/built-in.a
  CC      fs/notify/mark.o
  CC [M]  arch/x86/kvm/../../../virt/kvm/vfio.o
  CC [M]  arch/x86/kvm/../../../virt/kvm/coalesced_mmio.o
  AS      arch/x86/entry/thunk_64.o
  AR      sound/pci/rme9652/built-in.a
  CC      sound/core/seq/seq_ports.o
  CC      arch/x86/kernel/acpi/cstate.o
  AR      sound/pci/trident/built-in.a
  CC      sound/core/seq/seq_info.o
  CC      arch/x86/kernel/cpu/mce/genpool.o
  AS      arch/x86/entry/entry_64_compat.o
  CC      arch/x86/entry/syscall_32.o
  CC      io_uring/nop.o
  CC      kernel/printk/printk.o
  AR      sound/sh/built-in.a
  CC      kernel/power/snapshot.o
  CC      kernel/printk/printk_safe.o
  AS      arch/x86/lib/msr-reg.o
  CC [M]  sound/pci/hda/hda_jack.o
  CC [M]  sound/pci/hda/hda_auto_parser.o
  CC      arch/x86/lib/msr-reg-export.o
  CC      fs/iomap/seek.o
  CC      arch/x86/kernel/apic/io_apic.o
  CC      kernel/irq/irqdesc.o
  CC      net/ethtool/ioctl.o
  CC      arch/x86/kernel/apic/msi.o
  CC      lib/crypto/sha1.o
  AR      drivers/pinctrl/intel/built-in.a
  AS      arch/x86/lib/hweight.o
  AR      drivers/pinctrl/sunplus/built-in.a
  CC      arch/x86/kernel/apic/x2apic_phys.o
  AR      drivers/pinctrl/ti/built-in.a
  CC      drivers/pinctrl/core.o
  CC      arch/x86/lib/iomem.o
  CC      crypto/shash.o
  AR      arch/x86/pci/built-in.a
  CC      arch/x86/kernel/apic/x2apic_cluster.o
  CC      kernel/irq/handle.o
  CC      io_uring/fs.o
  CC      io_uring/splice.o
  CC      fs/iomap/swapfile.o
  CC      arch/x86/events/intel/pt.o
  AR      arch/x86/kernel/acpi/built-in.a
  AS      arch/x86/lib/iomap_copy_64.o
  CC [M]  sound/pci/hda/hda_sysfs.o
  CC      arch/x86/kernel/cpu/mce/intel.o
  CC [M]  sound/pci/hda/hda_controller.o
  CC      kernel/sched/build_policy.o
  CC      block/blk-sysfs.o
  LDS     arch/x86/kernel/vmlinux.lds
  AR      arch/x86/entry/built-in.a
  CC      arch/x86/mm/ioremap.o
  CC      arch/x86/events/intel/uncore.o
  AR      arch/x86/kernel/kprobes/built-in.a
  AS      arch/x86/kernel/head_64.o
  CC      arch/x86/kernel/apic/apic_flat_64.o
  CC [M]  arch/x86/kvm/../../../virt/kvm/async_pf.o
  CC      arch/x86/lib/inat.o
  CC      block/blk-flush.o
  CC      arch/x86/kernel/cpu/mtrr/mtrr.o
  CC      arch/x86/kernel/cpu/cacheinfo.o
  AR      sound/core/seq/built-in.a
  CC      lib/crypto/sha256.o
  CC      arch/x86/mm/extable.o
  CC      sound/core/sound.o
  CC      arch/x86/kernel/cpu/mtrr/if.o
  CC      arch/x86/mm/mmap.o
  CC      fs/notify/fdinfo.o
  AR      arch/x86/lib/built-in.a
  AR      arch/x86/lib/lib.a
  CC      kernel/power/swap.o
  AR      sound/synth/emux/built-in.a
  AR      sound/synth/built-in.a
  AR      sound/usb/misc/built-in.a
  CC      kernel/locking/spinlock.o
  AR      sound/usb/usx2y/built-in.a
  AR      sound/usb/caiaq/built-in.a
  AR      sound/usb/6fire/built-in.a
  AR      sound/usb/hiface/built-in.a
  AR      sound/firewire/built-in.a
  CC      sound/core/init.o
  AR      sound/usb/bcd2000/built-in.a
  CC      kernel/sched/build_utility.o
  AR      sound/usb/built-in.a
  CC      arch/x86/kernel/cpu/mtrr/generic.o
  CC      kernel/printk/printk_ringbuffer.o
  AR      security/built-in.a
  CC      arch/x86/kernel/cpu/scattered.o
  CC      arch/x86/kernel/apic/probe_64.o
  CC      lib/lz4/lz4_decompress.o
  CC      kernel/irq/manage.o
  AR      drivers/gpio/built-in.a
  AR      fs/iomap/built-in.a
  CC      drivers/pci/msi/pcidev_msi.o
  AR      fs/quota/built-in.a
  CC      arch/x86/kernel/cpu/mce/threshold.o
  CC [M]  sound/pci/hda/hda_proc.o
  CC      mm/fadvise.o
  CC      drivers/pci/pcie/portdrv.o
  CC      drivers/pci/hotplug/pci_hotplug_core.o
  CC      drivers/pci/hotplug/acpi_pcihp.o
  CC [M]  sound/pci/hda/hda_hwdep.o
  AR      drivers/pci/controller/dwc/built-in.a
  AR      drivers/pci/controller/mobiveil/built-in.a
  CC      drivers/pci/controller/vmd.o
  CC      io_uring/sync.o
  CC      block/blk-settings.o
  CC      kernel/power/user.o
  CC [M]  lib/crypto/arc4.o
  CC      kernel/locking/osq_lock.o
  CC      crypto/akcipher.o
  CC      drivers/pci/msi/api.o
  CC      crypto/kpp.o
  AR      sound/pci/ymfpci/built-in.a
  CC      arch/x86/kernel/cpu/mtrr/cleanup.o
  AR      fs/notify/built-in.a
  CC      kernel/locking/qspinlock.o
  CC      drivers/pci/pcie/rcec.o
  CC      kernel/power/poweroff.o
  CC      sound/core/memory.o
  CC      kernel/locking/rtmutex_api.o
  CC      net/core/datagram.o
  CC      fs/proc/task_mmu.o
  CC      arch/x86/mm/pgtable.o
  CC [M]  arch/x86/kvm/../../../virt/kvm/irqchip.o
  CC      drivers/pci/pcie/aspm.o
  CC      kernel/locking/spinlock_debug.o
  CC [M]  arch/x86/kvm/../../../virt/kvm/dirty_ring.o
  AR      drivers/pci/switch/built-in.a
  CC      drivers/pci/msi/msi.o
  CC      drivers/pinctrl/pinctrl-utils.o
  CC      block/blk-ioc.o
  CC      mm/maccess.o
  CC      drivers/pinctrl/pinmux.o
  AR      lib/crypto/built-in.a
  LD [M]  lib/crypto/libarc4.o
  CC      drivers/pinctrl/pinconf.o
  CC      arch/x86/kernel/cpu/topology.o
  CC      arch/x86/events/intel/uncore_nhmex.o
  CC      lib/zstd/zstd_compress_module.o
  CC      kernel/printk/sysctl.o
  CC      block/blk-map.o
  CC      block/blk-merge.o
  CC      drivers/pci/pcie/aer.o
  CC      sound/core/control.o
  CC      sound/core/misc.o
  CC      net/core/stream.o
  CC      net/core/scm.o
  CC      arch/x86/kernel/cpu/mce/apei.o
  CC      lib/xz/xz_dec_syms.o
  CC      io_uring/advise.o
  CC      lib/raid6/algos.o
  CC      drivers/pci/hotplug/pciehp_core.o
  AR      arch/x86/kernel/apic/built-in.a
  CC      crypto/acompress.o
  CC      arch/x86/events/intel/uncore_snb.o
  CC      lib/xz/xz_dec_stream.o
  CC      kernel/locking/qrwlock.o
  CC      arch/x86/kernel/head64.o
  CC      arch/x86/events/intel/uncore_snbep.o
  AR      kernel/printk/built-in.a
  CC      drivers/pinctrl/pinconf-generic.o
  CC [M]  sound/pci/hda/hda_generic.o
  AR      kernel/power/built-in.a
  CC      drivers/pci/msi/irqdomain.o
  AR      lib/lz4/built-in.a
  CC      lib/zstd/compress/fse_compress.o
  CC      kernel/rcu/update.o
  AR      kernel/livepatch/built-in.a
  CC      block/blk-timeout.o
  CC      lib/zstd/compress/hist.o
  CC      arch/x86/kernel/cpu/common.o
  CC      arch/x86/kernel/cpu/rdrand.o
  AR      drivers/pci/controller/built-in.a
  CC      lib/raid6/recov.o
  CC      drivers/pci/access.o
  CC      mm/page-writeback.o
  CC [M]  sound/pci/hda/patch_realtek.o
  AR      arch/x86/kernel/cpu/mtrr/built-in.a
  CC      net/ethtool/common.o
  AR      sound/pci/vx222/built-in.a
  CC      arch/x86/kernel/cpu/match.o
  CC      io_uring/filetable.o
  CC [M]  net/netfilter/ipvs/ip_vs_conn.o
  CC      arch/x86/mm/physaddr.o
  CC      arch/x86/mm/tlb.o
  CC      kernel/irq/spurious.o
  CC      net/ethtool/netlink.o
  CC [M]  arch/x86/kvm/../../../virt/kvm/pfncache.o
  CC [M]  sound/pci/hda/patch_analog.o
  CC      lib/zstd/compress/huf_compress.o
  CC      arch/x86/kernel/cpu/bugs.o
  CC      arch/x86/kernel/cpu/aperfmperf.o
  AR      kernel/locking/built-in.a
  CC [M]  arch/x86/kvm/x86.o
  CC      lib/xz/xz_dec_lzma2.o
  AR      arch/x86/kernel/cpu/mce/built-in.a
  CC      net/ethtool/bitset.o
  CC      crypto/scompress.o
  AR      drivers/pinctrl/built-in.a
  CC      lib/zstd/compress/zstd_compress.o
  CC      kernel/dma/mapping.o
  CC      drivers/video/console/dummycon.o
  CC      drivers/video/logo/logo.o
  HOSTCC  drivers/video/logo/pnmtologo
  CC      kernel/dma/direct.o
  CC      kernel/dma/ops_helpers.o
  CC      drivers/pci/hotplug/pciehp_ctrl.o
  CC      drivers/video/console/vgacon.o
  CC      arch/x86/kernel/cpu/cpuid-deps.o
  CC      arch/x86/mm/cpu_entry_area.o
  CC      lib/zstd/compress/zstd_compress_literals.o
  CC      net/ethtool/strset.o
  HOSTCC  lib/raid6/mktables
  AR      drivers/pci/msi/built-in.a
  CC      block/blk-lib.o
  CC      net/core/gen_stats.o
  CC      drivers/pci/hotplug/pciehp_pci.o
  CC      arch/x86/mm/maccess.o
  CC      arch/x86/mm/pgprot.o
  CC      io_uring/openclose.o
  UNROLL  lib/raid6/int1.c
  CC      kernel/irq/resend.o
  CC      arch/x86/kernel/cpu/umwait.o
  UNROLL  lib/raid6/int2.c
  UNROLL  lib/raid6/int4.c
  UNROLL  lib/raid6/int8.c
  UNROLL  lib/raid6/int16.c
  UNROLL  lib/raid6/int32.c
  CC      lib/raid6/recov_ssse3.o
  LOGO    drivers/video/logo/logo_linux_clut224.c
  CC      drivers/video/logo/logo_linux_clut224.o
  CC      drivers/pci/pcie/err.o
  AR      drivers/video/logo/built-in.a
  CC      kernel/irq/chip.o
  CC      crypto/algboss.o
  CC      mm/folio-compat.o
  CC      mm/readahead.o
  CC      mm/swap.o
  CC      mm/truncate.o
  CC      fs/proc/inode.o
  CC      drivers/video/backlight/backlight.o
  CC      lib/xz/xz_dec_bcj.o
  CC      net/ethtool/linkinfo.o
  CC      net/ethtool/linkmodes.o
  CC      drivers/video/fbdev/core/fb_notify.o
  AR      drivers/video/fbdev/omap/built-in.a
  AR      drivers/video/fbdev/omap2/omapfb/dss/built-in.a
  CC [M]  arch/x86/kvm/emulate.o
  CC      lib/zstd/compress/zstd_compress_sequences.o
  AR      drivers/video/fbdev/omap2/omapfb/displays/built-in.a
  CC      crypto/testmgr.o
  AR      drivers/video/fbdev/omap2/omapfb/built-in.a
  AR      drivers/video/fbdev/omap2/built-in.a
  CC      drivers/idle/intel_idle.o
  CC      crypto/cmac.o
  CC      arch/x86/mm/hugetlbpage.o
  CC      crypto/hmac.o
  CC      arch/x86/kernel/cpu/proc.o
  CC      sound/core/device.o
  CC      kernel/rcu/sync.o
  CC      drivers/pci/hotplug/pciehp_hpc.o
  CC      drivers/pci/bus.o
  MKCAP   arch/x86/kernel/cpu/capflags.c
  CC      block/blk-mq.o
  CC      kernel/dma/dummy.o
  CC      lib/raid6/recov_avx2.o
  CC      block/blk-mq-tag.o
  CC      drivers/pci/pcie/aer_inject.o
  CC      lib/fonts/fonts.o
  CC      lib/argv_split.o
  AR      drivers/video/console/built-in.a
  AR      lib/xz/built-in.a
  CC      lib/fonts/font_8x8.o
  CC      fs/proc/root.o
  CC [M]  net/netfilter/ipvs/ip_vs_core.o
  CC      net/netfilter/core.o
  CC      net/netfilter/nf_log.o
  CC      kernel/rcu/srcutree.o
  CC      drivers/pci/pcie/pme.o
  CC      drivers/pci/hotplug/acpiphp_core.o
  CC      lib/zstd/compress/zstd_compress_superblock.o
  CC      crypto/vmac.o
  CC      arch/x86/kernel/cpu/powerflags.o
  CC      net/ethtool/rss.o
  CC      kernel/rcu/tree.o
  CC [M]  drivers/video/fbdev/core/fbmem.o
  CC      sound/core/info.o
  CC      io_uring/uring_cmd.o
  CC      io_uring/epoll.o
  CC      sound/core/isadma.o
  CC      sound/core/vmaster.o
  CC      lib/fonts/font_8x16.o
  AR      drivers/video/backlight/built-in.a
  CC      kernel/entry/common.o
  CC      kernel/dma/contiguous.o
  CC      arch/x86/mm/kasan_init_64.o
  CC      kernel/irq/dummychip.o
  CC      arch/x86/mm/pkeys.o
  CC      kernel/irq/devres.o
  CC      arch/x86/events/intel/uncore_discovery.o
  CC      net/ethtool/linkstate.o
  CC      mm/vmscan.o
  CC      kernel/dma/swiotlb.o
  CC [M]  drivers/video/fbdev/core/fbmon.o
  CC      lib/raid6/mmx.o
  CC      lib/raid6/sse1.o
  CC      lib/raid6/sse2.o
  CC [M]  drivers/video/fbdev/core/fbcmap.o
  CC      arch/x86/events/intel/cstate.o
  CC      kernel/rcu/rcu_segcblist.o
  CC      fs/proc/base.o
  CC      mm/shmem.o
  AR      lib/fonts/built-in.a
  AR      drivers/char/ipmi/built-in.a
  CC      kernel/irq/autoprobe.o
  CC      net/core/gen_estimator.o
  CC      block/blk-stat.o
  CC      drivers/pci/pcie/dpc.o
  AR      drivers/idle/built-in.a
  CC      drivers/pci/hotplug/acpiphp_glue.o
  CC [M]  net/netfilter/ipvs/ip_vs_ctl.o
  CC      drivers/acpi/acpica/dsargs.o
  CC      drivers/acpi/apei/apei-base.o
  CC      block/blk-mq-sysfs.o
  CC      drivers/acpi/apei/hest.o
  CC      block/blk-mq-cpumap.o
  CC      drivers/pnp/core.o
  CC      drivers/pnp/pnpacpi/core.o
  AR      drivers/amba/built-in.a
  CC      drivers/pnp/pnpacpi/rsparser.o
  AR      drivers/clk/actions/built-in.a
  AR      drivers/clk/analogbits/built-in.a
  CC [M]  sound/pci/hda/patch_hdmi.o
  AR      drivers/clk/bcm/built-in.a
  CC      arch/x86/mm/pti.o
  AR      drivers/clk/imgtec/built-in.a
  AR      drivers/clk/imx/built-in.a
  AR      drivers/clk/ingenic/built-in.a
  AR      drivers/clk/mediatek/built-in.a
  AR      drivers/clk/microchip/built-in.a
  AR      drivers/clk/mstar/built-in.a
  CC      sound/core/ctljack.o
  AR      drivers/clk/mvebu/built-in.a
  AR      drivers/clk/ralink/built-in.a
  AR      drivers/clk/renesas/built-in.a
  AR      drivers/clk/socfpga/built-in.a
  CC      sound/core/jack.o
  AR      drivers/clk/sprd/built-in.a
  CC      net/ethtool/debug.o
  AR      drivers/clk/sunxi-ng/built-in.a
  AR      drivers/clk/ti/built-in.a
  CC      crypto/xcbc.o
  AR      drivers/clk/versatile/built-in.a
  CC      lib/raid6/avx2.o
  CC      drivers/clk/x86/clk-lpss-atom.o
  CC      kernel/entry/syscall_user_dispatch.o
  CC      kernel/irq/irqdomain.o
  CC      io_uring/statx.o
  CC      kernel/irq/proc.o
  CC      drivers/acpi/acpica/dscontrol.o
  CC      io_uring/net.o
  CC      drivers/dma/dw/core.o
  CC      io_uring/msg_ring.o
  AR      arch/x86/events/intel/built-in.a
  CC      arch/x86/events/zhaoxin/core.o
  CC      lib/zstd/compress/zstd_double_fast.o
  CC      net/netfilter/nf_queue.o
  CC      drivers/dma/dw/dw.o
  CC      drivers/dma/dw/idma32.o
  CC      drivers/acpi/acpica/dsdebug.o
  AR      drivers/pci/pcie/built-in.a
  CC      kernel/dma/remap.o
  CC      kernel/entry/kvm.o
  CC      drivers/pci/probe.o
  CC      arch/x86/events/core.o
  CC      sound/core/timer.o
  CC      drivers/acpi/acpica/dsfield.o
  AR      sound/sparc/built-in.a
  CC      drivers/clk/x86/clk-pmc-atom.o
  CC      fs/kernfs/mount.o
  CC      fs/sysfs/file.o
  CC      drivers/dma/hsu/hsu.o
  CC      net/core/net_namespace.o
  CC      drivers/acpi/apei/erst.o
  CC      drivers/acpi/apei/bert.o
  CC      net/core/secure_seq.o
  CC [M]  drivers/video/fbdev/core/fbsysfs.o
  AR      arch/x86/mm/built-in.a
  CC      drivers/pnp/card.o
  CC      block/blk-mq-sched.o
  CC      sound/core/hrtimer.o
  CC      crypto/crypto_null.o
  AR      drivers/pci/hotplug/built-in.a
  CC      arch/x86/kernel/cpu/feat_ctl.o
  CC [M]  arch/x86/kvm/i8259.o
  CC      lib/raid6/avx512.o
  AR      drivers/pnp/pnpacpi/built-in.a
  CC [M]  drivers/video/fbdev/core/modedb.o
  CC      sound/core/seq_device.o
  CC      crypto/md5.o
  CC [M]  sound/pci/hda/hda_eld.o
  CC      fs/kernfs/inode.o
  CC      net/ethtool/wol.o
  AR      kernel/dma/built-in.a
  CC      net/ethtool/features.o
  CC [M]  sound/core/control_led.o
  AR      arch/x86/events/zhaoxin/built-in.a
  CC [M]  net/netfilter/ipvs/ip_vs_sched.o
  CC      drivers/acpi/acpica/dsinit.o
  CC      io_uring/timeout.o
  AR      kernel/sched/built-in.a
  CC      fs/kernfs/dir.o
  AR      drivers/clk/x86/built-in.a
  AR      drivers/clk/xilinx/built-in.a
  CC      drivers/clk/clk-devres.o
  CC [M]  net/netfilter/ipvs/ip_vs_xmit.o
  CC [M]  net/netfilter/ipvs/ip_vs_app.o
  AR      kernel/entry/built-in.a
  CC      arch/x86/events/probe.o
  CC      drivers/clk/clk-bulk.o
  CC      lib/zstd/compress/zstd_fast.o
  CC      drivers/acpi/acpica/dsmethod.o
  CC      fs/sysfs/dir.o
  CC      kernel/irq/migration.o
  CC      mm/util.o
  CC      crypto/sha1_generic.o
  AR      drivers/dma/hsu/built-in.a
  CC      drivers/pnp/driver.o
  CC      drivers/dma/dw/acpi.o
  CC [M]  sound/core/hwdep.o
  CC      arch/x86/kernel/ebda.o
  AR      drivers/dma/idxd/built-in.a
  CC      drivers/acpi/acpica/dsmthdat.o
  CC      lib/raid6/recov_avx512.o
  CC      drivers/clk/clkdev.o
  CC      drivers/acpi/apei/ghes.o
  CC      net/netfilter/nf_sockopt.o
  CC      arch/x86/kernel/platform-quirks.o
  CC      arch/x86/kernel/process_64.o
  CC      net/netfilter/utils.o
  CC [M]  sound/pci/hda/hda_intel.o
  CC      net/core/flow_dissector.o
  CC      mm/mmzone.o
  CC      mm/vmstat.o
  CC      lib/zstd/compress/zstd_lazy.o
  CC      fs/proc/generic.o
  CC [M]  drivers/video/fbdev/core/fbcvt.o
  CC      drivers/pnp/resource.o
  CC      kernel/irq/cpuhotplug.o
  CC      drivers/acpi/acpica/dsobject.o
  CC      fs/sysfs/symlink.o
  CC      block/ioctl.o
  CC      arch/x86/kernel/cpu/intel.o
  CC      fs/kernfs/file.o
  CC      fs/kernfs/symlink.o
  CC      net/ethtool/privflags.o
  CC      fs/sysfs/mount.o
  CC      fs/proc/array.o
  CC      drivers/pnp/manager.o
  CC      crypto/sha256_generic.o
  CC      io_uring/sqpoll.o
  CC      drivers/dma/dw/pci.o
  CC      io_uring/fdinfo.o
  CC      io_uring/tctx.o
  CC [M]  arch/x86/kvm/irq.o
  TABLE   lib/raid6/tables.c
  CC      drivers/pnp/support.o
  CC      lib/raid6/int1.o
  CC      drivers/clk/clk.o
  CC      drivers/pci/host-bridge.o
  CC [M]  sound/core/pcm.o
  CC [M]  sound/core/pcm_native.o
  CC [M]  sound/core/pcm_lib.o
  CC      mm/backing-dev.o
  CC      mm/mm_init.o
  CC      drivers/acpi/acpica/dsopcode.o
  CC      crypto/sha512_generic.o
  CC      drivers/pci/remove.o
  CC      kernel/irq/pm.o
  CC      net/core/sysctl_net_core.o
  CC      arch/x86/events/utils.o
  CC [M]  net/netfilter/ipvs/ip_vs_sync.o
  CC      block/genhd.o
  CC [M]  net/netfilter/nfnetlink.o
  CC      fs/sysfs/group.o
  CC      mm/percpu.o
  CC      drivers/pci/pci.o
  CC [M]  drivers/video/fbdev/core/fb_cmdline.o
  CC      fs/proc/fd.o
  CC      io_uring/poll.o
  CC      lib/raid6/int2.o
  AR      drivers/dma/dw/built-in.a
  AR      drivers/dma/mediatek/built-in.a
  AR      drivers/dma/qcom/built-in.a
  AR      drivers/dma/ti/built-in.a
  AR      drivers/dma/xilinx/built-in.a
  AR      sound/spi/built-in.a
  CC      drivers/dma/dmaengine.o
  CC [M]  drivers/dma/ioat/init.o
  AR      drivers/acpi/apei/built-in.a
  CC      drivers/dma/virt-dma.o
  CC      drivers/pci/pci-driver.o
  CC      net/core/dev.o
  AR      drivers/acpi/pmic/built-in.a
  CC      block/ioprio.o
  CC      drivers/pnp/interface.o
  CC      drivers/acpi/acpica/dspkginit.o
  CC      drivers/acpi/dptf/int340x_thermal.o
  CC      drivers/acpi/tables.o
  CC      net/ethtool/rings.o
  AR      fs/kernfs/built-in.a
  CC      net/ethtool/channels.o
  CC      drivers/video/aperture.o
  CC      fs/proc/proc_tty.o
  CC      fs/proc/cmdline.o
  CC      drivers/pnp/quirks.o
  CC      arch/x86/kernel/cpu/intel_pconfig.o
  CC [M]  drivers/dma/ioat/dma.o
  CC      crypto/blake2b_generic.o
  CC      drivers/pci/search.o
  CC      arch/x86/events/rapl.o
  CC      kernel/irq/msi.o
  CC      fs/configfs/inode.o
  CC      kernel/irq/affinity.o
  CC      arch/x86/kernel/cpu/tsx.o
  CC      fs/proc/consoles.o
  AR      fs/sysfs/built-in.a
  CC      crypto/ecb.o
  CC      lib/raid6/int4.o
  CC [M]  drivers/video/fbdev/core/fb_defio.o
  LD [M]  sound/pci/hda/snd-hda-codec.o
  AR      drivers/acpi/dptf/built-in.a
  CC      drivers/acpi/acpica/dsutils.o
  CC      fs/configfs/file.o
  AR      drivers/soc/apple/built-in.a
  CC      fs/configfs/dir.o
  AR      drivers/soc/aspeed/built-in.a
  LD [M]  sound/pci/hda/snd-hda-codec-generic.o
  AR      drivers/soc/bcm/bcm63xx/built-in.a
  AR      drivers/soc/bcm/built-in.a
  AR      drivers/soc/fsl/built-in.a
  LD [M]  sound/pci/hda/snd-hda-codec-realtek.o
  AR      drivers/soc/fujitsu/built-in.a
  AR      drivers/soc/imx/built-in.a
  AR      drivers/soc/ixp4xx/built-in.a
  LD [M]  sound/pci/hda/snd-hda-codec-analog.o
  AR      drivers/soc/loongson/built-in.a
  LD [M]  sound/pci/hda/snd-hda-codec-hdmi.o
  CC      drivers/pci/pci-sysfs.o
  CC [M]  arch/x86/kvm/lapic.o
  AR      drivers/soc/mediatek/built-in.a
  CC      lib/raid6/int8.o
  LD [M]  sound/pci/hda/snd-hda-intel.o
  AR      drivers/soc/microchip/built-in.a
  AR      drivers/soc/nuvoton/built-in.a
  AR      sound/pci/built-in.a
  AR      drivers/soc/pxa/built-in.a
  AR      drivers/soc/amlogic/built-in.a
  CC      crypto/cbc.o
  AR      drivers/soc/qcom/built-in.a
  AR      drivers/soc/renesas/built-in.a
  AR      drivers/soc/rockchip/built-in.a
  CC [M]  sound/core/pcm_misc.o
  AR      drivers/soc/sifive/built-in.a
  CC [M]  sound/core/pcm_memory.o
  AR      drivers/soc/sunxi/built-in.a
  CC [M]  drivers/video/fbdev/uvesafb.o
  AR      drivers/soc/ti/built-in.a
  CC [M]  arch/x86/kvm/i8254.o
  AR      drivers/soc/xilinx/built-in.a
  CC      arch/x86/kernel/cpu/intel_epb.o
  AR      drivers/soc/built-in.a
  CC      drivers/dma/acpi-dma.o
  CC      drivers/acpi/acpica/dswexec.o
  CC      block/badblocks.o
  CC      drivers/pnp/system.o
  AR      kernel/rcu/built-in.a
  CC      fs/proc/cpuinfo.o
  CC      drivers/acpi/blacklist.o
  CC [M]  arch/x86/kvm/ioapic.o
  CC      fs/proc/devices.o
  CC      block/blk-rq-qos.o
  CC      drivers/pci/rom.o
  CC      lib/raid6/int16.o
  CC      arch/x86/events/msr.o
  CC      drivers/acpi/acpica/dswload.o
  CC      drivers/virtio/virtio.o
  CC      io_uring/cancel.o
  CC      drivers/tty/vt/vt_ioctl.o
  CC      net/ethtool/coalesce.o
  CC      drivers/tty/vt/vc_screen.o
  CC      lib/zstd/compress/zstd_ldm.o
  CC      drivers/virtio/virtio_ring.o
  CC      drivers/tty/vt/selection.o
  CC      drivers/virtio/virtio_anchor.o
  CC      crypto/pcbc.o
  AR      sound/parisc/built-in.a
  CC      kernel/irq/matrix.o
  CC      crypto/cts.o
  CC      arch/x86/kernel/cpu/amd.o
  CC [M]  sound/core/memalloc.o
  CC [M]  drivers/video/fbdev/core/fbcon.o
  CC      lib/raid6/int32.o
  AR      drivers/pnp/built-in.a
  CC      drivers/acpi/acpica/dswload2.o
  CC      io_uring/kbuf.o
  CC [M]  drivers/dma/ioat/prep.o
  CC      io_uring/rsrc.o
  CC [M]  net/netfilter/ipvs/ip_vs_est.o
  CC      io_uring/rw.o
  CC      drivers/acpi/osi.o
  CC [M]  net/netfilter/ipvs/ip_vs_proto.o
  CC [M]  sound/core/pcm_timer.o
  CC [M]  drivers/video/fbdev/core/bitblit.o
  CC      fs/proc/interrupts.o
  CC      lib/raid6/tables.o
  CC      drivers/pci/setup-res.o
  CC      block/disk-events.o
  CC      block/blk-ia-ranges.o
  CC [M]  drivers/dma/ioat/dca.o
  CC      lib/zstd/compress/zstd_opt.o
  AR      arch/x86/events/built-in.a
  CC      crypto/lrw.o
  CC      lib/zstd/zstd_decompress_module.o
  AR      sound/pcmcia/vx/built-in.a
  AR      sound/pcmcia/pdaudiocf/built-in.a
  CC      crypto/xts.o
  AR      sound/pcmcia/built-in.a
  CC [M]  drivers/dma/ioat/sysfs.o
  CC      fs/configfs/symlink.o
  CC      fs/devpts/inode.o
  AR      sound/mips/built-in.a
  CC      fs/configfs/mount.o
  CC      arch/x86/kernel/cpu/hygon.o
  CC      arch/x86/kernel/cpu/centaur.o
  CC      net/core/dev_addr_lists.o
  CC      drivers/acpi/acpica/dswscope.o
  CC      net/core/dst.o
  CC      drivers/tty/vt/keyboard.o
  CC      io_uring/opdef.o
  CC      lib/zstd/decompress/huf_decompress.o
  CC      fs/proc/loadavg.o
  CC [M]  arch/x86/kvm/irq_comm.o
  CC [M]  drivers/video/fbdev/simplefb.o
  AR      sound/soc/built-in.a
  LD [M]  sound/core/snd-ctl-led.o
  CC      drivers/tty/vt/consolemap.o
  CC      fs/ext4/balloc.o
  CC      fs/jbd2/transaction.o
  CC      drivers/video/cmdline.o
  CC      net/ethtool/pause.o
  CC      fs/jbd2/commit.o
  AR      lib/raid6/built-in.a
  CC      lib/bug.o
  CC      mm/slab_common.o
  CC      drivers/virtio/virtio_pci_modern_dev.o
  CC      mm/compaction.o
  CC      fs/ext4/bitmap.o
  LD [M]  sound/core/snd-hwdep.o
  CC      drivers/acpi/acpica/dswstate.o
  LD [M]  sound/core/snd-pcm.o
  CC      block/bsg.o
  CC [M]  arch/x86/kvm/cpuid.o
  CC      fs/configfs/item.o
  AR      sound/core/built-in.a
  CC      drivers/virtio/virtio_pci_legacy_dev.o
  AR      sound/atmel/built-in.a
  CC      lib/buildid.o
  CC      arch/x86/kernel/cpu/zhaoxin.o
  AR      sound/hda/built-in.a
  CC      arch/x86/kernel/cpu/perfctr-watchdog.o
  CC      drivers/video/nomodeset.o
  CC [M]  sound/hda/hda_bus_type.o
  AR      kernel/irq/built-in.a
  CC      crypto/ctr.o
  CC      arch/x86/kernel/cpu/vmware.o
  CC      kernel/module/main.o
  LD [M]  drivers/dma/ioat/ioatdma.o
  CC      crypto/gcm.o
  AR      drivers/dma/built-in.a
  CC      fs/proc/meminfo.o
  CC      drivers/pci/irq.o
  CC [M]  sound/hda/hdac_bus.o
  CC      drivers/clk/clk-divider.o
  AR      fs/devpts/built-in.a
  CC      drivers/char/hw_random/core.o
  CC      drivers/char/agp/backend.o
  CC      drivers/char/tpm/tpm-chip.o
  CC      drivers/char/agp/generic.o
  CC      arch/x86/kernel/signal.o
  CC      drivers/acpi/acpica/evevent.o
  CC      drivers/virtio/virtio_mmio.o
  HOSTCC  drivers/tty/vt/conmakehash
  CC      drivers/char/mem.o
  CC      io_uring/notif.o
  AR      fs/configfs/built-in.a
  CC      drivers/tty/vt/vt.o
  CC [M]  net/netfilter/ipvs/ip_vs_pe.o
  CC      drivers/char/hw_random/intel-rng.o
  CC [M]  net/netfilter/ipvs/ip_vs_proto_tcp.o
  CC      crypto/pcrypt.o
  CC      fs/ext4/block_validity.o
  CC      drivers/virtio/virtio_pci_modern.o
  AR      net/ipv4/netfilter/built-in.a
  CC      block/bsg-lib.o
  CC [M]  net/ipv4/netfilter/nf_defrag_ipv4.o
  CC [M]  net/ipv4/netfilter/nf_reject_ipv4.o
  CC      net/xfrm/xfrm_policy.o
  CC [M]  sound/hda/hdac_device.o
  CC      net/xfrm/xfrm_state.o
  CC [M]  sound/hda/hdac_sysfs.o
  CC      net/xfrm/xfrm_hash.o
  CC      lib/zstd/decompress/zstd_ddict.o
  CC      arch/x86/kernel/cpu/hypervisor.o
  CC      net/ethtool/eee.o
  CC      drivers/pci/vpd.o
  CC [M]  drivers/video/fbdev/core/softcursor.o
  CC      drivers/tty/hvc/hvc_console.o
  CC      net/xfrm/xfrm_input.o
  CC      fs/ramfs/inode.o
  CC [M]  arch/x86/kvm/pmu.o
  CC      drivers/clk/clk-fixed-factor.o
  CC      fs/proc/stat.o
  CC      drivers/acpi/acpica/evgpe.o
  CC      arch/x86/kernel/cpu/mshyperv.o
  COPY    drivers/tty/vt/defkeymap.c
  CONMK   drivers/tty/vt/consolemap_deftbl.c
  CC      drivers/tty/vt/defkeymap.o
  CC      fs/hugetlbfs/inode.o
  CC      drivers/char/tpm/tpm-dev-common.o
  CC      fs/fat/cache.o
  CC      drivers/char/tpm/tpm-dev.o
  CC      drivers/clk/clk-fixed-rate.o
  CC      arch/x86/kernel/signal_64.o
  CC      drivers/char/random.o
  CC      crypto/cryptd.o
  AR      drivers/char/hw_random/built-in.a
  CC      fs/fat/dir.o
  CC      drivers/tty/vt/consolemap_deftbl.o
  CC [M]  sound/hda/hdac_regmap.o
  CC      drivers/clk/clk-gate.o
  CC [M]  arch/x86/kvm/mtrr.o
  CC      drivers/virtio/virtio_pci_common.o
  CC      crypto/des_generic.o
  CC      block/blk-cgroup.o
  CC      fs/jbd2/recovery.o
  CC      lib/cmdline.o
  CC      lib/cpumask.o
  CC      drivers/acpi/acpica/evgpeblk.o
  CC      io_uring/io-wq.o
  CC      fs/ext4/dir.o
  CC      fs/fat/fatent.o
  CC      drivers/pci/setup-bus.o
  CC      fs/proc/uptime.o
  CC [M]  drivers/video/fbdev/core/tileblit.o
  CC      drivers/char/agp/isoch.o
  CC [M]  arch/x86/kvm/hyperv.o
  CC      drivers/clk/clk-multiplier.o
  CC      fs/ramfs/file-mmu.o
  CC      net/ethtool/tsinfo.o
  CC      drivers/acpi/acpica/evgpeinit.o
  CC      drivers/clk/clk-mux.o
  CC      fs/jbd2/checkpoint.o
  CC      arch/x86/kernel/cpu/capflags.o
  AR      drivers/tty/hvc/built-in.a
  CC [M]  arch/x86/kvm/debugfs.o
  CC      drivers/char/tpm/tpm-interface.o
  CC      drivers/tty/serial/8250/8250_core.o
  CC      drivers/tty/serial/8250/8250_pnp.o
  AR      arch/x86/kernel/cpu/built-in.a
  CC      drivers/tty/serial/serial_core.o
  CC      drivers/tty/serial/earlycon.o
  AR      drivers/tty/ipwireless/built-in.a
  CC      arch/x86/kernel/traps.o
  CC [M]  net/ipv4/netfilter/ip_tables.o
  CC [M]  net/netfilter/ipvs/ip_vs_proto_udp.o
  CC      drivers/tty/serial/8250/8250_port.o
  CC      drivers/tty/tty_io.o
  CC      net/ethtool/cabletest.o
  CC      fs/proc/util.o
  CC [M]  net/ipv4/netfilter/iptable_filter.o
  CC      kernel/module/strict_rwx.o
  CC [M]  sound/hda/hdac_controller.o
  CC      drivers/acpi/acpica/evgpeutil.o
  AR      fs/ramfs/built-in.a
  CC      fs/ext4/ext4_jbd2.o
  CC      drivers/clk/clk-composite.o
  CC      crypto/aes_generic.o
  CC      fs/nfs/client.o
  CC      drivers/virtio/virtio_pci_legacy.o
  CC      drivers/char/agp/intel-agp.o
  CC [M]  drivers/video/fbdev/core/cfbfillrect.o
  CC      fs/nfs/dir.o
  CC      kernel/module/tree_lookup.o
  AR      fs/hugetlbfs/built-in.a
  CC      lib/ctype.o
  CC      fs/ext4/extents.o
  CC      fs/ext4/extents_status.o
  CC      fs/exportfs/expfs.o
  CC      fs/proc/version.o
  CC      fs/proc/softirqs.o
  CC      net/ethtool/tunnels.o
  CC      fs/lockd/clntlock.o
  CC      drivers/acpi/acpica/evglock.o
  CC      fs/lockd/clntproc.o
  CC      fs/proc/namespaces.o
  CC      net/ethtool/fec.o
  CC      net/ethtool/eeprom.o
  CC      drivers/char/tpm/tpm1-cmd.o
  CC      fs/jbd2/revoke.o
  CC      drivers/tty/serial/serial_mctrl_gpio.o
  CC      drivers/char/misc.o
  CC      fs/fat/file.o
  CC      lib/zstd/decompress/zstd_decompress.o
  CC      mm/interval_tree.o
  CC [M]  drivers/virtio/virtio_mem.o
  CC      arch/x86/kernel/idt.o
  CC      drivers/clk/clk-fractional-divider.o
  CC [M]  sound/hda/hdac_stream.o
  CC      fs/proc/self.o
  CC      kernel/module/debug_kmemleak.o
  CC [M]  net/ipv4/netfilter/iptable_mangle.o
  CC [M]  net/ipv4/netfilter/iptable_nat.o
  CC      fs/jbd2/journal.o
  CC      drivers/acpi/acpica/evhandler.o
  AR      io_uring/built-in.a
  CC      lib/zstd/decompress/zstd_decompress_block.o
  CC      lib/dec_and_lock.o
  CC      drivers/char/agp/intel-gtt.o
  AR      fs/exportfs/built-in.a
  CC      fs/nls/nls_base.o
  AR      drivers/tty/vt/built-in.a
  CC [M]  net/netfilter/ipvs/ip_vs_nfct.o
  CC [M]  net/netfilter/ipvs/ip_vs_rr.o
  CC      drivers/pci/vc.o
  CC [M]  drivers/video/fbdev/core/cfbcopyarea.o
  CC      fs/proc/thread_self.o
  CC      net/xfrm/xfrm_output.o
  CC      fs/ext4/file.o
  CC      block/blk-cgroup-rwstat.o
  CC      crypto/deflate.o
  CC      fs/lockd/clntxdr.o
  CC      fs/lockd/host.o
  CC      kernel/module/kallsyms.o
  CC [M]  drivers/video/fbdev/core/cfbimgblt.o
  CC      drivers/acpi/acpica/evmisc.o
  CC      drivers/char/tpm/tpm2-cmd.o
  CC      drivers/char/virtio_console.o
  CC      drivers/clk/clk-gpio.o
  AR      fs/unicode/built-in.a
  CC      block/blk-throttle.o
  CC      fs/nls/nls_cp437.o
  CC      fs/ntfs/aops.o
  CC      fs/nls/nls_ascii.o
  CC      fs/lockd/svc.o
  CC      net/ethtool/stats.o
  CC      fs/fat/inode.o
  CC      net/core/netevent.o
  CC      arch/x86/kernel/irq.o
  CC      mm/list_lru.o
  CC      arch/x86/kernel/irq_64.o
  CC      drivers/tty/n_tty.o
  CC [M]  sound/hda/array.o
  CC      fs/proc/proc_sysctl.o
  CC      net/xfrm/xfrm_sysctl.o
  CC      drivers/tty/serial/8250/8250_dma.o
  CC      fs/lockd/svclock.o
  CC      drivers/pci/mmap.o
  CC      drivers/pci/setup-irq.o
  CC      drivers/acpi/acpica/evregion.o
  CC      fs/fat/misc.o
  CC      crypto/crc32c_generic.o
  CC [M]  sound/hda/hdmi_chmap.o
  CC [M]  arch/x86/kvm/mmu/mmu.o
  CC      drivers/acpi/acpica/evrgnini.o
  CC [M]  net/ipv4/netfilter/ipt_REJECT.o
  CC [M]  arch/x86/kvm/mmu/page_track.o
  AR      drivers/clk/built-in.a
  CC      fs/nls/nls_iso8859-1.o
  CC [M]  arch/x86/kvm/mmu/spte.o
  CC [M]  sound/hda/trace.o
  AR      drivers/char/agp/built-in.a
  CC      drivers/char/hpet.o
  CC      net/core/neighbour.o
  LD [M]  net/netfilter/ipvs/ip_vs.o
  CC      kernel/module/procfs.o
  CC      fs/ext4/fsmap.o
  CC      fs/nls/nls_utf8.o
  CC [M]  net/netfilter/nf_conntrack_core.o
  CC [M]  drivers/video/fbdev/core/sysfillrect.o
  CC [M]  net/netfilter/nf_conntrack_standalone.o
  CC      block/mq-deadline.o
  CC      crypto/crct10dif_common.o
  CC      drivers/char/nvram.o
  CC      fs/lockd/svcshare.o
  CC      fs/fat/nfs.o
  CC      fs/fat/namei_vfat.o
  CC      drivers/char/tpm/tpmrm-dev.o
  CC      lib/zstd/zstd_common_module.o
  CC      drivers/acpi/acpica/evsci.o
  CC      drivers/pci/proc.o
  CC      fs/ntfs/attrib.o
  CC      mm/workingset.o
  AR      drivers/virtio/built-in.a
  CC      drivers/tty/serial/8250/8250_dwlib.o
  CC      kernel/time/time.o
  CC      fs/lockd/svcproc.o
  CC      drivers/tty/serial/8250/8250_pcilib.o
  AR      fs/nls/built-in.a
  CC      kernel/time/timer.o
  CC      mm/debug.o
  CC      net/ethtool/phc_vclocks.o
  CC      crypto/crct10dif_generic.o
  CC      net/ethtool/mm.o
  CC      fs/autofs/init.o
  CC      fs/autofs/inode.o
  CC      drivers/video/hdmi.o
  CC      kernel/module/sysfs.o
  CC      lib/zstd/common/debug.o
  CC [M]  sound/hda/hdac_component.o
  CC      lib/zstd/common/entropy_common.o
  CC      net/ethtool/module.o
  CC      drivers/acpi/acpica/evxface.o
  CC      fs/debugfs/inode.o
  CC      net/ipv4/route.o
  CC [M]  net/netfilter/nf_conntrack_expect.o
  CC      block/kyber-iosched.o
  CC      net/xfrm/xfrm_replay.o
  CC      block/bfq-iosched.o
  CC      fs/debugfs/file.o
  CC [M]  sound/hda/hdac_i915.o
  CC      drivers/char/tpm/tpm2-space.o
  CC      fs/fat/namei_msdos.o
  CC [M]  arch/x86/kvm/mmu/tdp_iter.o
  CC      drivers/acpi/acpica/evxfevnt.o
  CC [M]  drivers/video/fbdev/core/syscopyarea.o
  CC      crypto/authenc.o
  CC      block/bfq-wf2q.o
  CC      kernel/futex/core.o
  CC      drivers/pci/slot.o
  CC      kernel/futex/syscalls.o
  CC      arch/x86/kernel/dumpstack_64.o
  CC      drivers/tty/serial/8250/8250_pci.o
  CC      kernel/futex/pi.o
  CC      fs/nfs/file.o
  CC      fs/proc/proc_net.o
  CC      mm/gup.o
  CC      drivers/acpi/acpica/evxfgpe.o
  CC      drivers/tty/serial/8250/8250_exar.o
  AR      kernel/module/built-in.a
  CC      fs/autofs/root.o
  CC      fs/autofs/symlink.o
  CC      fs/proc/kcore.o
  CC [M]  net/netfilter/nf_conntrack_helper.o
  CC      mm/mmap_lock.o
  CC      mm/highmem.o
  CC      fs/ntfs/collate.o
  CC      fs/autofs/waitq.o
  CC      lib/zstd/common/error_private.o
  CC [M]  sound/hda/intel-dsp-config.o
  AR      fs/jbd2/built-in.a
  CC      drivers/acpi/osl.o
  CC      fs/nfs/getroot.o
  CC [M]  sound/hda/intel-nhlt.o
  CC      fs/nfs/inode.o
  CC      fs/lockd/svcsubs.o
  CC      fs/ntfs/compress.o
  CC      fs/lockd/mon.o
  CC      fs/lockd/xdr.o
  CC      net/ethtool/pse-pd.o
  CC      arch/x86/kernel/time.o
  CC [M]  arch/x86/kvm/mmu/tdp_mmu.o
  CC      drivers/char/tpm/tpm-sysfs.o
  AR      fs/fat/built-in.a
  CC      drivers/char/tpm/eventlog/common.o
  CC      drivers/acpi/utils.o
  CC [M]  drivers/video/fbdev/core/sysimgblt.o
  CC      drivers/pci/pci-acpi.o
  CC      kernel/time/hrtimer.o
  CC      drivers/acpi/acpica/evxfregn.o
  AR      fs/debugfs/built-in.a
  CC      drivers/acpi/acpica/exconcat.o
  CC      crypto/authencesn.o
  CC      lib/decompress.o
  CC      lib/zstd/common/fse_decompress.o
  CC      drivers/char/tpm/eventlog/tpm1.o
  CC      lib/zstd/common/zstd_common.o
  CC      fs/proc/kmsg.o
  CC      kernel/futex/requeue.o
  CC      lib/decompress_bunzip2.o
  CC      fs/ext4/fsync.o
  CC      crypto/lzo.o
  CC [M]  sound/hda/intel-sdw-acpi.o
  CC      crypto/lzo-rle.o
  CC      fs/ext4/hash.o
  CC      fs/proc/page.o
  CC      net/xfrm/xfrm_device.o
  CC      fs/autofs/expire.o
  CC      arch/x86/kernel/ioport.o
  CC      crypto/lz4.o
  CC      kernel/futex/waitwake.o
  CC      fs/lockd/clnt4xdr.o
  CC      drivers/acpi/acpica/exconfig.o
  CC [M]  net/netfilter/nf_conntrack_proto.o
  CC      fs/ntfs/debug.o
  CC      fs/ext4/ialloc.o
  CC      fs/ext4/indirect.o
  CC      drivers/tty/serial/8250/8250_early.o
  CC [M]  arch/x86/kvm/smm.o
  CC      drivers/char/tpm/eventlog/tpm2.o
  CC      drivers/acpi/acpica/exconvrt.o
  CC      block/bfq-cgroup.o
  CC      net/ethtool/plca.o
  CC      fs/lockd/xdr4.o
  CC      drivers/acpi/reboot.o
  CC      fs/ntfs/dir.o
  CC      drivers/acpi/nvs.o
  LD [M]  sound/hda/snd-hda-core.o
  CC      kernel/time/timekeeping.o
  CC      arch/x86/kernel/dumpstack.o
  CC      drivers/acpi/wakeup.o
  CC      fs/ext4/inline.o
  CC      kernel/cgroup/cgroup.o
  CC [M]  net/netfilter/nf_conntrack_proto_generic.o
  LD [M]  sound/hda/snd-intel-dspcfg.o
  CC      fs/ext4/inode.o
  CC [M]  drivers/video/fbdev/core/fb_sys_fops.o
  LD [M]  sound/hda/snd-intel-sdw-acpi.o
  CC      kernel/time/ntp.o
  AR      sound/x86/built-in.a
  AR      sound/xen/built-in.a
  AR      sound/virtio/built-in.a
  CC      sound/sound_core.o
  CC      drivers/acpi/sleep.o
  CC      kernel/cgroup/rstat.o
  CC      drivers/char/tpm/tpm_ppi.o
  AR      kernel/futex/built-in.a
  CC      drivers/tty/tty_ioctl.o
  CC      drivers/acpi/device_sysfs.o
  CC      net/xfrm/xfrm_algo.o
  CC      fs/autofs/dev-ioctl.o
  CC      crypto/lz4hc.o
  CC      kernel/time/clocksource.o
  CC      kernel/time/jiffies.o
  CC      drivers/pci/quirks.o
  CC      kernel/trace/trace_clock.o
  CC      drivers/acpi/acpica/excreate.o
  CC      drivers/tty/serial/8250/8250_dw.o
  AR      fs/proc/built-in.a
  CC      kernel/trace/ftrace.o
  CC      sound/last.o
  CC      drivers/char/tpm/eventlog/acpi.o
  CC      fs/ext4/ioctl.o
  CC      drivers/char/tpm/eventlog/efi.o
  CC      kernel/time/timer_list.o
  CC      drivers/char/tpm/tpm_crb.o
  CC      drivers/pci/ats.o
  CC      crypto/xxhash_generic.o
  CC      drivers/acpi/device_pm.o
  CC      net/unix/af_unix.o
  CC      drivers/acpi/proc.o
  CC      arch/x86/kernel/nmi.o
  CC      net/core/rtnetlink.o
  CC      net/unix/garbage.o
  CC      drivers/acpi/acpica/exdebug.o
  LD [M]  drivers/video/fbdev/core/fb.o
  CC      drivers/pci/iov.o
  AR      net/ethtool/built-in.a
  CC      drivers/pci/pci-label.o
  CC      drivers/acpi/bus.o
  AR      drivers/video/fbdev/core/built-in.a
  CC      lib/decompress_inflate.o
  CC      lib/decompress_unlz4.o
  AR      sound/built-in.a
  AR      drivers/video/fbdev/built-in.a
  CC      kernel/time/timeconv.o
  CC      mm/memory.o
  AR      drivers/video/built-in.a
  AR      fs/autofs/built-in.a
  CC      fs/nfs/super.o
  CC      fs/tracefs/inode.o
  AR      drivers/iommu/amd/built-in.a
  AR      drivers/gpu/host1x/built-in.a
  CC [M]  net/netfilter/nf_conntrack_proto_tcp.o
  CC      drivers/iommu/intel/dmar.o
  CC      fs/lockd/svc4proc.o
  CC      fs/lockd/procfs.o
  AR      drivers/gpu/drm/tests/built-in.a
  CC [M]  drivers/gpu/drm/tests/drm_kunit_helpers.o
  AR      drivers/gpu/vga/built-in.a
  CC [M]  drivers/gpu/drm/tests/drm_buddy_test.o
  CC      fs/ntfs/file.o
  CC      crypto/rng.o
  AR      drivers/gpu/drm/arm/built-in.a
  CC [M]  drivers/gpu/drm/tests/drm_cmdline_parser_test.o
  CC      drivers/acpi/glue.o
  AR      drivers/gpu/drm/display/built-in.a
  CC [M]  drivers/gpu/drm/display/drm_display_helper_mod.o
  AR      drivers/gpu/drm/rcar-du/built-in.a
  CC      net/xfrm/xfrm_user.o
  AR      drivers/gpu/drm/omapdrm/built-in.a
  CC      drivers/acpi/scan.o
  CC      crypto/drbg.o
  CC      drivers/acpi/acpica/exdump.o
  CC [M]  drivers/gpu/drm/tests/drm_connector_test.o
  CC [M]  drivers/gpu/drm/display/drm_dp_dual_mode_helper.o
  CC      drivers/tty/serial/8250/8250_lpss.o
  CC      net/ipv4/inetpeer.o
  CC      block/blk-mq-pci.o
  AR      drivers/gpu/drm/tilcdc/built-in.a
  CC [M]  net/netfilter/nf_conntrack_proto_udp.o
  CC [M]  drivers/gpu/drm/display/drm_dp_helper.o
  CC      kernel/time/timecounter.o
  CC [M]  arch/x86/kvm/vmx/vmx.o
  CC      mm/mincore.o
  CC      drivers/iommu/intel/iommu.o
  CC      drivers/iommu/intel/pasid.o
  AR      drivers/char/tpm/built-in.a
  CC      kernel/time/alarmtimer.o
  AR      drivers/char/built-in.a
  CC [M]  drivers/gpu/drm/display/drm_dp_mst_topology.o
  CC [M]  drivers/gpu/drm/tests/drm_damage_helper_test.o
  CC      lib/decompress_unlzma.o
  CC      fs/nfs/io.o
  CC      arch/x86/kernel/ldt.o
  CC      drivers/acpi/acpica/exfield.o
  AR      fs/tracefs/built-in.a
  CC      mm/mlock.o
  CC      fs/ntfs/index.o
  CC      kernel/trace/ring_buffer.o
  CC      drivers/acpi/resource.o
  CC      fs/btrfs/super.o
  CC      drivers/connector/cn_queue.o
  CC      kernel/events/core.o
  CC      drivers/connector/connector.o
  CC      kernel/fork.o
  CC      kernel/events/ring_buffer.o
  CC      drivers/connector/cn_proc.o
  CC      kernel/bpf/core.o
  CC      block/blk-mq-virtio.o
  CC      drivers/tty/serial/8250/8250_mid.o
  CC [M]  drivers/gpu/drm/tests/drm_dp_mst_helper_test.o
  CC      drivers/acpi/acpica/exfldio.o
  CC [M]  drivers/gpu/drm/tests/drm_format_helper_test.o
  AR      fs/lockd/built-in.a
  CC      kernel/events/callchain.o
  CC      lib/decompress_unlzo.o
  CC      net/ipv4/protocol.o
  CC      fs/nfs/direct.o
  AR      net/ipv6/netfilter/built-in.a
  CC [M]  net/ipv6/netfilter/nf_defrag_ipv6_hooks.o
  CC      drivers/pci/pci-stub.o
  CC      net/ipv6/af_inet6.o
  CC      fs/nfs/pagelist.o
  CC      fs/ntfs/inode.o
  CC      fs/ext4/mballoc.o
  CC      drivers/pci/vgaarb.o
  CC      drivers/acpi/acpica/exmisc.o
  CC      arch/x86/kernel/setup.o
  CC      fs/ntfs/mft.o
  CC      kernel/time/posix-timers.o
  CC      block/blk-mq-debugfs.o
  CC      crypto/jitterentropy.o
  CC      block/blk-pm.o
  CC      crypto/jitterentropy-kcapi.o
  CC [M]  net/netfilter/nf_conntrack_proto_icmp.o
  CC      drivers/tty/serial/8250/8250_pericom.o
  CC [M]  arch/x86/kvm/kvm-asm-offsets.s
  AR      drivers/gpu/drm/imx/built-in.a
  CC      kernel/cgroup/namespace.o
  CC      net/ipv6/anycast.o
  CC [M]  net/netfilter/nf_conntrack_extend.o
  CC      net/core/utils.o
  CC      drivers/acpi/acpi_processor.o
  AR      drivers/gpu/drm/i2c/built-in.a
  CC      kernel/events/hw_breakpoint.o
  AR      drivers/iommu/arm/arm-smmu/built-in.a
  AR      drivers/iommu/arm/arm-smmu-v3/built-in.a
  AR      drivers/iommu/arm/built-in.a
  CC      fs/pstore/inode.o
  CC      drivers/acpi/acpica/exmutex.o
  CC      fs/pstore/platform.o
  CC [M]  drivers/gpu/drm/tests/drm_format_test.o
  CC      fs/pstore/pmsg.o
  AR      drivers/connector/built-in.a
  CC      net/ipv6/ip6_output.o
  CC      crypto/ghash-generic.o
  CC      mm/mmap.o
  AR      drivers/iommu/iommufd/built-in.a
  CC [M]  drivers/gpu/drm/display/drm_dsc_helper.o
  CC      net/ipv4/ip_input.o
  CC      net/unix/sysctl_net_unix.o
  CC      block/holder.o
  CC      drivers/acpi/acpica/exnames.o
  AR      lib/zstd/built-in.a
  CC      lib/decompress_unxz.o
  AR      drivers/tty/serial/8250/built-in.a
  AR      drivers/tty/serial/built-in.a
  CC      drivers/tty/tty_ldisc.o
  CC      fs/efivarfs/inode.o
  CC      drivers/tty/tty_buffer.o
  CC      kernel/cgroup/cgroup-v1.o
  CC [M]  net/ipv6/netfilter/nf_conntrack_reasm.o
  CC      kernel/exec_domain.o
  CC      lib/decompress_unzstd.o
  CC      arch/x86/kernel/x86_init.o
  CC      net/ipv4/ip_fragment.o
  CC      crypto/af_alg.o
  AR      drivers/pci/built-in.a
  CC [M]  arch/x86/kvm/vmx/pmu_intel.o
  CC      drivers/iommu/iommu.o
  CC      drivers/acpi/processor_core.o
  CC      drivers/acpi/acpica/exoparg1.o
  CC      net/packet/af_packet.o
  AR      net/xfrm/built-in.a
  CC [M]  net/netfilter/nf_conntrack_acct.o
  AR      fs/pstore/built-in.a
  CC      net/packet/diag.o
  CC [M]  net/netfilter/nf_conntrack_seqadj.o
  CC      net/ipv4/ip_forward.o
  CC      net/core/link_watch.o
  AR      block/built-in.a
  CC      fs/ntfs/mst.o
  CC [M]  drivers/gpu/drm/tests/drm_framebuffer_test.o
  CC      crypto/algif_hash.o
  CC      kernel/cgroup/freezer.o
  CC      drivers/acpi/acpica/exoparg2.o
  CC      fs/efivarfs/file.o
  CC      kernel/time/posix-cpu-timers.o
  CC      drivers/iommu/iommu-traces.o
  CC      net/ipv4/ip_options.o
  CC      lib/dump_stack.o
  CC      kernel/panic.o
  CC      net/unix/diag.o
  CC      fs/efivarfs/super.o
  CC      fs/efivarfs/vars.o
  CC      drivers/tty/tty_port.o
  CC      fs/nfs/read.o
  CC      arch/x86/kernel/i8259.o
  CC      drivers/iommu/intel/trace.o
  CC      kernel/trace/trace.o
  CC      drivers/iommu/intel/cap_audit.o
  CC      drivers/iommu/intel/irq_remapping.o
  AR      kernel/bpf/built-in.a
  CC      lib/earlycpio.o
  CC      lib/extable.o
  CC      drivers/acpi/acpica/exoparg3.o
  CC [M]  drivers/gpu/drm/tests/drm_managed_test.o
  CC      fs/ntfs/namei.o
  CC      kernel/trace/trace_output.o
  CC [M]  drivers/gpu/drm/display/drm_hdcp_helper.o
  CC      arch/x86/kernel/irqinit.o
  CC      drivers/iommu/intel/perfmon.o
  CC      lib/flex_proportions.o
  CC      fs/ext4/migrate.o
  CC      net/unix/scm.o
  CC [M]  fs/netfs/buffered_read.o
  CC [M]  net/netfilter/nf_conntrack_proto_icmpv6.o
  CC      kernel/cgroup/legacy_freezer.o
  CC [M]  drivers/gpu/drm/tests/drm_mm_test.o
  CC      crypto/algif_skcipher.o
  CC [M]  drivers/gpu/drm/tests/drm_modes_test.o
  CC      net/ipv4/ip_output.o
  CC      drivers/iommu/iommu-sysfs.o
  CC      kernel/trace/trace_seq.o
  CC      arch/x86/kernel/jump_label.o
  LD [M]  net/ipv6/netfilter/nf_defrag_ipv6.o
  CC      net/core/filter.o
  CC      drivers/acpi/acpica/exoparg6.o
  CC [M]  fs/fscache/cache.o
  AR      fs/efivarfs/built-in.a
  CC [M]  fs/smbfs_common/cifs_arc4.o
  CC [M]  drivers/gpu/drm/display/drm_hdmi_helper.o
  CC [M]  arch/x86/kvm/vmx/vmcs12.o
  CC [M]  fs/smbfs_common/cifs_md4.o
  CC      fs/btrfs/ctree.o
  CC      net/core/sock_diag.o
  CC      drivers/tty/tty_mutex.o
  CC [M]  net/netfilter/nf_conntrack_proto_dccp.o
  CC      lib/idr.o
  CC      fs/ntfs/runlist.o
  CC      kernel/time/posix-clock.o
  CC [M]  arch/x86/kvm/vmx/hyperv.o
  AR      drivers/gpu/drm/panel/built-in.a
  CC      drivers/iommu/dma-iommu.o
  CC      net/ipv4/ip_sockglue.o
  CC      crypto/xor.o
  CC      lib/irq_regs.o
  CC      fs/btrfs/extent-tree.o
  CC      fs/ntfs/super.o
  CC      fs/ntfs/sysctl.o
  CC      drivers/acpi/acpica/exprep.o
  CC      fs/ntfs/unistr.o
  CC      fs/btrfs/print-tree.o
  CC [M]  fs/cifs/trace.o
  CC [M]  net/netfilter/nf_conntrack_proto_sctp.o
  AR      drivers/gpu/drm/bridge/analogix/built-in.a
  AR      drivers/gpu/drm/hisilicon/built-in.a
  CC      fs/nfs/symlink.o
  AR      drivers/gpu/drm/bridge/cadence/built-in.a
  AR      drivers/gpu/drm/bridge/imx/built-in.a
  AR      drivers/gpu/drm/bridge/synopsys/built-in.a
  CC      arch/x86/kernel/irq_work.o
  AR      drivers/gpu/drm/bridge/built-in.a
  CC      fs/nfs/unlink.o
  CC      crypto/hash_info.o
  CC      drivers/iommu/ioasid.o
  CC      kernel/cgroup/pids.o
  CC      fs/ext4/mmp.o
  CC      kernel/cgroup/cpuset.o
  CC [M]  drivers/gpu/drm/display/drm_scdc_helper.o
  CC      drivers/tty/tty_ldsem.o
  AR      net/unix/built-in.a
  CC [M]  fs/netfs/io.o
  CC      net/ipv4/inet_hashtables.o
  CC      lib/is_single_threaded.o
  CC [M]  arch/x86/kvm/vmx/nested.o
  CC      crypto/simd.o
  CC [M]  drivers/gpu/drm/display/drm_dp_aux_dev.o
  AR      drivers/gpu/drm/mxsfb/built-in.a
  CC [M]  arch/x86/kvm/vmx/posted_intr.o
  AR      drivers/gpu/drm/tiny/built-in.a
  CC      drivers/iommu/iova.o
  AR      drivers/gpu/drm/xlnx/built-in.a
  CC      drivers/iommu/irq_remapping.o
  CC [M]  fs/fscache/cookie.o
  CC      drivers/acpi/acpica/exregion.o
  AR      drivers/iommu/intel/built-in.a
  CC      kernel/time/itimer.o
  CC      net/ipv4/inet_timewait_sock.o
  CC      mm/mmu_gather.o
  CC      net/ipv4/inet_connection_sock.o
  CC      drivers/acpi/acpica/exresnte.o
  CC      fs/ntfs/upcase.o
  CC      lib/klist.o
  CC      lib/kobject.o
  CC      fs/nfs/write.o
  CC      drivers/acpi/processor_pdc.o
  CC [M]  net/netfilter/nf_conntrack_netlink.o
  CC      drivers/base/power/sysfs.o
  CC      net/ipv6/ip6_input.o
  CC      drivers/base/power/generic_ops.o
  CC      arch/x86/kernel/probe_roms.o
  CC      drivers/tty/tty_baudrate.o
  CC [M]  crypto/md4.o
  CC      drivers/base/power/common.o
  CC      net/ipv6/addrconf.o
  CC      drivers/acpi/acpica/exresolv.o
  CC      drivers/base/firmware_loader/builtin/main.o
  CC      drivers/base/regmap/regmap.o
  AR      drivers/base/test/built-in.a
  CC      drivers/base/firmware_loader/main.o
  CC      fs/nfs/namespace.o
  CC      drivers/base/regmap/regcache.o
  CC      net/ipv6/addrlabel.o
  CC      drivers/block/loop.o
  AR      fs/ntfs/built-in.a
  AR      drivers/misc/eeprom/built-in.a
  AR      drivers/misc/cb710/built-in.a
  CC      net/ipv6/route.o
  AR      drivers/iommu/built-in.a
  LD [M]  drivers/gpu/drm/display/drm_display_helper.o
  AR      drivers/misc/ti-st/built-in.a
  CC      drivers/base/regmap/regcache-rbtree.o
  CC      drivers/base/regmap/regcache-flat.o
  AR      drivers/misc/lis3lv02d/built-in.a
  AR      drivers/misc/cardreader/built-in.a
  CC      mm/mprotect.o
  CC [M]  drivers/gpu/drm/tests/drm_plane_helper_test.o
  CC [M]  drivers/misc/mei/hdcp/mei_hdcp.o
  CC      mm/mremap.o
  CC [M]  fs/netfs/iterator.o
  AR      drivers/base/firmware_loader/builtin/built-in.a
  CC [M]  fs/fscache/io.o
  LD [M]  arch/x86/kvm/kvm.o
  CC      kernel/time/clockevents.o
  CC      drivers/base/power/qos.o
  CC      lib/kobject_uevent.o
  CC [M]  crypto/ccm.o
  CC      drivers/acpi/acpica/exresop.o
  AR      drivers/misc/built-in.a
  CC      drivers/base/power/runtime.o
  CC      drivers/tty/tty_jobctrl.o
  CC      drivers/tty/n_null.o
  CC      arch/x86/kernel/sys_ia32.o
  CC      fs/nfs/mount_clnt.o
  CC      net/ipv4/tcp.o
  CC      net/ipv4/tcp_input.o
  CC [M]  fs/netfs/main.o
  AR      net/packet/built-in.a
  CC [M]  crypto/arc4.o
  CC      drivers/base/power/wakeirq.o
  CC      drivers/base/regmap/regmap-debugfs.o
  CC      fs/btrfs/root-tree.o
  CC      drivers/acpi/acpica/exserial.o
  CC      drivers/tty/pty.o
  CC      drivers/block/virtio_blk.o
  CC [M]  fs/cifs/cifsfs.o
  CC [M]  drivers/gpu/drm/tests/drm_probe_helper_test.o
  CC      drivers/base/regmap/regmap-i2c.o
  CC      kernel/time/tick-common.o
  AR      drivers/base/firmware_loader/built-in.a
  CC      drivers/base/component.o
  CC      drivers/acpi/acpica/exstore.o
  CC      fs/ext4/move_extent.o
  CC [M]  crypto/ecc.o
  CC [M]  fs/fscache/main.o
  UPD     arch/x86/kvm/kvm-asm-offsets.h
  CC [M]  drivers/misc/mei/pxp/mei_pxp.o
  CC      drivers/acpi/acpica/exstoren.o
  CC      drivers/acpi/acpica/exstorob.o
  CC      kernel/time/tick-broadcast.o
  CC      drivers/tty/sysrq.o
  CC      fs/nfs/nfstrace.o
  CC [M]  crypto/essiv.o
  AR      drivers/gpu/drm/gud/built-in.a
  CC      drivers/base/core.o
  CC [M]  fs/fscache/volume.o
  CC [M]  crypto/ecdh.o
  CC [M]  crypto/ecdh_helper.o
  CC      arch/x86/kernel/signal_32.o
  CC      drivers/base/power/main.o
  CC      lib/logic_pio.o
  CC      fs/ext4/namei.o
  CC      mm/msync.o
  CC      drivers/acpi/ec.o
  AS [M]  arch/x86/kvm/vmx/vmenter.o
  CC      arch/x86/kernel/sys_x86_64.o
  CC [M]  fs/fuse/dev.o
  CC      drivers/acpi/acpica/exsystem.o
  CC      arch/x86/kernel/espfix_64.o
  CC [M]  fs/fuse/dir.o
  CC [M]  drivers/gpu/drm/tests/drm_rect_test.o
  CC [M]  fs/netfs/objects.o
  CC      fs/btrfs/dir-item.o
  CC      mm/page_vma_mapped.o
  CC      drivers/base/regmap/regmap-irq.o
  AR      kernel/cgroup/built-in.a
  CC      drivers/base/bus.o
  CC      mm/pagewalk.o
  CC [M]  drivers/misc/mei/init.o
  CC      kernel/trace/trace_stat.o
  CC      kernel/cpu.o
  CC      drivers/base/dd.o
  AR      drivers/gpu/drm/solomon/built-in.a
  CC [M]  drivers/gpu/drm/ttm/ttm_tt.o
  CC      kernel/time/tick-broadcast-hrtimer.o
  CC      mm/pgtable-generic.o
  CC      drivers/acpi/acpica/extrace.o
  CC      kernel/time/tick-oneshot.o
  CC      drivers/base/power/wakeup.o
  CC      lib/maple_tree.o
  CC      kernel/trace/trace_printk.o
  CC [M]  fs/fscache/proc.o
  CC [M]  drivers/block/nbd.o
  CC [M]  net/netfilter/nf_nat_core.o
  CC      kernel/time/tick-sched.o
  CC      drivers/acpi/acpica/exutils.o
  AR      drivers/tty/built-in.a
  CC [M]  fs/fuse/file.o
  CC      kernel/exit.o
  CC      drivers/acpi/acpica/hwacpi.o
  CC      drivers/acpi/dock.o
  CC      kernel/events/uprobes.o
  CC      arch/x86/kernel/ksysfs.o
  CC      kernel/time/vsyscall.o
  LD [M]  fs/netfs/netfs.o
  CC      drivers/base/power/wakeup_stats.o
  CC      kernel/time/timekeeping_debug.o
  CC [M]  drivers/misc/mei/hbm.o
  CC      drivers/mfd/mfd-core.o
  AR      drivers/nfc/built-in.a
  AR      drivers/dax/hmem/built-in.a
  CC      drivers/dax/super.o
  CC      kernel/time/namespace.o
  CC      kernel/trace/pid_list.o
  CC      drivers/mfd/intel-lpss.o
  CC      drivers/dma-buf/dma-buf.o
  CC      mm/rmap.o
  CC      kernel/trace/trace_sched_switch.o
  CC [M]  drivers/gpu/drm/ttm/ttm_bo.o
  CC      drivers/acpi/acpica/hwesleep.o
  CC [M]  net/netfilter/nf_nat_proto.o
  CC      mm/vmalloc.o
  LD [M]  fs/fscache/fscache.o
  CC      drivers/dma-buf/dma-fence.o
  CC      drivers/acpi/acpica/hwgpe.o
  CC      fs/btrfs/file-item.o
  CC [M]  net/netfilter/nf_nat_helper.o
  CC      drivers/mfd/intel-lpss-pci.o
  CC [M]  fs/cifs/cifs_debug.o
  LD [M]  crypto/ecdh_generic.o
  CC      drivers/base/power/domain.o
  AR      crypto/built-in.a
  CC      kernel/trace/trace_functions.o
  CC      net/key/af_key.o
  CC      arch/x86/kernel/bootflag.o
  AR      drivers/base/regmap/built-in.a
  CC [M]  net/netfilter/nf_nat_redirect.o
  CC      mm/page_alloc.o
  CC      fs/btrfs/inode-item.o
  CC [M]  net/netfilter/nf_nat_masquerade.o
  CC      arch/x86/kernel/e820.o
  CC [M]  net/netfilter/x_tables.o
  CC      drivers/acpi/acpica/hwregs.o
  AR      kernel/time/built-in.a
  CC      drivers/dax/bus.o
  CC      drivers/base/power/domain_governor.o
  CC      kernel/softirq.o
  AR      drivers/macintosh/built-in.a
  AR      drivers/cxl/core/built-in.a
  AR      drivers/cxl/built-in.a
  CC      kernel/resource.o
  AR      net/bridge/netfilter/built-in.a
  CC      drivers/mfd/intel-lpss-acpi.o
  CC      net/bridge/br.o
  CC      kernel/trace/trace_preemptirq.o
  CC      drivers/scsi/scsi.o
  CC      drivers/scsi/hosts.o
  CC      drivers/scsi/scsi_ioctl.o
  CC [M]  net/netfilter/xt_tcpudp.o
  CC      lib/memcat_p.o
  CC [M]  drivers/misc/mei/interrupt.o
  CC [M]  net/netfilter/xt_mark.o
  CC      mm/init-mm.o
  CC      drivers/acpi/acpica/hwsleep.o
  CC [M]  drivers/gpu/drm/ttm/ttm_bo_util.o
  CC      fs/btrfs/disk-io.o
  CC      drivers/mfd/intel_soc_pmic_crc.o
  CC      drivers/acpi/acpica/hwvalid.o
  CC      net/bridge/br_device.o
  CC      lib/nmi_backtrace.o
  CC [M]  drivers/gpu/drm/ttm/ttm_bo_vm.o
  CC [M]  drivers/mfd/lpc_sch.o
  CC      drivers/dma-buf/dma-fence-array.o
  AR      kernel/events/built-in.a
  CC      kernel/sysctl.o
  CC      kernel/trace/trace_nop.o
  CC [M]  net/netfilter/xt_nat.o
  CC      net/bridge/br_fdb.o
  CC      drivers/base/power/clock_ops.o
  CC      drivers/acpi/acpica/hwxface.o
  CC      drivers/base/syscore.o
  CC      drivers/dma-buf/dma-fence-chain.o
  CC      arch/x86/kernel/pci-dma.o
  CC      drivers/dma-buf/dma-fence-unwrap.o
  CC      drivers/dma-buf/dma-resv.o
  CC [M]  fs/cifs/connect.o
  CC      lib/plist.o
  CC      kernel/trace/trace_functions_graph.o
  CC      fs/nfs/export.o
  CC [M]  drivers/misc/mei/client.o
  CC      kernel/trace/fgraph.o
  AR      drivers/dax/built-in.a
  CC      drivers/base/driver.o
  CC [M]  fs/cifs/dir.o
  CC [M]  net/netfilter/xt_REDIRECT.o
  CC      drivers/nvme/host/core.o
  CC      kernel/capability.o
  CC      drivers/nvme/host/ioctl.o
  CC      drivers/dma-buf/sync_file.o
  CC      arch/x86/kernel/quirks.o
  LD [M]  arch/x86/kvm/kvm-intel.o
  AR      drivers/nvme/target/built-in.a
  CC      drivers/ata/libata-core.o
  CC      drivers/ata/libata-scsi.o
  CC      fs/ext4/page-io.o
  AR      drivers/block/built-in.a
  CC      drivers/scsi/scsicam.o
  CC      fs/ext4/readpage.o
  CC [M]  net/netfilter/xt_MASQUERADE.o
  CC      fs/ext4/resize.o
  CC [M]  drivers/mfd/lpc_ich.o
  CC      lib/radix-tree.o
  CC [M]  fs/fuse/inode.o
  CC      fs/ext4/super.o
  CC [M]  drivers/gpu/drm/ttm/ttm_module.o
  CC      drivers/acpi/acpica/hwxfsleep.o
  CC      drivers/dma-buf/sw_sync.o
  CC      drivers/dma-buf/sync_debug.o
  CC      kernel/ptrace.o
  AR      drivers/base/power/built-in.a
  CC      drivers/scsi/scsi_error.o
  CC      drivers/ata/libata-eh.o
  CC      drivers/base/class.o
  CC      fs/nfs/sysfs.o
  CC      net/bridge/br_forward.o
  CC      kernel/trace/blktrace.o
  CC      kernel/trace/trace_events.o
  CC      drivers/acpi/acpica/hwpci.o
  CC [M]  drivers/gpu/drm/ttm/ttm_execbuf_util.o
  CC      lib/ratelimit.o
  CC [M]  net/netfilter/xt_addrtype.o
  CC      net/bridge/br_if.o
  CC      arch/x86/kernel/topology.o
  CC [M]  drivers/dma-buf/selftest.o
  CC      kernel/user.o
  CC [M]  drivers/dma-buf/st-dma-fence.o
  CC [M]  drivers/gpu/drm/ttm/ttm_range_manager.o
  AR      net/key/built-in.a
  CC      fs/nfs/fs_context.o
  AR      drivers/mfd/built-in.a
  CC      net/ipv6/ip6_fib.o
  CC [M]  fs/cifs/file.o
  CC      lib/rbtree.o
  CC      kernel/signal.o
  CC      fs/nfs/sysctl.o
  CC      kernel/trace/trace_export.o
  CC      net/ipv4/tcp_output.o
  CC [M]  drivers/dma-buf/st-dma-fence-chain.o
  CC      drivers/acpi/acpica/nsaccess.o
  CC      arch/x86/kernel/kdebugfs.o
  CC [M]  drivers/gpu/drm/scheduler/sched_main.o
  CC      drivers/nvme/host/trace.o
  CC [M]  fs/fuse/control.o
  CC      arch/x86/kernel/alternative.o
  CC      kernel/sys.o
  CC      drivers/base/platform.o
  CC [M]  drivers/dma-buf/st-dma-fence-unwrap.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.o
  CC      drivers/base/cpu.o
  CC [M]  drivers/dma-buf/st-dma-resv.o
  CC [M]  drivers/gpu/drm/scheduler/sched_fence.o
  CC      net/core/dev_ioctl.o
  CC [M]  fs/cifs/inode.o
  CC      drivers/spi/spi.o
  CC [M]  drivers/gpu/drm/i915/i915_driver.o
  CC [M]  drivers/gpu/drm/ttm/ttm_resource.o
  CC [M]  drivers/gpu/drm/i915/i915_drm_client.o
  CC      kernel/umh.o
  AR      drivers/dma-buf/built-in.a
  CC      drivers/acpi/acpica/nsalloc.o
  CC      drivers/acpi/acpica/nsarguments.o
  CC      net/ipv4/tcp_timer.o
  CC [M]  fs/fuse/xattr.o
  CC [M]  drivers/misc/mei/main.o
  CC [M]  drivers/misc/mei/dma-ring.o
  CC [M]  drivers/gpu/drm/xe/tests/xe_bo_test.o
  CC [M]  net/netfilter/xt_conntrack.o
  CC      kernel/trace/trace_event_perf.o
  CC [M]  drivers/gpu/drm/xe/tests/xe_dma_buf_test.o
  CC      kernel/trace/trace_events_filter.o
  CC      kernel/trace/trace_events_trigger.o
  CC      net/bridge/br_input.o
  CC [M]  drivers/misc/mei/bus.o
  CC      fs/btrfs/transaction.o
  CC      drivers/scsi/scsi_lib.o
  CC      arch/x86/kernel/i8253.o
  CC [M]  fs/cifs/link.o
  CC      drivers/acpi/pci_root.o
  CC [M]  drivers/gpu/drm/scheduler/sched_entity.o
  LD [M]  drivers/dma-buf/dmabuf_selftests.o
  CC      lib/seq_buf.o
  CC      drivers/acpi/acpica/nsconvert.o
  CC      drivers/net/phy/mdio-boardinfo.o
  AR      drivers/net/pse-pd/built-in.a
  CC      drivers/base/firmware.o
  CC      drivers/net/mdio/acpi_mdio.o
  CC [M]  drivers/gpu/drm/xe/tests/xe_migrate_test.o
  CC      drivers/net/mdio/fwnode_mdio.o
  CC [M]  fs/fuse/acl.o
  CC      net/bridge/br_ioctl.o
  CC      drivers/acpi/acpica/nsdump.o
  CC      kernel/workqueue.o
  CC [M]  drivers/gpu/drm/ttm/ttm_pool.o
  CC [M]  drivers/gpu/drm/ttm/ttm_device.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_device.o
  CC      net/core/tso.o
  CC [M]  drivers/gpu/drm/i915/i915_config.o
  CC      arch/x86/kernel/hw_breakpoint.o
  CC      arch/x86/kernel/tsc.o
  CC      drivers/base/init.o
  CC [M]  drivers/gpu/drm/xe/tests/xe_pci_test.o
  CC      drivers/base/map.o
  CC      drivers/acpi/acpica/nseval.o
  CC      fs/nfs/nfs2super.o
  CC [M]  drivers/misc/mei/bus-fixup.o
  CC      drivers/net/phy/mdio_devres.o
  CC [M]  fs/overlayfs/super.o
  CC      fs/open.o
  CC [M]  net/netfilter/xt_ipvs.o
  CC      net/ipv6/ipv6_sockglue.o
  LD [M]  drivers/gpu/drm/scheduler/gpu-sched.o
  CC      drivers/ata/libata-transport.o
  CC [M]  drivers/gpu/drm/vgem/vgem_drv.o
  CC      drivers/acpi/pci_link.o
  CC      drivers/acpi/pci_irq.o
  CC [M]  drivers/gpu/drm/vgem/vgem_fence.o
  CC [M]  fs/fuse/readdir.o
  CC      lib/show_mem.o
  CC      fs/nfs/proc.o
  AR      drivers/net/mdio/built-in.a
  CC      net/ipv4/tcp_ipv4.o
  CC [M]  net/sunrpc/auth_gss/auth_gss.o
  CC      drivers/acpi/acpica/nsinit.o
  CC      drivers/acpi/acpica/nsload.o
  CC      drivers/base/devres.o
  CC      drivers/base/attribute_container.o
  CC [M]  drivers/gpu/drm/i915/i915_getparam.o
  CC      drivers/base/transport_class.o
  CC      drivers/ata/libata-trace.o
  CC      net/bridge/br_stp.o
  CC      kernel/trace/trace_eprobe.o
  CC [M]  drivers/gpu/drm/ttm/ttm_sys_manager.o
  CC [M]  drivers/gpu/drm/xe/tests/xe_rtp_test.o
  CC      net/core/sock_reuseport.o
  CC      drivers/net/phy/phy.o
  CC [M]  drivers/misc/mei/debugfs.o
  CC [M]  drivers/gpu/drm/i915/i915_ioctl.o
  CC      mm/memblock.o
  CC      net/sunrpc/clnt.o
  CC      net/sunrpc/xprt.o
  CC      arch/x86/kernel/tsc_msr.o
  CC      lib/siphash.o
  CC      drivers/nvme/host/pci.o
  CC [M]  drivers/misc/mei/mei-trace.o
  CC      fs/read_write.o
  CC      fs/file_table.o
  CC      drivers/acpi/acpica/nsnames.o
  CC [M]  fs/cifs/misc.o
  CC      kernel/trace/trace_kprobe.o
  LD [M]  drivers/gpu/drm/vgem/vgem.o
  CC      kernel/trace/error_report-traces.o
  CC      net/core/fib_notifier.o
  CC [M]  drivers/gpu/drm/ttm/ttm_agp_backend.o
  CC      arch/x86/kernel/io_delay.o
  CC      drivers/scsi/scsi_lib_dma.o
  CC      drivers/scsi/scsi_scan.o
  CC [M]  drivers/misc/mei/pci-me.o
  CC [M]  fs/fuse/ioctl.o
  CC      drivers/acpi/acpi_lpss.o
  GEN     drivers/scsi/scsi_devinfo_tbl.c
  CC      drivers/acpi/acpica/nsobject.o
  LD [M]  net/netfilter/nf_conntrack.o
  LD [M]  net/netfilter/nf_nat.o
  CC      drivers/acpi/acpica/nsparse.o
  CC      lib/string.o
  CC      drivers/base/topology.o
  AR      net/netfilter/built-in.a
  CC      fs/super.o
  CC      fs/ext4/symlink.o
  CC      net/core/xdp.o
  CC      arch/x86/kernel/rtc.o
  CC      lib/timerqueue.o
  CC [M]  drivers/gpu/drm/xe/tests/xe_wa_test.o
  CC [M]  fs/overlayfs/namei.o
  CC      drivers/acpi/acpi_apd.o
  CC      fs/btrfs/inode.o
  CC      fs/nfs/nfs2xdr.o
  CC      fs/btrfs/file.o
  LD [M]  drivers/gpu/drm/ttm/ttm.o
  CC      drivers/acpi/acpi_platform.o
  CC      lib/vsprintf.o
  CC      drivers/ata/libata-sata.o
  CC      fs/char_dev.o
  CC      kernel/pid.o
  CC      fs/btrfs/defrag.o
  CC [M]  drivers/gpu/drm/i915/i915_irq.o
  CC      drivers/acpi/acpica/nspredef.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/object.o
  CC      drivers/acpi/acpica/nsprepkg.o
  CC      fs/btrfs/extent_map.o
  CC      net/ipv4/tcp_minisocks.o
  CC [M]  drivers/gpu/drm/ast/ast_drv.o
  CC      net/bridge/br_stp_bpdu.o
  CC [M]  drivers/misc/mei/hw-me.o
  AR      drivers/spi/built-in.a
  CC [M]  drivers/gpu/drm/ast/ast_i2c.o
  CC      net/ipv4/tcp_cong.o
  CC      net/ipv4/tcp_metrics.o
  LD [M]  fs/fuse/fuse.o
  CC      drivers/base/container.o
  CC      fs/btrfs/sysfs.o
  CC      kernel/trace/power-traces.o
  CC      arch/x86/kernel/resource.o
  CC [M]  drivers/gpu/drm/i915/i915_mitigations.o
  CC      mm/memory_hotplug.o
  CC      drivers/gpu/drm/drm_mipi_dsi.o
  CC      net/ipv6/ndisc.o
  AR      drivers/net/pcs/built-in.a
  CC      drivers/net/phy/phy-c45.o
  CC [M]  fs/cifs/netmisc.o
  CC [M]  drivers/gpu/drm/xe/xe_bb.o
  AS      arch/x86/kernel/irqflags.o
  CC      drivers/acpi/acpica/nsrepair.o
  CC      drivers/ata/libata-sff.o
  CC      arch/x86/kernel/static_call.o
  CC      drivers/scsi/scsi_devinfo.o
  CC      net/8021q/vlan_core.o
  CC      drivers/base/property.o
  CC [M]  fs/overlayfs/util.o
  CC [M]  net/8021q/vlan.o
  CC [M]  fs/overlayfs/inode.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/client.o
  CC [M]  fs/overlayfs/file.o
  CC [M]  fs/overlayfs/dir.o
  CC [M]  drivers/gpu/drm/ast/ast_main.o
  CC [M]  net/sunrpc/auth_gss/gss_generic_token.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/conn.o
  CC      arch/x86/kernel/process.o
  CC      drivers/acpi/acpica/nsrepair2.o
  CC      fs/btrfs/accessors.o
  CC      kernel/task_work.o
  CC [M]  drivers/gpu/drm/i915/i915_module.o
  CC      net/core/flow_offload.o
  AR      drivers/net/ethernet/adi/built-in.a
  CC      kernel/extable.o
  CC [M]  drivers/gpu/drm/i915/i915_params.o
  AR      drivers/net/ethernet/alacritech/built-in.a
  CC      net/bridge/br_stp_if.o
  AR      drivers/net/ethernet/amazon/built-in.a
  CC [M]  drivers/gpu/drm/xe/xe_bo.o
  AR      drivers/net/ethernet/aquantia/built-in.a
  AR      drivers/net/ethernet/asix/built-in.a
  AR      drivers/net/ethernet/cadence/built-in.a
  CC      drivers/scsi/scsi_sysctl.o
  AR      drivers/net/ethernet/broadcom/built-in.a
  CC [M]  drivers/net/ethernet/broadcom/b44.o
  CC      fs/nfs/nfs3super.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.o
  AR      drivers/nvme/host/built-in.a
  CC [M]  drivers/net/ethernet/broadcom/bnx2.o
  CC [M]  drivers/net/ethernet/broadcom/cnic.o
  AR      drivers/nvme/built-in.a
  CC      net/sunrpc/socklib.o
  CC      drivers/scsi/scsi_debugfs.o
  CC      fs/nfs/nfs3client.o
  CC      drivers/acpi/acpica/nssearch.o
  CC      drivers/base/cacheinfo.o
  CC [M]  drivers/gpu/drm/i915/i915_pci.o
  CC      drivers/net/phy/phy-core.o
  CC [M]  net/sunrpc/auth_gss/gss_mech_switch.o
  CC [M]  drivers/gpu/drm/xe/xe_bo_evict.o
  CC      kernel/trace/rpm-traces.o
  CC      fs/nfs/nfs3proc.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/device.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/disp.o
  CC      drivers/acpi/acpica/nsutils.o
  CC      drivers/acpi/acpica/nswalk.o
  CC [M]  net/sunrpc/auth_gss/svcauth_gss.o
  CC      fs/nfs/nfs3xdr.o
  CC [M]  drivers/gpu/drm/ast/ast_mm.o
  CC      net/ipv4/tcp_fastopen.o
  CC      net/ipv4/tcp_rate.o
  CC      mm/madvise.o
  CC [M]  net/8021q/vlan_dev.o
  CC [M]  drivers/misc/mei/gsc-me.o
  CC      net/ipv4/tcp_recovery.o
  CC [M]  fs/cifs/smbencrypt.o
  CC [M]  fs/cifs/transport.o
  CC      mm/page_io.o
  CC [M]  fs/overlayfs/readdir.o
  CC      drivers/scsi/scsi_trace.o
  CC      fs/btrfs/xattr.o
  CC [M]  drivers/gpu/drm/i915/i915_scatterlist.o
  CC      drivers/base/swnode.o
  CC      kernel/trace/trace_dynevent.o
  CC      net/core/gro.o
  CC      net/core/netdev-genl.o
  CC      arch/x86/kernel/ptrace.o
  CC      net/core/netdev-genl-gen.o
  CC      lib/win_minmax.o
  CC      drivers/acpi/acpica/nsxfeval.o
  CC      net/bridge/br_stp_timer.o
  CC      net/ipv6/udp.o
  CC      drivers/base/auxiliary.o
  CC [M]  drivers/gpu/drm/i915/i915_suspend.o
  CC [M]  drivers/net/ethernet/broadcom/tg3.o
  CC [M]  drivers/gpu/drm/i915/i915_switcheroo.o
  CC [M]  fs/overlayfs/copy_up.o
  CC      kernel/trace/trace_probe.o
  CC      drivers/acpi/acpica/nsxfname.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/driver.o
  CC      drivers/net/phy/phy_device.o
  LD [M]  drivers/misc/mei/mei.o
  CC [M]  drivers/gpu/drm/ast/ast_mode.o
  CC      drivers/ata/libata-pmp.o
  LD [M]  drivers/misc/mei/mei-me.o
  CC      net/ipv4/tcp_ulp.o
  LD [M]  drivers/misc/mei/mei-gsc.o
  CC      drivers/scsi/scsi_logging.o
  CC      kernel/trace/trace_uprobe.o
  CC      lib/xarray.o
  CC      kernel/trace/rethook.o
  CC      drivers/scsi/scsi_pm.o
  CC [M]  net/sunrpc/auth_gss/gss_rpc_upcall.o
  CC      net/bridge/br_netlink.o
  CC      drivers/acpi/acpica/nsxfobj.o
  CC      drivers/acpi/acpica/psargs.o
  CC [M]  drivers/gpu/drm/i915/i915_sysfs.o
  AR      drivers/net/ethernet/cavium/common/built-in.a
  AR      drivers/net/ethernet/cavium/thunder/built-in.a
  AR      drivers/net/ethernet/cavium/liquidio/built-in.a
  AR      drivers/net/ethernet/cavium/octeon/built-in.a
  AR      drivers/net/ethernet/cavium/built-in.a
  CC      drivers/base/devtmpfs.o
  AR      drivers/net/usb/built-in.a
  CC [M]  drivers/net/usb/pegasus.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.o
  CC      net/core/net-sysfs.o
  CC      arch/x86/kernel/tls.o
  CC [M]  drivers/gpu/drm/ast/ast_post.o
  CC [M]  net/8021q/vlan_netlink.o
  CC      drivers/ata/libata-acpi.o
  CC [M]  drivers/gpu/drm/i915/i915_utils.o
  AR      drivers/firewire/built-in.a
  CC      drivers/acpi/acpica/psloop.o
  CC      drivers/net/phy/linkmode.o
  CC      arch/x86/kernel/step.o
  CC      net/ipv6/udplite.o
  AR      drivers/net/ethernet/cortina/built-in.a
  AR      drivers/net/ethernet/engleder/built-in.a
  CC      fs/stat.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/event.o
  CC [M]  drivers/gpu/drm/ast/ast_dp501.o
  CC      net/bridge/br_netlink_tunnel.o
  CC      net/bridge/br_arp_nd_proxy.o
  CC      net/bridge/br_sysfs_if.o
  CC [M]  drivers/gpu/drm/xe/xe_debugfs.o
  CC      net/bridge/br_sysfs_br.o
  AR      drivers/cdrom/built-in.a
  CC [M]  fs/cifs/cached_dir.o
  CC      drivers/scsi/scsi_bsg.o
  CC      net/bridge/br_nf_core.o
  CC      mm/swap_state.o
  CC      net/core/net-procfs.o
  CC      fs/btrfs/ordered-data.o
  CC [M]  fs/overlayfs/export.o
  CC      drivers/acpi/acpica/psobject.o
  CC      drivers/scsi/scsi_common.o
  AR      fs/nfs/built-in.a
  CC      fs/exec.o
  CC      net/ipv4/tcp_offload.o
  CC [M]  fs/cifs/cifs_unicode.o
  AR      drivers/auxdisplay/built-in.a
  CC      drivers/input/serio/serio.o
  CC      drivers/base/memory.o
  CC      drivers/usb/common/common.o
  CC      arch/x86/kernel/i8237.o
  CC      drivers/usb/common/debug.o
  CC      drivers/input/keyboard/atkbd.o
  CC [M]  net/8021q/vlanproc.o
  AR      drivers/input/mouse/built-in.a
  CC [M]  net/sunrpc/auth_gss/gss_rpc_xdr.o
  CC [M]  net/sunrpc/auth_gss/trace.o
  CC      arch/x86/kernel/stacktrace.o
  CC      drivers/usb/core/usb.o
  CC [M]  drivers/gpu/drm/i915/intel_clock_gating.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/fifo.o
  CC      drivers/usb/core/hub.o
  CC      drivers/scsi/sd.o
  CC      drivers/ata/libata-pata-timings.o
  CC      arch/x86/kernel/reboot.o
  CC      drivers/acpi/acpica/psopcode.o
  CC      net/ipv6/raw.o
  CC      drivers/net/phy/mdio_bus.o
  CC      lib/lockref.o
  CC      mm/swapfile.o
  LD [M]  fs/overlayfs/overlay.o
  CC [M]  drivers/gpu/drm/xe/xe_devcoredump.o
  CC      mm/swap_slots.o
  CC      drivers/base/module.o
  CC [M]  drivers/gpu/drm/ast/ast_dp.o
  CC [M]  drivers/net/usb/rtl8150.o
  CC      drivers/scsi/sg.o
  CC      net/bridge/br_multicast.o
  CC [M]  drivers/gpu/drm/xe/xe_device.o
  CC [M]  drivers/gpu/drm/xe/xe_dma_buf.o
  CC      lib/bcd.o
  CC      lib/sort.o
  AR      drivers/usb/phy/built-in.a
  CC      mm/dmapool.o
  AR      drivers/usb/common/built-in.a
  AR      kernel/trace/built-in.a
  CC      net/core/netpoll.o
  CC      kernel/params.o
  CC [M]  drivers/net/usb/r8152.o
  CC      drivers/acpi/acpica/psopinfo.o
  CC      kernel/kthread.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/atombios_crtc.o
  CC      drivers/input/serio/i8042.o
  CC      fs/ext4/sysfs.o
  CC      lib/parser.o
  CC      net/sunrpc/xprtsock.o
  CC [M]  drivers/net/ipvlan/ipvlan_core.o
  CC [M]  drivers/net/vxlan/vxlan_core.o
  CC [M]  drivers/net/ipvlan/ipvlan_main.o
  CC [M]  fs/cifs/nterr.o
  CC      drivers/base/pinctrl.o
  AR      net/8021q/built-in.a
  LD [M]  net/8021q/8021q.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/head.o
  CC      drivers/rtc/lib.o
  CC [M]  fs/cifs/cifsencrypt.o
  CC      drivers/ata/ahci.o
  CC [M]  fs/cifs/readdir.o
  CC      drivers/rtc/class.o
  AR      drivers/input/keyboard/built-in.a
  CC      drivers/input/input.o
  CC      drivers/input/serio/libps2.o
  CC      drivers/acpi/acpica/psparse.o
  CC      lib/debug_locks.o
  CC      net/ipv4/tcp_plb.o
  CC      drivers/acpi/acpica/psscope.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/mem.o
  CC      arch/x86/kernel/msr.o
  CC      drivers/base/devcoredump.o
  LD [M]  drivers/gpu/drm/ast/ast.o
  CC [M]  drivers/gpu/drm/drm_aperture.o
  CC      fs/btrfs/extent_io.o
  CC      lib/random32.o
  CC      mm/hugetlb.o
  CC      fs/btrfs/volumes.o
  CC      fs/ext4/xattr.o
  CC [M]  drivers/gpu/drm/xe/xe_engine.o
  CC      drivers/rtc/interface.o
  CC      fs/btrfs/async-thread.o
  CC      arch/x86/kernel/cpuid.o
  CC      fs/btrfs/ioctl.o
  CC      drivers/base/platform-msi.o
  CC      drivers/acpi/acpica/pstree.o
  CC      drivers/net/phy/mdio_device.o
  CC      kernel/sys_ni.o
  CC      lib/bust_spinlocks.o
  CC      drivers/rtc/nvmem.o
  CC [M]  fs/cifs/ioctl.o
  CC      fs/ext4/xattr_hurd.o
  CC [M]  drivers/gpu/drm/i915/intel_device_info.o
  CC [M]  drivers/gpu/drm/drm_atomic.o
  CC [M]  drivers/net/usb/asix_devices.o
  CC      fs/pipe.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.o
  CC      lib/kasprintf.o
  CC      fs/btrfs/locking.o
  AR      drivers/input/serio/built-in.a
  CC      kernel/nsproxy.o
  CC      drivers/rtc/dev.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/mmu.o
  CC      drivers/acpi/acpica/psutils.o
  CC      kernel/notifier.o
  CC [M]  net/sunrpc/auth_gss/gss_krb5_mech.o
  CC      arch/x86/kernel/early-quirks.o
  AR      drivers/i2c/algos/built-in.a
  CC      drivers/ata/libahci.o
  CC [M]  drivers/i2c/algos/i2c-algo-bit.o
  CC      drivers/base/physical_location.o
  CC [M]  drivers/net/ipvlan/ipvlan_l3s.o
  CC      drivers/i2c/busses/i2c-designware-common.o
  CC      drivers/i2c/busses/i2c-designware-master.o
  CC      net/ipv4/datagram.o
  CC      net/core/fib_rules.o
  CC      net/ipv6/icmp.o
  CC      lib/bitmap.o
  CC      drivers/i2c/busses/i2c-designware-platdrv.o
  CC      kernel/ksysfs.o
  AR      drivers/net/ethernet/ezchip/built-in.a
  CC      drivers/usb/host/pci-quirks.o
  CC      fs/namei.o
  CC      drivers/net/phy/swphy.o
  CC      drivers/acpi/acpica/pswalk.o
  CC      drivers/net/phy/fixed_phy.o
  CC      drivers/scsi/scsi_sysfs.o
  CC      drivers/rtc/proc.o
  CC      drivers/base/trace.o
  CC      drivers/ata/ata_piix.o
  CC      drivers/input/input-compat.o
  CC [M]  drivers/gpu/drm/drm_atomic_uapi.o
  CC      drivers/usb/host/ehci-hcd.o
  CC [M]  net/sunrpc/auth_gss/gss_krb5_seal.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/outp.o
  CC      drivers/acpi/acpica/psxface.o
  CC [M]  drivers/gpu/drm/i915/intel_memory_region.o
  CC      arch/x86/kernel/smp.o
  CC [M]  drivers/gpu/drm/i915/intel_pcode.o
  CC [M]  drivers/gpu/drm/xe/xe_exec.o
  CC      drivers/rtc/sysfs.o
  CC      kernel/cred.o
  CC      kernel/reboot.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/timer.o
  CC      drivers/net/loopback.o
  CC [M]  net/sunrpc/auth_gss/gss_krb5_unseal.o
  CC [M]  fs/cifs/sess.o
  CC      drivers/net/netconsole.o
  CC      mm/hugetlb_vmemmap.o
  CC [M]  drivers/gpu/drm/i915/intel_region_ttm.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/atom.o
  CC      net/dcb/dcbnl.o
  LD [M]  drivers/net/ipvlan/ipvlan.o
  CC [M]  drivers/gpu/drm/drm_auth.o
  CC      arch/x86/kernel/smpboot.o
  CC      drivers/input/input-mt.o
  CC      drivers/i2c/busses/i2c-designware-baytrail.o
  CC      drivers/acpi/acpica/rsaddr.o
  CC      drivers/net/virtio_net.o
  CC      lib/scatterlist.o
  CC      net/ipv4/raw.o
  AR      drivers/base/built-in.a
  CC      drivers/input/input-poller.o
  CC [M]  drivers/net/phy/phylink.o
  CC      drivers/usb/core/hcd.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.o
  CC      drivers/usb/core/urb.o
  CC      fs/ext4/xattr_trusted.o
  CC [M]  net/sunrpc/auth_gss/gss_krb5_seqnum.o
  CC      drivers/rtc/rtc-mc146818-lib.o
  CC      drivers/acpi/acpica/rscalc.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/vmm.o
  CC      lib/list_sort.o
  CC      lib/uuid.o
  CC      drivers/usb/core/message.o
  CC [M]  drivers/gpu/drm/xe/xe_execlist.o
  AR      drivers/ata/built-in.a
  CC      net/ipv6/mcast.o
  AR      drivers/scsi/built-in.a
  CC      drivers/rtc/rtc-cmos.o
  CC      lib/iov_iter.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/user.o
  AR      drivers/i3c/built-in.a
  CC [M]  drivers/gpu/drm/i915/intel_runtime_pm.o
  CC      net/ipv4/udp.o
  CC      net/core/net-traces.o
  CC [M]  drivers/i2c/busses/i2c-scmi.o
  CC      drivers/input/ff-core.o
  CC      net/core/selftests.o
  CC      lib/clz_ctz.o
  CC      net/core/ptp_classifier.o
  CC      lib/bsearch.o
  CC      fs/btrfs/orphan.o
  CC      kernel/async.o
  CC      drivers/acpi/acpica/rscreate.o
  CC [M]  drivers/i2c/busses/i2c-ccgx-ucsi.o
  CC [M]  drivers/gpu/drm/i915/intel_sbi.o
  CC      fs/ext4/xattr_user.o
  CC [M]  net/sunrpc/auth_gss/gss_krb5_wrap.o
  CC [M]  drivers/gpu/drm/nouveau/nvif/userc361.o
  CC      fs/btrfs/export.o
  CC      kernel/range.o
  CC      net/dcb/dcbevent.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.o
  CC      drivers/usb/host/ehci-pci.o
  CC      drivers/usb/core/driver.o
  CC      drivers/acpi/acpica/rsdumpinfo.o
  CC      kernel/smpboot.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/client.o
  CC      arch/x86/kernel/tsc_sync.o
  CC      drivers/input/touchscreen.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/engine.o
  CC      net/sunrpc/sched.o
  CC      net/ipv4/udplite.o
  CC      fs/btrfs/tree-log.o
  CC [M]  drivers/i2c/busses/i2c-i801.o
  CC      fs/ext4/fast_commit.o
  CC      kernel/ucount.o
  CC [M]  drivers/gpu/drm/xe/xe_force_wake.o
  CC      net/bridge/br_mdb.o
  AR      drivers/rtc/built-in.a
  CC [M]  drivers/net/vxlan/vxlan_multicast.o
  CC      net/bridge/br_multicast_eht.o
  CC [M]  net/sunrpc/auth_gss/gss_krb5_crypto.o
  CC      fs/fcntl.o
  CC      drivers/acpi/acpica/rsinfo.o
  CC      lib/find_bit.o
  AR      drivers/media/i2c/built-in.a
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/enum.o
  AR      drivers/media/tuners/built-in.a
  AR      drivers/media/rc/keymaps/built-in.a
  AR      drivers/ptp/built-in.a
  CC      fs/btrfs/free-space-cache.o
  AR      drivers/media/rc/built-in.a
  CC [M]  drivers/gpu/drm/i915/intel_step.o
  CC [M]  drivers/ptp/ptp_clock.o
  AR      drivers/media/common/b2c2/built-in.a
  AR      drivers/media/common/saa7146/built-in.a
  AR      drivers/media/platform/allegro-dvt/built-in.a
  AR      drivers/media/common/siano/built-in.a
  AR      drivers/media/platform/amlogic/meson-ge2d/built-in.a
  AR      drivers/media/common/v4l2-tpg/built-in.a
  AR      drivers/media/platform/amlogic/built-in.a
  AR      drivers/media/common/videobuf2/built-in.a
  AR      drivers/media/pci/ttpci/built-in.a
  AR      drivers/media/platform/amphion/built-in.a
  AR      drivers/media/pci/b2c2/built-in.a
  AR      drivers/media/common/built-in.a
  CC      lib/llist.o
  AR      drivers/media/platform/aspeed/built-in.a
  AR      drivers/media/pci/pluto2/built-in.a
  AR      net/dcb/built-in.a
  CC      fs/ioctl.o
  AR      drivers/media/platform/atmel/built-in.a
  AR      drivers/media/pci/dm1105/built-in.a
  AR      drivers/media/pci/pt1/built-in.a
  AR      drivers/media/platform/cadence/built-in.a
  CC      net/ipv4/udp_offload.o
  AR      drivers/media/pci/pt3/built-in.a
  CC      arch/x86/kernel/setup_percpu.o
  CC      drivers/usb/storage/scsiglue.o
  AR      drivers/media/pci/mantis/built-in.a
  AR      drivers/media/platform/chips-media/built-in.a
  AR      drivers/media/platform/intel/built-in.a
  AR      drivers/media/pci/ngene/built-in.a
  AR      drivers/media/platform/marvell/built-in.a
  CC      drivers/usb/serial/usb-serial.o
  AR      drivers/media/pci/ddbridge/built-in.a
  AR      drivers/media/platform/mediatek/jpeg/built-in.a
  AR      drivers/media/pci/saa7146/built-in.a
  CC      drivers/input/ff-memless.o
  AR      drivers/media/platform/mediatek/mdp/built-in.a
  AR      drivers/media/pci/smipcie/built-in.a
  AR      drivers/media/platform/mediatek/vcodec/built-in.a
  CC      drivers/usb/serial/generic.o
  AR      drivers/media/pci/netup_unidvb/built-in.a
  AR      drivers/media/platform/mediatek/vpu/built-in.a
  CC [M]  fs/cifs/export.o
  AR      drivers/media/platform/mediatek/mdp3/built-in.a
  AR      drivers/media/pci/intel/ipu3/built-in.a
  CC [M]  drivers/i2c/busses/i2c-isch.o
  AR      drivers/media/platform/mediatek/built-in.a
  AR      drivers/media/pci/intel/built-in.a
  AR      drivers/media/pci/built-in.a
  CC [M]  drivers/i2c/busses/i2c-ismt.o
  AR      drivers/media/platform/microchip/built-in.a
  CC      mm/sparse.o
  AR      drivers/media/platform/nvidia/tegra-vde/built-in.a
  AR      drivers/usb/misc/built-in.a
  AR      drivers/media/platform/nvidia/built-in.a
  CC [M]  drivers/usb/misc/ftdi-elan.o
  AR      drivers/media/platform/nxp/dw100/built-in.a
  AR      drivers/media/platform/nxp/imx-jpeg/built-in.a
  AR      drivers/media/platform/nxp/built-in.a
  AR      drivers/media/platform/qcom/camss/built-in.a
  AR      drivers/media/platform/qcom/venus/built-in.a
  CC      net/bridge/br_vlan.o
  AR      drivers/power/reset/built-in.a
  AR      drivers/media/platform/qcom/built-in.a
  CC      drivers/acpi/acpica/rsio.o
  CC      drivers/power/supply/power_supply_core.o
  AR      drivers/media/platform/renesas/rcar-vin/built-in.a
  AR      drivers/media/platform/renesas/rzg2l-cru/built-in.a
  AR      drivers/media/platform/renesas/vsp1/built-in.a
  CC      kernel/regset.o
  AR      drivers/media/platform/renesas/built-in.a
  CC      drivers/usb/gadget/udc/core.o
  AR      drivers/media/platform/rockchip/rga/built-in.a
  AR      drivers/media/platform/rockchip/rkisp1/built-in.a
  AR      drivers/media/platform/rockchip/built-in.a
  CC      drivers/usb/gadget/udc/trace.o
  AR      drivers/media/platform/samsung/exynos-gsc/built-in.a
  AR      drivers/media/platform/samsung/exynos4-is/built-in.a
  AR      drivers/media/platform/samsung/s3c-camif/built-in.a
  AR      drivers/media/platform/samsung/s5p-g2d/built-in.a
  AR      drivers/media/platform/samsung/s5p-jpeg/built-in.a
  AR      drivers/media/platform/samsung/s5p-mfc/built-in.a
  AR      drivers/media/platform/samsung/built-in.a
  AR      drivers/media/platform/st/sti/bdisp/built-in.a
  CC [M]  drivers/net/phy/aquantia_main.o
  AR      drivers/media/platform/st/sti/c8sectpfe/built-in.a
  CC      drivers/acpi/acpica/rsirq.o
  AR      drivers/media/platform/st/sti/delta/built-in.a
  AR      drivers/media/platform/st/sti/hva/built-in.a
  CC [M]  drivers/gpu/drm/drm_blend.o
  AR      drivers/media/platform/st/stm32/built-in.a
  CC      drivers/usb/core/config.o
  AR      drivers/media/platform/st/built-in.a
  AR      drivers/media/platform/sunxi/sun4i-csi/built-in.a
  CC [M]  drivers/gpu/drm/xe/xe_ggtt.o
  AR      drivers/media/platform/sunxi/sun6i-csi/built-in.a
  CC [M]  drivers/net/usb/asix_common.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/event.o
  AR      drivers/media/platform/sunxi/sun6i-mipi-csi2/built-in.a
  CC      arch/x86/kernel/ftrace.o
  AR      drivers/media/platform/sunxi/sun8i-a83t-mipi-csi2/built-in.a
  AR      drivers/media/platform/sunxi/sun8i-di/built-in.a
  AR      drivers/media/platform/sunxi/sun8i-rotate/built-in.a
  AR      drivers/media/platform/sunxi/built-in.a
  AS      arch/x86/kernel/ftrace_64.o
  AR      drivers/media/platform/ti/am437x/built-in.a
  AR      drivers/media/platform/ti/cal/built-in.a
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/firmware.o
  AR      drivers/media/platform/ti/vpe/built-in.a
  AR      drivers/media/platform/ti/davinci/built-in.a
  AR      drivers/media/platform/ti/omap/built-in.a
  CC      kernel/kmod.o
  AR      drivers/media/platform/ti/omap3isp/built-in.a
  AR      drivers/media/platform/ti/built-in.a
  AR      drivers/media/platform/verisilicon/built-in.a
  AR      drivers/media/platform/via/built-in.a
  AR      drivers/media/platform/xilinx/built-in.a
  AR      drivers/media/platform/built-in.a
  CC [M]  drivers/ptp/ptp_chardev.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/gpuobj.o
  CC      drivers/usb/storage/protocol.o
  AR      drivers/media/usb/b2c2/built-in.a
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_object.o
  AR      drivers/media/usb/dvb-usb/built-in.a
  AR      drivers/media/usb/dvb-usb-v2/built-in.a
  CC      drivers/input/vivaldi-fmap.o
  AR      drivers/media/usb/s2255/built-in.a
  CC [M]  drivers/gpu/drm/i915/intel_uncore.o
  AR      drivers/media/usb/siano/built-in.a
  CC      net/ipv4/arp.o
  CC      drivers/acpi/acpica/rslist.o
  AR      drivers/media/usb/ttusb-budget/built-in.a
  AR      drivers/media/usb/ttusb-dec/built-in.a
  AR      drivers/media/usb/built-in.a
  AR      drivers/media/mmc/siano/built-in.a
  AR      drivers/media/mmc/built-in.a
  CC [M]  net/sunrpc/auth_gss/gss_krb5_keys.o
  CC [M]  drivers/net/vxlan/vxlan_vnifilter.o
  AR      drivers/media/firewire/built-in.a
  CC [M]  drivers/gpu/drm/i915/intel_wakeref.o
  CC      drivers/usb/host/ohci-hcd.o
  AR      drivers/media/spi/built-in.a
  CC [M]  fs/cifs/unc.o
  AR      drivers/media/test-drivers/built-in.a
  AR      drivers/media/built-in.a
  CC      lib/memweight.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/intr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/ioctl.o
  CC      mm/sparse-vmemmap.o
  CC      drivers/power/supply/power_supply_sysfs.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/memory.o
  CC [M]  drivers/i2c/busses/i2c-piix4.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/mm.o
  CC      drivers/input/input-leds.o
  CC      drivers/hwmon/hwmon.o
  CC      lib/kfifo.o
  CC      drivers/acpi/acpica/rsmemory.o
  CC      drivers/usb/serial/bus.o
  CC [M]  drivers/net/phy/aquantia_hwmon.o
  CC      arch/x86/kernel/trace_clock.o
  CC      drivers/usb/serial/console.o
  CC      net/ipv6/reassembly.o
  CC [M]  drivers/gpu/drm/xe/xe_gt.o
  CC      drivers/usb/storage/transport.o
  CC      net/ipv4/icmp.o
  CC      arch/x86/kernel/trace.o
  AR      drivers/thermal/broadcom/built-in.a
  CC      kernel/groups.o
  AR      drivers/thermal/samsung/built-in.a
  CC      kernel/kcmp.o
  CC      drivers/thermal/intel/intel_tcc.o
  CC [M]  drivers/gpu/drm/i915/vlv_sideband.o
  CC [M]  drivers/ptp/ptp_sysfs.o
  CC      net/core/netprio_cgroup.o
  CC      net/core/dst_cache.o
  CC      drivers/power/supply/power_supply_leds.o
  AR      drivers/usb/gadget/udc/built-in.a
  CC      drivers/acpi/acpica/rsmisc.o
  CC [M]  drivers/net/usb/ax88172a.o
  AR      drivers/usb/gadget/function/built-in.a
  CC      drivers/usb/core/file.o
  AR      drivers/usb/gadget/legacy/built-in.a
  CC      drivers/usb/gadget/usbstring.o
  CC      drivers/thermal/intel/therm_throt.o
  LD [M]  net/sunrpc/auth_gss/auth_rpcgss.o
  CC      drivers/input/mousedev.o
  CC      mm/mmu_notifier.o
  LD [M]  net/sunrpc/auth_gss/rpcsec_gss_krb5.o
  CC      drivers/net/net_failover.o
  CC      mm/ksm.o
  CC      fs/ext4/orphan.o
  CC [M]  drivers/net/usb/ax88179_178a.o
  CC      net/sunrpc/auth.o
  CC      arch/x86/kernel/rethook.o
  CC      net/sunrpc/auth_null.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_encoders.o
  CC      lib/percpu-refcount.o
  CC      drivers/usb/host/ohci-pci.o
  CC      drivers/usb/serial/ftdi_sio.o
  CC      drivers/usb/serial/pl2303.o
  CC [M]  drivers/net/phy/ax88796b.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/object.o
  CC [M]  fs/cifs/winucase.o
  CC      drivers/watchdog/watchdog_core.o
  CC      drivers/power/supply/power_supply_hwmon.o
  CC [M]  fs/cifs/smb2ops.o
  CC [M]  drivers/md/persistent-data/dm-array.o
  CC [M]  drivers/i2c/busses/i2c-designware-pcidrv.o
  CC      drivers/acpi/acpica/rsserial.o
  CC [M]  drivers/md/persistent-data/dm-bitset.o
  CC      drivers/usb/gadget/config.o
  CC      kernel/freezer.o
  CC [M]  drivers/hwmon/acpi_power_meter.o
  CC [M]  drivers/net/dummy.o
  CC      drivers/usb/core/buffer.o
  CC [M]  drivers/ptp/ptp_vclock.o
  CC [M]  drivers/gpu/drm/xe/xe_gt_clock.o
  CC      drivers/usb/storage/usb.o
  CC      arch/x86/kernel/crash_core_64.o
  CC [M]  drivers/net/usb/cdc_ether.o
  CC      drivers/input/evdev.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_display.o
  CC      drivers/acpi/acpica/rsutils.o
  AR      drivers/power/supply/built-in.a
  CC      net/sunrpc/auth_unix.o
  AR      drivers/power/built-in.a
  CC      drivers/usb/core/sysfs.o
  CC [M]  drivers/gpu/drm/i915/vlv_suspend.o
  CC      drivers/usb/storage/initializers.o
  CC      lib/rhashtable.o
  CC      lib/base64.o
  CC [M]  drivers/thermal/intel/x86_pkg_temp_thermal.o
  CC [M]  drivers/net/phy/bcm7xxx.o
  CC      drivers/usb/core/endpoint.o
  CC      net/core/gro_cells.o
  CC      net/bridge/br_vlan_tunnel.o
  CC      lib/once.o
  CC      fs/btrfs/zlib.o
  LD [M]  drivers/net/vxlan/vxlan.o
  CC      net/ipv6/tcp_ipv6.o
  AR      fs/ext4/built-in.a
  CC      drivers/usb/host/uhci-hcd.o
  CC [M]  fs/cifs/smb2maperror.o
  CC [M]  drivers/gpu/drm/i915/soc/intel_dram.o
  CC [M]  drivers/md/persistent-data/dm-block-manager.o
  CC [M]  drivers/md/persistent-data/dm-space-map-common.o
  CC      fs/readdir.o
  CC      drivers/usb/gadget/epautoconf.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/oproxy.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/option.o
  CC      drivers/watchdog/watchdog_dev.o
  CC      arch/x86/kernel/module.o
  LD [M]  drivers/i2c/busses/i2c-designware-pci.o
  CC [M]  fs/cifs/smb2transport.o
  AR      drivers/i2c/busses/built-in.a
  AR      drivers/i2c/muxes/built-in.a
  CC [M]  drivers/i2c/muxes/i2c-mux-gpio.o
  CC      drivers/usb/core/devio.o
  CC      kernel/stacktrace.o
  CC [M]  drivers/hwmon/coretemp.o
  CC      drivers/acpi/acpica/rsxface.o
  CC [M]  drivers/gpu/drm/xe/xe_gt_debugfs.o
  CC [M]  drivers/net/phy/bcm87xx.o
  CC [M]  drivers/ptp/ptp_kvm_x86.o
  CC      net/ipv6/ping.o
  CC      drivers/usb/gadget/composite.o
  AR      drivers/usb/serial/built-in.a
  CC      net/ipv4/devinet.o
  CC      lib/refcount.o
  CC [M]  fs/cifs/smb2misc.o
  CC [M]  drivers/gpu/drm/xe/xe_gt_idle_sysfs.o
  CC [M]  drivers/net/macvlan.o
  CC      drivers/usb/storage/sierra_ms.o
  CC [M]  drivers/thermal/intel/intel_menlow.o
  CC      arch/x86/kernel/early_printk.o
  CC [M]  drivers/net/usb/cdc_eem.o
  CC [M]  drivers/gpu/drm/drm_bridge.o
  CC      lib/usercopy.o
  CC      net/sunrpc/svc.o
  CC      mm/slub.o
  CC      net/core/failover.o
  CC      fs/btrfs/lzo.o
  AR      drivers/net/ethernet/fungible/built-in.a
  CC      drivers/usb/core/notify.o
  CC      drivers/acpi/acpica/tbdata.o
  CC      fs/select.o
  AR      drivers/net/ethernet/huawei/built-in.a
  CC      fs/dcache.o
  AR      drivers/input/built-in.a
  CC [M]  drivers/net/ethernet/intel/e1000/e1000_main.o
  CC      drivers/usb/core/generic.o
  CC      drivers/acpi/acpica/tbfadt.o
  CC      kernel/dma.o
  CC      drivers/opp/core.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/ramht.o
  CC      drivers/opp/cpu.o
  CC [M]  drivers/ptp/ptp_kvm_common.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/subdev.o
  CC      arch/x86/kernel/hpet.o
  CC      drivers/opp/debugfs.o
  CC      drivers/i2c/i2c-boardinfo.o
  CC [M]  drivers/net/phy/bcm-phy-lib.o
  CC      drivers/acpi/acpica/tbfind.o
  CC      lib/errseq.o
  CC [M]  drivers/md/persistent-data/dm-space-map-disk.o
  CC      drivers/watchdog/softdog.o
  AR      drivers/hwmon/built-in.a
  LD [M]  drivers/ptp/ptp.o
  CC      drivers/cpufreq/cpufreq.o
  CC      lib/bucket_locks.o
  CC [M]  drivers/gpu/drm/xe/xe_gt_mcr.o
  CC [M]  drivers/gpu/drm/i915/soc/intel_gmch.o
  CC      lib/generic-radix-tree.o
  CC      drivers/usb/storage/option_ms.o
  CC      drivers/usb/storage/usual-tables.o
  CC      mm/migrate.o
  CC      drivers/cpufreq/freq_table.o
  CC      net/bridge/br_vlan_options.o
  AR      drivers/thermal/intel/built-in.a
  CC      drivers/cpufreq/cpufreq_performance.o
  CC      kernel/smp.o
  AR      drivers/thermal/st/built-in.a
  CC      mm/migrate_device.o
  AR      drivers/thermal/qcom/built-in.a
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.o
  AR      drivers/thermal/tegra/built-in.a
  AR      drivers/thermal/mediatek/built-in.a
  CC      drivers/thermal/thermal_core.o
  CC      drivers/usb/host/xhci.o
  CC      drivers/cpuidle/governors/menu.o
  CC      drivers/cpuidle/cpuidle.o
  CC      drivers/acpi/acpica/tbinstal.o
  CC [M]  drivers/net/usb/smsc75xx.o
  AR      drivers/watchdog/built-in.a
  CC [M]  drivers/gpu/drm/drm_cache.o
  CC      drivers/cpuidle/driver.o
  CC      drivers/usb/host/xhci-mem.o
  AR      net/core/built-in.a
  CC      drivers/usb/core/quirks.o
  CC      net/ipv4/af_inet.o
  LD [M]  drivers/ptp/ptp_kvm.o
  CC      fs/inode.o
  CC      drivers/i2c/i2c-core-base.o
  CC      lib/string_helpers.o
  CC [M]  drivers/gpu/drm/i915/soc/intel_pch.o
  CC [M]  fs/cifs/smb2pdu.o
  CC [M]  drivers/md/persistent-data/dm-space-map-metadata.o
  CC      drivers/i2c/i2c-core-smbus.o
  CC      drivers/usb/core/devices.o
  CC [M]  drivers/gpu/drm/xe/xe_gt_pagefault.o
  AR      drivers/usb/storage/built-in.a
  CC      drivers/cpuidle/governors/haltpoll.o
  CC      fs/btrfs/zstd.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/core/uevent.o
  CC      drivers/acpi/acpica/tbprint.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/nvfw/fw.o
  CC      arch/x86/kernel/amd_nb.o
  CC      drivers/cpuidle/governor.o
  CC [M]  drivers/net/phy/broadcom.o
  CC      drivers/mmc/core/core.o
  CC      kernel/uid16.o
  CC      drivers/mmc/core/bus.o
  CC      drivers/mmc/host/sdhci.o
  AR      drivers/ufs/built-in.a
  CC      net/l3mdev/l3mdev.o
  CC      drivers/mmc/host/sdhci-pci-core.o
  CC      drivers/usb/gadget/functions.o
  CC      kernel/kallsyms.o
  CC      drivers/usb/core/phy.o
  CC      drivers/acpi/acpica/tbutils.o
  CC      fs/btrfs/compression.o
  AR      drivers/opp/built-in.a
  CC [M]  drivers/gpu/drm/nouveau/nvkm/nvfw/hs.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.o
  CC      net/ipv6/exthdrs.o
  CC      drivers/cpuidle/sysfs.o
  CC      drivers/cpuidle/poll_state.o
  CC      lib/hexdump.o
  CC [M]  drivers/md/persistent-data/dm-transaction-manager.o
  CC      drivers/cpuidle/cpuidle-haltpoll.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/nvfw/ls.o
  AR      drivers/cpuidle/governors/built-in.a
  CC      net/ipv4/igmp.o
  CC      net/bridge/br_mst.o
  CC      drivers/acpi/acpica/tbxface.o
  CC [M]  drivers/gpu/drm/i915/i915_memcpy.o
  CC      net/sunrpc/svcsock.o
  CC      net/sunrpc/svcauth.o
  CC      lib/kstrtox.o
  CC      arch/x86/kernel/kvm.o
  CC      drivers/acpi/acpica/tbxfload.o
  CC      net/sunrpc/svcauth_unix.o
  CC      drivers/mmc/core/host.o
  CC      drivers/cpufreq/cpufreq_ondemand.o
  CC      drivers/usb/gadget/configfs.o
  CC      drivers/thermal/thermal_sysfs.o
  CC      fs/attr.o
  CC      net/sunrpc/addr.o
  CC [M]  drivers/net/phy/lxt.o
  CC      net/ipv4/fib_frontend.o
  CC      drivers/usb/core/port.o
  CC [M]  drivers/gpu/drm/i915/i915_mm.o
  AR      net/l3mdev/built-in.a
  CC [M]  drivers/md/persistent-data/dm-btree.o
  CC      drivers/cpufreq/cpufreq_governor.o
  AR      drivers/cpuidle/built-in.a
  CC [M]  drivers/net/mii.o
  CC      fs/bad_inode.o
  CC      drivers/i2c/i2c-core-acpi.o
  CC      drivers/i2c/i2c-core-slave.o
  CC [M]  drivers/gpu/drm/xe/xe_gt_sysfs.o
  CC      drivers/thermal/thermal_trip.o
  CC      lib/debug_info.o
  CC      drivers/cpufreq/cpufreq_governor_attr_set.o
  CC [M]  drivers/net/mdio.o
  CC [M]  drivers/net/usb/smsc95xx.o
  CC      drivers/acpi/acpica/tbxfroot.o
  CC [M]  fs/cifs/smb2inode.o
  AR      drivers/leds/trigger/built-in.a
  CC [M]  drivers/leds/trigger/ledtrig-audio.o
  CC [M]  drivers/net/usb/mcs7830.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/nvfw/acr.o
  CC      kernel/acct.o
  CC      fs/file.o
  CC      kernel/crash_core.o
  CC      net/sunrpc/rpcb_clnt.o
  CC      net/sunrpc/timer.o
  CC      net/sunrpc/xdr.o
  CC      drivers/thermal/thermal_helpers.o
  AR      drivers/leds/blink/built-in.a
  CC      drivers/thermal/thermal_hwmon.o
  CC      drivers/i2c/i2c-dev.o
  AR      drivers/leds/simple/built-in.a
  CC [M]  drivers/net/usb/usbnet.o
  CC      drivers/acpi/acpica/utaddress.o
  CC      drivers/leds/led-core.o
  CC [M]  drivers/net/ethernet/intel/e1000e/82571.o
  CC      drivers/leds/led-class.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.o
  CC      drivers/cpufreq/acpi-cpufreq.o
  CC      net/ipv4/fib_semantics.o
  CC [M]  drivers/gpu/drm/xe/xe_gt_tlb_invalidation.o
  CC      drivers/usb/core/hcd-pci.o
  CC [M]  drivers/net/phy/realtek.o
  CC [M]  net/bridge/br_netfilter_hooks.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/nvfw/flcn.o
  CC [M]  drivers/i2c/i2c-smbus.o
  CC      arch/x86/kernel/kvmclock.o
  CC [M]  drivers/gpu/drm/i915/i915_sw_fence.o
  CC      fs/btrfs/delayed-ref.o
  CC [M]  drivers/net/ethernet/intel/e1000/e1000_hw.o
  CC [M]  drivers/gpu/drm/i915/i915_sw_fence_work.o
  CC      drivers/acpi/acpica/utalloc.o
  CC      drivers/mmc/core/mmc.o
  CC [M]  drivers/net/usb/cdc_ncm.o
  CC [M]  drivers/net/tun.o
  CC [M]  drivers/usb/class/usbtmc.o
  CC [M]  net/bridge/br_netfilter_ipv6.o
  CC      drivers/thermal/gov_fair_share.o
  CC [M]  drivers/md/persistent-data/dm-btree-remove.o
  CC [M]  drivers/gpu/drm/xe/xe_gt_topology.o
  AR      drivers/net/ethernet/i825xx/built-in.a
  CC      kernel/compat.o
  CC      drivers/usb/gadget/u_f.o
  CC      mm/huge_memory.o
  CC [M]  drivers/net/usb/r8153_ecm.o
  CC      lib/iomap.o
  CC      drivers/leds/led-triggers.o
  CC      drivers/mmc/core/mmc_ops.o
  CC [M]  fs/cifs/smb2file.o
  CC      net/ipv6/datagram.o
  CC      drivers/acpi/acpica/utascii.o
  CC      net/sunrpc/sunrpc_syms.o
  CC      net/sunrpc/cache.o
  CC [M]  drivers/gpu/drm/i915/i915_syncmap.o
  AR      drivers/net/ethernet/microsoft/built-in.a
  CC [M]  drivers/i2c/i2c-mux.o
  AR      drivers/net/ethernet/litex/built-in.a
  CC      drivers/usb/host/xhci-ext-caps.o
  CC      arch/x86/kernel/paravirt.o
  CC      fs/btrfs/relocation.o
  CC      drivers/thermal/gov_step_wise.o
  CC      drivers/cpufreq/intel_pstate.o
  CC      drivers/usb/core/usb-acpi.o
  CC [M]  drivers/gpu/drm/drm_client.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/falcon/base.o
  AR      drivers/usb/gadget/built-in.a
  CC [M]  drivers/net/phy/smsc.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/falcon/cmdq.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/falcon/fw.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_bios.o
  CC      drivers/acpi/acpica/utbuffer.o
  CC      fs/btrfs/delayed-inode.o
  HOSTCC  drivers/gpu/drm/xe/xe_gen_wa_oob
  CC      fs/filesystems.o
  CC [M]  drivers/net/ethernet/intel/e1000e/ich8lan.o
  CC [M]  drivers/md/persistent-data/dm-btree-spine.o
  AR      drivers/leds/built-in.a
  CC      kernel/utsname.o
  CC [M]  drivers/gpu/drm/drm_client_modeset.o
  CC      lib/pci_iomap.o
  CC [M]  drivers/gpu/drm/drm_color_mgmt.o
  CC      drivers/acpi/acpica/utcksum.o
  CC [M]  drivers/gpu/drm/xe/xe_guc_ads.o
  CC      lib/iomap_copy.o
  CC      drivers/thermal/gov_user_space.o
  CC [M]  fs/cifs/cifsacl.o
  CC      net/sunrpc/rpc_pipe.o
  CC      drivers/usb/host/xhci-ring.o
  CC      drivers/mmc/host/sdhci-pci-o2micro.o
  CC      net/ipv4/fib_trie.o
  CC [M]  drivers/gpu/drm/xe/xe_guc_ct.o
  CC      arch/x86/kernel/pvclock.o
  CC      mm/khugepaged.o
  AR      drivers/usb/core/built-in.a
  CC      arch/x86/kernel/pcspeaker.o
  CC      fs/btrfs/scrub.o
  CC      arch/x86/kernel/check.o
  AR      drivers/i2c/built-in.a
  CC      drivers/usb/host/xhci-hub.o
  CC      drivers/acpi/acpica/utcopy.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.o
  CC      drivers/mmc/core/sd.o
  CC [M]  drivers/gpu/drm/i915/i915_user_extensions.o
  CC      kernel/user_namespace.o
  AR      drivers/thermal/built-in.a
  CC      fs/btrfs/backref.o
  AR      drivers/firmware/arm_ffa/built-in.a
  AR      drivers/firmware/arm_scmi/built-in.a
  AR      drivers/firmware/broadcom/built-in.a
  AR      drivers/firmware/cirrus/built-in.a
  AR      drivers/firmware/meson/built-in.a
  CC      lib/devres.o
  CC      drivers/firmware/efi/efi-bgrt.o
  LD [M]  drivers/net/phy/aquantia.o
  AR      drivers/net/phy/built-in.a
  LD [M]  drivers/md/persistent-data/dm-persistent-data.o
  CC      drivers/md/md.o
  CC      kernel/pid_namespace.o
  AR      drivers/firmware/imx/built-in.a
  CC      drivers/usb/host/xhci-dbg.o
  UPD     kernel/config_data
  CC      drivers/firmware/efi/libstub/efi-stub-helper.o
  CC      drivers/firmware/efi/libstub/gop.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/falcon/msgq.o
  CC      drivers/md/md-bitmap.o
  CC      drivers/usb/host/xhci-trace.o
  LD [M]  drivers/net/usb/asix.o
  CC [M]  drivers/net/veth.o
  CC      drivers/mmc/core/sd_ops.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/falcon/qmgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/falcon/v1.o
  CC [M]  drivers/gpu/drm/i915/i915_ioc32.o
  CC      kernel/stop_machine.o
  AR      net/bridge/built-in.a
  LD [M]  net/bridge/br_netfilter.o
  CC [M]  drivers/net/ethernet/intel/e1000/e1000_ethtool.o
  CC [M]  drivers/net/ethernet/intel/e1000/e1000_param.o
  CC      drivers/mmc/core/sdio.o
  CC      drivers/mmc/core/sdio_ops.o
  CC      arch/x86/kernel/uprobes.o
  CC      net/ipv6/ip6_flowlabel.o
  AR      drivers/crypto/stm32/built-in.a
  CC      drivers/acpi/acpica/utexcep.o
  AR      drivers/crypto/xilinx/built-in.a
  AR      drivers/crypto/hisilicon/built-in.a
  AR      drivers/crypto/keembay/built-in.a
  AR      drivers/crypto/built-in.a
  CC      drivers/acpi/acpica/utdebug.o
  CC      drivers/mmc/host/sdhci-pci-arasan.o
  CC      drivers/firmware/efi/efi.o
  CC [M]  net/bluetooth/af_bluetooth.o
  CC [M]  net/dns_resolver/dns_key.o
  CC [M]  drivers/gpu/drm/xe/xe_guc_debugfs.o
  CC      lib/check_signature.o
  CC      drivers/md/md-autodetect.o
  CC      drivers/firmware/efi/libstub/secureboot.o
  CC      net/devres.o
  CC      drivers/firmware/efi/libstub/tpm.o
  CC      drivers/acpi/acpica/utdecode.o
  CC [M]  net/bluetooth/hci_core.o
  CC [M]  net/bluetooth/hci_conn.o
  CC      drivers/usb/host/xhci-debugfs.o
  CC      lib/interval_tree.o
  CC [M]  drivers/gpu/drm/drm_connector.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/falcon/gm200.o
  AR      drivers/cpufreq/built-in.a
  CC [M]  drivers/gpu/drm/nouveau/nvkm/falcon/gp102.o
  CC      drivers/usb/host/xhci-pci.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/falcon/ga100.o
  CC      fs/btrfs/ulist.o
  CC      kernel/kprobes.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/atombios_dp.o
  CC      arch/x86/kernel/perf_regs.o
  CC [M]  drivers/gpu/drm/i915/i915_debugfs.o
  CC      kernel/hung_task.o
  CC [M]  fs/cifs/fs_context.o
  CC      lib/assoc_array.o
  CC      net/sunrpc/sysfs.o
  CC      drivers/mmc/host/sdhci-pci-dwc-mshc.o
  CC [M]  net/dns_resolver/dns_query.o
  CC      drivers/acpi/acpica/utdelete.o
  CC      drivers/mmc/host/sdhci-pci-gli.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/falcon/ga102.o
  CC [M]  drivers/gpu/drm/xe/xe_guc_hwconfig.o
  CC      fs/namespace.o
  CC      drivers/mmc/core/sdio_bus.o
  CC      drivers/mmc/core/sdio_cis.o
  CC      fs/seq_file.o
  CC      drivers/md/dm-uevent.o
  CC      drivers/firmware/efi/libstub/file.o
  CC      net/sunrpc/svc_xprt.o
  CC      kernel/watchdog.o
  CC      kernel/watchdog_hld.o
  CC      arch/x86/kernel/tracepoint.o
  CC      drivers/acpi/acpica/uterror.o
  CC [M]  drivers/net/ethernet/intel/igb/igb_main.o
  CC      drivers/acpi/acpica/uteval.o
  CC      net/ipv6/inet6_connection_sock.o
  CC [M]  drivers/net/ethernet/intel/igb/igb_ethtool.o
  CC [M]  drivers/net/ethernet/intel/e1000e/80003es2lan.o
  CC      net/ipv6/udp_offload.o
  CC [M]  drivers/net/ethernet/intel/e1000e/mac.o
  CC      drivers/acpi/acpica/utglobal.o
  LD [M]  net/dns_resolver/dns_resolver.o
  CC      drivers/mmc/core/sdio_io.o
  CC      mm/page_counter.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/acr/base.o
  CC      fs/btrfs/qgroup.o
  CC      kernel/seccomp.o
  LD [M]  drivers/net/ethernet/intel/e1000/e1000.o
  CC      drivers/md/dm.o
  CC      mm/memcontrol.o
  CC      lib/list_debug.o
  CC      drivers/firmware/efi/vars.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.o
  CC      arch/x86/kernel/itmt.o
  CC      net/ipv4/fib_notifier.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/acr/gm200.o
  CC [M]  drivers/gpu/drm/xe/xe_guc_log.o
  CC      drivers/firmware/efi/libstub/mem.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/acr/gm20b.o
  CC      drivers/mmc/core/sdio_irq.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_afmt.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.o
  CC [M]  net/bluetooth/hci_event.o
  CC      drivers/mmc/core/slot-gpio.o
  CC      drivers/acpi/acpica/uthex.o
  CC      lib/debugobjects.o
  AR      drivers/net/ethernet/microchip/built-in.a
  AR      drivers/net/ethernet/mscc/built-in.a
  CC      lib/bitrev.o
  CC      net/socket.o
  CC      net/sunrpc/xprtmultipath.o
  AR      drivers/firmware/psci/built-in.a
  CC      net/compat.o
  AR      drivers/firmware/smccc/built-in.a
  AR      drivers/net/ethernet/neterion/built-in.a
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/acr/gp102.o
  CC [M]  net/bluetooth/mgmt.o
  CC      drivers/firmware/efi/reboot.o
  CC      drivers/firmware/efi/libstub/random.o
  CC [M]  drivers/gpu/drm/i915/i915_debugfs_params.o
  CC      fs/btrfs/send.o
  CC      drivers/mmc/host/sdhci-acpi.o
  AR      drivers/usb/host/built-in.a
  CC      drivers/firmware/efi/memattr.o
  AR      drivers/usb/built-in.a
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/acr/gp108.o
  CC      drivers/acpi/acpica/utids.o
  CC      lib/crc16.o
  CC      arch/x86/kernel/umip.o
  CC [M]  drivers/gpu/drm/i915/display/intel_display_debugfs.o
  CC      kernel/relay.o
  CC      drivers/firmware/efi/tpm.o
  CC      drivers/mmc/host/cqhci-core.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_82575.o
  CC [M]  net/bluetooth/hci_sock.o
  CC [M]  drivers/gpu/drm/xe/xe_guc_pc.o
  CC      drivers/mmc/core/regulator.o
  CC      drivers/clocksource/acpi_pm.o
  CC      net/ipv6/seg6.o
  CC      drivers/hid/usbhid/hid-core.o
  CC      net/ipv6/fib6_notifier.o
  AR      drivers/staging/media/built-in.a
  CC      net/ipv4/inet_fragment.o
  AR      drivers/staging/built-in.a
  CC      drivers/hid/hid-core.o
  CC      drivers/firmware/efi/libstub/randomalloc.o
  CC      drivers/mmc/core/debugfs.o
  CC      net/ipv6/rpl.o
  CC      kernel/utsname_sysctl.o
  CC [M]  net/bluetooth/hci_sysfs.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/acr/gv100.o
  CC [M]  fs/cifs/dns_resolve.o
  CC      drivers/acpi/acpica/utinit.o
  CC [M]  drivers/net/ethernet/intel/e1000e/manage.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/acr/gp10b.o
  CC      net/ipv4/ping.o
  CC      lib/crc-t10dif.o
  CC      kernel/delayacct.o
  CC      arch/x86/kernel/unwind_orc.o
  CC      kernel/taskstats.o
  CC [M]  drivers/net/ethernet/intel/e1000e/nvm.o
  AR      drivers/platform/x86/amd/built-in.a
  AR      drivers/platform/surface/built-in.a
  HOSTCC  lib/gen_crc32table
  CC      drivers/platform/x86/p2sb.o
  CC      drivers/platform/x86/intel/pmc/core.o
  CC      drivers/clocksource/i8253.o
  CC      drivers/platform/x86/intel/pmc/spt.o
  CC      drivers/platform/x86/pmc_atom.o
  CC      drivers/platform/x86/intel/pmc/cnp.o
  CC      drivers/acpi/acpica/utlock.o
  CC      drivers/platform/x86/intel/pmc/icl.o
  CC      drivers/firmware/efi/libstub/pci.o
  CC      kernel/tsacct.o
  CC      drivers/mmc/core/block.o
  CC      drivers/firmware/efi/libstub/skip_spaces.o
  CC      lib/libcrc32c.o
  CC      net/sunrpc/stats.o
  CC [M]  drivers/gpu/drm/i915/display/intel_pipe_crc.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/acr/tu102.o
  CC      net/ipv6/ioam6.o
  CC      kernel/tracepoint.o
  AR      drivers/clocksource/built-in.a
  CC [M]  drivers/platform/x86/wmi.o
  CC      net/ipv6/sysctl_net_ipv6.o
  CC [M]  drivers/gpu/drm/i915/i915_pmu.o
  CC      drivers/mailbox/mailbox.o
  CC      drivers/acpi/acpica/utmath.o
  CC      drivers/mailbox/pcc.o
  CC [M]  drivers/gpu/drm/i915/gt/gen2_engine_cs.o
  CC      net/ipv4/ip_tunnel_core.o
  CC [M]  drivers/gpu/drm/xe/xe_guc_submit.o
  CC      drivers/acpi/acpica/utmisc.o
  CC      drivers/firmware/efi/libstub/lib-cmdline.o
  AR      drivers/net/ethernet/netronome/built-in.a
  CC [M]  drivers/gpu/drm/xe/xe_hw_engine.o
  CC [M]  drivers/mmc/host/sdhci-pltfm.o
  CC [M]  drivers/gpu/drm/i915/gt/gen6_engine_cs.o
  CC      lib/xxhash.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_mac.o
  CC [M]  drivers/net/ethernet/intel/e1000e/phy.o
  CC      arch/x86/kernel/callthunks.o
  ASN.1   fs/cifs/cifs_spnego_negtokeninit.asn1.[ch]
  CC [M]  fs/cifs/smb1ops.o
  CC [M]  drivers/net/ethernet/intel/e1000e/param.o
  CC      drivers/firmware/efi/libstub/lib-ctype.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/atombios_encoders.o
  CC [M]  drivers/net/ethernet/intel/e1000e/ethtool.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_nvm.o
  CC      drivers/firmware/efi/libstub/alignedmem.o
  CC [M]  drivers/gpu/drm/xe/xe_hw_fence.o
  CC [M]  drivers/gpu/drm/xe/xe_huc.o
  CC      drivers/hid/usbhid/hiddev.o
  CC      drivers/acpi/acpica/utmutex.o
  AR      drivers/firmware/tegra/built-in.a
  AR      drivers/firmware/xilinx/built-in.a
  CC      drivers/firmware/dmi_scan.o
  AR      drivers/net/ethernet/ni/built-in.a
  CC [M]  net/bluetooth/l2cap_core.o
  CC      net/ipv4/gre_offload.o
  AR      drivers/net/ethernet/packetengines/built-in.a
  CC      net/ipv4/metrics.o
  AR      drivers/net/ethernet/realtek/built-in.a
  AR      drivers/net/ethernet/renesas/built-in.a
  CC [M]  drivers/net/ethernet/realtek/8139cp.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_phy.o
  CC      drivers/acpi/acpica/utnonansi.o
  CC      kernel/latencytop.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/acr/ga100.o
  AR      drivers/mailbox/built-in.a
  CC [M]  drivers/net/ethernet/intel/igb/e1000_mbx.o
  CC      lib/genalloc.o
  CC      drivers/platform/x86/intel/pmc/tgl.o
  AR      drivers/mmc/host/built-in.a
  CC      drivers/firmware/dmi-sysfs.o
  CC [M]  net/bluetooth/l2cap_sock.o
  CC      drivers/acpi/acpica/utobject.o
  CC [M]  drivers/gpu/drm/xe/xe_huc_debugfs.o
  CC      net/ipv4/netlink.o
  CC      drivers/acpi/acpica/utosi.o
  CC      drivers/firmware/efi/libstub/relocate.o
  CC      arch/x86/kernel/mmconf-fam10h_64.o
  CC      drivers/devfreq/devfreq.o
  CC      arch/x86/kernel/vsmp_64.o
  CC      fs/btrfs/dev-replace.o
  CC      fs/btrfs/raid56.o
  CC      net/ipv4/nexthop.o
  CC [M]  drivers/gpu/drm/i915/gt/gen6_ppgtt.o
  CC [M]  net/bluetooth/smp.o
  CC      kernel/irq_work.o
  CC      fs/xattr.o
  CC [M]  drivers/platform/x86/wmi-bmof.o
  CC      drivers/platform/x86/intel/pmc/adl.o
  CC [M]  drivers/devfreq/governor_simpleondemand.o
  CC      net/sunrpc/sysctl.o
  CC      drivers/acpi/acpi_pnp.o
  CC [M]  drivers/gpu/drm/xe/xe_irq.o
  CC      drivers/acpi/acpica/utownerid.o
  CC [M]  drivers/gpu/drm/xe/xe_lrc.o
  CC [M]  drivers/net/ethernet/intel/igb/e1000_i210.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/acr/ga102.o
  CC      drivers/hid/hid-input.o
  CC      drivers/firmware/dmi-id.o
  CC      drivers/firmware/memmap.o
  CC [M]  drivers/gpu/drm/i915/gt/gen7_renderclear.o
  AR      drivers/hid/usbhid/built-in.a
  CC      drivers/hid/hid-quirks.o
  CC      lib/percpu_counter.o
  CC      drivers/mmc/core/queue.o
  CC [M]  drivers/platform/x86/mxm-wmi.o
  CC      drivers/acpi/acpica/utpredef.o
  CC      drivers/firmware/efi/libstub/printk.o
  CC      lib/fault-inject.o
  CC      net/ipv6/xfrm6_policy.o
  AR      arch/x86/kernel/built-in.a
  AR      arch/x86/built-in.a
  CC      net/ipv4/udp_tunnel_stub.o
  CC      drivers/powercap/powercap_sys.o
  CC [M]  drivers/devfreq/governor_performance.o
  CC      drivers/powercap/intel_rapl_common.o
  CC      drivers/firmware/efi/libstub/vsprintf.o
  CC [M]  drivers/platform/x86/intel_ips.o
  CC      kernel/static_call.o
  CC      drivers/platform/x86/intel/pmc/mtl.o
  CC      kernel/static_call_inline.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_sa.o
  CC [M]  fs/cifs/cifssmb.o
  AR      net/sunrpc/built-in.a
  CC      net/sysctl_net.o
  CC      drivers/acpi/acpica/utresdecode.o
  CC [M]  fs/cifs/cifs_spnego_negtokeninit.asn1.o
  CC [M]  drivers/gpu/drm/i915/gt/gen8_engine_cs.o
  CC      drivers/platform/x86/intel/pmc/pltdrv.o
  CC      drivers/powercap/intel_rapl_msr.o
  AR      drivers/perf/built-in.a
  CC      drivers/firmware/efi/libstub/x86-stub.o
  CC      drivers/ras/ras.o
  AR      drivers/hwtracing/intel_th/built-in.a
  STUBCPY drivers/firmware/efi/libstub/alignedmem.stub.o
  CC      drivers/android/binderfs.o
  CC      drivers/android/binder.o
  CC      lib/syscall.o
  CC      drivers/android/binder_alloc.o
  CC [M]  drivers/net/ethernet/intel/e1000e/netdev.o
  CC      drivers/acpi/power.o
  CC      drivers/acpi/event.o
  CC      drivers/firmware/efi/memmap.o
  CC [M]  drivers/platform/x86/intel/pmt/class.o
  CC [M]  drivers/net/ethernet/realtek/8139too.o
  CC      drivers/acpi/acpica/utresrc.o
  CC [M]  drivers/gpu/drm/i915/gt/gen8_ppgtt.o
  AR      drivers/mmc/core/built-in.a
  AR      drivers/mmc/built-in.a
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bar/base.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.o
  CC      lib/dynamic_debug.o
  CC      mm/vmpressure.o
  AR      drivers/net/ethernet/sfc/built-in.a
  CC      drivers/md/dm-table.o
  CC      lib/errname.o
  AR      drivers/net/ethernet/smsc/built-in.a
  CC [M]  drivers/gpu/drm/xe/xe_migrate.o
  CC [M]  drivers/net/ethernet/smsc/smsc9420.o
  CC [M]  drivers/net/ethernet/intel/igb/igb_ptp.o
  CC      kernel/user-return-notifier.o
  AR      drivers/devfreq/built-in.a
  AR      drivers/platform/x86/intel/pmc/built-in.a
  CC      drivers/firmware/efi/esrt.o
  CC [M]  drivers/net/ethernet/intel/e1000e/ptp.o
  CC      drivers/platform/x86/intel/turbo_max_3.o
  CC      drivers/hid/hid-debug.o
  CC      drivers/nvmem/core.o
  CC      mm/swap_cgroup.o
  CC      kernel/padata.o
  AR      drivers/net/ethernet/socionext/built-in.a
  CC      kernel/jump_label.o
  CC      net/ipv6/xfrm6_state.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/atombios_i2c.o
  CC      net/ipv4/sysctl_net_ipv4.o
  CC      fs/btrfs/uuid-tree.o
  CC      drivers/acpi/acpica/utstate.o
  CC [M]  net/bluetooth/lib.o
  CC      drivers/hid/hidraw.o
  STUBCPY drivers/firmware/efi/libstub/efi-stub-helper.stub.o
  AR      drivers/powercap/built-in.a
  CC [M]  drivers/uio/uio.o
  CC [M]  drivers/net/ethernet/intel/igb/igb_hwmon.o
  STUBCPY drivers/firmware/efi/libstub/file.stub.o
  CC [M]  drivers/mtd/chips/chipreg.o
  STUBCPY drivers/firmware/efi/libstub/gop.stub.o
  CC      drivers/acpi/acpica/utstring.o
  STUBCPY drivers/firmware/efi/libstub/lib-cmdline.stub.o
  STUBCPY drivers/firmware/efi/libstub/lib-ctype.stub.o
  STUBCPY drivers/firmware/efi/libstub/mem.stub.o
  CC      drivers/acpi/acpica/utstrsuppt.o
  STUBCPY drivers/firmware/efi/libstub/pci.stub.o
  STUBCPY drivers/firmware/efi/libstub/printk.stub.o
  STUBCPY drivers/firmware/efi/libstub/random.stub.o
  STUBCPY drivers/firmware/efi/libstub/randomalloc.stub.o
  CC      lib/nlattr.o
  STUBCPY drivers/firmware/efi/libstub/relocate.stub.o
  CC      drivers/firmware/efi/efi-pstore.o
  CC [M]  drivers/platform/x86/intel/pmt/telemetry.o
  STUBCPY drivers/firmware/efi/libstub/secureboot.stub.o
  STUBCPY drivers/firmware/efi/libstub/skip_spaces.stub.o
  STUBCPY drivers/firmware/efi/libstub/tpm.stub.o
  STUBCPY drivers/firmware/efi/libstub/vsprintf.stub.o
  STUBCPY drivers/firmware/efi/libstub/x86-stub.stub.o
  AR      drivers/firmware/efi/libstub/lib.a
  CC      drivers/ras/debugfs.o
  CC      fs/libfs.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bar/g84.o
  CC      drivers/acpi/evged.o
  CC      drivers/hid/hid-generic.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.o
  CC      drivers/hid/hid-a4tech.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bar/gk20a.o
  CC [M]  drivers/vfio/pci/vfio_pci_core.o
  CC [M]  drivers/vfio/pci/vfio_pci_intrs.o
  CC      mm/hugetlb_cgroup.o
  CC      drivers/acpi/acpica/utstrtoul64.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_breadcrumbs.o
  CC      fs/fs-writeback.o
  CC [M]  drivers/mtd/mtdcore.o
  CC      fs/pnode.o
  AR      drivers/ras/built-in.a
  CC [M]  net/bluetooth/ecdh_helper.o
  CC [M]  net/bluetooth/hci_request.o
  CC      drivers/acpi/sysfs.o
  CC      kernel/context_tracking.o
  CC      drivers/acpi/acpica/utxface.o
  CC      drivers/firmware/efi/cper.o
  CC [M]  drivers/platform/x86/intel/pmt/crashlog.o
  CC [M]  drivers/vfio/pci/vfio_pci_rdwr.o
  AR      drivers/nvmem/built-in.a
  CC [M]  drivers/mtd/mtdsuper.o
  CC      net/ipv6/xfrm6_input.o
  CC      fs/splice.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_context.o
  CC [M]  drivers/vfio/vfio_main.o
  CC      mm/kmemleak.o
  CC      net/ipv4/proc.o
  CC [M]  net/bluetooth/mgmt_util.o
  CC [M]  drivers/net/ethernet/realtek/r8169_main.o
  CC [M]  drivers/pps/pps.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bar/gm107.o
  CC      mm/page_isolation.o
  CC      drivers/md/dm-target.o
  CC      drivers/hid/hid-apple.o
  CC      drivers/hid/hid-belkin.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bar/gm20b.o
  CC      fs/btrfs/props.o
  CC      drivers/md/dm-linear.o
  LD [M]  drivers/net/ethernet/intel/igb/igb.o
  CC      lib/checksum.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_context_sseu.o
  CC      fs/sync.o
  CC      fs/btrfs/free-space-tree.o
  CC      drivers/acpi/acpica/utxfinit.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_engine_cs.o
  CC [M]  drivers/net/ethernet/intel/igc/igc_main.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.o
  CC [M]  fs/cifs/asn1.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.o
  CC [M]  drivers/vfio/group.o
  CC      drivers/acpi/property.o
  CC      lib/cpu_rmap.o
  CC      kernel/iomem.o
  LD [M]  drivers/platform/x86/intel/pmt/pmt_class.o
  CC      net/ipv4/syncookies.o
  LD [M]  drivers/platform/x86/intel/pmt/pmt_telemetry.o
  LD [M]  drivers/platform/x86/intel/pmt/pmt_crashlog.o
  CC [M]  drivers/platform/x86/intel/vsec.o
  CC      drivers/acpi/acpi_cmos_rtc.o
  CC [M]  net/bluetooth/mgmt_config.o
  CC      drivers/acpi/acpica/utxferror.o
  CC [M]  drivers/vfio/pci/vfio_pci_config.o
  CC      drivers/acpi/x86/apple.o
  CC [M]  drivers/pps/kapi.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_engine_heartbeat.o
  CC      drivers/firmware/efi/cper_cxl.o
  CC      drivers/md/dm-stripe.o
  CC [M]  net/bluetooth/hci_codec.o
  CC [M]  drivers/vfio/pci/vfio_pci.o
  CC [M]  drivers/gpu/drm/xe/xe_mmio.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bar/tu102.o
  CC [M]  drivers/mtd/mtdconcat.o
  CC      lib/dynamic_queue_limits.o
  CC      drivers/acpi/acpica/utxfmutex.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_engine_pm.o
  CC [M]  net/bluetooth/eir.o
  CC      kernel/rseq.o
  CC      drivers/hid/hid-cherry.o
  CC      fs/utimes.o
  CC      net/ipv6/xfrm6_output.o
  CC      lib/glob.o
  CC [M]  drivers/gpu/drm/xe/xe_mocs.o
  CC      drivers/firmware/efi/runtime-wrappers.o
  CC [M]  drivers/vfio/iova_bitmap.o
  CC [M]  drivers/vfio/container.o
  CC [M]  drivers/pps/sysfs.o
  AR      drivers/net/ethernet/vertexcom/built-in.a
  CC [M]  drivers/mtd/mtdpart.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.o
  CC [M]  drivers/mtd/mtdchar.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_pll.o
  CC [M]  drivers/platform/x86/intel/rst.o
  CC      drivers/acpi/x86/utils.o
  CC [M]  drivers/vfio/virqfd.o
  CC      mm/early_ioremap.o
  AR      drivers/net/ethernet/wangxun/built-in.a
  CC      mm/cma.o
  AR      drivers/acpi/acpica/built-in.a
  CC      fs/d_path.o
  CC [M]  drivers/gpu/drm/xe/xe_module.o
  LD [M]  drivers/platform/x86/intel/intel_vsec.o
  CC [M]  drivers/vfio/vfio_iommu_type1.o
  CC      drivers/hid/hid-chicony.o
  CC      lib/strncpy_from_user.o
  CC      mm/secretmem.o
  CC      net/ipv6/xfrm6_protocol.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/base.o
  LD [M]  drivers/pps/pps_core.o
  CC [M]  drivers/bluetooth/btusb.o
  CC      drivers/md/dm-ioctl.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_engine_user.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/bit.o
  CC [M]  drivers/bluetooth/btintel.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.o
  CC      net/ipv4/esp4.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.o
  CC      drivers/acpi/x86/s2idle.o
  AR      drivers/platform/x86/intel/built-in.a
  CC [M]  drivers/gpu/drm/xe/xe_pat.o
  LD [M]  drivers/platform/x86/intel/intel-rst.o
  AR      drivers/platform/x86/built-in.a
  AR      drivers/platform/built-in.a
  CC      drivers/md/dm-io.o
  CC      mm/userfaultfd.o
  CC [M]  drivers/bluetooth/btbcm.o
  CC      fs/stack.o
  CC      fs/btrfs/tree-checker.o
  LD [M]  drivers/vfio/pci/vfio-pci.o
  CC [M]  drivers/net/ethernet/intel/igc/igc_mac.o
  GZIP    kernel/config_data.gz
  LD [M]  drivers/vfio/pci/vfio-pci-core.o
  CC [M]  drivers/net/ethernet/intel/igbvf/vf.o
  CC      kernel/configs.o
  CC [M]  net/bluetooth/hci_sync.o
  CC [M]  drivers/net/ethernet/intel/igc/igc_i225.o
  LD [M]  fs/cifs/cifs.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_main.o
  CC      lib/strnlen_user.o
  CC [M]  drivers/net/ethernet/intel/igc/igc_base.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_common.o
  CC [M]  drivers/net/ethernet/intel/igc/igc_nvm.o
  CC      drivers/firmware/efi/dev-path-parser.o
  CC      drivers/hid/hid-cypress.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_sync.o
  CC [M]  drivers/net/ethernet/intel/ixgbevf/vf.o
  CC      mm/memremap.o
  CC [M]  drivers/net/ethernet/intel/ixgbevf/mbx.o
  CC      mm/hmm.o
  CC      fs/fs_struct.o
  CC [M]  drivers/net/ethernet/intel/ixgbevf/ethtool.o
  LD [M]  drivers/vfio/vfio.o
  LD [M]  drivers/mtd/mtd.o
  CC [M]  drivers/gpu/drm/xe/xe_pci.o
  AR      kernel/built-in.a
  CC      mm/memfd.o
  CC      fs/statfs.o
  CC [M]  drivers/dca/dca-core.o
  CC [M]  drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/boost.o
  CC      mm/bootmem_info.o
  CC      lib/net_utils.o
  CC [M]  drivers/dca/dca-sysfs.o
  CC      net/ipv6/netfilter.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.o
  CC      drivers/firmware/efi/apple-properties.o
  CC      drivers/acpi/debugfs.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.o
  CC [M]  drivers/net/ethernet/intel/igbvf/mbx.o
  LD [M]  drivers/net/ethernet/intel/e1000e/e1000e.o
  CC [M]  drivers/net/ethernet/realtek/r8169_firmware.o
  CC [M]  drivers/net/ethernet/realtek/r8169_phy_config.o
  CC      drivers/md/dm-kcopyd.o
  CC [M]  drivers/net/ethernet/intel/igbvf/ethtool.o
  CC      drivers/acpi/acpi_lpat.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.o
  CC [M]  drivers/net/ethernet/intel/ixgbevf/ipsec.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_execlists_submission.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_82599.o
  CC      drivers/hid/hid-ezkey.o
  AR      drivers/android/built-in.a
  CC [M]  drivers/bluetooth/btrtl.o
  CC [M]  drivers/ssb/main.o
  CC [M]  drivers/ssb/scan.o
  CC      drivers/md/dm-sysfs.o
  CC [M]  drivers/vhost/net.o
  CC      net/ipv6/fib6_rules.o
  CC      lib/sg_pool.o
  CC      net/ipv6/proc.o
  CC      lib/stackdepot.o
  LD [M]  drivers/dca/dca.o
  CC      drivers/firmware/efi/earlycon.o
  CC      lib/ucs2_string.o
  CC      fs/fs_pin.o
  CC      drivers/firmware/efi/cper-x86.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/conn.o
  CC [M]  drivers/gpu/drm/xe/xe_pcode.o
  CC      fs/nsfs.o
  CC [M]  drivers/gpu/drm/xe/xe_pm.o
  CC      fs/fs_types.o
  CC      drivers/md/dm-stats.o
  CC      drivers/md/dm-rq.o
  CC [M]  drivers/net/ethernet/intel/igbvf/netdev.o
  CC [M]  drivers/ssb/sprom.o
  CC      drivers/acpi/acpi_lpit.o
  AR      mm/built-in.a
  CC      net/ipv4/esp4_offload.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_ggtt.o
  CC [M]  net/bluetooth/sco.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_ggtt_fencing.o
  CC      net/ipv4/netfilter.o
../drivers/gpu/drm/i915/gt/intel_engine_cs.c:1525: warning: expecting prototype for intel_engines_cleanup_common(). Prototype was for intel_engine_cleanup_common() instead
  CC [M]  drivers/gpu/drm/i915/gt/intel_gt.o
  CC      net/ipv6/syncookies.o
  CC      drivers/hid/hid-kensington.o
  CC [M]  drivers/ssb/pci.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_82598.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_phy.o
  CC      fs/btrfs/space-info.o
  LD [M]  drivers/net/ethernet/realtek/r8169.o
  CC [M]  drivers/ssb/pcihost_wrapper.o
  CC      net/ipv4/inet_diag.o
  AR      drivers/net/ethernet/xilinx/built-in.a
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.o
  CC      fs/btrfs/block-rsv.o
  CC [M]  net/bluetooth/iso.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.o
  CC      drivers/acpi/prmt.o
  CC      lib/sbitmap.o
  CC      lib/group_cpus.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.o
  AR      drivers/firmware/efi/built-in.a
  CC      drivers/md/dm-io-rewind.o
  CC      fs/fs_context.o
  AR      drivers/firmware/built-in.a
  CC      drivers/acpi/acpi_pcc.o
  CC      fs/fs_parser.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/cstep.o
  CC [M]  net/bluetooth/a2mp.o
  CC [M]  drivers/gpu/drm/xe/xe_preempt_fence.o
  CC [M]  drivers/net/ethernet/intel/igc/igc_phy.o
  CC [M]  drivers/net/ethernet/intel/igc/igc_diag.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_gt_clock_utils.o
  CC [M]  drivers/gpu/drm/xe/xe_pt.o
  CC      drivers/hid/hid-lg.o
  CC      net/ipv6/mip6.o
  CC      drivers/md/dm-builtin.o
  CC      drivers/hid/hid-lg-g15.o
  CC      net/ipv4/tcp_diag.o
  CC      net/ipv6/addrconf_core.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_vf_error.o
  CC [M]  drivers/ssb/driver_chipcommon.o
  CC [M]  drivers/ssb/driver_chipcommon_pmu.o
  CC [M]  drivers/gpu/drm/drm_crtc.o
  CC      fs/fsopen.o
  CC      net/ipv4/udp_diag.o
  CC      fs/init.o
  CC [M]  drivers/md/dm-bufio.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.o
  CC      drivers/acpi/ac.o
  CC [M]  lib/asn1_decoder.o
  CC      net/ipv6/exthdrs_core.o
  CC [M]  drivers/gpu/drm/xe/xe_pt_walk.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/dcb.o
  CC [M]  drivers/gpu/drm/xe/xe_query.o
  CC [M]  drivers/vhost/vhost.o
  CC [M]  drivers/ssb/driver_pcicore.o
  CC      fs/kernel_read_file.o
  CC      fs/mnt_idmapping.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_gt_debugfs.o
  GEN     lib/oid_registry_data.c
  CC [M]  lib/oid_registry.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.o
  CC      net/ipv6/ip6_checksum.o
  CC [M]  drivers/net/ethernet/intel/igc/igc_ethtool.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/disp.o
  CC      net/ipv4/tcp_cubic.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_mbx.o
  CC [M]  drivers/vhost/iotlb.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_sched.o
  CC [M]  drivers/gpu/drm/xe/xe_reg_sr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_gt_irq.o
  CC      drivers/acpi/button.o
  AR      drivers/net/ethernet/synopsys/built-in.a
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/dp.o
  CC      fs/remap_range.o
  CC      drivers/hid/hid-microsoft.o
  CC [M]  drivers/net/ethernet/intel/igc/igc_ptp.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/extdev.o
  CC      net/ipv6/ip6_icmp.o
  CC [M]  drivers/gpu/drm/xe/xe_reg_whitelist.o
  AR      lib/lib.a
  GEN     lib/crc32table.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/fan.o
  CC      lib/crc32.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/gpio.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/i2c.o
  CC [M]  net/bluetooth/amp.o
  AR      drivers/net/ethernet/pensando/built-in.a
  CC [M]  net/bluetooth/hci_debugfs.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_x540.o
  CC      net/ipv4/xfrm4_policy.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_x550.o
  LD [M]  drivers/net/ethernet/intel/igbvf/igbvf.o
  CC      net/ipv4/xfrm4_state.o
  CC      drivers/hid/hid-monterey.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_gt_mcr.o
  CC [M]  drivers/net/ethernet/intel/ixgb/ixgb_main.o
  CC      fs/btrfs/delalloc-space.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_ids.o
  LD [M]  drivers/ssb/ssb.o
  CC      net/ipv4/xfrm4_input.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_lib.o
  LD [M]  drivers/net/ethernet/intel/ixgbevf/ixgbevf.o
  CC      net/ipv6/output_core.o
  LD [M]  drivers/vhost/vhost_net.o
  LD [M]  drivers/vhost/vhost_iotlb.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.o
  CC [M]  drivers/net/ethernet/intel/ixgb/ixgb_hw.o
  CC [M]  drivers/gpu/drm/xe/xe_rtp.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/iccsense.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_gt_pm.o
  AR      lib/built-in.a
  CC [M]  drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.o
  GEN     xe_wa_oob.c xe_wa_oob.h
  CC [M]  drivers/gpu/drm/i915/gt/intel_gt_pm_irq.o
  AR      drivers/net/ethernet/intel/built-in.a
  CC      net/ipv4/xfrm4_output.o
  CC [M]  drivers/net/ethernet/intel/ixgb/ixgb_ee.o
  CC      drivers/acpi/fan_core.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_dcb.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.o
  CC      fs/btrfs/block-group.o
  CC      net/ipv4/xfrm4_protocol.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/image.o
  CC [M]  drivers/md/dm-bio-prison-v1.o
  CC [M]  drivers/net/ethernet/intel/igc/igc_dump.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_82598.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_82599.o
  AR      drivers/hid/built-in.a
  GEN     xe_wa_oob.c xe_wa_oob.h
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.o
  CC      drivers/acpi/fan_attr.o
  CC [M]  net/ipv4/ip_tunnel.o
  CC      fs/buffer.o
  CC [M]  drivers/net/ethernet/intel/ixgb/ixgb_ethtool.o
  CC [M]  drivers/gpu/drm/xe/xe_sa.o
  CC [M]  drivers/gpu/drm/xe/xe_sched_job.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_gt_requests.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_gt_sysfs.o
  CC [M]  net/ipv4/udp_tunnel_core.o
  CC [M]  drivers/gpu/drm/xe/xe_step.o
  CC [M]  drivers/net/ethernet/intel/igc/igc_tsn.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.o
  CC      drivers/acpi/processor_driver.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_gtt.o
  CC [M]  drivers/net/ethernet/intel/igc/igc_xdp.o
  CC [M]  net/ipv4/udp_tunnel_nic.o
  CC [M]  drivers/net/ethernet/intel/ixgb/ixgb_param.o
  CC      drivers/acpi/processor_thermal.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_hdp.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_csa.o
  CC      net/ipv6/protocol.o
  CC [M]  drivers/net/ethernet/intel/e100.o
  CC      fs/btrfs/discard.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_llc.o
  CC [M]  drivers/gpu/drm/xe/xe_sync.o
  CC [M]  drivers/md/dm-bio-prison-v2.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.o
  CC [M]  drivers/gpu/drm/drm_displayid.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/mxm.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_sysfs.o
  CC      drivers/acpi/processor_idle.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_debugfs.o
  CC      fs/mpage.o
  CC [M]  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.o
  LD [M]  net/bluetooth/bluetooth.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/npde.o
  CC [M]  drivers/gpu/drm/drm_drv.o
  CC      net/ipv6/ip6_offload.o
  CC [M]  drivers/gpu/drm/xe/xe_tile.o
  AR      net/ipv4/built-in.a
  CC [M]  drivers/gpu/drm/xe/xe_trace.o
  CC [M]  drivers/md/dm-crypt.o
  CC [M]  drivers/gpu/drm/xe/xe_ttm_sys_mgr.o
  CC      drivers/acpi/processor_throttling.o
  CC      drivers/acpi/processor_perflib.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/pcir.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_lrc.o
  CC      fs/proc_namespace.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_migrate.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_mocs.o
  CC [M]  drivers/gpu/drm/drm_dumb_buffers.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/perf.o
  CC [M]  drivers/md/dm-thin.o
  CC [M]  drivers/md/dm-thin-metadata.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/pll.o
  CC      fs/btrfs/reflink.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_ppgtt.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.o
  CC      fs/btrfs/subpage.o
  LD [M]  drivers/net/ethernet/intel/igc/igc.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.o
  LD [M]  drivers/md/dm-bio-prison.o
  CC      fs/direct-io.o
  AR      drivers/md/built-in.a
  CC [M]  drivers/gpu/drm/xe/xe_ttm_stolen_mgr.o
  LD [M]  drivers/net/ethernet/intel/ixgb/ixgb.o
  CC [M]  drivers/gpu/drm/xe/xe_ttm_vram_mgr.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_rc6.o
  CC      drivers/acpi/container.o
  CC      drivers/acpi/thermal.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_region_lmem.o
  CC      fs/btrfs/tree-mod-log.o
  CC      net/ipv6/tcpv6_offload.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_renderstate.o
  CC [M]  drivers/gpu/drm/drm_edid.o
  CC      fs/eventpoll.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_reset.o
  CC      fs/btrfs/extent-io-tree.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_ring.o
  CC [M]  drivers/gpu/drm/drm_encoder.o
  CC [M]  drivers/gpu/drm/xe/xe_tuning.o
  CC [M]  drivers/gpu/drm/xe/xe_uc.o
  CC      drivers/acpi/acpi_memhotplug.o
  CC      fs/anon_inodes.o
  CC      drivers/acpi/ioapic.o
  CC      fs/btrfs/fs.o
  LD [M]  net/ipv4/udp_tunnel.o
  CC      fs/btrfs/messages.o
  CC      drivers/acpi/battery.o
  CC [M]  drivers/gpu/drm/drm_file.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.o
  CC      drivers/acpi/hed.o
  CC      drivers/acpi/bgrt.o
  CC      net/ipv6/exthdrs_offload.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/pmu.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/power_budget.o
  LD [M]  drivers/net/ethernet/intel/ixgbe/ixgbe.o
  CC [M]  drivers/gpu/drm/xe/xe_uc_debugfs.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_ring_submission.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_rps.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_sa_media.o
  CC      drivers/acpi/cppc_acpi.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_sseu.o
  AR      drivers/net/ethernet/built-in.a
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.o
  CC [M]  drivers/gpu/drm/xe/xe_uc_fw.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_umc.o
  AR      drivers/net/built-in.a
  CC [M]  drivers/gpu/drm/xe/xe_vm.o
  CC      net/ipv6/inet6_hashtables.o
  CC      drivers/acpi/spcr.o
  CC      fs/signalfd.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.o
  CC [M]  drivers/gpu/drm/drm_fourcc.o
  CC [M]  drivers/gpu/drm/drm_framebuffer.o
  CC [M]  drivers/gpu/drm/drm_gem.o
  CC [M]  drivers/gpu/drm/xe/xe_vm_madvise.o
  CC      drivers/acpi/acpi_pad.o
  CC      net/ipv6/mcast_snoop.o
  CC      fs/btrfs/bio.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.o
  CC      fs/btrfs/lru_cache.o
  CC      fs/timerfd.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_sseu_debugfs.o
  CC [M]  drivers/gpu/drm/drm_ioctl.o
  CC [M]  drivers/gpu/drm/drm_lease.o
  CC      fs/btrfs/acl.o
  CC [M]  drivers/gpu/drm/drm_managed.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_timeline.o
  CC [M]  drivers/gpu/drm/xe/xe_wait_user_fence.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_wopcm.o
  CC [M]  drivers/gpu/drm/xe/xe_wa.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_rap.o
  CC [M]  drivers/gpu/drm/drm_mm.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_workarounds.o
  CC [M]  drivers/gpu/drm/i915/gt/shmem_utils.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/ramcfg.o
  CC [M]  net/ipv6/ip6_udp_tunnel.o
  CC [M]  drivers/acpi/acpi_video.o
  CC [M]  drivers/gpu/drm/i915/gt/sysfs_engines.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_fw_attestation.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_ggtt_gmch.o
  CC [M]  drivers/gpu/drm/i915/gt/gen6_renderstate.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.o
  CC [M]  drivers/gpu/drm/i915/gt/gen7_renderstate.o
  CC [M]  drivers/gpu/drm/i915/gt/gen8_renderstate.o
  CC [M]  drivers/gpu/drm/i915/gt/gen9_renderstate.o
  CC [M]  drivers/acpi/video_detect.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_busy.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.o
  LD [M]  drivers/md/dm-thin-pool.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/rammap.o
  CC [M]  drivers/gpu/drm/xe/xe_wopcm.o
  CC [M]  drivers/gpu/drm/drm_mode_config.o
  CC [M]  drivers/gpu/drm/drm_mode_object.o
  AR      drivers/acpi/built-in.a
  CC [M]  drivers/gpu/drm/drm_modes.o
  CC [M]  drivers/gpu/drm/xe/xe_display.o
  CC      fs/eventfd.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_clflush.o
  CC [M]  drivers/gpu/drm/drm_modeset_lock.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_context.o
  CC [M]  drivers/gpu/drm/drm_plane.o
  CC      fs/userfaultfd.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_create.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_domain.o
  AR      net/ipv6/built-in.a
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_internal.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowacpi.o
  CC [M]  drivers/gpu/drm/drm_prime.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_object.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowpci.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_lmem.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_mca.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_mman.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_pages.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_phys.o
  CC [M]  drivers/gpu/drm/drm_print.o
  AR      fs/btrfs/built-in.a
  CC      fs/aio.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_psp_ta.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_pm.o
  AR      net/built-in.a
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_region.o
  CC [M]  drivers/gpu/drm/drm_property.o
  CC [M]  drivers/gpu/drm/drm_syncobj.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_shmem.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowramin.o
  CC [M]  drivers/gpu/drm/xe/display/xe_fb_pin.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_shrinker.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_lsdma.o
  CC [M]  drivers/gpu/drm/drm_sysfs.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowrom.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/timing.o
  CC [M]  drivers/gpu/drm/drm_trace_points.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_stolen.o
  CC      fs/locks.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_throttle.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/therm.o
  LD [M]  drivers/acpi/video.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_tiling.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/vmap.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_ttm.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/volt.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/vpstate.o
  CC [M]  drivers/gpu/drm/drm_vblank.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_ttm_move.o
  CC [M]  drivers/gpu/drm/drm_vblank_work.o
  CC [M]  drivers/gpu/drm/drm_vma_manager.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.o
  CC [M]  drivers/gpu/drm/drm_writeback.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.o
  CC [M]  drivers/gpu/drm/lib/drm_random.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_userptr.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_wait.o
  CC [M]  drivers/gpu/drm/drm_ioc32.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/cik.o
  CC      fs/binfmt_script.o
  CC [M]  drivers/gpu/drm/i915/gem/i915_gemfs.o
  CC [M]  drivers/gpu/drm/i915/i915_active.o
  CC [M]  drivers/gpu/drm/drm_panel.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/cik_ih.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/xpio.o
  CC [M]  drivers/gpu/drm/drm_pci.o
  CC [M]  drivers/gpu/drm/drm_debugfs.o
  CC [M]  drivers/gpu/drm/drm_debugfs_crc.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/dce_v8_0.o
  CC [M]  drivers/gpu/drm/i915/i915_cmd_parser.o
  CC [M]  drivers/gpu/drm/drm_edid_load.o
  CC [M]  drivers/gpu/drm/drm_panel_orientation_quirks.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/M0203.o
  CC [M]  drivers/gpu/drm/i915/i915_deps.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/M0205.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfx_v7_0.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/M0209.o
  CC [M]  drivers/gpu/drm/i915/i915_gem_evict.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bios/P0260.o
  CC [M]  drivers/gpu/drm/i915/i915_gem_gtt.o
  CC [M]  drivers/gpu/drm/drm_buddy.o
  CC [M]  drivers/gpu/drm/i915/i915_gem_ww.o
  CC [M]  drivers/gpu/drm/i915/i915_gem.o
  CC [M]  drivers/gpu/drm/i915/i915_query.o
  CC [M]  drivers/gpu/drm/drm_gem_shmem_helper.o
  CC [M]  drivers/gpu/drm/i915/i915_request.o
  CC [M]  drivers/gpu/drm/i915/i915_scheduler.o
  CC [M]  drivers/gpu/drm/drm_suballoc.o
  CC [M]  drivers/gpu/drm/i915/i915_trace_points.o
  CC [M]  drivers/gpu/drm/i915/i915_ttm_buddy_manager.o
  CC [M]  drivers/gpu/drm/i915/i915_vma.o
  CC [M]  drivers/gpu/drm/drm_gem_ttm_helper.o
  CC [M]  drivers/gpu/drm/drm_atomic_helper.o
  CC      fs/binfmt_elf.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bus/base.o
  CC [M]  drivers/gpu/drm/xe/display/xe_hdcp_gsc.o
  CC [M]  drivers/gpu/drm/drm_atomic_state_helper.o
  CC [M]  drivers/gpu/drm/i915/i915_vma_resource.o
  CC [M]  drivers/gpu/drm/xe/display/xe_plane_initial.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bus/hwsq.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/cik_sdma.o
  CC [M]  drivers/gpu/drm/xe/display/xe_display_rps.o
  CC [M]  drivers/gpu/drm/drm_bridge_connector.o
  CC [M]  drivers/gpu/drm/drm_crtc_helper.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/uvd_v4_2.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bus/nv04.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bus/nv31.o
  CC [M]  drivers/gpu/drm/drm_damage_helper.o
  CC [M]  drivers/gpu/drm/drm_encoder_slave.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/vce_v2_0.o
  CC [M]  drivers/gpu/drm/xe/display/ext/i915_irq.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bus/nv50.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bus/g94.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_gsc_uc_heci_cmd_submit.o
  CC [M]  drivers/gpu/drm/drm_flip_work.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_guc.o
  CC [M]  drivers/gpu/drm/drm_format_helper.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/si.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_guc_capture.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.o
  CC [M]  drivers/gpu/drm/xe/display/ext/intel_clock_gating.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/bus/gf100.o
  CC [M]  drivers/gpu/drm/drm_gem_atomic_helper.o
  CC [M]  drivers/gpu/drm/drm_gem_framebuffer_helper.o
  CC [M]  drivers/gpu/drm/drm_kms_helper_common.o
  CC      fs/compat_binfmt_elf.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_guc_debugfs.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.o
  CC [M]  drivers/gpu/drm/drm_modeset_helper.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/clk/base.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/clk/nv04.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/clk/nv40.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/clk/nv50.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/clk/g84.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_guc_log.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_guc_log_debugfs.o
  CC [M]  drivers/gpu/drm/xe/display/ext/intel_device_info.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.o
  CC [M]  drivers/gpu/drm/drm_plane_helper.o
  CC [M]  drivers/gpu/drm/drm_probe_helper.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_guc_rc.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_huc.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfx_v6_0.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/si_ih.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/si_dma.o
  CC [M]  drivers/gpu/drm/drm_rect.o
  CC [M]  drivers/gpu/drm/drm_self_refresh_helper.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_huc_debugfs.o
  CC [M]  drivers/gpu/drm/drm_simple_kms_helper.o
  CC      fs/mbcache.o
  CC [M]  drivers/gpu/drm/bridge/panel.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_huc_fw.o
  CC [M]  drivers/gpu/drm/drm_fbdev_generic.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_uc.o
  CC [M]  drivers/gpu/drm/drm_fb_helper.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_uc_debugfs.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/clk/mcp77.o
  CC [M]  drivers/gpu/drm/xe/display/ext/intel_dram.o
  CC [M]  drivers/gpu/drm/xe/display/ext/intel_pch.o
  CC [M]  drivers/gpu/drm/xe/i915-display/icl_dsi.o
  CC [M]  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/clk/gf100.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_atomic.o
  CC      fs/posix_acl.o
  CC [M]  drivers/gpu/drm/i915/gt/intel_gsc.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/clk/gk104.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/dce_v6_0.o
  LD [M]  drivers/gpu/drm/drm.o
  CC [M]  drivers/gpu/drm/i915/i915_hwmon.o
  CC [M]  drivers/gpu/drm/i915/display/hsw_ips.o
  CC [M]  drivers/gpu/drm/i915/display/intel_atomic.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/clk/gk20a.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/clk/gm20b.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/clk/pllnv04.o
  CC [M]  drivers/gpu/drm/i915/display/intel_atomic_plane.o
  CC [M]  drivers/gpu/drm/i915/display/intel_audio.o
  LD [M]  drivers/gpu/drm/drm_shmem_helper.o
  CC [M]  drivers/gpu/drm/i915/display/intel_bios.o
  CC [M]  drivers/gpu/drm/i915/display/intel_bw.o
  LD [M]  drivers/gpu/drm/drm_suballoc_helper.o
  CC [M]  drivers/gpu/drm/i915/display/intel_cdclk.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_atomic_plane.o
  LD [M]  drivers/gpu/drm/drm_ttm_helper.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/uvd_v3_1.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/vi.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_audio.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/clk/pllgt215.o
  CC      fs/coredump.o
  CC      fs/drop_caches.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/mxgpu_vi.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/nbio_v6_1.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/soc15.o
  CC [M]  drivers/gpu/drm/i915/display/intel_color.o
  CC [M]  drivers/gpu/drm/i915/display/intel_combo_phy.o
  CC      fs/fhandle.o
  CC [M]  drivers/gpu/drm/i915/display/intel_connector.o
  CC [M]  drivers/gpu/drm/i915/display/intel_crtc.o
  AR      drivers/gpu/drm/built-in.a
  CC [M]  drivers/gpu/drm/i915/display/intel_crtc_state_dump.o
  CC [M]  drivers/gpu/drm/i915/display/intel_cursor.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_backlight.o
  CC [M]  drivers/gpu/drm/i915/display/intel_display.o
  CC [M]  drivers/gpu/drm/i915/display/intel_display_driver.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_bios.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/base.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/emu_soc.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/nv04.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/nv05.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/nv10.o
  CC [M]  drivers/gpu/drm/i915/display/intel_display_irq.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_bw.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_cdclk.o
  CC [M]  drivers/gpu/drm/i915/display/intel_display_power.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/nv1a.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/nv20.o
  CC [M]  drivers/gpu/drm/i915/display/intel_display_power_map.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_color.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/nv50.o
  LD [M]  drivers/gpu/drm/drm_kms_helper.o
  CC [M]  drivers/gpu/drm/i915/display/intel_display_power_well.o
  CC [M]  drivers/gpu/drm/i915/display/intel_display_reset.o
  CC [M]  drivers/gpu/drm/i915/display/intel_display_rps.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dmc.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dpio_phy.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/mxgpu_ai.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/nbio_v7_0.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dpll.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/g84.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dpll_mgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/vega10_reg_init.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/vega20_reg_init.o
  AR      fs/built-in.a
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_combo_phy.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_connector.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_crtc.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/g98.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dpt.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/nbio_v7_4.o
  CC [M]  drivers/gpu/drm/i915/display/intel_drrs.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gt215.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/nbio_v2_3.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/mcp89.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dsb.o
  CC [M]  drivers/gpu/drm/i915/display/intel_fb.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gf100.o
  CC [M]  drivers/gpu/drm/i915/display/intel_fb_pin.o
  CC [M]  drivers/gpu/drm/i915/display/intel_fbc.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm107.o
  CC [M]  drivers/gpu/drm/i915/display/intel_fdi.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm200.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/nv.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/arct_reg_init.o
  CC [M]  drivers/gpu/drm/i915/display/intel_fifo_underrun.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gv100.o
  CC [M]  drivers/gpu/drm/i915/display/intel_frontbuffer.o
  CC [M]  drivers/gpu/drm/i915/display/intel_global_state.o
  CC [M]  drivers/gpu/drm/i915/display/intel_hdcp.o
  CC [M]  drivers/gpu/drm/i915/display/intel_hdcp_gsc.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_crtc_state_dump.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_cursor.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/tu102.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_cx0_phy.o
  CC [M]  drivers/gpu/drm/i915/display/intel_hotplug.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/ga100.o
  CC [M]  drivers/gpu/drm/i915/display/intel_hotplug_irq.o
  CC [M]  drivers/gpu/drm/i915/display/intel_hti.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fault/user.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_ddi.o
  CC [M]  drivers/gpu/drm/i915/display/intel_load_detect.o
  CC [M]  drivers/gpu/drm/i915/display/intel_lpe_audio.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp100.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/mxgpu_nv.o
  CC [M]  drivers/gpu/drm/i915/display/intel_modeset_verify.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_ddi_buf_trans.o
  CC [M]  drivers/gpu/drm/i915/display/intel_modeset_setup.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/nbio_v7_2.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/hdp_v4_0.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_display.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/hdp_v5_0.o
  CC [M]  drivers/gpu/drm/i915/display/intel_overlay.o
  CC [M]  drivers/gpu/drm/i915/display/intel_pch_display.o
  CC [M]  drivers/gpu/drm/i915/display/intel_pch_refclk.o
  CC [M]  drivers/gpu/drm/i915/display/intel_plane_initial.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_display_device.o
  CC [M]  drivers/gpu/drm/i915/display/intel_psr.o
  CC [M]  drivers/gpu/drm/i915/display/intel_quirks.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_display_driver.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/aldebaran_reg_init.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp10b.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fault/tu102.o
  CC [M]  drivers/gpu/drm/i915/display/intel_sprite.o
  CC [M]  drivers/gpu/drm/i915/display/intel_sprite_uapi.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/base.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv04.o
  CC [M]  drivers/gpu/drm/i915/display/intel_tc.o
  CC [M]  drivers/gpu/drm/i915/display/intel_vblank.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv10.o
  CC [M]  drivers/gpu/drm/i915/display/intel_vga.o
  CC [M]  drivers/gpu/drm/i915/display/intel_wm.o
  CC [M]  drivers/gpu/drm/i915/display/i9xx_plane.o
  CC [M]  drivers/gpu/drm/i915/display/i9xx_wm.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_display_debugfs.o
  CC [M]  drivers/gpu/drm/i915/display/skl_scaler.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_display_power.o
  CC [M]  drivers/gpu/drm/i915/display/skl_universal_plane.o
  CC [M]  drivers/gpu/drm/i915/display/skl_watermark.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/aldebaran.o
  CC [M]  drivers/gpu/drm/i915/display/intel_acpi.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_display_power_map.o
  CC [M]  drivers/gpu/drm/i915/display/intel_opregion.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv1a.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv20.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_display_power_well.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_display_trace.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv25.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv30.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_dkl_phy.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv35.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/soc21.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/sienna_cichlid.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_dmc.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv36.o
  CC [M]  drivers/gpu/drm/i915/display/intel_fbdev.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv40.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv41.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv44.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv46.o
  CC [M]  drivers/gpu/drm/i915/display/dvo_ch7017.o
  CC [M]  drivers/gpu/drm/i915/display/dvo_ch7xxx.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv47.o
  CC [M]  drivers/gpu/drm/i915/display/dvo_ivch.o
  CC [M]  drivers/gpu/drm/i915/display/dvo_ns2501.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/smu_v13_0_10.o
  CC [M]  drivers/gpu/drm/i915/display/dvo_sil164.o
  CC [M]  drivers/gpu/drm/i915/display/dvo_tfp410.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/nbio_v4_3.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv49.o
  CC [M]  drivers/gpu/drm/i915/display/g4x_dp.o
  CC [M]  drivers/gpu/drm/i915/display/g4x_hdmi.o
  CC [M]  drivers/gpu/drm/i915/display/icl_dsi.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv4e.o
  CC [M]  drivers/gpu/drm/i915/display/intel_backlight.o
  CC [M]  drivers/gpu/drm/i915/display/intel_crt.o
  CC [M]  drivers/gpu/drm/i915/display/intel_cx0_phy.o
  CC [M]  drivers/gpu/drm/i915/display/intel_ddi.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv50.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_dp.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/hdp_v6_0.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/g84.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gt215.o
  CC [M]  drivers/gpu/drm/i915/display/intel_ddi_buf_trans.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_dp_aux.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_dp_aux_backlight.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/mcp77.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_dp_hdcp.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/mcp89.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_dp_link_training.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_dp_mst.o
  CC [M]  drivers/gpu/drm/i915/display/intel_display_device.o
  CC [M]  drivers/gpu/drm/i915/display/intel_display_trace.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/nbio_v7_7.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/hdp_v5_2.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dkl_phy.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dp.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_dpll.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dp_aux.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dp_aux_backlight.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dp_hdcp.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_dpll_mgr.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dp_link_training.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dp_mst.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_dpt.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dsi.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf108.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk104.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dsi_dcs_backlight.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk110.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk20a.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/lsdma_v6_0.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_drrs.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gm107.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dsi_vbt.o
  CC [M]  drivers/gpu/drm/i915/display/intel_dvo.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/nbio_v7_9.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_dsb.o
  CC [M]  drivers/gpu/drm/i915/display/intel_gmbus.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_dsi.o
  CC [M]  drivers/gpu/drm/i915/display/intel_hdmi.o
  CC [M]  drivers/gpu/drm/i915/display/intel_lspcon.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/df_v1_7.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/df_v3_6.o
  CC [M]  drivers/gpu/drm/i915/display/intel_lvds.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/df_v4_3.o
  CC [M]  drivers/gpu/drm/i915/display/intel_panel.o
  CC [M]  drivers/gpu/drm/i915/display/intel_pps.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gm200.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gm20b.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_dsi_dcs_backlight.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_dsi_vbt.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_fb.o
  CC [M]  drivers/gpu/drm/i915/display/intel_qp_tables.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_fbc.o
  CC [M]  drivers/gpu/drm/i915/display/intel_sdvo.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gp100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gp102.o
  CC [M]  drivers/gpu/drm/i915/display/intel_snps_phy.o
  CC [M]  drivers/gpu/drm/i915/display/intel_tv.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.o
  CC [M]  drivers/gpu/drm/i915/display/intel_vdsc.o
  CC [M]  drivers/gpu/drm/i915/display/intel_vrr.o
  CC [M]  drivers/gpu/drm/i915/display/vlv_dsi.o
  CC [M]  drivers/gpu/drm/i915/display/vlv_dsi_pll.o
  CC [M]  drivers/gpu/drm/i915/i915_perf.o
  CC [M]  drivers/gpu/drm/i915/pxp/intel_pxp.o
  CC [M]  drivers/gpu/drm/i915/pxp/intel_pxp_tee.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_fdi.o
  CC [M]  drivers/gpu/drm/i915/pxp/intel_pxp_huc.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_fifo_underrun.o
  CC [M]  drivers/gpu/drm/i915/pxp/intel_pxp_cmd.o
  CC [M]  drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.o
  CC [M]  drivers/gpu/drm/i915/pxp/intel_pxp_irq.o
  CC [M]  drivers/gpu/drm/i915/pxp/intel_pxp_pm.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gp10b.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_frontbuffer.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_global_state.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.o
  CC [M]  drivers/gpu/drm/i915/pxp/intel_pxp_session.o
  CC [M]  drivers/gpu/drm/i915/i915_gpu_error.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gv100.o
  CC [M]  drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/tu102.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_gmbus.o
  CC [M]  drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.o
  CC [M]  drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_hdcp.o
  CC [M]  drivers/gpu/drm/i915/selftests/i915_random.o
  CC [M]  drivers/gpu/drm/i915/selftests/i915_selftest.o
  CC [M]  drivers/gpu/drm/i915/selftests/igt_atomic.o
  CC [M]  drivers/gpu/drm/i915/selftests/igt_flush_test.o
  CC [M]  drivers/gpu/drm/i915/selftests/igt_live_test.o
  CC [M]  drivers/gpu/drm/i915/selftests/igt_mmap.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ga100.o
  CC [M]  drivers/gpu/drm/i915/selftests/igt_reset.o
  CC [M]  drivers/gpu/drm/i915/selftests/igt_spinner.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_hdmi.o
  CC [M]  drivers/gpu/drm/i915/selftests/librapl.o
  CC [M]  drivers/gpu/drm/i915/i915_vgpu.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ga102.o
  HDRTEST drivers/gpu/drm/i915/display/intel_dkl_phy_regs.h
  HDRTEST drivers/gpu/drm/i915/display/intel_crtc_state_dump.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ram.o
  HDRTEST drivers/gpu/drm/i915/display/hsw_ips.h
  HDRTEST drivers/gpu/drm/i915/display/g4x_hdmi.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramnv04.o
  HDRTEST drivers/gpu/drm/i915/display/intel_hdcp_regs.h
  HDRTEST drivers/gpu/drm/i915/display/intel_overlay.h
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_hotplug.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_hotplug_irq.o
  HDRTEST drivers/gpu/drm/i915/display/intel_display.h
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_hti.o
  HDRTEST drivers/gpu/drm/i915/display/skl_watermark_regs.h
  HDRTEST drivers/gpu/drm/i915/display/intel_dmc.h
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_lspcon.o
  HDRTEST drivers/gpu/drm/i915/display/intel_vga.h
  HDRTEST drivers/gpu/drm/i915/display/intel_audio.h
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_modeset_setup.o
  HDRTEST drivers/gpu/drm/i915/display/intel_lvds.h
  HDRTEST drivers/gpu/drm/i915/display/intel_modeset_setup.h
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_modeset_verify.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.o
  HDRTEST drivers/gpu/drm/i915/display/intel_cdclk.h
  HDRTEST drivers/gpu/drm/i915/display/intel_display_limits.h
  HDRTEST drivers/gpu/drm/i915/display/intel_hotplug.h
  HDRTEST drivers/gpu/drm/i915/display/intel_dkl_phy.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramnv10.o
  HDRTEST drivers/gpu/drm/i915/display/intel_atomic.h
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_panel.o
  HDRTEST drivers/gpu/drm/i915/display/intel_display_driver.h
  HDRTEST drivers/gpu/drm/i915/display/intel_dpll.h
  HDRTEST drivers/gpu/drm/i915/display/vlv_dsi_pll_regs.h
  HDRTEST drivers/gpu/drm/i915/display/intel_dp_mst.h
  HDRTEST drivers/gpu/drm/i915/display/intel_fdi_regs.h
  HDRTEST drivers/gpu/drm/i915/display/g4x_dp.h
  HDRTEST drivers/gpu/drm/i915/display/intel_tc.h
  HDRTEST drivers/gpu/drm/i915/display/intel_frontbuffer.h
  HDRTEST drivers/gpu/drm/i915/display/intel_dsi_vbt.h
  HDRTEST drivers/gpu/drm/i915/display/intel_psr.h
  HDRTEST drivers/gpu/drm/i915/display/intel_crt.h
  HDRTEST drivers/gpu/drm/i915/display/intel_opregion.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_pipe_crc.o
  HDRTEST drivers/gpu/drm/i915/display/intel_snps_phy_regs.h
  HDRTEST drivers/gpu/drm/i915/display/i9xx_wm.h
  HDRTEST drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
  HDRTEST drivers/gpu/drm/i915/display/intel_global_state.h
  HDRTEST drivers/gpu/drm/i915/display/intel_lpe_audio.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramnv1a.o
  HDRTEST drivers/gpu/drm/i915/display/intel_drrs.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramnv20.o
  HDRTEST drivers/gpu/drm/i915/display/intel_display_rps.h
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_pps.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_psr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramnv40.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramnv41.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_qp_tables.o
  HDRTEST drivers/gpu/drm/i915/display/intel_fbdev.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.o
  HDRTEST drivers/gpu/drm/i915/display/intel_pps_regs.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramnv44.o
  HDRTEST drivers/gpu/drm/i915/display/intel_hdmi.h
  HDRTEST drivers/gpu/drm/i915/display/intel_fdi.h
  HDRTEST drivers/gpu/drm/i915/display/intel_fb.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramnv49.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramnv4e.o
  HDRTEST drivers/gpu/drm/i915/display/intel_qp_tables.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramnv50.o
  HDRTEST drivers/gpu/drm/i915/display/intel_dsb_regs.h
  HDRTEST drivers/gpu/drm/i915/display/intel_vdsc.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgt215.o
  HDRTEST drivers/gpu/drm/i915/display/intel_snps_phy.h
  HDRTEST drivers/gpu/drm/i915/display/intel_display_core.h
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_quirks.o
  HDRTEST drivers/gpu/drm/i915/display/vlv_dsi_pll.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/rammcp77.o
  HDRTEST drivers/gpu/drm/i915/display/intel_dvo_dev.h
  HDRTEST drivers/gpu/drm/i915/display/intel_hdcp.h
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_snps_phy.o
  HDRTEST drivers/gpu/drm/i915/display/intel_sdvo_regs.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfxhub_v3_0.o
  HDRTEST drivers/gpu/drm/i915/display/intel_pch_refclk.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/mmhub_v3_0.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_tc.o
  HDRTEST drivers/gpu/drm/i915/display/intel_display_trace.h
  HDRTEST drivers/gpu/drm/i915/display/intel_display_power.h
  HDRTEST drivers/gpu/drm/i915/display/intel_dp_aux_regs.h
  HDRTEST drivers/gpu/drm/i915/display/i9xx_plane.h
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_vblank.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_vdsc.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgf100.o
  HDRTEST drivers/gpu/drm/i915/display/intel_dp_aux_backlight.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgf108.o
  HDRTEST drivers/gpu/drm/i915/display/intel_dpll_mgr.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgk104.o
  HDRTEST drivers/gpu/drm/i915/display/vlv_dsi.h
  HDRTEST drivers/gpu/drm/i915/display/intel_plane_initial.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgm107.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_vga.o
  HDRTEST drivers/gpu/drm/i915/display/intel_display_device.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgm200.o
  HDRTEST drivers/gpu/drm/i915/display/intel_fifo_underrun.h
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_vrr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgp100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramga102.o
  HDRTEST drivers/gpu/drm/i915/display/intel_cursor.h
  HDRTEST drivers/gpu/drm/i915/display/vlv_dsi_regs.h
  HDRTEST drivers/gpu/drm/i915/display/intel_cx0_phy.h
  HDRTEST drivers/gpu/drm/i915/display/skl_scaler.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/mmhub_v3_0_2.o
  HDRTEST drivers/gpu/drm/i915/display/intel_hti.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/sddr2.o
  HDRTEST drivers/gpu/drm/i915/display/icl_dsi_regs.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/gmc_v11_0.o
  HDRTEST drivers/gpu/drm/i915/display/intel_atomic_plane.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/sddr3.o
  HDRTEST drivers/gpu/drm/i915/display/skl_watermark.h
  HDRTEST drivers/gpu/drm/i915/display/intel_fbc.h
  HDRTEST drivers/gpu/drm/i915/display/intel_display_reg_defs.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gddr3.o
../drivers/gpu/drm/i915/i915_gpu_error.c:2174: warning: Function parameter or member 'dump_flags' not described in 'i915_capture_error_state'
  HDRTEST drivers/gpu/drm/i915/display/intel_acpi.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/mmhub_v3_0_1.o
  HDRTEST drivers/gpu/drm/i915/display/intel_connector.h
  HDRTEST drivers/gpu/drm/i915/display/intel_dpt.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfxhub_v3_0_3.o
  HDRTEST drivers/gpu/drm/i915/display/intel_quirks.h
  HDRTEST drivers/gpu/drm/i915/display/intel_dp_link_training.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.o
  HDRTEST drivers/gpu/drm/i915/display/intel_color.h
  HDRTEST drivers/gpu/drm/i915/display/intel_crtc.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.o
  HDRTEST drivers/gpu/drm/i915/display/intel_display_debugfs.h
  HDRTEST drivers/gpu/drm/i915/display/intel_modeset_verify.h
  HDRTEST drivers/gpu/drm/i915/display/intel_display_power_well.h
  HDRTEST drivers/gpu/drm/i915/display/intel_psr_regs.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/umc_v6_0.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fb/gddr5.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fuse/base.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_wm.o
  HDRTEST drivers/gpu/drm/i915/display/intel_wm.h
  HDRTEST drivers/gpu/drm/i915/display/intel_pipe_crc.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fuse/nv50.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/umc_v6_1.o
  HDRTEST drivers/gpu/drm/i915/display/intel_audio_regs.h
  HDRTEST drivers/gpu/drm/i915/display/intel_panel.h
  HDRTEST drivers/gpu/drm/i915/display/intel_sprite.h
  HDRTEST drivers/gpu/drm/i915/display/intel_wm_types.h
  HDRTEST drivers/gpu/drm/i915/display/intel_tv.h
  HDRTEST drivers/gpu/drm/i915/display/intel_hti_regs.h
  HDRTEST drivers/gpu/drm/i915/display/intel_vrr.h
  HDRTEST drivers/gpu/drm/i915/display/intel_load_detect.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fuse/gf100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/fuse/gm107.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/umc_v6_7.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/umc_v8_7.o
  CC [M]  drivers/gpu/drm/xe/i915-display/skl_scaler.o
  HDRTEST drivers/gpu/drm/i915/display/skl_universal_plane.h
  CC [M]  drivers/gpu/drm/xe/i915-display/skl_universal_plane.o
  CC [M]  drivers/gpu/drm/xe/i915-display/skl_watermark.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/gpio/base.o
  HDRTEST drivers/gpu/drm/i915/display/intel_mg_phy_regs.h
  HDRTEST drivers/gpu/drm/i915/display/intel_bw.h
  CC [M]  drivers/gpu/drm/xe/xe_pmu.o
  HDRTEST drivers/gpu/drm/i915/display/intel_display_irq.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/gpio/nv10.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/umc_v8_10.o
  HDRTEST drivers/gpu/drm/i915/display/intel_de.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/gpio/nv50.o
../drivers/gpu/drm/i915/i915_perf.c:5307: warning: Function parameter or member 'i915' not described in 'i915_perf_ioctl_version'
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/iceland_ih.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/gpio/g94.o
  HDRTEST drivers/gpu/drm/i915/display/intel_lvds_regs.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/gpio/gf119.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/tonga_ih.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/gpio/gk104.o
  HDRTEST drivers/gpu/drm/i915/display/intel_gmbus_regs.h
  HDRTEST drivers/gpu/drm/i915/display/intel_dsi_dcs_backlight.h
  HDRTEST drivers/gpu/drm/i915/display/intel_dvo.h
  HDRTEST drivers/gpu/drm/i915/display/intel_sdvo.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/cz_ih.o
  HDRTEST drivers/gpu/drm/i915/display/intel_dp_aux.h
  HDRTEST drivers/gpu/drm/i915/display/intel_vdsc_regs.h
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_acpi.o
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_opregion.o
  HDRTEST drivers/gpu/drm/i915/display/intel_combo_phy.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/gpio/ga102.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/gsp/base.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/gsp/gv100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga102.o
  HDRTEST drivers/gpu/drm/i915/display/intel_dvo_regs.h
  HDRTEST drivers/gpu/drm/i915/display/intel_gmbus.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/vega10_ih.o
  HDRTEST drivers/gpu/drm/i915/display/intel_hdcp_gsc.h
  CC [M]  drivers/gpu/drm/xe/i915-display/intel_fbdev.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/vega20_ih.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.o
  HDRTEST drivers/gpu/drm/i915/display/intel_dsi.h
  HDRTEST drivers/gpu/drm/i915/display/intel_dmc_regs.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/navi10_ih.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/nv04.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/nv4e.o
  HDRTEST drivers/gpu/drm/i915/display/intel_ddi.h
  HDRTEST drivers/gpu/drm/i915/display/intel_hotplug_irq.h
  CC [M]  drivers/gpu/drm/xe/xe_guc.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/ih_v6_0.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.o
  HDRTEST drivers/gpu/drm/i915/display/intel_tv_regs.h
  CC [M]  drivers/gpu/drm/xe/xe_ring_ops.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/psp_v3_1.o
  HDRTEST drivers/gpu/drm/xe/abi/guc_klvs_abi.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/nv50.o
  HDRTEST drivers/gpu/drm/xe/abi/guc_errors_abi.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/psp_v10_0.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/psp_v11_0.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/psp_v11_0_8.o
  HDRTEST drivers/gpu/drm/xe/abi/guc_actions_slpc_abi.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/psp_v12_0.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/g94.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/psp_v13_0.o
  HDRTEST drivers/gpu/drm/xe/abi/guc_communication_mmio_abi.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/psp_v13_0_4.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/dce_v10_0.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/gf117.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/dce_v11_0.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/gf119.o
  HDRTEST drivers/gpu/drm/xe/abi/guc_actions_abi.h
  HDRTEST drivers/gpu/drm/i915/display/intel_dsb.h
  HDRTEST drivers/gpu/drm/xe/abi/guc_communication_ctb_abi.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.o
  HDRTEST drivers/gpu/drm/xe/abi/guc_messages_abi.h
  HDRTEST drivers/gpu/drm/i915/display/intel_bios.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_vma_types.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/vlv_sideband_reg.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.o
  HDRTEST drivers/gpu/drm/i915/display/intel_pch_display.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.o
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/intel_wakeref.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/intel_pcode.h
  HDRTEST drivers/gpu/drm/i915/display/intel_display_types.h
  HDRTEST drivers/gpu/drm/i915/display/intel_backlight.h
  HDRTEST drivers/gpu/drm/i915/display/intel_vblank.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfx_v9_4.o
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
  HDRTEST drivers/gpu/drm/i915/display/intel_dp.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_reg_defs.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/gk104.o
  HDRTEST drivers/gpu/drm/i915/display/intel_backlight_regs.h
  HDRTEST drivers/gpu/drm/i915/display/intel_combo_phy_regs.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_trace.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/gk110.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/gm200.o
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_reg.h
  HDRTEST drivers/gpu/drm/i915/display/intel_display_reset.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_active_types.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_utils.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/pad.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/padnv04.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/padnv4e.o
  HDRTEST drivers/gpu/drm/i915/display/intel_display_power_map.h
  HDRTEST drivers/gpu/drm/i915/display/intel_ddi_buf_trans.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/padnv50.o
  HDRTEST drivers/gpu/drm/i915/display/icl_dsi.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.o
  HDRTEST drivers/gpu/drm/i915/display/intel_lspcon.h
  HDRTEST drivers/gpu/drm/i915/display/intel_dpio_phy.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/padg94.o
  HDRTEST drivers/gpu/drm/i915/display/intel_dp_hdcp.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_config.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_vma.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/padgf119.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/padgm200.o
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/vlv_sideband.h
  HDRTEST drivers/gpu/drm/i915/display/intel_fb_pin.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/intel_mchbar_regs.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.o
  HDRTEST drivers/gpu/drm/i915/display/intel_pps.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_debugfs.h
  HDRTEST drivers/gpu/drm/i915/display/intel_sprite_uapi.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_ttm.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/imu_v11_0.o
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/soc/intel_gmch.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.o
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_vgpu.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_fixed.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_region.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_context_types.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/intel_runtime_pm.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_lmem.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfx_v11_0_3.o
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_mman.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_object_types.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/intel_pm_types.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_context.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/intel_uncore.h
../drivers/gpu/drm/i915/gem/i915_gem_region.h:25: warning: Incorrect use of kernel-doc format:          * process_obj - Process the current object
../drivers/gpu/drm/i915/gem/i915_gem_region.h:35: warning: Function parameter or member 'process_obj' not described in 'i915_gem_apply_to_region_ops'
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_clflush.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/imu_v11_0_3.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.o
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_tiling.h
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/intel_pci_config.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/sdma_v2_4.o
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/intel_clock_gating.h
  HDRTEST drivers/gpu/drm/xe/display/ext/i915_irq.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_stolen.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.o
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_create.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_ttm_move.h
  HDRTEST drivers/gpu/drm/xe/display/ext/intel_pch.h
  HDRTEST drivers/gpu/drm/xe/display/ext/intel_dram.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/busnv04.o
  HDRTEST drivers/gpu/drm/xe/display/ext/intel_device_info.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.o
  HDRTEST drivers/gpu/drm/xe/regs/xe_reg_defs.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_domain.h
  HDRTEST drivers/gpu/drm/xe/regs/xe_guc_regs.h
  HDRTEST drivers/gpu/drm/xe/regs/xe_gt_regs.h
  HDRTEST drivers/gpu/drm/xe/regs/xe_regs.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/sdma_v4_4.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/busnv4e.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.o
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_internal.h
  HDRTEST drivers/gpu/drm/xe/regs/xe_gpu_commands.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.o
  HDRTEST drivers/gpu/drm/xe/regs/xe_lrc_layout.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/busnv50.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/sdma_v5_2.o
  HDRTEST drivers/gpu/drm/xe/regs/xe_engine_regs.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/busgf119.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/sdma_v6_0.o
  HDRTEST drivers/gpu/drm/xe/tests/xe_test.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_dmabuf.h
  HDRTEST drivers/gpu/drm/xe/tests/xe_pci_test.h
  HDRTEST drivers/gpu/drm/i915/gem/selftests/mock_context.h
  HDRTEST drivers/gpu/drm/xe/tests/xe_migrate_test.h
  HDRTEST drivers/gpu/drm/i915/gem/selftests/huge_gem_object.h
  HDRTEST drivers/gpu/drm/xe/tests/xe_dma_buf_test.h
  HDRTEST drivers/gpu/drm/xe/tests/xe_bo_test.h
  HDRTEST drivers/gpu/drm/i915/gem/selftests/mock_gem_object.h
  HDRTEST drivers/gpu/drm/xe/xe_bb.h
  HDRTEST drivers/gpu/drm/xe/xe_bb_types.h
  HDRTEST drivers/gpu/drm/xe/xe_bo.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_mes.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bit.o
  HDRTEST drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.h
  HDRTEST drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_userptr.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/mes_v10_1.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.o
../drivers/gpu/drm/i915/gem/i915_gem_ttm.h:50: warning: Function parameter or member 'bo' not described in 'i915_ttm_to_gem'
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_pm.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/mes_v11_0.o
  HDRTEST drivers/gpu/drm/xe/xe_bo_doc.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gemfs.h
  HDRTEST drivers/gpu/drm/xe/xe_bo_evict.h
  HDRTEST drivers/gpu/drm/i915/gem/i915_gem_object.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_timeline_types.h
  HDRTEST drivers/gpu/drm/i915/gt/selftest_engine.h
  HDRTEST drivers/gpu/drm/xe/xe_bo_types.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/auxg94.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_breadcrumbs.h
  HDRTEST drivers/gpu/drm/xe/xe_debugfs.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h
  HDRTEST drivers/gpu/drm/xe/xe_devcoredump.h
  HDRTEST drivers/gpu/drm/xe/xe_devcoredump_types.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_context_types.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_execlists_submission.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_pm.h
  HDRTEST drivers/gpu/drm/xe/xe_device.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.o
  HDRTEST drivers/gpu/drm/i915/gt/selftest_rc6.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_llc_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/uvd_v5_0.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/auxgf119.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/auxgm200.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/i2c/anx9805.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_region_lmem.h
  HDRTEST drivers/gpu/drm/xe/xe_device_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/uvd_v6_0.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_requests.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/uvd_v7_0.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_ggtt_gmch.h
  HDRTEST drivers/gpu/drm/xe/xe_display.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.o
  HDRTEST drivers/gpu/drm/xe/xe_dma_buf.h
  HDRTEST drivers/gpu/drm/xe/xe_drv.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_print.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_mcr.h
  HDRTEST drivers/gpu/drm/i915/gt/gen8_ppgtt.h
  HDRTEST drivers/gpu/drm/xe/xe_engine.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_timeline.h
  HDRTEST drivers/gpu/drm/i915/gt/gen6_engine_cs.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/vce_v3_0.o
  HDRTEST drivers/gpu/drm/xe/xe_engine_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/vce_v4_0.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_workarounds_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.o
  HDRTEST drivers/gpu/drm/i915/gt/selftest_rps.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/gf100.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_sa_media.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_debugfs.h
  HDRTEST drivers/gpu/drm/xe/xe_exec.h
../drivers/gpu/drm/i915/gem/i915_gem_object.h:94: warning: Function parameter or member 'file' not described in 'i915_gem_object_lookup_rcu'
../drivers/gpu/drm/i915/gem/i915_gem_object.h:94: warning: Excess function parameter 'filp' description in 'i915_gem_object_lookup_rcu'
  HDRTEST drivers/gpu/drm/xe/xe_execlist.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_clock_utils.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_rps_types.h
  HDRTEST drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.h
  HDRTEST drivers/gpu/drm/i915/gt/sysfs_engines.h
  HDRTEST drivers/gpu/drm/xe/xe_execlist_types.h
  HDRTEST drivers/gpu/drm/i915/gt/gen7_renderclear.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_context.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_wopcm.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/instmem/base.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_mocs.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_engine_pm.h
  HDRTEST drivers/gpu/drm/xe/xe_force_wake.h
  HDRTEST drivers/gpu/drm/xe/xe_force_wake_types.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_rc6.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/vcn_sw_ring.o
  HDRTEST drivers/gpu/drm/xe/xe_ggtt.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_ring_types.h
  HDRTEST drivers/gpu/drm/xe/xe_ggtt_types.h
  HDRTEST drivers/gpu/drm/xe/xe_gt.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_workarounds.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_clock.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_engine_regs.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_debugfs.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_idle_sysfs.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv04.o
  HDRTEST drivers/gpu/drm/xe/xe_gt_idle_sysfs_types.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_mcr.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.o
  HDRTEST drivers/gpu/drm/xe/xe_gt_pagefault.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_pm_irq.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/vcn_v2_0.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv40.o
  HDRTEST drivers/gpu/drm/xe/xe_gt_printk.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/vcn_v2_5.o
  HDRTEST drivers/gpu/drm/xe/xe_gt_sysfs.h
  HDRTEST drivers/gpu/drm/i915/gt/shmem_utils.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_engine.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_reset_types.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_regs.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_reset.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/ltc/base.o
  HDRTEST drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_sysfs_types.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_uc.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/vcn_v3_0.o
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_uc_fw_abi.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_guc_print.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/vcn_v4_0.o
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_guc_fw.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_tlb_invalidation_types.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gf100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gk104.o
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_guc_debugfs.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.o
  HDRTEST drivers/gpu/drm/xe/xe_gt_topology.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_types.h
  HDRTEST drivers/gpu/drm/xe/xe_guc.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/abi/guc_actions_slpc_abi.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_ads.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/abi/guc_communication_mmio_abi.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_ads_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.o
  HDRTEST drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_ct.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.o
  HDRTEST drivers/gpu/drm/xe/xe_guc_ct_types.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_debugfs.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_engine_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.o
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_gsc_uc_heci_cmd_submit.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.o
../drivers/gpu/drm/i915/gt/intel_context.h:108: warning: Function parameter or member 'ce' not described in 'intel_context_lock_pinned'
../drivers/gpu/drm/i915/gt/intel_context.h:123: warning: Function parameter or member 'ce' not described in 'intel_context_is_pinned'
../drivers/gpu/drm/i915/gt/intel_context.h:142: warning: Function parameter or member 'ce' not described in 'intel_context_unlock_pinned'
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gm107.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gm200.o
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_huc.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp100.o
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_guc.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp102.o
  HDRTEST drivers/gpu/drm/xe/xe_guc_fwif.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_hwconfig.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_huc_fw.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_log.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.o
  HDRTEST drivers/gpu/drm/xe/xe_guc_log_types.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/ltc/ga102.o
  HDRTEST drivers/gpu/drm/xe/xe_guc_pc.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_guc_capture.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.o
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_guc_log_debugfs.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:27: warning: Function parameter or member 'size' not described in '__guc_capture_bufstate'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:27: warning: Function parameter or member 'data' not described in '__guc_capture_bufstate'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:27: warning: Function parameter or member 'rd' not described in '__guc_capture_bufstate'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:27: warning: Function parameter or member 'wr' not described in '__guc_capture_bufstate'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:59: warning: Function parameter or member 'link' not described in '__guc_capture_parsed_output'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:59: warning: Function parameter or member 'is_partial' not described in '__guc_capture_parsed_output'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:59: warning: Function parameter or member 'eng_class' not described in '__guc_capture_parsed_output'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:59: warning: Function parameter or member 'eng_inst' not described in '__guc_capture_parsed_output'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:59: warning: Function parameter or member 'guc_id' not described in '__guc_capture_parsed_output'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:59: warning: Function parameter or member 'lrca' not described in '__guc_capture_parsed_output'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:59: warning: Function parameter or member 'reginfo' not described in '__guc_capture_parsed_output'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:62: warning: wrong kernel-doc identifier on line:
 * struct guc_debug_capture_list_header / struct guc_debug_capture_list
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:80: warning: wrong kernel-doc identifier on line:
 * struct __guc_mmio_reg_descr / struct __guc_mmio_reg_descr_group
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:105: warning: wrong kernel-doc identifier on line:
 * struct guc_state_capture_header_t / struct guc_state_capture_t /
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:163: warning: Function parameter or member 'is_valid' not described in '__guc_capture_ads_cache'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:163: warning: Function parameter or member 'ptr' not described in '__guc_capture_ads_cache'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:163: warning: Function parameter or member 'size' not described in '__guc_capture_ads_cache'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:163: warning: Function parameter or member 'status' not described in '__guc_capture_ads_cache'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:216: warning: Function parameter or member 'ads_null_cache' not described in 'intel_guc_state_capture'
../drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:216: warning: Function parameter or member 'max_mmio_per_node' not described in 'intel_guc_state_capture'
  HDRTEST drivers/gpu/drm/xe/xe_guc_pc_types.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_guc_log.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_submit.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_submit_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/athub_v1_0.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mc/base.o
  HDRTEST drivers/gpu/drm/xe/xe_guc_types.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
  HDRTEST drivers/gpu/drm/xe/xe_huc.h
  HDRTEST drivers/gpu/drm/xe/xe_huc_debugfs.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
  HDRTEST drivers/gpu/drm/xe/xe_huc_types.h
  HDRTEST drivers/gpu/drm/xe/xe_hw_engine.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/athub_v2_0.o
  HDRTEST drivers/gpu/drm/xe/xe_hw_engine_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/athub_v2_1.o
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_uc_debugfs.h
  HDRTEST drivers/gpu/drm/xe/xe_hw_fence.h
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_guc_rc.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mc/nv04.o
  HDRTEST drivers/gpu/drm/i915/gt/uc/intel_huc_debugfs.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_hwconfig.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mc/nv11.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mc/nv17.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_llc.h
  HDRTEST drivers/gpu/drm/i915/gt/gen8_engine_cs.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_sseu_debugfs.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_rc6_types.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mc/nv44.o
  HDRTEST drivers/gpu/drm/xe/xe_hw_fence_types.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mc/nv50.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_context_param.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_gpu_commands.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_engine_user.h
  HDRTEST drivers/gpu/drm/xe/xe_irq.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/athub_v3_0.o
  HDRTEST drivers/gpu/drm/xe/xe_lrc.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/smuio_v9_0.o
  HDRTEST drivers/gpu/drm/xe/xe_lrc_types.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_irq.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_gsc.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mc/g84.o
../drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'marker' not described in 'guc_log_buffer_state'
../drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'read_ptr' not described in 'guc_log_buffer_state'
../drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'write_ptr' not described in 'guc_log_buffer_state'
../drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'size' not described in 'guc_log_buffer_state'
../drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'sampled_write_ptr' not described in 'guc_log_buffer_state'
../drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'wrap_offset' not described in 'guc_log_buffer_state'
../drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'flush_to_file' not described in 'guc_log_buffer_state'
../drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'buffer_full_cnt' not described in 'guc_log_buffer_state'
../drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'reserved' not described in 'guc_log_buffer_state'
../drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'flags' not described in 'guc_log_buffer_state'
../drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'version' not described in 'guc_log_buffer_state'
  HDRTEST drivers/gpu/drm/i915/gt/intel_rps.h
  HDRTEST drivers/gpu/drm/i915/gt/selftest_llc.h
../drivers/gpu/drm/i915/gt/uc/intel_guc.h:274: warning: Function parameter or member 'dbgfs_node' not described in 'intel_guc'
  HDRTEST drivers/gpu/drm/i915/gt/gen6_ppgtt.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mc/g98.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mc/gt215.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mc/gf100.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_migrate_types.h
  HDRTEST drivers/gpu/drm/xe/xe_macros.h
  HDRTEST drivers/gpu/drm/i915/gt/selftests/mock_timeline.h
  HDRTEST drivers/gpu/drm/xe/xe_map.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_lrc.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_lrc_reg.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_migrate.h
  HDRTEST drivers/gpu/drm/xe/xe_migrate.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.h
  HDRTEST drivers/gpu/drm/i915/gt/mock_engine.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mc/gk104.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_engine_stats.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/smuio_v11_0.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_gtt.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mc/gk20a.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_buffer_pool_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/smuio_v11_0_6.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/smuio_v13_0.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/smuio_v13_0_6.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mc/gp100.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_reset.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_ring.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/mca_v3_0.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_types.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mc/gp10b.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_renderstate.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_module.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_sseu.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mc/ga100.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/base.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/nv04.o
  HDRTEST drivers/gpu/drm/xe/xe_migrate_doc.h
  HDRTEST drivers/gpu/drm/xe/xe_mmio.h
  HDRTEST drivers/gpu/drm/i915/gt/intel_engine_types.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/nv41.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/nv44.o
  HDRTEST drivers/gpu/drm/xe/xe_mocs.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/nv50.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_chardev.o
  HDRTEST drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.h
  HDRTEST drivers/gpu/drm/i915/gt/gen2_engine_cs.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/g84.o
  HDRTEST drivers/gpu/drm/xe/xe_module.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/mcp77.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_pasid.o
  HDRTEST drivers/gpu/drm/i915/gvt/gvt.h
  HDRTEST drivers/gpu/drm/i915/gvt/trace.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_doorbell.o
  HDRTEST drivers/gpu/drm/i915/gvt/debug.h
  HDRTEST drivers/gpu/drm/i915/gvt/edid.h
  HDRTEST drivers/gpu/drm/i915/gvt/page_track.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gf100.o
  HDRTEST drivers/gpu/drm/i915/gvt/mmio.h
  HDRTEST drivers/gpu/drm/i915/gvt/sched_policy.h
../drivers/gpu/drm/i915/gt/intel_gtt.h:515: warning: Function parameter or member 'vm' not described in 'i915_vm_resv_put'
../drivers/gpu/drm/i915/gt/intel_gtt.h:515: warning: Excess function parameter 'resv' description in 'i915_vm_resv_put'
  HDRTEST drivers/gpu/drm/i915/gvt/fb_decoder.h
  HDRTEST drivers/gpu/drm/i915/gvt/cmd_parser.h
  HDRTEST drivers/gpu/drm/i915/gvt/dmabuf.h
  HDRTEST drivers/gpu/drm/i915/gvt/mmio_context.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gk104.o
  HDRTEST drivers/gpu/drm/xe/xe_pat.h
  HDRTEST drivers/gpu/drm/i915/gvt/display.h
  HDRTEST drivers/gpu/drm/i915/gvt/gtt.h
  HDRTEST drivers/gpu/drm/xe/xe_pci.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gk20a.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gm200.o
  HDRTEST drivers/gpu/drm/i915/gvt/scheduler.h
  HDRTEST drivers/gpu/drm/i915/gvt/reg.h
  HDRTEST drivers/gpu/drm/i915/gvt/execlist.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_flat_memory.o
  HDRTEST drivers/gpu/drm/i915/gvt/interrupt.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gm20b.o
  HDRTEST drivers/gpu/drm/i915/i915_active.h
  HDRTEST drivers/gpu/drm/i915/i915_active_types.h
  HDRTEST drivers/gpu/drm/i915/i915_cmd_parser.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.o
  HDRTEST drivers/gpu/drm/i915/i915_config.h
  HDRTEST drivers/gpu/drm/i915/i915_debugfs.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_queue.o
  HDRTEST drivers/gpu/drm/xe/xe_pci_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_mqd_manager.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gp100.o
  HDRTEST drivers/gpu/drm/i915/i915_debugfs_params.h
  HDRTEST drivers/gpu/drm/i915/i915_deps.h
  HDRTEST drivers/gpu/drm/xe/xe_pcode.h
  HDRTEST drivers/gpu/drm/xe/xe_pcode_api.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gp10b.o
  HDRTEST drivers/gpu/drm/i915/i915_driver.h
  HDRTEST drivers/gpu/drm/i915/i915_drm_client.h
  HDRTEST drivers/gpu/drm/xe/xe_platform_types.h
../drivers/gpu/drm/i915/gt/intel_engine_types.h:293: warning: Function parameter or member 'preempt_hang' not described in 'intel_engine_execlists'
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gv100.o
  HDRTEST drivers/gpu/drm/xe/xe_pm.h
  HDRTEST drivers/gpu/drm/i915/i915_drv.h
  HDRTEST drivers/gpu/drm/i915/i915_file_private.h
  HDRTEST drivers/gpu/drm/i915/i915_fixed.h
  HDRTEST drivers/gpu/drm/i915/i915_gem.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_mqd_manager_cik.o
  HDRTEST drivers/gpu/drm/i915/i915_gem_evict.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/tu102.o
  HDRTEST drivers/gpu/drm/i915/i915_gem_gtt.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_mqd_manager_vi.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/mem.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_mqd_manager_v9.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/memnv04.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/memnv50.o
  HDRTEST drivers/gpu/drm/i915/i915_gem_ww.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/memgf100.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_mqd_manager_v10.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.o
  HDRTEST drivers/gpu/drm/i915/i915_getparam.h
  HDRTEST drivers/gpu/drm/i915/i915_gpu_error.h
  HDRTEST drivers/gpu/drm/i915/i915_hwmon.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv04.o
  HDRTEST drivers/gpu/drm/i915/i915_ioc32.h
  HDRTEST drivers/gpu/drm/xe/xe_pmu.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_mqd_manager_v11.o
  HDRTEST drivers/gpu/drm/xe/xe_pmu_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_kernel_queue.o
  HDRTEST drivers/gpu/drm/i915/i915_ioctl.h
  HDRTEST drivers/gpu/drm/i915/i915_iosf_mbi.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_packet_manager.o
  HDRTEST drivers/gpu/drm/i915/i915_irq.h
  HDRTEST drivers/gpu/drm/i915/i915_memcpy.h
  HDRTEST drivers/gpu/drm/xe/xe_preempt_fence.h
  HDRTEST drivers/gpu/drm/i915/i915_mitigations.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_packet_manager_vi.o
  HDRTEST drivers/gpu/drm/xe/xe_preempt_fence_types.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv41.o
  HDRTEST drivers/gpu/drm/i915/i915_mm.h
  HDRTEST drivers/gpu/drm/i915/i915_params.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_packet_manager_v9.o
../drivers/gpu/drm/i915/i915_active.h:66: warning: Function parameter or member 'active' not described in '__i915_active_fence_init'
../drivers/gpu/drm/i915/i915_active.h:66: warning: Function parameter or member 'fence' not described in '__i915_active_fence_init'
../drivers/gpu/drm/i915/i915_active.h:66: warning: Function parameter or member 'fn' not described in '__i915_active_fence_init'
../drivers/gpu/drm/i915/i915_active.h:89: warning: Function parameter or member 'active' not described in 'i915_active_fence_set'
../drivers/gpu/drm/i915/i915_active.h:89: warning: Function parameter or member 'rq' not described in 'i915_active_fence_set'
../drivers/gpu/drm/i915/i915_active.h:102: warning: Function parameter or member 'active' not described in 'i915_active_fence_get'
../drivers/gpu/drm/i915/i915_active.h:122: warning: Function parameter or member 'active' not described in 'i915_active_fence_isset'
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process_queue_manager.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager_cik.o
  HDRTEST drivers/gpu/drm/xe/xe_pt.h
  HDRTEST drivers/gpu/drm/xe/xe_pt_types.h
  HDRTEST drivers/gpu/drm/i915/i915_pci.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv44.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.o
  HDRTEST drivers/gpu/drm/xe/xe_pt_walk.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmmcp77.o
  HDRTEST drivers/gpu/drm/i915/i915_perf.h
  HDRTEST drivers/gpu/drm/i915/i915_perf_oa_regs.h
  HDRTEST drivers/gpu/drm/i915/i915_perf_types.h
  HDRTEST drivers/gpu/drm/i915/i915_pmu.h
  HDRTEST drivers/gpu/drm/xe/xe_query.h
  HDRTEST drivers/gpu/drm/xe/xe_reg_sr.h
  HDRTEST drivers/gpu/drm/i915/i915_priolist_types.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.o
  HDRTEST drivers/gpu/drm/i915/i915_pvinfo.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager_vi.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk104.o
  HDRTEST drivers/gpu/drm/i915/i915_query.h
  HDRTEST drivers/gpu/drm/i915/i915_reg.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager_v9.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.o
  HDRTEST drivers/gpu/drm/xe/xe_reg_sr_types.h
  HDRTEST drivers/gpu/drm/i915/i915_reg_defs.h
  HDRTEST drivers/gpu/drm/xe/xe_reg_whitelist.h
  HDRTEST drivers/gpu/drm/xe/xe_res_cursor.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm200.o
  HDRTEST drivers/gpu/drm/i915/i915_request.h
  HDRTEST drivers/gpu/drm/i915/i915_scatterlist.h
  HDRTEST drivers/gpu/drm/xe/xe_ring_ops.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.o
  HDRTEST drivers/gpu/drm/i915/i915_scheduler.h
  HDRTEST drivers/gpu/drm/i915/i915_scheduler_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager_v10.o
  HDRTEST drivers/gpu/drm/i915/i915_selftest.h
  HDRTEST drivers/gpu/drm/xe/xe_ring_ops_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager_v11.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_interrupt.o
  HDRTEST drivers/gpu/drm/i915/i915_suspend.h
  HDRTEST drivers/gpu/drm/i915/i915_sw_fence.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_events.o
  HDRTEST drivers/gpu/drm/xe/xe_rtp.h
  HDRTEST drivers/gpu/drm/xe/xe_rtp_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/cik_event_interrupt.o
  HDRTEST drivers/gpu/drm/xe/xe_sa.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.o
  HDRTEST drivers/gpu/drm/i915/i915_sw_fence_work.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp10b.o
  HDRTEST drivers/gpu/drm/xe/xe_sa_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_int_process_v9.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_int_process_v11.o
  HDRTEST drivers/gpu/drm/xe/xe_sched_job.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgv100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmtu102.o
../drivers/gpu/drm/i915/i915_pmu.h:21: warning: cannot understand function prototype: 'enum i915_pmu_tracked_events '
../drivers/gpu/drm/i915/i915_pmu.h:32: warning: cannot understand function prototype: 'enum '
../drivers/gpu/drm/i915/i915_pmu.h:41: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * How many different events we track in the global PMU mask.
  HDRTEST drivers/gpu/drm/i915/i915_switcheroo.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_smi_events.o
  HDRTEST drivers/gpu/drm/i915/i915_syncmap.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/umem.o
  HDRTEST drivers/gpu/drm/i915/i915_sysfs.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.o
  HDRTEST drivers/gpu/drm/xe/xe_sched_job_types.h
  HDRTEST drivers/gpu/drm/i915/i915_tasklet.h
  HDRTEST drivers/gpu/drm/i915/i915_trace.h
  HDRTEST drivers/gpu/drm/i915/i915_ttm_buddy_manager.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/ummu.o
  HDRTEST drivers/gpu/drm/i915/i915_user_extensions.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_debugfs.o
  HDRTEST drivers/gpu/drm/i915/i915_utils.h
  HDRTEST drivers/gpu/drm/i915/i915_vgpu.h
../drivers/gpu/drm/i915/i915_scatterlist.h:160: warning: Incorrect use of kernel-doc format:          * release() - Free the memory of the struct i915_refct_sgt
../drivers/gpu/drm/i915/i915_scatterlist.h:164: warning: Function parameter or member 'release' not described in 'i915_refct_sgt_ops'
../drivers/gpu/drm/i915/i915_scatterlist.h:187: warning: Function parameter or member 'rsgt' not described in 'i915_refct_sgt_put'
../drivers/gpu/drm/i915/i915_scatterlist.h:198: warning: Function parameter or member 'rsgt' not described in 'i915_refct_sgt_get'
../drivers/gpu/drm/i915/i915_scatterlist.h:214: warning: Function parameter or member 'rsgt' not described in '__i915_refct_sgt_init'
  HDRTEST drivers/gpu/drm/i915/i915_vma.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.o
  HDRTEST drivers/gpu/drm/i915/i915_vma_resource.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_migrate.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mxm/base.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mxm/mxms.o
  HDRTEST drivers/gpu/drm/xe/xe_step.h
  HDRTEST drivers/gpu/drm/i915/i915_vma_types.h
  HDRTEST drivers/gpu/drm/xe/xe_step_types.h
  HDRTEST drivers/gpu/drm/i915/intel_clock_gating.h
  HDRTEST drivers/gpu/drm/xe/xe_sync.h
  HDRTEST drivers/gpu/drm/xe/xe_sync_types.h
../drivers/gpu/drm/i915/i915_request.h:176: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * Request queue structure.
../drivers/gpu/drm/i915/i915_request.h:477: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * Returns true if seq1 is later than seq2.
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.o
  HDRTEST drivers/gpu/drm/xe/xe_tile.h
  HDRTEST drivers/gpu/drm/xe/xe_trace.h
  HDRTEST drivers/gpu/drm/i915/intel_device_info.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.o
  HDRTEST drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
  HDRTEST drivers/gpu/drm/i915/intel_gvt.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/mxm/nv50.o
  HDRTEST drivers/gpu/drm/xe/xe_ttm_sys_mgr.h
  HDRTEST drivers/gpu/drm/i915/intel_mchbar_regs.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pci/agp.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.o
  HDRTEST drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pci/pcie.o
  HDRTEST drivers/gpu/drm/i915/intel_memory_region.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.o
  HDRTEST drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
  HDRTEST drivers/gpu/drm/i915/intel_pci_config.h
  HDRTEST drivers/gpu/drm/i915/intel_pcode.h
../drivers/gpu/drm/i915/i915_utils.h:284: warning: Function parameter or member 'OP' not described in '__wait_for'
../drivers/gpu/drm/i915/i915_utils.h:284: warning: Function parameter or member 'COND' not described in '__wait_for'
../drivers/gpu/drm/i915/i915_utils.h:284: warning: Function parameter or member 'US' not described in '__wait_for'
../drivers/gpu/drm/i915/i915_utils.h:284: warning: Function parameter or member 'Wmin' not described in '__wait_for'
../drivers/gpu/drm/i915/i915_utils.h:284: warning: Function parameter or member 'Wmax' not described in '__wait_for'
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.o
  HDRTEST drivers/gpu/drm/i915/intel_region_ttm.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.o
  HDRTEST drivers/gpu/drm/i915/intel_runtime_pm.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.o
  HDRTEST drivers/gpu/drm/i915/intel_sbi.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pci/nv04.o
  HDRTEST drivers/gpu/drm/i915/intel_step.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.o
  HDRTEST drivers/gpu/drm/i915/intel_uncore.h
  HDRTEST drivers/gpu/drm/i915/intel_wakeref.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pci/nv40.o
  HDRTEST drivers/gpu/drm/i915/pxp/intel_pxp_tee.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pci/nv46.o
  HDRTEST drivers/gpu/drm/i915/pxp/intel_pxp_irq.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_job.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pci/nv4c.o
../drivers/gpu/drm/i915/i915_vma_resource.h:91: warning: Incorrect use of kernel-doc format:          * struct i915_vma_bindinfo - Information needed for async bind
  HDRTEST drivers/gpu/drm/xe/xe_tuning.h
../drivers/gpu/drm/i915/i915_vma_resource.h:129: warning: Function parameter or member 'wakeref' not described in 'i915_vma_resource'
../drivers/gpu/drm/i915/i915_vma_resource.h:129: warning: Function parameter or member 'bi' not described in 'i915_vma_resource'
  HDRTEST drivers/gpu/drm/i915/pxp/intel_pxp_session.h
  HDRTEST drivers/gpu/drm/xe/xe_uc.h
  HDRTEST drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_43.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_acp.o
  HDRTEST drivers/gpu/drm/i915/pxp/intel_pxp_cmd.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../acp/acp_hw.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pci/g84.o
  HDRTEST drivers/gpu/drm/i915/pxp/intel_pxp.h
  HDRTEST drivers/gpu/drm/i915/pxp/intel_pxp_types.h
  HDRTEST drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h
  HDRTEST drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_cmn.h
  HDRTEST drivers/gpu/drm/i915/pxp/intel_pxp_huc.h
  HDRTEST drivers/gpu/drm/i915/pxp/intel_pxp_pm.h
  HDRTEST drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_42.h
  HDRTEST drivers/gpu/drm/i915/selftests/igt_live_test.h
  HDRTEST drivers/gpu/drm/i915/selftests/igt_atomic.h
../drivers/gpu/drm/i915/i915_vma.h:145: warning: expecting prototype for i915_vma_offset(). Prototype was for i915_vma_size() instead
  HDRTEST drivers/gpu/drm/i915/selftests/mock_gem_device.h
  HDRTEST drivers/gpu/drm/i915/selftests/mock_drm.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pci/g92.o
  HDRTEST drivers/gpu/drm/i915/selftests/igt_reset.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pci/g94.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pci/gf100.o
  HDRTEST drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pci/gf106.o
  HDRTEST drivers/gpu/drm/i915/selftests/lib_sw_fence.h
  HDRTEST drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
  HDRTEST drivers/gpu/drm/i915/selftests/mock_uncore.h
  HDRTEST drivers/gpu/drm/xe/xe_uc_debugfs.h
  HDRTEST drivers/gpu/drm/i915/selftests/mock_gtt.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pci/gk104.o
  HDRTEST drivers/gpu/drm/xe/xe_uc_fw.h
  HDRTEST drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
  HDRTEST drivers/gpu/drm/xe/xe_uc_fw_abi.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pci/gp100.o
  HDRTEST drivers/gpu/drm/i915/selftests/mock_request.h
  HDRTEST drivers/gpu/drm/xe/xe_uc_fw_types.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_ioc32.o
  HDRTEST drivers/gpu/drm/xe/xe_uc_types.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pmu/memx.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.o
  HDRTEST drivers/gpu/drm/i915/selftests/i915_random.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gt215.o
  HDRTEST drivers/gpu/drm/i915/selftests/igt_spinner.h
  HDRTEST drivers/gpu/drm/xe/xe_vm.h
  HDRTEST drivers/gpu/drm/i915/selftests/librapl.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/arcturus_ppt.o
  HDRTEST drivers/gpu/drm/xe/xe_vm_doc.h
../drivers/gpu/drm/i915/pxp/intel_pxp_types.h:96: warning: Function parameter or member 'dev_link' not described in 'intel_pxp'
  HDRTEST drivers/gpu/drm/i915/selftests/mock_region.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gf100.o
  HDRTEST drivers/gpu/drm/i915/selftests/i915_live_selftests.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gf119.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/navi10_ppt.o
  HDRTEST drivers/gpu/drm/i915/selftests/igt_mmap.h
  HDRTEST drivers/gpu/drm/i915/selftests/igt_flush_test.h
  HDRTEST drivers/gpu/drm/xe/xe_vm_madvise.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/sienna_cichlid_ppt.o
  HDRTEST drivers/gpu/drm/xe/xe_vm_types.h
  HDRTEST drivers/gpu/drm/xe/xe_wa.h
  HDRTEST drivers/gpu/drm/i915/soc/intel_pch.h
  HDRTEST drivers/gpu/drm/i915/soc/intel_dram.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk104.o
  HDRTEST drivers/gpu/drm/i915/soc/intel_gmch.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.o
  HDRTEST drivers/gpu/drm/i915/vlv_sideband.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk110.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/cyan_skillfish_ppt.o
  HDRTEST drivers/gpu/drm/xe/xe_wait_user_fence.h
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk208.o
  HDRTEST drivers/gpu/drm/xe/xe_wopcm.h
  HDRTEST drivers/gpu/drm/xe/xe_wopcm_types.h
  HDRTEST drivers/gpu/drm/i915/vlv_sideband_reg.h
  LD [M]  drivers/gpu/drm/xe/xe.o
  HDRTEST drivers/gpu/drm/i915/vlv_suspend.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/smu_v11_0.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu12/renoir_ppt.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk20a.o
  LD [M]  drivers/gpu/drm/i915/i915.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu12/smu_v12_0.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/aldebaran_ppt.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/yellow_carp_ppt.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_0_ppt.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm107.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_4_ppt.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_5_ppt.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_7_ppt.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm200.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm20b.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_6_ppt.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/amdgpu_smu.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp102.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/smumgr/smumgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/smumgr/smu8_smumgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp10b.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/privring/gf100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/privring/gf117.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/smumgr/tonga_smumgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/smumgr/fiji_smumgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/smumgr/polaris10_smumgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/smumgr/iceland_smumgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/smumgr/smu7_smumgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/privring/gk104.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/privring/gk20a.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/privring/gm200.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/privring/gp10b.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/fan.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/smumgr/vega10_smumgr.o
drivers/gpu/drm/xe/xe.o: warning: objtool: intel_set_cpu_fifo_underrun_reporting+0x385: unreachable instruction
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/fannil.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/fanpwm.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/fantog.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/smumgr/smu10_smumgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/smumgr/ci_smumgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/ic.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/smumgr/vega12_smumgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/smumgr/vegam_smumgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/nv40.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/nv50.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/smumgr/smu9_smumgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/g84.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/smumgr/vega20_smumgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/hwmgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gt215.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf119.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/processpptables.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/hardwaremanager.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu8_hwmgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gm107.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/pppcielanes.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gm200.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/process_pptables_v1_0.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/timer/base.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/ppatomctrl.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/timer/nv04.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/timer/nv40.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/timer/nv41.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/ppatomfwctrl.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu7_hwmgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/timer/gk20a.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu7_powertune.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu7_thermal.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu7_clockpowergating.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_processpptables.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/top/base.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/top/gk104.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_hwmgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_powertune.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_thermal.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/top/ga100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/vfn/base.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/vfn/uvfn.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/vfn/gv100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/vfn/tu102.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/vfn/ga100.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu10_hwmgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/volt/base.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gpio.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/volt/nv40.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gf100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gf117.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk104.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gm20b.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/pp_psm.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/falcon.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/xtensa.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/bsp/g84.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega12_processpptables.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega12_hwmgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega12_thermal.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/ce/gt215.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/pp_overdriver.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/ce/gf100.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu_helper.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega20_processpptables.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega20_hwmgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega20_powertune.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega20_thermal.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/ce/gk104.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/common_baco.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/ce/gm107.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_baco.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega20_baco.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega12_baco.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/ce/gm200.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu9_baco.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/ce/gp100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/ce/gp102.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/tonga_baco.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/ce/gv100.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/polaris_baco.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/fiji_baco.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/ce/tu102.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/ci_baco.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu7_baco.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/ce/ga100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/ce/ga102.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/amd_powerplay.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/legacy-dpm/legacy_dpm.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/legacy-dpm/kv_dpm.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/cipher/g84.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/device/acpi.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/legacy-dpm/kv_smc.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/legacy-dpm/si_dpm.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/legacy-dpm/si_smc.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_dpm.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/device/base.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_pm.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_dpm_internal.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/device/pci.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_plane.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_crtc.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_irq.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/device/user.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/base.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_color.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/dc_fpu.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/chan.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/conn.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/dp.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/hdmi.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_services.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/head.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/ior.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_helpers.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/outp.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_pp_smu.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_psr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/vga.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/nv04.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_hdcp.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_crc.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_debugfs.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/basics/conversion.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/basics/fixpt31_32.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/basics/vector.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/basics/dc_common.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/bios_parser.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/bios_parser_interface.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/bios_parser_helper.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/command_table.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/command_table_helper.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/nv50.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/g84.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/bios_parser_common.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/command_table2.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/g94.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/gt200.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/command_table_helper2.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/mcp77.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/bios_parser2.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/gt215.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/dce60/command_table_helper_dce60.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/mcp89.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/dce80/command_table_helper_dce80.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/dce110/command_table_helper_dce110.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/gf119.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/gk104.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/dce112/command_table_helper_dce112.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/gk110.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/dce112/command_table_helper2_dce112.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/calcs/dce_calcs.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/gm107.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/calcs/custom_float.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/calcs/bw_fixed.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/gm200.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/gp100.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_rq_dlg_helpers.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dml1_display_rq_dlg_calc.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn10/dcn10_fpu.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn20/dcn20_fpu.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_vba.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn20/display_rq_dlg_calc_20.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/gp102.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn20/display_mode_vba_20.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/gv100.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn20/display_rq_dlg_calc_20v2.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn20/display_mode_vba_20v2.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn21/display_rq_dlg_calc_21.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn21/display_mode_vba_21.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/dcn30_fpu.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/tu102.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_rq_dlg_calc_30.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/ga102.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/udisp.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/uconn.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_rq_dlg_calc_31.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/display_mode_vba_314.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/uoutp.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/disp/uhead.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/display_rq_dlg_calc_314.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/dma/base.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_rq_dlg_calc_32.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_util_32.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/dma/nv04.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/dcn31_fpu.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/dma/nv50.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/dma/gf100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/dma/gf119.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn321/dcn321_fpu.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn301/dcn301_fpu.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn302/dcn302_fpu.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/dma/gv100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/dma/user.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn303/dcn303_fpu.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/dma/usernv04.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/dcn314_fpu.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/dma/usernv50.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/dma/usergf100.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dsc/rc_calc_fpu.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/dma/usergf119.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/calcs/dcn_calcs.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/dma/usergv100.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/fifo/base.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/fifo/cgrp.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/calcs/dcn_calc_math.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/calcs/dcn_calc_auto.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/fifo/runl.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/clk_mgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/fifo/runq.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/fifo/nv04.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dce60/dce60_clk_mgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dce100/dce_clk_mgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dce110/dce110_clk_mgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dce112/dce112_clk_mgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/fifo/nv10.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/fifo/nv17.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dce120/dce120_clk_mgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/fifo/nv40.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/fifo/nv50.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn10/rv1_clk_mgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/fifo/g84.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn10/rv1_clk_mgr_vbios_smu.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn10/rv2_clk_mgr.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn20/dcn20_clk_mgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/fifo/g98.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn201/dcn201_clk_mgr.o
  CC [M]  drivers/gpu/drm/nouveau/nvkm/engin



^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Intel-xe] ✗ CI.Hooks: failure for drm/xe/pmu: Enable PMU interface (rev2)
  2023-06-27 12:21 [Intel-xe] [PATCH v2 0/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
                   ` (5 preceding siblings ...)
  2023-06-27 13:10 ` [Intel-xe] ✓ CI.Build: " Patchwork
@ 2023-06-27 13:10 ` Patchwork
  6 siblings, 0 replies; 59+ messages in thread
From: Patchwork @ 2023-06-27 13:10 UTC (permalink / raw)
  To: Aravind Iddamsetty; +Cc: intel-xe

== Series Details ==

Series: drm/xe/pmu: Enable PMU interface (rev2)
URL   : https://patchwork.freedesktop.org/series/119504/
State : failure

== Summary ==

run-parts: executing /workspace/ci/hooks/00-showenv
+ pwd
+ ls -la
/workspace
total 516
drwxrwxr-x 10 1003 1003   4096 Jun 27 13:10 .
drwxr-xr-x  1 root root   4096 Jun 27 13:10 ..
-rw-rw-r--  1 1003 1003 397466 Jun 27 13:09 build.log
-rw-rw-r--  1 1003 1003   4991 Jun 27 13:04 checkpatch.log
drwxrwxr-x  5 1003 1003   4096 Jun 27 13:03 ci
drwxrwxr-x 10 1003 1003   4096 Jun 27 13:03 docker
drwxrwxr-x  8 1003 1003   4096 Jun 27 13:03 .git
-rw-rw-r--  1 1003 1003    330 Jun 27 13:04 git_apply.log
drwxrwxr-x  3 1003 1003   4096 Jun 27 13:03 .github
-rw-rw-r--  1 1003 1003    233 Jun 27 13:03 .groovylintrc.json
-rw-rw-r--  1 1003 1003     78 Jun 27 13:10 hooks.log
drwxrwxr-x 31 1003 1003   4096 Jun 27 13:09 kernel
-rw-rw-r--  1 1003 1003  30955 Jun 27 13:04 kernel.mbox
-rw-rw-r--  1 1003 1003  25980 Jun 27 13:06 kunit.log
drwxrwxr-x 42 1003 1003   4096 Jun 27 13:03 pipelines
-rw-rw-r--  1 1003 1003    793 Jun 27 13:03 README.adoc
drwxrwxr-x  3 1003 1003   4096 Jun 27 13:03 scripts
drwxrwxr-x  2 1003 1003   4096 Jun 27 13:03 .vscode
+ uname -a
Linux 5311b1888aa7 5.4.0-149-generic #166-Ubuntu SMP Tue Apr 18 16:51:45 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
+ export
+ grep -Ei '(^|\W)CI_'
declare -x CI_KERNEL_BUILD_DIR="/workspace/kernel/build64"
declare -x CI_KERNEL_IMAGES_DIR="/workspace/kernel/archive/boot"
declare -x CI_KERNEL_MODULES_DIR="/workspace/kernel/archive"
declare -x CI_KERNEL_SRC_DIR="/workspace/kernel"
declare -x CI_SRC_DIR="/workspace/kernel"
declare -x CI_TOOLS_SRC_DIR="/workspace/ci"
declare -x CI_WORKSPACE_DIR="/workspace"
+ '[' -n /workspace ']'
+ git_args='-C /workspace/kernel'
+ git_log_args=
+ git --no-pager -C /workspace/kernel log --format=oneline --abbrev-commit
599be1fe6 drm/xe/pmu: Enable PMU interface
da50f9f65 drm/xe: Get GT clock to nanosecs
f135580b5 drm/xe: Fix vm refcount races
run-parts: executing /workspace/ci/hooks/10-build-W1
+ SRC_DIR=/workspace/kernel
+ RESTORE_DISPLAY_CONFIG=0
+ '[' -n /workspace/kernel/build64 ']'
+ BUILD_DIR=/workspace/kernel/build64
+ cd /workspace/kernel
+ grep -q -e '^CONFIG_DRM_XE_DISPLAY=[yY]' /workspace/kernel/build64/.config
+ RESTORE_DISPLAY_CONFIG=1
+ trap cleanup EXIT
+ ./scripts/config --file /workspace/kernel/build64/.config --disable CONFIG_DRM_XE_DISPLAY
++ nproc
+ make -j48 O=/workspace/kernel/build64 modules_prepare
make[1]: Entering directory '/workspace/kernel/build64'
  SYNC    include/config/auto.conf.cmd
  GEN     Makefile
  GEN     Makefile
  UPD     include/config/kernel.release
  UPD     include/generated/compile.h
  UPD     include/generated/utsrelease.h
  DESCEND objtool
  CALL    ../scripts/checksyscalls.sh
  HOSTCC  /workspace/kernel/build64/tools/objtool/fixdep.o
  HOSTLD  /workspace/kernel/build64/tools/objtool/fixdep-in.o
  LINK    /workspace/kernel/build64/tools/objtool/fixdep
  INSTALL libsubcmd_headers
  CC      /workspace/kernel/build64/tools/objtool/libsubcmd/exec-cmd.o
  CC      /workspace/kernel/build64/tools/objtool/libsubcmd/help.o
  CC      /workspace/kernel/build64/tools/objtool/libsubcmd/pager.o
  CC      /workspace/kernel/build64/tools/objtool/libsubcmd/parse-options.o
  CC      /workspace/kernel/build64/tools/objtool/libsubcmd/run-command.o
  CC      /workspace/kernel/build64/tools/objtool/libsubcmd/subcmd-config.o
  CC      /workspace/kernel/build64/tools/objtool/libsubcmd/sigchain.o
  LD      /workspace/kernel/build64/tools/objtool/libsubcmd/libsubcmd-in.o
  AR      /workspace/kernel/build64/tools/objtool/libsubcmd/libsubcmd.a
  CC      /workspace/kernel/build64/tools/objtool/weak.o
  CC      /workspace/kernel/build64/tools/objtool/check.o
  CC      /workspace/kernel/build64/tools/objtool/special.o
  CC      /workspace/kernel/build64/tools/objtool/builtin-check.o
  CC      /workspace/kernel/build64/tools/objtool/elf.o
  CC      /workspace/kernel/build64/tools/objtool/objtool.o
  CC      /workspace/kernel/build64/tools/objtool/orc_gen.o
  CC      /workspace/kernel/build64/tools/objtool/orc_dump.o
  CC      /workspace/kernel/build64/tools/objtool/libstring.o
  CC      /workspace/kernel/build64/tools/objtool/libctype.o
  CC      /workspace/kernel/build64/tools/objtool/str_error_r.o
  CC      /workspace/kernel/build64/tools/objtool/librbtree.o
  CC      /workspace/kernel/build64/tools/objtool/arch/x86/special.o
  CC      /workspace/kernel/build64/tools/objtool/arch/x86/decode.o
  LD      /workspace/kernel/build64/tools/objtool/arch/x86/objtool-in.o
  LD      /workspace/kernel/build64/tools/objtool/objtool-in.o
  LINK    /workspace/kernel/build64/tools/objtool/objtool
make[1]: Leaving directory '/workspace/kernel/build64'
++ nproc
+ make -j48 O=/workspace/kernel/build64 M=drivers/gpu/drm/xe W=1
make[1]: Entering directory '/workspace/kernel/build64'
  CC [M]  drivers/gpu/drm/xe/xe_bb.o
  CC [M]  drivers/gpu/drm/xe/xe_bo.o
  CC [M]  drivers/gpu/drm/xe/xe_bo_evict.o
  CC [M]  drivers/gpu/drm/xe/xe_debugfs.o
  CC [M]  drivers/gpu/drm/xe/xe_devcoredump.o
  CC [M]  drivers/gpu/drm/xe/xe_device.o
  CC [M]  drivers/gpu/drm/xe/xe_dma_buf.o
  CC [M]  drivers/gpu/drm/xe/xe_engine.o
  CC [M]  drivers/gpu/drm/xe/xe_exec.o
  CC [M]  drivers/gpu/drm/xe/xe_execlist.o
  CC [M]  drivers/gpu/drm/xe/xe_force_wake.o
  CC [M]  drivers/gpu/drm/xe/xe_ggtt.o
  CC [M]  drivers/gpu/drm/xe/xe_gt.o
  CC [M]  drivers/gpu/drm/xe/xe_gt_clock.o
  CC [M]  drivers/gpu/drm/xe/xe_gt_debugfs.o
  CC [M]  drivers/gpu/drm/xe/xe_gt_idle_sysfs.o
  CC [M]  drivers/gpu/drm/xe/xe_gt_mcr.o
  CC [M]  drivers/gpu/drm/xe/xe_gt_pagefault.o
  CC [M]  drivers/gpu/drm/xe/xe_gt_sysfs.o
  CC [M]  drivers/gpu/drm/xe/xe_gt_tlb_invalidation.o
  HOSTCC  drivers/gpu/drm/xe/xe_gen_wa_oob
  CC [M]  drivers/gpu/drm/xe/xe_gt_topology.o
  CC [M]  drivers/gpu/drm/xe/xe_guc_ads.o
  CC [M]  drivers/gpu/drm/xe/xe_guc_ct.o
  CC [M]  drivers/gpu/drm/xe/xe_guc_debugfs.o
  CC [M]  drivers/gpu/drm/xe/xe_guc_hwconfig.o
  CC [M]  drivers/gpu/drm/xe/xe_guc_log.o
  CC [M]  drivers/gpu/drm/xe/xe_guc_pc.o
  CC [M]  drivers/gpu/drm/xe/xe_guc_submit.o
  CC [M]  drivers/gpu/drm/xe/xe_hw_engine.o
  CC [M]  drivers/gpu/drm/xe/xe_hw_fence.o
  CC [M]  drivers/gpu/drm/xe/xe_huc.o
  CC [M]  drivers/gpu/drm/xe/xe_huc_debugfs.o
  CC [M]  drivers/gpu/drm/xe/xe_irq.o
  CC [M]  drivers/gpu/drm/xe/xe_lrc.o
  CC [M]  drivers/gpu/drm/xe/xe_migrate.o
  CC [M]  drivers/gpu/drm/xe/xe_mmio.o
  CC [M]  drivers/gpu/drm/xe/xe_mocs.o
  CC [M]  drivers/gpu/drm/xe/xe_module.o
  CC [M]  drivers/gpu/drm/xe/xe_pat.o
  CC [M]  drivers/gpu/drm/xe/xe_pci.o
  CC [M]  drivers/gpu/drm/xe/xe_pcode.o
  CC [M]  drivers/gpu/drm/xe/xe_pm.o
  CC [M]  drivers/gpu/drm/xe/xe_preempt_fence.o
  CC [M]  drivers/gpu/drm/xe/xe_pt.o
  CC [M]  drivers/gpu/drm/xe/xe_pt_walk.o
  CC [M]  drivers/gpu/drm/xe/xe_query.o
  CC [M]  drivers/gpu/drm/xe/xe_reg_sr.o
  CC [M]  drivers/gpu/drm/xe/xe_reg_whitelist.o
  CC [M]  drivers/gpu/drm/xe/xe_rtp.o
  CC [M]  drivers/gpu/drm/xe/xe_sa.o
  CC [M]  drivers/gpu/drm/xe/xe_sched_job.o
  CC [M]  drivers/gpu/drm/xe/xe_step.o
  CC [M]  drivers/gpu/drm/xe/xe_sync.o
  CC [M]  drivers/gpu/drm/xe/xe_tile.o
  CC [M]  drivers/gpu/drm/xe/xe_trace.o
  CC [M]  drivers/gpu/drm/xe/xe_ttm_sys_mgr.o
  CC [M]  drivers/gpu/drm/xe/xe_ttm_stolen_mgr.o
  CC [M]  drivers/gpu/drm/xe/xe_ttm_vram_mgr.o
  CC [M]  drivers/gpu/drm/xe/xe_tuning.o
  CC [M]  drivers/gpu/drm/xe/xe_uc.o
  CC [M]  drivers/gpu/drm/xe/xe_uc_debugfs.o
  CC [M]  drivers/gpu/drm/xe/xe_uc_fw.o
  CC [M]  drivers/gpu/drm/xe/xe_vm.o
  CC [M]  drivers/gpu/drm/xe/xe_vm_madvise.o
  CC [M]  drivers/gpu/drm/xe/xe_wait_user_fence.o
  CC [M]  drivers/gpu/drm/xe/xe_wopcm.o
  CC [M]  drivers/gpu/drm/xe/xe_pmu.o
  HDRTEST drivers/gpu/drm/xe/abi/guc_klvs_abi.h
  HDRTEST drivers/gpu/drm/xe/abi/guc_errors_abi.h
  HDRTEST drivers/gpu/drm/xe/abi/guc_actions_slpc_abi.h
  CC [M]  drivers/gpu/drm/xe/tests/xe_bo_test.o
  HDRTEST drivers/gpu/drm/xe/abi/guc_communication_mmio_abi.h
  HDRTEST drivers/gpu/drm/xe/abi/guc_actions_abi.h
  CC [M]  drivers/gpu/drm/xe/tests/xe_dma_buf_test.o
  HDRTEST drivers/gpu/drm/xe/abi/guc_communication_ctb_abi.h
  HDRTEST drivers/gpu/drm/xe/abi/guc_messages_abi.h
  CC [M]  drivers/gpu/drm/xe/tests/xe_migrate_test.o
  HDRTEST drivers/gpu/drm/xe/regs/xe_reg_defs.h
  CC [M]  drivers/gpu/drm/xe/tests/xe_pci_test.o
  HDRTEST drivers/gpu/drm/xe/regs/xe_guc_regs.h
  CC [M]  drivers/gpu/drm/xe/tests/xe_rtp_test.o
  CC [M]  drivers/gpu/drm/xe/tests/xe_wa_test.o
  HDRTEST drivers/gpu/drm/xe/regs/xe_gt_regs.h
  HDRTEST drivers/gpu/drm/xe/regs/xe_regs.h
  HDRTEST drivers/gpu/drm/xe/regs/xe_gpu_commands.h
  HDRTEST drivers/gpu/drm/xe/regs/xe_lrc_layout.h
  HDRTEST drivers/gpu/drm/xe/regs/xe_engine_regs.h
  HDRTEST drivers/gpu/drm/xe/tests/xe_test.h
  HDRTEST drivers/gpu/drm/xe/tests/xe_pci_test.h
  HDRTEST drivers/gpu/drm/xe/tests/xe_migrate_test.h
  HDRTEST drivers/gpu/drm/xe/tests/xe_dma_buf_test.h
  HDRTEST drivers/gpu/drm/xe/tests/xe_bo_test.h
  HDRTEST drivers/gpu/drm/xe/xe_bb.h
  HDRTEST drivers/gpu/drm/xe/xe_bb_types.h
  HDRTEST drivers/gpu/drm/xe/xe_bo.h
  HDRTEST drivers/gpu/drm/xe/xe_bo_doc.h
  HDRTEST drivers/gpu/drm/xe/xe_bo_evict.h
  HDRTEST drivers/gpu/drm/xe/xe_bo_types.h
  HDRTEST drivers/gpu/drm/xe/xe_debugfs.h
  HDRTEST drivers/gpu/drm/xe/xe_devcoredump.h
  HDRTEST drivers/gpu/drm/xe/xe_devcoredump_types.h
  HDRTEST drivers/gpu/drm/xe/xe_device.h
  HDRTEST drivers/gpu/drm/xe/xe_device_types.h
  HDRTEST drivers/gpu/drm/xe/xe_dma_buf.h
  HDRTEST drivers/gpu/drm/xe/xe_drv.h
  HDRTEST drivers/gpu/drm/xe/xe_engine.h
  HDRTEST drivers/gpu/drm/xe/xe_engine_types.h
  HDRTEST drivers/gpu/drm/xe/xe_exec.h
  HDRTEST drivers/gpu/drm/xe/xe_execlist.h
  HDRTEST drivers/gpu/drm/xe/xe_execlist_types.h
  HDRTEST drivers/gpu/drm/xe/xe_force_wake.h
  HDRTEST drivers/gpu/drm/xe/xe_force_wake_types.h
  HDRTEST drivers/gpu/drm/xe/xe_ggtt.h
  HDRTEST drivers/gpu/drm/xe/xe_ggtt_types.h
  HDRTEST drivers/gpu/drm/xe/xe_gt.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_clock.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_debugfs.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_idle_sysfs.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_idle_sysfs_types.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_mcr.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_pagefault.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_printk.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_sysfs.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_sysfs_types.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_tlb_invalidation_types.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_topology.h
  HDRTEST drivers/gpu/drm/xe/xe_gt_types.h
  HDRTEST drivers/gpu/drm/xe/xe_guc.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_ads.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_ads_types.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_ct.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_ct_types.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_debugfs.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_engine_types.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_fwif.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_hwconfig.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_log.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_log_types.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_pc.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_pc_types.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_submit.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_submit_types.h
  HDRTEST drivers/gpu/drm/xe/xe_guc_types.h
  HDRTEST drivers/gpu/drm/xe/xe_huc.h
  HDRTEST drivers/gpu/drm/xe/xe_huc_debugfs.h
  HDRTEST drivers/gpu/drm/xe/xe_huc_types.h
  HDRTEST drivers/gpu/drm/xe/xe_hw_engine.h
  HDRTEST drivers/gpu/drm/xe/xe_hw_engine_types.h
  HDRTEST drivers/gpu/drm/xe/xe_hw_fence.h
  HDRTEST drivers/gpu/drm/xe/xe_hw_fence_types.h
  HDRTEST drivers/gpu/drm/xe/xe_irq.h
  HDRTEST drivers/gpu/drm/xe/xe_lrc.h
  HDRTEST drivers/gpu/drm/xe/xe_lrc_types.h
  HDRTEST drivers/gpu/drm/xe/xe_macros.h
  HDRTEST drivers/gpu/drm/xe/xe_map.h
  HDRTEST drivers/gpu/drm/xe/xe_migrate.h
  HDRTEST drivers/gpu/drm/xe/xe_migrate_doc.h
  HDRTEST drivers/gpu/drm/xe/xe_mmio.h
  HDRTEST drivers/gpu/drm/xe/xe_mocs.h
  HDRTEST drivers/gpu/drm/xe/xe_module.h
  HDRTEST drivers/gpu/drm/xe/xe_pat.h
  HDRTEST drivers/gpu/drm/xe/xe_pci.h
  HDRTEST drivers/gpu/drm/xe/xe_pci_types.h
  HDRTEST drivers/gpu/drm/xe/xe_pcode.h
  HDRTEST drivers/gpu/drm/xe/xe_pcode_api.h
  HDRTEST drivers/gpu/drm/xe/xe_platform_types.h
  HDRTEST drivers/gpu/drm/xe/xe_pm.h
  HDRTEST drivers/gpu/drm/xe/xe_pmu.h
  HDRTEST drivers/gpu/drm/xe/xe_pmu_types.h
  HDRTEST drivers/gpu/drm/xe/xe_preempt_fence.h
  HDRTEST drivers/gpu/drm/xe/xe_preempt_fence_types.h
  HDRTEST drivers/gpu/drm/xe/xe_pt.h
  HDRTEST drivers/gpu/drm/xe/xe_pt_types.h
  HDRTEST drivers/gpu/drm/xe/xe_pt_walk.h
  HDRTEST drivers/gpu/drm/xe/xe_query.h
  HDRTEST drivers/gpu/drm/xe/xe_reg_sr.h
  HDRTEST drivers/gpu/drm/xe/xe_reg_sr_types.h
  HDRTEST drivers/gpu/drm/xe/xe_reg_whitelist.h
  HDRTEST drivers/gpu/drm/xe/xe_res_cursor.h
  HDRTEST drivers/gpu/drm/xe/xe_ring_ops.h
  HDRTEST drivers/gpu/drm/xe/xe_ring_ops_types.h
  HDRTEST drivers/gpu/drm/xe/xe_rtp.h
  HDRTEST drivers/gpu/drm/xe/xe_rtp_types.h
  HDRTEST drivers/gpu/drm/xe/xe_sa.h
  HDRTEST drivers/gpu/drm/xe/xe_sa_types.h
  HDRTEST drivers/gpu/drm/xe/xe_sched_job.h
  HDRTEST drivers/gpu/drm/xe/xe_sched_job_types.h
  HDRTEST drivers/gpu/drm/xe/xe_step.h
  HDRTEST drivers/gpu/drm/xe/xe_step_types.h
  HDRTEST drivers/gpu/drm/xe/xe_sync.h
  HDRTEST drivers/gpu/drm/xe/xe_sync_types.h
  HDRTEST drivers/gpu/drm/xe/xe_tile.h
  HDRTEST drivers/gpu/drm/xe/xe_trace.h
  HDRTEST drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
  HDRTEST drivers/gpu/drm/xe/xe_ttm_sys_mgr.h
  HDRTEST drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
  HDRTEST drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
  HDRTEST drivers/gpu/drm/xe/xe_tuning.h
  HDRTEST drivers/gpu/drm/xe/xe_uc.h
  HDRTEST drivers/gpu/drm/xe/xe_uc_debugfs.h
  HDRTEST drivers/gpu/drm/xe/xe_uc_fw.h
  HDRTEST drivers/gpu/drm/xe/xe_uc_fw_abi.h
  HDRTEST drivers/gpu/drm/xe/xe_uc_fw_types.h
  HDRTEST drivers/gpu/drm/xe/xe_uc_types.h
  HDRTEST drivers/gpu/drm/xe/xe_vm.h
  HDRTEST drivers/gpu/drm/xe/xe_vm_doc.h
  HDRTEST drivers/gpu/drm/xe/xe_vm_madvise.h
  HDRTEST drivers/gpu/drm/xe/xe_vm_types.h
  HDRTEST drivers/gpu/drm/xe/xe_wa.h
  HDRTEST drivers/gpu/drm/xe/xe_wait_user_fence.h
  HDRTEST drivers/gpu/drm/xe/xe_wopcm.h
  HDRTEST drivers/gpu/drm/xe/xe_wopcm_types.h
  GEN     xe_wa_oob.c xe_wa_oob.h
  GEN     xe_wa_oob.c xe_wa_oob.h
  CC [M]  drivers/gpu/drm/xe/xe_guc.o
  CC [M]  drivers/gpu/drm/xe/xe_ring_ops.o
  CC [M]  drivers/gpu/drm/xe/xe_wa.o
  LD [M]  drivers/gpu/drm/xe/xe.o
  MODPOST drivers/gpu/drm/xe/Module.symvers
  CC [M]  drivers/gpu/drm/xe/xe.mod.o
  CC [M]  drivers/gpu/drm/xe/tests/xe_bo_test.mod.o
  CC [M]  drivers/gpu/drm/xe/tests/xe_dma_buf_test.mod.o
  CC [M]  drivers/gpu/drm/xe/tests/xe_migrate_test.mod.o
  CC [M]  drivers/gpu/drm/xe/tests/xe_pci_test.mod.o
  CC [M]  drivers/gpu/drm/xe/tests/xe_rtp_test.mod.o
  CC [M]  drivers/gpu/drm/xe/tests/xe_wa_test.mod.o
  LD [M]  drivers/gpu/drm/xe/tests/xe_bo_test.ko
  LD [M]  drivers/gpu/drm/xe/tests/xe_dma_buf_test.ko
  LD [M]  drivers/gpu/drm/xe/tests/xe_pci_test.ko
  LD [M]  drivers/gpu/drm/xe/tests/xe_wa_test.ko
  LD [M]  drivers/gpu/drm/xe/xe.ko
  LD [M]  drivers/gpu/drm/xe/tests/xe_migrate_test.ko
  LD [M]  drivers/gpu/drm/xe/tests/xe_rtp_test.ko
make[1]: Leaving directory '/workspace/kernel/build64'
+ cleanup
+ '[' 1 -eq 1 ']'
+ ./scripts/config --file /workspace/kernel/build64/.config --enable CONFIG_DRM_XE_DISPLAY
run-parts: executing /workspace/ci/hooks/20-kernel-doc
+ SRC_DIR=/workspace/kernel
+ cd /workspace/kernel
+ find drivers/gpu/drm/xe/ -name '*.[ch]' -not -path 'drivers/gpu/drm/xe/display/*'
+ xargs ./scripts/kernel-doc -Werror -none include/uapi/drm/xe_drm.h
drivers/gpu/drm/xe/xe_device_types.h:425: warning: Function parameter or member 'pmu' not described in 'xe_device'
1 warnings as Errors
run-parts: /workspace/ci/hooks/20-kernel-doc exited with return code 123



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-06-27 12:21 ` [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
@ 2023-06-30 13:53   ` Upadhyay, Tejas
  2023-07-03  5:11     ` Iddamsetty, Aravind
  2023-07-04  3:34   ` Ghimiray, Himal Prasad
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 59+ messages in thread
From: Upadhyay, Tejas @ 2023-06-30 13:53 UTC (permalink / raw)
  To: Iddamsetty, Aravind, intel-xe; +Cc: Bommu, Krishnaiah, Ursulin, Tvrtko

Review in progress, as it is large patch , its taking time. Some initial comments as follows:

> -----Original Message-----
> From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of
> Aravind Iddamsetty
> Sent: Tuesday, June 27, 2023 5:51 PM
> To: intel-xe@lists.freedesktop.org
> Cc: Bommu, Krishnaiah <krishnaiah.bommu@intel.com>; Ursulin, Tvrtko
> <tvrtko.ursulin@intel.com>
> Subject: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
> 
> There are a set of engine group busyness counters provided by HW which
> are perfect fit to be exposed via PMU perf events.
> 
> BSPEC: 46559, 46560, 46722, 46729
> 
> events can be listed using:
> perf list
>   xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
>   xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
>   xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
>   xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
>   xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
> 
> and can be read using:
> 
> perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
>            time             counts unit events
>      1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>     10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>     10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> 
> The pmu base implementation is taken from i915.
> 
> v2:
> Store last known value when device is awake return that while the GT is
> suspended and then update the driver copy when read during awake.
> 
> Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
> ---
>  drivers/gpu/drm/xe/Makefile          |   2 +
>  drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
>  drivers/gpu/drm/xe/xe_device.c       |   2 +
>  drivers/gpu/drm/xe/xe_device_types.h |   4 +
>  drivers/gpu/drm/xe/xe_gt.c           |   2 +
>  drivers/gpu/drm/xe/xe_irq.c          |  22 +
>  drivers/gpu/drm/xe/xe_module.c       |   5 +
>  drivers/gpu/drm/xe/xe_pmu.c          | 739 +++++++++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_pmu.h          |  25 +
>  drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
>  include/uapi/drm/xe_drm.h            |  16 +
>  11 files changed, 902 insertions(+)
>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.c  create mode 100644
> drivers/gpu/drm/xe/xe_pmu.h  create mode 100644
> drivers/gpu/drm/xe/xe_pmu_types.h
> 
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 081c57fd8632..e52ab795c566 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -217,6 +217,8 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
>  	i915-display/skl_universal_plane.o \
>  	i915-display/skl_watermark.o
> 
> +xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
> +
>  ifeq ($(CONFIG_ACPI),y)
>  	xe-$(CONFIG_DRM_XE_DISPLAY) += \
>  		i915-display/intel_acpi.o \
> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> index 3f664011eaea..c7d9e4634745 100644
> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> @@ -285,6 +285,11 @@
>  #define   INVALIDATION_BROADCAST_MODE_DIS	REG_BIT(12)
>  #define   GLOBAL_INVALIDATION_MODE		REG_BIT(2)
> 
> +#define XE_OAG_RC0_ANY_ENGINE_BUSY_FREE
> 	XE_REG(0xdb80)
> +#define XE_OAG_ANY_MEDIA_FF_BUSY_FREE		XE_REG(0xdba0)
> +#define XE_OAG_BLT_BUSY_FREE			XE_REG(0xdbbc)
> +#define XE_OAG_RENDER_BUSY_FREE			XE_REG(0xdbdc)
> +
>  #define SAMPLER_MODE				XE_REG_MCR(0xe18c,
> XE_REG_OPTION_MASKED)
>  #define   ENABLE_SMALLPL			REG_BIT(15)
>  #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
> diff --git a/drivers/gpu/drm/xe/xe_device.c
> b/drivers/gpu/drm/xe/xe_device.c index c7985af85a53..b2c7bd4a97d9
> 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -328,6 +328,8 @@ int xe_device_probe(struct xe_device *xe)
> 
>  	xe_debugfs_register(xe);
> 
> +	xe_pmu_register(&xe->pmu);
> +
>  	err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize, xe);
>  	if (err)
>  		return err;
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h
> b/drivers/gpu/drm/xe/xe_device_types.h
> index 0226d44a6af2..3ba99aae92b9 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -15,6 +15,7 @@
>  #include "xe_devcoredump_types.h"
>  #include "xe_gt_types.h"
>  #include "xe_platform_types.h"
> +#include "xe_pmu.h"
>  #include "xe_step_types.h"
> 
>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> @@ -332,6 +333,9 @@ struct xe_device {
>  	/** @d3cold_allowed: Indicates if d3cold is a valid device state */
>  	bool d3cold_allowed;
> 
> +	/* @pmu: performance monitoring unit */
> +	struct xe_pmu pmu;
> +
>  	/* private: */
> 
>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index
> 2458397ce8af..96e3720923d4 100644
> --- a/drivers/gpu/drm/xe/xe_gt.c
> +++ b/drivers/gpu/drm/xe/xe_gt.c
> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
>  	if (err)
>  		goto err_msg;
> 
> +	engine_group_busyness_store(gt);
> +
>  	err = xe_uc_suspend(&gt->uc);
>  	if (err)
>  		goto err_force_wake;
> diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
> index b4ed1e4a3388..cb943fb94ec7 100644
> --- a/drivers/gpu/drm/xe/xe_irq.c
> +++ b/drivers/gpu/drm/xe/xe_irq.c
> @@ -27,6 +27,24 @@
>  #define IIR(offset)				XE_REG(offset + 0x8)
>  #define IER(offset)				XE_REG(offset + 0xc)
> 
> +/*
> + * Interrupt statistic for PMU. Increments the counter only if the
> + * interrupt originated from the GPU so interrupts from a device which
> + * shares the interrupt line are not accounted.
> + */
> +static inline void xe_pmu_irq_stats(struct xe_device *xe,
> +				    irqreturn_t res)
> +{
> +	if (unlikely(res != IRQ_HANDLED))
> +		return;
> +
> +	/*
> +	 * A clever compiler translates that into INC. A not so clever one
> +	 * should at least prevent store tearing.
> +	 */
> +	WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1); }
> +
>  static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)  {
>  	u32 val = xe_mmio_read32(mmio, reg);
> @@ -325,6 +343,8 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
> 
>  	xe_display_irq_enable(xe, gu_misc_iir);
> 
> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
> +
>  	return IRQ_HANDLED;
>  }
> 
> @@ -414,6 +434,8 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
>  	dg1_intr_enable(xe, false);
>  	xe_display_irq_enable(xe, gu_misc_iir);
> 
> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
> +
>  	return IRQ_HANDLED;
>  }
> 
> diff --git a/drivers/gpu/drm/xe/xe_module.c
> b/drivers/gpu/drm/xe/xe_module.c index 75e5be939f53..f6fe89748525
> 100644
> --- a/drivers/gpu/drm/xe/xe_module.c
> +++ b/drivers/gpu/drm/xe/xe_module.c
> @@ -12,6 +12,7 @@
>  #include "xe_hw_fence.h"
>  #include "xe_module.h"
>  #include "xe_pci.h"
> +#include "xe_pmu.h"
>  #include "xe_sched_job.h"
> 
>  bool enable_guc = true;
> @@ -49,6 +50,10 @@ static const struct init_funcs init_funcs[] = {
>  		.init = xe_sched_job_module_init,
>  		.exit = xe_sched_job_module_exit,
>  	},
> +	{
> +		.init = xe_pmu_init,
> +		.exit = xe_pmu_exit,
> +	},
>  	{
>  		.init = xe_register_pci_driver,
>  		.exit = xe_unregister_pci_driver,
> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
> new file mode 100644 index 000000000000..bef1895be9f7
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu.c
> @@ -0,0 +1,739 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +#include <drm/drm_drv.h>
> +#include <drm/drm_managed.h>
> +#include <drm/xe_drm.h>
> +
> +#include "regs/xe_gt_regs.h"
> +#include "xe_device.h"
> +#include "xe_gt_clock.h"
> +#include "xe_mmio.h"
> +
> +static cpumask_t xe_pmu_cpumask;
> +static unsigned int xe_pmu_target_cpu = -1;
> +
> +static unsigned int config_gt_id(const u64 config) {
> +	return config >> __XE_PMU_GT_SHIFT;
> +}
> +
> +static u64 config_counter(const u64 config) {
> +	return config & ~(~0ULL << __XE_PMU_GT_SHIFT); }
> +
> +static unsigned int
> +__sample_idx(struct xe_pmu *pmu, unsigned int gt_id, int sample) {
> +	unsigned int idx = gt_id * __XE_NUM_PMU_SAMPLERS + sample;
> +
> +	XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
> +
> +	return idx;
> +}
> +
> +static u64 read_sample(struct xe_pmu *pmu, unsigned int gt_id, int
> +sample) {
> +	return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur; }
> +
> +static void
> +store_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample, u64
> +val) {
> +	pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val; }
> +
> +static int engine_busyness_sample_type(u64 config) {
> +	int type = 0;
> +
> +	switch (config) {
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +		type =  __XE_SAMPLE_RENDER_GROUP_BUSY;
> +		break;
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +		type = __XE_SAMPLE_COPY_GROUP_BUSY;
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		type = __XE_SAMPLE_MEDIA_GROUP_BUSY;
> +		break;
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		type = __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY;
> +		break;
> +	}
> +
> +	return type;
> +}
> +
> +static void xe_pmu_event_destroy(struct perf_event *event) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +
> +	drm_WARN_ON(&xe->drm, event->parent);
> +
> +	drm_dev_put(&xe->drm);
> +}
> +
> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config) {
> +	u64 val = 0;
> +
> +	switch (config) {
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
> +		break;
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt,
> XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
> +		break;
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt,
> XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
> +		break;
> +	default:
> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
> +	}
> +
> +	return xe_gt_clock_interval_to_ns(gt, val * 16); }
> +
> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config) {
> +	int sample_type = engine_busyness_sample_type(config);
> +	struct xe_device *xe = gt->tile->xe;
> +	const unsigned int gt_id = gt->info.id;
> +	struct xe_pmu *pmu = &xe->pmu;
> +	bool device_awake;
> +	unsigned long flags;
> +	u64 val;
> +
> +	/*
> +	 * found no better way to check if device is awake or not. Before
> +	 * we suspend we set the submission_state.enabled to false.
> +	 */
> +	device_awake = gt->uc.guc.submission_state.enabled ? true : false;
> +	if (device_awake)
> +		val = __engine_group_busyness_read(gt, config);
> +
> +	spin_lock_irqsave(&pmu->lock, flags);
> +
> +	if (device_awake)
> +		store_sample(pmu, gt_id, sample_type, val);
> +	else
> +		val = read_sample(pmu, gt_id, sample_type);
> +
> +	spin_unlock_irqrestore(&pmu->lock, flags);
> +
> +	return val;
> +}
> +
> +void engine_group_busyness_store(struct xe_gt *gt) {
> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
> +	unsigned int gt_id = gt->info.id;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&pmu->lock, flags);
> +
> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt,
> XE_PMU_RENDER_GROUP_BUSY(0)));

Should not you pass gt_id here?, like XE_PMU_RENDER_GROUP_BUSY(gt_id), I know it will not make difference right now as we have max 0 or 1 id of GT, but in future this might be still compatible if we pass gt_id.

> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt,
> XE_PMU_COPY_GROUP_BUSY(0)));
> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt,
> XE_PMU_MEDIA_GROUP_BUSY(0)));
> +	store_sample(pmu, gt_id,
> __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt,
> +XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
> +
> +	spin_unlock_irqrestore(&pmu->lock, flags); }
> +
> +static int
> +config_status(struct xe_device *xe, u64 config) {
> +	unsigned int max_gt_id = xe->info.gt_count > 1 ? 1 : 0;
> +	unsigned int gt_id = config_gt_id(config);
> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
> +
> +	if (gt_id > max_gt_id)
> +		return -ENOENT;
> +
> +	switch (config_counter(config)) {
> +	case XE_PMU_INTERRUPTS(0):
> +		if (gt_id)
> +			return -ENOENT;
> +		break;
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		if (GRAPHICS_VER(xe) < 12)
> +			return -ENOENT;
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		if (MEDIA_VER(xe) >= 13 && gt->info.type !=
> XE_GT_TYPE_MEDIA)
> +			return -ENOENT;
> +		break;
> +	default:
> +		return -ENOENT;
> +	}
> +
> +	return 0;
> +}
> +
> +static int xe_pmu_event_init(struct perf_event *event) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct xe_pmu *pmu = &xe->pmu;
> +	int ret;
> +
> +	if (pmu->closed)
> +		return -ENODEV;
> +
> +	if (event->attr.type != event->pmu->type)
> +		return -ENOENT;
> +
> +	/* unsupported modes and filters */
> +	if (event->attr.sample_period) /* no sampling */
> +		return -EINVAL;
> +
> +	if (has_branch_stack(event))
> +		return -EOPNOTSUPP;
> +
> +	if (event->cpu < 0)
> +		return -EINVAL;
> +
> +	/* only allow running on one cpu at a time */
> +	if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask))
> +		return -EINVAL;
> +
> +	ret = config_status(xe, event->attr.config);
> +	if (ret)
> +		return ret;
> +
> +	if (!event->parent) {
> +		drm_dev_get(&xe->drm);
> +		event->destroy = xe_pmu_event_destroy;
> +	}
> +
> +	return 0;
> +}
> +
> +static u64 __xe_pmu_event_read(struct perf_event *event) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	const unsigned int gt_id = config_gt_id(event->attr.config);
> +	const u64 config = config_counter(event->attr.config);
> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
> +	struct xe_pmu *pmu = &xe->pmu;
> +	u64 val = 0;
> +
> +	switch (config) {
> +	case XE_PMU_INTERRUPTS(0):
> +		val = READ_ONCE(pmu->irq_count);
> +		break;
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		val = engine_group_busyness_read(gt, config);
> +	}
> +
> +	return val;
> +}
> +
> +static void xe_pmu_event_read(struct perf_event *event) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct xe_pmu *pmu = &xe->pmu;
> +	u64 prev, new;
> +
> +	if (pmu->closed) {
> +		event->hw.state = PERF_HES_STOPPED;
> +		return;
> +	}
> +again:
> +	prev = local64_read(&hwc->prev_count);
> +	new = __xe_pmu_event_read(event);
> +
> +	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
> +		goto again;
> +
> +	local64_add(new - prev, &event->count); }
> +
> +static void xe_pmu_enable(struct perf_event *event) {
> +	/*
> +	 * Store the current counter value so we can report the correct delta
> +	 * for all listeners. Even when the event was already enabled and has
> +	 * an existing non-zero value.
> +	 */
> +	local64_set(&event->hw.prev_count, __xe_pmu_event_read(event));
> }
> +
> +static void xe_pmu_event_start(struct perf_event *event, int flags) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct xe_pmu *pmu = &xe->pmu;
> +
> +	if (pmu->closed)
> +		return;
> +
> +	xe_pmu_enable(event);
> +	event->hw.state = 0;
> +}
> +
> +static void xe_pmu_event_stop(struct perf_event *event, int flags) {
> +	if (flags & PERF_EF_UPDATE)
> +		xe_pmu_event_read(event);
> +
> +	event->hw.state = PERF_HES_STOPPED;
> +}
> +
> +static int xe_pmu_event_add(struct perf_event *event, int flags) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct xe_pmu *pmu = &xe->pmu;
> +
> +	if (pmu->closed)
> +		return -ENODEV;
> +
> +	if (flags & PERF_EF_START)
> +		xe_pmu_event_start(event, flags);
> +
> +	return 0;
> +}
> +
> +static void xe_pmu_event_del(struct perf_event *event, int flags) {
> +	xe_pmu_event_stop(event, PERF_EF_UPDATE); }
> +
> +static int xe_pmu_event_event_idx(struct perf_event *event) {
> +	return 0;
> +}
> +
> +struct xe_str_attribute {
> +	struct device_attribute attr;
> +	const char *str;
> +};
> +
> +static ssize_t xe_pmu_format_show(struct device *dev,
> +				  struct device_attribute *attr, char *buf) {
> +	struct xe_str_attribute *eattr;
> +
> +	eattr = container_of(attr, struct xe_str_attribute, attr);
> +	return sprintf(buf, "%s\n", eattr->str); }
> +
> +#define XE_PMU_FORMAT_ATTR(_name, _config) \
> +	(&((struct xe_str_attribute[]) { \
> +		{ .attr = __ATTR(_name, 0444, xe_pmu_format_show, NULL),
> \
> +		  .str = _config, } \
> +	})[0].attr.attr)
> +
> +static struct attribute *xe_pmu_format_attrs[] = {
> +	XE_PMU_FORMAT_ATTR(xe_eventid, "config:0-20"),
> +	NULL,
> +};
> +
> +static const struct attribute_group xe_pmu_format_attr_group = {
> +	.name = "format",
> +	.attrs = xe_pmu_format_attrs,
> +};
> +
> +struct xe_ext_attribute {
> +	struct device_attribute attr;
> +	unsigned long val;
> +};
> +
> +static ssize_t xe_pmu_event_show(struct device *dev,
> +				 struct device_attribute *attr, char *buf) {
> +	struct xe_ext_attribute *eattr;
> +
> +	eattr = container_of(attr, struct xe_ext_attribute, attr);
> +	return sprintf(buf, "config=0x%lx\n", eattr->val); }
> +
> +static ssize_t cpumask_show(struct device *dev,
> +			    struct device_attribute *attr, char *buf) {
> +	return cpumap_print_to_pagebuf(true, buf, &xe_pmu_cpumask); }
> +
> +static DEVICE_ATTR_RO(cpumask);
> +
> +static struct attribute *xe_cpumask_attrs[] = {
> +	&dev_attr_cpumask.attr,
> +	NULL,
> +};
> +
> +static const struct attribute_group xe_pmu_cpumask_attr_group = {
> +	.attrs = xe_cpumask_attrs,
> +};
> +
> +#define __event(__counter, __name, __unit) \ { \
> +	.counter = (__counter), \
> +	.name = (__name), \
> +	.unit = (__unit), \
> +	.global = false, \
> +}
> +
> +#define __global_event(__counter, __name, __unit) \ { \
> +	.counter = (__counter), \
> +	.name = (__name), \
> +	.unit = (__unit), \
> +	.global = true, \
> +}
> +
> +static struct xe_ext_attribute *
> +add_xe_attr(struct xe_ext_attribute *attr, const char *name, u64
> +config) {
> +	sysfs_attr_init(&attr->attr.attr);
> +	attr->attr.attr.name = name;
> +	attr->attr.attr.mode = 0444;
> +	attr->attr.show = xe_pmu_event_show;
> +	attr->val = config;
> +
> +	return ++attr;
> +}
> +
> +static struct perf_pmu_events_attr *
> +add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name,
> +	     const char *str)
> +{
> +	sysfs_attr_init(&attr->attr.attr);
> +	attr->attr.attr.name = name;
> +	attr->attr.attr.mode = 0444;
> +	attr->attr.show = perf_event_sysfs_show;
> +	attr->event_str = str;
> +
> +	return ++attr;
> +}
> +
> +static struct attribute **
> +create_event_attributes(struct xe_pmu *pmu) {
> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +	static const struct {
> +		unsigned int counter;
> +		const char *name;
> +		const char *unit;
> +		bool global;
> +	} events[] = {
> +		__global_event(0, "interrupts", NULL),
> +		__event(1, "render-group-busy", "ns"),
> +		__event(2, "copy-group-busy", "ns"),
> +		__event(3, "media-group-busy", "ns"),
> +		__event(4, "any-engine-group-busy", "ns"),
> +	};
> +
> +	unsigned int count = 0;
> +	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
> +	struct xe_ext_attribute *xe_attr = NULL, *xe_iter;
> +	struct attribute **attr = NULL, **attr_iter;
> +	struct xe_gt *gt;
> +	unsigned int i, j;
> +
> +	/* Count how many counters we will be exposing. */
> +	for_each_gt(gt, xe, j) {
> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
> +			u64 config = ___XE_PMU_OTHER(j,
> events[i].counter);
> +
> +			if (!config_status(xe, config))
> +				count++;
> +		}
> +	}
> +
> +	/* Allocate attribute objects and table. */
> +	xe_attr = kcalloc(count, sizeof(*xe_attr), GFP_KERNEL);
> +	if (!xe_attr)
> +		goto err_alloc;
> +
> +	pmu_attr = kcalloc(count, sizeof(*pmu_attr), GFP_KERNEL);
> +	if (!pmu_attr)
> +		goto err_alloc;
> +
> +	/* Max one pointer of each attribute type plus a termination entry.
> */
> +	attr = kcalloc(count * 2 + 1, sizeof(*attr), GFP_KERNEL);
> +	if (!attr)
> +		goto err_alloc;
> +
> +	xe_iter = xe_attr;
> +	pmu_iter = pmu_attr;
> +	attr_iter = attr;
> +
> +	for_each_gt(gt, xe, j) {
> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
> +			u64 config = ___XE_PMU_OTHER(j,
> events[i].counter);
> +			char *str;
> +
> +			if (config_status(xe, config))
> +				continue;
> +
> +			if (events[i].global)
> +				str = kstrdup(events[i].name, GFP_KERNEL);
> +			else
> +				str = kasprintf(GFP_KERNEL, "%s-gt%u",
> +						events[i].name, j);
> +			if (!str)
> +				goto err;
> +
> +			*attr_iter++ = &xe_iter->attr.attr;
> +			xe_iter = add_xe_attr(xe_iter, str, config);
> +
> +			if (events[i].unit) {
> +				if (events[i].global)
> +					str = kasprintf(GFP_KERNEL,
> "%s.unit",
> +							events[i].name);
> +				else
> +					str = kasprintf(GFP_KERNEL, "%s-
> gt%u.unit",
> +							events[i].name, j);
> +				if (!str)
> +					goto err;
> +
> +				*attr_iter++ = &pmu_iter->attr.attr;
> +				pmu_iter = add_pmu_attr(pmu_iter, str,
> +							events[i].unit);
> +			}
> +		}
> +	}
> +
> +	pmu->xe_attr = xe_attr;
> +	pmu->pmu_attr = pmu_attr;
> +
> +	return attr;
> +
> +err:
> +	for (attr_iter = attr; *attr_iter; attr_iter++)
> +		kfree((*attr_iter)->name);
> +
> +err_alloc:
> +	kfree(attr);
> +	kfree(xe_attr);
> +	kfree(pmu_attr);
> +
> +	return NULL;
> +}
> +
> +static void free_event_attributes(struct xe_pmu *pmu) {
> +	struct attribute **attr_iter = pmu->events_attr_group.attrs;
> +
> +	for (; *attr_iter; attr_iter++)
> +		kfree((*attr_iter)->name);
> +
> +	kfree(pmu->events_attr_group.attrs);
> +	kfree(pmu->xe_attr);
> +	kfree(pmu->pmu_attr);
> +
> +	pmu->events_attr_group.attrs = NULL;
> +	pmu->xe_attr = NULL;
> +	pmu->pmu_attr = NULL;
> +}
> +
> +static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
> +{
> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu),
> cpuhp.node);
> +
> +	XE_BUG_ON(!pmu->base.event_init);
> +
> +	/* Select the first online CPU as a designated reader. */
> +	if (cpumask_empty(&xe_pmu_cpumask))
> +		cpumask_set_cpu(cpu, &xe_pmu_cpumask);
> +
> +	return 0;
> +}
> +
> +static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node
> +*node) {
> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu),
> cpuhp.node);
> +	unsigned int target = xe_pmu_target_cpu;
> +
> +	XE_BUG_ON(!pmu->base.event_init);
> +
> +	/*
> +	 * Unregistering an instance generates a CPU offline event which we
> must
> +	 * ignore to avoid incorrectly modifying the shared
> xe_pmu_cpumask.
> +	 */
> +	if (pmu->closed)
> +		return 0;
> +
> +	if (cpumask_test_and_clear_cpu(cpu, &xe_pmu_cpumask)) {
> +		target = cpumask_any_but(topology_sibling_cpumask(cpu),
> cpu);
> +
> +		/* Migrate events if there is a valid target */
> +		if (target < nr_cpu_ids) {
> +			cpumask_set_cpu(target, &xe_pmu_cpumask);
> +			xe_pmu_target_cpu = target;
> +		}
> +	}
> +
> +	if (target < nr_cpu_ids && target != pmu->cpuhp.cpu) {
> +		perf_pmu_migrate_context(&pmu->base, cpu, target);
> +		pmu->cpuhp.cpu = target;
> +	}
> +
> +	return 0;
> +}
> +
> +static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
> +
> +int xe_pmu_init(void)
> +{
> +	int ret;
> +
> +	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
> +				      "perf/x86/intel/xe:online",
> +				      xe_pmu_cpu_online,
> +				      xe_pmu_cpu_offline);
> +	if (ret < 0)
> +		pr_notice("Failed to setup cpuhp state for xe PMU! (%d)\n",
> +			  ret);
> +	else
> +		cpuhp_slot = ret;
> +
> +	return 0;
> +}
> +
> +void xe_pmu_exit(void)
> +{
> +	if (cpuhp_slot != CPUHP_INVALID)
> +		cpuhp_remove_multi_state(cpuhp_slot);
> +}
> +
> +static int xe_pmu_register_cpuhp_state(struct xe_pmu *pmu) {
> +	if (cpuhp_slot == CPUHP_INVALID)
> +		return -EINVAL;
> +
> +	return cpuhp_state_add_instance(cpuhp_slot, &pmu->cpuhp.node);
> }
> +
> +static void xe_pmu_unregister_cpuhp_state(struct xe_pmu *pmu) {
> +	cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node); }
> +
> +static void xe_pmu_unregister(struct drm_device *device, void *arg) {
> +	struct xe_pmu *pmu = arg;
> +
> +	if (!pmu->base.event_init)
> +		return;
> +
> +	/*
> +	 * "Disconnect" the PMU callbacks - since all are atomic
> synchronize_rcu
> +	 * ensures all currently executing ones will have exited before we
> +	 * proceed with unregistration.
> +	 */
> +	pmu->closed = true;
> +	synchronize_rcu();
> +
> +	xe_pmu_unregister_cpuhp_state(pmu);
> +
> +	perf_pmu_unregister(&pmu->base);
> +	pmu->base.event_init = NULL;
> +	kfree(pmu->base.attr_groups);
> +	kfree(pmu->name);
> +	free_event_attributes(pmu);
> +}
> +
> +static void init_samples(struct xe_pmu *pmu) {
> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +	struct xe_gt *gt;
> +	unsigned int i;
> +
> +	for_each_gt(gt, xe, i)
> +		engine_group_busyness_store(gt);
> +}
> +
> +void xe_pmu_register(struct xe_pmu *pmu) {
> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +	const struct attribute_group *attr_groups[] = {
> +		&xe_pmu_format_attr_group,
> +		&pmu->events_attr_group,
> +		&xe_pmu_cpumask_attr_group,
> +		NULL
> +	};
> +

No newline needed here I think.

> +	int ret = -ENOMEM;
> +
> +	spin_lock_init(&pmu->lock);
> +	pmu->cpuhp.cpu = -1;
> +	init_samples(pmu);
> +
> +	pmu->name = kasprintf(GFP_KERNEL,
> +			      "xe_%s",
> +			      dev_name(xe->drm.dev));
> +	if (pmu->name)
> +		/* tools/perf reserves colons as special. */
> +		strreplace((char *)pmu->name, ':', '_');
> +

We can skip checking pmu->name again here. 

Thanks,
Tejas
> +	if (!pmu->name)
> +		goto err;
> +
> +	pmu->events_attr_group.name = "events";
> +	pmu->events_attr_group.attrs = create_event_attributes(pmu);
> +	if (!pmu->events_attr_group.attrs)
> +		goto err_name;
> +
> +	pmu->base.attr_groups = kmemdup(attr_groups,
> sizeof(attr_groups),
> +					GFP_KERNEL);
> +	if (!pmu->base.attr_groups)
> +		goto err_attr;
> +
> +	pmu->base.module	= THIS_MODULE;
> +	pmu->base.task_ctx_nr	= perf_invalid_context;
> +	pmu->base.event_init	= xe_pmu_event_init;
> +	pmu->base.add		= xe_pmu_event_add;
> +	pmu->base.del		= xe_pmu_event_del;
> +	pmu->base.start		= xe_pmu_event_start;
> +	pmu->base.stop		= xe_pmu_event_stop;
> +	pmu->base.read		= xe_pmu_event_read;
> +	pmu->base.event_idx	= xe_pmu_event_event_idx;
> +
> +	ret = perf_pmu_register(&pmu->base, pmu->name, -1);
> +	if (ret)
> +		goto err_groups;
> +
> +	ret = xe_pmu_register_cpuhp_state(pmu);
> +	if (ret)
> +		goto err_unreg;
> +
> +	ret = drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister,
> pmu);
> +	XE_WARN_ON(ret);
> +
> +	return;
> +
> +err_unreg:
> +	perf_pmu_unregister(&pmu->base);
> +err_groups:
> +	kfree(pmu->base.attr_groups);
> +err_attr:
> +	pmu->base.event_init = NULL;
> +	free_event_attributes(pmu);
> +err_name:
> +	kfree(pmu->name);
> +err:
> +	drm_notice(&xe->drm, "Failed to register PMU!\n"); }
> diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
> new file mode 100644 index 000000000000..d3f47f4ab343
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu.h
> @@ -0,0 +1,25 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +#ifndef _XE_PMU_H_
> +#define _XE_PMU_H_
> +
> +#include "xe_gt_types.h"
> +#include "xe_pmu_types.h"
> +
> +#ifdef CONFIG_PERF_EVENTS
> +int xe_pmu_init(void);
> +void xe_pmu_exit(void);
> +void xe_pmu_register(struct xe_pmu *pmu); void
> +engine_group_busyness_store(struct xe_gt *gt); #else static inline int
> +xe_pmu_init(void) { return 0; } static inline void xe_pmu_exit(void) {}
> +static inline void xe_pmu_register(struct xe_pmu *pmu) {} static inline
> +void engine_group_busyness_store(struct xe_gt *gt) {} #endif
> +
> +#endif
> +
> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h
> b/drivers/gpu/drm/xe/xe_pmu_types.h
> new file mode 100644
> index 000000000000..e87edd4d6a87
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
> @@ -0,0 +1,80 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +#ifndef _XE_PMU_TYPES_H_
> +#define _XE_PMU_TYPES_H_
> +
> +#include <linux/perf_event.h>
> +#include <linux/spinlock_types.h>
> +#include <uapi/drm/xe_drm.h>
> +
> +enum {
> +	__XE_SAMPLE_RENDER_GROUP_BUSY,
> +	__XE_SAMPLE_COPY_GROUP_BUSY,
> +	__XE_SAMPLE_MEDIA_GROUP_BUSY,
> +	__XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> +	__XE_NUM_PMU_SAMPLERS
> +};
> +
> +struct xe_pmu_sample {
> +	u64 cur;
> +};
> +
> +#define XE_MAX_GT_PER_TILE 2
> +
> +struct xe_pmu {
> +	/**
> +	 * @cpuhp: Struct used for CPU hotplug handling.
> +	 */
> +	struct {
> +		struct hlist_node node;
> +		unsigned int cpu;
> +	} cpuhp;
> +	/**
> +	 * @base: PMU base.
> +	 */
> +	struct pmu base;
> +	/**
> +	 * @closed: xe is unregistering.
> +	 */
> +	bool closed;
> +	/**
> +	 * @name: Name as registered with perf core.
> +	 */
> +	const char *name;
> +	/**
> +	 * @lock: Lock protecting enable mask and ref count handling.
> +	 */
> +	spinlock_t lock;
> +	/**
> +	 * @sample: Current and previous (raw) counters.
> +	 *
> +	 * These counters are updated when the device is awake.
> +	 *
> +	 */
> +	struct xe_pmu_sample sample[XE_MAX_GT_PER_TILE *
> __XE_NUM_PMU_SAMPLERS];
> +	/**
> +	 * @irq_count: Number of interrupts
> +	 *
> +	 * Intentionally unsigned long to avoid atomics or heuristics on 32bit.
> +	 * 4e9 interrupts are a lot and postprocessing can really deal with an
> +	 * occasional wraparound easily. It's 32bit after all.
> +	 */
> +	unsigned long irq_count;
> +	/**
> +	 * @events_attr_group: Device events attribute group.
> +	 */
> +	struct attribute_group events_attr_group;
> +	/**
> +	 * @xe_attr: Memory block holding device attributes.
> +	 */
> +	void *xe_attr;
> +	/**
> +	 * @pmu_attr: Memory block holding device attributes.
> +	 */
> +	void *pmu_attr;
> +};
> +
> +#endif
> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index
> 965cd9527ff1..ed097056f944 100644
> --- a/include/uapi/drm/xe_drm.h
> +++ b/include/uapi/drm/xe_drm.h
> @@ -990,6 +990,22 @@ struct drm_xe_vm_madvise {
>  	__u64 reserved[2];
>  };
> 
> +/* PMU event config IDs */
> +
> +/*
> + * Top 4 bits of every counter are GT id.
> + */
> +#define __XE_PMU_GT_SHIFT (60)
> +
> +#define ___XE_PMU_OTHER(gt, x) \
> +	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
> +
> +#define XE_PMU_INTERRUPTS(gt)
> 	___XE_PMU_OTHER(gt, 0)
> +#define XE_PMU_RENDER_GROUP_BUSY(gt)
> 	___XE_PMU_OTHER(gt, 1)
> +#define XE_PMU_COPY_GROUP_BUSY(gt)
> 	___XE_PMU_OTHER(gt, 2)
> +#define XE_PMU_MEDIA_GROUP_BUSY(gt)
> 	___XE_PMU_OTHER(gt, 3)
> +#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)
> 	___XE_PMU_OTHER(gt, 4)
> +
>  #if defined(__cplusplus)
>  }
>  #endif
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-06-30 13:53   ` Upadhyay, Tejas
@ 2023-07-03  5:11     ` Iddamsetty, Aravind
  0 siblings, 0 replies; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-03  5:11 UTC (permalink / raw)
  To: Upadhyay, Tejas, intel-xe; +Cc: Bommu, Krishnaiah, Ursulin, Tvrtko



On 30-06-2023 19:23, Upadhyay, Tejas wrote:
> Review in progress, as it is large patch , its taking time. Some initial comments as follows:
> 
>> -----Original Message-----
>> From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of
>> Aravind Iddamsetty
>> Sent: Tuesday, June 27, 2023 5:51 PM
>> To: intel-xe@lists.freedesktop.org
>> Cc: Bommu, Krishnaiah <krishnaiah.bommu@intel.com>; Ursulin, Tvrtko
>> <tvrtko.ursulin@intel.com>
>> Subject: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
>>
>> There are a set of engine group busyness counters provided by HW which
>> are perfect fit to be exposed via PMU perf events.
>>
>> BSPEC: 46559, 46560, 46722, 46729
>>
>> events can be listed using:
>> perf list
>>   xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
>>   xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
>>   xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
>>   xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
>>   xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
>>
>> and can be read using:
>>
>> perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
>>            time             counts unit events
>>      1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>     10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>     10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>
>> The pmu base implementation is taken from i915.
>>
>> v2:
>> Store last known value when device is awake return that while the GT is
>> suspended and then update the driver copy when read during awake.
>>
>> Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
>> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
>> ---
>>  drivers/gpu/drm/xe/Makefile          |   2 +
>>  drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
>>  drivers/gpu/drm/xe/xe_device.c       |   2 +
>>  drivers/gpu/drm/xe/xe_device_types.h |   4 +
>>  drivers/gpu/drm/xe/xe_gt.c           |   2 +
>>  drivers/gpu/drm/xe/xe_irq.c          |  22 +
>>  drivers/gpu/drm/xe/xe_module.c       |   5 +
>>  drivers/gpu/drm/xe/xe_pmu.c          | 739 +++++++++++++++++++++++++++
>>  drivers/gpu/drm/xe/xe_pmu.h          |  25 +
>>  drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
>>  include/uapi/drm/xe_drm.h            |  16 +
>>  11 files changed, 902 insertions(+)
>>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.c  create mode 100644
>> drivers/gpu/drm/xe/xe_pmu.h  create mode 100644
>> drivers/gpu/drm/xe/xe_pmu_types.h
>>
>> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
>> index 081c57fd8632..e52ab795c566 100644
>> --- a/drivers/gpu/drm/xe/Makefile
>> +++ b/drivers/gpu/drm/xe/Makefile
>> @@ -217,6 +217,8 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
>>  	i915-display/skl_universal_plane.o \
>>  	i915-display/skl_watermark.o
>>
>> +xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
>> +
>>  ifeq ($(CONFIG_ACPI),y)
>>  	xe-$(CONFIG_DRM_XE_DISPLAY) += \
>>  		i915-display/intel_acpi.o \
>> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> index 3f664011eaea..c7d9e4634745 100644
>> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> @@ -285,6 +285,11 @@
>>  #define   INVALIDATION_BROADCAST_MODE_DIS	REG_BIT(12)
>>  #define   GLOBAL_INVALIDATION_MODE		REG_BIT(2)
>>
>> +#define XE_OAG_RC0_ANY_ENGINE_BUSY_FREE
>> 	XE_REG(0xdb80)
>> +#define XE_OAG_ANY_MEDIA_FF_BUSY_FREE		XE_REG(0xdba0)
>> +#define XE_OAG_BLT_BUSY_FREE			XE_REG(0xdbbc)
>> +#define XE_OAG_RENDER_BUSY_FREE			XE_REG(0xdbdc)
>> +
>>  #define SAMPLER_MODE				XE_REG_MCR(0xe18c,
>> XE_REG_OPTION_MASKED)
>>  #define   ENABLE_SMALLPL			REG_BIT(15)
>>  #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
>> diff --git a/drivers/gpu/drm/xe/xe_device.c
>> b/drivers/gpu/drm/xe/xe_device.c index c7985af85a53..b2c7bd4a97d9
>> 100644
>> --- a/drivers/gpu/drm/xe/xe_device.c
>> +++ b/drivers/gpu/drm/xe/xe_device.c
>> @@ -328,6 +328,8 @@ int xe_device_probe(struct xe_device *xe)
>>
>>  	xe_debugfs_register(xe);
>>
>> +	xe_pmu_register(&xe->pmu);
>> +
>>  	err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize, xe);
>>  	if (err)
>>  		return err;
>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h
>> b/drivers/gpu/drm/xe/xe_device_types.h
>> index 0226d44a6af2..3ba99aae92b9 100644
>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>> @@ -15,6 +15,7 @@
>>  #include "xe_devcoredump_types.h"
>>  #include "xe_gt_types.h"
>>  #include "xe_platform_types.h"
>> +#include "xe_pmu.h"
>>  #include "xe_step_types.h"
>>
>>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
>> @@ -332,6 +333,9 @@ struct xe_device {
>>  	/** @d3cold_allowed: Indicates if d3cold is a valid device state */
>>  	bool d3cold_allowed;
>>
>> +	/* @pmu: performance monitoring unit */
>> +	struct xe_pmu pmu;
>> +
>>  	/* private: */
>>
>>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
>> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index
>> 2458397ce8af..96e3720923d4 100644
>> --- a/drivers/gpu/drm/xe/xe_gt.c
>> +++ b/drivers/gpu/drm/xe/xe_gt.c
>> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
>>  	if (err)
>>  		goto err_msg;
>>
>> +	engine_group_busyness_store(gt);
>> +
>>  	err = xe_uc_suspend(&gt->uc);
>>  	if (err)
>>  		goto err_force_wake;
>> diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
>> index b4ed1e4a3388..cb943fb94ec7 100644
>> --- a/drivers/gpu/drm/xe/xe_irq.c
>> +++ b/drivers/gpu/drm/xe/xe_irq.c
>> @@ -27,6 +27,24 @@
>>  #define IIR(offset)				XE_REG(offset + 0x8)
>>  #define IER(offset)				XE_REG(offset + 0xc)
>>
>> +/*
>> + * Interrupt statistic for PMU. Increments the counter only if the
>> + * interrupt originated from the GPU so interrupts from a device which
>> + * shares the interrupt line are not accounted.
>> + */
>> +static inline void xe_pmu_irq_stats(struct xe_device *xe,
>> +				    irqreturn_t res)
>> +{
>> +	if (unlikely(res != IRQ_HANDLED))
>> +		return;
>> +
>> +	/*
>> +	 * A clever compiler translates that into INC. A not so clever one
>> +	 * should at least prevent store tearing.
>> +	 */
>> +	WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1); }
>> +
>>  static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)  {
>>  	u32 val = xe_mmio_read32(mmio, reg);
>> @@ -325,6 +343,8 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
>>
>>  	xe_display_irq_enable(xe, gu_misc_iir);
>>
>> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
>> +
>>  	return IRQ_HANDLED;
>>  }
>>
>> @@ -414,6 +434,8 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
>>  	dg1_intr_enable(xe, false);
>>  	xe_display_irq_enable(xe, gu_misc_iir);
>>
>> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
>> +
>>  	return IRQ_HANDLED;
>>  }
>>
>> diff --git a/drivers/gpu/drm/xe/xe_module.c
>> b/drivers/gpu/drm/xe/xe_module.c index 75e5be939f53..f6fe89748525
>> 100644
>> --- a/drivers/gpu/drm/xe/xe_module.c
>> +++ b/drivers/gpu/drm/xe/xe_module.c
>> @@ -12,6 +12,7 @@
>>  #include "xe_hw_fence.h"
>>  #include "xe_module.h"
>>  #include "xe_pci.h"
>> +#include "xe_pmu.h"
>>  #include "xe_sched_job.h"
>>
>>  bool enable_guc = true;
>> @@ -49,6 +50,10 @@ static const struct init_funcs init_funcs[] = {
>>  		.init = xe_sched_job_module_init,
>>  		.exit = xe_sched_job_module_exit,
>>  	},
>> +	{
>> +		.init = xe_pmu_init,
>> +		.exit = xe_pmu_exit,
>> +	},
>>  	{
>>  		.init = xe_register_pci_driver,
>>  		.exit = xe_unregister_pci_driver,
>> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
>> new file mode 100644 index 000000000000..bef1895be9f7
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_pmu.c
>> @@ -0,0 +1,739 @@
>> +/*
>> + * SPDX-License-Identifier: MIT
>> + *
>> + * Copyright © 2023 Intel Corporation
>> + */
>> +
>> +#include <drm/drm_drv.h>
>> +#include <drm/drm_managed.h>
>> +#include <drm/xe_drm.h>
>> +
>> +#include "regs/xe_gt_regs.h"
>> +#include "xe_device.h"
>> +#include "xe_gt_clock.h"
>> +#include "xe_mmio.h"
>> +
>> +static cpumask_t xe_pmu_cpumask;
>> +static unsigned int xe_pmu_target_cpu = -1;
>> +
>> +static unsigned int config_gt_id(const u64 config) {
>> +	return config >> __XE_PMU_GT_SHIFT;
>> +}
>> +
>> +static u64 config_counter(const u64 config) {
>> +	return config & ~(~0ULL << __XE_PMU_GT_SHIFT); }
>> +
>> +static unsigned int
>> +__sample_idx(struct xe_pmu *pmu, unsigned int gt_id, int sample) {
>> +	unsigned int idx = gt_id * __XE_NUM_PMU_SAMPLERS + sample;
>> +
>> +	XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
>> +
>> +	return idx;
>> +}
>> +
>> +static u64 read_sample(struct xe_pmu *pmu, unsigned int gt_id, int
>> +sample) {
>> +	return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur; }
>> +
>> +static void
>> +store_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample, u64
>> +val) {
>> +	pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val; }
>> +
>> +static int engine_busyness_sample_type(u64 config) {
>> +	int type = 0;
>> +
>> +	switch (config) {
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +		type =  __XE_SAMPLE_RENDER_GROUP_BUSY;
>> +		break;
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +		type = __XE_SAMPLE_COPY_GROUP_BUSY;
>> +		break;
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		type = __XE_SAMPLE_MEDIA_GROUP_BUSY;
>> +		break;
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +		type = __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY;
>> +		break;
>> +	}
>> +
>> +	return type;
>> +}
>> +
>> +static void xe_pmu_event_destroy(struct perf_event *event) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +
>> +	drm_WARN_ON(&xe->drm, event->parent);
>> +
>> +	drm_dev_put(&xe->drm);
>> +}
>> +
>> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config) {
>> +	u64 val = 0;
>> +
>> +	switch (config) {
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
>> +		break;
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
>> +		break;
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt,
>> XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
>> +		break;
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt,
>> XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
>> +		break;
>> +	default:
>> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
>> +	}
>> +
>> +	return xe_gt_clock_interval_to_ns(gt, val * 16); }
>> +
>> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config) {
>> +	int sample_type = engine_busyness_sample_type(config);
>> +	struct xe_device *xe = gt->tile->xe;
>> +	const unsigned int gt_id = gt->info.id;
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	bool device_awake;
>> +	unsigned long flags;
>> +	u64 val;
>> +
>> +	/*
>> +	 * found no better way to check if device is awake or not. Before
>> +	 * we suspend we set the submission_state.enabled to false.
>> +	 */
>> +	device_awake = gt->uc.guc.submission_state.enabled ? true : false;
>> +	if (device_awake)
>> +		val = __engine_group_busyness_read(gt, config);
>> +
>> +	spin_lock_irqsave(&pmu->lock, flags);
>> +
>> +	if (device_awake)
>> +		store_sample(pmu, gt_id, sample_type, val);
>> +	else
>> +		val = read_sample(pmu, gt_id, sample_type);
>> +
>> +	spin_unlock_irqrestore(&pmu->lock, flags);
>> +
>> +	return val;
>> +}
>> +
>> +void engine_group_busyness_store(struct xe_gt *gt) {
>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
>> +	unsigned int gt_id = gt->info.id;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&pmu->lock, flags);
>> +
>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt,
>> XE_PMU_RENDER_GROUP_BUSY(0)));
> 
> Should not you pass gt_id here?, like XE_PMU_RENDER_GROUP_BUSY(gt_id), I know it will not make difference right now as we have max 0 or 1 id of GT, but in future this might be still compatible if we pass gt_id.

__engine_group_busyness_read is already having a gt reference hence no
need to pass again in config.

> 
>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt,
>> XE_PMU_COPY_GROUP_BUSY(0)));
>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt,
>> XE_PMU_MEDIA_GROUP_BUSY(0)));
>> +	store_sample(pmu, gt_id,
>> __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt,
>> +XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
>> +
>> +	spin_unlock_irqrestore(&pmu->lock, flags); }
>> +
>> +static int
>> +config_status(struct xe_device *xe, u64 config) {
>> +	unsigned int max_gt_id = xe->info.gt_count > 1 ? 1 : 0;
>> +	unsigned int gt_id = config_gt_id(config);
>> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
>> +
>> +	if (gt_id > max_gt_id)
>> +		return -ENOENT;
>> +
>> +	switch (config_counter(config)) {
>> +	case XE_PMU_INTERRUPTS(0):
>> +		if (gt_id)
>> +			return -ENOENT;
>> +		break;
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +		if (GRAPHICS_VER(xe) < 12)
>> +			return -ENOENT;
>> +		break;
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		if (MEDIA_VER(xe) >= 13 && gt->info.type !=
>> XE_GT_TYPE_MEDIA)
>> +			return -ENOENT;
>> +		break;
>> +	default:
>> +		return -ENOENT;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int xe_pmu_event_init(struct perf_event *event) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	int ret;
>> +
>> +	if (pmu->closed)
>> +		return -ENODEV;
>> +
>> +	if (event->attr.type != event->pmu->type)
>> +		return -ENOENT;
>> +
>> +	/* unsupported modes and filters */
>> +	if (event->attr.sample_period) /* no sampling */
>> +		return -EINVAL;
>> +
>> +	if (has_branch_stack(event))
>> +		return -EOPNOTSUPP;
>> +
>> +	if (event->cpu < 0)
>> +		return -EINVAL;
>> +
>> +	/* only allow running on one cpu at a time */
>> +	if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask))
>> +		return -EINVAL;
>> +
>> +	ret = config_status(xe, event->attr.config);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (!event->parent) {
>> +		drm_dev_get(&xe->drm);
>> +		event->destroy = xe_pmu_event_destroy;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static u64 __xe_pmu_event_read(struct perf_event *event) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	const unsigned int gt_id = config_gt_id(event->attr.config);
>> +	const u64 config = config_counter(event->attr.config);
>> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	u64 val = 0;
>> +
>> +	switch (config) {
>> +	case XE_PMU_INTERRUPTS(0):
>> +		val = READ_ONCE(pmu->irq_count);
>> +		break;
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		val = engine_group_busyness_read(gt, config);
>> +	}
>> +
>> +	return val;
>> +}
>> +
>> +static void xe_pmu_event_read(struct perf_event *event) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct hw_perf_event *hwc = &event->hw;
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	u64 prev, new;
>> +
>> +	if (pmu->closed) {
>> +		event->hw.state = PERF_HES_STOPPED;
>> +		return;
>> +	}
>> +again:
>> +	prev = local64_read(&hwc->prev_count);
>> +	new = __xe_pmu_event_read(event);
>> +
>> +	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
>> +		goto again;
>> +
>> +	local64_add(new - prev, &event->count); }
>> +
>> +static void xe_pmu_enable(struct perf_event *event) {
>> +	/*
>> +	 * Store the current counter value so we can report the correct delta
>> +	 * for all listeners. Even when the event was already enabled and has
>> +	 * an existing non-zero value.
>> +	 */
>> +	local64_set(&event->hw.prev_count, __xe_pmu_event_read(event));
>> }
>> +
>> +static void xe_pmu_event_start(struct perf_event *event, int flags) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +
>> +	if (pmu->closed)
>> +		return;
>> +
>> +	xe_pmu_enable(event);
>> +	event->hw.state = 0;
>> +}
>> +
>> +static void xe_pmu_event_stop(struct perf_event *event, int flags) {
>> +	if (flags & PERF_EF_UPDATE)
>> +		xe_pmu_event_read(event);
>> +
>> +	event->hw.state = PERF_HES_STOPPED;
>> +}
>> +
>> +static int xe_pmu_event_add(struct perf_event *event, int flags) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +
>> +	if (pmu->closed)
>> +		return -ENODEV;
>> +
>> +	if (flags & PERF_EF_START)
>> +		xe_pmu_event_start(event, flags);
>> +
>> +	return 0;
>> +}
>> +
>> +static void xe_pmu_event_del(struct perf_event *event, int flags) {
>> +	xe_pmu_event_stop(event, PERF_EF_UPDATE); }
>> +
>> +static int xe_pmu_event_event_idx(struct perf_event *event) {
>> +	return 0;
>> +}
>> +
>> +struct xe_str_attribute {
>> +	struct device_attribute attr;
>> +	const char *str;
>> +};
>> +
>> +static ssize_t xe_pmu_format_show(struct device *dev,
>> +				  struct device_attribute *attr, char *buf) {
>> +	struct xe_str_attribute *eattr;
>> +
>> +	eattr = container_of(attr, struct xe_str_attribute, attr);
>> +	return sprintf(buf, "%s\n", eattr->str); }
>> +
>> +#define XE_PMU_FORMAT_ATTR(_name, _config) \
>> +	(&((struct xe_str_attribute[]) { \
>> +		{ .attr = __ATTR(_name, 0444, xe_pmu_format_show, NULL),
>> \
>> +		  .str = _config, } \
>> +	})[0].attr.attr)
>> +
>> +static struct attribute *xe_pmu_format_attrs[] = {
>> +	XE_PMU_FORMAT_ATTR(xe_eventid, "config:0-20"),
>> +	NULL,
>> +};
>> +
>> +static const struct attribute_group xe_pmu_format_attr_group = {
>> +	.name = "format",
>> +	.attrs = xe_pmu_format_attrs,
>> +};
>> +
>> +struct xe_ext_attribute {
>> +	struct device_attribute attr;
>> +	unsigned long val;
>> +};
>> +
>> +static ssize_t xe_pmu_event_show(struct device *dev,
>> +				 struct device_attribute *attr, char *buf) {
>> +	struct xe_ext_attribute *eattr;
>> +
>> +	eattr = container_of(attr, struct xe_ext_attribute, attr);
>> +	return sprintf(buf, "config=0x%lx\n", eattr->val); }
>> +
>> +static ssize_t cpumask_show(struct device *dev,
>> +			    struct device_attribute *attr, char *buf) {
>> +	return cpumap_print_to_pagebuf(true, buf, &xe_pmu_cpumask); }
>> +
>> +static DEVICE_ATTR_RO(cpumask);
>> +
>> +static struct attribute *xe_cpumask_attrs[] = {
>> +	&dev_attr_cpumask.attr,
>> +	NULL,
>> +};
>> +
>> +static const struct attribute_group xe_pmu_cpumask_attr_group = {
>> +	.attrs = xe_cpumask_attrs,
>> +};
>> +
>> +#define __event(__counter, __name, __unit) \ { \
>> +	.counter = (__counter), \
>> +	.name = (__name), \
>> +	.unit = (__unit), \
>> +	.global = false, \
>> +}
>> +
>> +#define __global_event(__counter, __name, __unit) \ { \
>> +	.counter = (__counter), \
>> +	.name = (__name), \
>> +	.unit = (__unit), \
>> +	.global = true, \
>> +}
>> +
>> +static struct xe_ext_attribute *
>> +add_xe_attr(struct xe_ext_attribute *attr, const char *name, u64
>> +config) {
>> +	sysfs_attr_init(&attr->attr.attr);
>> +	attr->attr.attr.name = name;
>> +	attr->attr.attr.mode = 0444;
>> +	attr->attr.show = xe_pmu_event_show;
>> +	attr->val = config;
>> +
>> +	return ++attr;
>> +}
>> +
>> +static struct perf_pmu_events_attr *
>> +add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name,
>> +	     const char *str)
>> +{
>> +	sysfs_attr_init(&attr->attr.attr);
>> +	attr->attr.attr.name = name;
>> +	attr->attr.attr.mode = 0444;
>> +	attr->attr.show = perf_event_sysfs_show;
>> +	attr->event_str = str;
>> +
>> +	return ++attr;
>> +}
>> +
>> +static struct attribute **
>> +create_event_attributes(struct xe_pmu *pmu) {
>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +	static const struct {
>> +		unsigned int counter;
>> +		const char *name;
>> +		const char *unit;
>> +		bool global;
>> +	} events[] = {
>> +		__global_event(0, "interrupts", NULL),
>> +		__event(1, "render-group-busy", "ns"),
>> +		__event(2, "copy-group-busy", "ns"),
>> +		__event(3, "media-group-busy", "ns"),
>> +		__event(4, "any-engine-group-busy", "ns"),
>> +	};
>> +
>> +	unsigned int count = 0;
>> +	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
>> +	struct xe_ext_attribute *xe_attr = NULL, *xe_iter;
>> +	struct attribute **attr = NULL, **attr_iter;
>> +	struct xe_gt *gt;
>> +	unsigned int i, j;
>> +
>> +	/* Count how many counters we will be exposing. */
>> +	for_each_gt(gt, xe, j) {
>> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
>> +			u64 config = ___XE_PMU_OTHER(j,
>> events[i].counter);
>> +
>> +			if (!config_status(xe, config))
>> +				count++;
>> +		}
>> +	}
>> +
>> +	/* Allocate attribute objects and table. */
>> +	xe_attr = kcalloc(count, sizeof(*xe_attr), GFP_KERNEL);
>> +	if (!xe_attr)
>> +		goto err_alloc;
>> +
>> +	pmu_attr = kcalloc(count, sizeof(*pmu_attr), GFP_KERNEL);
>> +	if (!pmu_attr)
>> +		goto err_alloc;
>> +
>> +	/* Max one pointer of each attribute type plus a termination entry.
>> */
>> +	attr = kcalloc(count * 2 + 1, sizeof(*attr), GFP_KERNEL);
>> +	if (!attr)
>> +		goto err_alloc;
>> +
>> +	xe_iter = xe_attr;
>> +	pmu_iter = pmu_attr;
>> +	attr_iter = attr;
>> +
>> +	for_each_gt(gt, xe, j) {
>> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
>> +			u64 config = ___XE_PMU_OTHER(j,
>> events[i].counter);
>> +			char *str;
>> +
>> +			if (config_status(xe, config))
>> +				continue;
>> +
>> +			if (events[i].global)
>> +				str = kstrdup(events[i].name, GFP_KERNEL);
>> +			else
>> +				str = kasprintf(GFP_KERNEL, "%s-gt%u",
>> +						events[i].name, j);
>> +			if (!str)
>> +				goto err;
>> +
>> +			*attr_iter++ = &xe_iter->attr.attr;
>> +			xe_iter = add_xe_attr(xe_iter, str, config);
>> +
>> +			if (events[i].unit) {
>> +				if (events[i].global)
>> +					str = kasprintf(GFP_KERNEL,
>> "%s.unit",
>> +							events[i].name);
>> +				else
>> +					str = kasprintf(GFP_KERNEL, "%s-
>> gt%u.unit",
>> +							events[i].name, j);
>> +				if (!str)
>> +					goto err;
>> +
>> +				*attr_iter++ = &pmu_iter->attr.attr;
>> +				pmu_iter = add_pmu_attr(pmu_iter, str,
>> +							events[i].unit);
>> +			}
>> +		}
>> +	}
>> +
>> +	pmu->xe_attr = xe_attr;
>> +	pmu->pmu_attr = pmu_attr;
>> +
>> +	return attr;
>> +
>> +err:
>> +	for (attr_iter = attr; *attr_iter; attr_iter++)
>> +		kfree((*attr_iter)->name);
>> +
>> +err_alloc:
>> +	kfree(attr);
>> +	kfree(xe_attr);
>> +	kfree(pmu_attr);
>> +
>> +	return NULL;
>> +}
>> +
>> +static void free_event_attributes(struct xe_pmu *pmu) {
>> +	struct attribute **attr_iter = pmu->events_attr_group.attrs;
>> +
>> +	for (; *attr_iter; attr_iter++)
>> +		kfree((*attr_iter)->name);
>> +
>> +	kfree(pmu->events_attr_group.attrs);
>> +	kfree(pmu->xe_attr);
>> +	kfree(pmu->pmu_attr);
>> +
>> +	pmu->events_attr_group.attrs = NULL;
>> +	pmu->xe_attr = NULL;
>> +	pmu->pmu_attr = NULL;
>> +}
>> +
>> +static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
>> +{
>> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu),
>> cpuhp.node);
>> +
>> +	XE_BUG_ON(!pmu->base.event_init);
>> +
>> +	/* Select the first online CPU as a designated reader. */
>> +	if (cpumask_empty(&xe_pmu_cpumask))
>> +		cpumask_set_cpu(cpu, &xe_pmu_cpumask);
>> +
>> +	return 0;
>> +}
>> +
>> +static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node
>> +*node) {
>> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu),
>> cpuhp.node);
>> +	unsigned int target = xe_pmu_target_cpu;
>> +
>> +	XE_BUG_ON(!pmu->base.event_init);
>> +
>> +	/*
>> +	 * Unregistering an instance generates a CPU offline event which we
>> must
>> +	 * ignore to avoid incorrectly modifying the shared
>> xe_pmu_cpumask.
>> +	 */
>> +	if (pmu->closed)
>> +		return 0;
>> +
>> +	if (cpumask_test_and_clear_cpu(cpu, &xe_pmu_cpumask)) {
>> +		target = cpumask_any_but(topology_sibling_cpumask(cpu),
>> cpu);
>> +
>> +		/* Migrate events if there is a valid target */
>> +		if (target < nr_cpu_ids) {
>> +			cpumask_set_cpu(target, &xe_pmu_cpumask);
>> +			xe_pmu_target_cpu = target;
>> +		}
>> +	}
>> +
>> +	if (target < nr_cpu_ids && target != pmu->cpuhp.cpu) {
>> +		perf_pmu_migrate_context(&pmu->base, cpu, target);
>> +		pmu->cpuhp.cpu = target;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
>> +
>> +int xe_pmu_init(void)
>> +{
>> +	int ret;
>> +
>> +	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
>> +				      "perf/x86/intel/xe:online",
>> +				      xe_pmu_cpu_online,
>> +				      xe_pmu_cpu_offline);
>> +	if (ret < 0)
>> +		pr_notice("Failed to setup cpuhp state for xe PMU! (%d)\n",
>> +			  ret);
>> +	else
>> +		cpuhp_slot = ret;
>> +
>> +	return 0;
>> +}
>> +
>> +void xe_pmu_exit(void)
>> +{
>> +	if (cpuhp_slot != CPUHP_INVALID)
>> +		cpuhp_remove_multi_state(cpuhp_slot);
>> +}
>> +
>> +static int xe_pmu_register_cpuhp_state(struct xe_pmu *pmu) {
>> +	if (cpuhp_slot == CPUHP_INVALID)
>> +		return -EINVAL;
>> +
>> +	return cpuhp_state_add_instance(cpuhp_slot, &pmu->cpuhp.node);
>> }
>> +
>> +static void xe_pmu_unregister_cpuhp_state(struct xe_pmu *pmu) {
>> +	cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node); }
>> +
>> +static void xe_pmu_unregister(struct drm_device *device, void *arg) {
>> +	struct xe_pmu *pmu = arg;
>> +
>> +	if (!pmu->base.event_init)
>> +		return;
>> +
>> +	/*
>> +	 * "Disconnect" the PMU callbacks - since all are atomic
>> synchronize_rcu
>> +	 * ensures all currently executing ones will have exited before we
>> +	 * proceed with unregistration.
>> +	 */
>> +	pmu->closed = true;
>> +	synchronize_rcu();
>> +
>> +	xe_pmu_unregister_cpuhp_state(pmu);
>> +
>> +	perf_pmu_unregister(&pmu->base);
>> +	pmu->base.event_init = NULL;
>> +	kfree(pmu->base.attr_groups);
>> +	kfree(pmu->name);
>> +	free_event_attributes(pmu);
>> +}
>> +
>> +static void init_samples(struct xe_pmu *pmu) {
>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +	struct xe_gt *gt;
>> +	unsigned int i;
>> +
>> +	for_each_gt(gt, xe, i)
>> +		engine_group_busyness_store(gt);
>> +}
>> +
>> +void xe_pmu_register(struct xe_pmu *pmu) {
>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +	const struct attribute_group *attr_groups[] = {
>> +		&xe_pmu_format_attr_group,
>> +		&pmu->events_attr_group,
>> +		&xe_pmu_cpumask_attr_group,
>> +		NULL
>> +	};
>> +
> 
> No newline needed here I think.
> 
>> +	int ret = -ENOMEM;
>> +
>> +	spin_lock_init(&pmu->lock);
>> +	pmu->cpuhp.cpu = -1;
>> +	init_samples(pmu);
>> +
>> +	pmu->name = kasprintf(GFP_KERNEL,
>> +			      "xe_%s",
>> +			      dev_name(xe->drm.dev));
>> +	if (pmu->name)
>> +		/* tools/perf reserves colons as special. */
>> +		strreplace((char *)pmu->name, ':', '_');
>> +
> 
> We can skip checking pmu->name again here.
skipping the below will not catch the error. i can combine them to an if
else.

Thanks,
Aravind.
> 
> Thanks,
> Tejas
>> +	if (!pmu->name)
>> +		goto err;
>> +
>> +	pmu->events_attr_group.name = "events";
>> +	pmu->events_attr_group.attrs = create_event_attributes(pmu);
>> +	if (!pmu->events_attr_group.attrs)
>> +		goto err_name;
>> +
>> +	pmu->base.attr_groups = kmemdup(attr_groups,
>> sizeof(attr_groups),
>> +					GFP_KERNEL);
>> +	if (!pmu->base.attr_groups)
>> +		goto err_attr;
>> +
>> +	pmu->base.module	= THIS_MODULE;
>> +	pmu->base.task_ctx_nr	= perf_invalid_context;
>> +	pmu->base.event_init	= xe_pmu_event_init;
>> +	pmu->base.add		= xe_pmu_event_add;
>> +	pmu->base.del		= xe_pmu_event_del;
>> +	pmu->base.start		= xe_pmu_event_start;
>> +	pmu->base.stop		= xe_pmu_event_stop;
>> +	pmu->base.read		= xe_pmu_event_read;
>> +	pmu->base.event_idx	= xe_pmu_event_event_idx;
>> +
>> +	ret = perf_pmu_register(&pmu->base, pmu->name, -1);
>> +	if (ret)
>> +		goto err_groups;
>> +
>> +	ret = xe_pmu_register_cpuhp_state(pmu);
>> +	if (ret)
>> +		goto err_unreg;
>> +
>> +	ret = drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister,
>> pmu);
>> +	XE_WARN_ON(ret);
>> +
>> +	return;
>> +
>> +err_unreg:
>> +	perf_pmu_unregister(&pmu->base);
>> +err_groups:
>> +	kfree(pmu->base.attr_groups);
>> +err_attr:
>> +	pmu->base.event_init = NULL;
>> +	free_event_attributes(pmu);
>> +err_name:
>> +	kfree(pmu->name);
>> +err:
>> +	drm_notice(&xe->drm, "Failed to register PMU!\n"); }
>> diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
>> new file mode 100644 index 000000000000..d3f47f4ab343
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_pmu.h
>> @@ -0,0 +1,25 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright © 2023 Intel Corporation
>> + */
>> +
>> +#ifndef _XE_PMU_H_
>> +#define _XE_PMU_H_
>> +
>> +#include "xe_gt_types.h"
>> +#include "xe_pmu_types.h"
>> +
>> +#ifdef CONFIG_PERF_EVENTS
>> +int xe_pmu_init(void);
>> +void xe_pmu_exit(void);
>> +void xe_pmu_register(struct xe_pmu *pmu); void
>> +engine_group_busyness_store(struct xe_gt *gt); #else static inline int
>> +xe_pmu_init(void) { return 0; } static inline void xe_pmu_exit(void) {}
>> +static inline void xe_pmu_register(struct xe_pmu *pmu) {} static inline
>> +void engine_group_busyness_store(struct xe_gt *gt) {} #endif
>> +
>> +#endif
>> +
>> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h
>> b/drivers/gpu/drm/xe/xe_pmu_types.h
>> new file mode 100644
>> index 000000000000..e87edd4d6a87
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
>> @@ -0,0 +1,80 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright © 2023 Intel Corporation
>> + */
>> +
>> +#ifndef _XE_PMU_TYPES_H_
>> +#define _XE_PMU_TYPES_H_
>> +
>> +#include <linux/perf_event.h>
>> +#include <linux/spinlock_types.h>
>> +#include <uapi/drm/xe_drm.h>
>> +
>> +enum {
>> +	__XE_SAMPLE_RENDER_GROUP_BUSY,
>> +	__XE_SAMPLE_COPY_GROUP_BUSY,
>> +	__XE_SAMPLE_MEDIA_GROUP_BUSY,
>> +	__XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>> +	__XE_NUM_PMU_SAMPLERS
>> +};
>> +
>> +struct xe_pmu_sample {
>> +	u64 cur;
>> +};
>> +
>> +#define XE_MAX_GT_PER_TILE 2
>> +
>> +struct xe_pmu {
>> +	/**
>> +	 * @cpuhp: Struct used for CPU hotplug handling.
>> +	 */
>> +	struct {
>> +		struct hlist_node node;
>> +		unsigned int cpu;
>> +	} cpuhp;
>> +	/**
>> +	 * @base: PMU base.
>> +	 */
>> +	struct pmu base;
>> +	/**
>> +	 * @closed: xe is unregistering.
>> +	 */
>> +	bool closed;
>> +	/**
>> +	 * @name: Name as registered with perf core.
>> +	 */
>> +	const char *name;
>> +	/**
>> +	 * @lock: Lock protecting enable mask and ref count handling.
>> +	 */
>> +	spinlock_t lock;
>> +	/**
>> +	 * @sample: Current and previous (raw) counters.
>> +	 *
>> +	 * These counters are updated when the device is awake.
>> +	 *
>> +	 */
>> +	struct xe_pmu_sample sample[XE_MAX_GT_PER_TILE *
>> __XE_NUM_PMU_SAMPLERS];
>> +	/**
>> +	 * @irq_count: Number of interrupts
>> +	 *
>> +	 * Intentionally unsigned long to avoid atomics or heuristics on 32bit.
>> +	 * 4e9 interrupts are a lot and postprocessing can really deal with an
>> +	 * occasional wraparound easily. It's 32bit after all.
>> +	 */
>> +	unsigned long irq_count;
>> +	/**
>> +	 * @events_attr_group: Device events attribute group.
>> +	 */
>> +	struct attribute_group events_attr_group;
>> +	/**
>> +	 * @xe_attr: Memory block holding device attributes.
>> +	 */
>> +	void *xe_attr;
>> +	/**
>> +	 * @pmu_attr: Memory block holding device attributes.
>> +	 */
>> +	void *pmu_attr;
>> +};
>> +
>> +#endif
>> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index
>> 965cd9527ff1..ed097056f944 100644
>> --- a/include/uapi/drm/xe_drm.h
>> +++ b/include/uapi/drm/xe_drm.h
>> @@ -990,6 +990,22 @@ struct drm_xe_vm_madvise {
>>  	__u64 reserved[2];
>>  };
>>
>> +/* PMU event config IDs */
>> +
>> +/*
>> + * Top 4 bits of every counter are GT id.
>> + */
>> +#define __XE_PMU_GT_SHIFT (60)
>> +
>> +#define ___XE_PMU_OTHER(gt, x) \
>> +	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
>> +
>> +#define XE_PMU_INTERRUPTS(gt)
>> 	___XE_PMU_OTHER(gt, 0)
>> +#define XE_PMU_RENDER_GROUP_BUSY(gt)
>> 	___XE_PMU_OTHER(gt, 1)
>> +#define XE_PMU_COPY_GROUP_BUSY(gt)
>> 	___XE_PMU_OTHER(gt, 2)
>> +#define XE_PMU_MEDIA_GROUP_BUSY(gt)
>> 	___XE_PMU_OTHER(gt, 3)
>> +#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)
>> 	___XE_PMU_OTHER(gt, 4)
>> +
>>  #if defined(__cplusplus)
>>  }
>>  #endif
>> --
>> 2.25.1
> 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-06-27 12:21 ` [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
  2023-06-30 13:53   ` Upadhyay, Tejas
@ 2023-07-04  3:34   ` Ghimiray, Himal Prasad
  2023-07-05  4:52     ` Iddamsetty, Aravind
  2023-07-04  9:10   ` Upadhyay, Tejas
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 59+ messages in thread
From: Ghimiray, Himal Prasad @ 2023-07-04  3:34 UTC (permalink / raw)
  To: Iddamsetty, Aravind, intel-xe; +Cc: Bommu, Krishnaiah, Ursulin, Tvrtko

Hi Aravind,

> -----Original Message-----
> From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of
> Aravind Iddamsetty
> Sent: 27 June 2023 17:51
> To: intel-xe@lists.freedesktop.org
> Cc: Bommu, Krishnaiah <krishnaiah.bommu@intel.com>; Ursulin, Tvrtko
> <tvrtko.ursulin@intel.com>
> Subject: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
> 
> There are a set of engine group busyness counters provided by HW which
> are perfect fit to be exposed via PMU perf events.
> 
> BSPEC: 46559, 46560, 46722, 46729
> 
> events can be listed using:
> perf list
>   xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
>   xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
>   xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
>   xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
>   xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
> 
> and can be read using:
> 
> perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
>            time             counts unit events
>      1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>     10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>     10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> 
> The pmu base implementation is taken from i915.
> 
> v2:
> Store last known value when device is awake return that while the GT is
> suspended and then update the driver copy when read during awake.
> 
> Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
> ---
>  drivers/gpu/drm/xe/Makefile          |   2 +
>  drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
>  drivers/gpu/drm/xe/xe_device.c       |   2 +
>  drivers/gpu/drm/xe/xe_device_types.h |   4 +
>  drivers/gpu/drm/xe/xe_gt.c           |   2 +
>  drivers/gpu/drm/xe/xe_irq.c          |  22 +
>  drivers/gpu/drm/xe/xe_module.c       |   5 +
>  drivers/gpu/drm/xe/xe_pmu.c          | 739
> +++++++++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_pmu.h          |  25 +
>  drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
>  include/uapi/drm/xe_drm.h            |  16 +
>  11 files changed, 902 insertions(+)
>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.c  create mode 100644
> drivers/gpu/drm/xe/xe_pmu.h  create mode 100644
> drivers/gpu/drm/xe/xe_pmu_types.h
> 
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 081c57fd8632..e52ab795c566 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -217,6 +217,8 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
>  	i915-display/skl_universal_plane.o \
>  	i915-display/skl_watermark.o
> 
> +xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
> +
>  ifeq ($(CONFIG_ACPI),y)
>  	xe-$(CONFIG_DRM_XE_DISPLAY) += \
>  		i915-display/intel_acpi.o \
> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> index 3f664011eaea..c7d9e4634745 100644
> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> @@ -285,6 +285,11 @@
>  #define   INVALIDATION_BROADCAST_MODE_DIS	REG_BIT(12)
>  #define   GLOBAL_INVALIDATION_MODE		REG_BIT(2)
> 
> +#define XE_OAG_RC0_ANY_ENGINE_BUSY_FREE
> 	XE_REG(0xdb80)
> +#define XE_OAG_ANY_MEDIA_FF_BUSY_FREE
> 	XE_REG(0xdba0)
> +#define XE_OAG_BLT_BUSY_FREE			XE_REG(0xdbbc)
> +#define XE_OAG_RENDER_BUSY_FREE
> 	XE_REG(0xdbdc)
> +
>  #define SAMPLER_MODE
> 	XE_REG_MCR(0xe18c, XE_REG_OPTION_MASKED)
>  #define   ENABLE_SMALLPL			REG_BIT(15)
>  #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
> diff --git a/drivers/gpu/drm/xe/xe_device.c
> b/drivers/gpu/drm/xe/xe_device.c index c7985af85a53..b2c7bd4a97d9
> 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -328,6 +328,8 @@ int xe_device_probe(struct xe_device *xe)
> 
>  	xe_debugfs_register(xe);
> 
> +	xe_pmu_register(&xe->pmu);
> +
>  	err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize,
> xe);
>  	if (err)
>  		return err;
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h
> b/drivers/gpu/drm/xe/xe_device_types.h
> index 0226d44a6af2..3ba99aae92b9 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -15,6 +15,7 @@
>  #include "xe_devcoredump_types.h"
>  #include "xe_gt_types.h"
>  #include "xe_platform_types.h"
> +#include "xe_pmu.h"
>  #include "xe_step_types.h"
> 
>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> @@ -332,6 +333,9 @@ struct xe_device {
>  	/** @d3cold_allowed: Indicates if d3cold is a valid device state */
>  	bool d3cold_allowed;
> 
> +	/* @pmu: performance monitoring unit */
> +	struct xe_pmu pmu;
> +
>  	/* private: */
> 
>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index
> 2458397ce8af..96e3720923d4 100644
> --- a/drivers/gpu/drm/xe/xe_gt.c
> +++ b/drivers/gpu/drm/xe/xe_gt.c
> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
>  	if (err)
>  		goto err_msg;
> 
> +	engine_group_busyness_store(gt);
> +
>  	err = xe_uc_suspend(&gt->uc);
>  	if (err)
>  		goto err_force_wake;
> diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
> index b4ed1e4a3388..cb943fb94ec7 100644
> --- a/drivers/gpu/drm/xe/xe_irq.c
> +++ b/drivers/gpu/drm/xe/xe_irq.c
> @@ -27,6 +27,24 @@
>  #define IIR(offset)				XE_REG(offset + 0x8)
>  #define IER(offset)				XE_REG(offset + 0xc)
> 
> +/*
> + * Interrupt statistic for PMU. Increments the counter only if the
> + * interrupt originated from the GPU so interrupts from a device which
> + * shares the interrupt line are not accounted.
> + */
> +static inline void xe_pmu_irq_stats(struct xe_device *xe,
> +				    irqreturn_t res)
The res parameter seems redundant, the caller should call xe_pmu_irq_stats
only in case of IRQ_HANDLED. Do we see need to pass this as an argument from caller and 
check in this function ?

BR
Himal 
> +{
> +	if (unlikely(res != IRQ_HANDLED))
> +		return;
> +
> +	/*
> +	 * A clever compiler translates that into INC. A not so clever one
> +	 * should at least prevent store tearing.
> +	 */
> +	WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1); }
> +
>  static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)  {
>  	u32 val = xe_mmio_read32(mmio, reg);
> @@ -325,6 +343,8 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
> 
>  	xe_display_irq_enable(xe, gu_misc_iir);
> 
> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
> +
>  	return IRQ_HANDLED;
>  }
> 
> @@ -414,6 +434,8 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
>  	dg1_intr_enable(xe, false);
>  	xe_display_irq_enable(xe, gu_misc_iir);
> 
> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
> +
>  	return IRQ_HANDLED;
>  }
> 
> diff --git a/drivers/gpu/drm/xe/xe_module.c
> b/drivers/gpu/drm/xe/xe_module.c index 75e5be939f53..f6fe89748525
> 100644
> --- a/drivers/gpu/drm/xe/xe_module.c
> +++ b/drivers/gpu/drm/xe/xe_module.c
> @@ -12,6 +12,7 @@
>  #include "xe_hw_fence.h"
>  #include "xe_module.h"
>  #include "xe_pci.h"
> +#include "xe_pmu.h"
>  #include "xe_sched_job.h"
> 
>  bool enable_guc = true;
> @@ -49,6 +50,10 @@ static const struct init_funcs init_funcs[] = {
>  		.init = xe_sched_job_module_init,
>  		.exit = xe_sched_job_module_exit,
>  	},
> +	{
> +		.init = xe_pmu_init,
> +		.exit = xe_pmu_exit,
> +	},
>  	{
>  		.init = xe_register_pci_driver,
>  		.exit = xe_unregister_pci_driver,
> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
> new file mode 100644 index 000000000000..bef1895be9f7
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu.c
> @@ -0,0 +1,739 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +#include <drm/drm_drv.h>
> +#include <drm/drm_managed.h>
> +#include <drm/xe_drm.h>
> +
> +#include "regs/xe_gt_regs.h"
> +#include "xe_device.h"
> +#include "xe_gt_clock.h"
> +#include "xe_mmio.h"
> +
> +static cpumask_t xe_pmu_cpumask;
> +static unsigned int xe_pmu_target_cpu = -1;
> +
> +static unsigned int config_gt_id(const u64 config) {
> +	return config >> __XE_PMU_GT_SHIFT;
> +}
> +
> +static u64 config_counter(const u64 config) {
> +	return config & ~(~0ULL << __XE_PMU_GT_SHIFT); }
> +
> +static unsigned int
> +__sample_idx(struct xe_pmu *pmu, unsigned int gt_id, int sample) {
> +	unsigned int idx = gt_id * __XE_NUM_PMU_SAMPLERS + sample;
> +
> +	XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
> +
> +	return idx;
> +}
> +
> +static u64 read_sample(struct xe_pmu *pmu, unsigned int gt_id, int
> +sample) {
> +	return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur; }
> +
> +static void
> +store_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample, u64
> +val) {
> +	pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val; }
> +
> +static int engine_busyness_sample_type(u64 config) {
> +	int type = 0;
> +
> +	switch (config) {
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +		type =  __XE_SAMPLE_RENDER_GROUP_BUSY;
> +		break;
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +		type = __XE_SAMPLE_COPY_GROUP_BUSY;
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		type = __XE_SAMPLE_MEDIA_GROUP_BUSY;
> +		break;
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		type = __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY;
> +		break;
> +	}
> +
> +	return type;
> +}
> +
> +static void xe_pmu_event_destroy(struct perf_event *event) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +
> +	drm_WARN_ON(&xe->drm, event->parent);
> +
> +	drm_dev_put(&xe->drm);
> +}
> +
> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config) {
> +	u64 val = 0;
> +
> +	switch (config) {
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
> +		break;
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt,
> XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
> +		break;
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt,
> XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
> +		break;
> +	default:
> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
> +	}
> +
> +	return xe_gt_clock_interval_to_ns(gt, val * 16); }
> +
> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config) {
> +	int sample_type = engine_busyness_sample_type(config);
> +	struct xe_device *xe = gt->tile->xe;
> +	const unsigned int gt_id = gt->info.id;
> +	struct xe_pmu *pmu = &xe->pmu;
> +	bool device_awake;
> +	unsigned long flags;
> +	u64 val;
> +
> +	/*
> +	 * found no better way to check if device is awake or not. Before
> +	 * we suspend we set the submission_state.enabled to false.
> +	 */
> +	device_awake = gt->uc.guc.submission_state.enabled ? true : false;
> +	if (device_awake)
> +		val = __engine_group_busyness_read(gt, config);
> +
> +	spin_lock_irqsave(&pmu->lock, flags);
> +
> +	if (device_awake)
> +		store_sample(pmu, gt_id, sample_type, val);
> +	else
> +		val = read_sample(pmu, gt_id, sample_type);
> +
> +	spin_unlock_irqrestore(&pmu->lock, flags);
> +
> +	return val;
> +}
> +
> +void engine_group_busyness_store(struct xe_gt *gt) {
> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
> +	unsigned int gt_id = gt->info.id;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&pmu->lock, flags);
> +
> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt,
> XE_PMU_RENDER_GROUP_BUSY(0)));
> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt,
> XE_PMU_COPY_GROUP_BUSY(0)));
> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt,
> XE_PMU_MEDIA_GROUP_BUSY(0)));
> +	store_sample(pmu, gt_id,
> __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt,
> +XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
> +
> +	spin_unlock_irqrestore(&pmu->lock, flags); }
> +
> +static int
> +config_status(struct xe_device *xe, u64 config) {
> +	unsigned int max_gt_id = xe->info.gt_count > 1 ? 1 : 0;
> +	unsigned int gt_id = config_gt_id(config);
> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
> +
> +	if (gt_id > max_gt_id)
> +		return -ENOENT;
> +
> +	switch (config_counter(config)) {
> +	case XE_PMU_INTERRUPTS(0):
> +		if (gt_id)
> +			return -ENOENT;
> +		break;
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		if (GRAPHICS_VER(xe) < 12)
> +			return -ENOENT;
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		if (MEDIA_VER(xe) >= 13 && gt->info.type !=
> XE_GT_TYPE_MEDIA)
> +			return -ENOENT;
> +		break;
> +	default:
> +		return -ENOENT;
> +	}
> +
> +	return 0;
> +}
> +
> +static int xe_pmu_event_init(struct perf_event *event) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct xe_pmu *pmu = &xe->pmu;
> +	int ret;
> +
> +	if (pmu->closed)
> +		return -ENODEV;
> +
> +	if (event->attr.type != event->pmu->type)
> +		return -ENOENT;
> +
> +	/* unsupported modes and filters */
> +	if (event->attr.sample_period) /* no sampling */
> +		return -EINVAL;
> +
> +	if (has_branch_stack(event))
> +		return -EOPNOTSUPP;
> +
> +	if (event->cpu < 0)
> +		return -EINVAL;
> +
> +	/* only allow running on one cpu at a time */
> +	if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask))
> +		return -EINVAL;
> +
> +	ret = config_status(xe, event->attr.config);
> +	if (ret)
> +		return ret;
> +
> +	if (!event->parent) {
> +		drm_dev_get(&xe->drm);
> +		event->destroy = xe_pmu_event_destroy;
> +	}
> +
> +	return 0;
> +}
> +
> +static u64 __xe_pmu_event_read(struct perf_event *event) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	const unsigned int gt_id = config_gt_id(event->attr.config);
> +	const u64 config = config_counter(event->attr.config);
> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
> +	struct xe_pmu *pmu = &xe->pmu;
> +	u64 val = 0;
> +
> +	switch (config) {
> +	case XE_PMU_INTERRUPTS(0):
> +		val = READ_ONCE(pmu->irq_count);
> +		break;
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		val = engine_group_busyness_read(gt, config);
> +	}
> +
> +	return val;
> +}
> +
> +static void xe_pmu_event_read(struct perf_event *event) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct xe_pmu *pmu = &xe->pmu;
> +	u64 prev, new;
> +
> +	if (pmu->closed) {
> +		event->hw.state = PERF_HES_STOPPED;
> +		return;
> +	}
> +again:
> +	prev = local64_read(&hwc->prev_count);
> +	new = __xe_pmu_event_read(event);
> +
> +	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
> +		goto again;
> +
> +	local64_add(new - prev, &event->count); }
> +
> +static void xe_pmu_enable(struct perf_event *event) {
> +	/*
> +	 * Store the current counter value so we can report the correct delta
> +	 * for all listeners. Even when the event was already enabled and has
> +	 * an existing non-zero value.
> +	 */
> +	local64_set(&event->hw.prev_count,
> __xe_pmu_event_read(event)); }
> +
> +static void xe_pmu_event_start(struct perf_event *event, int flags) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct xe_pmu *pmu = &xe->pmu;
> +
> +	if (pmu->closed)
> +		return;
> +
> +	xe_pmu_enable(event);
> +	event->hw.state = 0;
> +}
> +
> +static void xe_pmu_event_stop(struct perf_event *event, int flags) {
> +	if (flags & PERF_EF_UPDATE)
> +		xe_pmu_event_read(event);
> +
> +	event->hw.state = PERF_HES_STOPPED;
> +}
> +
> +static int xe_pmu_event_add(struct perf_event *event, int flags) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct xe_pmu *pmu = &xe->pmu;
> +
> +	if (pmu->closed)
> +		return -ENODEV;
> +
> +	if (flags & PERF_EF_START)
> +		xe_pmu_event_start(event, flags);
> +
> +	return 0;
> +}
> +
> +static void xe_pmu_event_del(struct perf_event *event, int flags) {
> +	xe_pmu_event_stop(event, PERF_EF_UPDATE); }
> +
> +static int xe_pmu_event_event_idx(struct perf_event *event) {
> +	return 0;
> +}
> +
> +struct xe_str_attribute {
> +	struct device_attribute attr;
> +	const char *str;
> +};
> +
> +static ssize_t xe_pmu_format_show(struct device *dev,
> +				  struct device_attribute *attr, char *buf) {
> +	struct xe_str_attribute *eattr;
> +
> +	eattr = container_of(attr, struct xe_str_attribute, attr);
> +	return sprintf(buf, "%s\n", eattr->str); }
> +
> +#define XE_PMU_FORMAT_ATTR(_name, _config) \
> +	(&((struct xe_str_attribute[]) { \
> +		{ .attr = __ATTR(_name, 0444, xe_pmu_format_show,
> NULL), \
> +		  .str = _config, } \
> +	})[0].attr.attr)
> +
> +static struct attribute *xe_pmu_format_attrs[] = {
> +	XE_PMU_FORMAT_ATTR(xe_eventid, "config:0-20"),
> +	NULL,
> +};
> +
> +static const struct attribute_group xe_pmu_format_attr_group = {
> +	.name = "format",
> +	.attrs = xe_pmu_format_attrs,
> +};
> +
> +struct xe_ext_attribute {
> +	struct device_attribute attr;
> +	unsigned long val;
> +};
> +
> +static ssize_t xe_pmu_event_show(struct device *dev,
> +				 struct device_attribute *attr, char *buf) {
> +	struct xe_ext_attribute *eattr;
> +
> +	eattr = container_of(attr, struct xe_ext_attribute, attr);
> +	return sprintf(buf, "config=0x%lx\n", eattr->val); }
> +
> +static ssize_t cpumask_show(struct device *dev,
> +			    struct device_attribute *attr, char *buf) {
> +	return cpumap_print_to_pagebuf(true, buf, &xe_pmu_cpumask); }
> +
> +static DEVICE_ATTR_RO(cpumask);
> +
> +static struct attribute *xe_cpumask_attrs[] = {
> +	&dev_attr_cpumask.attr,
> +	NULL,
> +};
> +
> +static const struct attribute_group xe_pmu_cpumask_attr_group = {
> +	.attrs = xe_cpumask_attrs,
> +};
> +
> +#define __event(__counter, __name, __unit) \ { \
> +	.counter = (__counter), \
> +	.name = (__name), \
> +	.unit = (__unit), \
> +	.global = false, \
> +}
> +
> +#define __global_event(__counter, __name, __unit) \ { \
> +	.counter = (__counter), \
> +	.name = (__name), \
> +	.unit = (__unit), \
> +	.global = true, \
> +}
> +
> +static struct xe_ext_attribute *
> +add_xe_attr(struct xe_ext_attribute *attr, const char *name, u64
> +config) {
> +	sysfs_attr_init(&attr->attr.attr);
> +	attr->attr.attr.name = name;
> +	attr->attr.attr.mode = 0444;
> +	attr->attr.show = xe_pmu_event_show;
> +	attr->val = config;
> +
> +	return ++attr;
> +}
> +
> +static struct perf_pmu_events_attr *
> +add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name,
> +	     const char *str)
> +{
> +	sysfs_attr_init(&attr->attr.attr);
> +	attr->attr.attr.name = name;
> +	attr->attr.attr.mode = 0444;
> +	attr->attr.show = perf_event_sysfs_show;
> +	attr->event_str = str;
> +
> +	return ++attr;
> +}
> +
> +static struct attribute **
> +create_event_attributes(struct xe_pmu *pmu) {
> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +	static const struct {
> +		unsigned int counter;
> +		const char *name;
> +		const char *unit;
> +		bool global;
> +	} events[] = {
> +		__global_event(0, "interrupts", NULL),
> +		__event(1, "render-group-busy", "ns"),
> +		__event(2, "copy-group-busy", "ns"),
> +		__event(3, "media-group-busy", "ns"),
> +		__event(4, "any-engine-group-busy", "ns"),
> +	};
> +
> +	unsigned int count = 0;
> +	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
> +	struct xe_ext_attribute *xe_attr = NULL, *xe_iter;
> +	struct attribute **attr = NULL, **attr_iter;
> +	struct xe_gt *gt;
> +	unsigned int i, j;
> +
> +	/* Count how many counters we will be exposing. */
> +	for_each_gt(gt, xe, j) {
> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
> +			u64 config = ___XE_PMU_OTHER(j,
> events[i].counter);
> +
> +			if (!config_status(xe, config))
> +				count++;
> +		}
> +	}
> +
> +	/* Allocate attribute objects and table. */
> +	xe_attr = kcalloc(count, sizeof(*xe_attr), GFP_KERNEL);
> +	if (!xe_attr)
> +		goto err_alloc;
> +
> +	pmu_attr = kcalloc(count, sizeof(*pmu_attr), GFP_KERNEL);
> +	if (!pmu_attr)
> +		goto err_alloc;
> +
> +	/* Max one pointer of each attribute type plus a termination entry.
> */
> +	attr = kcalloc(count * 2 + 1, sizeof(*attr), GFP_KERNEL);
> +	if (!attr)
> +		goto err_alloc;
> +
> +	xe_iter = xe_attr;
> +	pmu_iter = pmu_attr;
> +	attr_iter = attr;
> +
> +	for_each_gt(gt, xe, j) {
> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
> +			u64 config = ___XE_PMU_OTHER(j,
> events[i].counter);
> +			char *str;
> +
> +			if (config_status(xe, config))
> +				continue;
> +
> +			if (events[i].global)
> +				str = kstrdup(events[i].name, GFP_KERNEL);
> +			else
> +				str = kasprintf(GFP_KERNEL, "%s-gt%u",
> +						events[i].name, j);
> +			if (!str)
> +				goto err;
> +
> +			*attr_iter++ = &xe_iter->attr.attr;
> +			xe_iter = add_xe_attr(xe_iter, str, config);
> +
> +			if (events[i].unit) {
> +				if (events[i].global)
> +					str = kasprintf(GFP_KERNEL,
> "%s.unit",
> +							events[i].name);
> +				else
> +					str = kasprintf(GFP_KERNEL, "%s-
> gt%u.unit",
> +							events[i].name, j);
> +				if (!str)
> +					goto err;
> +
> +				*attr_iter++ = &pmu_iter->attr.attr;
> +				pmu_iter = add_pmu_attr(pmu_iter, str,
> +							events[i].unit);
> +			}
> +		}
> +	}
> +
> +	pmu->xe_attr = xe_attr;
> +	pmu->pmu_attr = pmu_attr;
> +
> +	return attr;
> +
> +err:
> +	for (attr_iter = attr; *attr_iter; attr_iter++)
> +		kfree((*attr_iter)->name);
> +
> +err_alloc:
> +	kfree(attr);
> +	kfree(xe_attr);
> +	kfree(pmu_attr);
> +
> +	return NULL;
> +}
> +
> +static void free_event_attributes(struct xe_pmu *pmu) {
> +	struct attribute **attr_iter = pmu->events_attr_group.attrs;
> +
> +	for (; *attr_iter; attr_iter++)
> +		kfree((*attr_iter)->name);
> +
> +	kfree(pmu->events_attr_group.attrs);
> +	kfree(pmu->xe_attr);
> +	kfree(pmu->pmu_attr);
> +
> +	pmu->events_attr_group.attrs = NULL;
> +	pmu->xe_attr = NULL;
> +	pmu->pmu_attr = NULL;
> +}
> +
> +static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
> +{
> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu),
> cpuhp.node);
> +
> +	XE_BUG_ON(!pmu->base.event_init);
> +
> +	/* Select the first online CPU as a designated reader. */
> +	if (cpumask_empty(&xe_pmu_cpumask))
> +		cpumask_set_cpu(cpu, &xe_pmu_cpumask);
> +
> +	return 0;
> +}
> +
> +static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node
> +*node) {
> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu),
> cpuhp.node);
> +	unsigned int target = xe_pmu_target_cpu;
> +
> +	XE_BUG_ON(!pmu->base.event_init);
> +
> +	/*
> +	 * Unregistering an instance generates a CPU offline event which we
> must
> +	 * ignore to avoid incorrectly modifying the shared
> xe_pmu_cpumask.
> +	 */
> +	if (pmu->closed)
> +		return 0;
> +
> +	if (cpumask_test_and_clear_cpu(cpu, &xe_pmu_cpumask)) {
> +		target = cpumask_any_but(topology_sibling_cpumask(cpu),
> cpu);
> +
> +		/* Migrate events if there is a valid target */
> +		if (target < nr_cpu_ids) {
> +			cpumask_set_cpu(target, &xe_pmu_cpumask);
> +			xe_pmu_target_cpu = target;
> +		}
> +	}
> +
> +	if (target < nr_cpu_ids && target != pmu->cpuhp.cpu) {
> +		perf_pmu_migrate_context(&pmu->base, cpu, target);
> +		pmu->cpuhp.cpu = target;
> +	}
> +
> +	return 0;
> +}
> +
> +static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
> +
> +int xe_pmu_init(void)
> +{
> +	int ret;
> +
> +	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
> +				      "perf/x86/intel/xe:online",
> +				      xe_pmu_cpu_online,
> +				      xe_pmu_cpu_offline);
> +	if (ret < 0)
> +		pr_notice("Failed to setup cpuhp state for xe PMU! (%d)\n",
> +			  ret);
> +	else
> +		cpuhp_slot = ret;
> +
> +	return 0;
> +}
> +
> +void xe_pmu_exit(void)
> +{
> +	if (cpuhp_slot != CPUHP_INVALID)
> +		cpuhp_remove_multi_state(cpuhp_slot);
> +}
> +
> +static int xe_pmu_register_cpuhp_state(struct xe_pmu *pmu) {
> +	if (cpuhp_slot == CPUHP_INVALID)
> +		return -EINVAL;
> +
> +	return cpuhp_state_add_instance(cpuhp_slot, &pmu-
> >cpuhp.node); }
> +
> +static void xe_pmu_unregister_cpuhp_state(struct xe_pmu *pmu) {
> +	cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node); }
> +
> +static void xe_pmu_unregister(struct drm_device *device, void *arg) {
> +	struct xe_pmu *pmu = arg;
> +
> +	if (!pmu->base.event_init)
> +		return;
> +
> +	/*
> +	 * "Disconnect" the PMU callbacks - since all are atomic
> synchronize_rcu
> +	 * ensures all currently executing ones will have exited before we
> +	 * proceed with unregistration.
> +	 */
> +	pmu->closed = true;
> +	synchronize_rcu();
> +
> +	xe_pmu_unregister_cpuhp_state(pmu);
> +
> +	perf_pmu_unregister(&pmu->base);
> +	pmu->base.event_init = NULL;
> +	kfree(pmu->base.attr_groups);
> +	kfree(pmu->name);
> +	free_event_attributes(pmu);
> +}
> +
> +static void init_samples(struct xe_pmu *pmu) {
> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +	struct xe_gt *gt;
> +	unsigned int i;
> +
> +	for_each_gt(gt, xe, i)
> +		engine_group_busyness_store(gt);
> +}
> +
> +void xe_pmu_register(struct xe_pmu *pmu) {
> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +	const struct attribute_group *attr_groups[] = {
> +		&xe_pmu_format_attr_group,
> +		&pmu->events_attr_group,
> +		&xe_pmu_cpumask_attr_group,
> +		NULL
> +	};
> +
> +	int ret = -ENOMEM;
> +
> +	spin_lock_init(&pmu->lock);
> +	pmu->cpuhp.cpu = -1;
> +	init_samples(pmu);
> +
> +	pmu->name = kasprintf(GFP_KERNEL,
> +			      "xe_%s",
> +			      dev_name(xe->drm.dev));
> +	if (pmu->name)
> +		/* tools/perf reserves colons as special. */
> +		strreplace((char *)pmu->name, ':', '_');
> +
> +	if (!pmu->name)
> +		goto err;
> +
> +	pmu->events_attr_group.name = "events";
> +	pmu->events_attr_group.attrs = create_event_attributes(pmu);
> +	if (!pmu->events_attr_group.attrs)
> +		goto err_name;
> +
> +	pmu->base.attr_groups = kmemdup(attr_groups,
> sizeof(attr_groups),
> +					GFP_KERNEL);
> +	if (!pmu->base.attr_groups)
> +		goto err_attr;
> +
> +	pmu->base.module	= THIS_MODULE;
> +	pmu->base.task_ctx_nr	= perf_invalid_context;
> +	pmu->base.event_init	= xe_pmu_event_init;
> +	pmu->base.add		= xe_pmu_event_add;
> +	pmu->base.del		= xe_pmu_event_del;
> +	pmu->base.start		= xe_pmu_event_start;
> +	pmu->base.stop		= xe_pmu_event_stop;
> +	pmu->base.read		= xe_pmu_event_read;
> +	pmu->base.event_idx	= xe_pmu_event_event_idx;
> +
> +	ret = perf_pmu_register(&pmu->base, pmu->name, -1);
> +	if (ret)
> +		goto err_groups;
> +
> +	ret = xe_pmu_register_cpuhp_state(pmu);
> +	if (ret)
> +		goto err_unreg;
> +
> +	ret = drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister,
> pmu);
> +	XE_WARN_ON(ret);
> +
> +	return;
> +
> +err_unreg:
> +	perf_pmu_unregister(&pmu->base);
> +err_groups:
> +	kfree(pmu->base.attr_groups);
> +err_attr:
> +	pmu->base.event_init = NULL;
> +	free_event_attributes(pmu);
> +err_name:
> +	kfree(pmu->name);
> +err:
> +	drm_notice(&xe->drm, "Failed to register PMU!\n"); }
> diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
> new file mode 100644 index 000000000000..d3f47f4ab343
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu.h
> @@ -0,0 +1,25 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +#ifndef _XE_PMU_H_
> +#define _XE_PMU_H_
> +
> +#include "xe_gt_types.h"
> +#include "xe_pmu_types.h"
> +
> +#ifdef CONFIG_PERF_EVENTS
> +int xe_pmu_init(void);
> +void xe_pmu_exit(void);
> +void xe_pmu_register(struct xe_pmu *pmu); void
> +engine_group_busyness_store(struct xe_gt *gt); #else static inline int
> +xe_pmu_init(void) { return 0; } static inline void xe_pmu_exit(void) {}
> +static inline void xe_pmu_register(struct xe_pmu *pmu) {} static inline
> +void engine_group_busyness_store(struct xe_gt *gt) {} #endif
> +
> +#endif
> +
> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h
> b/drivers/gpu/drm/xe/xe_pmu_types.h
> new file mode 100644
> index 000000000000..e87edd4d6a87
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
> @@ -0,0 +1,80 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +#ifndef _XE_PMU_TYPES_H_
> +#define _XE_PMU_TYPES_H_
> +
> +#include <linux/perf_event.h>
> +#include <linux/spinlock_types.h>
> +#include <uapi/drm/xe_drm.h>
> +
> +enum {
> +	__XE_SAMPLE_RENDER_GROUP_BUSY,
> +	__XE_SAMPLE_COPY_GROUP_BUSY,
> +	__XE_SAMPLE_MEDIA_GROUP_BUSY,
> +	__XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> +	__XE_NUM_PMU_SAMPLERS
> +};
> +
> +struct xe_pmu_sample {
> +	u64 cur;
> +};
> +
> +#define XE_MAX_GT_PER_TILE 2
> +
> +struct xe_pmu {
> +	/**
> +	 * @cpuhp: Struct used for CPU hotplug handling.
> +	 */
> +	struct {
> +		struct hlist_node node;
> +		unsigned int cpu;
> +	} cpuhp;
> +	/**
> +	 * @base: PMU base.
> +	 */
> +	struct pmu base;
> +	/**
> +	 * @closed: xe is unregistering.
> +	 */
> +	bool closed;
> +	/**
> +	 * @name: Name as registered with perf core.
> +	 */
> +	const char *name;
> +	/**
> +	 * @lock: Lock protecting enable mask and ref count handling.
> +	 */
> +	spinlock_t lock;
> +	/**
> +	 * @sample: Current and previous (raw) counters.
> +	 *
> +	 * These counters are updated when the device is awake.
> +	 *
> +	 */
> +	struct xe_pmu_sample sample[XE_MAX_GT_PER_TILE *
> __XE_NUM_PMU_SAMPLERS];
> +	/**
> +	 * @irq_count: Number of interrupts
> +	 *
> +	 * Intentionally unsigned long to avoid atomics or heuristics on 32bit.
> +	 * 4e9 interrupts are a lot and postprocessing can really deal with an
> +	 * occasional wraparound easily. It's 32bit after all.
> +	 */
> +	unsigned long irq_count;
> +	/**
> +	 * @events_attr_group: Device events attribute group.
> +	 */
> +	struct attribute_group events_attr_group;
> +	/**
> +	 * @xe_attr: Memory block holding device attributes.
> +	 */
> +	void *xe_attr;
> +	/**
> +	 * @pmu_attr: Memory block holding device attributes.
> +	 */
> +	void *pmu_attr;
> +};
> +
> +#endif
> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index
> 965cd9527ff1..ed097056f944 100644
> --- a/include/uapi/drm/xe_drm.h
> +++ b/include/uapi/drm/xe_drm.h
> @@ -990,6 +990,22 @@ struct drm_xe_vm_madvise {
>  	__u64 reserved[2];
>  };
> 
> +/* PMU event config IDs */
> +
> +/*
> + * Top 4 bits of every counter are GT id.
> + */
> +#define __XE_PMU_GT_SHIFT (60)
> +
> +#define ___XE_PMU_OTHER(gt, x) \
> +	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
> +
> +#define XE_PMU_INTERRUPTS(gt)
> 	___XE_PMU_OTHER(gt, 0)
> +#define XE_PMU_RENDER_GROUP_BUSY(gt)
> 	___XE_PMU_OTHER(gt, 1)
> +#define XE_PMU_COPY_GROUP_BUSY(gt)
> 	___XE_PMU_OTHER(gt, 2)
> +#define XE_PMU_MEDIA_GROUP_BUSY(gt)
> 	___XE_PMU_OTHER(gt, 3)
> +#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)
> 	___XE_PMU_OTHER(gt, 4)
> +
>  #if defined(__cplusplus)
>  }
>  #endif
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-06-27 12:21 ` [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
  2023-06-30 13:53   ` Upadhyay, Tejas
  2023-07-04  3:34   ` Ghimiray, Himal Prasad
@ 2023-07-04  9:10   ` Upadhyay, Tejas
  2023-07-05  4:42     ` Iddamsetty, Aravind
  2023-07-06  2:39   ` Dixit, Ashutosh
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 59+ messages in thread
From: Upadhyay, Tejas @ 2023-07-04  9:10 UTC (permalink / raw)
  To: Iddamsetty, Aravind, intel-xe; +Cc: Bommu, Krishnaiah, Ursulin, Tvrtko



> -----Original Message-----
> From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of
> Aravind Iddamsetty
> Sent: Tuesday, June 27, 2023 5:51 PM
> To: intel-xe@lists.freedesktop.org
> Cc: Bommu, Krishnaiah <krishnaiah.bommu@intel.com>; Ursulin, Tvrtko
> <tvrtko.ursulin@intel.com>
> Subject: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
> 
> There are a set of engine group busyness counters provided by HW which
> are perfect fit to be exposed via PMU perf events.
> 
> BSPEC: 46559, 46560, 46722, 46729
> 
> events can be listed using:
> perf list
>   xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
>   xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
>   xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
>   xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
>   xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
> 
> and can be read using:
> 
> perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
>            time             counts unit events
>      1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>     10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>     10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> 
> The pmu base implementation is taken from i915.
> 
> v2:
> Store last known value when device is awake return that while the GT is
> suspended and then update the driver copy when read during awake.
> 
> Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
> ---
>  drivers/gpu/drm/xe/Makefile          |   2 +
>  drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
>  drivers/gpu/drm/xe/xe_device.c       |   2 +
>  drivers/gpu/drm/xe/xe_device_types.h |   4 +
>  drivers/gpu/drm/xe/xe_gt.c           |   2 +
>  drivers/gpu/drm/xe/xe_irq.c          |  22 +
>  drivers/gpu/drm/xe/xe_module.c       |   5 +
>  drivers/gpu/drm/xe/xe_pmu.c          | 739 +++++++++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_pmu.h          |  25 +
>  drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
>  include/uapi/drm/xe_drm.h            |  16 +
>  11 files changed, 902 insertions(+)
>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.c  create mode 100644
> drivers/gpu/drm/xe/xe_pmu.h  create mode 100644
> drivers/gpu/drm/xe/xe_pmu_types.h
> 
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 081c57fd8632..e52ab795c566 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -217,6 +217,8 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
>  	i915-display/skl_universal_plane.o \
>  	i915-display/skl_watermark.o
> 
> +xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
> +
>  ifeq ($(CONFIG_ACPI),y)
>  	xe-$(CONFIG_DRM_XE_DISPLAY) += \
>  		i915-display/intel_acpi.o \
> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> index 3f664011eaea..c7d9e4634745 100644
> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> @@ -285,6 +285,11 @@
>  #define   INVALIDATION_BROADCAST_MODE_DIS	REG_BIT(12)
>  #define   GLOBAL_INVALIDATION_MODE		REG_BIT(2)
> 
> +#define XE_OAG_RC0_ANY_ENGINE_BUSY_FREE
> 	XE_REG(0xdb80)
> +#define XE_OAG_ANY_MEDIA_FF_BUSY_FREE		XE_REG(0xdba0)
> +#define XE_OAG_BLT_BUSY_FREE			XE_REG(0xdbbc)
> +#define XE_OAG_RENDER_BUSY_FREE			XE_REG(0xdbdc)
> +
>  #define SAMPLER_MODE				XE_REG_MCR(0xe18c,
> XE_REG_OPTION_MASKED)
>  #define   ENABLE_SMALLPL			REG_BIT(15)
>  #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
> diff --git a/drivers/gpu/drm/xe/xe_device.c
> b/drivers/gpu/drm/xe/xe_device.c index c7985af85a53..b2c7bd4a97d9
> 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -328,6 +328,8 @@ int xe_device_probe(struct xe_device *xe)
> 
>  	xe_debugfs_register(xe);
> 
> +	xe_pmu_register(&xe->pmu);
> +
>  	err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize, xe);
>  	if (err)
>  		return err;
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h
> b/drivers/gpu/drm/xe/xe_device_types.h
> index 0226d44a6af2..3ba99aae92b9 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -15,6 +15,7 @@
>  #include "xe_devcoredump_types.h"
>  #include "xe_gt_types.h"
>  #include "xe_platform_types.h"
> +#include "xe_pmu.h"
>  #include "xe_step_types.h"
> 
>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> @@ -332,6 +333,9 @@ struct xe_device {
>  	/** @d3cold_allowed: Indicates if d3cold is a valid device state */
>  	bool d3cold_allowed;
> 
> +	/* @pmu: performance monitoring unit */

/** @pmu: performance monitoring unit */

> +	struct xe_pmu pmu;
> +
>  	/* private: */
> 
>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index
> 2458397ce8af..96e3720923d4 100644
> --- a/drivers/gpu/drm/xe/xe_gt.c
> +++ b/drivers/gpu/drm/xe/xe_gt.c
> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
>  	if (err)
>  		goto err_msg;
> 
> +	engine_group_busyness_store(gt);
> +
>  	err = xe_uc_suspend(&gt->uc);
>  	if (err)
>  		goto err_force_wake;
> diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
> index b4ed1e4a3388..cb943fb94ec7 100644
> --- a/drivers/gpu/drm/xe/xe_irq.c
> +++ b/drivers/gpu/drm/xe/xe_irq.c
> @@ -27,6 +27,24 @@
>  #define IIR(offset)				XE_REG(offset + 0x8)
>  #define IER(offset)				XE_REG(offset + 0xc)
> 
> +/*
> + * Interrupt statistic for PMU. Increments the counter only if the
> + * interrupt originated from the GPU so interrupts from a device which
> + * shares the interrupt line are not accounted.
> + */
> +static inline void xe_pmu_irq_stats(struct xe_device *xe,
> +				    irqreturn_t res)
> +{
> +	if (unlikely(res != IRQ_HANDLED))
> +		return;
> +
> +	/*
> +	 * A clever compiler translates that into INC. A not so clever one
> +	 * should at least prevent store tearing.
> +	 */
> +	WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1); }
> +
>  static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)  {
>  	u32 val = xe_mmio_read32(mmio, reg);
> @@ -325,6 +343,8 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
> 
>  	xe_display_irq_enable(xe, gu_misc_iir);
> 
> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
> +
>  	return IRQ_HANDLED;
>  }
> 
> @@ -414,6 +434,8 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
>  	dg1_intr_enable(xe, false);
>  	xe_display_irq_enable(xe, gu_misc_iir);
> 
> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
> +
>  	return IRQ_HANDLED;
>  }
> 
> diff --git a/drivers/gpu/drm/xe/xe_module.c
> b/drivers/gpu/drm/xe/xe_module.c index 75e5be939f53..f6fe89748525
> 100644
> --- a/drivers/gpu/drm/xe/xe_module.c
> +++ b/drivers/gpu/drm/xe/xe_module.c
> @@ -12,6 +12,7 @@
>  #include "xe_hw_fence.h"
>  #include "xe_module.h"
>  #include "xe_pci.h"
> +#include "xe_pmu.h"
>  #include "xe_sched_job.h"
> 
>  bool enable_guc = true;
> @@ -49,6 +50,10 @@ static const struct init_funcs init_funcs[] = {
>  		.init = xe_sched_job_module_init,
>  		.exit = xe_sched_job_module_exit,
>  	},
> +	{
> +		.init = xe_pmu_init,
> +		.exit = xe_pmu_exit,
> +	},
>  	{
>  		.init = xe_register_pci_driver,
>  		.exit = xe_unregister_pci_driver,
> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
> new file mode 100644 index 000000000000..bef1895be9f7
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu.c
> @@ -0,0 +1,739 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +#include <drm/drm_drv.h>
> +#include <drm/drm_managed.h>
> +#include <drm/xe_drm.h>
> +
> +#include "regs/xe_gt_regs.h"
> +#include "xe_device.h"
> +#include "xe_gt_clock.h"
> +#include "xe_mmio.h"
> +
> +static cpumask_t xe_pmu_cpumask;
> +static unsigned int xe_pmu_target_cpu = -1;
> +
> +static unsigned int config_gt_id(const u64 config) {
> +	return config >> __XE_PMU_GT_SHIFT;
> +}
> +
> +static u64 config_counter(const u64 config) {
> +	return config & ~(~0ULL << __XE_PMU_GT_SHIFT); }
> +
> +static unsigned int
> +__sample_idx(struct xe_pmu *pmu, unsigned int gt_id, int sample) {
> +	unsigned int idx = gt_id * __XE_NUM_PMU_SAMPLERS + sample;
> +
> +	XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
> +
> +	return idx;
> +}
> +
> +static u64 read_sample(struct xe_pmu *pmu, unsigned int gt_id, int
> +sample) {
> +	return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur; }
> +
> +static void
> +store_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample, u64
> +val) {
> +	pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val; }
> +
> +static int engine_busyness_sample_type(u64 config) {
> +	int type = 0;
> +
> +	switch (config) {
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +		type =  __XE_SAMPLE_RENDER_GROUP_BUSY;
> +		break;
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +		type = __XE_SAMPLE_COPY_GROUP_BUSY;
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		type = __XE_SAMPLE_MEDIA_GROUP_BUSY;
> +		break;
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		type = __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY;
> +		break;
> +	}
> +
> +	return type;
> +}
> +
> +static void xe_pmu_event_destroy(struct perf_event *event) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +
> +	drm_WARN_ON(&xe->drm, event->parent);
> +
> +	drm_dev_put(&xe->drm);
> +}
> +
> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config) {
> +	u64 val = 0;
> +
> +	switch (config) {
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
> +		break;
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt,
> XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
> +		break;
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt,
> XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
> +		break;
> +	default:
> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
> +	}
> +
> +	return xe_gt_clock_interval_to_ns(gt, val * 16); }
> +
> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config) {
> +	int sample_type = engine_busyness_sample_type(config);
> +	struct xe_device *xe = gt->tile->xe;
> +	const unsigned int gt_id = gt->info.id;
> +	struct xe_pmu *pmu = &xe->pmu;
> +	bool device_awake;
> +	unsigned long flags;
> +	u64 val;

Many places Christmas tree order for variable declaration group not followed. Please apply wherever applicable.

> +
> +	/*
> +	 * found no better way to check if device is awake or not. Before
> +	 * we suspend we set the submission_state.enabled to false.
> +	 */
> +	device_awake = gt->uc.guc.submission_state.enabled ? true : false;
> +	if (device_awake)
> +		val = __engine_group_busyness_read(gt, config);
> +
> +	spin_lock_irqsave(&pmu->lock, flags);
> +
> +	if (device_awake)
> +		store_sample(pmu, gt_id, sample_type, val);
> +	else
> +		val = read_sample(pmu, gt_id, sample_type);
> +
> +	spin_unlock_irqrestore(&pmu->lock, flags);
> +
> +	return val;
> +}
> +
> +void engine_group_busyness_store(struct xe_gt *gt) {
> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
> +	unsigned int gt_id = gt->info.id;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&pmu->lock, flags);
> +
> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt,
> XE_PMU_RENDER_GROUP_BUSY(0)));
> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt,
> XE_PMU_COPY_GROUP_BUSY(0)));
> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt,
> XE_PMU_MEDIA_GROUP_BUSY(0)));
> +	store_sample(pmu, gt_id,
> __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt,
> +XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
> +
> +	spin_unlock_irqrestore(&pmu->lock, flags); }
> +
> +static int
> +config_status(struct xe_device *xe, u64 config) {
> +	unsigned int max_gt_id = xe->info.gt_count > 1 ? 1 : 0;
> +	unsigned int gt_id = config_gt_id(config);
> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
> +
> +	if (gt_id > max_gt_id)
> +		return -ENOENT;
> +
> +	switch (config_counter(config)) {
> +	case XE_PMU_INTERRUPTS(0):
> +		if (gt_id)
> +			return -ENOENT;
> +		break;
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		if (GRAPHICS_VER(xe) < 12)
> +			return -ENOENT;
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		if (MEDIA_VER(xe) >= 13 && gt->info.type !=
> XE_GT_TYPE_MEDIA)
> +			return -ENOENT;
> +		break;
> +	default:
> +		return -ENOENT;
> +	}
> +
> +	return 0;
> +}
> +
> +static int xe_pmu_event_init(struct perf_event *event) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct xe_pmu *pmu = &xe->pmu;
> +	int ret;
> +
> +	if (pmu->closed)
> +		return -ENODEV;
> +
> +	if (event->attr.type != event->pmu->type)
> +		return -ENOENT;
> +
> +	/* unsupported modes and filters */
> +	if (event->attr.sample_period) /* no sampling */
> +		return -EINVAL;
> +
> +	if (has_branch_stack(event))
> +		return -EOPNOTSUPP;
> +
> +	if (event->cpu < 0)
> +		return -EINVAL;
> +
> +	/* only allow running on one cpu at a time */
> +	if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask))
> +		return -EINVAL;
> +
> +	ret = config_status(xe, event->attr.config);
> +	if (ret)
> +		return ret;
> +
> +	if (!event->parent) {
> +		drm_dev_get(&xe->drm);
> +		event->destroy = xe_pmu_event_destroy;
> +	}
> +
> +	return 0;
> +}
> +
> +static u64 __xe_pmu_event_read(struct perf_event *event) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	const unsigned int gt_id = config_gt_id(event->attr.config);
> +	const u64 config = config_counter(event->attr.config);
> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
> +	struct xe_pmu *pmu = &xe->pmu;
> +	u64 val = 0;
> +
> +	switch (config) {
> +	case XE_PMU_INTERRUPTS(0):
> +		val = READ_ONCE(pmu->irq_count);
> +		break;
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		val = engine_group_busyness_read(gt, config);
> +	}

Do you need to handle default case here?

Also, if possible can you please think of breaking up this into small patches. Its huge patch to review. Split it only if it makes sense.

Thanks,
Tejas
> +
> +	return val;
> +}
> +
> +static void xe_pmu_event_read(struct perf_event *event) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct xe_pmu *pmu = &xe->pmu;
> +	u64 prev, new;
> +
> +	if (pmu->closed) {
> +		event->hw.state = PERF_HES_STOPPED;
> +		return;
> +	}
> +again:
> +	prev = local64_read(&hwc->prev_count);
> +	new = __xe_pmu_event_read(event);
> +
> +	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
> +		goto again;
> +
> +	local64_add(new - prev, &event->count); }
> +
> +static void xe_pmu_enable(struct perf_event *event) {
> +	/*
> +	 * Store the current counter value so we can report the correct delta
> +	 * for all listeners. Even when the event was already enabled and has
> +	 * an existing non-zero value.
> +	 */
> +	local64_set(&event->hw.prev_count, __xe_pmu_event_read(event));
> }
> +
> +static void xe_pmu_event_start(struct perf_event *event, int flags) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct xe_pmu *pmu = &xe->pmu;
> +
> +	if (pmu->closed)
> +		return;
> +
> +	xe_pmu_enable(event);
> +	event->hw.state = 0;
> +}
> +
> +static void xe_pmu_event_stop(struct perf_event *event, int flags) {
> +	if (flags & PERF_EF_UPDATE)
> +		xe_pmu_event_read(event);
> +
> +	event->hw.state = PERF_HES_STOPPED;
> +}
> +
> +static int xe_pmu_event_add(struct perf_event *event, int flags) {
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct xe_pmu *pmu = &xe->pmu;
> +
> +	if (pmu->closed)
> +		return -ENODEV;
> +
> +	if (flags & PERF_EF_START)
> +		xe_pmu_event_start(event, flags);
> +
> +	return 0;
> +}
> +
> +static void xe_pmu_event_del(struct perf_event *event, int flags) {
> +	xe_pmu_event_stop(event, PERF_EF_UPDATE); }
> +
> +static int xe_pmu_event_event_idx(struct perf_event *event) {
> +	return 0;
> +}
> +
> +struct xe_str_attribute {
> +	struct device_attribute attr;
> +	const char *str;
> +};
> +
> +static ssize_t xe_pmu_format_show(struct device *dev,
> +				  struct device_attribute *attr, char *buf) {
> +	struct xe_str_attribute *eattr;
> +
> +	eattr = container_of(attr, struct xe_str_attribute, attr);
> +	return sprintf(buf, "%s\n", eattr->str); }
> +
> +#define XE_PMU_FORMAT_ATTR(_name, _config) \
> +	(&((struct xe_str_attribute[]) { \
> +		{ .attr = __ATTR(_name, 0444, xe_pmu_format_show, NULL),
> \
> +		  .str = _config, } \
> +	})[0].attr.attr)
> +
> +static struct attribute *xe_pmu_format_attrs[] = {
> +	XE_PMU_FORMAT_ATTR(xe_eventid, "config:0-20"),
> +	NULL,
> +};
> +
> +static const struct attribute_group xe_pmu_format_attr_group = {
> +	.name = "format",
> +	.attrs = xe_pmu_format_attrs,
> +};
> +
> +struct xe_ext_attribute {
> +	struct device_attribute attr;
> +	unsigned long val;
> +};
> +
> +static ssize_t xe_pmu_event_show(struct device *dev,
> +				 struct device_attribute *attr, char *buf) {
> +	struct xe_ext_attribute *eattr;
> +
> +	eattr = container_of(attr, struct xe_ext_attribute, attr);
> +	return sprintf(buf, "config=0x%lx\n", eattr->val); }
> +
> +static ssize_t cpumask_show(struct device *dev,
> +			    struct device_attribute *attr, char *buf) {
> +	return cpumap_print_to_pagebuf(true, buf, &xe_pmu_cpumask); }
> +
> +static DEVICE_ATTR_RO(cpumask);
> +
> +static struct attribute *xe_cpumask_attrs[] = {
> +	&dev_attr_cpumask.attr,
> +	NULL,
> +};
> +
> +static const struct attribute_group xe_pmu_cpumask_attr_group = {
> +	.attrs = xe_cpumask_attrs,
> +};
> +
> +#define __event(__counter, __name, __unit) \ { \
> +	.counter = (__counter), \
> +	.name = (__name), \
> +	.unit = (__unit), \
> +	.global = false, \
> +}
> +
> +#define __global_event(__counter, __name, __unit) \ { \
> +	.counter = (__counter), \
> +	.name = (__name), \
> +	.unit = (__unit), \
> +	.global = true, \
> +}
> +
> +static struct xe_ext_attribute *
> +add_xe_attr(struct xe_ext_attribute *attr, const char *name, u64
> +config) {
> +	sysfs_attr_init(&attr->attr.attr);
> +	attr->attr.attr.name = name;
> +	attr->attr.attr.mode = 0444;
> +	attr->attr.show = xe_pmu_event_show;
> +	attr->val = config;
> +
> +	return ++attr;
> +}
> +
> +static struct perf_pmu_events_attr *
> +add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name,
> +	     const char *str)
> +{
> +	sysfs_attr_init(&attr->attr.attr);
> +	attr->attr.attr.name = name;
> +	attr->attr.attr.mode = 0444;
> +	attr->attr.show = perf_event_sysfs_show;
> +	attr->event_str = str;
> +
> +	return ++attr;
> +}
> +
> +static struct attribute **
> +create_event_attributes(struct xe_pmu *pmu) {
> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +	static const struct {
> +		unsigned int counter;
> +		const char *name;
> +		const char *unit;
> +		bool global;
> +	} events[] = {
> +		__global_event(0, "interrupts", NULL),
> +		__event(1, "render-group-busy", "ns"),
> +		__event(2, "copy-group-busy", "ns"),
> +		__event(3, "media-group-busy", "ns"),
> +		__event(4, "any-engine-group-busy", "ns"),
> +	};
> +
> +	unsigned int count = 0;
> +	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
> +	struct xe_ext_attribute *xe_attr = NULL, *xe_iter;
> +	struct attribute **attr = NULL, **attr_iter;
> +	struct xe_gt *gt;
> +	unsigned int i, j;
> +
> +	/* Count how many counters we will be exposing. */
> +	for_each_gt(gt, xe, j) {
> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
> +			u64 config = ___XE_PMU_OTHER(j,
> events[i].counter);
> +
> +			if (!config_status(xe, config))
> +				count++;
> +		}
> +	}
> +
> +	/* Allocate attribute objects and table. */
> +	xe_attr = kcalloc(count, sizeof(*xe_attr), GFP_KERNEL);
> +	if (!xe_attr)
> +		goto err_alloc;
> +
> +	pmu_attr = kcalloc(count, sizeof(*pmu_attr), GFP_KERNEL);
> +	if (!pmu_attr)
> +		goto err_alloc;
> +
> +	/* Max one pointer of each attribute type plus a termination entry.
> */
> +	attr = kcalloc(count * 2 + 1, sizeof(*attr), GFP_KERNEL);
> +	if (!attr)
> +		goto err_alloc;
> +
> +	xe_iter = xe_attr;
> +	pmu_iter = pmu_attr;
> +	attr_iter = attr;
> +
> +	for_each_gt(gt, xe, j) {
> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
> +			u64 config = ___XE_PMU_OTHER(j,
> events[i].counter);
> +			char *str;
> +
> +			if (config_status(xe, config))
> +				continue;
> +
> +			if (events[i].global)
> +				str = kstrdup(events[i].name, GFP_KERNEL);
> +			else
> +				str = kasprintf(GFP_KERNEL, "%s-gt%u",
> +						events[i].name, j);
> +			if (!str)
> +				goto err;
> +
> +			*attr_iter++ = &xe_iter->attr.attr;
> +			xe_iter = add_xe_attr(xe_iter, str, config);
> +
> +			if (events[i].unit) {
> +				if (events[i].global)
> +					str = kasprintf(GFP_KERNEL,
> "%s.unit",
> +							events[i].name);
> +				else
> +					str = kasprintf(GFP_KERNEL, "%s-
> gt%u.unit",
> +							events[i].name, j);
> +				if (!str)
> +					goto err;
> +
> +				*attr_iter++ = &pmu_iter->attr.attr;
> +				pmu_iter = add_pmu_attr(pmu_iter, str,
> +							events[i].unit);
> +			}
> +		}
> +	}
> +
> +	pmu->xe_attr = xe_attr;
> +	pmu->pmu_attr = pmu_attr;
> +
> +	return attr;
> +
> +err:
> +	for (attr_iter = attr; *attr_iter; attr_iter++)
> +		kfree((*attr_iter)->name);
> +
> +err_alloc:
> +	kfree(attr);
> +	kfree(xe_attr);
> +	kfree(pmu_attr);
> +
> +	return NULL;
> +}
> +
> +static void free_event_attributes(struct xe_pmu *pmu) {
> +	struct attribute **attr_iter = pmu->events_attr_group.attrs;
> +
> +	for (; *attr_iter; attr_iter++)
> +		kfree((*attr_iter)->name);
> +
> +	kfree(pmu->events_attr_group.attrs);
> +	kfree(pmu->xe_attr);
> +	kfree(pmu->pmu_attr);
> +
> +	pmu->events_attr_group.attrs = NULL;
> +	pmu->xe_attr = NULL;
> +	pmu->pmu_attr = NULL;
> +}
> +
> +static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
> +{
> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu),
> cpuhp.node);
> +
> +	XE_BUG_ON(!pmu->base.event_init);
> +
> +	/* Select the first online CPU as a designated reader. */
> +	if (cpumask_empty(&xe_pmu_cpumask))
> +		cpumask_set_cpu(cpu, &xe_pmu_cpumask);
> +
> +	return 0;
> +}
> +
> +static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node
> +*node) {
> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu),
> cpuhp.node);
> +	unsigned int target = xe_pmu_target_cpu;
> +
> +	XE_BUG_ON(!pmu->base.event_init);
> +
> +	/*
> +	 * Unregistering an instance generates a CPU offline event which we
> must
> +	 * ignore to avoid incorrectly modifying the shared
> xe_pmu_cpumask.
> +	 */
> +	if (pmu->closed)
> +		return 0;
> +
> +	if (cpumask_test_and_clear_cpu(cpu, &xe_pmu_cpumask)) {
> +		target = cpumask_any_but(topology_sibling_cpumask(cpu),
> cpu);
> +
> +		/* Migrate events if there is a valid target */
> +		if (target < nr_cpu_ids) {
> +			cpumask_set_cpu(target, &xe_pmu_cpumask);
> +			xe_pmu_target_cpu = target;
> +		}
> +	}
> +
> +	if (target < nr_cpu_ids && target != pmu->cpuhp.cpu) {
> +		perf_pmu_migrate_context(&pmu->base, cpu, target);
> +		pmu->cpuhp.cpu = target;
> +	}
> +
> +	return 0;
> +}
> +
> +static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
> +
> +int xe_pmu_init(void)
> +{
> +	int ret;
> +
> +	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
> +				      "perf/x86/intel/xe:online",
> +				      xe_pmu_cpu_online,
> +				      xe_pmu_cpu_offline);
> +	if (ret < 0)
> +		pr_notice("Failed to setup cpuhp state for xe PMU! (%d)\n",
> +			  ret);
> +	else
> +		cpuhp_slot = ret;
> +
> +	return 0;
> +}
> +
> +void xe_pmu_exit(void)
> +{
> +	if (cpuhp_slot != CPUHP_INVALID)
> +		cpuhp_remove_multi_state(cpuhp_slot);
> +}
> +
> +static int xe_pmu_register_cpuhp_state(struct xe_pmu *pmu) {
> +	if (cpuhp_slot == CPUHP_INVALID)
> +		return -EINVAL;
> +
> +	return cpuhp_state_add_instance(cpuhp_slot, &pmu->cpuhp.node);
> }
> +
> +static void xe_pmu_unregister_cpuhp_state(struct xe_pmu *pmu) {
> +	cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node); }
> +
> +static void xe_pmu_unregister(struct drm_device *device, void *arg) {
> +	struct xe_pmu *pmu = arg;
> +
> +	if (!pmu->base.event_init)
> +		return;
> +
> +	/*
> +	 * "Disconnect" the PMU callbacks - since all are atomic
> synchronize_rcu
> +	 * ensures all currently executing ones will have exited before we
> +	 * proceed with unregistration.
> +	 */
> +	pmu->closed = true;
> +	synchronize_rcu();
> +
> +	xe_pmu_unregister_cpuhp_state(pmu);
> +
> +	perf_pmu_unregister(&pmu->base);
> +	pmu->base.event_init = NULL;
> +	kfree(pmu->base.attr_groups);
> +	kfree(pmu->name);
> +	free_event_attributes(pmu);
> +}
> +
> +static void init_samples(struct xe_pmu *pmu) {
> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +	struct xe_gt *gt;
> +	unsigned int i;
> +
> +	for_each_gt(gt, xe, i)
> +		engine_group_busyness_store(gt);
> +}
> +
> +void xe_pmu_register(struct xe_pmu *pmu) {
> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +	const struct attribute_group *attr_groups[] = {
> +		&xe_pmu_format_attr_group,
> +		&pmu->events_attr_group,
> +		&xe_pmu_cpumask_attr_group,
> +		NULL
> +	};
> +
> +	int ret = -ENOMEM;
> +
> +	spin_lock_init(&pmu->lock);
> +	pmu->cpuhp.cpu = -1;
> +	init_samples(pmu);
> +
> +	pmu->name = kasprintf(GFP_KERNEL,
> +			      "xe_%s",
> +			      dev_name(xe->drm.dev));
> +	if (pmu->name)
> +		/* tools/perf reserves colons as special. */
> +		strreplace((char *)pmu->name, ':', '_');
> +
> +	if (!pmu->name)
> +		goto err;
> +
> +	pmu->events_attr_group.name = "events";
> +	pmu->events_attr_group.attrs = create_event_attributes(pmu);
> +	if (!pmu->events_attr_group.attrs)
> +		goto err_name;
> +
> +	pmu->base.attr_groups = kmemdup(attr_groups,
> sizeof(attr_groups),
> +					GFP_KERNEL);
> +	if (!pmu->base.attr_groups)
> +		goto err_attr;
> +
> +	pmu->base.module	= THIS_MODULE;
> +	pmu->base.task_ctx_nr	= perf_invalid_context;
> +	pmu->base.event_init	= xe_pmu_event_init;
> +	pmu->base.add		= xe_pmu_event_add;
> +	pmu->base.del		= xe_pmu_event_del;
> +	pmu->base.start		= xe_pmu_event_start;
> +	pmu->base.stop		= xe_pmu_event_stop;
> +	pmu->base.read		= xe_pmu_event_read;
> +	pmu->base.event_idx	= xe_pmu_event_event_idx;
> +
> +	ret = perf_pmu_register(&pmu->base, pmu->name, -1);
> +	if (ret)
> +		goto err_groups;
> +
> +	ret = xe_pmu_register_cpuhp_state(pmu);
> +	if (ret)
> +		goto err_unreg;
> +
> +	ret = drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister,
> pmu);
> +	XE_WARN_ON(ret);
> +
> +	return;
> +
> +err_unreg:
> +	perf_pmu_unregister(&pmu->base);
> +err_groups:
> +	kfree(pmu->base.attr_groups);
> +err_attr:
> +	pmu->base.event_init = NULL;
> +	free_event_attributes(pmu);
> +err_name:
> +	kfree(pmu->name);
> +err:
> +	drm_notice(&xe->drm, "Failed to register PMU!\n"); }
> diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
> new file mode 100644 index 000000000000..d3f47f4ab343
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu.h
> @@ -0,0 +1,25 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +#ifndef _XE_PMU_H_
> +#define _XE_PMU_H_
> +
> +#include "xe_gt_types.h"
> +#include "xe_pmu_types.h"
> +
> +#ifdef CONFIG_PERF_EVENTS
> +int xe_pmu_init(void);
> +void xe_pmu_exit(void);
> +void xe_pmu_register(struct xe_pmu *pmu); void
> +engine_group_busyness_store(struct xe_gt *gt); #else static inline int
> +xe_pmu_init(void) { return 0; } static inline void xe_pmu_exit(void) {}
> +static inline void xe_pmu_register(struct xe_pmu *pmu) {} static inline
> +void engine_group_busyness_store(struct xe_gt *gt) {} #endif
> +
> +#endif
> +
> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h
> b/drivers/gpu/drm/xe/xe_pmu_types.h
> new file mode 100644
> index 000000000000..e87edd4d6a87
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
> @@ -0,0 +1,80 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +#ifndef _XE_PMU_TYPES_H_
> +#define _XE_PMU_TYPES_H_
> +
> +#include <linux/perf_event.h>
> +#include <linux/spinlock_types.h>
> +#include <uapi/drm/xe_drm.h>
> +
> +enum {
> +	__XE_SAMPLE_RENDER_GROUP_BUSY,
> +	__XE_SAMPLE_COPY_GROUP_BUSY,
> +	__XE_SAMPLE_MEDIA_GROUP_BUSY,
> +	__XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> +	__XE_NUM_PMU_SAMPLERS
> +};
> +
> +struct xe_pmu_sample {
> +	u64 cur;
> +};
> +
> +#define XE_MAX_GT_PER_TILE 2
> +
> +struct xe_pmu {
> +	/**
> +	 * @cpuhp: Struct used for CPU hotplug handling.
> +	 */
> +	struct {
> +		struct hlist_node node;
> +		unsigned int cpu;
> +	} cpuhp;
> +	/**
> +	 * @base: PMU base.
> +	 */
> +	struct pmu base;
> +	/**
> +	 * @closed: xe is unregistering.
> +	 */
> +	bool closed;
> +	/**
> +	 * @name: Name as registered with perf core.
> +	 */
> +	const char *name;
> +	/**
> +	 * @lock: Lock protecting enable mask and ref count handling.
> +	 */
> +	spinlock_t lock;
> +	/**
> +	 * @sample: Current and previous (raw) counters.
> +	 *
> +	 * These counters are updated when the device is awake.
> +	 *
> +	 */
> +	struct xe_pmu_sample sample[XE_MAX_GT_PER_TILE *
> __XE_NUM_PMU_SAMPLERS];
> +	/**
> +	 * @irq_count: Number of interrupts
> +	 *
> +	 * Intentionally unsigned long to avoid atomics or heuristics on 32bit.
> +	 * 4e9 interrupts are a lot and postprocessing can really deal with an
> +	 * occasional wraparound easily. It's 32bit after all.
> +	 */
> +	unsigned long irq_count;
> +	/**
> +	 * @events_attr_group: Device events attribute group.
> +	 */
> +	struct attribute_group events_attr_group;
> +	/**
> +	 * @xe_attr: Memory block holding device attributes.
> +	 */
> +	void *xe_attr;
> +	/**
> +	 * @pmu_attr: Memory block holding device attributes.
> +	 */
> +	void *pmu_attr;
> +};
> +
> +#endif
> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index
> 965cd9527ff1..ed097056f944 100644
> --- a/include/uapi/drm/xe_drm.h
> +++ b/include/uapi/drm/xe_drm.h
> @@ -990,6 +990,22 @@ struct drm_xe_vm_madvise {
>  	__u64 reserved[2];
>  };
> 
> +/* PMU event config IDs */
> +
> +/*
> + * Top 4 bits of every counter are GT id.
> + */
> +#define __XE_PMU_GT_SHIFT (60)
> +
> +#define ___XE_PMU_OTHER(gt, x) \
> +	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
> +
> +#define XE_PMU_INTERRUPTS(gt)
> 	___XE_PMU_OTHER(gt, 0)
> +#define XE_PMU_RENDER_GROUP_BUSY(gt)
> 	___XE_PMU_OTHER(gt, 1)
> +#define XE_PMU_COPY_GROUP_BUSY(gt)
> 	___XE_PMU_OTHER(gt, 2)
> +#define XE_PMU_MEDIA_GROUP_BUSY(gt)
> 	___XE_PMU_OTHER(gt, 3)
> +#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)
> 	___XE_PMU_OTHER(gt, 4)
> +
>  #if defined(__cplusplus)
>  }
>  #endif
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs
  2023-06-27 12:21 ` [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs Aravind Iddamsetty
@ 2023-07-04  9:29   ` Upadhyay, Tejas
  2023-07-04 10:14     ` Upadhyay, Tejas
  2023-07-06  0:55   ` Dixit, Ashutosh
  1 sibling, 1 reply; 59+ messages in thread
From: Upadhyay, Tejas @ 2023-07-04  9:29 UTC (permalink / raw)
  To: Iddamsetty, Aravind, intel-xe



> -----Original Message-----
> From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of
> Aravind Iddamsetty
> Sent: Tuesday, June 27, 2023 5:51 PM
> To: intel-xe@lists.freedesktop.org
> Subject: [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs
> 
> Helpers to get GT clock to nanosecs
> 
> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_gt_clock.c | 10 ++++++++++
> drivers/gpu/drm/xe/xe_gt_clock.h |  4 +++-
>  2 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_gt_clock.c
> b/drivers/gpu/drm/xe/xe_gt_clock.c
> index 7cf11078ff57..3689c7d5cf53 100644
> --- a/drivers/gpu/drm/xe/xe_gt_clock.c
> +++ b/drivers/gpu/drm/xe/xe_gt_clock.c
> @@ -78,3 +78,13 @@ int xe_gt_clock_init(struct xe_gt *gt)
>  	gt->info.clock_freq = freq;
>  	return 0;
>  }
> +
> +static u64 div_u64_roundup(u64 nom, u32 den) {
> +	return div_u64(nom + den - 1, den);
> +}
> +
> +u64 xe_gt_clock_interval_to_ns(const struct xe_gt *gt, u64 count) {
> +	return div_u64_roundup(count * NSEC_PER_SEC, gt-
> >info.clock_freq); }
> diff --git a/drivers/gpu/drm/xe/xe_gt_clock.h
> b/drivers/gpu/drm/xe/xe_gt_clock.h
> index 511923afd224..91fc9b7e83f5 100644
> --- a/drivers/gpu/drm/xe/xe_gt_clock.h
> +++ b/drivers/gpu/drm/xe/xe_gt_clock.h
> @@ -6,8 +6,10 @@
>  #ifndef _XE_GT_CLOCK_H_
>  #define _XE_GT_CLOCK_H_
> 
> +#include <linux/types.h>
> +
>  struct xe_gt;
> 
>  int xe_gt_clock_init(struct xe_gt *gt);
> -
> +u64 xe_gt_clock_interval_to_ns(const struct xe_gt *gt, u64 count);
>  #endif

Looks ok to me,
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

> --
> 2.25.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs
  2023-07-04  9:29   ` Upadhyay, Tejas
@ 2023-07-04 10:14     ` Upadhyay, Tejas
  2023-07-05  4:46       ` Iddamsetty, Aravind
  0 siblings, 1 reply; 59+ messages in thread
From: Upadhyay, Tejas @ 2023-07-04 10:14 UTC (permalink / raw)
  To: Iddamsetty, Aravind, intel-xe



> -----Original Message-----
> From: Upadhyay, Tejas
> Sent: Tuesday, July 4, 2023 2:59 PM
> To: Aravind Iddamsetty <aravind.iddamsetty@intel.com>; intel-
> xe@lists.freedesktop.org
> Subject: RE: [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs
> 
> 
> 
> > -----Original Message-----
> > From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of
> > Aravind Iddamsetty
> > Sent: Tuesday, June 27, 2023 5:51 PM
> > To: intel-xe@lists.freedesktop.org
> > Subject: [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs
> >
> > Helpers to get GT clock to nanosecs
> >
> > Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_gt_clock.c | 10 ++++++++++
> > drivers/gpu/drm/xe/xe_gt_clock.h |  4 +++-
> >  2 files changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_clock.c
> > b/drivers/gpu/drm/xe/xe_gt_clock.c
> > index 7cf11078ff57..3689c7d5cf53 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_clock.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_clock.c
> > @@ -78,3 +78,13 @@ int xe_gt_clock_init(struct xe_gt *gt)
> >  	gt->info.clock_freq = freq;
> >  	return 0;
> >  }
> > +
> > +static u64 div_u64_roundup(u64 nom, u32 den) {
> > +	return div_u64(nom + den - 1, den);
> > +}

Also this API can be moved to more common place like, xe_drv.h for others to use when needed?

> > +
> > +u64 xe_gt_clock_interval_to_ns(const struct xe_gt *gt, u64 count) {
> > +	return div_u64_roundup(count * NSEC_PER_SEC, gt-
> > >info.clock_freq); }
> > diff --git a/drivers/gpu/drm/xe/xe_gt_clock.h
> > b/drivers/gpu/drm/xe/xe_gt_clock.h
> > index 511923afd224..91fc9b7e83f5 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_clock.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_clock.h
> > @@ -6,8 +6,10 @@
> >  #ifndef _XE_GT_CLOCK_H_
> >  #define _XE_GT_CLOCK_H_
> >
> > +#include <linux/types.h>
> > +
> >  struct xe_gt;
> >
> >  int xe_gt_clock_init(struct xe_gt *gt);
> > -
> > +u64 xe_gt_clock_interval_to_ns(const struct xe_gt *gt, u64 count);
> >  #endif
> 
> Looks ok to me,
> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> 
> > --
> > 2.25.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-04  9:10   ` Upadhyay, Tejas
@ 2023-07-05  4:42     ` Iddamsetty, Aravind
  0 siblings, 0 replies; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-05  4:42 UTC (permalink / raw)
  To: Upadhyay, Tejas, intel-xe; +Cc: Bommu, Krishnaiah, Ursulin, Tvrtko



On 04-07-2023 14:40, Upadhyay, Tejas wrote:
> 
> 
>> -----Original Message-----
>> From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of
>> Aravind Iddamsetty
>> Sent: Tuesday, June 27, 2023 5:51 PM
>> To: intel-xe@lists.freedesktop.org
>> Cc: Bommu, Krishnaiah <krishnaiah.bommu@intel.com>; Ursulin, Tvrtko
>> <tvrtko.ursulin@intel.com>
>> Subject: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
>>
>> There are a set of engine group busyness counters provided by HW which
>> are perfect fit to be exposed via PMU perf events.
>>
>> BSPEC: 46559, 46560, 46722, 46729
>>
>> events can be listed using:
>> perf list
>>   xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
>>   xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
>>   xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
>>   xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
>>   xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
>>
>> and can be read using:
>>
>> perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
>>            time             counts unit events
>>      1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>     10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>     10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>
>> The pmu base implementation is taken from i915.
>>
>> v2:
>> Store last known value when device is awake return that while the GT is
>> suspended and then update the driver copy when read during awake.
>>
>> Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
>> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
>> ---
>>  drivers/gpu/drm/xe/Makefile          |   2 +
>>  drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
>>  drivers/gpu/drm/xe/xe_device.c       |   2 +
>>  drivers/gpu/drm/xe/xe_device_types.h |   4 +
>>  drivers/gpu/drm/xe/xe_gt.c           |   2 +
>>  drivers/gpu/drm/xe/xe_irq.c          |  22 +
>>  drivers/gpu/drm/xe/xe_module.c       |   5 +
>>  drivers/gpu/drm/xe/xe_pmu.c          | 739 +++++++++++++++++++++++++++
>>  drivers/gpu/drm/xe/xe_pmu.h          |  25 +
>>  drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
>>  include/uapi/drm/xe_drm.h            |  16 +
>>  11 files changed, 902 insertions(+)
>>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.c  create mode 100644
>> drivers/gpu/drm/xe/xe_pmu.h  create mode 100644
>> drivers/gpu/drm/xe/xe_pmu_types.h
>>
>> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
>> index 081c57fd8632..e52ab795c566 100644
>> --- a/drivers/gpu/drm/xe/Makefile
>> +++ b/drivers/gpu/drm/xe/Makefile
>> @@ -217,6 +217,8 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
>>  	i915-display/skl_universal_plane.o \
>>  	i915-display/skl_watermark.o
>>
>> +xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
>> +
>>  ifeq ($(CONFIG_ACPI),y)
>>  	xe-$(CONFIG_DRM_XE_DISPLAY) += \
>>  		i915-display/intel_acpi.o \
>> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> index 3f664011eaea..c7d9e4634745 100644
>> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> @@ -285,6 +285,11 @@
>>  #define   INVALIDATION_BROADCAST_MODE_DIS	REG_BIT(12)
>>  #define   GLOBAL_INVALIDATION_MODE		REG_BIT(2)
>>
>> +#define XE_OAG_RC0_ANY_ENGINE_BUSY_FREE
>> 	XE_REG(0xdb80)
>> +#define XE_OAG_ANY_MEDIA_FF_BUSY_FREE		XE_REG(0xdba0)
>> +#define XE_OAG_BLT_BUSY_FREE			XE_REG(0xdbbc)
>> +#define XE_OAG_RENDER_BUSY_FREE			XE_REG(0xdbdc)
>> +
>>  #define SAMPLER_MODE				XE_REG_MCR(0xe18c,
>> XE_REG_OPTION_MASKED)
>>  #define   ENABLE_SMALLPL			REG_BIT(15)
>>  #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
>> diff --git a/drivers/gpu/drm/xe/xe_device.c
>> b/drivers/gpu/drm/xe/xe_device.c index c7985af85a53..b2c7bd4a97d9
>> 100644
>> --- a/drivers/gpu/drm/xe/xe_device.c
>> +++ b/drivers/gpu/drm/xe/xe_device.c
>> @@ -328,6 +328,8 @@ int xe_device_probe(struct xe_device *xe)
>>
>>  	xe_debugfs_register(xe);
>>
>> +	xe_pmu_register(&xe->pmu);
>> +
>>  	err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize, xe);
>>  	if (err)
>>  		return err;
>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h
>> b/drivers/gpu/drm/xe/xe_device_types.h
>> index 0226d44a6af2..3ba99aae92b9 100644
>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>> @@ -15,6 +15,7 @@
>>  #include "xe_devcoredump_types.h"
>>  #include "xe_gt_types.h"
>>  #include "xe_platform_types.h"
>> +#include "xe_pmu.h"
>>  #include "xe_step_types.h"
>>
>>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
>> @@ -332,6 +333,9 @@ struct xe_device {
>>  	/** @d3cold_allowed: Indicates if d3cold is a valid device state */
>>  	bool d3cold_allowed;
>>
>> +	/* @pmu: performance monitoring unit */
> 
> /** @pmu: performance monitoring unit */

right I noticed this in CI it complained.
> 
>> +	struct xe_pmu pmu;
>> +
>>  	/* private: */
>>
>>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
>> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index
>> 2458397ce8af..96e3720923d4 100644
>> --- a/drivers/gpu/drm/xe/xe_gt.c
>> +++ b/drivers/gpu/drm/xe/xe_gt.c
>> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
>>  	if (err)
>>  		goto err_msg;
>>
>> +	engine_group_busyness_store(gt);
>> +
>>  	err = xe_uc_suspend(&gt->uc);
>>  	if (err)
>>  		goto err_force_wake;
>> diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
>> index b4ed1e4a3388..cb943fb94ec7 100644
>> --- a/drivers/gpu/drm/xe/xe_irq.c
>> +++ b/drivers/gpu/drm/xe/xe_irq.c
>> @@ -27,6 +27,24 @@
>>  #define IIR(offset)				XE_REG(offset + 0x8)
>>  #define IER(offset)				XE_REG(offset + 0xc)
>>
>> +/*
>> + * Interrupt statistic for PMU. Increments the counter only if the
>> + * interrupt originated from the GPU so interrupts from a device which
>> + * shares the interrupt line are not accounted.
>> + */
>> +static inline void xe_pmu_irq_stats(struct xe_device *xe,
>> +				    irqreturn_t res)
>> +{
>> +	if (unlikely(res != IRQ_HANDLED))
>> +		return;
>> +
>> +	/*
>> +	 * A clever compiler translates that into INC. A not so clever one
>> +	 * should at least prevent store tearing.
>> +	 */
>> +	WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1); }
>> +
>>  static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)  {
>>  	u32 val = xe_mmio_read32(mmio, reg);
>> @@ -325,6 +343,8 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
>>
>>  	xe_display_irq_enable(xe, gu_misc_iir);
>>
>> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
>> +
>>  	return IRQ_HANDLED;
>>  }
>>
>> @@ -414,6 +434,8 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
>>  	dg1_intr_enable(xe, false);
>>  	xe_display_irq_enable(xe, gu_misc_iir);
>>
>> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
>> +
>>  	return IRQ_HANDLED;
>>  }
>>
>> diff --git a/drivers/gpu/drm/xe/xe_module.c
>> b/drivers/gpu/drm/xe/xe_module.c index 75e5be939f53..f6fe89748525
>> 100644
>> --- a/drivers/gpu/drm/xe/xe_module.c
>> +++ b/drivers/gpu/drm/xe/xe_module.c
>> @@ -12,6 +12,7 @@
>>  #include "xe_hw_fence.h"
>>  #include "xe_module.h"
>>  #include "xe_pci.h"
>> +#include "xe_pmu.h"
>>  #include "xe_sched_job.h"
>>
>>  bool enable_guc = true;
>> @@ -49,6 +50,10 @@ static const struct init_funcs init_funcs[] = {
>>  		.init = xe_sched_job_module_init,
>>  		.exit = xe_sched_job_module_exit,
>>  	},
>> +	{
>> +		.init = xe_pmu_init,
>> +		.exit = xe_pmu_exit,
>> +	},
>>  	{
>>  		.init = xe_register_pci_driver,
>>  		.exit = xe_unregister_pci_driver,
>> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
>> new file mode 100644 index 000000000000..bef1895be9f7
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_pmu.c
>> @@ -0,0 +1,739 @@
>> +/*
>> + * SPDX-License-Identifier: MIT
>> + *
>> + * Copyright © 2023 Intel Corporation
>> + */
>> +
>> +#include <drm/drm_drv.h>
>> +#include <drm/drm_managed.h>
>> +#include <drm/xe_drm.h>
>> +
>> +#include "regs/xe_gt_regs.h"
>> +#include "xe_device.h"
>> +#include "xe_gt_clock.h"
>> +#include "xe_mmio.h"
>> +
>> +static cpumask_t xe_pmu_cpumask;
>> +static unsigned int xe_pmu_target_cpu = -1;
>> +
>> +static unsigned int config_gt_id(const u64 config) {
>> +	return config >> __XE_PMU_GT_SHIFT;
>> +}
>> +
>> +static u64 config_counter(const u64 config) {
>> +	return config & ~(~0ULL << __XE_PMU_GT_SHIFT); }
>> +
>> +static unsigned int
>> +__sample_idx(struct xe_pmu *pmu, unsigned int gt_id, int sample) {
>> +	unsigned int idx = gt_id * __XE_NUM_PMU_SAMPLERS + sample;
>> +
>> +	XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
>> +
>> +	return idx;
>> +}
>> +
>> +static u64 read_sample(struct xe_pmu *pmu, unsigned int gt_id, int
>> +sample) {
>> +	return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur; }
>> +
>> +static void
>> +store_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample, u64
>> +val) {
>> +	pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val; }
>> +
>> +static int engine_busyness_sample_type(u64 config) {
>> +	int type = 0;
>> +
>> +	switch (config) {
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +		type =  __XE_SAMPLE_RENDER_GROUP_BUSY;
>> +		break;
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +		type = __XE_SAMPLE_COPY_GROUP_BUSY;
>> +		break;
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		type = __XE_SAMPLE_MEDIA_GROUP_BUSY;
>> +		break;
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +		type = __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY;
>> +		break;
>> +	}
>> +
>> +	return type;
>> +}
>> +
>> +static void xe_pmu_event_destroy(struct perf_event *event) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +
>> +	drm_WARN_ON(&xe->drm, event->parent);
>> +
>> +	drm_dev_put(&xe->drm);
>> +}
>> +
>> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config) {
>> +	u64 val = 0;
>> +
>> +	switch (config) {
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
>> +		break;
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
>> +		break;
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt,
>> XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
>> +		break;
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt,
>> XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
>> +		break;
>> +	default:
>> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
>> +	}
>> +
>> +	return xe_gt_clock_interval_to_ns(gt, val * 16); }
>> +
>> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config) {
>> +	int sample_type = engine_busyness_sample_type(config);
>> +	struct xe_device *xe = gt->tile->xe;
>> +	const unsigned int gt_id = gt->info.id;
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	bool device_awake;
>> +	unsigned long flags;
>> +	u64 val;
> 
> Many places Christmas tree order for variable declaration group not followed. Please apply wherever applicable.

sure will take care.
> 
>> +
>> +	/*
>> +	 * found no better way to check if device is awake or not. Before
>> +	 * we suspend we set the submission_state.enabled to false.
>> +	 */
>> +	device_awake = gt->uc.guc.submission_state.enabled ? true : false;
>> +	if (device_awake)
>> +		val = __engine_group_busyness_read(gt, config);
>> +
>> +	spin_lock_irqsave(&pmu->lock, flags);
>> +
>> +	if (device_awake)
>> +		store_sample(pmu, gt_id, sample_type, val);
>> +	else
>> +		val = read_sample(pmu, gt_id, sample_type);
>> +
>> +	spin_unlock_irqrestore(&pmu->lock, flags);
>> +
>> +	return val;
>> +}
>> +
>> +void engine_group_busyness_store(struct xe_gt *gt) {
>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
>> +	unsigned int gt_id = gt->info.id;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&pmu->lock, flags);
>> +
>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt,
>> XE_PMU_RENDER_GROUP_BUSY(0)));
>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt,
>> XE_PMU_COPY_GROUP_BUSY(0)));
>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt,
>> XE_PMU_MEDIA_GROUP_BUSY(0)));
>> +	store_sample(pmu, gt_id,
>> __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt,
>> +XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
>> +
>> +	spin_unlock_irqrestore(&pmu->lock, flags); }
>> +
>> +static int
>> +config_status(struct xe_device *xe, u64 config) {
>> +	unsigned int max_gt_id = xe->info.gt_count > 1 ? 1 : 0;
>> +	unsigned int gt_id = config_gt_id(config);
>> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
>> +
>> +	if (gt_id > max_gt_id)
>> +		return -ENOENT;
>> +
>> +	switch (config_counter(config)) {
>> +	case XE_PMU_INTERRUPTS(0):
>> +		if (gt_id)
>> +			return -ENOENT;
>> +		break;
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +		if (GRAPHICS_VER(xe) < 12)
>> +			return -ENOENT;
>> +		break;
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		if (MEDIA_VER(xe) >= 13 && gt->info.type !=
>> XE_GT_TYPE_MEDIA)
>> +			return -ENOENT;
>> +		break;
>> +	default:
>> +		return -ENOENT;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int xe_pmu_event_init(struct perf_event *event) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	int ret;
>> +
>> +	if (pmu->closed)
>> +		return -ENODEV;
>> +
>> +	if (event->attr.type != event->pmu->type)
>> +		return -ENOENT;
>> +
>> +	/* unsupported modes and filters */
>> +	if (event->attr.sample_period) /* no sampling */
>> +		return -EINVAL;
>> +
>> +	if (has_branch_stack(event))
>> +		return -EOPNOTSUPP;
>> +
>> +	if (event->cpu < 0)
>> +		return -EINVAL;
>> +
>> +	/* only allow running on one cpu at a time */
>> +	if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask))
>> +		return -EINVAL;
>> +
>> +	ret = config_status(xe, event->attr.config);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (!event->parent) {
>> +		drm_dev_get(&xe->drm);
>> +		event->destroy = xe_pmu_event_destroy;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static u64 __xe_pmu_event_read(struct perf_event *event) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	const unsigned int gt_id = config_gt_id(event->attr.config);
>> +	const u64 config = config_counter(event->attr.config);
>> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	u64 val = 0;
>> +
>> +	switch (config) {
>> +	case XE_PMU_INTERRUPTS(0):
>> +		val = READ_ONCE(pmu->irq_count);
>> +		break;
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		val = engine_group_busyness_read(gt, config);
>> +	}
> 
> Do you need to handle default case here?
ideally i could but an unknown config will not even reach this far, but
to satisfy the symantics of switch i'll put default.
> 
> Also, if possible can you please think of breaking up this into small patches. Its huge patch to review. Split it only if it makes sense.

well if i break most of the routines will be dummy in one of the
patches. But let me see what I can do when i spin the next version.

Thanks,
Aravind.
> 
> Thanks,
> Tejas
>> +
>> +	return val;
>> +}
>> +
>> +static void xe_pmu_event_read(struct perf_event *event) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct hw_perf_event *hwc = &event->hw;
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	u64 prev, new;
>> +
>> +	if (pmu->closed) {
>> +		event->hw.state = PERF_HES_STOPPED;
>> +		return;
>> +	}
>> +again:
>> +	prev = local64_read(&hwc->prev_count);
>> +	new = __xe_pmu_event_read(event);
>> +
>> +	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
>> +		goto again;
>> +
>> +	local64_add(new - prev, &event->count); }
>> +
>> +static void xe_pmu_enable(struct perf_event *event) {
>> +	/*
>> +	 * Store the current counter value so we can report the correct delta
>> +	 * for all listeners. Even when the event was already enabled and has
>> +	 * an existing non-zero value.
>> +	 */
>> +	local64_set(&event->hw.prev_count, __xe_pmu_event_read(event));
>> }
>> +
>> +static void xe_pmu_event_start(struct perf_event *event, int flags) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +
>> +	if (pmu->closed)
>> +		return;
>> +
>> +	xe_pmu_enable(event);
>> +	event->hw.state = 0;
>> +}
>> +
>> +static void xe_pmu_event_stop(struct perf_event *event, int flags) {
>> +	if (flags & PERF_EF_UPDATE)
>> +		xe_pmu_event_read(event);
>> +
>> +	event->hw.state = PERF_HES_STOPPED;
>> +}
>> +
>> +static int xe_pmu_event_add(struct perf_event *event, int flags) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +
>> +	if (pmu->closed)
>> +		return -ENODEV;
>> +
>> +	if (flags & PERF_EF_START)
>> +		xe_pmu_event_start(event, flags);
>> +
>> +	return 0;
>> +}
>> +
>> +static void xe_pmu_event_del(struct perf_event *event, int flags) {
>> +	xe_pmu_event_stop(event, PERF_EF_UPDATE); }
>> +
>> +static int xe_pmu_event_event_idx(struct perf_event *event) {
>> +	return 0;
>> +}
>> +
>> +struct xe_str_attribute {
>> +	struct device_attribute attr;
>> +	const char *str;
>> +};
>> +
>> +static ssize_t xe_pmu_format_show(struct device *dev,
>> +				  struct device_attribute *attr, char *buf) {
>> +	struct xe_str_attribute *eattr;
>> +
>> +	eattr = container_of(attr, struct xe_str_attribute, attr);
>> +	return sprintf(buf, "%s\n", eattr->str); }
>> +
>> +#define XE_PMU_FORMAT_ATTR(_name, _config) \
>> +	(&((struct xe_str_attribute[]) { \
>> +		{ .attr = __ATTR(_name, 0444, xe_pmu_format_show, NULL),
>> \
>> +		  .str = _config, } \
>> +	})[0].attr.attr)
>> +
>> +static struct attribute *xe_pmu_format_attrs[] = {
>> +	XE_PMU_FORMAT_ATTR(xe_eventid, "config:0-20"),
>> +	NULL,
>> +};
>> +
>> +static const struct attribute_group xe_pmu_format_attr_group = {
>> +	.name = "format",
>> +	.attrs = xe_pmu_format_attrs,
>> +};
>> +
>> +struct xe_ext_attribute {
>> +	struct device_attribute attr;
>> +	unsigned long val;
>> +};
>> +
>> +static ssize_t xe_pmu_event_show(struct device *dev,
>> +				 struct device_attribute *attr, char *buf) {
>> +	struct xe_ext_attribute *eattr;
>> +
>> +	eattr = container_of(attr, struct xe_ext_attribute, attr);
>> +	return sprintf(buf, "config=0x%lx\n", eattr->val); }
>> +
>> +static ssize_t cpumask_show(struct device *dev,
>> +			    struct device_attribute *attr, char *buf) {
>> +	return cpumap_print_to_pagebuf(true, buf, &xe_pmu_cpumask); }
>> +
>> +static DEVICE_ATTR_RO(cpumask);
>> +
>> +static struct attribute *xe_cpumask_attrs[] = {
>> +	&dev_attr_cpumask.attr,
>> +	NULL,
>> +};
>> +
>> +static const struct attribute_group xe_pmu_cpumask_attr_group = {
>> +	.attrs = xe_cpumask_attrs,
>> +};
>> +
>> +#define __event(__counter, __name, __unit) \ { \
>> +	.counter = (__counter), \
>> +	.name = (__name), \
>> +	.unit = (__unit), \
>> +	.global = false, \
>> +}
>> +
>> +#define __global_event(__counter, __name, __unit) \ { \
>> +	.counter = (__counter), \
>> +	.name = (__name), \
>> +	.unit = (__unit), \
>> +	.global = true, \
>> +}
>> +
>> +static struct xe_ext_attribute *
>> +add_xe_attr(struct xe_ext_attribute *attr, const char *name, u64
>> +config) {
>> +	sysfs_attr_init(&attr->attr.attr);
>> +	attr->attr.attr.name = name;
>> +	attr->attr.attr.mode = 0444;
>> +	attr->attr.show = xe_pmu_event_show;
>> +	attr->val = config;
>> +
>> +	return ++attr;
>> +}
>> +
>> +static struct perf_pmu_events_attr *
>> +add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name,
>> +	     const char *str)
>> +{
>> +	sysfs_attr_init(&attr->attr.attr);
>> +	attr->attr.attr.name = name;
>> +	attr->attr.attr.mode = 0444;
>> +	attr->attr.show = perf_event_sysfs_show;
>> +	attr->event_str = str;
>> +
>> +	return ++attr;
>> +}
>> +
>> +static struct attribute **
>> +create_event_attributes(struct xe_pmu *pmu) {
>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +	static const struct {
>> +		unsigned int counter;
>> +		const char *name;
>> +		const char *unit;
>> +		bool global;
>> +	} events[] = {
>> +		__global_event(0, "interrupts", NULL),
>> +		__event(1, "render-group-busy", "ns"),
>> +		__event(2, "copy-group-busy", "ns"),
>> +		__event(3, "media-group-busy", "ns"),
>> +		__event(4, "any-engine-group-busy", "ns"),
>> +	};
>> +
>> +	unsigned int count = 0;
>> +	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
>> +	struct xe_ext_attribute *xe_attr = NULL, *xe_iter;
>> +	struct attribute **attr = NULL, **attr_iter;
>> +	struct xe_gt *gt;
>> +	unsigned int i, j;
>> +
>> +	/* Count how many counters we will be exposing. */
>> +	for_each_gt(gt, xe, j) {
>> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
>> +			u64 config = ___XE_PMU_OTHER(j,
>> events[i].counter);
>> +
>> +			if (!config_status(xe, config))
>> +				count++;
>> +		}
>> +	}
>> +
>> +	/* Allocate attribute objects and table. */
>> +	xe_attr = kcalloc(count, sizeof(*xe_attr), GFP_KERNEL);
>> +	if (!xe_attr)
>> +		goto err_alloc;
>> +
>> +	pmu_attr = kcalloc(count, sizeof(*pmu_attr), GFP_KERNEL);
>> +	if (!pmu_attr)
>> +		goto err_alloc;
>> +
>> +	/* Max one pointer of each attribute type plus a termination entry.
>> */
>> +	attr = kcalloc(count * 2 + 1, sizeof(*attr), GFP_KERNEL);
>> +	if (!attr)
>> +		goto err_alloc;
>> +
>> +	xe_iter = xe_attr;
>> +	pmu_iter = pmu_attr;
>> +	attr_iter = attr;
>> +
>> +	for_each_gt(gt, xe, j) {
>> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
>> +			u64 config = ___XE_PMU_OTHER(j,
>> events[i].counter);
>> +			char *str;
>> +
>> +			if (config_status(xe, config))
>> +				continue;
>> +
>> +			if (events[i].global)
>> +				str = kstrdup(events[i].name, GFP_KERNEL);
>> +			else
>> +				str = kasprintf(GFP_KERNEL, "%s-gt%u",
>> +						events[i].name, j);
>> +			if (!str)
>> +				goto err;
>> +
>> +			*attr_iter++ = &xe_iter->attr.attr;
>> +			xe_iter = add_xe_attr(xe_iter, str, config);
>> +
>> +			if (events[i].unit) {
>> +				if (events[i].global)
>> +					str = kasprintf(GFP_KERNEL,
>> "%s.unit",
>> +							events[i].name);
>> +				else
>> +					str = kasprintf(GFP_KERNEL, "%s-
>> gt%u.unit",
>> +							events[i].name, j);
>> +				if (!str)
>> +					goto err;
>> +
>> +				*attr_iter++ = &pmu_iter->attr.attr;
>> +				pmu_iter = add_pmu_attr(pmu_iter, str,
>> +							events[i].unit);
>> +			}
>> +		}
>> +	}
>> +
>> +	pmu->xe_attr = xe_attr;
>> +	pmu->pmu_attr = pmu_attr;
>> +
>> +	return attr;
>> +
>> +err:
>> +	for (attr_iter = attr; *attr_iter; attr_iter++)
>> +		kfree((*attr_iter)->name);
>> +
>> +err_alloc:
>> +	kfree(attr);
>> +	kfree(xe_attr);
>> +	kfree(pmu_attr);
>> +
>> +	return NULL;
>> +}
>> +
>> +static void free_event_attributes(struct xe_pmu *pmu) {
>> +	struct attribute **attr_iter = pmu->events_attr_group.attrs;
>> +
>> +	for (; *attr_iter; attr_iter++)
>> +		kfree((*attr_iter)->name);
>> +
>> +	kfree(pmu->events_attr_group.attrs);
>> +	kfree(pmu->xe_attr);
>> +	kfree(pmu->pmu_attr);
>> +
>> +	pmu->events_attr_group.attrs = NULL;
>> +	pmu->xe_attr = NULL;
>> +	pmu->pmu_attr = NULL;
>> +}
>> +
>> +static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
>> +{
>> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu),
>> cpuhp.node);
>> +
>> +	XE_BUG_ON(!pmu->base.event_init);
>> +
>> +	/* Select the first online CPU as a designated reader. */
>> +	if (cpumask_empty(&xe_pmu_cpumask))
>> +		cpumask_set_cpu(cpu, &xe_pmu_cpumask);
>> +
>> +	return 0;
>> +}
>> +
>> +static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node
>> +*node) {
>> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu),
>> cpuhp.node);
>> +	unsigned int target = xe_pmu_target_cpu;
>> +
>> +	XE_BUG_ON(!pmu->base.event_init);
>> +
>> +	/*
>> +	 * Unregistering an instance generates a CPU offline event which we
>> must
>> +	 * ignore to avoid incorrectly modifying the shared
>> xe_pmu_cpumask.
>> +	 */
>> +	if (pmu->closed)
>> +		return 0;
>> +
>> +	if (cpumask_test_and_clear_cpu(cpu, &xe_pmu_cpumask)) {
>> +		target = cpumask_any_but(topology_sibling_cpumask(cpu),
>> cpu);
>> +
>> +		/* Migrate events if there is a valid target */
>> +		if (target < nr_cpu_ids) {
>> +			cpumask_set_cpu(target, &xe_pmu_cpumask);
>> +			xe_pmu_target_cpu = target;
>> +		}
>> +	}
>> +
>> +	if (target < nr_cpu_ids && target != pmu->cpuhp.cpu) {
>> +		perf_pmu_migrate_context(&pmu->base, cpu, target);
>> +		pmu->cpuhp.cpu = target;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
>> +
>> +int xe_pmu_init(void)
>> +{
>> +	int ret;
>> +
>> +	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
>> +				      "perf/x86/intel/xe:online",
>> +				      xe_pmu_cpu_online,
>> +				      xe_pmu_cpu_offline);
>> +	if (ret < 0)
>> +		pr_notice("Failed to setup cpuhp state for xe PMU! (%d)\n",
>> +			  ret);
>> +	else
>> +		cpuhp_slot = ret;
>> +
>> +	return 0;
>> +}
>> +
>> +void xe_pmu_exit(void)
>> +{
>> +	if (cpuhp_slot != CPUHP_INVALID)
>> +		cpuhp_remove_multi_state(cpuhp_slot);
>> +}
>> +
>> +static int xe_pmu_register_cpuhp_state(struct xe_pmu *pmu) {
>> +	if (cpuhp_slot == CPUHP_INVALID)
>> +		return -EINVAL;
>> +
>> +	return cpuhp_state_add_instance(cpuhp_slot, &pmu->cpuhp.node);
>> }
>> +
>> +static void xe_pmu_unregister_cpuhp_state(struct xe_pmu *pmu) {
>> +	cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node); }
>> +
>> +static void xe_pmu_unregister(struct drm_device *device, void *arg) {
>> +	struct xe_pmu *pmu = arg;
>> +
>> +	if (!pmu->base.event_init)
>> +		return;
>> +
>> +	/*
>> +	 * "Disconnect" the PMU callbacks - since all are atomic
>> synchronize_rcu
>> +	 * ensures all currently executing ones will have exited before we
>> +	 * proceed with unregistration.
>> +	 */
>> +	pmu->closed = true;
>> +	synchronize_rcu();
>> +
>> +	xe_pmu_unregister_cpuhp_state(pmu);
>> +
>> +	perf_pmu_unregister(&pmu->base);
>> +	pmu->base.event_init = NULL;
>> +	kfree(pmu->base.attr_groups);
>> +	kfree(pmu->name);
>> +	free_event_attributes(pmu);
>> +}
>> +
>> +static void init_samples(struct xe_pmu *pmu) {
>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +	struct xe_gt *gt;
>> +	unsigned int i;
>> +
>> +	for_each_gt(gt, xe, i)
>> +		engine_group_busyness_store(gt);
>> +}
>> +
>> +void xe_pmu_register(struct xe_pmu *pmu) {
>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +	const struct attribute_group *attr_groups[] = {
>> +		&xe_pmu_format_attr_group,
>> +		&pmu->events_attr_group,
>> +		&xe_pmu_cpumask_attr_group,
>> +		NULL
>> +	};
>> +
>> +	int ret = -ENOMEM;
>> +
>> +	spin_lock_init(&pmu->lock);
>> +	pmu->cpuhp.cpu = -1;
>> +	init_samples(pmu);
>> +
>> +	pmu->name = kasprintf(GFP_KERNEL,
>> +			      "xe_%s",
>> +			      dev_name(xe->drm.dev));
>> +	if (pmu->name)
>> +		/* tools/perf reserves colons as special. */
>> +		strreplace((char *)pmu->name, ':', '_');
>> +
>> +	if (!pmu->name)
>> +		goto err;
>> +
>> +	pmu->events_attr_group.name = "events";
>> +	pmu->events_attr_group.attrs = create_event_attributes(pmu);
>> +	if (!pmu->events_attr_group.attrs)
>> +		goto err_name;
>> +
>> +	pmu->base.attr_groups = kmemdup(attr_groups,
>> sizeof(attr_groups),
>> +					GFP_KERNEL);
>> +	if (!pmu->base.attr_groups)
>> +		goto err_attr;
>> +
>> +	pmu->base.module	= THIS_MODULE;
>> +	pmu->base.task_ctx_nr	= perf_invalid_context;
>> +	pmu->base.event_init	= xe_pmu_event_init;
>> +	pmu->base.add		= xe_pmu_event_add;
>> +	pmu->base.del		= xe_pmu_event_del;
>> +	pmu->base.start		= xe_pmu_event_start;
>> +	pmu->base.stop		= xe_pmu_event_stop;
>> +	pmu->base.read		= xe_pmu_event_read;
>> +	pmu->base.event_idx	= xe_pmu_event_event_idx;
>> +
>> +	ret = perf_pmu_register(&pmu->base, pmu->name, -1);
>> +	if (ret)
>> +		goto err_groups;
>> +
>> +	ret = xe_pmu_register_cpuhp_state(pmu);
>> +	if (ret)
>> +		goto err_unreg;
>> +
>> +	ret = drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister,
>> pmu);
>> +	XE_WARN_ON(ret);
>> +
>> +	return;
>> +
>> +err_unreg:
>> +	perf_pmu_unregister(&pmu->base);
>> +err_groups:
>> +	kfree(pmu->base.attr_groups);
>> +err_attr:
>> +	pmu->base.event_init = NULL;
>> +	free_event_attributes(pmu);
>> +err_name:
>> +	kfree(pmu->name);
>> +err:
>> +	drm_notice(&xe->drm, "Failed to register PMU!\n"); }
>> diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
>> new file mode 100644 index 000000000000..d3f47f4ab343
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_pmu.h
>> @@ -0,0 +1,25 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright © 2023 Intel Corporation
>> + */
>> +
>> +#ifndef _XE_PMU_H_
>> +#define _XE_PMU_H_
>> +
>> +#include "xe_gt_types.h"
>> +#include "xe_pmu_types.h"
>> +
>> +#ifdef CONFIG_PERF_EVENTS
>> +int xe_pmu_init(void);
>> +void xe_pmu_exit(void);
>> +void xe_pmu_register(struct xe_pmu *pmu); void
>> +engine_group_busyness_store(struct xe_gt *gt); #else static inline int
>> +xe_pmu_init(void) { return 0; } static inline void xe_pmu_exit(void) {}
>> +static inline void xe_pmu_register(struct xe_pmu *pmu) {} static inline
>> +void engine_group_busyness_store(struct xe_gt *gt) {} #endif
>> +
>> +#endif
>> +
>> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h
>> b/drivers/gpu/drm/xe/xe_pmu_types.h
>> new file mode 100644
>> index 000000000000..e87edd4d6a87
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
>> @@ -0,0 +1,80 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright © 2023 Intel Corporation
>> + */
>> +
>> +#ifndef _XE_PMU_TYPES_H_
>> +#define _XE_PMU_TYPES_H_
>> +
>> +#include <linux/perf_event.h>
>> +#include <linux/spinlock_types.h>
>> +#include <uapi/drm/xe_drm.h>
>> +
>> +enum {
>> +	__XE_SAMPLE_RENDER_GROUP_BUSY,
>> +	__XE_SAMPLE_COPY_GROUP_BUSY,
>> +	__XE_SAMPLE_MEDIA_GROUP_BUSY,
>> +	__XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>> +	__XE_NUM_PMU_SAMPLERS
>> +};
>> +
>> +struct xe_pmu_sample {
>> +	u64 cur;
>> +};
>> +
>> +#define XE_MAX_GT_PER_TILE 2
>> +
>> +struct xe_pmu {
>> +	/**
>> +	 * @cpuhp: Struct used for CPU hotplug handling.
>> +	 */
>> +	struct {
>> +		struct hlist_node node;
>> +		unsigned int cpu;
>> +	} cpuhp;
>> +	/**
>> +	 * @base: PMU base.
>> +	 */
>> +	struct pmu base;
>> +	/**
>> +	 * @closed: xe is unregistering.
>> +	 */
>> +	bool closed;
>> +	/**
>> +	 * @name: Name as registered with perf core.
>> +	 */
>> +	const char *name;
>> +	/**
>> +	 * @lock: Lock protecting enable mask and ref count handling.
>> +	 */
>> +	spinlock_t lock;
>> +	/**
>> +	 * @sample: Current and previous (raw) counters.
>> +	 *
>> +	 * These counters are updated when the device is awake.
>> +	 *
>> +	 */
>> +	struct xe_pmu_sample sample[XE_MAX_GT_PER_TILE *
>> __XE_NUM_PMU_SAMPLERS];
>> +	/**
>> +	 * @irq_count: Number of interrupts
>> +	 *
>> +	 * Intentionally unsigned long to avoid atomics or heuristics on 32bit.
>> +	 * 4e9 interrupts are a lot and postprocessing can really deal with an
>> +	 * occasional wraparound easily. It's 32bit after all.
>> +	 */
>> +	unsigned long irq_count;
>> +	/**
>> +	 * @events_attr_group: Device events attribute group.
>> +	 */
>> +	struct attribute_group events_attr_group;
>> +	/**
>> +	 * @xe_attr: Memory block holding device attributes.
>> +	 */
>> +	void *xe_attr;
>> +	/**
>> +	 * @pmu_attr: Memory block holding device attributes.
>> +	 */
>> +	void *pmu_attr;
>> +};
>> +
>> +#endif
>> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index
>> 965cd9527ff1..ed097056f944 100644
>> --- a/include/uapi/drm/xe_drm.h
>> +++ b/include/uapi/drm/xe_drm.h
>> @@ -990,6 +990,22 @@ struct drm_xe_vm_madvise {
>>  	__u64 reserved[2];
>>  };
>>
>> +/* PMU event config IDs */
>> +
>> +/*
>> + * Top 4 bits of every counter are GT id.
>> + */
>> +#define __XE_PMU_GT_SHIFT (60)
>> +
>> +#define ___XE_PMU_OTHER(gt, x) \
>> +	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
>> +
>> +#define XE_PMU_INTERRUPTS(gt)
>> 	___XE_PMU_OTHER(gt, 0)
>> +#define XE_PMU_RENDER_GROUP_BUSY(gt)
>> 	___XE_PMU_OTHER(gt, 1)
>> +#define XE_PMU_COPY_GROUP_BUSY(gt)
>> 	___XE_PMU_OTHER(gt, 2)
>> +#define XE_PMU_MEDIA_GROUP_BUSY(gt)
>> 	___XE_PMU_OTHER(gt, 3)
>> +#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)
>> 	___XE_PMU_OTHER(gt, 4)
>> +
>>  #if defined(__cplusplus)
>>  }
>>  #endif
>> --
>> 2.25.1
> 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs
  2023-07-04 10:14     ` Upadhyay, Tejas
@ 2023-07-05  4:46       ` Iddamsetty, Aravind
  0 siblings, 0 replies; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-05  4:46 UTC (permalink / raw)
  To: Upadhyay, Tejas, intel-xe



On 04-07-2023 15:44, Upadhyay, Tejas wrote:
> 
> 
>> -----Original Message-----
>> From: Upadhyay, Tejas
>> Sent: Tuesday, July 4, 2023 2:59 PM
>> To: Aravind Iddamsetty <aravind.iddamsetty@intel.com>; intel-
>> xe@lists.freedesktop.org
>> Subject: RE: [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs
>>
>>
>>
>>> -----Original Message-----
>>> From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of
>>> Aravind Iddamsetty
>>> Sent: Tuesday, June 27, 2023 5:51 PM
>>> To: intel-xe@lists.freedesktop.org
>>> Subject: [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs
>>>
>>> Helpers to get GT clock to nanosecs
>>>
>>> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
>>> ---
>>>  drivers/gpu/drm/xe/xe_gt_clock.c | 10 ++++++++++
>>> drivers/gpu/drm/xe/xe_gt_clock.h |  4 +++-
>>>  2 files changed, 13 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_gt_clock.c
>>> b/drivers/gpu/drm/xe/xe_gt_clock.c
>>> index 7cf11078ff57..3689c7d5cf53 100644
>>> --- a/drivers/gpu/drm/xe/xe_gt_clock.c
>>> +++ b/drivers/gpu/drm/xe/xe_gt_clock.c
>>> @@ -78,3 +78,13 @@ int xe_gt_clock_init(struct xe_gt *gt)
>>>  	gt->info.clock_freq = freq;
>>>  	return 0;
>>>  }
>>> +
>>> +static u64 div_u64_roundup(u64 nom, u32 den) {
>>> +	return div_u64(nom + den - 1, den);
>>> +}
> 
> Also this API can be moved to more common place like, xe_drv.h for others to use when needed?

xe_drv.h is not the right place to move, atleast for now do not see any
use other than here so i let it stay here for now.

Thanks,
Aravind.
> 
>>> +
>>> +u64 xe_gt_clock_interval_to_ns(const struct xe_gt *gt, u64 count) {
>>> +	return div_u64_roundup(count * NSEC_PER_SEC, gt-
>>>> info.clock_freq); }
>>> diff --git a/drivers/gpu/drm/xe/xe_gt_clock.h
>>> b/drivers/gpu/drm/xe/xe_gt_clock.h
>>> index 511923afd224..91fc9b7e83f5 100644
>>> --- a/drivers/gpu/drm/xe/xe_gt_clock.h
>>> +++ b/drivers/gpu/drm/xe/xe_gt_clock.h
>>> @@ -6,8 +6,10 @@
>>>  #ifndef _XE_GT_CLOCK_H_
>>>  #define _XE_GT_CLOCK_H_
>>>
>>> +#include <linux/types.h>
>>> +
>>>  struct xe_gt;
>>>
>>>  int xe_gt_clock_init(struct xe_gt *gt);
>>> -
>>> +u64 xe_gt_clock_interval_to_ns(const struct xe_gt *gt, u64 count);
>>>  #endif
>>
>> Looks ok to me,
>> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
>>
>>> --
>>> 2.25.1
> 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-04  3:34   ` Ghimiray, Himal Prasad
@ 2023-07-05  4:52     ` Iddamsetty, Aravind
  0 siblings, 0 replies; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-05  4:52 UTC (permalink / raw)
  To: Ghimiray, Himal Prasad, intel-xe; +Cc: Bommu, Krishnaiah, Ursulin, Tvrtko



On 04-07-2023 09:04, Ghimiray, Himal Prasad wrote:
> Hi Aravind,
> 
>> -----Original Message-----
>> From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of
>> Aravind Iddamsetty
>> Sent: 27 June 2023 17:51
>> To: intel-xe@lists.freedesktop.org
>> Cc: Bommu, Krishnaiah <krishnaiah.bommu@intel.com>; Ursulin, Tvrtko
>> <tvrtko.ursulin@intel.com>
>> Subject: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
>>
>> There are a set of engine group busyness counters provided by HW which
>> are perfect fit to be exposed via PMU perf events.
>>
>> BSPEC: 46559, 46560, 46722, 46729
>>
>> events can be listed using:
>> perf list
>>   xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
>>   xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
>>   xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
>>   xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
>>   xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
>>
>> and can be read using:
>>
>> perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
>>            time             counts unit events
>>      1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>     10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>     10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>
>> The pmu base implementation is taken from i915.
>>
>> v2:
>> Store last known value when device is awake return that while the GT is
>> suspended and then update the driver copy when read during awake.
>>
>> Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
>> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
>> ---
>>  drivers/gpu/drm/xe/Makefile          |   2 +
>>  drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
>>  drivers/gpu/drm/xe/xe_device.c       |   2 +
>>  drivers/gpu/drm/xe/xe_device_types.h |   4 +
>>  drivers/gpu/drm/xe/xe_gt.c           |   2 +
>>  drivers/gpu/drm/xe/xe_irq.c          |  22 +
>>  drivers/gpu/drm/xe/xe_module.c       |   5 +
>>  drivers/gpu/drm/xe/xe_pmu.c          | 739
>> +++++++++++++++++++++++++++
>>  drivers/gpu/drm/xe/xe_pmu.h          |  25 +
>>  drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
>>  include/uapi/drm/xe_drm.h            |  16 +
>>  11 files changed, 902 insertions(+)
>>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.c  create mode 100644
>> drivers/gpu/drm/xe/xe_pmu.h  create mode 100644
>> drivers/gpu/drm/xe/xe_pmu_types.h
>>
>> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
>> index 081c57fd8632..e52ab795c566 100644
>> --- a/drivers/gpu/drm/xe/Makefile
>> +++ b/drivers/gpu/drm/xe/Makefile
>> @@ -217,6 +217,8 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
>>  	i915-display/skl_universal_plane.o \
>>  	i915-display/skl_watermark.o
>>
>> +xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
>> +
>>  ifeq ($(CONFIG_ACPI),y)
>>  	xe-$(CONFIG_DRM_XE_DISPLAY) += \
>>  		i915-display/intel_acpi.o \
>> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> index 3f664011eaea..c7d9e4634745 100644
>> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> @@ -285,6 +285,11 @@
>>  #define   INVALIDATION_BROADCAST_MODE_DIS	REG_BIT(12)
>>  #define   GLOBAL_INVALIDATION_MODE		REG_BIT(2)
>>
>> +#define XE_OAG_RC0_ANY_ENGINE_BUSY_FREE
>> 	XE_REG(0xdb80)
>> +#define XE_OAG_ANY_MEDIA_FF_BUSY_FREE
>> 	XE_REG(0xdba0)
>> +#define XE_OAG_BLT_BUSY_FREE			XE_REG(0xdbbc)
>> +#define XE_OAG_RENDER_BUSY_FREE
>> 	XE_REG(0xdbdc)
>> +
>>  #define SAMPLER_MODE
>> 	XE_REG_MCR(0xe18c, XE_REG_OPTION_MASKED)
>>  #define   ENABLE_SMALLPL			REG_BIT(15)
>>  #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
>> diff --git a/drivers/gpu/drm/xe/xe_device.c
>> b/drivers/gpu/drm/xe/xe_device.c index c7985af85a53..b2c7bd4a97d9
>> 100644
>> --- a/drivers/gpu/drm/xe/xe_device.c
>> +++ b/drivers/gpu/drm/xe/xe_device.c
>> @@ -328,6 +328,8 @@ int xe_device_probe(struct xe_device *xe)
>>
>>  	xe_debugfs_register(xe);
>>
>> +	xe_pmu_register(&xe->pmu);
>> +
>>  	err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize,
>> xe);
>>  	if (err)
>>  		return err;
>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h
>> b/drivers/gpu/drm/xe/xe_device_types.h
>> index 0226d44a6af2..3ba99aae92b9 100644
>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>> @@ -15,6 +15,7 @@
>>  #include "xe_devcoredump_types.h"
>>  #include "xe_gt_types.h"
>>  #include "xe_platform_types.h"
>> +#include "xe_pmu.h"
>>  #include "xe_step_types.h"
>>
>>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
>> @@ -332,6 +333,9 @@ struct xe_device {
>>  	/** @d3cold_allowed: Indicates if d3cold is a valid device state */
>>  	bool d3cold_allowed;
>>
>> +	/* @pmu: performance monitoring unit */
>> +	struct xe_pmu pmu;
>> +
>>  	/* private: */
>>
>>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
>> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index
>> 2458397ce8af..96e3720923d4 100644
>> --- a/drivers/gpu/drm/xe/xe_gt.c
>> +++ b/drivers/gpu/drm/xe/xe_gt.c
>> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
>>  	if (err)
>>  		goto err_msg;
>>
>> +	engine_group_busyness_store(gt);
>> +
>>  	err = xe_uc_suspend(&gt->uc);
>>  	if (err)
>>  		goto err_force_wake;
>> diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
>> index b4ed1e4a3388..cb943fb94ec7 100644
>> --- a/drivers/gpu/drm/xe/xe_irq.c
>> +++ b/drivers/gpu/drm/xe/xe_irq.c
>> @@ -27,6 +27,24 @@
>>  #define IIR(offset)				XE_REG(offset + 0x8)
>>  #define IER(offset)				XE_REG(offset + 0xc)
>>
>> +/*
>> + * Interrupt statistic for PMU. Increments the counter only if the
>> + * interrupt originated from the GPU so interrupts from a device which
>> + * shares the interrupt line are not accounted.
>> + */
>> +static inline void xe_pmu_irq_stats(struct xe_device *xe,
>> +				    irqreturn_t res)
> The res parameter seems redundant, the caller should call xe_pmu_irq_stats
> only in case of IRQ_HANDLED. Do we see need to pass this as an argument from caller and 
> check in this function ?
ya makes sense, as it is being invoked just before IRQ_HANDLED.

Thanks,
Aravind.
> 
> BR
> Himal 
>> +{
>> +	if (unlikely(res != IRQ_HANDLED))
>> +		return;
>> +
>> +	/*
>> +	 * A clever compiler translates that into INC. A not so clever one
>> +	 * should at least prevent store tearing.
>> +	 */
>> +	WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1); }
>> +
>>  static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)  {
>>  	u32 val = xe_mmio_read32(mmio, reg);
>> @@ -325,6 +343,8 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
>>
>>  	xe_display_irq_enable(xe, gu_misc_iir);
>>
>> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
>> +
>>  	return IRQ_HANDLED;
>>  }
>>
>> @@ -414,6 +434,8 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
>>  	dg1_intr_enable(xe, false);
>>  	xe_display_irq_enable(xe, gu_misc_iir);
>>
>> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
>> +
>>  	return IRQ_HANDLED;
>>  }
>>
>> diff --git a/drivers/gpu/drm/xe/xe_module.c
>> b/drivers/gpu/drm/xe/xe_module.c index 75e5be939f53..f6fe89748525
>> 100644
>> --- a/drivers/gpu/drm/xe/xe_module.c
>> +++ b/drivers/gpu/drm/xe/xe_module.c
>> @@ -12,6 +12,7 @@
>>  #include "xe_hw_fence.h"
>>  #include "xe_module.h"
>>  #include "xe_pci.h"
>> +#include "xe_pmu.h"
>>  #include "xe_sched_job.h"
>>
>>  bool enable_guc = true;
>> @@ -49,6 +50,10 @@ static const struct init_funcs init_funcs[] = {
>>  		.init = xe_sched_job_module_init,
>>  		.exit = xe_sched_job_module_exit,
>>  	},
>> +	{
>> +		.init = xe_pmu_init,
>> +		.exit = xe_pmu_exit,
>> +	},
>>  	{
>>  		.init = xe_register_pci_driver,
>>  		.exit = xe_unregister_pci_driver,
>> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
>> new file mode 100644 index 000000000000..bef1895be9f7
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_pmu.c
>> @@ -0,0 +1,739 @@
>> +/*
>> + * SPDX-License-Identifier: MIT
>> + *
>> + * Copyright © 2023 Intel Corporation
>> + */
>> +
>> +#include <drm/drm_drv.h>
>> +#include <drm/drm_managed.h>
>> +#include <drm/xe_drm.h>
>> +
>> +#include "regs/xe_gt_regs.h"
>> +#include "xe_device.h"
>> +#include "xe_gt_clock.h"
>> +#include "xe_mmio.h"
>> +
>> +static cpumask_t xe_pmu_cpumask;
>> +static unsigned int xe_pmu_target_cpu = -1;
>> +
>> +static unsigned int config_gt_id(const u64 config) {
>> +	return config >> __XE_PMU_GT_SHIFT;
>> +}
>> +
>> +static u64 config_counter(const u64 config) {
>> +	return config & ~(~0ULL << __XE_PMU_GT_SHIFT); }
>> +
>> +static unsigned int
>> +__sample_idx(struct xe_pmu *pmu, unsigned int gt_id, int sample) {
>> +	unsigned int idx = gt_id * __XE_NUM_PMU_SAMPLERS + sample;
>> +
>> +	XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
>> +
>> +	return idx;
>> +}
>> +
>> +static u64 read_sample(struct xe_pmu *pmu, unsigned int gt_id, int
>> +sample) {
>> +	return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur; }
>> +
>> +static void
>> +store_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample, u64
>> +val) {
>> +	pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val; }
>> +
>> +static int engine_busyness_sample_type(u64 config) {
>> +	int type = 0;
>> +
>> +	switch (config) {
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +		type =  __XE_SAMPLE_RENDER_GROUP_BUSY;
>> +		break;
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +		type = __XE_SAMPLE_COPY_GROUP_BUSY;
>> +		break;
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		type = __XE_SAMPLE_MEDIA_GROUP_BUSY;
>> +		break;
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +		type = __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY;
>> +		break;
>> +	}
>> +
>> +	return type;
>> +}
>> +
>> +static void xe_pmu_event_destroy(struct perf_event *event) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +
>> +	drm_WARN_ON(&xe->drm, event->parent);
>> +
>> +	drm_dev_put(&xe->drm);
>> +}
>> +
>> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config) {
>> +	u64 val = 0;
>> +
>> +	switch (config) {
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
>> +		break;
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
>> +		break;
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt,
>> XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
>> +		break;
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt,
>> XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
>> +		break;
>> +	default:
>> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
>> +	}
>> +
>> +	return xe_gt_clock_interval_to_ns(gt, val * 16); }
>> +
>> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config) {
>> +	int sample_type = engine_busyness_sample_type(config);
>> +	struct xe_device *xe = gt->tile->xe;
>> +	const unsigned int gt_id = gt->info.id;
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	bool device_awake;
>> +	unsigned long flags;
>> +	u64 val;
>> +
>> +	/*
>> +	 * found no better way to check if device is awake or not. Before
>> +	 * we suspend we set the submission_state.enabled to false.
>> +	 */
>> +	device_awake = gt->uc.guc.submission_state.enabled ? true : false;
>> +	if (device_awake)
>> +		val = __engine_group_busyness_read(gt, config);
>> +
>> +	spin_lock_irqsave(&pmu->lock, flags);
>> +
>> +	if (device_awake)
>> +		store_sample(pmu, gt_id, sample_type, val);
>> +	else
>> +		val = read_sample(pmu, gt_id, sample_type);
>> +
>> +	spin_unlock_irqrestore(&pmu->lock, flags);
>> +
>> +	return val;
>> +}
>> +
>> +void engine_group_busyness_store(struct xe_gt *gt) {
>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
>> +	unsigned int gt_id = gt->info.id;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&pmu->lock, flags);
>> +
>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt,
>> XE_PMU_RENDER_GROUP_BUSY(0)));
>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt,
>> XE_PMU_COPY_GROUP_BUSY(0)));
>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt,
>> XE_PMU_MEDIA_GROUP_BUSY(0)));
>> +	store_sample(pmu, gt_id,
>> __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt,
>> +XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
>> +
>> +	spin_unlock_irqrestore(&pmu->lock, flags); }
>> +
>> +static int
>> +config_status(struct xe_device *xe, u64 config) {
>> +	unsigned int max_gt_id = xe->info.gt_count > 1 ? 1 : 0;
>> +	unsigned int gt_id = config_gt_id(config);
>> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
>> +
>> +	if (gt_id > max_gt_id)
>> +		return -ENOENT;
>> +
>> +	switch (config_counter(config)) {
>> +	case XE_PMU_INTERRUPTS(0):
>> +		if (gt_id)
>> +			return -ENOENT;
>> +		break;
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +		if (GRAPHICS_VER(xe) < 12)
>> +			return -ENOENT;
>> +		break;
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		if (MEDIA_VER(xe) >= 13 && gt->info.type !=
>> XE_GT_TYPE_MEDIA)
>> +			return -ENOENT;
>> +		break;
>> +	default:
>> +		return -ENOENT;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int xe_pmu_event_init(struct perf_event *event) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	int ret;
>> +
>> +	if (pmu->closed)
>> +		return -ENODEV;
>> +
>> +	if (event->attr.type != event->pmu->type)
>> +		return -ENOENT;
>> +
>> +	/* unsupported modes and filters */
>> +	if (event->attr.sample_period) /* no sampling */
>> +		return -EINVAL;
>> +
>> +	if (has_branch_stack(event))
>> +		return -EOPNOTSUPP;
>> +
>> +	if (event->cpu < 0)
>> +		return -EINVAL;
>> +
>> +	/* only allow running on one cpu at a time */
>> +	if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask))
>> +		return -EINVAL;
>> +
>> +	ret = config_status(xe, event->attr.config);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (!event->parent) {
>> +		drm_dev_get(&xe->drm);
>> +		event->destroy = xe_pmu_event_destroy;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static u64 __xe_pmu_event_read(struct perf_event *event) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	const unsigned int gt_id = config_gt_id(event->attr.config);
>> +	const u64 config = config_counter(event->attr.config);
>> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	u64 val = 0;
>> +
>> +	switch (config) {
>> +	case XE_PMU_INTERRUPTS(0):
>> +		val = READ_ONCE(pmu->irq_count);
>> +		break;
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		val = engine_group_busyness_read(gt, config);
>> +	}
>> +
>> +	return val;
>> +}
>> +
>> +static void xe_pmu_event_read(struct perf_event *event) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct hw_perf_event *hwc = &event->hw;
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	u64 prev, new;
>> +
>> +	if (pmu->closed) {
>> +		event->hw.state = PERF_HES_STOPPED;
>> +		return;
>> +	}
>> +again:
>> +	prev = local64_read(&hwc->prev_count);
>> +	new = __xe_pmu_event_read(event);
>> +
>> +	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
>> +		goto again;
>> +
>> +	local64_add(new - prev, &event->count); }
>> +
>> +static void xe_pmu_enable(struct perf_event *event) {
>> +	/*
>> +	 * Store the current counter value so we can report the correct delta
>> +	 * for all listeners. Even when the event was already enabled and has
>> +	 * an existing non-zero value.
>> +	 */
>> +	local64_set(&event->hw.prev_count,
>> __xe_pmu_event_read(event)); }
>> +
>> +static void xe_pmu_event_start(struct perf_event *event, int flags) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +
>> +	if (pmu->closed)
>> +		return;
>> +
>> +	xe_pmu_enable(event);
>> +	event->hw.state = 0;
>> +}
>> +
>> +static void xe_pmu_event_stop(struct perf_event *event, int flags) {
>> +	if (flags & PERF_EF_UPDATE)
>> +		xe_pmu_event_read(event);
>> +
>> +	event->hw.state = PERF_HES_STOPPED;
>> +}
>> +
>> +static int xe_pmu_event_add(struct perf_event *event, int flags) {
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +
>> +	if (pmu->closed)
>> +		return -ENODEV;
>> +
>> +	if (flags & PERF_EF_START)
>> +		xe_pmu_event_start(event, flags);
>> +
>> +	return 0;
>> +}
>> +
>> +static void xe_pmu_event_del(struct perf_event *event, int flags) {
>> +	xe_pmu_event_stop(event, PERF_EF_UPDATE); }
>> +
>> +static int xe_pmu_event_event_idx(struct perf_event *event) {
>> +	return 0;
>> +}
>> +
>> +struct xe_str_attribute {
>> +	struct device_attribute attr;
>> +	const char *str;
>> +};
>> +
>> +static ssize_t xe_pmu_format_show(struct device *dev,
>> +				  struct device_attribute *attr, char *buf) {
>> +	struct xe_str_attribute *eattr;
>> +
>> +	eattr = container_of(attr, struct xe_str_attribute, attr);
>> +	return sprintf(buf, "%s\n", eattr->str); }
>> +
>> +#define XE_PMU_FORMAT_ATTR(_name, _config) \
>> +	(&((struct xe_str_attribute[]) { \
>> +		{ .attr = __ATTR(_name, 0444, xe_pmu_format_show,
>> NULL), \
>> +		  .str = _config, } \
>> +	})[0].attr.attr)
>> +
>> +static struct attribute *xe_pmu_format_attrs[] = {
>> +	XE_PMU_FORMAT_ATTR(xe_eventid, "config:0-20"),
>> +	NULL,
>> +};
>> +
>> +static const struct attribute_group xe_pmu_format_attr_group = {
>> +	.name = "format",
>> +	.attrs = xe_pmu_format_attrs,
>> +};
>> +
>> +struct xe_ext_attribute {
>> +	struct device_attribute attr;
>> +	unsigned long val;
>> +};
>> +
>> +static ssize_t xe_pmu_event_show(struct device *dev,
>> +				 struct device_attribute *attr, char *buf) {
>> +	struct xe_ext_attribute *eattr;
>> +
>> +	eattr = container_of(attr, struct xe_ext_attribute, attr);
>> +	return sprintf(buf, "config=0x%lx\n", eattr->val); }
>> +
>> +static ssize_t cpumask_show(struct device *dev,
>> +			    struct device_attribute *attr, char *buf) {
>> +	return cpumap_print_to_pagebuf(true, buf, &xe_pmu_cpumask); }
>> +
>> +static DEVICE_ATTR_RO(cpumask);
>> +
>> +static struct attribute *xe_cpumask_attrs[] = {
>> +	&dev_attr_cpumask.attr,
>> +	NULL,
>> +};
>> +
>> +static const struct attribute_group xe_pmu_cpumask_attr_group = {
>> +	.attrs = xe_cpumask_attrs,
>> +};
>> +
>> +#define __event(__counter, __name, __unit) \ { \
>> +	.counter = (__counter), \
>> +	.name = (__name), \
>> +	.unit = (__unit), \
>> +	.global = false, \
>> +}
>> +
>> +#define __global_event(__counter, __name, __unit) \ { \
>> +	.counter = (__counter), \
>> +	.name = (__name), \
>> +	.unit = (__unit), \
>> +	.global = true, \
>> +}
>> +
>> +static struct xe_ext_attribute *
>> +add_xe_attr(struct xe_ext_attribute *attr, const char *name, u64
>> +config) {
>> +	sysfs_attr_init(&attr->attr.attr);
>> +	attr->attr.attr.name = name;
>> +	attr->attr.attr.mode = 0444;
>> +	attr->attr.show = xe_pmu_event_show;
>> +	attr->val = config;
>> +
>> +	return ++attr;
>> +}
>> +
>> +static struct perf_pmu_events_attr *
>> +add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name,
>> +	     const char *str)
>> +{
>> +	sysfs_attr_init(&attr->attr.attr);
>> +	attr->attr.attr.name = name;
>> +	attr->attr.attr.mode = 0444;
>> +	attr->attr.show = perf_event_sysfs_show;
>> +	attr->event_str = str;
>> +
>> +	return ++attr;
>> +}
>> +
>> +static struct attribute **
>> +create_event_attributes(struct xe_pmu *pmu) {
>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +	static const struct {
>> +		unsigned int counter;
>> +		const char *name;
>> +		const char *unit;
>> +		bool global;
>> +	} events[] = {
>> +		__global_event(0, "interrupts", NULL),
>> +		__event(1, "render-group-busy", "ns"),
>> +		__event(2, "copy-group-busy", "ns"),
>> +		__event(3, "media-group-busy", "ns"),
>> +		__event(4, "any-engine-group-busy", "ns"),
>> +	};
>> +
>> +	unsigned int count = 0;
>> +	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
>> +	struct xe_ext_attribute *xe_attr = NULL, *xe_iter;
>> +	struct attribute **attr = NULL, **attr_iter;
>> +	struct xe_gt *gt;
>> +	unsigned int i, j;
>> +
>> +	/* Count how many counters we will be exposing. */
>> +	for_each_gt(gt, xe, j) {
>> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
>> +			u64 config = ___XE_PMU_OTHER(j,
>> events[i].counter);
>> +
>> +			if (!config_status(xe, config))
>> +				count++;
>> +		}
>> +	}
>> +
>> +	/* Allocate attribute objects and table. */
>> +	xe_attr = kcalloc(count, sizeof(*xe_attr), GFP_KERNEL);
>> +	if (!xe_attr)
>> +		goto err_alloc;
>> +
>> +	pmu_attr = kcalloc(count, sizeof(*pmu_attr), GFP_KERNEL);
>> +	if (!pmu_attr)
>> +		goto err_alloc;
>> +
>> +	/* Max one pointer of each attribute type plus a termination entry.
>> */
>> +	attr = kcalloc(count * 2 + 1, sizeof(*attr), GFP_KERNEL);
>> +	if (!attr)
>> +		goto err_alloc;
>> +
>> +	xe_iter = xe_attr;
>> +	pmu_iter = pmu_attr;
>> +	attr_iter = attr;
>> +
>> +	for_each_gt(gt, xe, j) {
>> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
>> +			u64 config = ___XE_PMU_OTHER(j,
>> events[i].counter);
>> +			char *str;
>> +
>> +			if (config_status(xe, config))
>> +				continue;
>> +
>> +			if (events[i].global)
>> +				str = kstrdup(events[i].name, GFP_KERNEL);
>> +			else
>> +				str = kasprintf(GFP_KERNEL, "%s-gt%u",
>> +						events[i].name, j);
>> +			if (!str)
>> +				goto err;
>> +
>> +			*attr_iter++ = &xe_iter->attr.attr;
>> +			xe_iter = add_xe_attr(xe_iter, str, config);
>> +
>> +			if (events[i].unit) {
>> +				if (events[i].global)
>> +					str = kasprintf(GFP_KERNEL,
>> "%s.unit",
>> +							events[i].name);
>> +				else
>> +					str = kasprintf(GFP_KERNEL, "%s-
>> gt%u.unit",
>> +							events[i].name, j);
>> +				if (!str)
>> +					goto err;
>> +
>> +				*attr_iter++ = &pmu_iter->attr.attr;
>> +				pmu_iter = add_pmu_attr(pmu_iter, str,
>> +							events[i].unit);
>> +			}
>> +		}
>> +	}
>> +
>> +	pmu->xe_attr = xe_attr;
>> +	pmu->pmu_attr = pmu_attr;
>> +
>> +	return attr;
>> +
>> +err:
>> +	for (attr_iter = attr; *attr_iter; attr_iter++)
>> +		kfree((*attr_iter)->name);
>> +
>> +err_alloc:
>> +	kfree(attr);
>> +	kfree(xe_attr);
>> +	kfree(pmu_attr);
>> +
>> +	return NULL;
>> +}
>> +
>> +static void free_event_attributes(struct xe_pmu *pmu) {
>> +	struct attribute **attr_iter = pmu->events_attr_group.attrs;
>> +
>> +	for (; *attr_iter; attr_iter++)
>> +		kfree((*attr_iter)->name);
>> +
>> +	kfree(pmu->events_attr_group.attrs);
>> +	kfree(pmu->xe_attr);
>> +	kfree(pmu->pmu_attr);
>> +
>> +	pmu->events_attr_group.attrs = NULL;
>> +	pmu->xe_attr = NULL;
>> +	pmu->pmu_attr = NULL;
>> +}
>> +
>> +static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
>> +{
>> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu),
>> cpuhp.node);
>> +
>> +	XE_BUG_ON(!pmu->base.event_init);
>> +
>> +	/* Select the first online CPU as a designated reader. */
>> +	if (cpumask_empty(&xe_pmu_cpumask))
>> +		cpumask_set_cpu(cpu, &xe_pmu_cpumask);
>> +
>> +	return 0;
>> +}
>> +
>> +static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node
>> +*node) {
>> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu),
>> cpuhp.node);
>> +	unsigned int target = xe_pmu_target_cpu;
>> +
>> +	XE_BUG_ON(!pmu->base.event_init);
>> +
>> +	/*
>> +	 * Unregistering an instance generates a CPU offline event which we
>> must
>> +	 * ignore to avoid incorrectly modifying the shared
>> xe_pmu_cpumask.
>> +	 */
>> +	if (pmu->closed)
>> +		return 0;
>> +
>> +	if (cpumask_test_and_clear_cpu(cpu, &xe_pmu_cpumask)) {
>> +		target = cpumask_any_but(topology_sibling_cpumask(cpu),
>> cpu);
>> +
>> +		/* Migrate events if there is a valid target */
>> +		if (target < nr_cpu_ids) {
>> +			cpumask_set_cpu(target, &xe_pmu_cpumask);
>> +			xe_pmu_target_cpu = target;
>> +		}
>> +	}
>> +
>> +	if (target < nr_cpu_ids && target != pmu->cpuhp.cpu) {
>> +		perf_pmu_migrate_context(&pmu->base, cpu, target);
>> +		pmu->cpuhp.cpu = target;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
>> +
>> +int xe_pmu_init(void)
>> +{
>> +	int ret;
>> +
>> +	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
>> +				      "perf/x86/intel/xe:online",
>> +				      xe_pmu_cpu_online,
>> +				      xe_pmu_cpu_offline);
>> +	if (ret < 0)
>> +		pr_notice("Failed to setup cpuhp state for xe PMU! (%d)\n",
>> +			  ret);
>> +	else
>> +		cpuhp_slot = ret;
>> +
>> +	return 0;
>> +}
>> +
>> +void xe_pmu_exit(void)
>> +{
>> +	if (cpuhp_slot != CPUHP_INVALID)
>> +		cpuhp_remove_multi_state(cpuhp_slot);
>> +}
>> +
>> +static int xe_pmu_register_cpuhp_state(struct xe_pmu *pmu) {
>> +	if (cpuhp_slot == CPUHP_INVALID)
>> +		return -EINVAL;
>> +
>> +	return cpuhp_state_add_instance(cpuhp_slot, &pmu-
>>> cpuhp.node); }
>> +
>> +static void xe_pmu_unregister_cpuhp_state(struct xe_pmu *pmu) {
>> +	cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node); }
>> +
>> +static void xe_pmu_unregister(struct drm_device *device, void *arg) {
>> +	struct xe_pmu *pmu = arg;
>> +
>> +	if (!pmu->base.event_init)
>> +		return;
>> +
>> +	/*
>> +	 * "Disconnect" the PMU callbacks - since all are atomic
>> synchronize_rcu
>> +	 * ensures all currently executing ones will have exited before we
>> +	 * proceed with unregistration.
>> +	 */
>> +	pmu->closed = true;
>> +	synchronize_rcu();
>> +
>> +	xe_pmu_unregister_cpuhp_state(pmu);
>> +
>> +	perf_pmu_unregister(&pmu->base);
>> +	pmu->base.event_init = NULL;
>> +	kfree(pmu->base.attr_groups);
>> +	kfree(pmu->name);
>> +	free_event_attributes(pmu);
>> +}
>> +
>> +static void init_samples(struct xe_pmu *pmu) {
>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +	struct xe_gt *gt;
>> +	unsigned int i;
>> +
>> +	for_each_gt(gt, xe, i)
>> +		engine_group_busyness_store(gt);
>> +}
>> +
>> +void xe_pmu_register(struct xe_pmu *pmu) {
>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +	const struct attribute_group *attr_groups[] = {
>> +		&xe_pmu_format_attr_group,
>> +		&pmu->events_attr_group,
>> +		&xe_pmu_cpumask_attr_group,
>> +		NULL
>> +	};
>> +
>> +	int ret = -ENOMEM;
>> +
>> +	spin_lock_init(&pmu->lock);
>> +	pmu->cpuhp.cpu = -1;
>> +	init_samples(pmu);
>> +
>> +	pmu->name = kasprintf(GFP_KERNEL,
>> +			      "xe_%s",
>> +			      dev_name(xe->drm.dev));
>> +	if (pmu->name)
>> +		/* tools/perf reserves colons as special. */
>> +		strreplace((char *)pmu->name, ':', '_');
>> +
>> +	if (!pmu->name)
>> +		goto err;
>> +
>> +	pmu->events_attr_group.name = "events";
>> +	pmu->events_attr_group.attrs = create_event_attributes(pmu);
>> +	if (!pmu->events_attr_group.attrs)
>> +		goto err_name;
>> +
>> +	pmu->base.attr_groups = kmemdup(attr_groups,
>> sizeof(attr_groups),
>> +					GFP_KERNEL);
>> +	if (!pmu->base.attr_groups)
>> +		goto err_attr;
>> +
>> +	pmu->base.module	= THIS_MODULE;
>> +	pmu->base.task_ctx_nr	= perf_invalid_context;
>> +	pmu->base.event_init	= xe_pmu_event_init;
>> +	pmu->base.add		= xe_pmu_event_add;
>> +	pmu->base.del		= xe_pmu_event_del;
>> +	pmu->base.start		= xe_pmu_event_start;
>> +	pmu->base.stop		= xe_pmu_event_stop;
>> +	pmu->base.read		= xe_pmu_event_read;
>> +	pmu->base.event_idx	= xe_pmu_event_event_idx;
>> +
>> +	ret = perf_pmu_register(&pmu->base, pmu->name, -1);
>> +	if (ret)
>> +		goto err_groups;
>> +
>> +	ret = xe_pmu_register_cpuhp_state(pmu);
>> +	if (ret)
>> +		goto err_unreg;
>> +
>> +	ret = drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister,
>> pmu);
>> +	XE_WARN_ON(ret);
>> +
>> +	return;
>> +
>> +err_unreg:
>> +	perf_pmu_unregister(&pmu->base);
>> +err_groups:
>> +	kfree(pmu->base.attr_groups);
>> +err_attr:
>> +	pmu->base.event_init = NULL;
>> +	free_event_attributes(pmu);
>> +err_name:
>> +	kfree(pmu->name);
>> +err:
>> +	drm_notice(&xe->drm, "Failed to register PMU!\n"); }
>> diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
>> new file mode 100644 index 000000000000..d3f47f4ab343
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_pmu.h
>> @@ -0,0 +1,25 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright © 2023 Intel Corporation
>> + */
>> +
>> +#ifndef _XE_PMU_H_
>> +#define _XE_PMU_H_
>> +
>> +#include "xe_gt_types.h"
>> +#include "xe_pmu_types.h"
>> +
>> +#ifdef CONFIG_PERF_EVENTS
>> +int xe_pmu_init(void);
>> +void xe_pmu_exit(void);
>> +void xe_pmu_register(struct xe_pmu *pmu); void
>> +engine_group_busyness_store(struct xe_gt *gt); #else static inline int
>> +xe_pmu_init(void) { return 0; } static inline void xe_pmu_exit(void) {}
>> +static inline void xe_pmu_register(struct xe_pmu *pmu) {} static inline
>> +void engine_group_busyness_store(struct xe_gt *gt) {} #endif
>> +
>> +#endif
>> +
>> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h
>> b/drivers/gpu/drm/xe/xe_pmu_types.h
>> new file mode 100644
>> index 000000000000..e87edd4d6a87
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
>> @@ -0,0 +1,80 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright © 2023 Intel Corporation
>> + */
>> +
>> +#ifndef _XE_PMU_TYPES_H_
>> +#define _XE_PMU_TYPES_H_
>> +
>> +#include <linux/perf_event.h>
>> +#include <linux/spinlock_types.h>
>> +#include <uapi/drm/xe_drm.h>
>> +
>> +enum {
>> +	__XE_SAMPLE_RENDER_GROUP_BUSY,
>> +	__XE_SAMPLE_COPY_GROUP_BUSY,
>> +	__XE_SAMPLE_MEDIA_GROUP_BUSY,
>> +	__XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>> +	__XE_NUM_PMU_SAMPLERS
>> +};
>> +
>> +struct xe_pmu_sample {
>> +	u64 cur;
>> +};
>> +
>> +#define XE_MAX_GT_PER_TILE 2
>> +
>> +struct xe_pmu {
>> +	/**
>> +	 * @cpuhp: Struct used for CPU hotplug handling.
>> +	 */
>> +	struct {
>> +		struct hlist_node node;
>> +		unsigned int cpu;
>> +	} cpuhp;
>> +	/**
>> +	 * @base: PMU base.
>> +	 */
>> +	struct pmu base;
>> +	/**
>> +	 * @closed: xe is unregistering.
>> +	 */
>> +	bool closed;
>> +	/**
>> +	 * @name: Name as registered with perf core.
>> +	 */
>> +	const char *name;
>> +	/**
>> +	 * @lock: Lock protecting enable mask and ref count handling.
>> +	 */
>> +	spinlock_t lock;
>> +	/**
>> +	 * @sample: Current and previous (raw) counters.
>> +	 *
>> +	 * These counters are updated when the device is awake.
>> +	 *
>> +	 */
>> +	struct xe_pmu_sample sample[XE_MAX_GT_PER_TILE *
>> __XE_NUM_PMU_SAMPLERS];
>> +	/**
>> +	 * @irq_count: Number of interrupts
>> +	 *
>> +	 * Intentionally unsigned long to avoid atomics or heuristics on 32bit.
>> +	 * 4e9 interrupts are a lot and postprocessing can really deal with an
>> +	 * occasional wraparound easily. It's 32bit after all.
>> +	 */
>> +	unsigned long irq_count;
>> +	/**
>> +	 * @events_attr_group: Device events attribute group.
>> +	 */
>> +	struct attribute_group events_attr_group;
>> +	/**
>> +	 * @xe_attr: Memory block holding device attributes.
>> +	 */
>> +	void *xe_attr;
>> +	/**
>> +	 * @pmu_attr: Memory block holding device attributes.
>> +	 */
>> +	void *pmu_attr;
>> +};
>> +
>> +#endif
>> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index
>> 965cd9527ff1..ed097056f944 100644
>> --- a/include/uapi/drm/xe_drm.h
>> +++ b/include/uapi/drm/xe_drm.h
>> @@ -990,6 +990,22 @@ struct drm_xe_vm_madvise {
>>  	__u64 reserved[2];
>>  };
>>
>> +/* PMU event config IDs */
>> +
>> +/*
>> + * Top 4 bits of every counter are GT id.
>> + */
>> +#define __XE_PMU_GT_SHIFT (60)
>> +
>> +#define ___XE_PMU_OTHER(gt, x) \
>> +	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
>> +
>> +#define XE_PMU_INTERRUPTS(gt)
>> 	___XE_PMU_OTHER(gt, 0)
>> +#define XE_PMU_RENDER_GROUP_BUSY(gt)
>> 	___XE_PMU_OTHER(gt, 1)
>> +#define XE_PMU_COPY_GROUP_BUSY(gt)
>> 	___XE_PMU_OTHER(gt, 2)
>> +#define XE_PMU_MEDIA_GROUP_BUSY(gt)
>> 	___XE_PMU_OTHER(gt, 3)
>> +#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)
>> 	___XE_PMU_OTHER(gt, 4)
>> +
>>  #if defined(__cplusplus)
>>  }
>>  #endif
>> --
>> 2.25.1
> 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs
  2023-06-27 12:21 ` [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs Aravind Iddamsetty
  2023-07-04  9:29   ` Upadhyay, Tejas
@ 2023-07-06  0:55   ` Dixit, Ashutosh
  1 sibling, 0 replies; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-06  0:55 UTC (permalink / raw)
  To: Aravind Iddamsetty; +Cc: intel-xe

On Tue, 27 Jun 2023 05:21:12 -0700, Aravind Iddamsetty wrote:
>

Hi Aravind,

> Helpers to get GT clock to nanosecs
>
> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_gt_clock.c | 10 ++++++++++
>  drivers/gpu/drm/xe/xe_gt_clock.h |  4 +++-
>  2 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_clock.c b/drivers/gpu/drm/xe/xe_gt_clock.c
> index 7cf11078ff57..3689c7d5cf53 100644
> --- a/drivers/gpu/drm/xe/xe_gt_clock.c
> +++ b/drivers/gpu/drm/xe/xe_gt_clock.c
> @@ -78,3 +78,13 @@ int xe_gt_clock_init(struct xe_gt *gt)
>	gt->info.clock_freq = freq;
>	return 0;
>  }
> +
> +static u64 div_u64_roundup(u64 nom, u32 den)

s/num/nom/

> +{
> +	return div_u64(nom + den - 1, den);
> +}
> +

Shouldn't need this function. Look at DIV_ROUND_CLOSEST_ULL in math.h or
DIV64_U64_ROUND_CLOSEST or DIV64_U64_ROUND_UP in math64.h.

> +u64 xe_gt_clock_interval_to_ns(const struct xe_gt *gt, u64 count)
> +{
> +	return div_u64_roundup(count * NSEC_PER_SEC, gt->info.clock_freq);
> +}
> diff --git a/drivers/gpu/drm/xe/xe_gt_clock.h b/drivers/gpu/drm/xe/xe_gt_clock.h
> index 511923afd224..91fc9b7e83f5 100644
> --- a/drivers/gpu/drm/xe/xe_gt_clock.h
> +++ b/drivers/gpu/drm/xe/xe_gt_clock.h
> @@ -6,8 +6,10 @@
>  #ifndef _XE_GT_CLOCK_H_
>  #define _XE_GT_CLOCK_H_
>
> +#include <linux/types.h>
> +
>  struct xe_gt;
>
>  int xe_gt_clock_init(struct xe_gt *gt);
> -
> +u64 xe_gt_clock_interval_to_ns(const struct xe_gt *gt, u64 count);
>  #endif
> --
> 2.25.1
>

Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-06-27 12:21 ` [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
                     ` (2 preceding siblings ...)
  2023-07-04  9:10   ` Upadhyay, Tejas
@ 2023-07-06  2:39   ` Dixit, Ashutosh
  2023-07-06 13:42     ` Iddamsetty, Aravind
  2023-07-06  2:40   ` Belgaumkar, Vinay
                     ` (2 subsequent siblings)
  6 siblings, 1 reply; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-06  2:39 UTC (permalink / raw)
  To: Aravind Iddamsetty; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
>

Hi Aravind,

> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
> +{
> +	u64 val = 0;
> +
> +	switch (config) {
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
> +		break;
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
> +		break;
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
> +		break;
> +	default:
> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
> +	}
> +
> +	return xe_gt_clock_interval_to_ns(gt, val * 16);
> +}

A few questions on just the above function first:

1. OK so these registers won't be available to VF's, but any idea what
   these counts are when VF's are active?

2. When would these 32 bit registers overflow? Let us say a group is
   continuously busy and we are running at 1 GHz, the registers would
   overflow in 4 seconds (max value 4G)?

3. What is the multiplication by 16 (not factored above in 2.)? I don't see
   that in Bspec.

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-06-27 12:21 ` [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
                     ` (3 preceding siblings ...)
  2023-07-06  2:39   ` Dixit, Ashutosh
@ 2023-07-06  2:40   ` Belgaumkar, Vinay
  2023-07-06 13:06     ` Iddamsetty, Aravind
  2023-07-21  1:02   ` Dixit, Ashutosh
  2023-07-22 14:39   ` Dixit, Ashutosh
  6 siblings, 1 reply; 59+ messages in thread
From: Belgaumkar, Vinay @ 2023-07-06  2:40 UTC (permalink / raw)
  To: intel-xe


On 6/27/2023 5:21 AM, aravind.iddamsetty at intel.com (Aravind 
Iddamsetty) wrote:
> There are a set of engine group busyness counters provided by HW which are
> perfect fit to be exposed via PMU perf events.
>
> BSPEC: 46559, 46560, 46722, 46729
>
> events can be listed using:
> perf list
>    xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
>    xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
>    xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
>    xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
>    xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
>
> and can be read using:
>
> perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
>             time             counts unit events
>       1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>       2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>       3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>       4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>       5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>       6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>       7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>       8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>       9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>
> The pmu base implementation is taken from i915.
>
> v2:
> Store last known value when device is awake return that while the GT is
> suspended and then update the driver copy when read during awake.
>
> Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu at intel.com>
> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty at intel.com>
> ---
>   drivers/gpu/drm/xe/Makefile          |   2 +
>   drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
>   drivers/gpu/drm/xe/xe_device.c       |   2 +
>   drivers/gpu/drm/xe/xe_device_types.h |   4 +
>   drivers/gpu/drm/xe/xe_gt.c           |   2 +
>   drivers/gpu/drm/xe/xe_irq.c          |  22 +
>   drivers/gpu/drm/xe/xe_module.c       |   5 +
>   drivers/gpu/drm/xe/xe_pmu.c          | 739 +++++++++++++++++++++++++++
>   drivers/gpu/drm/xe/xe_pmu.h          |  25 +
>   drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
>   include/uapi/drm/xe_drm.h            |  16 +
>   11 files changed, 902 insertions(+)
>   create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
>   create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
>   create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h
>
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 081c57fd8632..e52ab795c566 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -217,6 +217,8 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
>   	i915-display/skl_universal_plane.o \
>   	i915-display/skl_watermark.o
>   
> +xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
> +
>   ifeq ($(CONFIG_ACPI),y)
>   	xe-$(CONFIG_DRM_XE_DISPLAY) += \
>   		i915-display/intel_acpi.o \
> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> index 3f664011eaea..c7d9e4634745 100644
> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> @@ -285,6 +285,11 @@
>   #define   INVALIDATION_BROADCAST_MODE_DIS	REG_BIT(12)
>   #define   GLOBAL_INVALIDATION_MODE		REG_BIT(2)
>   
> +#define XE_OAG_RC0_ANY_ENGINE_BUSY_FREE		XE_REG(0xdb80)
> +#define XE_OAG_ANY_MEDIA_FF_BUSY_FREE		XE_REG(0xdba0)
> +#define XE_OAG_BLT_BUSY_FREE			XE_REG(0xdbbc)
> +#define XE_OAG_RENDER_BUSY_FREE			XE_REG(0xdbdc)
> +
>   #define SAMPLER_MODE				XE_REG_MCR(0xe18c, XE_REG_OPTION_MASKED)
>   #define   ENABLE_SMALLPL			REG_BIT(15)
>   #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index c7985af85a53..b2c7bd4a97d9 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -328,6 +328,8 @@ int xe_device_probe(struct xe_device *xe)
>   
>   	xe_debugfs_register(xe);
>   
> +	xe_pmu_register(&xe->pmu);
> +
>   	err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize, xe);
>   	if (err)
>   		return err;
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 0226d44a6af2..3ba99aae92b9 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -15,6 +15,7 @@
>   #include "xe_devcoredump_types.h"
>   #include "xe_gt_types.h"
>   #include "xe_platform_types.h"
> +#include "xe_pmu.h"
>   #include "xe_step_types.h"
>   
>   #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> @@ -332,6 +333,9 @@ struct xe_device {
>   	/** @d3cold_allowed: Indicates if d3cold is a valid device state */
>   	bool d3cold_allowed;
>   
> +	/* @pmu: performance monitoring unit */
> +	struct xe_pmu pmu;
> +
>   	/* private: */
>   
>   #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> index 2458397ce8af..96e3720923d4 100644
> --- a/drivers/gpu/drm/xe/xe_gt.c
> +++ b/drivers/gpu/drm/xe/xe_gt.c
> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
>   	if (err)
>   		goto err_msg;
>   
> +	engine_group_busyness_store(gt);
> +
>   	err = xe_uc_suspend(&gt->uc);
>   	if (err)
>   		goto err_force_wake;
> diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
> index b4ed1e4a3388..cb943fb94ec7 100644
> --- a/drivers/gpu/drm/xe/xe_irq.c
> +++ b/drivers/gpu/drm/xe/xe_irq.c
> @@ -27,6 +27,24 @@
>   #define IIR(offset)				XE_REG(offset + 0x8)
>   #define IER(offset)				XE_REG(offset + 0xc)
>   
> +/*
> + * Interrupt statistic for PMU. Increments the counter only if the
> + * interrupt originated from the GPU so interrupts from a device which
> + * shares the interrupt line are not accounted.
> + */
> +static inline void xe_pmu_irq_stats(struct xe_device *xe,
> +				    irqreturn_t res)
> +{
> +	if (unlikely(res != IRQ_HANDLED))
> +		return;
> +
> +	/*
> +	 * A clever compiler translates that into INC. A not so clever one
> +	 * should at least prevent store tearing.
> +	 */
> +	WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1);
> +}
> +
>   static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)
>   {
>   	u32 val = xe_mmio_read32(mmio, reg);
> @@ -325,6 +343,8 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
>   
>   	xe_display_irq_enable(xe, gu_misc_iir);
>   
> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
> +
>   	return IRQ_HANDLED;
>   }
>   
> @@ -414,6 +434,8 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
>   	dg1_intr_enable(xe, false);
>   	xe_display_irq_enable(xe, gu_misc_iir);
>   
> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
> +
>   	return IRQ_HANDLED;
>   }
>   
> diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
> index 75e5be939f53..f6fe89748525 100644
> --- a/drivers/gpu/drm/xe/xe_module.c
> +++ b/drivers/gpu/drm/xe/xe_module.c
> @@ -12,6 +12,7 @@
>   #include "xe_hw_fence.h"
>   #include "xe_module.h"
>   #include "xe_pci.h"
> +#include "xe_pmu.h"
>   #include "xe_sched_job.h"
>   
>   bool enable_guc = true;
> @@ -49,6 +50,10 @@ static const struct init_funcs init_funcs[] = {
>   		.init = xe_sched_job_module_init,
>   		.exit = xe_sched_job_module_exit,
>   	},
> +	{
> +		.init = xe_pmu_init,
> +		.exit = xe_pmu_exit,
> +	},
>   	{
>   		.init = xe_register_pci_driver,
>   		.exit = xe_unregister_pci_driver,
> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
> new file mode 100644
> index 000000000000..bef1895be9f7
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu.c
> @@ -0,0 +1,739 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright ? 2023 Intel Corporation
> + */
> +
> +#include <drm/drm_drv.h>
> +#include <drm/drm_managed.h>
> +#include <drm/xe_drm.h>
> +
> +#include "regs/xe_gt_regs.h"
> +#include "xe_device.h"
> +#include "xe_gt_clock.h"
> +#include "xe_mmio.h"
> +
> +static cpumask_t xe_pmu_cpumask;
> +static unsigned int xe_pmu_target_cpu = -1;
> +
> +static unsigned int config_gt_id(const u64 config)
> +{
> +	return config >> __XE_PMU_GT_SHIFT;
> +}
> +
> +static u64 config_counter(const u64 config)
> +{
> +	return config & ~(~0ULL << __XE_PMU_GT_SHIFT);
> +}
> +
> +static unsigned int
> +__sample_idx(struct xe_pmu *pmu, unsigned int gt_id, int sample)
> +{
> +	unsigned int idx = gt_id * __XE_NUM_PMU_SAMPLERS + sample;
> +
> +	XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
> +
> +	return idx;
> +}
> +
> +static u64 read_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample)
> +{
> +	return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur;
> +}
> +
> +static void
> +store_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample, u64 val)
> +{
> +	pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val;
> +}
> +
> +static int engine_busyness_sample_type(u64 config)
> +{
> +	int type = 0;
> +
> +	switch (config) {
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +		type =  __XE_SAMPLE_RENDER_GROUP_BUSY;
> +		break;
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +		type = __XE_SAMPLE_COPY_GROUP_BUSY;
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		type = __XE_SAMPLE_MEDIA_GROUP_BUSY;
> +		break;
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		type = __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY;
> +		break;
> +	}
> +
> +	return type;
> +}
> +
> +static void xe_pmu_event_destroy(struct perf_event *event)
> +{
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +
> +	drm_WARN_ON(&xe->drm, event->parent);
> +
> +	drm_dev_put(&xe->drm);
> +}
> +
> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
> +{
> +	u64 val = 0;
> +
> +	switch (config) {
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
> +		break;
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
> +		break;
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
> +		break;
> +	default:
> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
> +	}
> +
> +	return xe_gt_clock_interval_to_ns(gt, val * 16);
> +}
> +
> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config)
> +{
> +	int sample_type = engine_busyness_sample_type(config);
> +	struct xe_device *xe = gt->tile->xe;
> +	const unsigned int gt_id = gt->info.id;
> +	struct xe_pmu *pmu = &xe->pmu;
> +	bool device_awake;
> +	unsigned long flags;
> +	u64 val;
> +
> +	/*
> +	 * found no better way to check if device is awake or not. Before
> +	 * we suspend we set the submission_state.enabled to false.
> +	 */
> +	device_awake = gt->uc.guc.submission_state.enabled ? true : false;
> +	if (device_awake)
> +		val = __engine_group_busyness_read(gt, config);
> +
> +	spin_lock_irqsave(&pmu->lock, flags);
> +
> +	if (device_awake)
> +		store_sample(pmu, gt_id, sample_type, val);
> +	else
> +		val = read_sample(pmu, gt_id, sample_type);
> +
> +	spin_unlock_irqrestore(&pmu->lock, flags);
> +
> +	return val;
> +}
> +
> +void engine_group_busyness_store(struct xe_gt *gt)
> +{
> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
> +	unsigned int gt_id = gt->info.id;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&pmu->lock, flags);
> +
> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
> +
> +	spin_unlock_irqrestore(&pmu->lock, flags);
> +}
> +
> +static int
> +config_status(struct xe_device *xe, u64 config)
> +{
> +	unsigned int max_gt_id = xe->info.gt_count > 1 ? 1 : 0;
> +	unsigned int gt_id = config_gt_id(config);
> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
> +
> +	if (gt_id > max_gt_id)
> +		return -ENOENT;
> +
> +	switch (config_counter(config)) {
> +	case XE_PMU_INTERRUPTS(0):
> +		if (gt_id)
> +			return -ENOENT;
> +		break;
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		if (GRAPHICS_VER(xe) < 12)
> +			return -ENOENT;
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		if (MEDIA_VER(xe) >= 13 && gt->info.type != XE_GT_TYPE_MEDIA)
> +			return -ENOENT;
> +		break;
> +	default:
> +		return -ENOENT;
> +	}
> +
> +	return 0;
> +}
> +
> +static int xe_pmu_event_init(struct perf_event *event)
> +{
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct xe_pmu *pmu = &xe->pmu;
> +	int ret;
> +
> +	if (pmu->closed)
> +		return -ENODEV;
> +
> +	if (event->attr.type != event->pmu->type)
> +		return -ENOENT;
> +
> +	/* unsupported modes and filters */
> +	if (event->attr.sample_period) /* no sampling */
> +		return -EINVAL;
> +
> +	if (has_branch_stack(event))
> +		return -EOPNOTSUPP;
> +
> +	if (event->cpu < 0)
> +		return -EINVAL;
> +
> +	/* only allow running on one cpu at a time */
> +	if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask))
> +		return -EINVAL;
> +
> +	ret = config_status(xe, event->attr.config);
> +	if (ret)
> +		return ret;
> +
> +	if (!event->parent) {
> +		drm_dev_get(&xe->drm);
> +		event->destroy = xe_pmu_event_destroy;
> +	}
> +
> +	return 0;
> +}
> +
> +static u64 __xe_pmu_event_read(struct perf_event *event)
> +{
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	const unsigned int gt_id = config_gt_id(event->attr.config);
> +	const u64 config = config_counter(event->attr.config);
> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
> +	struct xe_pmu *pmu = &xe->pmu;
> +	u64 val = 0;
> +
> +	switch (config) {
> +	case XE_PMU_INTERRUPTS(0):
> +		val = READ_ONCE(pmu->irq_count);
> +		break;
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		val = engine_group_busyness_read(gt, config);
> +	}
> +
> +	return val;
> +}
> +
> +static void xe_pmu_event_read(struct perf_event *event)
> +{
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct xe_pmu *pmu = &xe->pmu;
> +	u64 prev, new;
> +
> +	if (pmu->closed) {
> +		event->hw.state = PERF_HES_STOPPED;
> +		return;
> +	}
> +again:
> +	prev = local64_read(&hwc->prev_count);
> +	new = __xe_pmu_event_read(event);
> +
> +	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
> +		goto again;
> +
> +	local64_add(new - prev, &event->count);
> +}
> +
> +static void xe_pmu_enable(struct perf_event *event)
> +{
> +	/*
> +	 * Store the current counter value so we can report the correct delta
> +	 * for all listeners. Even when the event was already enabled and has
> +	 * an existing non-zero value.
> +	 */
> +	local64_set(&event->hw.prev_count, __xe_pmu_event_read(event));
> +}
> +
> +static void xe_pmu_event_start(struct perf_event *event, int flags)
> +{
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct xe_pmu *pmu = &xe->pmu;
> +
> +	if (pmu->closed)
> +		return;
> +
> +	xe_pmu_enable(event);
> +	event->hw.state = 0;
> +}
> +
> +static void xe_pmu_event_stop(struct perf_event *event, int flags)
> +{
> +	if (flags & PERF_EF_UPDATE)
> +		xe_pmu_event_read(event);
> +
> +	event->hw.state = PERF_HES_STOPPED;
> +}
> +
> +static int xe_pmu_event_add(struct perf_event *event, int flags)
> +{
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct xe_pmu *pmu = &xe->pmu;
> +
> +	if (pmu->closed)
> +		return -ENODEV;
> +
> +	if (flags & PERF_EF_START)
> +		xe_pmu_event_start(event, flags);
> +
> +	return 0;
> +}
> +
> +static void xe_pmu_event_del(struct perf_event *event, int flags)
> +{
> +	xe_pmu_event_stop(event, PERF_EF_UPDATE);
> +}
> +
> +static int xe_pmu_event_event_idx(struct perf_event *event)
> +{
> +	return 0;
> +}
> +
> +struct xe_str_attribute {
> +	struct device_attribute attr;
> +	const char *str;
> +};
> +
> +static ssize_t xe_pmu_format_show(struct device *dev,
> +				  struct device_attribute *attr, char *buf)
> +{
> +	struct xe_str_attribute *eattr;
> +
> +	eattr = container_of(attr, struct xe_str_attribute, attr);
> +	return sprintf(buf, "%s\n", eattr->str);
> +}
> +
> +#define XE_PMU_FORMAT_ATTR(_name, _config) \
> +	(&((struct xe_str_attribute[]) { \
> +		{ .attr = __ATTR(_name, 0444, xe_pmu_format_show, NULL), \
> +		  .str = _config, } \
> +	})[0].attr.attr)
> +
> +static struct attribute *xe_pmu_format_attrs[] = {
> +	XE_PMU_FORMAT_ATTR(xe_eventid, "config:0-20"),
> +	NULL,
> +};
> +
> +static const struct attribute_group xe_pmu_format_attr_group = {
> +	.name = "format",
> +	.attrs = xe_pmu_format_attrs,
> +};
> +
> +struct xe_ext_attribute {
> +	struct device_attribute attr;
> +	unsigned long val;
> +};
> +
> +static ssize_t xe_pmu_event_show(struct device *dev,
> +				 struct device_attribute *attr, char *buf)
> +{
> +	struct xe_ext_attribute *eattr;
> +
> +	eattr = container_of(attr, struct xe_ext_attribute, attr);
> +	return sprintf(buf, "config=0x%lx\n", eattr->val);
> +}
> +
> +static ssize_t cpumask_show(struct device *dev,
> +			    struct device_attribute *attr, char *buf)
> +{
> +	return cpumap_print_to_pagebuf(true, buf, &xe_pmu_cpumask);
> +}
> +
> +static DEVICE_ATTR_RO(cpumask);
> +
> +static struct attribute *xe_cpumask_attrs[] = {
> +	&dev_attr_cpumask.attr,
> +	NULL,
> +};
> +
> +static const struct attribute_group xe_pmu_cpumask_attr_group = {
> +	.attrs = xe_cpumask_attrs,
> +};
> +
> +#define __event(__counter, __name, __unit) \
> +{ \
> +	.counter = (__counter), \
> +	.name = (__name), \
> +	.unit = (__unit), \
> +	.global = false, \
> +}
> +
> +#define __global_event(__counter, __name, __unit) \
> +{ \
> +	.counter = (__counter), \
> +	.name = (__name), \
> +	.unit = (__unit), \
> +	.global = true, \
> +}
> +
> +static struct xe_ext_attribute *
> +add_xe_attr(struct xe_ext_attribute *attr, const char *name, u64 config)
> +{
> +	sysfs_attr_init(&attr->attr.attr);
> +	attr->attr.attr.name = name;
> +	attr->attr.attr.mode = 0444;
> +	attr->attr.show = xe_pmu_event_show;
> +	attr->val = config;
> +
> +	return ++attr;
> +}
> +
> +static struct perf_pmu_events_attr *
> +add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name,
> +	     const char *str)
> +{
> +	sysfs_attr_init(&attr->attr.attr);
> +	attr->attr.attr.name = name;
> +	attr->attr.attr.mode = 0444;
> +	attr->attr.show = perf_event_sysfs_show;
> +	attr->event_str = str;
> +
> +	return ++attr;
> +}
> +
> +static struct attribute **
> +create_event_attributes(struct xe_pmu *pmu)
> +{
> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +	static const struct {
> +		unsigned int counter;
> +		const char *name;
> +		const char *unit;
> +		bool global;
> +	} events[] = {
> +		__global_event(0, "interrupts", NULL),
> +		__event(1, "render-group-busy", "ns"),
> +		__event(2, "copy-group-busy", "ns"),
> +		__event(3, "media-group-busy", "ns"),
> +		__event(4, "any-engine-group-busy", "ns"),
> +	};
> +
> +	unsigned int count = 0;
> +	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
> +	struct xe_ext_attribute *xe_attr = NULL, *xe_iter;
> +	struct attribute **attr = NULL, **attr_iter;
> +	struct xe_gt *gt;
> +	unsigned int i, j;
> +
> +	/* Count how many counters we will be exposing. */
> +	for_each_gt(gt, xe, j) {
> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
> +			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
> +
> +			if (!config_status(xe, config))
> +				count++;
> +		}
> +	}
> +
> +	/* Allocate attribute objects and table. */
> +	xe_attr = kcalloc(count, sizeof(*xe_attr), GFP_KERNEL);
> +	if (!xe_attr)
> +		goto err_alloc;
> +
> +	pmu_attr = kcalloc(count, sizeof(*pmu_attr), GFP_KERNEL);
> +	if (!pmu_attr)
> +		goto err_alloc;
> +
> +	/* Max one pointer of each attribute type plus a termination entry. */
> +	attr = kcalloc(count * 2 + 1, sizeof(*attr), GFP_KERNEL);
> +	if (!attr)
> +		goto err_alloc;
> +
> +	xe_iter = xe_attr;
> +	pmu_iter = pmu_attr;
> +	attr_iter = attr;
> +
> +	for_each_gt(gt, xe, j) {
> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
> +			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
> +			char *str;
> +
> +			if (config_status(xe, config))
> +				continue;
> +
> +			if (events[i].global)
> +				str = kstrdup(events[i].name, GFP_KERNEL);
> +			else
> +				str = kasprintf(GFP_KERNEL, "%s-gt%u",
> +						events[i].name, j);
> +			if (!str)
> +				goto err;
> +
> +			*attr_iter++ = &xe_iter->attr.attr;
> +			xe_iter = add_xe_attr(xe_iter, str, config);
> +
> +			if (events[i].unit) {
> +				if (events[i].global)
> +					str = kasprintf(GFP_KERNEL, "%s.unit",
> +							events[i].name);
> +				else
> +					str = kasprintf(GFP_KERNEL, "%s-gt%u.unit",
> +							events[i].name, j);
> +				if (!str)
> +					goto err;
> +
> +				*attr_iter++ = &pmu_iter->attr.attr;
> +				pmu_iter = add_pmu_attr(pmu_iter, str,
> +							events[i].unit);
> +			}
> +		}
> +	}
> +
> +	pmu->xe_attr = xe_attr;
> +	pmu->pmu_attr = pmu_attr;
> +
> +	return attr;
> +
> +err:
> +	for (attr_iter = attr; *attr_iter; attr_iter++)
> +		kfree((*attr_iter)->name);
> +
> +err_alloc:
> +	kfree(attr);
> +	kfree(xe_attr);
> +	kfree(pmu_attr);
> +
> +	return NULL;
> +}
> +
> +static void free_event_attributes(struct xe_pmu *pmu)
> +{
> +	struct attribute **attr_iter = pmu->events_attr_group.attrs;
> +
> +	for (; *attr_iter; attr_iter++)
> +		kfree((*attr_iter)->name);
> +
> +	kfree(pmu->events_attr_group.attrs);
> +	kfree(pmu->xe_attr);
> +	kfree(pmu->pmu_attr);
> +
> +	pmu->events_attr_group.attrs = NULL;
> +	pmu->xe_attr = NULL;
> +	pmu->pmu_attr = NULL;
> +}
> +
> +static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
> +{
> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
> +
> +	XE_BUG_ON(!pmu->base.event_init);
> +
> +	/* Select the first online CPU as a designated reader. */
> +	if (cpumask_empty(&xe_pmu_cpumask))
> +		cpumask_set_cpu(cpu, &xe_pmu_cpumask);
> +
> +	return 0;
> +}
> +
> +static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
> +{
> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
> +	unsigned int target = xe_pmu_target_cpu;
> +
> +	XE_BUG_ON(!pmu->base.event_init);
> +
> +	/*
> +	 * Unregistering an instance generates a CPU offline event which we must
> +	 * ignore to avoid incorrectly modifying the shared xe_pmu_cpumask.
> +	 */
> +	if (pmu->closed)
> +		return 0;
> +
> +	if (cpumask_test_and_clear_cpu(cpu, &xe_pmu_cpumask)) {
> +		target = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
> +
> +		/* Migrate events if there is a valid target */
> +		if (target < nr_cpu_ids) {
> +			cpumask_set_cpu(target, &xe_pmu_cpumask);
> +			xe_pmu_target_cpu = target;
> +		}
> +	}
> +
> +	if (target < nr_cpu_ids && target != pmu->cpuhp.cpu) {
> +		perf_pmu_migrate_context(&pmu->base, cpu, target);
> +		pmu->cpuhp.cpu = target;
> +	}
> +
> +	return 0;
> +}
> +
> +static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
> +
> +int xe_pmu_init(void)
> +{
> +	int ret;
> +
> +	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
> +				      "perf/x86/intel/xe:online",
> +				      xe_pmu_cpu_online,
> +				      xe_pmu_cpu_offline);
> +	if (ret < 0)
> +		pr_notice("Failed to setup cpuhp state for xe PMU! (%d)\n",
> +			  ret);
> +	else
> +		cpuhp_slot = ret;
> +
> +	return 0;
> +}
> +
> +void xe_pmu_exit(void)
> +{
> +	if (cpuhp_slot != CPUHP_INVALID)
> +		cpuhp_remove_multi_state(cpuhp_slot);
> +}
> +
> +static int xe_pmu_register_cpuhp_state(struct xe_pmu *pmu)
> +{
> +	if (cpuhp_slot == CPUHP_INVALID)
> +		return -EINVAL;
> +
> +	return cpuhp_state_add_instance(cpuhp_slot, &pmu->cpuhp.node);
> +}
> +
> +static void xe_pmu_unregister_cpuhp_state(struct xe_pmu *pmu)
> +{
> +	cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node);
> +}
> +
> +static void xe_pmu_unregister(struct drm_device *device, void *arg)
> +{
> +	struct xe_pmu *pmu = arg;
> +
> +	if (!pmu->base.event_init)
> +		return;
> +
> +	/*
> +	 * "Disconnect" the PMU callbacks - since all are atomic synchronize_rcu
> +	 * ensures all currently executing ones will have exited before we
> +	 * proceed with unregistration.
> +	 */
> +	pmu->closed = true;
> +	synchronize_rcu();
> +
> +	xe_pmu_unregister_cpuhp_state(pmu);
> +
> +	perf_pmu_unregister(&pmu->base);
> +	pmu->base.event_init = NULL;
> +	kfree(pmu->base.attr_groups);
> +	kfree(pmu->name);
> +	free_event_attributes(pmu);
> +}
> +
> +static void init_samples(struct xe_pmu *pmu)
> +{
> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +	struct xe_gt *gt;
> +	unsigned int i;
> +
> +	for_each_gt(gt, xe, i)
> +		engine_group_busyness_store(gt);
> +}
> +
> +void xe_pmu_register(struct xe_pmu *pmu)
> +{
> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +	const struct attribute_group *attr_groups[] = {
> +		&xe_pmu_format_attr_group,
> +		&pmu->events_attr_group,
> +		&xe_pmu_cpumask_attr_group,
> +		NULL
> +	};
> +
> +	int ret = -ENOMEM;
> +
> +	spin_lock_init(&pmu->lock);
> +	pmu->cpuhp.cpu = -1;
> +	init_samples(pmu);
> +
> +	pmu->name = kasprintf(GFP_KERNEL,
> +			      "xe_%s",
> +			      dev_name(xe->drm.dev));
> +	if (pmu->name)
> +		/* tools/perf reserves colons as special. */
> +		strreplace((char *)pmu->name, ':', '_');
> +
> +	if (!pmu->name)
> +		goto err;
> +
> +	pmu->events_attr_group.name = "events";
> +	pmu->events_attr_group.attrs = create_event_attributes(pmu);
> +	if (!pmu->events_attr_group.attrs)
> +		goto err_name;
> +
> +	pmu->base.attr_groups = kmemdup(attr_groups, sizeof(attr_groups),
> +					GFP_KERNEL);
> +	if (!pmu->base.attr_groups)
> +		goto err_attr;
> +
> +	pmu->base.module	= THIS_MODULE;
> +	pmu->base.task_ctx_nr	= perf_invalid_context;
> +	pmu->base.event_init	= xe_pmu_event_init;
> +	pmu->base.add		= xe_pmu_event_add;
> +	pmu->base.del		= xe_pmu_event_del;
> +	pmu->base.start		= xe_pmu_event_start;
> +	pmu->base.stop		= xe_pmu_event_stop;
> +	pmu->base.read		= xe_pmu_event_read;
> +	pmu->base.event_idx	= xe_pmu_event_event_idx;
> +
> +	ret = perf_pmu_register(&pmu->base, pmu->name, -1);
> +	if (ret)
> +		goto err_groups;
> +
> +	ret = xe_pmu_register_cpuhp_state(pmu);
> +	if (ret)
> +		goto err_unreg;
> +
> +	ret = drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister, pmu);
> +	XE_WARN_ON(ret);
> +
> +	return;
> +
> +err_unreg:
> +	perf_pmu_unregister(&pmu->base);
> +err_groups:
> +	kfree(pmu->base.attr_groups);
> +err_attr:
> +	pmu->base.event_init = NULL;
> +	free_event_attributes(pmu);
> +err_name:
> +	kfree(pmu->name);
> +err:
> +	drm_notice(&xe->drm, "Failed to register PMU!\n");
> +}
> diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
> new file mode 100644
> index 000000000000..d3f47f4ab343
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu.h
> @@ -0,0 +1,25 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright ? 2023 Intel Corporation
> + */
> +
> +#ifndef _XE_PMU_H_
> +#define _XE_PMU_H_
> +
> +#include "xe_gt_types.h"
> +#include "xe_pmu_types.h"
> +
> +#ifdef CONFIG_PERF_EVENTS
> +int xe_pmu_init(void);
> +void xe_pmu_exit(void);
> +void xe_pmu_register(struct xe_pmu *pmu);
> +void engine_group_busyness_store(struct xe_gt *gt);
> +#else
> +static inline int xe_pmu_init(void) { return 0; }
> +static inline void xe_pmu_exit(void) {}
> +static inline void xe_pmu_register(struct xe_pmu *pmu) {}
> +static inline void engine_group_busyness_store(struct xe_gt *gt) {}
> +#endif
> +
> +#endif
> +
> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h b/drivers/gpu/drm/xe/xe_pmu_types.h
> new file mode 100644
> index 000000000000..e87edd4d6a87
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
> @@ -0,0 +1,80 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright ? 2023 Intel Corporation
> + */
> +
> +#ifndef _XE_PMU_TYPES_H_
> +#define _XE_PMU_TYPES_H_
> +
> +#include <linux/perf_event.h>
> +#include <linux/spinlock_types.h>
> +#include <uapi/drm/xe_drm.h>
> +
> +enum {
> +	__XE_SAMPLE_RENDER_GROUP_BUSY,
> +	__XE_SAMPLE_COPY_GROUP_BUSY,
> +	__XE_SAMPLE_MEDIA_GROUP_BUSY,
> +	__XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> +	__XE_NUM_PMU_SAMPLERS
> +};
> +
> +struct xe_pmu_sample {
> +	u64 cur;
> +};
> +
> +#define XE_MAX_GT_PER_TILE 2
> +
> +struct xe_pmu {
> +	/**
> +	 * @cpuhp: Struct used for CPU hotplug handling.
> +	 */
> +	struct {
> +		struct hlist_node node;
> +		unsigned int cpu;
> +	} cpuhp;
> +	/**
> +	 * @base: PMU base.
> +	 */
> +	struct pmu base;

Are we not adding this timer for xe (it exists in i915)?

/**
          * @timer: Timer for internal i915 PMU sampling.
          */
         struct hrtimer timer;

Thanks,

Vinay.

> +	/**
> +	 * @closed: xe is unregistering.
> +	 */
> +	bool closed;
> +	/**
> +	 * @name: Name as registered with perf core.
> +	 */
> +	const char *name;
> +	/**
> +	 * @lock: Lock protecting enable mask and ref count handling.
> +	 */
> +	spinlock_t lock;
> +	/**
> +	 * @sample: Current and previous (raw) counters.
> +	 *
> +	 * These counters are updated when the device is awake.
> +	 *
> +	 */
> +	struct xe_pmu_sample sample[XE_MAX_GT_PER_TILE * __XE_NUM_PMU_SAMPLERS];
> +	/**
> +	 * @irq_count: Number of interrupts
> +	 *
> +	 * Intentionally unsigned long to avoid atomics or heuristics on 32bit.
> +	 * 4e9 interrupts are a lot and postprocessing can really deal with an
> +	 * occasional wraparound easily. It's 32bit after all.
> +	 */
> +	unsigned long irq_count;
> +	/**
> +	 * @events_attr_group: Device events attribute group.
> +	 */
> +	struct attribute_group events_attr_group;
> +	/**
> +	 * @xe_attr: Memory block holding device attributes.
> +	 */
> +	void *xe_attr;
> +	/**
> +	 * @pmu_attr: Memory block holding device attributes.
> +	 */
> +	void *pmu_attr;
> +};
> +
> +#endif
> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> index 965cd9527ff1..ed097056f944 100644
> --- a/include/uapi/drm/xe_drm.h
> +++ b/include/uapi/drm/xe_drm.h
> @@ -990,6 +990,22 @@ struct drm_xe_vm_madvise {
>   	__u64 reserved[2];
>   };
>   
> +/* PMU event config IDs */
> +
> +/*
> + * Top 4 bits of every counter are GT id.
> + */
> +#define __XE_PMU_GT_SHIFT (60)
> +
> +#define ___XE_PMU_OTHER(gt, x) \
> +	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
> +
> +#define XE_PMU_INTERRUPTS(gt)			___XE_PMU_OTHER(gt, 0)
> +#define XE_PMU_RENDER_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 1)
> +#define XE_PMU_COPY_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 2)
> +#define XE_PMU_MEDIA_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 3)
> +#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)	___XE_PMU_OTHER(gt, 4)
> +
>   #if defined(__cplusplus)
>   }
>   #endif

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-06  2:40   ` Belgaumkar, Vinay
@ 2023-07-06 13:06     ` Iddamsetty, Aravind
  0 siblings, 0 replies; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-06 13:06 UTC (permalink / raw)
  To: Belgaumkar, Vinay, intel-xe



On 06-07-2023 08:10, Belgaumkar, Vinay wrote:
> 
> On 6/27/2023 5:21 AM, aravind.iddamsetty at intel.com (Aravind
> Iddamsetty) wrote:
>> There are a set of engine group busyness counters provided by HW which
>> are
>> perfect fit to be exposed via PMU perf events.
>>
>> BSPEC: 46559, 46560, 46722, 46729
>>
>> events can be listed using:
>> perf list
>>    xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
>>    xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
>>    xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
>>    xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
>>    xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
>>
>> and can be read using:
>>
>> perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
>>             time             counts unit events
>>       1.001139062                  0 ns 
>> xe_0000_8c_00.0/render-group-busy-gt0/
>>       2.003294678                  0 ns 
>> xe_0000_8c_00.0/render-group-busy-gt0/
>>       3.005199582                  0 ns 
>> xe_0000_8c_00.0/render-group-busy-gt0/
>>       4.007076497                  0 ns 
>> xe_0000_8c_00.0/render-group-busy-gt0/
>>       5.008553068                  0 ns 
>> xe_0000_8c_00.0/render-group-busy-gt0/
>>       6.010531563              43520 ns 
>> xe_0000_8c_00.0/render-group-busy-gt0/
>>       7.012468029              44800 ns 
>> xe_0000_8c_00.0/render-group-busy-gt0/
>>       8.013463515                  0 ns 
>> xe_0000_8c_00.0/render-group-busy-gt0/
>>       9.015300183                  0 ns 
>> xe_0000_8c_00.0/render-group-busy-gt0/
>>      10.017233010                  0 ns 
>> xe_0000_8c_00.0/render-group-busy-gt0/
>>      10.971934120                  0 ns 
>> xe_0000_8c_00.0/render-group-busy-gt0/
>>
>> The pmu base implementation is taken from i915.
>>
>> v2:
>> Store last known value when device is awake return that while the GT is
>> suspended and then update the driver copy when read during awake.
>>
>> Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu at intel.com>
>> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty at intel.com>
>> ---
>>   drivers/gpu/drm/xe/Makefile          |   2 +
>>   drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
>>   drivers/gpu/drm/xe/xe_device.c       |   2 +
>>   drivers/gpu/drm/xe/xe_device_types.h |   4 +
>>   drivers/gpu/drm/xe/xe_gt.c           |   2 +
>>   drivers/gpu/drm/xe/xe_irq.c          |  22 +
>>   drivers/gpu/drm/xe/xe_module.c       |   5 +
>>   drivers/gpu/drm/xe/xe_pmu.c          | 739 +++++++++++++++++++++++++++
>>   drivers/gpu/drm/xe/xe_pmu.h          |  25 +
>>   drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
>>   include/uapi/drm/xe_drm.h            |  16 +
>>   11 files changed, 902 insertions(+)
>>   create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
>>   create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
>>   create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h
>>
>> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
>> index 081c57fd8632..e52ab795c566 100644
>> --- a/drivers/gpu/drm/xe/Makefile
>> +++ b/drivers/gpu/drm/xe/Makefile
>> @@ -217,6 +217,8 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
>>       i915-display/skl_universal_plane.o \
>>       i915-display/skl_watermark.o
>>   +xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
>> +
>>   ifeq ($(CONFIG_ACPI),y)
>>       xe-$(CONFIG_DRM_XE_DISPLAY) += \
>>           i915-display/intel_acpi.o \
>> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> index 3f664011eaea..c7d9e4634745 100644
>> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> @@ -285,6 +285,11 @@
>>   #define   INVALIDATION_BROADCAST_MODE_DIS    REG_BIT(12)
>>   #define   GLOBAL_INVALIDATION_MODE        REG_BIT(2)
>>   +#define XE_OAG_RC0_ANY_ENGINE_BUSY_FREE        XE_REG(0xdb80)
>> +#define XE_OAG_ANY_MEDIA_FF_BUSY_FREE        XE_REG(0xdba0)
>> +#define XE_OAG_BLT_BUSY_FREE            XE_REG(0xdbbc)
>> +#define XE_OAG_RENDER_BUSY_FREE            XE_REG(0xdbdc)
>> +
>>   #define SAMPLER_MODE                XE_REG_MCR(0xe18c,
>> XE_REG_OPTION_MASKED)
>>   #define   ENABLE_SMALLPL            REG_BIT(15)
>>   #define   SC_DISABLE_POWER_OPTIMIZATION_EBB    REG_BIT(9)
>> diff --git a/drivers/gpu/drm/xe/xe_device.c
>> b/drivers/gpu/drm/xe/xe_device.c
>> index c7985af85a53..b2c7bd4a97d9 100644
>> --- a/drivers/gpu/drm/xe/xe_device.c
>> +++ b/drivers/gpu/drm/xe/xe_device.c
>> @@ -328,6 +328,8 @@ int xe_device_probe(struct xe_device *xe)
>>         xe_debugfs_register(xe);
>>   +    xe_pmu_register(&xe->pmu);
>> +
>>       err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize, xe);
>>       if (err)
>>           return err;
>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h
>> b/drivers/gpu/drm/xe/xe_device_types.h
>> index 0226d44a6af2..3ba99aae92b9 100644
>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>> @@ -15,6 +15,7 @@
>>   #include "xe_devcoredump_types.h"
>>   #include "xe_gt_types.h"
>>   #include "xe_platform_types.h"
>> +#include "xe_pmu.h"
>>   #include "xe_step_types.h"
>>     #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
>> @@ -332,6 +333,9 @@ struct xe_device {
>>       /** @d3cold_allowed: Indicates if d3cold is a valid device state */
>>       bool d3cold_allowed;
>>   +    /* @pmu: performance monitoring unit */
>> +    struct xe_pmu pmu;
>> +
>>       /* private: */
>>     #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
>> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
>> index 2458397ce8af..96e3720923d4 100644
>> --- a/drivers/gpu/drm/xe/xe_gt.c
>> +++ b/drivers/gpu/drm/xe/xe_gt.c
>> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
>>       if (err)
>>           goto err_msg;
>>   +    engine_group_busyness_store(gt);
>> +
>>       err = xe_uc_suspend(&gt->uc);
>>       if (err)
>>           goto err_force_wake;
>> diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
>> index b4ed1e4a3388..cb943fb94ec7 100644
>> --- a/drivers/gpu/drm/xe/xe_irq.c
>> +++ b/drivers/gpu/drm/xe/xe_irq.c
>> @@ -27,6 +27,24 @@
>>   #define IIR(offset)                XE_REG(offset + 0x8)
>>   #define IER(offset)                XE_REG(offset + 0xc)
>>   +/*
>> + * Interrupt statistic for PMU. Increments the counter only if the
>> + * interrupt originated from the GPU so interrupts from a device which
>> + * shares the interrupt line are not accounted.
>> + */
>> +static inline void xe_pmu_irq_stats(struct xe_device *xe,
>> +                    irqreturn_t res)
>> +{
>> +    if (unlikely(res != IRQ_HANDLED))
>> +        return;
>> +
>> +    /*
>> +     * A clever compiler translates that into INC. A not so clever one
>> +     * should at least prevent store tearing.
>> +     */
>> +    WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1);
>> +}
>> +
>>   static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)
>>   {
>>       u32 val = xe_mmio_read32(mmio, reg);
>> @@ -325,6 +343,8 @@ static irqreturn_t xelp_irq_handler(int irq, void
>> *arg)
>>         xe_display_irq_enable(xe, gu_misc_iir);
>>   +    xe_pmu_irq_stats(xe, IRQ_HANDLED);
>> +
>>       return IRQ_HANDLED;
>>   }
>>   @@ -414,6 +434,8 @@ static irqreturn_t dg1_irq_handler(int irq, void
>> *arg)
>>       dg1_intr_enable(xe, false);
>>       xe_display_irq_enable(xe, gu_misc_iir);
>>   +    xe_pmu_irq_stats(xe, IRQ_HANDLED);
>> +
>>       return IRQ_HANDLED;
>>   }
>>   diff --git a/drivers/gpu/drm/xe/xe_module.c
>> b/drivers/gpu/drm/xe/xe_module.c
>> index 75e5be939f53..f6fe89748525 100644
>> --- a/drivers/gpu/drm/xe/xe_module.c
>> +++ b/drivers/gpu/drm/xe/xe_module.c
>> @@ -12,6 +12,7 @@
>>   #include "xe_hw_fence.h"
>>   #include "xe_module.h"
>>   #include "xe_pci.h"
>> +#include "xe_pmu.h"
>>   #include "xe_sched_job.h"
>>     bool enable_guc = true;
>> @@ -49,6 +50,10 @@ static const struct init_funcs init_funcs[] = {
>>           .init = xe_sched_job_module_init,
>>           .exit = xe_sched_job_module_exit,
>>       },
>> +    {
>> +        .init = xe_pmu_init,
>> +        .exit = xe_pmu_exit,
>> +    },
>>       {
>>           .init = xe_register_pci_driver,
>>           .exit = xe_unregister_pci_driver,
>> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
>> new file mode 100644
>> index 000000000000..bef1895be9f7
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_pmu.c
>> @@ -0,0 +1,739 @@
>> +/*
>> + * SPDX-License-Identifier: MIT
>> + *
>> + * Copyright ? 2023 Intel Corporation
>> + */
>> +
>> +#include <drm/drm_drv.h>
>> +#include <drm/drm_managed.h>
>> +#include <drm/xe_drm.h>
>> +
>> +#include "regs/xe_gt_regs.h"
>> +#include "xe_device.h"
>> +#include "xe_gt_clock.h"
>> +#include "xe_mmio.h"
>> +
>> +static cpumask_t xe_pmu_cpumask;
>> +static unsigned int xe_pmu_target_cpu = -1;
>> +
>> +static unsigned int config_gt_id(const u64 config)
>> +{
>> +    return config >> __XE_PMU_GT_SHIFT;
>> +}
>> +
>> +static u64 config_counter(const u64 config)
>> +{
>> +    return config & ~(~0ULL << __XE_PMU_GT_SHIFT);
>> +}
>> +
>> +static unsigned int
>> +__sample_idx(struct xe_pmu *pmu, unsigned int gt_id, int sample)
>> +{
>> +    unsigned int idx = gt_id * __XE_NUM_PMU_SAMPLERS + sample;
>> +
>> +    XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
>> +
>> +    return idx;
>> +}
>> +
>> +static u64 read_sample(struct xe_pmu *pmu, unsigned int gt_id, int
>> sample)
>> +{
>> +    return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur;
>> +}
>> +
>> +static void
>> +store_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample, u64
>> val)
>> +{
>> +    pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val;
>> +}
>> +
>> +static int engine_busyness_sample_type(u64 config)
>> +{
>> +    int type = 0;
>> +
>> +    switch (config) {
>> +    case XE_PMU_RENDER_GROUP_BUSY(0):
>> +        type =  __XE_SAMPLE_RENDER_GROUP_BUSY;
>> +        break;
>> +    case XE_PMU_COPY_GROUP_BUSY(0):
>> +        type = __XE_SAMPLE_COPY_GROUP_BUSY;
>> +        break;
>> +    case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +        type = __XE_SAMPLE_MEDIA_GROUP_BUSY;
>> +        break;
>> +    case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +        type = __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY;
>> +        break;
>> +    }
>> +
>> +    return type;
>> +}
>> +
>> +static void xe_pmu_event_destroy(struct perf_event *event)
>> +{
>> +    struct xe_device *xe =
>> +        container_of(event->pmu, typeof(*xe), pmu.base);
>> +
>> +    drm_WARN_ON(&xe->drm, event->parent);
>> +
>> +    drm_dev_put(&xe->drm);
>> +}
>> +
>> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
>> +{
>> +    u64 val = 0;
>> +
>> +    switch (config) {
>> +    case XE_PMU_RENDER_GROUP_BUSY(0):
>> +        val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
>> +        break;
>> +    case XE_PMU_COPY_GROUP_BUSY(0):
>> +        val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
>> +        break;
>> +    case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +        val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
>> +        break;
>> +    case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +        val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
>> +        break;
>> +    default:
>> +        drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
>> +    }
>> +
>> +    return xe_gt_clock_interval_to_ns(gt, val * 16);
>> +}
>> +
>> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config)
>> +{
>> +    int sample_type = engine_busyness_sample_type(config);
>> +    struct xe_device *xe = gt->tile->xe;
>> +    const unsigned int gt_id = gt->info.id;
>> +    struct xe_pmu *pmu = &xe->pmu;
>> +    bool device_awake;
>> +    unsigned long flags;
>> +    u64 val;
>> +
>> +    /*
>> +     * found no better way to check if device is awake or not. Before
>> +     * we suspend we set the submission_state.enabled to false.
>> +     */
>> +    device_awake = gt->uc.guc.submission_state.enabled ? true : false;
>> +    if (device_awake)
>> +        val = __engine_group_busyness_read(gt, config);
>> +
>> +    spin_lock_irqsave(&pmu->lock, flags);
>> +
>> +    if (device_awake)
>> +        store_sample(pmu, gt_id, sample_type, val);
>> +    else
>> +        val = read_sample(pmu, gt_id, sample_type);
>> +
>> +    spin_unlock_irqrestore(&pmu->lock, flags);
>> +
>> +    return val;
>> +}
>> +
>> +void engine_group_busyness_store(struct xe_gt *gt)
>> +{
>> +    struct xe_pmu *pmu = &gt->tile->xe->pmu;
>> +    unsigned int gt_id = gt->info.id;
>> +    unsigned long flags;
>> +
>> +    spin_lock_irqsave(&pmu->lock, flags);
>> +
>> +    store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
>> +             __engine_group_busyness_read(gt,
>> XE_PMU_RENDER_GROUP_BUSY(0)));
>> +    store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
>> +             __engine_group_busyness_read(gt,
>> XE_PMU_COPY_GROUP_BUSY(0)));
>> +    store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
>> +             __engine_group_busyness_read(gt,
>> XE_PMU_MEDIA_GROUP_BUSY(0)));
>> +    store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>> +             __engine_group_busyness_read(gt,
>> XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
>> +
>> +    spin_unlock_irqrestore(&pmu->lock, flags);
>> +}
>> +
>> +static int
>> +config_status(struct xe_device *xe, u64 config)
>> +{
>> +    unsigned int max_gt_id = xe->info.gt_count > 1 ? 1 : 0;
>> +    unsigned int gt_id = config_gt_id(config);
>> +    struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
>> +
>> +    if (gt_id > max_gt_id)
>> +        return -ENOENT;
>> +
>> +    switch (config_counter(config)) {
>> +    case XE_PMU_INTERRUPTS(0):
>> +        if (gt_id)
>> +            return -ENOENT;
>> +        break;
>> +    case XE_PMU_RENDER_GROUP_BUSY(0):
>> +    case XE_PMU_COPY_GROUP_BUSY(0):
>> +    case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +        if (GRAPHICS_VER(xe) < 12)
>> +            return -ENOENT;
>> +        break;
>> +    case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +        if (MEDIA_VER(xe) >= 13 && gt->info.type != XE_GT_TYPE_MEDIA)
>> +            return -ENOENT;
>> +        break;
>> +    default:
>> +        return -ENOENT;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int xe_pmu_event_init(struct perf_event *event)
>> +{
>> +    struct xe_device *xe =
>> +        container_of(event->pmu, typeof(*xe), pmu.base);
>> +    struct xe_pmu *pmu = &xe->pmu;
>> +    int ret;
>> +
>> +    if (pmu->closed)
>> +        return -ENODEV;
>> +
>> +    if (event->attr.type != event->pmu->type)
>> +        return -ENOENT;
>> +
>> +    /* unsupported modes and filters */
>> +    if (event->attr.sample_period) /* no sampling */
>> +        return -EINVAL;
>> +
>> +    if (has_branch_stack(event))
>> +        return -EOPNOTSUPP;
>> +
>> +    if (event->cpu < 0)
>> +        return -EINVAL;
>> +
>> +    /* only allow running on one cpu at a time */
>> +    if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask))
>> +        return -EINVAL;
>> +
>> +    ret = config_status(xe, event->attr.config);
>> +    if (ret)
>> +        return ret;
>> +
>> +    if (!event->parent) {
>> +        drm_dev_get(&xe->drm);
>> +        event->destroy = xe_pmu_event_destroy;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static u64 __xe_pmu_event_read(struct perf_event *event)
>> +{
>> +    struct xe_device *xe =
>> +        container_of(event->pmu, typeof(*xe), pmu.base);
>> +    const unsigned int gt_id = config_gt_id(event->attr.config);
>> +    const u64 config = config_counter(event->attr.config);
>> +    struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
>> +    struct xe_pmu *pmu = &xe->pmu;
>> +    u64 val = 0;
>> +
>> +    switch (config) {
>> +    case XE_PMU_INTERRUPTS(0):
>> +        val = READ_ONCE(pmu->irq_count);
>> +        break;
>> +    case XE_PMU_RENDER_GROUP_BUSY(0):
>> +    case XE_PMU_COPY_GROUP_BUSY(0):
>> +    case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +    case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +        val = engine_group_busyness_read(gt, config);
>> +    }
>> +
>> +    return val;
>> +}
>> +
>> +static void xe_pmu_event_read(struct perf_event *event)
>> +{
>> +    struct xe_device *xe =
>> +        container_of(event->pmu, typeof(*xe), pmu.base);
>> +    struct hw_perf_event *hwc = &event->hw;
>> +    struct xe_pmu *pmu = &xe->pmu;
>> +    u64 prev, new;
>> +
>> +    if (pmu->closed) {
>> +        event->hw.state = PERF_HES_STOPPED;
>> +        return;
>> +    }
>> +again:
>> +    prev = local64_read(&hwc->prev_count);
>> +    new = __xe_pmu_event_read(event);
>> +
>> +    if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
>> +        goto again;
>> +
>> +    local64_add(new - prev, &event->count);
>> +}
>> +
>> +static void xe_pmu_enable(struct perf_event *event)
>> +{
>> +    /*
>> +     * Store the current counter value so we can report the correct
>> delta
>> +     * for all listeners. Even when the event was already enabled and
>> has
>> +     * an existing non-zero value.
>> +     */
>> +    local64_set(&event->hw.prev_count, __xe_pmu_event_read(event));
>> +}
>> +
>> +static void xe_pmu_event_start(struct perf_event *event, int flags)
>> +{
>> +    struct xe_device *xe =
>> +        container_of(event->pmu, typeof(*xe), pmu.base);
>> +    struct xe_pmu *pmu = &xe->pmu;
>> +
>> +    if (pmu->closed)
>> +        return;
>> +
>> +    xe_pmu_enable(event);
>> +    event->hw.state = 0;
>> +}
>> +
>> +static void xe_pmu_event_stop(struct perf_event *event, int flags)
>> +{
>> +    if (flags & PERF_EF_UPDATE)
>> +        xe_pmu_event_read(event);
>> +
>> +    event->hw.state = PERF_HES_STOPPED;
>> +}
>> +
>> +static int xe_pmu_event_add(struct perf_event *event, int flags)
>> +{
>> +    struct xe_device *xe =
>> +        container_of(event->pmu, typeof(*xe), pmu.base);
>> +    struct xe_pmu *pmu = &xe->pmu;
>> +
>> +    if (pmu->closed)
>> +        return -ENODEV;
>> +
>> +    if (flags & PERF_EF_START)
>> +        xe_pmu_event_start(event, flags);
>> +
>> +    return 0;
>> +}
>> +
>> +static void xe_pmu_event_del(struct perf_event *event, int flags)
>> +{
>> +    xe_pmu_event_stop(event, PERF_EF_UPDATE);
>> +}
>> +
>> +static int xe_pmu_event_event_idx(struct perf_event *event)
>> +{
>> +    return 0;
>> +}
>> +
>> +struct xe_str_attribute {
>> +    struct device_attribute attr;
>> +    const char *str;
>> +};
>> +
>> +static ssize_t xe_pmu_format_show(struct device *dev,
>> +                  struct device_attribute *attr, char *buf)
>> +{
>> +    struct xe_str_attribute *eattr;
>> +
>> +    eattr = container_of(attr, struct xe_str_attribute, attr);
>> +    return sprintf(buf, "%s\n", eattr->str);
>> +}
>> +
>> +#define XE_PMU_FORMAT_ATTR(_name, _config) \
>> +    (&((struct xe_str_attribute[]) { \
>> +        { .attr = __ATTR(_name, 0444, xe_pmu_format_show, NULL), \
>> +          .str = _config, } \
>> +    })[0].attr.attr)
>> +
>> +static struct attribute *xe_pmu_format_attrs[] = {
>> +    XE_PMU_FORMAT_ATTR(xe_eventid, "config:0-20"),
>> +    NULL,
>> +};
>> +
>> +static const struct attribute_group xe_pmu_format_attr_group = {
>> +    .name = "format",
>> +    .attrs = xe_pmu_format_attrs,
>> +};
>> +
>> +struct xe_ext_attribute {
>> +    struct device_attribute attr;
>> +    unsigned long val;
>> +};
>> +
>> +static ssize_t xe_pmu_event_show(struct device *dev,
>> +                 struct device_attribute *attr, char *buf)
>> +{
>> +    struct xe_ext_attribute *eattr;
>> +
>> +    eattr = container_of(attr, struct xe_ext_attribute, attr);
>> +    return sprintf(buf, "config=0x%lx\n", eattr->val);
>> +}
>> +
>> +static ssize_t cpumask_show(struct device *dev,
>> +                struct device_attribute *attr, char *buf)
>> +{
>> +    return cpumap_print_to_pagebuf(true, buf, &xe_pmu_cpumask);
>> +}
>> +
>> +static DEVICE_ATTR_RO(cpumask);
>> +
>> +static struct attribute *xe_cpumask_attrs[] = {
>> +    &dev_attr_cpumask.attr,
>> +    NULL,
>> +};
>> +
>> +static const struct attribute_group xe_pmu_cpumask_attr_group = {
>> +    .attrs = xe_cpumask_attrs,
>> +};
>> +
>> +#define __event(__counter, __name, __unit) \
>> +{ \
>> +    .counter = (__counter), \
>> +    .name = (__name), \
>> +    .unit = (__unit), \
>> +    .global = false, \
>> +}
>> +
>> +#define __global_event(__counter, __name, __unit) \
>> +{ \
>> +    .counter = (__counter), \
>> +    .name = (__name), \
>> +    .unit = (__unit), \
>> +    .global = true, \
>> +}
>> +
>> +static struct xe_ext_attribute *
>> +add_xe_attr(struct xe_ext_attribute *attr, const char *name, u64 config)
>> +{
>> +    sysfs_attr_init(&attr->attr.attr);
>> +    attr->attr.attr.name = name;
>> +    attr->attr.attr.mode = 0444;
>> +    attr->attr.show = xe_pmu_event_show;
>> +    attr->val = config;
>> +
>> +    return ++attr;
>> +}
>> +
>> +static struct perf_pmu_events_attr *
>> +add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name,
>> +         const char *str)
>> +{
>> +    sysfs_attr_init(&attr->attr.attr);
>> +    attr->attr.attr.name = name;
>> +    attr->attr.attr.mode = 0444;
>> +    attr->attr.show = perf_event_sysfs_show;
>> +    attr->event_str = str;
>> +
>> +    return ++attr;
>> +}
>> +
>> +static struct attribute **
>> +create_event_attributes(struct xe_pmu *pmu)
>> +{
>> +    struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +    static const struct {
>> +        unsigned int counter;
>> +        const char *name;
>> +        const char *unit;
>> +        bool global;
>> +    } events[] = {
>> +        __global_event(0, "interrupts", NULL),
>> +        __event(1, "render-group-busy", "ns"),
>> +        __event(2, "copy-group-busy", "ns"),
>> +        __event(3, "media-group-busy", "ns"),
>> +        __event(4, "any-engine-group-busy", "ns"),
>> +    };
>> +
>> +    unsigned int count = 0;
>> +    struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
>> +    struct xe_ext_attribute *xe_attr = NULL, *xe_iter;
>> +    struct attribute **attr = NULL, **attr_iter;
>> +    struct xe_gt *gt;
>> +    unsigned int i, j;
>> +
>> +    /* Count how many counters we will be exposing. */
>> +    for_each_gt(gt, xe, j) {
>> +        for (i = 0; i < ARRAY_SIZE(events); i++) {
>> +            u64 config = ___XE_PMU_OTHER(j, events[i].counter);
>> +
>> +            if (!config_status(xe, config))
>> +                count++;
>> +        }
>> +    }
>> +
>> +    /* Allocate attribute objects and table. */
>> +    xe_attr = kcalloc(count, sizeof(*xe_attr), GFP_KERNEL);
>> +    if (!xe_attr)
>> +        goto err_alloc;
>> +
>> +    pmu_attr = kcalloc(count, sizeof(*pmu_attr), GFP_KERNEL);
>> +    if (!pmu_attr)
>> +        goto err_alloc;
>> +
>> +    /* Max one pointer of each attribute type plus a termination
>> entry. */
>> +    attr = kcalloc(count * 2 + 1, sizeof(*attr), GFP_KERNEL);
>> +    if (!attr)
>> +        goto err_alloc;
>> +
>> +    xe_iter = xe_attr;
>> +    pmu_iter = pmu_attr;
>> +    attr_iter = attr;
>> +
>> +    for_each_gt(gt, xe, j) {
>> +        for (i = 0; i < ARRAY_SIZE(events); i++) {
>> +            u64 config = ___XE_PMU_OTHER(j, events[i].counter);
>> +            char *str;
>> +
>> +            if (config_status(xe, config))
>> +                continue;
>> +
>> +            if (events[i].global)
>> +                str = kstrdup(events[i].name, GFP_KERNEL);
>> +            else
>> +                str = kasprintf(GFP_KERNEL, "%s-gt%u",
>> +                        events[i].name, j);
>> +            if (!str)
>> +                goto err;
>> +
>> +            *attr_iter++ = &xe_iter->attr.attr;
>> +            xe_iter = add_xe_attr(xe_iter, str, config);
>> +
>> +            if (events[i].unit) {
>> +                if (events[i].global)
>> +                    str = kasprintf(GFP_KERNEL, "%s.unit",
>> +                            events[i].name);
>> +                else
>> +                    str = kasprintf(GFP_KERNEL, "%s-gt%u.unit",
>> +                            events[i].name, j);
>> +                if (!str)
>> +                    goto err;
>> +
>> +                *attr_iter++ = &pmu_iter->attr.attr;
>> +                pmu_iter = add_pmu_attr(pmu_iter, str,
>> +                            events[i].unit);
>> +            }
>> +        }
>> +    }
>> +
>> +    pmu->xe_attr = xe_attr;
>> +    pmu->pmu_attr = pmu_attr;
>> +
>> +    return attr;
>> +
>> +err:
>> +    for (attr_iter = attr; *attr_iter; attr_iter++)
>> +        kfree((*attr_iter)->name);
>> +
>> +err_alloc:
>> +    kfree(attr);
>> +    kfree(xe_attr);
>> +    kfree(pmu_attr);
>> +
>> +    return NULL;
>> +}
>> +
>> +static void free_event_attributes(struct xe_pmu *pmu)
>> +{
>> +    struct attribute **attr_iter = pmu->events_attr_group.attrs;
>> +
>> +    for (; *attr_iter; attr_iter++)
>> +        kfree((*attr_iter)->name);
>> +
>> +    kfree(pmu->events_attr_group.attrs);
>> +    kfree(pmu->xe_attr);
>> +    kfree(pmu->pmu_attr);
>> +
>> +    pmu->events_attr_group.attrs = NULL;
>> +    pmu->xe_attr = NULL;
>> +    pmu->pmu_attr = NULL;
>> +}
>> +
>> +static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
>> +{
>> +    struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu),
>> cpuhp.node);
>> +
>> +    XE_BUG_ON(!pmu->base.event_init);
>> +
>> +    /* Select the first online CPU as a designated reader. */
>> +    if (cpumask_empty(&xe_pmu_cpumask))
>> +        cpumask_set_cpu(cpu, &xe_pmu_cpumask);
>> +
>> +    return 0;
>> +}
>> +
>> +static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
>> +{
>> +    struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu),
>> cpuhp.node);
>> +    unsigned int target = xe_pmu_target_cpu;
>> +
>> +    XE_BUG_ON(!pmu->base.event_init);
>> +
>> +    /*
>> +     * Unregistering an instance generates a CPU offline event which
>> we must
>> +     * ignore to avoid incorrectly modifying the shared xe_pmu_cpumask.
>> +     */
>> +    if (pmu->closed)
>> +        return 0;
>> +
>> +    if (cpumask_test_and_clear_cpu(cpu, &xe_pmu_cpumask)) {
>> +        target = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
>> +
>> +        /* Migrate events if there is a valid target */
>> +        if (target < nr_cpu_ids) {
>> +            cpumask_set_cpu(target, &xe_pmu_cpumask);
>> +            xe_pmu_target_cpu = target;
>> +        }
>> +    }
>> +
>> +    if (target < nr_cpu_ids && target != pmu->cpuhp.cpu) {
>> +        perf_pmu_migrate_context(&pmu->base, cpu, target);
>> +        pmu->cpuhp.cpu = target;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
>> +
>> +int xe_pmu_init(void)
>> +{
>> +    int ret;
>> +
>> +    ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
>> +                      "perf/x86/intel/xe:online",
>> +                      xe_pmu_cpu_online,
>> +                      xe_pmu_cpu_offline);
>> +    if (ret < 0)
>> +        pr_notice("Failed to setup cpuhp state for xe PMU! (%d)\n",
>> +              ret);
>> +    else
>> +        cpuhp_slot = ret;
>> +
>> +    return 0;
>> +}
>> +
>> +void xe_pmu_exit(void)
>> +{
>> +    if (cpuhp_slot != CPUHP_INVALID)
>> +        cpuhp_remove_multi_state(cpuhp_slot);
>> +}
>> +
>> +static int xe_pmu_register_cpuhp_state(struct xe_pmu *pmu)
>> +{
>> +    if (cpuhp_slot == CPUHP_INVALID)
>> +        return -EINVAL;
>> +
>> +    return cpuhp_state_add_instance(cpuhp_slot, &pmu->cpuhp.node);
>> +}
>> +
>> +static void xe_pmu_unregister_cpuhp_state(struct xe_pmu *pmu)
>> +{
>> +    cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node);
>> +}
>> +
>> +static void xe_pmu_unregister(struct drm_device *device, void *arg)
>> +{
>> +    struct xe_pmu *pmu = arg;
>> +
>> +    if (!pmu->base.event_init)
>> +        return;
>> +
>> +    /*
>> +     * "Disconnect" the PMU callbacks - since all are atomic
>> synchronize_rcu
>> +     * ensures all currently executing ones will have exited before we
>> +     * proceed with unregistration.
>> +     */
>> +    pmu->closed = true;
>> +    synchronize_rcu();
>> +
>> +    xe_pmu_unregister_cpuhp_state(pmu);
>> +
>> +    perf_pmu_unregister(&pmu->base);
>> +    pmu->base.event_init = NULL;
>> +    kfree(pmu->base.attr_groups);
>> +    kfree(pmu->name);
>> +    free_event_attributes(pmu);
>> +}
>> +
>> +static void init_samples(struct xe_pmu *pmu)
>> +{
>> +    struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +    struct xe_gt *gt;
>> +    unsigned int i;
>> +
>> +    for_each_gt(gt, xe, i)
>> +        engine_group_busyness_store(gt);
>> +}
>> +
>> +void xe_pmu_register(struct xe_pmu *pmu)
>> +{
>> +    struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +    const struct attribute_group *attr_groups[] = {
>> +        &xe_pmu_format_attr_group,
>> +        &pmu->events_attr_group,
>> +        &xe_pmu_cpumask_attr_group,
>> +        NULL
>> +    };
>> +
>> +    int ret = -ENOMEM;
>> +
>> +    spin_lock_init(&pmu->lock);
>> +    pmu->cpuhp.cpu = -1;
>> +    init_samples(pmu);
>> +
>> +    pmu->name = kasprintf(GFP_KERNEL,
>> +                  "xe_%s",
>> +                  dev_name(xe->drm.dev));
>> +    if (pmu->name)
>> +        /* tools/perf reserves colons as special. */
>> +        strreplace((char *)pmu->name, ':', '_');
>> +
>> +    if (!pmu->name)
>> +        goto err;
>> +
>> +    pmu->events_attr_group.name = "events";
>> +    pmu->events_attr_group.attrs = create_event_attributes(pmu);
>> +    if (!pmu->events_attr_group.attrs)
>> +        goto err_name;
>> +
>> +    pmu->base.attr_groups = kmemdup(attr_groups, sizeof(attr_groups),
>> +                    GFP_KERNEL);
>> +    if (!pmu->base.attr_groups)
>> +        goto err_attr;
>> +
>> +    pmu->base.module    = THIS_MODULE;
>> +    pmu->base.task_ctx_nr    = perf_invalid_context;
>> +    pmu->base.event_init    = xe_pmu_event_init;
>> +    pmu->base.add        = xe_pmu_event_add;
>> +    pmu->base.del        = xe_pmu_event_del;
>> +    pmu->base.start        = xe_pmu_event_start;
>> +    pmu->base.stop        = xe_pmu_event_stop;
>> +    pmu->base.read        = xe_pmu_event_read;
>> +    pmu->base.event_idx    = xe_pmu_event_event_idx;
>> +
>> +    ret = perf_pmu_register(&pmu->base, pmu->name, -1);
>> +    if (ret)
>> +        goto err_groups;
>> +
>> +    ret = xe_pmu_register_cpuhp_state(pmu);
>> +    if (ret)
>> +        goto err_unreg;
>> +
>> +    ret = drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister, pmu);
>> +    XE_WARN_ON(ret);
>> +
>> +    return;
>> +
>> +err_unreg:
>> +    perf_pmu_unregister(&pmu->base);
>> +err_groups:
>> +    kfree(pmu->base.attr_groups);
>> +err_attr:
>> +    pmu->base.event_init = NULL;
>> +    free_event_attributes(pmu);
>> +err_name:
>> +    kfree(pmu->name);
>> +err:
>> +    drm_notice(&xe->drm, "Failed to register PMU!\n");
>> +}
>> diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
>> new file mode 100644
>> index 000000000000..d3f47f4ab343
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_pmu.h
>> @@ -0,0 +1,25 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright ? 2023 Intel Corporation
>> + */
>> +
>> +#ifndef _XE_PMU_H_
>> +#define _XE_PMU_H_
>> +
>> +#include "xe_gt_types.h"
>> +#include "xe_pmu_types.h"
>> +
>> +#ifdef CONFIG_PERF_EVENTS
>> +int xe_pmu_init(void);
>> +void xe_pmu_exit(void);
>> +void xe_pmu_register(struct xe_pmu *pmu);
>> +void engine_group_busyness_store(struct xe_gt *gt);
>> +#else
>> +static inline int xe_pmu_init(void) { return 0; }
>> +static inline void xe_pmu_exit(void) {}
>> +static inline void xe_pmu_register(struct xe_pmu *pmu) {}
>> +static inline void engine_group_busyness_store(struct xe_gt *gt) {}
>> +#endif
>> +
>> +#endif
>> +
>> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h
>> b/drivers/gpu/drm/xe/xe_pmu_types.h
>> new file mode 100644
>> index 000000000000..e87edd4d6a87
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
>> @@ -0,0 +1,80 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright ? 2023 Intel Corporation
>> + */
>> +
>> +#ifndef _XE_PMU_TYPES_H_
>> +#define _XE_PMU_TYPES_H_
>> +
>> +#include <linux/perf_event.h>
>> +#include <linux/spinlock_types.h>
>> +#include <uapi/drm/xe_drm.h>
>> +
>> +enum {
>> +    __XE_SAMPLE_RENDER_GROUP_BUSY,
>> +    __XE_SAMPLE_COPY_GROUP_BUSY,
>> +    __XE_SAMPLE_MEDIA_GROUP_BUSY,
>> +    __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>> +    __XE_NUM_PMU_SAMPLERS
>> +};
>> +
>> +struct xe_pmu_sample {
>> +    u64 cur;
>> +};
>> +
>> +#define XE_MAX_GT_PER_TILE 2
>> +
>> +struct xe_pmu {
>> +    /**
>> +     * @cpuhp: Struct used for CPU hotplug handling.
>> +     */
>> +    struct {
>> +        struct hlist_node node;
>> +        unsigned int cpu;
>> +    } cpuhp;
>> +    /**
>> +     * @base: PMU base.
>> +     */
>> +    struct pmu base;
> 
> Are we not adding this timer for xe (it exists in i915)?
> 
> /**
>          * @timer: Timer for internal i915 PMU sampling.
>          */
>         struct hrtimer timer;
In my current implementation i have limited to expose only OA engine
counters which doesn't need an internal sampling timer as we directly
read from OA HW registers.

Thanks,
Aravind.
> 
> Thanks,
> 
> Vinay.
> 
>> +    /**
>> +     * @closed: xe is unregistering.
>> +     */
>> +    bool closed;
>> +    /**
>> +     * @name: Name as registered with perf core.
>> +     */
>> +    const char *name;
>> +    /**
>> +     * @lock: Lock protecting enable mask and ref count handling.
>> +     */
>> +    spinlock_t lock;
>> +    /**
>> +     * @sample: Current and previous (raw) counters.
>> +     *
>> +     * These counters are updated when the device is awake.
>> +     *
>> +     */
>> +    struct xe_pmu_sample sample[XE_MAX_GT_PER_TILE *
>> __XE_NUM_PMU_SAMPLERS];
>> +    /**
>> +     * @irq_count: Number of interrupts
>> +     *
>> +     * Intentionally unsigned long to avoid atomics or heuristics on
>> 32bit.
>> +     * 4e9 interrupts are a lot and postprocessing can really deal
>> with an
>> +     * occasional wraparound easily. It's 32bit after all.
>> +     */
>> +    unsigned long irq_count;
>> +    /**
>> +     * @events_attr_group: Device events attribute group.
>> +     */
>> +    struct attribute_group events_attr_group;
>> +    /**
>> +     * @xe_attr: Memory block holding device attributes.
>> +     */
>> +    void *xe_attr;
>> +    /**
>> +     * @pmu_attr: Memory block holding device attributes.
>> +     */
>> +    void *pmu_attr;
>> +};
>> +
>> +#endif
>> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
>> index 965cd9527ff1..ed097056f944 100644
>> --- a/include/uapi/drm/xe_drm.h
>> +++ b/include/uapi/drm/xe_drm.h
>> @@ -990,6 +990,22 @@ struct drm_xe_vm_madvise {
>>       __u64 reserved[2];
>>   };
>>   +/* PMU event config IDs */
>> +
>> +/*
>> + * Top 4 bits of every counter are GT id.
>> + */
>> +#define __XE_PMU_GT_SHIFT (60)
>> +
>> +#define ___XE_PMU_OTHER(gt, x) \
>> +    (((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
>> +
>> +#define XE_PMU_INTERRUPTS(gt)            ___XE_PMU_OTHER(gt, 0)
>> +#define XE_PMU_RENDER_GROUP_BUSY(gt)        ___XE_PMU_OTHER(gt, 1)
>> +#define XE_PMU_COPY_GROUP_BUSY(gt)        ___XE_PMU_OTHER(gt, 2)
>> +#define XE_PMU_MEDIA_GROUP_BUSY(gt)        ___XE_PMU_OTHER(gt, 3)
>> +#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)    ___XE_PMU_OTHER(gt, 4)
>> +
>>   #if defined(__cplusplus)
>>   }
>>   #endif

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-06  2:39   ` Dixit, Ashutosh
@ 2023-07-06 13:42     ` Iddamsetty, Aravind
  2023-07-07  2:18       ` Dixit, Ashutosh
  0 siblings, 1 reply; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-06 13:42 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin



On 06-07-2023 08:09, Dixit, Ashutosh wrote:
> On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
>>
> 
> Hi Aravind,
> 
Hi Ashutosh,
>> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
>> +{
>> +	u64 val = 0;
>> +
>> +	switch (config) {
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
>> +		break;
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
>> +		break;
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
>> +		break;
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
>> +		break;
>> +	default:
>> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
>> +	}
>> +
>> +	return xe_gt_clock_interval_to_ns(gt, val * 16);
>> +}
> 
> A few questions on just the above function first:
> 
> 1. OK so these registers won't be available to VF's, but any idea what
>    these counts are when VF's are active?

VF's cannot access the registers but the counters will be incrementing
if respective engines are busy and can be monitored from PF and that
will be across all VFs.

> 
> 2. When would these 32 bit registers overflow? Let us say a group is
>    continuously busy and we are running at 1 GHz, the registers would
>    overflow in 4 seconds (max value 4G)?

Based on BSPEC:52071 they use MHZ clock assuming the default 24MHz, it
would take around 5726 secs to overflow.

> 
> 3. What is the multiplication by 16 (not factored above in 2.)? I don't see
>    that in Bspec.

These counters are incremented based on crystal clock frequency and we
need to convert to CS time base hence a 16x mul. BSPEC:52071

Thanks,
Aravind.
> 
> Thanks.
> --
> Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-06 13:42     ` Iddamsetty, Aravind
@ 2023-07-07  2:18       ` Dixit, Ashutosh
  2023-07-07  3:53         ` Iddamsetty, Aravind
  0 siblings, 1 reply; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-07  2:18 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Thu, 06 Jul 2023 06:42:29 -0700, Iddamsetty, Aravind wrote:
>

Hi Aravind,

> On 06-07-2023 08:09, Dixit, Ashutosh wrote:
> > On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
>
> >> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
> >> +{
> >> +	u64 val = 0;
> >> +
> >> +	switch (config) {
> >> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> >> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
> >> +		break;
> >> +	case XE_PMU_COPY_GROUP_BUSY(0):
> >> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
> >> +		break;
> >> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> >> +		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
> >> +		break;
> >> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> >> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
> >> +		break;
> >> +	default:
> >> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
> >> +	}
> >> +
> >> +	return xe_gt_clock_interval_to_ns(gt, val * 16);
> >> +}
> >
> > A few questions on just the above function first:
> >
> > 1. OK so these registers won't be available to VF's, but any idea what
> >    these counts are when VF's are active?
>
> VF's cannot access the registers but the counters will be incrementing
> if respective engines are busy and can be monitored from PF and that
> will be across all VFs.

Ok, good.

>
> >
> > 2. When would these 32 bit registers overflow? Let us say a group is
> >    continuously busy and we are running at 1 GHz, the registers would
> >    overflow in 4 seconds (max value 4G)?
>
> Based on BSPEC:52071 they use MHZ clock assuming the default 24MHz, it
> would take around 5726 secs to overflow.

OK, overflow should not be an issue then. Though I have seen 19.2 and 38.4
MHz in OA. Also, if these are OAG registers, OA timestamp freq can be
different from CS timestamp freq, so not sure if that needs to be
handled. See i915_perf_oa_timestamp_frequency() in i915.

>
> >
> > 3. What is the multiplication by 16 (not factored above in 2.)? I don't see
> >    that in Bspec.
>
> These counters are incremented based on crystal clock frequency and we
> need to convert to CS time base hence a 16x mul. BSPEC:52071

Hmm still don't see the 16x mul in BSPEC:52071. Anyway.

Also, could you please explain where the requirement to expose these OAG
group busy/free registers via the PMU is coming from? Since these are OA
registers presumably they can be collected using the OA subsystem.

The i915 PMU I believe deduces busyness by sampling the RING_CTL register
using a timer. So these registers look better since you can get these
busyness values directly. On the other hand you can only get busyness for
an engine group and things like compute seem to be missing?

Also, would you know about plans to expose other kinds of busyness-es? I
think we may be exposing per-VF and also per-client busyness via PMU. Not
sure what else GuC can expose. Knowing all this we can better understand
how these particular busyness values will be used.

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-07  2:18       ` Dixit, Ashutosh
@ 2023-07-07  3:53         ` Iddamsetty, Aravind
  2023-07-07  6:08           ` Dixit, Ashutosh
                             ` (2 more replies)
  0 siblings, 3 replies; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-07  3:53 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin



On 07-07-2023 07:48, Dixit, Ashutosh wrote:
> On Thu, 06 Jul 2023 06:42:29 -0700, Iddamsetty, Aravind wrote:
>>
> 
> Hi Aravind,

Hi Ashutosh,
> 
>> On 06-07-2023 08:09, Dixit, Ashutosh wrote:
>>> On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
>>
>>>> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
>>>> +{
>>>> +	u64 val = 0;
>>>> +
>>>> +	switch (config) {
>>>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>>>> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
>>>> +		break;
>>>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>>>> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
>>>> +		break;
>>>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>>>> +		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
>>>> +		break;
>>>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>>>> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
>>>> +		break;
>>>> +	default:
>>>> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
>>>> +	}
>>>> +
>>>> +	return xe_gt_clock_interval_to_ns(gt, val * 16);
>>>> +}
>>>
>>> A few questions on just the above function first:
>>>
>>> 1. OK so these registers won't be available to VF's, but any idea what
>>>    these counts are when VF's are active?
>>
>> VF's cannot access the registers but the counters will be incrementing
>> if respective engines are busy and can be monitored from PF and that
>> will be across all VFs.
> 
> Ok, good.
> 
>>
>>>
>>> 2. When would these 32 bit registers overflow? Let us say a group is
>>>    continuously busy and we are running at 1 GHz, the registers would
>>>    overflow in 4 seconds (max value 4G)?
>>
>> Based on BSPEC:52071 they use MHZ clock assuming the default 24MHz, it
>> would take around 5726 secs to overflow.
> 
> OK, overflow should not be an issue then. Though I have seen 19.2 and 38.4
> MHz in OA. Also, if these are OAG registers, OA timestamp freq can be
> different from CS timestamp freq, so not sure if that needs to be
> handled. See i915_perf_oa_timestamp_frequency() in i915.

so that is handled by below calculation
> 
>>
>>>
>>> 3. What is the multiplication by 16 (not factored above in 2.)? I don't see
>>>    that in Bspec.
>>
>> These counters are incremented based on crystal clock frequency and we
>> need to convert to CS time base hence a 16x mul. BSPEC:52071
> 
> Hmm still don't see the 16x mul in BSPEC:52071. Anyway.

lets say the frequency is 24MHz so the counter increments every
1333.333ns(granularity) and corresponding cs timestamp base for this
frequency is 83.333ns, 1333.333/83.333 = 16 and this true for rest of
the frequency selections as well. hence we multiply the counter x 16.
> 
> Also, could you please explain where the requirement to expose these OAG
> group busy/free registers via the PMU is coming from? Since these are OA
> registers presumably they can be collected using the OA subsystem.

L0 sysman needs this
https://spec.oneapi.io/level-zero/latest/sysman/api.html#zes-engine-properties-t
and xpumanager uses this
https://github.com/intel/xpumanager/blob/master/core/src/device/gpu/gpu_device.cpp
> 
> The i915 PMU I believe deduces busyness by sampling the RING_CTL register
> using a timer. So these registers look better since you can get these
> busyness values directly. On the other hand you can only get busyness for
> an engine group and things like compute seem to be missing?

The per engine busyness is a different thing we still need that and it
has different implementation with GuC enabled, I believe Umesh is
looking into that.

compute group will still be accounted in XE_OAG_RENDER_BUSY_FREE and
also under XE_OAG_RC0_ANY_ENGINE_BUSY_FREE.
> 
> Also, would you know about plans to expose other kinds of busyness-es? I
> think we may be exposing per-VF and also per-client busyness via PMU. Not
> sure what else GuC can expose. Knowing all this we can better understand
> how these particular busyness values will be used.

ya, that shall be coming next probably from Umesh but per client
busyness is through fdinfo.

Thanks,
Aravind.
> 
> Thanks.
> --
> Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-07  3:53         ` Iddamsetty, Aravind
@ 2023-07-07  6:08           ` Dixit, Ashutosh
  2023-07-07 10:42             ` Iddamsetty, Aravind
  2023-07-09  0:32           ` Dixit, Ashutosh
  2023-07-18  5:07           ` Dixit, Ashutosh
  2 siblings, 1 reply; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-07  6:08 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Thu, 06 Jul 2023 20:53:47 -0700, Iddamsetty, Aravind wrote:
>

Hi Aravind,

I will look at the timing stuff later but one further question about the
requirement:

> > Also, could you please explain where the requirement to expose these OAG
> > group busy/free registers via the PMU is coming from? Since these are OA
> > registers presumably they can be collected using the OA subsystem.
>
> L0 sysman needs this
> https://spec.oneapi.io/level-zero/latest/sysman/api.html#zes-engine-properties-t
> and xpumanager uses this
> https://github.com/intel/xpumanager/blob/master/core/src/device/gpu/gpu_device.cpp

So fine these are UMD requirements, but why do these quantities (everything
in this patch) have to exposed via PMU? I could just create sysfs or an
ioctl to provide these to userland, right?

I had this same question about i915 PMU which was never answered. i915 PMU
IMO does truly strange things like sample freq's every 5 ms and provides
software averages which I thought userspace can easily do.

I don't think it's the timestamps, maybe there is some convention related
to the cpu pmu (which I am not familiar with).

Let's see, maybe Tvrtko can also answer why these things were exposed via
i915 PMU.

Thanks.
--
Ashutosh


> >
> > The i915 PMU I believe deduces busyness by sampling the RING_CTL register
> > using a timer. So these registers look better since you can get these
> > busyness values directly. On the other hand you can only get busyness for
> > an engine group and things like compute seem to be missing?
>
> The per engine busyness is a different thing we still need that and it
> has different implementation with GuC enabled, I believe Umesh is
> looking into that.
>
> compute group will still be accounted in XE_OAG_RENDER_BUSY_FREE and
> also under XE_OAG_RC0_ANY_ENGINE_BUSY_FREE.
> >
> > Also, would you know about plans to expose other kinds of busyness-es? I
> > think we may be exposing per-VF and also per-client busyness via PMU. Not
> > sure what else GuC can expose. Knowing all this we can better understand
> > how these particular busyness values will be used.
>
> ya, that shall be coming next probably from Umesh but per client
> busyness is through fdinfo.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-07  6:08           ` Dixit, Ashutosh
@ 2023-07-07 10:42             ` Iddamsetty, Aravind
  2023-07-07 21:25               ` Dixit, Ashutosh
  0 siblings, 1 reply; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-07 10:42 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin



On 07-07-2023 11:38, Dixit, Ashutosh wrote:
> On Thu, 06 Jul 2023 20:53:47 -0700, Iddamsetty, Aravind wrote:
>>
> 
> Hi Aravind,
> 
Hi Ashutosh,

> I will look at the timing stuff later but one further question about the
> requirement:
> 
>>> Also, could you please explain where the requirement to expose these OAG
>>> group busy/free registers via the PMU is coming from? Since these are OA
>>> registers presumably they can be collected using the OA subsystem.
>>
>> L0 sysman needs this
>> https://spec.oneapi.io/level-zero/latest/sysman/api.html#zes-engine-properties-t
>> and xpumanager uses this
>> https://github.com/intel/xpumanager/blob/master/core/src/device/gpu/gpu_device.cpp
> 
> So fine these are UMD requirements, but why do these quantities (everything
> in this patch) have to exposed via PMU? I could just create sysfs or an
> ioctl to provide these to userland, right?

PMU is enhanced interface to present the metrics, it provides low
latency reads compared to sysfs and one can read multiple events in a
single shot and it will give timestamps as well which sysfs cannot
provide and which is one of the requirements of UMD. Also UMDs/
observability tools do not want to have any open handles to get these
info so ioctl is dropped out.

the other motivation to use PMU in xe is the existing tools like
intel_gpu_top will work with just a minor change.
> 
> I had this same question about i915 PMU which was never answered. i915 PMU
> IMO does truly strange things like sample freq's every 5 ms and provides
> software averages which I thought userspace can easily do.

that is a different thing nothing to do with PMU interface

Thanks,
Aravind.
> 
> I don't think it's the timestamps, maybe there is some convention related
> to the cpu pmu (which I am not familiar with).
> 
> Let's see, maybe Tvrtko can also answer why these things were exposed via
> i915 PMU.
> 
> Thanks.
> --
> Ashutosh
> 
> 
>>>
>>> The i915 PMU I believe deduces busyness by sampling the RING_CTL register
>>> using a timer. So these registers look better since you can get these
>>> busyness values directly. On the other hand you can only get busyness for
>>> an engine group and things like compute seem to be missing?
>>
>> The per engine busyness is a different thing we still need that and it
>> has different implementation with GuC enabled, I believe Umesh is
>> looking into that.
>>
>> compute group will still be accounted in XE_OAG_RENDER_BUSY_FREE and
>> also under XE_OAG_RC0_ANY_ENGINE_BUSY_FREE.
>>>
>>> Also, would you know about plans to expose other kinds of busyness-es? I
>>> think we may be exposing per-VF and also per-client busyness via PMU. Not
>>> sure what else GuC can expose. Knowing all this we can better understand
>>> how these particular busyness values will be used.
>>
>> ya, that shall be coming next probably from Umesh but per client
>> busyness is through fdinfo.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-07 10:42             ` Iddamsetty, Aravind
@ 2023-07-07 21:25               ` Dixit, Ashutosh
  2023-07-10  6:05                 ` Iddamsetty, Aravind
  0 siblings, 1 reply; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-07 21:25 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Fri, 07 Jul 2023 03:42:36 -0700, Iddamsetty, Aravind wrote:
>

Hi Aravind,

> On 07-07-2023 11:38, Dixit, Ashutosh wrote:
> > On Thu, 06 Jul 2023 20:53:47 -0700, Iddamsetty, Aravind wrote:
> > I will look at the timing stuff later but one further question about the
> > requirement:
> >
> >>> Also, could you please explain where the requirement to expose these OAG
> >>> group busy/free registers via the PMU is coming from? Since these are OA
> >>> registers presumably they can be collected using the OA subsystem.
> >>
> >> L0 sysman needs this
> >> https://spec.oneapi.io/level-zero/latest/sysman/api.html#zes-engine-properties-t
> >> and xpumanager uses this
> >> https://github.com/intel/xpumanager/blob/master/core/src/device/gpu/gpu_device.cpp
> >
> > So fine these are UMD requirements, but why do these quantities (everything
> > in this patch) have to exposed via PMU? I could just create sysfs or an
> > ioctl to provide these to userland, right?
>
> PMU is enhanced interface to present the metrics, it provides low
> latency reads compared to sysfs

Why lower latency compared to sysfs? In both cases we have user to kernel
transitions and then register reads etc.

> and one can read multiple events in a single shot

Yes, this PMU can do and sysfs can't, though ioctl's can do this.

> and it will give timestamps as well which sysfs cannot provide and which
> is one of the requirements of UMD.

Ioctl's can do this if implement (counter, timestamp) pairs, but I agree
this may look strange so PMU does seem to have an advantage here.

But are these timestamps needed? The spec talks about different timestamp
bases but in this case we have already converted to ns and I am wondering
if the UMD can use it's own timestamps (maybe average of the ioctl call and
return from ioctl) if UMD needs timestamps.

> Also UMDs/ observability tools do not want to have any open handles to
> get these info so ioctl is dropped out.

Why? This also I don't follow. And UMD has an perf pmu fd open. See
igt@perf_pmu@module-unload e.g. which tests that module unload should fail
if the perf pmu fd is open (which takes a ref count on the module).

> the other motivation to use PMU in xe is the existing tools like
> intel_gpu_top will work with just a minor change.

Not too concerned about userspace tools. They can be changed to use a
different interface.

So I am still not convinced xe needs to expose a PMU interface with these
sort of "software events/counters". So my question is why can't we just
have an ioctl to expose these things, why PMU?

Incidentally if you look at amdgpu_pmu.c, they seem to exposing some
hardware sort of events through the PMU, not our kind of software stuff.

Another interesting thing is if we have ftrace statements they seem to
automatically be exposed by PMU
(https://perf.wiki.kernel.org/index.php/Tutorial), e.g.:

  i915:i915_request_add                              [Tracepoint event]
  i915:i915_request_queue                            [Tracepoint event]
  i915:i915_request_retire                           [Tracepoint event]
  i915:i915_request_wait_begin                       [Tracepoint event]
  i915:i915_request_wait_end                         [Tracepoint event]

So I am wondering if this might be an option?

So anyway let's try to understand the need for the PMU interface a bit more
before deciding on this. Once we introduce the interface (a) people will
willy nilly start exposing random stuff through that inteface (b) same
stuff will get exposed via multiple interfaces (e.g. frequency and rc6
residency in i915) etc. I am speaking on the basis of what I saw in i915.

Let's see if Tvrtko responds, otherwise I will try to get him on irc or
something. It will be good to have some input from maybe one of the
architects too about this.

Thanks.
--
Ashutosh

> > I had this same question about i915 PMU which was never answered. i915 PMU
> > IMO does truly strange things like sample freq's every 5 ms and provides
> > software averages which I thought userspace can easily do.
>
> that is a different thing nothing to do with PMU interface
>
> Thanks,
> Aravind.
> >
> > I don't think it's the timestamps, maybe there is some convention related
> > to the cpu pmu (which I am not familiar with).
> >
> > Let's see, maybe Tvrtko can also answer why these things were exposed via
> > i915 PMU.
> >
> > Thanks.
> > --
> > Ashutosh
> >
> >
> >>>
> >>> The i915 PMU I believe deduces busyness by sampling the RING_CTL register
> >>> using a timer. So these registers look better since you can get these
> >>> busyness values directly. On the other hand you can only get busyness for
> >>> an engine group and things like compute seem to be missing?
> >>
> >> The per engine busyness is a different thing we still need that and it
> >> has different implementation with GuC enabled, I believe Umesh is
> >> looking into that.
> >>
> >> compute group will still be accounted in XE_OAG_RENDER_BUSY_FREE and
> >> also under XE_OAG_RC0_ANY_ENGINE_BUSY_FREE.
> >>>
> >>> Also, would you know about plans to expose other kinds of busyness-es? I
> >>> think we may be exposing per-VF and also per-client busyness via PMU. Not
> >>> sure what else GuC can expose. Knowing all this we can better understand
> >>> how these particular busyness values will be used.
> >>
> >> ya, that shall be coming next probably from Umesh but per client
> >> busyness is through fdinfo.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-07  3:53         ` Iddamsetty, Aravind
  2023-07-07  6:08           ` Dixit, Ashutosh
@ 2023-07-09  0:32           ` Dixit, Ashutosh
  2023-07-10  4:13             ` Iddamsetty, Aravind
  2023-07-18  5:07           ` Dixit, Ashutosh
  2 siblings, 1 reply; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-09  0:32 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Thu, 06 Jul 2023 20:53:47 -0700, Iddamsetty, Aravind wrote:
> On 07-07-2023 07:48, Dixit, Ashutosh wrote:
> > On Thu, 06 Jul 2023 06:42:29 -0700, Iddamsetty, Aravind wrote:
> > Also, could you please explain where the requirement to expose these OAG
> > group busy/free registers via the PMU is coming from? Since these are OA
> > registers presumably they can be collected using the OA subsystem.
>
> L0 sysman needs this
> https://spec.oneapi.io/level-zero/latest/sysman/api.html#zes-engine-properties-t
> and xpumanager uses this
> https://github.com/intel/xpumanager/blob/master/core/src/device/gpu/gpu_device.cpp

Also there is the above mentioned open regarding this: "Since these are OA
registers presumably they can be collected using the OA subsystem". L0 now
seems to be supporting OA and we are going to provide an OA subsystem for
xe. This probably also needs arch input.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-09  0:32           ` Dixit, Ashutosh
@ 2023-07-10  4:13             ` Iddamsetty, Aravind
  2023-07-10  5:57               ` Dixit, Ashutosh
  0 siblings, 1 reply; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-10  4:13 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin



On 09-07-2023 06:02, Dixit, Ashutosh wrote:
> On Thu, 06 Jul 2023 20:53:47 -0700, Iddamsetty, Aravind wrote:
>> On 07-07-2023 07:48, Dixit, Ashutosh wrote:
>>> On Thu, 06 Jul 2023 06:42:29 -0700, Iddamsetty, Aravind wrote:
>>> Also, could you please explain where the requirement to expose these OAG
>>> group busy/free registers via the PMU is coming from? Since these are OA
>>> registers presumably they can be collected using the OA subsystem.
>>
>> L0 sysman needs this
>> https://spec.oneapi.io/level-zero/latest/sysman/api.html#zes-engine-properties-t
>> and xpumanager uses this
>> https://github.com/intel/xpumanager/blob/master/core/src/device/gpu/gpu_device.cpp
> 
> Also there is the above mentioned open regarding this: "Since these are OA
> registers presumably they can be collected using the OA subsystem". L0 now
> seems to be supporting OA and we are going to provide an OA subsystem for
> xe. This probably also needs arch input.

While I try to check on this, by any chance do you know why we have to
go with custom OA subsystem implementation and cannot use PMU for
reporting OA.

Thanks,
Aravind.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-10  4:13             ` Iddamsetty, Aravind
@ 2023-07-10  5:57               ` Dixit, Ashutosh
  0 siblings, 0 replies; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-10  5:57 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Sun, 09 Jul 2023 21:13:59 -0700, Iddamsetty, Aravind wrote:
>
> On 09-07-2023 06:02, Dixit, Ashutosh wrote:
> > On Thu, 06 Jul 2023 20:53:47 -0700, Iddamsetty, Aravind wrote:
> >> On 07-07-2023 07:48, Dixit, Ashutosh wrote:
> >>> On Thu, 06 Jul 2023 06:42:29 -0700, Iddamsetty, Aravind wrote:
> >>> Also, could you please explain where the requirement to expose these OAG
> >>> group busy/free registers via the PMU is coming from? Since these are OA
> >>> registers presumably they can be collected using the OA subsystem.
> >>
> >> L0 sysman needs this
> >> https://spec.oneapi.io/level-zero/latest/sysman/api.html#zes-engine-properties-t
> >> and xpumanager uses this
> >> https://github.com/intel/xpumanager/blob/master/core/src/device/gpu/gpu_device.cpp
> >
> > Also there is the above mentioned open regarding this: "Since these are OA
> > registers presumably they can be collected using the OA subsystem". L0 now
> > seems to be supporting OA and we are going to provide an OA subsystem for
> > xe. This probably also needs arch input.
>
> While I try to check on this, by any chance do you know why we have to
> go with custom OA subsystem implementation and cannot use PMU for
> reporting OA.

Because we already had a big OA subsystem in i915 (i915_perf.c) which we
are bringing over to xe. And OA has many more metrics than just the few we
are exposing (or had exposed via the PMU). You can see these metrics in
build/lib directory of the IGT build (after you compile IGT). Grep for
busyness. Some of metrics which come from GuC will not be available from OA
(since OA data is HW generated) so we will have to expose these separately.

Separately, I think it would be good to understand how other drm drivers
expose such performance data (such as what we were doing in i915 with OA
and PMU) for userland to digest.

Let's see. I am not against PMU per se, just asking questions so we can get
some opinion about these things before we commit to supporting the PMU
interface.

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-07 21:25               ` Dixit, Ashutosh
@ 2023-07-10  6:05                 ` Iddamsetty, Aravind
  2023-07-10  8:12                   ` Ursulin, Tvrtko
  0 siblings, 1 reply; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-10  6:05 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin



On 08-07-2023 02:55, Dixit, Ashutosh wrote:
> On Fri, 07 Jul 2023 03:42:36 -0700, Iddamsetty, Aravind wrote:
>>
> 
> Hi Aravind,
> 
>> On 07-07-2023 11:38, Dixit, Ashutosh wrote:
>>> On Thu, 06 Jul 2023 20:53:47 -0700, Iddamsetty, Aravind wrote:
>>> I will look at the timing stuff later but one further question about the
>>> requirement:
>>>
>>>>> Also, could you please explain where the requirement to expose these OAG
>>>>> group busy/free registers via the PMU is coming from? Since these are OA
>>>>> registers presumably they can be collected using the OA subsystem.
>>>>
>>>> L0 sysman needs this
>>>> https://spec.oneapi.io/level-zero/latest/sysman/api.html#zes-engine-properties-t
>>>> and xpumanager uses this
>>>> https://github.com/intel/xpumanager/blob/master/core/src/device/gpu/gpu_device.cpp
>>>
>>> So fine these are UMD requirements, but why do these quantities (everything
>>> in this patch) have to exposed via PMU? I could just create sysfs or an
>>> ioctl to provide these to userland, right?
>>
>> PMU is enhanced interface to present the metrics, it provides low
>> latency reads compared to sysfs
> 
> Why lower latency compared to sysfs? In both cases we have user to kernel
> transitions and then register reads etc.

The sysfs will have to go through filesystem which adds latency but here
i think the most important aspect is requirement of read timestamps.

> 
>> and one can read multiple events in a single shot
> 
> Yes, this PMU can do and sysfs can't, though ioctl's can do this.
> 
>> and it will give timestamps as well which sysfs cannot provide and which
>> is one of the requirements of UMD.
> 
> Ioctl's can do this if implement (counter, timestamp) pairs, but I agree
> this may look strange so PMU does seem to have an advantage here.
> 
> But are these timestamps needed? The spec talks about different timestamp
> bases but in this case we have already converted to ns and I am wondering
> if the UMD can use it's own timestamps (maybe average of the ioctl call and
> return from ioctl) if UMD needs timestamps.

here i'm talking about read timestamps not the counter itself and when
we already have an interface(PMU) which can give these details why to do
duplicate effort in ioctl
> 
>> Also UMDs/ observability tools do not want to have any open handles to
>> get these info so ioctl is dropped out.
> 
> Why? This also I don't follow. And UMD has an perf pmu fd open. See
> igt@perf_pmu@module-unload e.g. which tests that module unload should fail
> if the perf pmu fd is open (which takes a ref count on the module).

here I'm referring to drm fd, one need not open drm fd to read via pmu,
and typically UMDs do not want to open drm fd as it takes device
reference and might toggle the device state(eg: wake device) when we are
trying to read some stats which is not needed.

> 
>> the other motivation to use PMU in xe is the existing tools like
>> intel_gpu_top will work with just a minor change.
> 
> Not too concerned about userspace tools. They can be changed to use a
> different interface.
> 
> So I am still not convinced xe needs to expose a PMU interface with these
> sort of "software events/counters". So my question is why can't we just
> have an ioctl to expose these things, why PMU?

firstly, PMU satisfies all requirements of UMD, requiring read
timestamps, multiple event read. So as we already have a time tested
interface is kernel why should we try to duplicate. secondly, using
ioctl one has to open drm fd which umds do not want.
> 
> Incidentally if you look at amdgpu_pmu.c, they seem to exposing some
> hardware sort of events through the PMU, not our kind of software stuff.

the counters that I'm exposing in this series are hardware counters itself.

> 
> Another interesting thing is if we have ftrace statements they seem to
> automatically be exposed by PMU
> (https://perf.wiki.kernel.org/index.php/Tutorial), e.g.:
> 
>   i915:i915_request_add                              [Tracepoint event]
>   i915:i915_request_queue                            [Tracepoint event]
>   i915:i915_request_retire                           [Tracepoint event]
>   i915:i915_request_wait_begin                       [Tracepoint event]
>   i915:i915_request_wait_end                         [Tracepoint event]
> 
> So I am wondering if this might be an option?

i'm little confused here how ftrace will expose any counters as it is
mostly for profiling?

Thanks,
Aravind.
> 
> So anyway let's try to understand the need for the PMU interface a bit more
> before deciding on this. Once we introduce the interface (a) people will
> willy nilly start exposing random stuff through that inteface (b) same
> stuff will get exposed via multiple interfaces (e.g. frequency and rc6
> residency in i915) etc. I am speaking on the basis of what I saw in i915.
> 
> Let's see if Tvrtko responds, otherwise I will try to get him on irc or
> something. It will be good to have some input from maybe one of the
> architects too about this.
> 
> Thanks.
> --
> Ashutosh
> 
>>> I had this same question about i915 PMU which was never answered. i915 PMU
>>> IMO does truly strange things like sample freq's every 5 ms and provides
>>> software averages which I thought userspace can easily do.
>>
>> that is a different thing nothing to do with PMU interface
>>
>> Thanks,
>> Aravind.
>>>
>>> I don't think it's the timestamps, maybe there is some convention related
>>> to the cpu pmu (which I am not familiar with).
>>>
>>> Let's see, maybe Tvrtko can also answer why these things were exposed via
>>> i915 PMU.
>>>
>>> Thanks.
>>> --
>>> Ashutosh
>>>
>>>
>>>>>
>>>>> The i915 PMU I believe deduces busyness by sampling the RING_CTL register
>>>>> using a timer. So these registers look better since you can get these
>>>>> busyness values directly. On the other hand you can only get busyness for
>>>>> an engine group and things like compute seem to be missing?
>>>>
>>>> The per engine busyness is a different thing we still need that and it
>>>> has different implementation with GuC enabled, I believe Umesh is
>>>> looking into that.
>>>>
>>>> compute group will still be accounted in XE_OAG_RENDER_BUSY_FREE and
>>>> also under XE_OAG_RC0_ANY_ENGINE_BUSY_FREE.
>>>>>
>>>>> Also, would you know about plans to expose other kinds of busyness-es? I
>>>>> think we may be exposing per-VF and also per-client busyness via PMU. Not
>>>>> sure what else GuC can expose. Knowing all this we can better understand
>>>>> how these particular busyness values will be used.
>>>>
>>>> ya, that shall be coming next probably from Umesh but per client
>>>> busyness is through fdinfo.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-10  6:05                 ` Iddamsetty, Aravind
@ 2023-07-10  8:12                   ` Ursulin, Tvrtko
  2023-07-11 16:19                     ` Iddamsetty, Aravind
  2023-07-11 22:58                     ` Dixit, Ashutosh
  0 siblings, 2 replies; 59+ messages in thread
From: Ursulin, Tvrtko @ 2023-07-10  8:12 UTC (permalink / raw)
  To: Iddamsetty, Aravind, Dixit, Ashutosh; +Cc: Bommu, Krishnaiah, intel-xe


Hi,

A few random comments:

* For question about why OA is not i915_perf.c and not Perf/PMU read comment on top of i915_perf.c.

* In terms of efficiency - don't forget the fact sysfs is one value per file and in text format - so multiple counters to be read is multiple system calls (two per value at least - unless that new ioctl which opens and reads in one is used) and binary->text->binary conversion. While PMU is one ioctl to read as many counters as wanted straight in machine usable format.

* In terms of why not ioctl - my memory is hazy but I am pretty sure people requesting this interface at the time had a strong requirement to not have it. Could be what Aravind listed, or maybe even more to it.

* For sysfs there definitely was something about sysfs not being wanted with containers but I can't remember the details. Possibly people were saying they wouldn't want to mount sysfs inside them for some reason.

* The fact tracepoint names are shown with perf list does not make them PMU. 😊

* I also wouldn't discount so easily aligning with the same interface in terms of tools like intel_gpu_top. The tool has it's users and it would be non-trivial cost to refactor into wholly different backends. Most importantly someone would need to commit to that which looking at the past record of involvements I estimate is not very likely to happen.

* Also in terms of software counters, the story is not that simple as i915 invented them. There was an option to actually use OA registers to implement engine busyness stats but unfortunately the capability wasn't consistent across hw generations we needed to support. Not all engines had corresponding OA counters and there was also one other problem with OA which currently escapes me. Needing to keep the device awake maybe? But anyway, that is one reason for sw counters, including sampling on ringbuffer platforms, and accurate sw stats on execlists.

Sampled frequency (more granular than 1 HZ snapshots) was also AFAIR a customer requirement and i915 can do it much more cheaply than userspace hammering on sysfs can.

Of course the user/customer requirements might have changed so I am not saying that all past decisions still apply. Just providing context.

Regards,

Tvrtko

-----Original Message-----
From: Iddamsetty, Aravind <aravind.iddamsetty@intel.com> 
Sent: Monday, July 10, 2023 7:05 AM
To: Dixit, Ashutosh <ashutosh.dixit@intel.com>
Cc: intel-xe@lists.freedesktop.org; Bommu, Krishnaiah <krishnaiah.bommu@intel.com>; Ursulin, Tvrtko <tvrtko.ursulin@intel.com>
Subject: Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface



On 08-07-2023 02:55, Dixit, Ashutosh wrote:
> On Fri, 07 Jul 2023 03:42:36 -0700, Iddamsetty, Aravind wrote:
>>
> 
> Hi Aravind,
> 
>> On 07-07-2023 11:38, Dixit, Ashutosh wrote:
>>> On Thu, 06 Jul 2023 20:53:47 -0700, Iddamsetty, Aravind wrote:
>>> I will look at the timing stuff later but one further question about 
>>> the
>>> requirement:
>>>
>>>>> Also, could you please explain where the requirement to expose 
>>>>> these OAG group busy/free registers via the PMU is coming from? 
>>>>> Since these are OA registers presumably they can be collected using the OA subsystem.
>>>>
>>>> L0 sysman needs this
>>>> https://spec.oneapi.io/level-zero/latest/sysman/api.html#zes-engine
>>>> -properties-t
>>>> and xpumanager uses this
>>>> https://github.com/intel/xpumanager/blob/master/core/src/device/gpu
>>>> /gpu_device.cpp
>>>
>>> So fine these are UMD requirements, but why do these quantities 
>>> (everything in this patch) have to exposed via PMU? I could just 
>>> create sysfs or an ioctl to provide these to userland, right?
>>
>> PMU is enhanced interface to present the metrics, it provides low 
>> latency reads compared to sysfs
> 
> Why lower latency compared to sysfs? In both cases we have user to 
> kernel transitions and then register reads etc.

The sysfs will have to go through filesystem which adds latency but here i think the most important aspect is requirement of read timestamps.

> 
>> and one can read multiple events in a single shot
> 
> Yes, this PMU can do and sysfs can't, though ioctl's can do this.
> 
>> and it will give timestamps as well which sysfs cannot provide and 
>> which is one of the requirements of UMD.
> 
> Ioctl's can do this if implement (counter, timestamp) pairs, but I 
> agree this may look strange so PMU does seem to have an advantage here.
> 
> But are these timestamps needed? The spec talks about different 
> timestamp bases but in this case we have already converted to ns and I 
> am wondering if the UMD can use it's own timestamps (maybe average of 
> the ioctl call and return from ioctl) if UMD needs timestamps.

here i'm talking about read timestamps not the counter itself and when we already have an interface(PMU) which can give these details why to do duplicate effort in ioctl
> 
>> Also UMDs/ observability tools do not want to have any open handles 
>> to get these info so ioctl is dropped out.
> 
> Why? This also I don't follow. And UMD has an perf pmu fd open. See 
> igt@perf_pmu@module-unload e.g. which tests that module unload should 
> fail if the perf pmu fd is open (which takes a ref count on the module).

here I'm referring to drm fd, one need not open drm fd to read via pmu, and typically UMDs do not want to open drm fd as it takes device reference and might toggle the device state(eg: wake device) when we are trying to read some stats which is not needed.

> 
>> the other motivation to use PMU in xe is the existing tools like 
>> intel_gpu_top will work with just a minor change.
> 
> Not too concerned about userspace tools. They can be changed to use a 
> different interface.
> 
> So I am still not convinced xe needs to expose a PMU interface with 
> these sort of "software events/counters". So my question is why can't 
> we just have an ioctl to expose these things, why PMU?

firstly, PMU satisfies all requirements of UMD, requiring read timestamps, multiple event read. So as we already have a time tested interface is kernel why should we try to duplicate. secondly, using ioctl one has to open drm fd which umds do not want.
> 
> Incidentally if you look at amdgpu_pmu.c, they seem to exposing some 
> hardware sort of events through the PMU, not our kind of software stuff.

the counters that I'm exposing in this series are hardware counters itself.

> 
> Another interesting thing is if we have ftrace statements they seem to 
> automatically be exposed by PMU 
> (https://perf.wiki.kernel.org/index.php/Tutorial), e.g.:
> 
>   i915:i915_request_add                              [Tracepoint event]
>   i915:i915_request_queue                            [Tracepoint event]
>   i915:i915_request_retire                           [Tracepoint event]
>   i915:i915_request_wait_begin                       [Tracepoint event]
>   i915:i915_request_wait_end                         [Tracepoint event]
> 
> So I am wondering if this might be an option?

i'm little confused here how ftrace will expose any counters as it is mostly for profiling?

Thanks,
Aravind.
> 
> So anyway let's try to understand the need for the PMU interface a bit 
> more before deciding on this. Once we introduce the interface (a) 
> people will willy nilly start exposing random stuff through that 
> inteface (b) same stuff will get exposed via multiple interfaces (e.g. 
> frequency and rc6 residency in i915) etc. I am speaking on the basis of what I saw in i915.
> 
> Let's see if Tvrtko responds, otherwise I will try to get him on irc 
> or something. It will be good to have some input from maybe one of the 
> architects too about this.
> 
> Thanks.
> --
> Ashutosh
> 
>>> I had this same question about i915 PMU which was never answered. 
>>> i915 PMU IMO does truly strange things like sample freq's every 5 ms 
>>> and provides software averages which I thought userspace can easily do.
>>
>> that is a different thing nothing to do with PMU interface
>>
>> Thanks,
>> Aravind.
>>>
>>> I don't think it's the timestamps, maybe there is some convention 
>>> related to the cpu pmu (which I am not familiar with).
>>>
>>> Let's see, maybe Tvrtko can also answer why these things were 
>>> exposed via
>>> i915 PMU.
>>>
>>> Thanks.
>>> --
>>> Ashutosh
>>>
>>>
>>>>>
>>>>> The i915 PMU I believe deduces busyness by sampling the RING_CTL 
>>>>> register using a timer. So these registers look better since you 
>>>>> can get these busyness values directly. On the other hand you can 
>>>>> only get busyness for an engine group and things like compute seem to be missing?
>>>>
>>>> The per engine busyness is a different thing we still need that and 
>>>> it has different implementation with GuC enabled, I believe Umesh 
>>>> is looking into that.
>>>>
>>>> compute group will still be accounted in XE_OAG_RENDER_BUSY_FREE 
>>>> and also under XE_OAG_RC0_ANY_ENGINE_BUSY_FREE.
>>>>>
>>>>> Also, would you know about plans to expose other kinds of 
>>>>> busyness-es? I think we may be exposing per-VF and also per-client 
>>>>> busyness via PMU. Not sure what else GuC can expose. Knowing all 
>>>>> this we can better understand how these particular busyness values will be used.
>>>>
>>>> ya, that shall be coming next probably from Umesh but per client 
>>>> busyness is through fdinfo.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-10  8:12                   ` Ursulin, Tvrtko
@ 2023-07-11 16:19                     ` Iddamsetty, Aravind
  2023-07-11 23:10                       ` Dixit, Ashutosh
  2023-07-11 22:58                     ` Dixit, Ashutosh
  1 sibling, 1 reply; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-11 16:19 UTC (permalink / raw)
  To: Ursulin, Tvrtko, Dixit, Ashutosh; +Cc: Bommu, Krishnaiah, intel-xe

Hi Ashutosh,

Please let me know if you have any more questions around this.

Thanks,
Aravind.

On 10-07-2023 13:42, Ursulin, Tvrtko wrote:
> 
> Hi,
> 
> A few random comments:
> 
> * For question about why OA is not i915_perf.c and not Perf/PMU read comment on top of i915_perf.c.
> 
> * In terms of efficiency - don't forget the fact sysfs is one value per file and in text format - so multiple counters to be read is multiple system calls (two per value at least - unless that new ioctl which opens and reads in one is used) and binary->text->binary conversion. While PMU is one ioctl to read as many counters as wanted straight in machine usable format.
> 
> * In terms of why not ioctl - my memory is hazy but I am pretty sure people requesting this interface at the time had a strong requirement to not have it. Could be what Aravind listed, or maybe even more to it.
> 
> * For sysfs there definitely was something about sysfs not being wanted with containers but I can't remember the details. Possibly people were saying they wouldn't want to mount sysfs inside them for some reason.
> 
> * The fact tracepoint names are shown with perf list does not make them PMU. 😊
> 
> * I also wouldn't discount so easily aligning with the same interface in terms of tools like intel_gpu_top. The tool has it's users and it would be non-trivial cost to refactor into wholly different backends. Most importantly someone would need to commit to that which looking at the past record of involvements I estimate is not very likely to happen.
> 
> * Also in terms of software counters, the story is not that simple as i915 invented them. There was an option to actually use OA registers to implement engine busyness stats but unfortunately the capability wasn't consistent across hw generations we needed to support. Not all engines had corresponding OA counters and there was also one other problem with OA which currently escapes me. Needing to keep the device awake maybe? But anyway, that is one reason for sw counters, including sampling on ringbuffer platforms, and accurate sw stats on execlists.
> 
> Sampled frequency (more granular than 1 HZ snapshots) was also AFAIR a customer requirement and i915 can do it much more cheaply than userspace hammering on sysfs can.
> 
> Of course the user/customer requirements might have changed so I am not saying that all past decisions still apply. Just providing context.
> 
> Regards,
> 
> Tvrtko
> 
> -----Original Message-----
> From: Iddamsetty, Aravind <aravind.iddamsetty@intel.com> 
> Sent: Monday, July 10, 2023 7:05 AM
> To: Dixit, Ashutosh <ashutosh.dixit@intel.com>
> Cc: intel-xe@lists.freedesktop.org; Bommu, Krishnaiah <krishnaiah.bommu@intel.com>; Ursulin, Tvrtko <tvrtko.ursulin@intel.com>
> Subject: Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
> 
> 
> 
> On 08-07-2023 02:55, Dixit, Ashutosh wrote:
>> On Fri, 07 Jul 2023 03:42:36 -0700, Iddamsetty, Aravind wrote:
>>>
>>
>> Hi Aravind,
>>
>>> On 07-07-2023 11:38, Dixit, Ashutosh wrote:
>>>> On Thu, 06 Jul 2023 20:53:47 -0700, Iddamsetty, Aravind wrote:
>>>> I will look at the timing stuff later but one further question about 
>>>> the
>>>> requirement:
>>>>
>>>>>> Also, could you please explain where the requirement to expose 
>>>>>> these OAG group busy/free registers via the PMU is coming from? 
>>>>>> Since these are OA registers presumably they can be collected using the OA subsystem.
>>>>>
>>>>> L0 sysman needs this
>>>>> https://spec.oneapi.io/level-zero/latest/sysman/api.html#zes-engine
>>>>> -properties-t
>>>>> and xpumanager uses this
>>>>> https://github.com/intel/xpumanager/blob/master/core/src/device/gpu
>>>>> /gpu_device.cpp
>>>>
>>>> So fine these are UMD requirements, but why do these quantities 
>>>> (everything in this patch) have to exposed via PMU? I could just 
>>>> create sysfs or an ioctl to provide these to userland, right?
>>>
>>> PMU is enhanced interface to present the metrics, it provides low 
>>> latency reads compared to sysfs
>>
>> Why lower latency compared to sysfs? In both cases we have user to 
>> kernel transitions and then register reads etc.
> 
> The sysfs will have to go through filesystem which adds latency but here i think the most important aspect is requirement of read timestamps.
> 
>>
>>> and one can read multiple events in a single shot
>>
>> Yes, this PMU can do and sysfs can't, though ioctl's can do this.
>>
>>> and it will give timestamps as well which sysfs cannot provide and 
>>> which is one of the requirements of UMD.
>>
>> Ioctl's can do this if implement (counter, timestamp) pairs, but I 
>> agree this may look strange so PMU does seem to have an advantage here.
>>
>> But are these timestamps needed? The spec talks about different 
>> timestamp bases but in this case we have already converted to ns and I 
>> am wondering if the UMD can use it's own timestamps (maybe average of 
>> the ioctl call and return from ioctl) if UMD needs timestamps.
> 
> here i'm talking about read timestamps not the counter itself and when we already have an interface(PMU) which can give these details why to do duplicate effort in ioctl
>>
>>> Also UMDs/ observability tools do not want to have any open handles 
>>> to get these info so ioctl is dropped out.
>>
>> Why? This also I don't follow. And UMD has an perf pmu fd open. See 
>> igt@perf_pmu@module-unload e.g. which tests that module unload should 
>> fail if the perf pmu fd is open (which takes a ref count on the module).
> 
> here I'm referring to drm fd, one need not open drm fd to read via pmu, and typically UMDs do not want to open drm fd as it takes device reference and might toggle the device state(eg: wake device) when we are trying to read some stats which is not needed.
> 
>>
>>> the other motivation to use PMU in xe is the existing tools like 
>>> intel_gpu_top will work with just a minor change.
>>
>> Not too concerned about userspace tools. They can be changed to use a 
>> different interface.
>>
>> So I am still not convinced xe needs to expose a PMU interface with 
>> these sort of "software events/counters". So my question is why can't 
>> we just have an ioctl to expose these things, why PMU?
> 
> firstly, PMU satisfies all requirements of UMD, requiring read timestamps, multiple event read. So as we already have a time tested interface is kernel why should we try to duplicate. secondly, using ioctl one has to open drm fd which umds do not want.
>>
>> Incidentally if you look at amdgpu_pmu.c, they seem to exposing some 
>> hardware sort of events through the PMU, not our kind of software stuff.
> 
> the counters that I'm exposing in this series are hardware counters itself.
> 
>>
>> Another interesting thing is if we have ftrace statements they seem to 
>> automatically be exposed by PMU 
>> (https://perf.wiki.kernel.org/index.php/Tutorial), e.g.:
>>
>>   i915:i915_request_add                              [Tracepoint event]
>>   i915:i915_request_queue                            [Tracepoint event]
>>   i915:i915_request_retire                           [Tracepoint event]
>>   i915:i915_request_wait_begin                       [Tracepoint event]
>>   i915:i915_request_wait_end                         [Tracepoint event]
>>
>> So I am wondering if this might be an option?
> 
> i'm little confused here how ftrace will expose any counters as it is mostly for profiling?
> 
> Thanks,
> Aravind.
>>
>> So anyway let's try to understand the need for the PMU interface a bit 
>> more before deciding on this. Once we introduce the interface (a) 
>> people will willy nilly start exposing random stuff through that 
>> inteface (b) same stuff will get exposed via multiple interfaces (e.g. 
>> frequency and rc6 residency in i915) etc. I am speaking on the basis of what I saw in i915.
>>
>> Let's see if Tvrtko responds, otherwise I will try to get him on irc 
>> or something. It will be good to have some input from maybe one of the 
>> architects too about this.
>>
>> Thanks.
>> --
>> Ashutosh
>>
>>>> I had this same question about i915 PMU which was never answered. 
>>>> i915 PMU IMO does truly strange things like sample freq's every 5 ms 
>>>> and provides software averages which I thought userspace can easily do.
>>>
>>> that is a different thing nothing to do with PMU interface
>>>
>>> Thanks,
>>> Aravind.
>>>>
>>>> I don't think it's the timestamps, maybe there is some convention 
>>>> related to the cpu pmu (which I am not familiar with).
>>>>
>>>> Let's see, maybe Tvrtko can also answer why these things were 
>>>> exposed via
>>>> i915 PMU.
>>>>
>>>> Thanks.
>>>> --
>>>> Ashutosh
>>>>
>>>>
>>>>>>
>>>>>> The i915 PMU I believe deduces busyness by sampling the RING_CTL 
>>>>>> register using a timer. So these registers look better since you 
>>>>>> can get these busyness values directly. On the other hand you can 
>>>>>> only get busyness for an engine group and things like compute seem to be missing?
>>>>>
>>>>> The per engine busyness is a different thing we still need that and 
>>>>> it has different implementation with GuC enabled, I believe Umesh 
>>>>> is looking into that.
>>>>>
>>>>> compute group will still be accounted in XE_OAG_RENDER_BUSY_FREE 
>>>>> and also under XE_OAG_RC0_ANY_ENGINE_BUSY_FREE.
>>>>>>
>>>>>> Also, would you know about plans to expose other kinds of 
>>>>>> busyness-es? I think we may be exposing per-VF and also per-client 
>>>>>> busyness via PMU. Not sure what else GuC can expose. Knowing all 
>>>>>> this we can better understand how these particular busyness values will be used.
>>>>>
>>>>> ya, that shall be coming next probably from Umesh but per client 
>>>>> busyness is through fdinfo.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-10  8:12                   ` Ursulin, Tvrtko
  2023-07-11 16:19                     ` Iddamsetty, Aravind
@ 2023-07-11 22:58                     ` Dixit, Ashutosh
  1 sibling, 0 replies; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-11 22:58 UTC (permalink / raw)
  To: Ursulin, Tvrtko; +Cc: Bommu, Krishnaiah, intel-xe

On Mon, 10 Jul 2023 01:12:03 -0700, Ursulin, Tvrtko wrote:
>
>

Hi Tvrtko,

Thanks for providing the context. I have a couple of further questions:

* For client busyness it seems we mananged to invent a drm wide fdinfo
  based method (non PMU). Are any such efforts on for the kinds of things
  we are exposing via the i915 PMU or do you see a possibility for this?
  For it would appear that all drm drivers would want to expose such perf
  info and a common method across drm is desirable?

* Would you or anyone else have an idea of what other drm drivers (say AMD)
  are exposing with Perf/PMU? And why they didn't see the need to do what
  was done in i915 (it seems they are not exposing the same sort of stuff
  which i915 is exposing)?

And a few comments below.

> A few random comments:
>
> * For question about why OA is not i915_perf.c and not Perf/PMU read
> * comment on top of i915_perf.c.

... why OA is i915_perf.c and not Perf/PMU ... The question Aravind asked
earlier.

> * In terms of efficiency - don't forget the fact sysfs is one value per
> * file and in text format - so multiple counters to be read is multiple
> * system calls (two per value at least - unless that new ioctl which
> * opens and reads in one is used) and binary->text->binary
> * conversion. While PMU is one ioctl to read as many counters as wanted
> * straight in machine usable format.
>
> * In terms of why not ioctl - my memory is hazy but I am pretty sure
> * people requesting this interface at the time had a strong requirement
> * to not have it. Could be what Aravind listed, or maybe even more to it.

Another advantage PMU has over ioctl is "discoverability", at run time your
can discover what is available. Maybe can be done via ioctl extensions or
versioning such as is used in OA.

So overall I have also come round to thinking to expose perf stats, PMU
seems to be a better interface than ioctl or sysfs.

> * For sysfs there definitely was something about sysfs not being wanted
> * with containers but I can't remember the details. Possibly people were
> * saying they wouldn't want to mount sysfs inside them for some reason.

Heard the other day that using sysfs with containers needs some kernel
changes.

> * The fact tracepoint names are shown with perf list does not make them
> * PMU. 😊

Yup I tried this out and there was no real information there.

> * I also wouldn't discount so easily aligning with the same interface in
> * terms of tools like intel_gpu_top. The tool has it's users and it would
> * be non-trivial cost to refactor into wholly different backends. Most
> * importantly someone would need to commit to that which looking at the
> * past record of involvements I estimate is not very likely to happen.

Agreed, doing it say via OA would be more complicated too. I think tools
around OA exist but likely doing it in IGT is probably not going to happen.

And even if we did do something with OA, we would need a way to expose
busyness data which GuC emits (as opposed to what HW emits).

> * Also in terms of software counters, the story is not that simple as
> * i915 invented them. There was an option to actually use OA registers to
> * implement engine busyness stats but unfortunately the capability wasn't
> * consistent across hw generations we needed to support. Not all engines
> * had corresponding OA counters and there was also one other problem with
> * OA which currently escapes me. Needing to keep the device awake maybe?
> * But anyway, that is one reason for sw counters, including sampling on
> * ringbuffer platforms, and accurate sw stats on execlists.
>
> Sampled frequency (more granular than 1 HZ snapshots) was also AFAIR a
> customer requirement and i915 can do it much more cheaply than userspace
> hammering on sysfs can.

Agreed kernel can do it more cheaply, but maybe doing it at 20 Hz in
usersapce (rather than 200 Hz in the kernel) is sufficient? And userspace
won't have to deal with the i915 quirks such as freq is only measured when
GPU is unparked.

> Of course the user/customer requirements might have changed so I am not
> saying that all past decisions still apply. Just providing context.

Yes thanks for that, much appreciated.

Ashutosh

>
> Regards,
>
> Tvrtko
>
> -----Original Message-----
> From: Iddamsetty, Aravind <aravind.iddamsetty@intel.com>
> Sent: Monday, July 10, 2023 7:05 AM
> To: Dixit, Ashutosh <ashutosh.dixit@intel.com>
> Cc: intel-xe@lists.freedesktop.org; Bommu, Krishnaiah <krishnaiah.bommu@intel.com>; Ursulin, Tvrtko <tvrtko.ursulin@intel.com>
> Subject: Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
>
>
>
> On 08-07-2023 02:55, Dixit, Ashutosh wrote:
> > On Fri, 07 Jul 2023 03:42:36 -0700, Iddamsetty, Aravind wrote:
> >>
> >
> > Hi Aravind,
> >
> >> On 07-07-2023 11:38, Dixit, Ashutosh wrote:
> >>> On Thu, 06 Jul 2023 20:53:47 -0700, Iddamsetty, Aravind wrote:
> >>> I will look at the timing stuff later but one further question about
> >>> the
> >>> requirement:
> >>>
> >>>>> Also, could you please explain where the requirement to expose
> >>>>> these OAG group busy/free registers via the PMU is coming from?
> >>>>> Since these are OA registers presumably they can be collected using the OA subsystem.
> >>>>
> >>>> L0 sysman needs this
> >>>> https://spec.oneapi.io/level-zero/latest/sysman/api.html#zes-engine
> >>>> -properties-t
> >>>> and xpumanager uses this
> >>>> https://github.com/intel/xpumanager/blob/master/core/src/device/gpu
> >>>> /gpu_device.cpp
> >>>
> >>> So fine these are UMD requirements, but why do these quantities
> >>> (everything in this patch) have to exposed via PMU? I could just
> >>> create sysfs or an ioctl to provide these to userland, right?
> >>
> >> PMU is enhanced interface to present the metrics, it provides low
> >> latency reads compared to sysfs
> >
> > Why lower latency compared to sysfs? In both cases we have user to
> > kernel transitions and then register reads etc.
>
> The sysfs will have to go through filesystem which adds latency but here i think the most important aspect is requirement of read timestamps.
>
> >
> >> and one can read multiple events in a single shot
> >
> > Yes, this PMU can do and sysfs can't, though ioctl's can do this.
> >
> >> and it will give timestamps as well which sysfs cannot provide and
> >> which is one of the requirements of UMD.
> >
> > Ioctl's can do this if implement (counter, timestamp) pairs, but I
> > agree this may look strange so PMU does seem to have an advantage here.
> >
> > But are these timestamps needed? The spec talks about different
> > timestamp bases but in this case we have already converted to ns and I
> > am wondering if the UMD can use it's own timestamps (maybe average of
> > the ioctl call and return from ioctl) if UMD needs timestamps.
>
> here i'm talking about read timestamps not the counter itself and when we already have an interface(PMU) which can give these details why to do duplicate effort in ioctl
> >
> >> Also UMDs/ observability tools do not want to have any open handles
> >> to get these info so ioctl is dropped out.
> >
> > Why? This also I don't follow. And UMD has an perf pmu fd open. See
> > igt@perf_pmu@module-unload e.g. which tests that module unload should
> > fail if the perf pmu fd is open (which takes a ref count on the module).
>
> here I'm referring to drm fd, one need not open drm fd to read via pmu, and typically UMDs do not want to open drm fd as it takes device reference and might toggle the device state(eg: wake device) when we are trying to read some stats which is not needed.
>
> >
> >> the other motivation to use PMU in xe is the existing tools like
> >> intel_gpu_top will work with just a minor change.
> >
> > Not too concerned about userspace tools. They can be changed to use a
> > different interface.
> >
> > So I am still not convinced xe needs to expose a PMU interface with
> > these sort of "software events/counters". So my question is why can't
> > we just have an ioctl to expose these things, why PMU?
>
> firstly, PMU satisfies all requirements of UMD, requiring read timestamps, multiple event read. So as we already have a time tested interface is kernel why should we try to duplicate. secondly, using ioctl one has to open drm fd which umds do not want.
> >
> > Incidentally if you look at amdgpu_pmu.c, they seem to exposing some
> > hardware sort of events through the PMU, not our kind of software stuff.
>
> the counters that I'm exposing in this series are hardware counters itself.
>
> >
> > Another interesting thing is if we have ftrace statements they seem to
> > automatically be exposed by PMU
> > (https://perf.wiki.kernel.org/index.php/Tutorial), e.g.:
> >
> >   i915:i915_request_add                              [Tracepoint event]
> >   i915:i915_request_queue                            [Tracepoint event]
> >   i915:i915_request_retire                           [Tracepoint event]
> >   i915:i915_request_wait_begin                       [Tracepoint event]
> >   i915:i915_request_wait_end                         [Tracepoint event]
> >
> > So I am wondering if this might be an option?
>
> i'm little confused here how ftrace will expose any counters as it is mostly for profiling?
>
> Thanks,
> Aravind.
> >
> > So anyway let's try to understand the need for the PMU interface a bit
> > more before deciding on this. Once we introduce the interface (a)
> > people will willy nilly start exposing random stuff through that
> > inteface (b) same stuff will get exposed via multiple interfaces (e.g.
> > frequency and rc6 residency in i915) etc. I am speaking on the basis of what I saw in i915.
> >
> > Let's see if Tvrtko responds, otherwise I will try to get him on irc
> > or something. It will be good to have some input from maybe one of the
> > architects too about this.
> >
> > Thanks.
> > --
> > Ashutosh
> >
> >>> I had this same question about i915 PMU which was never answered.
> >>> i915 PMU IMO does truly strange things like sample freq's every 5 ms
> >>> and provides software averages which I thought userspace can easily do.
> >>
> >> that is a different thing nothing to do with PMU interface
> >>
> >> Thanks,
> >> Aravind.
> >>>
> >>> I don't think it's the timestamps, maybe there is some convention
> >>> related to the cpu pmu (which I am not familiar with).
> >>>
> >>> Let's see, maybe Tvrtko can also answer why these things were
> >>> exposed via
> >>> i915 PMU.
> >>>
> >>> Thanks.
> >>> --
> >>> Ashutosh
> >>>
> >>>
> >>>>>
> >>>>> The i915 PMU I believe deduces busyness by sampling the RING_CTL
> >>>>> register using a timer. So these registers look better since you
> >>>>> can get these busyness values directly. On the other hand you can
> >>>>> only get busyness for an engine group and things like compute seem to be missing?
> >>>>
> >>>> The per engine busyness is a different thing we still need that and
> >>>> it has different implementation with GuC enabled, I believe Umesh
> >>>> is looking into that.
> >>>>
> >>>> compute group will still be accounted in XE_OAG_RENDER_BUSY_FREE
> >>>> and also under XE_OAG_RC0_ANY_ENGINE_BUSY_FREE.
> >>>>>
> >>>>> Also, would you know about plans to expose other kinds of
> >>>>> busyness-es? I think we may be exposing per-VF and also per-client
> >>>>> busyness via PMU. Not sure what else GuC can expose. Knowing all
> >>>>> this we can better understand how these particular busyness values will be used.
> >>>>
> >>>> ya, that shall be coming next probably from Umesh but per client
> >>>> busyness is through fdinfo.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-11 16:19                     ` Iddamsetty, Aravind
@ 2023-07-11 23:10                       ` Dixit, Ashutosh
  2023-07-12  3:11                         ` Iddamsetty, Aravind
  0 siblings, 1 reply; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-11 23:10 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu, Krishnaiah, intel-xe, Ursulin, Tvrtko

On Tue, 11 Jul 2023 09:19:24 -0700, Iddamsetty, Aravind wrote:
>

Hi Aravind,

> Please let me know if you have any more questions around this.

I've asked. But you probably know that since this is uapi, to merge this we
would need a fully reviewed UMD (non IGT) merge request which consumes what
we are exposing. So while we are reviewing this maybe let's ask L0 etc. for
the merge request. I looked at xpumanager but didn't see it consuming what
we are exposing via PMU here.

Maybe we'll need an ack from someone like Joonas too.

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-11 23:10                       ` Dixit, Ashutosh
@ 2023-07-12  3:11                         ` Iddamsetty, Aravind
  2023-07-12  5:24                           ` Dixit, Ashutosh
  0 siblings, 1 reply; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-12  3:11 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Bommu, Krishnaiah, intel-xe, Ursulin, Tvrtko



On 12-07-2023 04:40, Dixit, Ashutosh wrote:
> On Tue, 11 Jul 2023 09:19:24 -0700, Iddamsetty, Aravind wrote:
>>
> 
> Hi Aravind,
> 
>> Please let me know if you have any more questions around this.
> 
> I've asked. But you probably know that since this is uapi, to merge this we
> would need a fully reviewed UMD (non IGT) merge request which consumes what
> we are exposing. So while we are reviewing this maybe let's ask L0 etc. for
> the merge request. I looked at xpumanager but didn't see it consuming what
> we are exposing via PMU here.
Hi Ashutosh,

For the UAPI consumption we can use the linux generic perf tool, won't
it be agreeable.

Thanks,
Aravind.
> 
> Maybe we'll need an ack from someone like Joonas too.
> 
> Thanks.
> --
> Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-12  3:11                         ` Iddamsetty, Aravind
@ 2023-07-12  5:24                           ` Dixit, Ashutosh
  0 siblings, 0 replies; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-12  5:24 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu, Krishnaiah, intel-xe, Ursulin, Tvrtko

On Tue, 11 Jul 2023 20:11:12 -0700, Iddamsetty, Aravind wrote:
>
> On 12-07-2023 04:40, Dixit, Ashutosh wrote:
> > On Tue, 11 Jul 2023 09:19:24 -0700, Iddamsetty, Aravind wrote:
> >>
> >
> >> Please let me know if you have any more questions around this.
> >
> > I've asked. But you probably know that since this is uapi, to merge this we
> > would need a fully reviewed UMD (non IGT) merge request which consumes what
> > we are exposing. So while we are reviewing this maybe let's ask L0 etc. for
> > the merge request. I looked at xpumanager but didn't see it consuming what
> > we are exposing via PMU here.
> Hi Ashutosh,
>
> For the UAPI consumption we can use the linux generic perf tool, won't
> it be agreeable.

We can check with Joonas or Matt Roper, but afaiu because the linux generic
perf tool has nothing specific about what we are exposing it will not be
enough, we will need a L0 or another "approved" UMD merge request (I
believe L0 is approved).

> >
> > Maybe we'll need an ack from someone like Joonas too.
> >
> > Thanks.
> > --
> > Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-07  3:53         ` Iddamsetty, Aravind
  2023-07-07  6:08           ` Dixit, Ashutosh
  2023-07-09  0:32           ` Dixit, Ashutosh
@ 2023-07-18  5:07           ` Dixit, Ashutosh
  2023-07-19  6:59             ` Iddamsetty, Aravind
  2 siblings, 1 reply; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-18  5:07 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Thu, 06 Jul 2023 20:53:47 -0700, Iddamsetty, Aravind wrote:
>

Hi Aravind,

Back to this review.

> >> On 06-07-2023 08:09, Dixit, Ashutosh wrote:
> >>> On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
> >>
> >>>> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
> >>>> +{
> >>>> +	u64 val = 0;
> >>>> +
> >>>> +	switch (config) {
> >>>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> >>>> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
> >>>> +		break;
> >>>> +	case XE_PMU_COPY_GROUP_BUSY(0):
> >>>> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
> >>>> +		break;
> >>>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> >>>> +		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
> >>>> +		break;
> >>>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> >>>> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
> >>>> +		break;
> >>>> +	default:
> >>>> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
> >>>> +	}
> >>>> +
> >>>> +	return xe_gt_clock_interval_to_ns(gt, val * 16);
> >>>> +}
> >>>
> >>> A few questions on just the above function first:
> >>>
> >>> 1. OK so these registers won't be available to VF's, but any idea what
> >>>    these counts are when VF's are active?
> >>
> >> VF's cannot access the registers but the counters will be incrementing
> >> if respective engines are busy and can be monitored from PF and that
> >> will be across all VFs.
> >
> > Ok, good.
> >
> >>
> >>>
> >>> 2. When would these 32 bit registers overflow? Let us say a group is
> >>>    continuously busy and we are running at 1 GHz, the registers would
> >>>    overflow in 4 seconds (max value 4G)?
> >>
> >> Based on BSPEC:52071 they use MHZ clock assuming the default 24MHz, it
> >> would take around 5726 secs to overflow.
> >
> > OK, overflow should not be an issue then. Though I have seen 19.2 and 38.4
> > MHz in OA. Also, if these are OAG registers, OA timestamp freq can be
> > different from CS timestamp freq, so not sure if that needs to be
> > handled. See i915_perf_oa_timestamp_frequency() in i915.
>
> so that is handled by below calculation

> >>
> >>>
> >>> 3. What is the multiplication by 16 (not factored above in 2.)? I don't see
> >>>    that in Bspec.
> >>
> >> These counters are incremented based on crystal clock frequency and we
> >> need to convert to CS time base hence a 16x mul. BSPEC:52071
> >
> > Hmm still don't see the 16x mul in BSPEC:52071. Anyway.
>
> lets say the frequency is 24MHz so the counter increments every
> 1333.333ns(granularity) and corresponding cs timestamp base for this
> frequency is 83.333ns, 1333.333/83.333 = 16 and this true for rest of
> the frequency selections as well. hence we multiply the counter x 16.

OK, I finally see in Bspec:71028: "Unit in GPMusec. Refer to Timestamp
Bases for details" which is a pointer to Bspec: 52071. So the 16x multiply
looks correct (and these registers are not running at OA freq otherwise
we'd have to use i915_perf_oa_timestamp_frequency()). Hopefully it's
correct. Thanks.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-18  5:07           ` Dixit, Ashutosh
@ 2023-07-19  6:59             ` Iddamsetty, Aravind
  0 siblings, 0 replies; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-19  6:59 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin



On 18-07-2023 10:37, Dixit, Ashutosh wrote:
> On Thu, 06 Jul 2023 20:53:47 -0700, Iddamsetty, Aravind wrote:
>>
> 
> Hi Aravind,
> 
> Back to this review.
> 
>>>> On 06-07-2023 08:09, Dixit, Ashutosh wrote:
>>>>> On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
>>>>
>>>>>> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
>>>>>> +{
>>>>>> +	u64 val = 0;
>>>>>> +
>>>>>> +	switch (config) {
>>>>>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>>>>>> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
>>>>>> +		break;
>>>>>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>>>>>> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
>>>>>> +		break;
>>>>>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>>>>>> +		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
>>>>>> +		break;
>>>>>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>>>>>> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
>>>>>> +		break;
>>>>>> +	default:
>>>>>> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
>>>>>> +	}
>>>>>> +
>>>>>> +	return xe_gt_clock_interval_to_ns(gt, val * 16);
>>>>>> +}
>>>>>
>>>>> A few questions on just the above function first:
>>>>>
>>>>> 1. OK so these registers won't be available to VF's, but any idea what
>>>>>    these counts are when VF's are active?
>>>>
>>>> VF's cannot access the registers but the counters will be incrementing
>>>> if respective engines are busy and can be monitored from PF and that
>>>> will be across all VFs.
>>>
>>> Ok, good.
>>>
>>>>
>>>>>
>>>>> 2. When would these 32 bit registers overflow? Let us say a group is
>>>>>    continuously busy and we are running at 1 GHz, the registers would
>>>>>    overflow in 4 seconds (max value 4G)?
>>>>
>>>> Based on BSPEC:52071 they use MHZ clock assuming the default 24MHz, it
>>>> would take around 5726 secs to overflow.
>>>
>>> OK, overflow should not be an issue then. Though I have seen 19.2 and 38.4
>>> MHz in OA. Also, if these are OAG registers, OA timestamp freq can be
>>> different from CS timestamp freq, so not sure if that needs to be
>>> handled. See i915_perf_oa_timestamp_frequency() in i915.
>>
>> so that is handled by below calculation
> 
>>>>
>>>>>
>>>>> 3. What is the multiplication by 16 (not factored above in 2.)? I don't see
>>>>>    that in Bspec.
>>>>
>>>> These counters are incremented based on crystal clock frequency and we
>>>> need to convert to CS time base hence a 16x mul. BSPEC:52071
>>>
>>> Hmm still don't see the 16x mul in BSPEC:52071. Anyway.
>>
>> lets say the frequency is 24MHz so the counter increments every
>> 1333.333ns(granularity) and corresponding cs timestamp base for this
>> frequency is 83.333ns, 1333.333/83.333 = 16 and this true for rest of
>> the frequency selections as well. hence we multiply the counter x 16.
> 
> OK, I finally see in Bspec:71028: "Unit in GPMusec. Refer to Timestamp
> Bases for details" which is a pointer to Bspec: 52071. So the 16x multiply
> looks correct (and these registers are not running at OA freq otherwise
> we'd have to use i915_perf_oa_timestamp_frequency()). Hopefully it's
> correct. Thanks.

Thanks,
Aravind.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-06-27 12:21 ` [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
                     ` (4 preceding siblings ...)
  2023-07-06  2:40   ` Belgaumkar, Vinay
@ 2023-07-21  1:02   ` Dixit, Ashutosh
  2023-07-21 11:51     ` Iddamsetty, Aravind
  2023-07-22 14:39   ` Dixit, Ashutosh
  6 siblings, 1 reply; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-21  1:02 UTC (permalink / raw)
  To: Aravind Iddamsetty; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
>

Hi Aravind,

More stuff to mull over. You can ignore comments starting with "OK", those
are just notes to myself.

Also, maybe some time we can add a basic IGT which reads these exposed
counters and verifies that we can read them and they are monotonically
increasing?

> There are a set of engine group busyness counters provided by HW which are
> perfect fit to be exposed via PMU perf events.
>
> BSPEC: 46559, 46560, 46722, 46729

Also add these Bspec entries: 71028, 52071

>
> events can be listed using:
> perf list
>   xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
>   xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
>   xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
>   xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
>   xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
>
> and can be read using:
>
> perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
>            time             counts unit events
>      1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>      9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>     10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>     10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>
> The pmu base implementation is taken from i915.
>
> v2:
> Store last known value when device is awake return that while the GT is
> suspended and then update the driver copy when read during awake.
>
> Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
> ---
>  drivers/gpu/drm/xe/Makefile          |   2 +
>  drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
>  drivers/gpu/drm/xe/xe_device.c       |   2 +
>  drivers/gpu/drm/xe/xe_device_types.h |   4 +
>  drivers/gpu/drm/xe/xe_gt.c           |   2 +
>  drivers/gpu/drm/xe/xe_irq.c          |  22 +
>  drivers/gpu/drm/xe/xe_module.c       |   5 +
>  drivers/gpu/drm/xe/xe_pmu.c          | 739 +++++++++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_pmu.h          |  25 +
>  drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
>  include/uapi/drm/xe_drm.h            |  16 +
>  11 files changed, 902 insertions(+)
>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
>  create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h
>
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 081c57fd8632..e52ab795c566 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -217,6 +217,8 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
>	i915-display/skl_universal_plane.o \
>	i915-display/skl_watermark.o
>
> +xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
> +
>  ifeq ($(CONFIG_ACPI),y)
>	xe-$(CONFIG_DRM_XE_DISPLAY) += \
>		i915-display/intel_acpi.o \
> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> index 3f664011eaea..c7d9e4634745 100644
> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> @@ -285,6 +285,11 @@
>  #define   INVALIDATION_BROADCAST_MODE_DIS	REG_BIT(12)
>  #define   GLOBAL_INVALIDATION_MODE		REG_BIT(2)
>
> +#define XE_OAG_RC0_ANY_ENGINE_BUSY_FREE		XE_REG(0xdb80)
> +#define XE_OAG_ANY_MEDIA_FF_BUSY_FREE		XE_REG(0xdba0)
> +#define XE_OAG_BLT_BUSY_FREE			XE_REG(0xdbbc)
> +#define XE_OAG_RENDER_BUSY_FREE			XE_REG(0xdbdc)
> +
>  #define SAMPLER_MODE				XE_REG_MCR(0xe18c, XE_REG_OPTION_MASKED)
>  #define   ENABLE_SMALLPL			REG_BIT(15)
>  #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index c7985af85a53..b2c7bd4a97d9 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -328,6 +328,8 @@ int xe_device_probe(struct xe_device *xe)
>
>	xe_debugfs_register(xe);
>
> +	xe_pmu_register(&xe->pmu);
> +
>	err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize, xe);
>	if (err)
>		return err;
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 0226d44a6af2..3ba99aae92b9 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -15,6 +15,7 @@
>  #include "xe_devcoredump_types.h"
>  #include "xe_gt_types.h"
>  #include "xe_platform_types.h"
> +#include "xe_pmu.h"
>  #include "xe_step_types.h"
>
>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> @@ -332,6 +333,9 @@ struct xe_device {
>	/** @d3cold_allowed: Indicates if d3cold is a valid device state */
>	bool d3cold_allowed;
>
> +	/* @pmu: performance monitoring unit */
> +	struct xe_pmu pmu;
> +

Now I am wondering why we don't make the PMU per-gt (or per-tile)? Per-gt
would work for these busyness registers and I am not sure how the
interrupts are hooked up.

In i915 the PMU being device level helped in having a single timer (rather
than a per gt timer).

Anyway probably not much practical benefit by make it per-gt/per-tile, so
maybe leave as is. Just thinking out loud.

>	/* private: */
>
>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> index 2458397ce8af..96e3720923d4 100644
> --- a/drivers/gpu/drm/xe/xe_gt.c
> +++ b/drivers/gpu/drm/xe/xe_gt.c
> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
>	if (err)
>		goto err_msg;
>
> +	engine_group_busyness_store(gt);

Not sure I follow the reason for doing this at suspend time? If PMU was
active there should be a previous value. Why must we take a new sample
explicitly here?

To me looks like engine_group_busyness_store should not be needed, see
comment below for init_samples too.

> +
>	err = xe_uc_suspend(&gt->uc);
>	if (err)
>		goto err_force_wake;
> diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
> index b4ed1e4a3388..cb943fb94ec7 100644
> --- a/drivers/gpu/drm/xe/xe_irq.c
> +++ b/drivers/gpu/drm/xe/xe_irq.c
> @@ -27,6 +27,24 @@
>  #define IIR(offset)				XE_REG(offset + 0x8)
>  #define IER(offset)				XE_REG(offset + 0xc)
>
> +/*
> + * Interrupt statistic for PMU. Increments the counter only if the
> + * interrupt originated from the GPU so interrupts from a device which
> + * shares the interrupt line are not accounted.
> + */
> +static inline void xe_pmu_irq_stats(struct xe_device *xe,

No inline, compiler will do it, but looks like we may want to always_inline
this. Also this function should really be in xe_pmu.h? Anyway probably
leave as is.

> +				    irqreturn_t res)
> +{
> +	if (unlikely(res != IRQ_HANDLED))
> +		return;
> +
> +	/*
> +	 * A clever compiler translates that into INC. A not so clever one
> +	 * should at least prevent store tearing.
> +	 */
> +	WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1);
> +}
> +
>  static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)
>  {
>	u32 val = xe_mmio_read32(mmio, reg);
> @@ -325,6 +343,8 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
>
>	xe_display_irq_enable(xe, gu_misc_iir);
>
> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
> +
>	return IRQ_HANDLED;
>  }
>
> @@ -414,6 +434,8 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
>	dg1_intr_enable(xe, false);
>	xe_display_irq_enable(xe, gu_misc_iir);
>
> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
> +
>	return IRQ_HANDLED;
>  }
>
> diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
> index 75e5be939f53..f6fe89748525 100644
> --- a/drivers/gpu/drm/xe/xe_module.c
> +++ b/drivers/gpu/drm/xe/xe_module.c
> @@ -12,6 +12,7 @@
>  #include "xe_hw_fence.h"
>  #include "xe_module.h"
>  #include "xe_pci.h"
> +#include "xe_pmu.h"
>  #include "xe_sched_job.h"
>
>  bool enable_guc = true;
> @@ -49,6 +50,10 @@ static const struct init_funcs init_funcs[] = {
>		.init = xe_sched_job_module_init,
>		.exit = xe_sched_job_module_exit,
>	},
> +	{
> +		.init = xe_pmu_init,
> +		.exit = xe_pmu_exit,
> +	},
>	{
>		.init = xe_register_pci_driver,
>		.exit = xe_unregister_pci_driver,
> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
> new file mode 100644
> index 000000000000..bef1895be9f7
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu.c
> @@ -0,0 +1,739 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2023 Intel Corporation
> + */

This SPDX header is for .h files not .c files. Actually, it is for neither :/

> +
> +#include <drm/drm_drv.h>
> +#include <drm/drm_managed.h>
> +#include <drm/xe_drm.h>
> +
> +#include "regs/xe_gt_regs.h"
> +#include "xe_device.h"
> +#include "xe_gt_clock.h"
> +#include "xe_mmio.h"
> +
> +static cpumask_t xe_pmu_cpumask;
> +static unsigned int xe_pmu_target_cpu = -1;
> +
> +static unsigned int config_gt_id(const u64 config)
> +{
> +	return config >> __XE_PMU_GT_SHIFT;
> +}
> +
> +static u64 config_counter(const u64 config)
> +{
> +	return config & ~(~0ULL << __XE_PMU_GT_SHIFT);
> +}
> +
> +static unsigned int
> +__sample_idx(struct xe_pmu *pmu, unsigned int gt_id, int sample)
> +{
> +	unsigned int idx = gt_id * __XE_NUM_PMU_SAMPLERS + sample;
> +
> +	XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
> +
> +	return idx;
> +}

The compiler does this for us if we make sample array 2-d.

> +
> +static u64 read_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample)
> +{
> +	return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur;
> +}
> +
> +static void
> +store_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample, u64 val)
> +{
> +	pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val;
> +}

The three functions above are not needed if we make the sample array
2-d. See here:

https://patchwork.freedesktop.org/patch/538887/?series=118225&rev=1

Only a part of the patch above was merged (see 8ed0753b527dc) to keep the
patch size small, but since for xe we are starting from scratch we can
implement the whole approach above (get rid of the read/store helpers
entirely, direct assignment without the helpers works).

> +
> +static int engine_busyness_sample_type(u64 config)
> +{
> +	int type = 0;
> +
> +	switch (config) {
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +		type =  __XE_SAMPLE_RENDER_GROUP_BUSY;
> +		break;
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +		type = __XE_SAMPLE_COPY_GROUP_BUSY;
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		type = __XE_SAMPLE_MEDIA_GROUP_BUSY;
> +		break;
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		type = __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY;
> +		break;
> +	}
> +
> +	return type;
> +}
> +
> +static void xe_pmu_event_destroy(struct perf_event *event)
> +{
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +
> +	drm_WARN_ON(&xe->drm, event->parent);
> +
> +	drm_dev_put(&xe->drm);
> +}
> +
> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
> +{
> +	u64 val = 0;
> +
> +	switch (config) {
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
> +		break;
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
> +		break;
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
> +		break;
> +	default:
> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
> +	}

We need xe_device_mem_access_get, also xe_force_wake_get if applicable,
somewhere before reading these registers.

> +
> +	return xe_gt_clock_interval_to_ns(gt, val * 16);
> +}
> +
> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config)
> +{
> +	int sample_type = engine_busyness_sample_type(config);
> +	struct xe_device *xe = gt->tile->xe;
> +	const unsigned int gt_id = gt->info.id;
> +	struct xe_pmu *pmu = &xe->pmu;
> +	bool device_awake;
> +	unsigned long flags;
> +	u64 val;
> +
> +	/*
> +	 * found no better way to check if device is awake or not. Before

xe_device_mem_access_get_if_ongoing (hard to find name).

> +	 * we suspend we set the submission_state.enabled to false.
> +	 */
> +	device_awake = gt->uc.guc.submission_state.enabled ? true : false;
> +	if (device_awake)
> +		val = __engine_group_busyness_read(gt, config);
> +
> +	spin_lock_irqsave(&pmu->lock, flags);
> +
> +	if (device_awake)
> +		store_sample(pmu, gt_id, sample_type, val);
> +	else
> +		val = read_sample(pmu, gt_id, sample_type);
> +
> +	spin_unlock_irqrestore(&pmu->lock, flags);
> +
> +	return val;
> +}
> +
> +void engine_group_busyness_store(struct xe_gt *gt)
> +{
> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
> +	unsigned int gt_id = gt->info.id;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&pmu->lock, flags);
> +
> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
> +
> +	spin_unlock_irqrestore(&pmu->lock, flags);
> +}
> +
> +static int
> +config_status(struct xe_device *xe, u64 config)
> +{
> +	unsigned int max_gt_id = xe->info.gt_count > 1 ? 1 : 0;
> +	unsigned int gt_id = config_gt_id(config);
> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
> +
> +	if (gt_id > max_gt_id)
> +		return -ENOENT;
> +
> +	switch (config_counter(config)) {
> +	case XE_PMU_INTERRUPTS(0):
> +		if (gt_id)
> +			return -ENOENT;

OK: this is a global event so we say this is enabled only for gt0.

> +		break;
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +		if (GRAPHICS_VER(xe) < 12)

Any point checking? xe only supports Gen12+.

> +			return -ENOENT;
> +		break;
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		if (MEDIA_VER(xe) >= 13 && gt->info.type != XE_GT_TYPE_MEDIA)
> +			return -ENOENT;

OK: this makes sense, so we will expose this event only for media gt's.

> +		break;
> +	default:
> +		return -ENOENT;
> +	}
> +
> +	return 0;
> +}
> +
> +static int xe_pmu_event_init(struct perf_event *event)
> +{
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct xe_pmu *pmu = &xe->pmu;
> +	int ret;
> +
> +	if (pmu->closed)
> +		return -ENODEV;
> +
> +	if (event->attr.type != event->pmu->type)
> +		return -ENOENT;
> +
> +	/* unsupported modes and filters */
> +	if (event->attr.sample_period) /* no sampling */
> +		return -EINVAL;
> +
> +	if (has_branch_stack(event))
> +		return -EOPNOTSUPP;
> +
> +	if (event->cpu < 0)
> +		return -EINVAL;
> +
> +	/* only allow running on one cpu at a time */
> +	if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask))
> +		return -EINVAL;
> +
> +	ret = config_status(xe, event->attr.config);
> +	if (ret)
> +		return ret;
> +
> +	if (!event->parent) {
> +		drm_dev_get(&xe->drm);
> +		event->destroy = xe_pmu_event_destroy;
> +	}
> +
> +	return 0;
> +}
> +
> +static u64 __xe_pmu_event_read(struct perf_event *event)
> +{
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	const unsigned int gt_id = config_gt_id(event->attr.config);
> +	const u64 config = config_counter(event->attr.config);
> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
> +	struct xe_pmu *pmu = &xe->pmu;
> +	u64 val = 0;
> +
> +	switch (config) {
> +	case XE_PMU_INTERRUPTS(0):
> +		val = READ_ONCE(pmu->irq_count);

OK: The correct way to do this READ_ONCE/WRITE_ONCE irq_count stuff would
be to take pmu->lock (both while reading and writing irq_count) but that
would be expensive in the interrupt handler (as the .h hints). So all we
can do is what is done here.

> +		break;
> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> +	case XE_PMU_COPY_GROUP_BUSY(0):
> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> +		val = engine_group_busyness_read(gt, config);
> +	}
> +
> +	return val;
> +}
> +
> +static void xe_pmu_event_read(struct perf_event *event)
> +{
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct xe_pmu *pmu = &xe->pmu;
> +	u64 prev, new;
> +
> +	if (pmu->closed) {
> +		event->hw.state = PERF_HES_STOPPED;
> +		return;
> +	}
> +again:
> +	prev = local64_read(&hwc->prev_count);
> +	new = __xe_pmu_event_read(event);
> +
> +	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
> +		goto again;
> +
> +	local64_add(new - prev, &event->count);
> +}
> +
> +static void xe_pmu_enable(struct perf_event *event)
> +{
> +	/*
> +	 * Store the current counter value so we can report the correct delta
> +	 * for all listeners. Even when the event was already enabled and has
> +	 * an existing non-zero value.
> +	 */
> +	local64_set(&event->hw.prev_count, __xe_pmu_event_read(event));
> +}
> +
> +static void xe_pmu_event_start(struct perf_event *event, int flags)
> +{
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct xe_pmu *pmu = &xe->pmu;
> +
> +	if (pmu->closed)
> +		return;
> +
> +	xe_pmu_enable(event);
> +	event->hw.state = 0;
> +}
> +
> +static void xe_pmu_event_stop(struct perf_event *event, int flags)
> +{
> +	if (flags & PERF_EF_UPDATE)
> +		xe_pmu_event_read(event);
> +
> +	event->hw.state = PERF_HES_STOPPED;
> +}
> +
> +static int xe_pmu_event_add(struct perf_event *event, int flags)
> +{
> +	struct xe_device *xe =
> +		container_of(event->pmu, typeof(*xe), pmu.base);
> +	struct xe_pmu *pmu = &xe->pmu;
> +
> +	if (pmu->closed)
> +		return -ENODEV;
> +
> +	if (flags & PERF_EF_START)
> +		xe_pmu_event_start(event, flags);
> +
> +	return 0;
> +}
> +
> +static void xe_pmu_event_del(struct perf_event *event, int flags)
> +{
> +	xe_pmu_event_stop(event, PERF_EF_UPDATE);
> +}
> +
> +static int xe_pmu_event_event_idx(struct perf_event *event)
> +{
> +	return 0;
> +}
> +
> +struct xe_str_attribute {
> +	struct device_attribute attr;
> +	const char *str;
> +};
> +
> +static ssize_t xe_pmu_format_show(struct device *dev,
> +				  struct device_attribute *attr, char *buf)
> +{
> +	struct xe_str_attribute *eattr;
> +
> +	eattr = container_of(attr, struct xe_str_attribute, attr);
> +	return sprintf(buf, "%s\n", eattr->str);
> +}
> +
> +#define XE_PMU_FORMAT_ATTR(_name, _config) \
> +	(&((struct xe_str_attribute[]) { \
> +		{ .attr = __ATTR(_name, 0444, xe_pmu_format_show, NULL), \
> +		  .str = _config, } \
> +	})[0].attr.attr)
> +
> +static struct attribute *xe_pmu_format_attrs[] = {
> +	XE_PMU_FORMAT_ATTR(xe_eventid, "config:0-20"),

0-20 means 0-20 bits? Though here we probably have different number of
config bits? Probably doesn't matter?

The string will show up with:

cat /sys/devices/xe/format/xe_eventid

> +	NULL,
> +};
> +
> +static const struct attribute_group xe_pmu_format_attr_group = {
> +	.name = "format",
> +	.attrs = xe_pmu_format_attrs,
> +};
> +
> +struct xe_ext_attribute {
> +	struct device_attribute attr;
> +	unsigned long val;
> +};
> +
> +static ssize_t xe_pmu_event_show(struct device *dev,
> +				 struct device_attribute *attr, char *buf)
> +{
> +	struct xe_ext_attribute *eattr;
> +
> +	eattr = container_of(attr, struct xe_ext_attribute, attr);
> +	return sprintf(buf, "config=0x%lx\n", eattr->val);
> +}
> +
> +static ssize_t cpumask_show(struct device *dev,
> +			    struct device_attribute *attr, char *buf)
> +{
> +	return cpumap_print_to_pagebuf(true, buf, &xe_pmu_cpumask);
> +}
> +
> +static DEVICE_ATTR_RO(cpumask);
> +
> +static struct attribute *xe_cpumask_attrs[] = {
> +	&dev_attr_cpumask.attr,
> +	NULL,
> +};
> +
> +static const struct attribute_group xe_pmu_cpumask_attr_group = {
> +	.attrs = xe_cpumask_attrs,
> +};
> +
> +#define __event(__counter, __name, __unit) \
> +{ \
> +	.counter = (__counter), \
> +	.name = (__name), \
> +	.unit = (__unit), \
> +	.global = false, \
> +}
> +
> +#define __global_event(__counter, __name, __unit) \
> +{ \
> +	.counter = (__counter), \
> +	.name = (__name), \
> +	.unit = (__unit), \
> +	.global = true, \
> +}
> +
> +static struct xe_ext_attribute *
> +add_xe_attr(struct xe_ext_attribute *attr, const char *name, u64 config)
> +{
> +	sysfs_attr_init(&attr->attr.attr);
> +	attr->attr.attr.name = name;
> +	attr->attr.attr.mode = 0444;
> +	attr->attr.show = xe_pmu_event_show;
> +	attr->val = config;
> +
> +	return ++attr;
> +}
> +
> +static struct perf_pmu_events_attr *
> +add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name,
> +	     const char *str)
> +{
> +	sysfs_attr_init(&attr->attr.attr);
> +	attr->attr.attr.name = name;
> +	attr->attr.attr.mode = 0444;
> +	attr->attr.show = perf_event_sysfs_show;
> +	attr->event_str = str;
> +
> +	return ++attr;
> +}
> +
> +static struct attribute **
> +create_event_attributes(struct xe_pmu *pmu)
> +{
> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +	static const struct {
> +		unsigned int counter;
> +		const char *name;
> +		const char *unit;
> +		bool global;
> +	} events[] = {
> +		__global_event(0, "interrupts", NULL),
> +		__event(1, "render-group-busy", "ns"),
> +		__event(2, "copy-group-busy", "ns"),
> +		__event(3, "media-group-busy", "ns"),
> +		__event(4, "any-engine-group-busy", "ns"),
> +	};

OK: this function is some black magic to expose stuff through
PMU. Identical to i915 and seems to be working from the commit message so
should be fine.

> +
> +	unsigned int count = 0;
> +	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
> +	struct xe_ext_attribute *xe_attr = NULL, *xe_iter;
> +	struct attribute **attr = NULL, **attr_iter;
> +	struct xe_gt *gt;
> +	unsigned int i, j;
> +
> +	/* Count how many counters we will be exposing. */
> +	for_each_gt(gt, xe, j) {
> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
> +			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
> +
> +			if (!config_status(xe, config))
> +				count++;
> +		}
> +	}
> +
> +	/* Allocate attribute objects and table. */
> +	xe_attr = kcalloc(count, sizeof(*xe_attr), GFP_KERNEL);
> +	if (!xe_attr)
> +		goto err_alloc;
> +
> +	pmu_attr = kcalloc(count, sizeof(*pmu_attr), GFP_KERNEL);
> +	if (!pmu_attr)
> +		goto err_alloc;
> +
> +	/* Max one pointer of each attribute type plus a termination entry. */
> +	attr = kcalloc(count * 2 + 1, sizeof(*attr), GFP_KERNEL);
> +	if (!attr)
> +		goto err_alloc;
> +
> +	xe_iter = xe_attr;
> +	pmu_iter = pmu_attr;
> +	attr_iter = attr;
> +
> +	for_each_gt(gt, xe, j) {
> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
> +			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
> +			char *str;
> +
> +			if (config_status(xe, config))
> +				continue;
> +
> +			if (events[i].global)
> +				str = kstrdup(events[i].name, GFP_KERNEL);
> +			else
> +				str = kasprintf(GFP_KERNEL, "%s-gt%u",
> +						events[i].name, j);
> +			if (!str)
> +				goto err;
> +
> +			*attr_iter++ = &xe_iter->attr.attr;
> +			xe_iter = add_xe_attr(xe_iter, str, config);
> +
> +			if (events[i].unit) {
> +				if (events[i].global)
> +					str = kasprintf(GFP_KERNEL, "%s.unit",
> +							events[i].name);
> +				else
> +					str = kasprintf(GFP_KERNEL, "%s-gt%u.unit",
> +							events[i].name, j);
> +				if (!str)
> +					goto err;
> +
> +				*attr_iter++ = &pmu_iter->attr.attr;
> +				pmu_iter = add_pmu_attr(pmu_iter, str,
> +							events[i].unit);
> +			}
> +		}
> +	}
> +
> +	pmu->xe_attr = xe_attr;
> +	pmu->pmu_attr = pmu_attr;
> +
> +	return attr;
> +
> +err:
> +	for (attr_iter = attr; *attr_iter; attr_iter++)
> +		kfree((*attr_iter)->name);
> +
> +err_alloc:
> +	kfree(attr);
> +	kfree(xe_attr);
> +	kfree(pmu_attr);
> +
> +	return NULL;
> +}
> +
> +static void free_event_attributes(struct xe_pmu *pmu)
> +{
> +	struct attribute **attr_iter = pmu->events_attr_group.attrs;
> +
> +	for (; *attr_iter; attr_iter++)
> +		kfree((*attr_iter)->name);
> +
> +	kfree(pmu->events_attr_group.attrs);
> +	kfree(pmu->xe_attr);
> +	kfree(pmu->pmu_attr);
> +
> +	pmu->events_attr_group.attrs = NULL;
> +	pmu->xe_attr = NULL;
> +	pmu->pmu_attr = NULL;
> +}
> +
> +static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
> +{
> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
> +
> +	XE_BUG_ON(!pmu->base.event_init);
> +
> +	/* Select the first online CPU as a designated reader. */
> +	if (cpumask_empty(&xe_pmu_cpumask))
> +		cpumask_set_cpu(cpu, &xe_pmu_cpumask);
> +
> +	return 0;
> +}
> +
> +static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
> +{
> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
> +	unsigned int target = xe_pmu_target_cpu;
> +
> +	XE_BUG_ON(!pmu->base.event_init);
> +
> +	/*
> +	 * Unregistering an instance generates a CPU offline event which we must
> +	 * ignore to avoid incorrectly modifying the shared xe_pmu_cpumask.
> +	 */
> +	if (pmu->closed)
> +		return 0;
> +
> +	if (cpumask_test_and_clear_cpu(cpu, &xe_pmu_cpumask)) {
> +		target = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
> +
> +		/* Migrate events if there is a valid target */
> +		if (target < nr_cpu_ids) {
> +			cpumask_set_cpu(target, &xe_pmu_cpumask);
> +			xe_pmu_target_cpu = target;
> +		}
> +	}
> +
> +	if (target < nr_cpu_ids && target != pmu->cpuhp.cpu) {
> +		perf_pmu_migrate_context(&pmu->base, cpu, target);
> +		pmu->cpuhp.cpu = target;
> +	}
> +
> +	return 0;
> +}
> +
> +static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
> +
> +int xe_pmu_init(void)
> +{
> +	int ret;
> +
> +	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
> +				      "perf/x86/intel/xe:online",
> +				      xe_pmu_cpu_online,
> +				      xe_pmu_cpu_offline);
> +	if (ret < 0)
> +		pr_notice("Failed to setup cpuhp state for xe PMU! (%d)\n",
> +			  ret);
> +	else
> +		cpuhp_slot = ret;
> +
> +	return 0;
> +}
> +
> +void xe_pmu_exit(void)
> +{
> +	if (cpuhp_slot != CPUHP_INVALID)
> +		cpuhp_remove_multi_state(cpuhp_slot);
> +}
> +
> +static int xe_pmu_register_cpuhp_state(struct xe_pmu *pmu)
> +{
> +	if (cpuhp_slot == CPUHP_INVALID)
> +		return -EINVAL;
> +
> +	return cpuhp_state_add_instance(cpuhp_slot, &pmu->cpuhp.node);
> +}
> +
> +static void xe_pmu_unregister_cpuhp_state(struct xe_pmu *pmu)
> +{
> +	cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node);
> +}
> +
> +static void xe_pmu_unregister(struct drm_device *device, void *arg)
> +{
> +	struct xe_pmu *pmu = arg;
> +
> +	if (!pmu->base.event_init)
> +		return;
> +
> +	/*
> +	 * "Disconnect" the PMU callbacks - since all are atomic synchronize_rcu
> +	 * ensures all currently executing ones will have exited before we
> +	 * proceed with unregistration.
> +	 */
> +	pmu->closed = true;
> +	synchronize_rcu();
> +
> +	xe_pmu_unregister_cpuhp_state(pmu);
> +
> +	perf_pmu_unregister(&pmu->base);
> +	pmu->base.event_init = NULL;
> +	kfree(pmu->base.attr_groups);
> +	kfree(pmu->name);
> +	free_event_attributes(pmu);
> +}
> +
> +static void init_samples(struct xe_pmu *pmu)
> +{
> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +	struct xe_gt *gt;
> +	unsigned int i;
> +
> +	for_each_gt(gt, xe, i)
> +		engine_group_busyness_store(gt);
> +}
> +
> +void xe_pmu_register(struct xe_pmu *pmu)

Why void, why not int? PMU failure is non fatal error?

> +{
> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +	const struct attribute_group *attr_groups[] = {
> +		&xe_pmu_format_attr_group,
> +		&pmu->events_attr_group,
> +		&xe_pmu_cpumask_attr_group,

Can someone please explain what this cpumask/cpuhotplug stuff does and
whether it needs to be in this patch? There's something here:

b46a33e271ed ("drm/i915/pmu: Expose a PMU interface for perf queries")

I'd rather just have the basic PMU infra and the events in this patch and
punt this cpumask/cpuhotplug stuff to a later patch, unless someone can say
what it does.

Though perf_pmu_register seems to be doing some per cpu stuff so likely
this is needed. But amdgpu_pmu only has event and format attributes.

Mostly leave as is I guess.

> +		NULL
> +	};
> +
> +	int ret = -ENOMEM;
> +
> +	spin_lock_init(&pmu->lock);
> +	pmu->cpuhp.cpu = -1;
> +	init_samples(pmu);

Why init_samples here? Can't we init the particular sample in
xe_pmu_event_init or even xe_pmu_event_start?

Init'ing here may be too soon since the event might not be enabled for a
long time. So really this needs to move to xe_pmu_event_init or
xe_pmu_event_start.

Actually this is already happening in xe_pmu_enable. We just need to decide
when we want to wake the device up and when we don't. So maybe wake the
device up at start (use xe_device_mem_access_get) and not wake up during
read (xe_device_mem_access_get_if_ongoing etc.)?

> +
> +	pmu->name = kasprintf(GFP_KERNEL,
> +			      "xe_%s",
> +			      dev_name(xe->drm.dev));
> +	if (pmu->name)
> +		/* tools/perf reserves colons as special. */
> +		strreplace((char *)pmu->name, ':', '_');
> +
> +	if (!pmu->name)
> +		goto err;
> +
> +	pmu->events_attr_group.name = "events";
> +	pmu->events_attr_group.attrs = create_event_attributes(pmu);
> +	if (!pmu->events_attr_group.attrs)
> +		goto err_name;
> +
> +	pmu->base.attr_groups = kmemdup(attr_groups, sizeof(attr_groups),
> +					GFP_KERNEL);
> +	if (!pmu->base.attr_groups)
> +		goto err_attr;
> +
> +	pmu->base.module	= THIS_MODULE;
> +	pmu->base.task_ctx_nr	= perf_invalid_context;
> +	pmu->base.event_init	= xe_pmu_event_init;
> +	pmu->base.add		= xe_pmu_event_add;
> +	pmu->base.del		= xe_pmu_event_del;
> +	pmu->base.start		= xe_pmu_event_start;
> +	pmu->base.stop		= xe_pmu_event_stop;
> +	pmu->base.read		= xe_pmu_event_read;
> +	pmu->base.event_idx	= xe_pmu_event_event_idx;
> +
> +	ret = perf_pmu_register(&pmu->base, pmu->name, -1);
> +	if (ret)
> +		goto err_groups;
> +
> +	ret = xe_pmu_register_cpuhp_state(pmu);
> +	if (ret)
> +		goto err_unreg;
> +
> +	ret = drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister, pmu);
> +	XE_WARN_ON(ret);

We should just follow the regular error rewind here too and let
drm_notice() at the end print the error. This is what other drivers calling
drmm_add_action_or_reset seem to be doing.

> +
> +	return;
> +
> +err_unreg:
> +	perf_pmu_unregister(&pmu->base);
> +err_groups:
> +	kfree(pmu->base.attr_groups);
> +err_attr:
> +	pmu->base.event_init = NULL;
> +	free_event_attributes(pmu);
> +err_name:
> +	kfree(pmu->name);
> +err:
> +	drm_notice(&xe->drm, "Failed to register PMU!\n");
> +}
> diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
> new file mode 100644
> index 000000000000..d3f47f4ab343
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu.h
> @@ -0,0 +1,25 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +#ifndef _XE_PMU_H_
> +#define _XE_PMU_H_
> +
> +#include "xe_gt_types.h"
> +#include "xe_pmu_types.h"
> +
> +#ifdef CONFIG_PERF_EVENTS

nit but maybe this should be:

#if IS_ENABLED(CONFIG_PERF_EVENTS)

or,

#if IS_BUILTIN(CONFIG_PERF_EVENTS)

Note CONFIG_PERF_EVENTS is a boolean kconfig option.

See similar macro IS_REACHABLE() in i915_hwmon.h.

> +int xe_pmu_init(void);
> +void xe_pmu_exit(void);
> +void xe_pmu_register(struct xe_pmu *pmu);
> +void engine_group_busyness_store(struct xe_gt *gt);

Add xe_pmu_ prefix if function is needed (hopefully not).

> +#else
> +static inline int xe_pmu_init(void) { return 0; }
> +static inline void xe_pmu_exit(void) {}
> +static inline void xe_pmu_register(struct xe_pmu *pmu) {}
> +static inline void engine_group_busyness_store(struct xe_gt *gt) {}
> +#endif
> +
> +#endif
> +
> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h b/drivers/gpu/drm/xe/xe_pmu_types.h
> new file mode 100644
> index 000000000000..e87edd4d6a87
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
> @@ -0,0 +1,80 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +#ifndef _XE_PMU_TYPES_H_
> +#define _XE_PMU_TYPES_H_
> +
> +#include <linux/perf_event.h>
> +#include <linux/spinlock_types.h>
> +#include <uapi/drm/xe_drm.h>
> +
> +enum {
> +	__XE_SAMPLE_RENDER_GROUP_BUSY,
> +	__XE_SAMPLE_COPY_GROUP_BUSY,
> +	__XE_SAMPLE_MEDIA_GROUP_BUSY,
> +	__XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> +	__XE_NUM_PMU_SAMPLERS

OK: irq_count is missing here since these are read from device...

> +};
> +
> +struct xe_pmu_sample {
> +	u64 cur;
> +};

This was also discussed for i915 PMU, no point having a struct with a
single u64 member. Might as well just use u64 wherever we are using
struct xe_pmu_sample.

> +
> +#define XE_MAX_GT_PER_TILE 2

Why per tile? The array size should be max_gt_per_device. Just call it
XE_MAX_GT?

> +
> +struct xe_pmu {
> +	/**
> +	 * @cpuhp: Struct used for CPU hotplug handling.
> +	 */
> +	struct {
> +		struct hlist_node node;
> +		unsigned int cpu;
> +	} cpuhp;
> +	/**
> +	 * @base: PMU base.
> +	 */
> +	struct pmu base;
> +	/**
> +	 * @closed: xe is unregistering.
> +	 */
> +	bool closed;
> +	/**
> +	 * @name: Name as registered with perf core.
> +	 */
> +	const char *name;
> +	/**
> +	 * @lock: Lock protecting enable mask and ref count handling.
> +	 */
> +	spinlock_t lock;
> +	/**
> +	 * @sample: Current and previous (raw) counters.
> +	 *
> +	 * These counters are updated when the device is awake.
> +	 *
> +	 */
> +	struct xe_pmu_sample sample[XE_MAX_GT_PER_TILE * __XE_NUM_PMU_SAMPLERS];

Change to 2-d array. See above.

> +	/**
> +	 * @irq_count: Number of interrupts
> +	 *
> +	 * Intentionally unsigned long to avoid atomics or heuristics on 32bit.
> +	 * 4e9 interrupts are a lot and postprocessing can really deal with an
> +	 * occasional wraparound easily. It's 32bit after all.
> +	 */
> +	unsigned long irq_count;
> +	/**
> +	 * @events_attr_group: Device events attribute group.
> +	 */
> +	struct attribute_group events_attr_group;
> +	/**
> +	 * @xe_attr: Memory block holding device attributes.
> +	 */
> +	void *xe_attr;
> +	/**
> +	 * @pmu_attr: Memory block holding device attributes.
> +	 */
> +	void *pmu_attr;
> +};
> +
> +#endif
> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> index 965cd9527ff1..ed097056f944 100644
> --- a/include/uapi/drm/xe_drm.h
> +++ b/include/uapi/drm/xe_drm.h
> @@ -990,6 +990,22 @@ struct drm_xe_vm_madvise {
>	__u64 reserved[2];
>  };
>
> +/* PMU event config IDs */
> +
> +/*
> + * Top 4 bits of every counter are GT id.
> + */
> +#define __XE_PMU_GT_SHIFT (60)

To future-proof this, and also because we seem to have plenty of bits
available, I think we should change this to 56 (instead of 60).

> +
> +#define ___XE_PMU_OTHER(gt, x) \
> +	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
> +
> +#define XE_PMU_INTERRUPTS(gt)			___XE_PMU_OTHER(gt, 0)
> +#define XE_PMU_RENDER_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 1)
> +#define XE_PMU_COPY_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 2)
> +#define XE_PMU_MEDIA_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 3)
> +#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)	___XE_PMU_OTHER(gt, 4)
> +
>  #if defined(__cplusplus)
>  }
>  #endif
> --
> 2.25.1
>

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-21  1:02   ` Dixit, Ashutosh
@ 2023-07-21 11:51     ` Iddamsetty, Aravind
  2023-07-21 23:36       ` Dixit, Ashutosh
  0 siblings, 1 reply; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-21 11:51 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin



On 21-07-2023 06:32, Dixit, Ashutosh wrote:
> On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
>>
> 
> Hi Aravind,

Hi Ashutosh,
> 
> More stuff to mull over. You can ignore comments starting with "OK", those
> are just notes to myself.
> 
> Also, maybe some time we can add a basic IGT which reads these exposed
> counters and verifies that we can read them and they are monotonically
> increasing?

this is the IGT https://patchwork.freedesktop.org/series/119936/ series
using these counters posted by Venkat.

> 
>> There are a set of engine group busyness counters provided by HW which are
>> perfect fit to be exposed via PMU perf events.
>>
>> BSPEC: 46559, 46560, 46722, 46729
> 
> Also add these Bspec entries: 71028, 52071

OK.

> 
>>
>> events can be listed using:
>> perf list
>>   xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
>>   xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
>>   xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
>>   xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
>>   xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
>>
>> and can be read using:
>>
>> perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
>>            time             counts unit events
>>      1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>      9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>     10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>     10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>
>> The pmu base implementation is taken from i915.
>>
>> v2:
>> Store last known value when device is awake return that while the GT is
>> suspended and then update the driver copy when read during awake.
>>
>> Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
>> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
>> ---
>>  drivers/gpu/drm/xe/Makefile          |   2 +
>>  drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
>>  drivers/gpu/drm/xe/xe_device.c       |   2 +
>>  drivers/gpu/drm/xe/xe_device_types.h |   4 +
>>  drivers/gpu/drm/xe/xe_gt.c           |   2 +
>>  drivers/gpu/drm/xe/xe_irq.c          |  22 +
>>  drivers/gpu/drm/xe/xe_module.c       |   5 +
>>  drivers/gpu/drm/xe/xe_pmu.c          | 739 +++++++++++++++++++++++++++
>>  drivers/gpu/drm/xe/xe_pmu.h          |  25 +
>>  drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
>>  include/uapi/drm/xe_drm.h            |  16 +
>>  11 files changed, 902 insertions(+)
>>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
>>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
>>  create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h
>>
>> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
>> index 081c57fd8632..e52ab795c566 100644
>> --- a/drivers/gpu/drm/xe/Makefile
>> +++ b/drivers/gpu/drm/xe/Makefile
>> @@ -217,6 +217,8 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
>> 	i915-display/skl_universal_plane.o \
>> 	i915-display/skl_watermark.o
>>
>> +xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
>> +
>>  ifeq ($(CONFIG_ACPI),y)
>> 	xe-$(CONFIG_DRM_XE_DISPLAY) += \
>> 		i915-display/intel_acpi.o \
>> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> index 3f664011eaea..c7d9e4634745 100644
>> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> @@ -285,6 +285,11 @@
>>  #define   INVALIDATION_BROADCAST_MODE_DIS	REG_BIT(12)
>>  #define   GLOBAL_INVALIDATION_MODE		REG_BIT(2)
>>
>> +#define XE_OAG_RC0_ANY_ENGINE_BUSY_FREE		XE_REG(0xdb80)
>> +#define XE_OAG_ANY_MEDIA_FF_BUSY_FREE		XE_REG(0xdba0)
>> +#define XE_OAG_BLT_BUSY_FREE			XE_REG(0xdbbc)
>> +#define XE_OAG_RENDER_BUSY_FREE			XE_REG(0xdbdc)
>> +
>>  #define SAMPLER_MODE				XE_REG_MCR(0xe18c, XE_REG_OPTION_MASKED)
>>  #define   ENABLE_SMALLPL			REG_BIT(15)
>>  #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
>> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>> index c7985af85a53..b2c7bd4a97d9 100644
>> --- a/drivers/gpu/drm/xe/xe_device.c
>> +++ b/drivers/gpu/drm/xe/xe_device.c
>> @@ -328,6 +328,8 @@ int xe_device_probe(struct xe_device *xe)
>>
>> 	xe_debugfs_register(xe);
>>
>> +	xe_pmu_register(&xe->pmu);
>> +
>> 	err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize, xe);
>> 	if (err)
>> 		return err;
>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>> index 0226d44a6af2..3ba99aae92b9 100644
>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>> @@ -15,6 +15,7 @@
>>  #include "xe_devcoredump_types.h"
>>  #include "xe_gt_types.h"
>>  #include "xe_platform_types.h"
>> +#include "xe_pmu.h"
>>  #include "xe_step_types.h"
>>
>>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
>> @@ -332,6 +333,9 @@ struct xe_device {
>> 	/** @d3cold_allowed: Indicates if d3cold is a valid device state */
>> 	bool d3cold_allowed;
>>
>> +	/* @pmu: performance monitoring unit */
>> +	struct xe_pmu pmu;
>> +
> 
> Now I am wondering why we don't make the PMU per-gt (or per-tile)? Per-gt
> would work for these busyness registers and I am not sure how the
> interrupts are hooked up.
> 
> In i915 the PMU being device level helped in having a single timer (rather
> than a per gt timer).
> 
> Anyway probably not much practical benefit by make it per-gt/per-tile, so
> maybe leave as is. Just thinking out loud.

we are able to expose per-gt counters so do not see any benefit in
making pmu per-gt, also i believe struct pmu is per device it will have
a type associated which is unique for a device.

> 
>> 	/* private: */
>>
>>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
>> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
>> index 2458397ce8af..96e3720923d4 100644
>> --- a/drivers/gpu/drm/xe/xe_gt.c
>> +++ b/drivers/gpu/drm/xe/xe_gt.c
>> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
>> 	if (err)
>> 		goto err_msg;
>>
>> +	engine_group_busyness_store(gt);
> 
> Not sure I follow the reason for doing this at suspend time? If PMU was
> active there should be a previous value. Why must we take a new sample
> explicitly here?

the PMU interface can be read even when device is suspend and in such
cases we should not wake up the device, and if we try to read the
register when device is suspended it gives spurious counter you can
check in version#1 of this series we were getting huge values. as we put
the device to suspend immediately after probe. So storing the last known
good value before suspend.

> 
> To me looks like engine_group_busyness_store should not be needed, see
> comment below for init_samples too.
> 
>> +
>> 	err = xe_uc_suspend(&gt->uc);
>> 	if (err)
>> 		goto err_force_wake;
>> diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
>> index b4ed1e4a3388..cb943fb94ec7 100644
>> --- a/drivers/gpu/drm/xe/xe_irq.c
>> +++ b/drivers/gpu/drm/xe/xe_irq.c
>> @@ -27,6 +27,24 @@
>>  #define IIR(offset)				XE_REG(offset + 0x8)
>>  #define IER(offset)				XE_REG(offset + 0xc)
>>
>> +/*
>> + * Interrupt statistic for PMU. Increments the counter only if the
>> + * interrupt originated from the GPU so interrupts from a device which
>> + * shares the interrupt line are not accounted.
>> + */
>> +static inline void xe_pmu_irq_stats(struct xe_device *xe,
> 
> No inline, compiler will do it, but looks like we may want to always_inline
> this. Also this function should really be in xe_pmu.h? Anyway probably
> leave as is.
> 
>> +				    irqreturn_t res)
>> +{
>> +	if (unlikely(res != IRQ_HANDLED))
>> +		return;
>> +
>> +	/*
>> +	 * A clever compiler translates that into INC. A not so clever one
>> +	 * should at least prevent store tearing.
>> +	 */
>> +	WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1);
>> +}
>> +
>>  static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)
>>  {
>> 	u32 val = xe_mmio_read32(mmio, reg);
>> @@ -325,6 +343,8 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
>>
>> 	xe_display_irq_enable(xe, gu_misc_iir);
>>
>> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
>> +
>> 	return IRQ_HANDLED;
>>  }
>>
>> @@ -414,6 +434,8 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
>> 	dg1_intr_enable(xe, false);
>> 	xe_display_irq_enable(xe, gu_misc_iir);
>>
>> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
>> +
>> 	return IRQ_HANDLED;
>>  }
>>
>> diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
>> index 75e5be939f53..f6fe89748525 100644
>> --- a/drivers/gpu/drm/xe/xe_module.c
>> +++ b/drivers/gpu/drm/xe/xe_module.c
>> @@ -12,6 +12,7 @@
>>  #include "xe_hw_fence.h"
>>  #include "xe_module.h"
>>  #include "xe_pci.h"
>> +#include "xe_pmu.h"
>>  #include "xe_sched_job.h"
>>
>>  bool enable_guc = true;
>> @@ -49,6 +50,10 @@ static const struct init_funcs init_funcs[] = {
>> 		.init = xe_sched_job_module_init,
>> 		.exit = xe_sched_job_module_exit,
>> 	},
>> +	{
>> +		.init = xe_pmu_init,
>> +		.exit = xe_pmu_exit,
>> +	},
>> 	{
>> 		.init = xe_register_pci_driver,
>> 		.exit = xe_unregister_pci_driver,
>> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
>> new file mode 100644
>> index 000000000000..bef1895be9f7
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_pmu.c
>> @@ -0,0 +1,739 @@
>> +/*
>> + * SPDX-License-Identifier: MIT
>> + *
>> + * Copyright © 2023 Intel Corporation
>> + */
> 
> This SPDX header is for .h files not .c files. Actually, it is for neither :/

But i see this in almost all the files in xe.
> 
>> +
>> +#include <drm/drm_drv.h>
>> +#include <drm/drm_managed.h>
>> +#include <drm/xe_drm.h>
>> +
>> +#include "regs/xe_gt_regs.h"
>> +#include "xe_device.h"
>> +#include "xe_gt_clock.h"
>> +#include "xe_mmio.h"
>> +
>> +static cpumask_t xe_pmu_cpumask;
>> +static unsigned int xe_pmu_target_cpu = -1;
>> +
>> +static unsigned int config_gt_id(const u64 config)
>> +{
>> +	return config >> __XE_PMU_GT_SHIFT;
>> +}
>> +
>> +static u64 config_counter(const u64 config)
>> +{
>> +	return config & ~(~0ULL << __XE_PMU_GT_SHIFT);
>> +}
>> +
>> +static unsigned int
>> +__sample_idx(struct xe_pmu *pmu, unsigned int gt_id, int sample)
>> +{
>> +	unsigned int idx = gt_id * __XE_NUM_PMU_SAMPLERS + sample;
>> +
>> +	XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
>> +
>> +	return idx;
>> +}
> 
> The compiler does this for us if we make sample array 2-d.
> 
>> +
>> +static u64 read_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample)
>> +{
>> +	return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur;
>> +}
>> +
>> +static void
>> +store_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample, u64 val)
>> +{
>> +	pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val;
>> +}
> 
> The three functions above are not needed if we make the sample array
> 2-d. See here:
> 
> https://patchwork.freedesktop.org/patch/538887/?series=118225&rev=1
> 
> Only a part of the patch above was merged (see 8ed0753b527dc) to keep the
> patch size small, but since for xe we are starting from scratch we can
> implement the whole approach above (get rid of the read/store helpers
> entirely, direct assignment without the helpers works).

Ok I'll go over it.

> 
>> +
>> +static int engine_busyness_sample_type(u64 config)
>> +{
>> +	int type = 0;
>> +
>> +	switch (config) {
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +		type =  __XE_SAMPLE_RENDER_GROUP_BUSY;
>> +		break;
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +		type = __XE_SAMPLE_COPY_GROUP_BUSY;
>> +		break;
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		type = __XE_SAMPLE_MEDIA_GROUP_BUSY;
>> +		break;
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +		type = __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY;
>> +		break;
>> +	}
>> +
>> +	return type;
>> +}
>> +
>> +static void xe_pmu_event_destroy(struct perf_event *event)
>> +{
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +
>> +	drm_WARN_ON(&xe->drm, event->parent);
>> +
>> +	drm_dev_put(&xe->drm);
>> +}
>> +
>> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
>> +{
>> +	u64 val = 0;
>> +
>> +	switch (config) {
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
>> +		break;
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
>> +		break;
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
>> +		break;
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
>> +		break;
>> +	default:
>> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
>> +	}
> 
> We need xe_device_mem_access_get, also xe_force_wake_get if applicable,
> somewhere before reading these registers.
> 
>> +
>> +	return xe_gt_clock_interval_to_ns(gt, val * 16);
>> +}
>> +
>> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config)
>> +{
>> +	int sample_type = engine_busyness_sample_type(config);
>> +	struct xe_device *xe = gt->tile->xe;
>> +	const unsigned int gt_id = gt->info.id;
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	bool device_awake;
>> +	unsigned long flags;
>> +	u64 val;
>> +
>> +	/*
>> +	 * found no better way to check if device is awake or not. Before
> 
> xe_device_mem_access_get_if_ongoing (hard to find name).

thanks for sharing looks to be added recently. If we use this we needn't
use xe_device_mem_access_get.

> 
>> +	 * we suspend we set the submission_state.enabled to false.
>> +	 */
>> +	device_awake = gt->uc.guc.submission_state.enabled ? true : false;
>> +	if (device_awake)
>> +		val = __engine_group_busyness_read(gt, config);
>> +
>> +	spin_lock_irqsave(&pmu->lock, flags);
>> +
>> +	if (device_awake)
>> +		store_sample(pmu, gt_id, sample_type, val);
>> +	else
>> +		val = read_sample(pmu, gt_id, sample_type);
>> +
>> +	spin_unlock_irqrestore(&pmu->lock, flags);
>> +
>> +	return val;
>> +}
>> +
>> +void engine_group_busyness_store(struct xe_gt *gt)
>> +{
>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
>> +	unsigned int gt_id = gt->info.id;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&pmu->lock, flags);
>> +
>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
>> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
>> +
>> +	spin_unlock_irqrestore(&pmu->lock, flags);
>> +}
>> +
>> +static int
>> +config_status(struct xe_device *xe, u64 config)
>> +{
>> +	unsigned int max_gt_id = xe->info.gt_count > 1 ? 1 : 0;
>> +	unsigned int gt_id = config_gt_id(config);
>> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
>> +
>> +	if (gt_id > max_gt_id)
>> +		return -ENOENT;
>> +
>> +	switch (config_counter(config)) {
>> +	case XE_PMU_INTERRUPTS(0):
>> +		if (gt_id)
>> +			return -ENOENT;
> 
> OK: this is a global event so we say this is enabled only for gt0.
> 
>> +		break;
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +		if (GRAPHICS_VER(xe) < 12)
> 
> Any point checking? xe only supports Gen12+.

hmmm ya good point will remove this.
> 
>> +			return -ENOENT;
>> +		break;
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		if (MEDIA_VER(xe) >= 13 && gt->info.type != XE_GT_TYPE_MEDIA)
>> +			return -ENOENT;
> 
> OK: this makes sense, so we will expose this event only for media gt's.
> 
>> +		break;
>> +	default:
>> +		return -ENOENT;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int xe_pmu_event_init(struct perf_event *event)
>> +{
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	int ret;
>> +
>> +	if (pmu->closed)
>> +		return -ENODEV;
>> +
>> +	if (event->attr.type != event->pmu->type)
>> +		return -ENOENT;
>> +
>> +	/* unsupported modes and filters */
>> +	if (event->attr.sample_period) /* no sampling */
>> +		return -EINVAL;
>> +
>> +	if (has_branch_stack(event))
>> +		return -EOPNOTSUPP;
>> +
>> +	if (event->cpu < 0)
>> +		return -EINVAL;
>> +
>> +	/* only allow running on one cpu at a time */
>> +	if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask))
>> +		return -EINVAL;
>> +
>> +	ret = config_status(xe, event->attr.config);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (!event->parent) {
>> +		drm_dev_get(&xe->drm);
>> +		event->destroy = xe_pmu_event_destroy;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static u64 __xe_pmu_event_read(struct perf_event *event)
>> +{
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	const unsigned int gt_id = config_gt_id(event->attr.config);
>> +	const u64 config = config_counter(event->attr.config);
>> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	u64 val = 0;
>> +
>> +	switch (config) {
>> +	case XE_PMU_INTERRUPTS(0):
>> +		val = READ_ONCE(pmu->irq_count);
> 
> OK: The correct way to do this READ_ONCE/WRITE_ONCE irq_count stuff would
> be to take pmu->lock (both while reading and writing irq_count) but that
> would be expensive in the interrupt handler (as the .h hints). So all we
> can do is what is done here.
> 
>> +		break;
>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>> +		val = engine_group_busyness_read(gt, config);
>> +	}
>> +
>> +	return val;
>> +}
>> +
>> +static void xe_pmu_event_read(struct perf_event *event)
>> +{
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct hw_perf_event *hwc = &event->hw;
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +	u64 prev, new;
>> +
>> +	if (pmu->closed) {
>> +		event->hw.state = PERF_HES_STOPPED;
>> +		return;
>> +	}
>> +again:
>> +	prev = local64_read(&hwc->prev_count);
>> +	new = __xe_pmu_event_read(event);
>> +
>> +	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
>> +		goto again;
>> +
>> +	local64_add(new - prev, &event->count);
>> +}
>> +
>> +static void xe_pmu_enable(struct perf_event *event)
>> +{
>> +	/*
>> +	 * Store the current counter value so we can report the correct delta
>> +	 * for all listeners. Even when the event was already enabled and has
>> +	 * an existing non-zero value.
>> +	 */
>> +	local64_set(&event->hw.prev_count, __xe_pmu_event_read(event));
>> +}
>> +
>> +static void xe_pmu_event_start(struct perf_event *event, int flags)
>> +{
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +
>> +	if (pmu->closed)
>> +		return;
>> +
>> +	xe_pmu_enable(event);
>> +	event->hw.state = 0;
>> +}
>> +
>> +static void xe_pmu_event_stop(struct perf_event *event, int flags)
>> +{
>> +	if (flags & PERF_EF_UPDATE)
>> +		xe_pmu_event_read(event);
>> +
>> +	event->hw.state = PERF_HES_STOPPED;
>> +}
>> +
>> +static int xe_pmu_event_add(struct perf_event *event, int flags)
>> +{
>> +	struct xe_device *xe =
>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>> +	struct xe_pmu *pmu = &xe->pmu;
>> +
>> +	if (pmu->closed)
>> +		return -ENODEV;
>> +
>> +	if (flags & PERF_EF_START)
>> +		xe_pmu_event_start(event, flags);
>> +
>> +	return 0;
>> +}
>> +
>> +static void xe_pmu_event_del(struct perf_event *event, int flags)
>> +{
>> +	xe_pmu_event_stop(event, PERF_EF_UPDATE);
>> +}
>> +
>> +static int xe_pmu_event_event_idx(struct perf_event *event)
>> +{
>> +	return 0;
>> +}
>> +
>> +struct xe_str_attribute {
>> +	struct device_attribute attr;
>> +	const char *str;
>> +};
>> +
>> +static ssize_t xe_pmu_format_show(struct device *dev,
>> +				  struct device_attribute *attr, char *buf)
>> +{
>> +	struct xe_str_attribute *eattr;
>> +
>> +	eattr = container_of(attr, struct xe_str_attribute, attr);
>> +	return sprintf(buf, "%s\n", eattr->str);
>> +}
>> +
>> +#define XE_PMU_FORMAT_ATTR(_name, _config) \
>> +	(&((struct xe_str_attribute[]) { \
>> +		{ .attr = __ATTR(_name, 0444, xe_pmu_format_show, NULL), \
>> +		  .str = _config, } \
>> +	})[0].attr.attr)
>> +
>> +static struct attribute *xe_pmu_format_attrs[] = {
>> +	XE_PMU_FORMAT_ATTR(xe_eventid, "config:0-20"),
> 
> 0-20 means 0-20 bits? Though here we probably have different number of
> config bits? Probably doesn't matter?

as i understand this is not used anymore so will remove it.

> 
> The string will show up with:
> 
> cat /sys/devices/xe/format/xe_eventid
> 
>> +	NULL,
>> +};
>> +
>> +static const struct attribute_group xe_pmu_format_attr_group = {
>> +	.name = "format",
>> +	.attrs = xe_pmu_format_attrs,
>> +};
>> +
>> +struct xe_ext_attribute {
>> +	struct device_attribute attr;
>> +	unsigned long val;
>> +};
>> +
>> +static ssize_t xe_pmu_event_show(struct device *dev,
>> +				 struct device_attribute *attr, char *buf)
>> +{
>> +	struct xe_ext_attribute *eattr;
>> +
>> +	eattr = container_of(attr, struct xe_ext_attribute, attr);
>> +	return sprintf(buf, "config=0x%lx\n", eattr->val);
>> +}
>> +
>> +static ssize_t cpumask_show(struct device *dev,
>> +			    struct device_attribute *attr, char *buf)
>> +{
>> +	return cpumap_print_to_pagebuf(true, buf, &xe_pmu_cpumask);
>> +}
>> +
>> +static DEVICE_ATTR_RO(cpumask);
>> +
>> +static struct attribute *xe_cpumask_attrs[] = {
>> +	&dev_attr_cpumask.attr,
>> +	NULL,
>> +};
>> +
>> +static const struct attribute_group xe_pmu_cpumask_attr_group = {
>> +	.attrs = xe_cpumask_attrs,
>> +};
>> +
>> +#define __event(__counter, __name, __unit) \
>> +{ \
>> +	.counter = (__counter), \
>> +	.name = (__name), \
>> +	.unit = (__unit), \
>> +	.global = false, \
>> +}
>> +
>> +#define __global_event(__counter, __name, __unit) \
>> +{ \
>> +	.counter = (__counter), \
>> +	.name = (__name), \
>> +	.unit = (__unit), \
>> +	.global = true, \
>> +}
>> +
>> +static struct xe_ext_attribute *
>> +add_xe_attr(struct xe_ext_attribute *attr, const char *name, u64 config)
>> +{
>> +	sysfs_attr_init(&attr->attr.attr);
>> +	attr->attr.attr.name = name;
>> +	attr->attr.attr.mode = 0444;
>> +	attr->attr.show = xe_pmu_event_show;
>> +	attr->val = config;
>> +
>> +	return ++attr;
>> +}
>> +
>> +static struct perf_pmu_events_attr *
>> +add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name,
>> +	     const char *str)
>> +{
>> +	sysfs_attr_init(&attr->attr.attr);
>> +	attr->attr.attr.name = name;
>> +	attr->attr.attr.mode = 0444;
>> +	attr->attr.show = perf_event_sysfs_show;
>> +	attr->event_str = str;
>> +
>> +	return ++attr;
>> +}
>> +
>> +static struct attribute **
>> +create_event_attributes(struct xe_pmu *pmu)
>> +{
>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +	static const struct {
>> +		unsigned int counter;
>> +		const char *name;
>> +		const char *unit;
>> +		bool global;
>> +	} events[] = {
>> +		__global_event(0, "interrupts", NULL),
>> +		__event(1, "render-group-busy", "ns"),
>> +		__event(2, "copy-group-busy", "ns"),
>> +		__event(3, "media-group-busy", "ns"),
>> +		__event(4, "any-engine-group-busy", "ns"),
>> +	};
> 
> OK: this function is some black magic to expose stuff through
> PMU. Identical to i915 and seems to be working from the commit message so
> should be fine.
> 
>> +
>> +	unsigned int count = 0;
>> +	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
>> +	struct xe_ext_attribute *xe_attr = NULL, *xe_iter;
>> +	struct attribute **attr = NULL, **attr_iter;
>> +	struct xe_gt *gt;
>> +	unsigned int i, j;
>> +
>> +	/* Count how many counters we will be exposing. */
>> +	for_each_gt(gt, xe, j) {
>> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
>> +			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
>> +
>> +			if (!config_status(xe, config))
>> +				count++;
>> +		}
>> +	}
>> +
>> +	/* Allocate attribute objects and table. */
>> +	xe_attr = kcalloc(count, sizeof(*xe_attr), GFP_KERNEL);
>> +	if (!xe_attr)
>> +		goto err_alloc;
>> +
>> +	pmu_attr = kcalloc(count, sizeof(*pmu_attr), GFP_KERNEL);
>> +	if (!pmu_attr)
>> +		goto err_alloc;
>> +
>> +	/* Max one pointer of each attribute type plus a termination entry. */
>> +	attr = kcalloc(count * 2 + 1, sizeof(*attr), GFP_KERNEL);
>> +	if (!attr)
>> +		goto err_alloc;
>> +
>> +	xe_iter = xe_attr;
>> +	pmu_iter = pmu_attr;
>> +	attr_iter = attr;
>> +
>> +	for_each_gt(gt, xe, j) {
>> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
>> +			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
>> +			char *str;
>> +
>> +			if (config_status(xe, config))
>> +				continue;
>> +
>> +			if (events[i].global)
>> +				str = kstrdup(events[i].name, GFP_KERNEL);
>> +			else
>> +				str = kasprintf(GFP_KERNEL, "%s-gt%u",
>> +						events[i].name, j);
>> +			if (!str)
>> +				goto err;
>> +
>> +			*attr_iter++ = &xe_iter->attr.attr;
>> +			xe_iter = add_xe_attr(xe_iter, str, config);
>> +
>> +			if (events[i].unit) {
>> +				if (events[i].global)
>> +					str = kasprintf(GFP_KERNEL, "%s.unit",
>> +							events[i].name);
>> +				else
>> +					str = kasprintf(GFP_KERNEL, "%s-gt%u.unit",
>> +							events[i].name, j);
>> +				if (!str)
>> +					goto err;
>> +
>> +				*attr_iter++ = &pmu_iter->attr.attr;
>> +				pmu_iter = add_pmu_attr(pmu_iter, str,
>> +							events[i].unit);
>> +			}
>> +		}
>> +	}
>> +
>> +	pmu->xe_attr = xe_attr;
>> +	pmu->pmu_attr = pmu_attr;
>> +
>> +	return attr;
>> +
>> +err:
>> +	for (attr_iter = attr; *attr_iter; attr_iter++)
>> +		kfree((*attr_iter)->name);
>> +
>> +err_alloc:
>> +	kfree(attr);
>> +	kfree(xe_attr);
>> +	kfree(pmu_attr);
>> +
>> +	return NULL;
>> +}
>> +
>> +static void free_event_attributes(struct xe_pmu *pmu)
>> +{
>> +	struct attribute **attr_iter = pmu->events_attr_group.attrs;
>> +
>> +	for (; *attr_iter; attr_iter++)
>> +		kfree((*attr_iter)->name);
>> +
>> +	kfree(pmu->events_attr_group.attrs);
>> +	kfree(pmu->xe_attr);
>> +	kfree(pmu->pmu_attr);
>> +
>> +	pmu->events_attr_group.attrs = NULL;
>> +	pmu->xe_attr = NULL;
>> +	pmu->pmu_attr = NULL;
>> +}
>> +
>> +static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
>> +{
>> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
>> +
>> +	XE_BUG_ON(!pmu->base.event_init);
>> +
>> +	/* Select the first online CPU as a designated reader. */
>> +	if (cpumask_empty(&xe_pmu_cpumask))
>> +		cpumask_set_cpu(cpu, &xe_pmu_cpumask);
>> +
>> +	return 0;
>> +}
>> +
>> +static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
>> +{
>> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
>> +	unsigned int target = xe_pmu_target_cpu;
>> +
>> +	XE_BUG_ON(!pmu->base.event_init);
>> +
>> +	/*
>> +	 * Unregistering an instance generates a CPU offline event which we must
>> +	 * ignore to avoid incorrectly modifying the shared xe_pmu_cpumask.
>> +	 */
>> +	if (pmu->closed)
>> +		return 0;
>> +
>> +	if (cpumask_test_and_clear_cpu(cpu, &xe_pmu_cpumask)) {
>> +		target = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
>> +
>> +		/* Migrate events if there is a valid target */
>> +		if (target < nr_cpu_ids) {
>> +			cpumask_set_cpu(target, &xe_pmu_cpumask);
>> +			xe_pmu_target_cpu = target;
>> +		}
>> +	}
>> +
>> +	if (target < nr_cpu_ids && target != pmu->cpuhp.cpu) {
>> +		perf_pmu_migrate_context(&pmu->base, cpu, target);
>> +		pmu->cpuhp.cpu = target;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
>> +
>> +int xe_pmu_init(void)
>> +{
>> +	int ret;
>> +
>> +	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
>> +				      "perf/x86/intel/xe:online",
>> +				      xe_pmu_cpu_online,
>> +				      xe_pmu_cpu_offline);
>> +	if (ret < 0)
>> +		pr_notice("Failed to setup cpuhp state for xe PMU! (%d)\n",
>> +			  ret);
>> +	else
>> +		cpuhp_slot = ret;
>> +
>> +	return 0;
>> +}
>> +
>> +void xe_pmu_exit(void)
>> +{
>> +	if (cpuhp_slot != CPUHP_INVALID)
>> +		cpuhp_remove_multi_state(cpuhp_slot);
>> +}
>> +
>> +static int xe_pmu_register_cpuhp_state(struct xe_pmu *pmu)
>> +{
>> +	if (cpuhp_slot == CPUHP_INVALID)
>> +		return -EINVAL;
>> +
>> +	return cpuhp_state_add_instance(cpuhp_slot, &pmu->cpuhp.node);
>> +}
>> +
>> +static void xe_pmu_unregister_cpuhp_state(struct xe_pmu *pmu)
>> +{
>> +	cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node);
>> +}
>> +
>> +static void xe_pmu_unregister(struct drm_device *device, void *arg)
>> +{
>> +	struct xe_pmu *pmu = arg;
>> +
>> +	if (!pmu->base.event_init)
>> +		return;
>> +
>> +	/*
>> +	 * "Disconnect" the PMU callbacks - since all are atomic synchronize_rcu
>> +	 * ensures all currently executing ones will have exited before we
>> +	 * proceed with unregistration.
>> +	 */
>> +	pmu->closed = true;
>> +	synchronize_rcu();
>> +
>> +	xe_pmu_unregister_cpuhp_state(pmu);
>> +
>> +	perf_pmu_unregister(&pmu->base);
>> +	pmu->base.event_init = NULL;
>> +	kfree(pmu->base.attr_groups);
>> +	kfree(pmu->name);
>> +	free_event_attributes(pmu);
>> +}
>> +
>> +static void init_samples(struct xe_pmu *pmu)
>> +{
>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +	struct xe_gt *gt;
>> +	unsigned int i;
>> +
>> +	for_each_gt(gt, xe, i)
>> +		engine_group_busyness_store(gt);
>> +}
>> +
>> +void xe_pmu_register(struct xe_pmu *pmu)
> 
> Why void, why not int? PMU failure is non fatal error?

Ya device is functional , it is only that these counters are not
available. Hence didn't want to fail the driver load.
> 
>> +{
>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +	const struct attribute_group *attr_groups[] = {
>> +		&xe_pmu_format_attr_group,
>> +		&pmu->events_attr_group,
>> +		&xe_pmu_cpumask_attr_group,
> 
> Can someone please explain what this cpumask/cpuhotplug stuff does and
> whether it needs to be in this patch? There's something here:

comments from original patch series in
i915:https://patchwork.kernel.org/project/intel-gfx/patch/20170802123249.14194-5-tvrtko.ursulin@linux.intel.com/

"IIRC an uncore PMU should expose a cpumask through sysfs, and then perf
tools will read that mask and auto-magically limit the number of CPUs it
instantiates the counter on."

and ours are global counters not per cpu so we limit to just to single cpu.

And as i understand we use the cpuhotplug support to migrate too new cpu
incase the earlier one goes offline.

> 
> b46a33e271ed ("drm/i915/pmu: Expose a PMU interface for perf queries")
> 
> I'd rather just have the basic PMU infra and the events in this patch and
> punt this cpumask/cpuhotplug stuff to a later patch, unless someone can say
> what it does.
> 
> Though perf_pmu_register seems to be doing some per cpu stuff so likely
> this is needed. But amdgpu_pmu only has event and format attributes.
> 
> Mostly leave as is I guess.
> 
>> +		NULL
>> +	};
>> +
>> +	int ret = -ENOMEM;
>> +
>> +	spin_lock_init(&pmu->lock);
>> +	pmu->cpuhp.cpu = -1;
>> +	init_samples(pmu);
> 
> Why init_samples here? Can't we init the particular sample in
> xe_pmu_event_init or even xe_pmu_event_start?
> 
> Init'ing here may be too soon since the event might not be enabled for a
> long time. So really this needs to move to xe_pmu_event_init or
> xe_pmu_event_start.

The device is put to suspend immediately after driver probe, typically
pmu is opened even before any workload is run so essentially device is
still in suspend state hence we cannot access registers so storing the
last known value in init_samples. otherwise we see the bug in v#1 of series.

> 
> Actually this is already happening in xe_pmu_enable. We just need to decide
> when we want to wake the device up and when we don't. So maybe wake the
> device up at start (use xe_device_mem_access_get) and not wake up during
> read (xe_device_mem_access_get_if_ongoing etc.)?

> 
>> +
>> +	pmu->name = kasprintf(GFP_KERNEL,
>> +			      "xe_%s",
>> +			      dev_name(xe->drm.dev));
>> +	if (pmu->name)
>> +		/* tools/perf reserves colons as special. */
>> +		strreplace((char *)pmu->name, ':', '_');
>> +
>> +	if (!pmu->name)
>> +		goto err;
>> +
>> +	pmu->events_attr_group.name = "events";
>> +	pmu->events_attr_group.attrs = create_event_attributes(pmu);
>> +	if (!pmu->events_attr_group.attrs)
>> +		goto err_name;
>> +
>> +	pmu->base.attr_groups = kmemdup(attr_groups, sizeof(attr_groups),
>> +					GFP_KERNEL);
>> +	if (!pmu->base.attr_groups)
>> +		goto err_attr;
>> +
>> +	pmu->base.module	= THIS_MODULE;
>> +	pmu->base.task_ctx_nr	= perf_invalid_context;
>> +	pmu->base.event_init	= xe_pmu_event_init;
>> +	pmu->base.add		= xe_pmu_event_add;
>> +	pmu->base.del		= xe_pmu_event_del;
>> +	pmu->base.start		= xe_pmu_event_start;
>> +	pmu->base.stop		= xe_pmu_event_stop;
>> +	pmu->base.read		= xe_pmu_event_read;
>> +	pmu->base.event_idx	= xe_pmu_event_event_idx;
>> +
>> +	ret = perf_pmu_register(&pmu->base, pmu->name, -1);
>> +	if (ret)
>> +		goto err_groups;
>> +
>> +	ret = xe_pmu_register_cpuhp_state(pmu);
>> +	if (ret)
>> +		goto err_unreg;
>> +
>> +	ret = drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister, pmu);
>> +	XE_WARN_ON(ret);
> 
> We should just follow the regular error rewind here too and let
> drm_notice() at the end print the error. This is what other drivers calling
> drmm_add_action_or_reset seem to be doing.

Ok ok.
> 
>> +
>> +	return;
>> +
>> +err_unreg:
>> +	perf_pmu_unregister(&pmu->base);
>> +err_groups:
>> +	kfree(pmu->base.attr_groups);
>> +err_attr:
>> +	pmu->base.event_init = NULL;
>> +	free_event_attributes(pmu);
>> +err_name:
>> +	kfree(pmu->name);
>> +err:
>> +	drm_notice(&xe->drm, "Failed to register PMU!\n");
>> +}
>> diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
>> new file mode 100644
>> index 000000000000..d3f47f4ab343
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_pmu.h
>> @@ -0,0 +1,25 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright © 2023 Intel Corporation
>> + */
>> +
>> +#ifndef _XE_PMU_H_
>> +#define _XE_PMU_H_
>> +
>> +#include "xe_gt_types.h"
>> +#include "xe_pmu_types.h"
>> +
>> +#ifdef CONFIG_PERF_EVENTS
> 
> nit but maybe this should be:
> 
> #if IS_ENABLED(CONFIG_PERF_EVENTS)
> 
> or,
> 
> #if IS_BUILTIN(CONFIG_PERF_EVENTS)
> 
> Note CONFIG_PERF_EVENTS is a boolean kconfig option.
> 
> See similar macro IS_REACHABLE() in i915_hwmon.h.
> 
>> +int xe_pmu_init(void);
>> +void xe_pmu_exit(void);
>> +void xe_pmu_register(struct xe_pmu *pmu);
>> +void engine_group_busyness_store(struct xe_gt *gt);
> 
> Add xe_pmu_ prefix if function is needed (hopefully not).

OK
> 
>> +#else
>> +static inline int xe_pmu_init(void) { return 0; }
>> +static inline void xe_pmu_exit(void) {}
>> +static inline void xe_pmu_register(struct xe_pmu *pmu) {}
>> +static inline void engine_group_busyness_store(struct xe_gt *gt) {}
>> +#endif
>> +
>> +#endif
>> +
>> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h b/drivers/gpu/drm/xe/xe_pmu_types.h
>> new file mode 100644
>> index 000000000000..e87edd4d6a87
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
>> @@ -0,0 +1,80 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright © 2023 Intel Corporation
>> + */
>> +
>> +#ifndef _XE_PMU_TYPES_H_
>> +#define _XE_PMU_TYPES_H_
>> +
>> +#include <linux/perf_event.h>
>> +#include <linux/spinlock_types.h>
>> +#include <uapi/drm/xe_drm.h>
>> +
>> +enum {
>> +	__XE_SAMPLE_RENDER_GROUP_BUSY,
>> +	__XE_SAMPLE_COPY_GROUP_BUSY,
>> +	__XE_SAMPLE_MEDIA_GROUP_BUSY,
>> +	__XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>> +	__XE_NUM_PMU_SAMPLERS
> 
> OK: irq_count is missing here since these are read from device...
> 
>> +};
>> +
>> +struct xe_pmu_sample {
>> +	u64 cur;
>> +};
> 
> This was also discussed for i915 PMU, no point having a struct with a
> single u64 member. Might as well just use u64 wherever we are using
> struct xe_pmu_sample.

OK.
> 
>> +
>> +#define XE_MAX_GT_PER_TILE 2
> 
> Why per tile? The array size should be max_gt_per_device. Just call it
> XE_MAX_GT?

I declared similar to what we have in drivers/gpu/drm/xe/xe_device.h
> 
>> +
>> +struct xe_pmu {
>> +	/**
>> +	 * @cpuhp: Struct used for CPU hotplug handling.
>> +	 */
>> +	struct {
>> +		struct hlist_node node;
>> +		unsigned int cpu;
>> +	} cpuhp;
>> +	/**
>> +	 * @base: PMU base.
>> +	 */
>> +	struct pmu base;
>> +	/**
>> +	 * @closed: xe is unregistering.
>> +	 */
>> +	bool closed;
>> +	/**
>> +	 * @name: Name as registered with perf core.
>> +	 */
>> +	const char *name;
>> +	/**
>> +	 * @lock: Lock protecting enable mask and ref count handling.
>> +	 */
>> +	spinlock_t lock;
>> +	/**
>> +	 * @sample: Current and previous (raw) counters.
>> +	 *
>> +	 * These counters are updated when the device is awake.
>> +	 *
>> +	 */
>> +	struct xe_pmu_sample sample[XE_MAX_GT_PER_TILE * __XE_NUM_PMU_SAMPLERS];
> 
> Change to 2-d array. See above.
> 
>> +	/**
>> +	 * @irq_count: Number of interrupts
>> +	 *
>> +	 * Intentionally unsigned long to avoid atomics or heuristics on 32bit.
>> +	 * 4e9 interrupts are a lot and postprocessing can really deal with an
>> +	 * occasional wraparound easily. It's 32bit after all.
>> +	 */
>> +	unsigned long irq_count;
>> +	/**
>> +	 * @events_attr_group: Device events attribute group.
>> +	 */
>> +	struct attribute_group events_attr_group;
>> +	/**
>> +	 * @xe_attr: Memory block holding device attributes.
>> +	 */
>> +	void *xe_attr;
>> +	/**
>> +	 * @pmu_attr: Memory block holding device attributes.
>> +	 */
>> +	void *pmu_attr;
>> +};
>> +
>> +#endif
>> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
>> index 965cd9527ff1..ed097056f944 100644
>> --- a/include/uapi/drm/xe_drm.h
>> +++ b/include/uapi/drm/xe_drm.h
>> @@ -990,6 +990,22 @@ struct drm_xe_vm_madvise {
>> 	__u64 reserved[2];
>>  };
>>
>> +/* PMU event config IDs */
>> +
>> +/*
>> + * Top 4 bits of every counter are GT id.
>> + */
>> +#define __XE_PMU_GT_SHIFT (60)
> 
> To future-proof this, and also because we seem to have plenty of bits
> available, I think we should change this to 56 (instead of 60).

OK

Thanks,
Aravind.
> 
>> +
>> +#define ___XE_PMU_OTHER(gt, x) \
>> +	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
>> +
>> +#define XE_PMU_INTERRUPTS(gt)			___XE_PMU_OTHER(gt, 0)
>> +#define XE_PMU_RENDER_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 1)
>> +#define XE_PMU_COPY_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 2)
>> +#define XE_PMU_MEDIA_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 3)
>> +#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)	___XE_PMU_OTHER(gt, 4)
>> +
>>  #if defined(__cplusplus)
>>  }
>>  #endif
>> --
>> 2.25.1
>>
> 
> Thanks.
> --
> Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-21 11:51     ` Iddamsetty, Aravind
@ 2023-07-21 23:36       ` Dixit, Ashutosh
  2023-07-22  6:04         ` Dixit, Ashutosh
  0 siblings, 1 reply; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-21 23:36 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote:
>
Hi Aravind,

> On 21-07-2023 06:32, Dixit, Ashutosh wrote:
> > On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
> >>
> > More stuff to mull over. You can ignore comments starting with "OK", those
> > are just notes to myself.
> >
> > Also, maybe some time we can add a basic IGT which reads these exposed
> > counters and verifies that we can read them and they are monotonically
> > increasing?
>
> this is the IGT https://patchwork.freedesktop.org/series/119936/ series
> using these counters posted by Venkat.
>
> >
> >> There are a set of engine group busyness counters provided by HW which are
> >> perfect fit to be exposed via PMU perf events.
> >>
> >> BSPEC: 46559, 46560, 46722, 46729
> >
> > Also add these Bspec entries: 71028, 52071
>
> OK.
>
> >
> >>
> >> events can be listed using:
> >> perf list
> >>   xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
> >>   xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
> >>   xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
> >>   xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
> >>   xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
> >>
> >> and can be read using:
> >>
> >> perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
> >>            time             counts unit events
> >>      1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> >>      2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> >>      3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> >>      4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> >>      5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> >>      6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> >>      7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> >>      8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> >>      9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> >>     10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> >>     10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> >>
> >> The pmu base implementation is taken from i915.
> >>
> >> v2:
> >> Store last known value when device is awake return that while the GT is
> >> suspended and then update the driver copy when read during awake.
> >>
> >> Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
> >> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
> >> ---
> >>  drivers/gpu/drm/xe/Makefile          |   2 +
> >>  drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
> >>  drivers/gpu/drm/xe/xe_device.c       |   2 +
> >>  drivers/gpu/drm/xe/xe_device_types.h |   4 +
> >>  drivers/gpu/drm/xe/xe_gt.c           |   2 +
> >>  drivers/gpu/drm/xe/xe_irq.c          |  22 +
> >>  drivers/gpu/drm/xe/xe_module.c       |   5 +
> >>  drivers/gpu/drm/xe/xe_pmu.c          | 739 +++++++++++++++++++++++++++
> >>  drivers/gpu/drm/xe/xe_pmu.h          |  25 +
> >>  drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
> >>  include/uapi/drm/xe_drm.h            |  16 +
> >>  11 files changed, 902 insertions(+)
> >>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
> >>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
> >>  create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h
> >>
> >> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> >> index 081c57fd8632..e52ab795c566 100644
> >> --- a/drivers/gpu/drm/xe/Makefile
> >> +++ b/drivers/gpu/drm/xe/Makefile
> >> @@ -217,6 +217,8 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
> >>	i915-display/skl_universal_plane.o \
> >>	i915-display/skl_watermark.o
> >>
> >> +xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
> >> +
> >>  ifeq ($(CONFIG_ACPI),y)
> >>	xe-$(CONFIG_DRM_XE_DISPLAY) += \
> >>		i915-display/intel_acpi.o \
> >> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> >> index 3f664011eaea..c7d9e4634745 100644
> >> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> >> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> >> @@ -285,6 +285,11 @@
> >>  #define   INVALIDATION_BROADCAST_MODE_DIS	REG_BIT(12)
> >>  #define   GLOBAL_INVALIDATION_MODE		REG_BIT(2)
> >>
> >> +#define XE_OAG_RC0_ANY_ENGINE_BUSY_FREE		XE_REG(0xdb80)
> >> +#define XE_OAG_ANY_MEDIA_FF_BUSY_FREE		XE_REG(0xdba0)
> >> +#define XE_OAG_BLT_BUSY_FREE			XE_REG(0xdbbc)
> >> +#define XE_OAG_RENDER_BUSY_FREE			XE_REG(0xdbdc)
> >> +
> >>  #define SAMPLER_MODE				XE_REG_MCR(0xe18c, XE_REG_OPTION_MASKED)
> >>  #define   ENABLE_SMALLPL			REG_BIT(15)
> >>  #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
> >> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> >> index c7985af85a53..b2c7bd4a97d9 100644
> >> --- a/drivers/gpu/drm/xe/xe_device.c
> >> +++ b/drivers/gpu/drm/xe/xe_device.c
> >> @@ -328,6 +328,8 @@ int xe_device_probe(struct xe_device *xe)
> >>
> >>	xe_debugfs_register(xe);
> >>
> >> +	xe_pmu_register(&xe->pmu);
> >> +
> >>	err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize, xe);
> >>	if (err)
> >>		return err;
> >> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> >> index 0226d44a6af2..3ba99aae92b9 100644
> >> --- a/drivers/gpu/drm/xe/xe_device_types.h
> >> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> >> @@ -15,6 +15,7 @@
> >>  #include "xe_devcoredump_types.h"
> >>  #include "xe_gt_types.h"
> >>  #include "xe_platform_types.h"
> >> +#include "xe_pmu.h"
> >>  #include "xe_step_types.h"
> >>
> >>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> >> @@ -332,6 +333,9 @@ struct xe_device {
> >>	/** @d3cold_allowed: Indicates if d3cold is a valid device state */
> >>	bool d3cold_allowed;
> >>
> >> +	/* @pmu: performance monitoring unit */
> >> +	struct xe_pmu pmu;
> >> +
> >
> > Now I am wondering why we don't make the PMU per-gt (or per-tile)? Per-gt
> > would work for these busyness registers and I am not sure how the
> > interrupts are hooked up.
> >
> > In i915 the PMU being device level helped in having a single timer (rather
> > than a per gt timer).
> >
> > Anyway probably not much practical benefit by make it per-gt/per-tile, so
> > maybe leave as is. Just thinking out loud.
>
> we are able to expose per-gt counters so do not see any benefit in
> making pmu per-gt, also i believe struct pmu is per device it will have
> a type associated which is unique for a device.

PMU can be made per gt and named xe-gt0, xe-gt1 etc. if we want. But anyway
leave as is.

>
> >
> >>	/* private: */
> >>
> >>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> >> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> >> index 2458397ce8af..96e3720923d4 100644
> >> --- a/drivers/gpu/drm/xe/xe_gt.c
> >> +++ b/drivers/gpu/drm/xe/xe_gt.c
> >> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
> >>	if (err)
> >>		goto err_msg;
> >>
> >> +	engine_group_busyness_store(gt);
> >
> > Not sure I follow the reason for doing this at suspend time? If PMU was
> > active there should be a previous value. Why must we take a new sample
> > explicitly here?
>
> the PMU interface can be read even when device is suspend and in such
> cases we should not wake up the device, and if we try to read the
> register when device is suspended it gives spurious counter you can
> check in version#1 of this series we were getting huge values. as we put
> the device to suspend immediately after probe. So storing the last known
> good value before suspend.

No need, see comment at init_samples below.

>
> >
> > To me looks like engine_group_busyness_store should not be needed, see
> > comment below for init_samples too.
> >
> >> +
> >>	err = xe_uc_suspend(&gt->uc);
> >>	if (err)
> >>		goto err_force_wake;
> >> diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
> >> index b4ed1e4a3388..cb943fb94ec7 100644
> >> --- a/drivers/gpu/drm/xe/xe_irq.c
> >> +++ b/drivers/gpu/drm/xe/xe_irq.c
> >> @@ -27,6 +27,24 @@
> >>  #define IIR(offset)				XE_REG(offset + 0x8)
> >>  #define IER(offset)				XE_REG(offset + 0xc)
> >>
> >> +/*
> >> + * Interrupt statistic for PMU. Increments the counter only if the
> >> + * interrupt originated from the GPU so interrupts from a device which
> >> + * shares the interrupt line are not accounted.
> >> + */
> >> +static inline void xe_pmu_irq_stats(struct xe_device *xe,
> >
> > No inline, compiler will do it, but looks like we may want to always_inline
> > this. Also this function should really be in xe_pmu.h? Anyway probably
> > leave as is.
> >
> >> +				    irqreturn_t res)
> >> +{
> >> +	if (unlikely(res != IRQ_HANDLED))
> >> +		return;
> >> +
> >> +	/*
> >> +	 * A clever compiler translates that into INC. A not so clever one
> >> +	 * should at least prevent store tearing.
> >> +	 */
> >> +	WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1);
> >> +}
> >> +
> >>  static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)
> >>  {
> >>	u32 val = xe_mmio_read32(mmio, reg);
> >> @@ -325,6 +343,8 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
> >>
> >>	xe_display_irq_enable(xe, gu_misc_iir);
> >>
> >> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
> >> +
> >>	return IRQ_HANDLED;
> >>  }
> >>
> >> @@ -414,6 +434,8 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
> >>	dg1_intr_enable(xe, false);
> >>	xe_display_irq_enable(xe, gu_misc_iir);
> >>
> >> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
> >> +
> >>	return IRQ_HANDLED;
> >>  }
> >>
> >> diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
> >> index 75e5be939f53..f6fe89748525 100644
> >> --- a/drivers/gpu/drm/xe/xe_module.c
> >> +++ b/drivers/gpu/drm/xe/xe_module.c
> >> @@ -12,6 +12,7 @@
> >>  #include "xe_hw_fence.h"
> >>  #include "xe_module.h"
> >>  #include "xe_pci.h"
> >> +#include "xe_pmu.h"
> >>  #include "xe_sched_job.h"
> >>
> >>  bool enable_guc = true;
> >> @@ -49,6 +50,10 @@ static const struct init_funcs init_funcs[] = {
> >>		.init = xe_sched_job_module_init,
> >>		.exit = xe_sched_job_module_exit,
> >>	},
> >> +	{
> >> +		.init = xe_pmu_init,
> >> +		.exit = xe_pmu_exit,
> >> +	},
> >>	{
> >>		.init = xe_register_pci_driver,
> >>		.exit = xe_unregister_pci_driver,
> >> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
> >> new file mode 100644
> >> index 000000000000..bef1895be9f7
> >> --- /dev/null
> >> +++ b/drivers/gpu/drm/xe/xe_pmu.c
> >> @@ -0,0 +1,739 @@
> >> +/*
> >> + * SPDX-License-Identifier: MIT
> >> + *
> >> + * Copyright © 2023 Intel Corporation
> >> + */
> >
> > This SPDX header is for .h files not .c files. Actually, it is for neither :/
>
> But i see this in almost all the files in xe.

Look closely.

> >
> >> +
> >> +#include <drm/drm_drv.h>
> >> +#include <drm/drm_managed.h>
> >> +#include <drm/xe_drm.h>
> >> +
> >> +#include "regs/xe_gt_regs.h"
> >> +#include "xe_device.h"
> >> +#include "xe_gt_clock.h"
> >> +#include "xe_mmio.h"
> >> +
> >> +static cpumask_t xe_pmu_cpumask;
> >> +static unsigned int xe_pmu_target_cpu = -1;
> >> +
> >> +static unsigned int config_gt_id(const u64 config)
> >> +{
> >> +	return config >> __XE_PMU_GT_SHIFT;
> >> +}
> >> +
> >> +static u64 config_counter(const u64 config)
> >> +{
> >> +	return config & ~(~0ULL << __XE_PMU_GT_SHIFT);
> >> +}
> >> +
> >> +static unsigned int
> >> +__sample_idx(struct xe_pmu *pmu, unsigned int gt_id, int sample)
> >> +{
> >> +	unsigned int idx = gt_id * __XE_NUM_PMU_SAMPLERS + sample;
> >> +
> >> +	XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
> >> +
> >> +	return idx;
> >> +}
> >
> > The compiler does this for us if we make sample array 2-d.
> >
> >> +
> >> +static u64 read_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample)
> >> +{
> >> +	return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur;
> >> +}
> >> +
> >> +static void
> >> +store_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample, u64 val)
> >> +{
> >> +	pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val;
> >> +}
> >
> > The three functions above are not needed if we make the sample array
> > 2-d. See here:
> >
> > https://patchwork.freedesktop.org/patch/538887/?series=118225&rev=1
> >
> > Only a part of the patch above was merged (see 8ed0753b527dc) to keep the
> > patch size small, but since for xe we are starting from scratch we can
> > implement the whole approach above (get rid of the read/store helpers
> > entirely, direct assignment without the helpers works).
>
> Ok I'll go over it.
>
> >
> >> +
> >> +static int engine_busyness_sample_type(u64 config)
> >> +{
> >> +	int type = 0;
> >> +
> >> +	switch (config) {
> >> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> >> +		type =  __XE_SAMPLE_RENDER_GROUP_BUSY;
> >> +		break;
> >> +	case XE_PMU_COPY_GROUP_BUSY(0):
> >> +		type = __XE_SAMPLE_COPY_GROUP_BUSY;
> >> +		break;
> >> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> >> +		type = __XE_SAMPLE_MEDIA_GROUP_BUSY;
> >> +		break;
> >> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> >> +		type = __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY;
> >> +		break;
> >> +	}
> >> +
> >> +	return type;
> >> +}
> >> +
> >> +static void xe_pmu_event_destroy(struct perf_event *event)
> >> +{
> >> +	struct xe_device *xe =
> >> +		container_of(event->pmu, typeof(*xe), pmu.base);
> >> +
> >> +	drm_WARN_ON(&xe->drm, event->parent);
> >> +
> >> +	drm_dev_put(&xe->drm);
> >> +}
> >> +
> >> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
> >> +{
> >> +	u64 val = 0;
> >> +
> >> +	switch (config) {
> >> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> >> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
> >> +		break;
> >> +	case XE_PMU_COPY_GROUP_BUSY(0):
> >> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
> >> +		break;
> >> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> >> +		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
> >> +		break;
> >> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> >> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
> >> +		break;
> >> +	default:
> >> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
> >> +	}
> >
> > We need xe_device_mem_access_get, also xe_force_wake_get if applicable,
> > somewhere before reading these registers.
> >
> >> +
> >> +	return xe_gt_clock_interval_to_ns(gt, val * 16);
> >> +}
> >> +
> >> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config)
> >> +{
> >> +	int sample_type = engine_busyness_sample_type(config);
> >> +	struct xe_device *xe = gt->tile->xe;
> >> +	const unsigned int gt_id = gt->info.id;
> >> +	struct xe_pmu *pmu = &xe->pmu;
> >> +	bool device_awake;
> >> +	unsigned long flags;
> >> +	u64 val;
> >> +
> >> +	/*
> >> +	 * found no better way to check if device is awake or not. Before
> >
> > xe_device_mem_access_get_if_ongoing (hard to find name).
>
> thanks for sharing looks to be added recently. If we use this we needn't
> use xe_device_mem_access_get.

See comment at init_samples.

>
> >
> >> +	 * we suspend we set the submission_state.enabled to false.
> >> +	 */
> >> +	device_awake = gt->uc.guc.submission_state.enabled ? true : false;
> >> +	if (device_awake)
> >> +		val = __engine_group_busyness_read(gt, config);
> >> +
> >> +	spin_lock_irqsave(&pmu->lock, flags);
> >> +
> >> +	if (device_awake)
> >> +		store_sample(pmu, gt_id, sample_type, val);
> >> +	else
> >> +		val = read_sample(pmu, gt_id, sample_type);
> >> +
> >> +	spin_unlock_irqrestore(&pmu->lock, flags);
> >> +
> >> +	return val;
> >> +}
> >> +
> >> +void engine_group_busyness_store(struct xe_gt *gt)
> >> +{
> >> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
> >> +	unsigned int gt_id = gt->info.id;
> >> +	unsigned long flags;
> >> +
> >> +	spin_lock_irqsave(&pmu->lock, flags);
> >> +
> >> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
> >> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
> >> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
> >> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
> >> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
> >> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
> >> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> >> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
> >> +
> >> +	spin_unlock_irqrestore(&pmu->lock, flags);
> >> +}
> >> +
> >> +static int
> >> +config_status(struct xe_device *xe, u64 config)
> >> +{
> >> +	unsigned int max_gt_id = xe->info.gt_count > 1 ? 1 : 0;
> >> +	unsigned int gt_id = config_gt_id(config);
> >> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
> >> +
> >> +	if (gt_id > max_gt_id)
> >> +		return -ENOENT;
> >> +
> >> +	switch (config_counter(config)) {
> >> +	case XE_PMU_INTERRUPTS(0):
> >> +		if (gt_id)
> >> +			return -ENOENT;
> >
> > OK: this is a global event so we say this is enabled only for gt0.
> >
> >> +		break;
> >> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> >> +	case XE_PMU_COPY_GROUP_BUSY(0):
> >> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> >> +		if (GRAPHICS_VER(xe) < 12)
> >
> > Any point checking? xe only supports Gen12+.
>
> hmmm ya good point will remove this.
> >
> >> +			return -ENOENT;
> >> +		break;
> >> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> >> +		if (MEDIA_VER(xe) >= 13 && gt->info.type != XE_GT_TYPE_MEDIA)
> >> +			return -ENOENT;
> >
> > OK: this makes sense, so we will expose this event only for media gt's.
> >
> >> +		break;
> >> +	default:
> >> +		return -ENOENT;
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int xe_pmu_event_init(struct perf_event *event)
> >> +{
> >> +	struct xe_device *xe =
> >> +		container_of(event->pmu, typeof(*xe), pmu.base);
> >> +	struct xe_pmu *pmu = &xe->pmu;
> >> +	int ret;
> >> +
> >> +	if (pmu->closed)
> >> +		return -ENODEV;
> >> +
> >> +	if (event->attr.type != event->pmu->type)
> >> +		return -ENOENT;
> >> +
> >> +	/* unsupported modes and filters */
> >> +	if (event->attr.sample_period) /* no sampling */
> >> +		return -EINVAL;
> >> +
> >> +	if (has_branch_stack(event))
> >> +		return -EOPNOTSUPP;
> >> +
> >> +	if (event->cpu < 0)
> >> +		return -EINVAL;
> >> +
> >> +	/* only allow running on one cpu at a time */
> >> +	if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask))
> >> +		return -EINVAL;
> >> +
> >> +	ret = config_status(xe, event->attr.config);
> >> +	if (ret)
> >> +		return ret;
> >> +
> >> +	if (!event->parent) {
> >> +		drm_dev_get(&xe->drm);
> >> +		event->destroy = xe_pmu_event_destroy;
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static u64 __xe_pmu_event_read(struct perf_event *event)
> >> +{
> >> +	struct xe_device *xe =
> >> +		container_of(event->pmu, typeof(*xe), pmu.base);
> >> +	const unsigned int gt_id = config_gt_id(event->attr.config);
> >> +	const u64 config = config_counter(event->attr.config);
> >> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
> >> +	struct xe_pmu *pmu = &xe->pmu;
> >> +	u64 val = 0;
> >> +
> >> +	switch (config) {
> >> +	case XE_PMU_INTERRUPTS(0):
> >> +		val = READ_ONCE(pmu->irq_count);
> >
> > OK: The correct way to do this READ_ONCE/WRITE_ONCE irq_count stuff would
> > be to take pmu->lock (both while reading and writing irq_count) but that
> > would be expensive in the interrupt handler (as the .h hints). So all we
> > can do is what is done here.
> >
> >> +		break;
> >> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> >> +	case XE_PMU_COPY_GROUP_BUSY(0):
> >> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> >> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> >> +		val = engine_group_busyness_read(gt, config);
> >> +	}
> >> +
> >> +	return val;
> >> +}
> >> +
> >> +static void xe_pmu_event_read(struct perf_event *event)
> >> +{
> >> +	struct xe_device *xe =
> >> +		container_of(event->pmu, typeof(*xe), pmu.base);
> >> +	struct hw_perf_event *hwc = &event->hw;
> >> +	struct xe_pmu *pmu = &xe->pmu;
> >> +	u64 prev, new;
> >> +
> >> +	if (pmu->closed) {
> >> +		event->hw.state = PERF_HES_STOPPED;
> >> +		return;
> >> +	}
> >> +again:
> >> +	prev = local64_read(&hwc->prev_count);
> >> +	new = __xe_pmu_event_read(event);
> >> +
> >> +	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
> >> +		goto again;
> >> +
> >> +	local64_add(new - prev, &event->count);
> >> +}
> >> +
> >> +static void xe_pmu_enable(struct perf_event *event)
> >> +{
> >> +	/*
> >> +	 * Store the current counter value so we can report the correct delta
> >> +	 * for all listeners. Even when the event was already enabled and has
> >> +	 * an existing non-zero value.
> >> +	 */
> >> +	local64_set(&event->hw.prev_count, __xe_pmu_event_read(event));
> >> +}
> >> +
> >> +static void xe_pmu_event_start(struct perf_event *event, int flags)
> >> +{
> >> +	struct xe_device *xe =
> >> +		container_of(event->pmu, typeof(*xe), pmu.base);
> >> +	struct xe_pmu *pmu = &xe->pmu;
> >> +
> >> +	if (pmu->closed)
> >> +		return;
> >> +
> >> +	xe_pmu_enable(event);
> >> +	event->hw.state = 0;
> >> +}
> >> +
> >> +static void xe_pmu_event_stop(struct perf_event *event, int flags)
> >> +{
> >> +	if (flags & PERF_EF_UPDATE)
> >> +		xe_pmu_event_read(event);
> >> +
> >> +	event->hw.state = PERF_HES_STOPPED;
> >> +}
> >> +
> >> +static int xe_pmu_event_add(struct perf_event *event, int flags)
> >> +{
> >> +	struct xe_device *xe =
> >> +		container_of(event->pmu, typeof(*xe), pmu.base);
> >> +	struct xe_pmu *pmu = &xe->pmu;
> >> +
> >> +	if (pmu->closed)
> >> +		return -ENODEV;
> >> +
> >> +	if (flags & PERF_EF_START)
> >> +		xe_pmu_event_start(event, flags);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static void xe_pmu_event_del(struct perf_event *event, int flags)
> >> +{
> >> +	xe_pmu_event_stop(event, PERF_EF_UPDATE);
> >> +}
> >> +
> >> +static int xe_pmu_event_event_idx(struct perf_event *event)
> >> +{
> >> +	return 0;
> >> +}
> >> +
> >> +struct xe_str_attribute {
> >> +	struct device_attribute attr;
> >> +	const char *str;
> >> +};
> >> +
> >> +static ssize_t xe_pmu_format_show(struct device *dev,
> >> +				  struct device_attribute *attr, char *buf)
> >> +{
> >> +	struct xe_str_attribute *eattr;
> >> +
> >> +	eattr = container_of(attr, struct xe_str_attribute, attr);
> >> +	return sprintf(buf, "%s\n", eattr->str);
> >> +}
> >> +
> >> +#define XE_PMU_FORMAT_ATTR(_name, _config) \
> >> +	(&((struct xe_str_attribute[]) { \
> >> +		{ .attr = __ATTR(_name, 0444, xe_pmu_format_show, NULL), \
> >> +		  .str = _config, } \
> >> +	})[0].attr.attr)
> >> +
> >> +static struct attribute *xe_pmu_format_attrs[] = {
> >> +	XE_PMU_FORMAT_ATTR(xe_eventid, "config:0-20"),
> >
> > 0-20 means 0-20 bits? Though here we probably have different number of
> > config bits? Probably doesn't matter?
>
> as i understand this is not used anymore so will remove it.
>
> >
> > The string will show up with:
> >
> > cat /sys/devices/xe/format/xe_eventid
> >
> >> +	NULL,
> >> +};
> >> +
> >> +static const struct attribute_group xe_pmu_format_attr_group = {
> >> +	.name = "format",
> >> +	.attrs = xe_pmu_format_attrs,
> >> +};
> >> +
> >> +struct xe_ext_attribute {
> >> +	struct device_attribute attr;
> >> +	unsigned long val;
> >> +};
> >> +
> >> +static ssize_t xe_pmu_event_show(struct device *dev,
> >> +				 struct device_attribute *attr, char *buf)
> >> +{
> >> +	struct xe_ext_attribute *eattr;
> >> +
> >> +	eattr = container_of(attr, struct xe_ext_attribute, attr);
> >> +	return sprintf(buf, "config=0x%lx\n", eattr->val);
> >> +}
> >> +
> >> +static ssize_t cpumask_show(struct device *dev,
> >> +			    struct device_attribute *attr, char *buf)
> >> +{
> >> +	return cpumap_print_to_pagebuf(true, buf, &xe_pmu_cpumask);
> >> +}
> >> +
> >> +static DEVICE_ATTR_RO(cpumask);
> >> +
> >> +static struct attribute *xe_cpumask_attrs[] = {
> >> +	&dev_attr_cpumask.attr,
> >> +	NULL,
> >> +};
> >> +
> >> +static const struct attribute_group xe_pmu_cpumask_attr_group = {
> >> +	.attrs = xe_cpumask_attrs,
> >> +};
> >> +
> >> +#define __event(__counter, __name, __unit) \
> >> +{ \
> >> +	.counter = (__counter), \
> >> +	.name = (__name), \
> >> +	.unit = (__unit), \
> >> +	.global = false, \
> >> +}
> >> +
> >> +#define __global_event(__counter, __name, __unit) \
> >> +{ \
> >> +	.counter = (__counter), \
> >> +	.name = (__name), \
> >> +	.unit = (__unit), \
> >> +	.global = true, \
> >> +}
> >> +
> >> +static struct xe_ext_attribute *
> >> +add_xe_attr(struct xe_ext_attribute *attr, const char *name, u64 config)
> >> +{
> >> +	sysfs_attr_init(&attr->attr.attr);
> >> +	attr->attr.attr.name = name;
> >> +	attr->attr.attr.mode = 0444;
> >> +	attr->attr.show = xe_pmu_event_show;
> >> +	attr->val = config;
> >> +
> >> +	return ++attr;
> >> +}
> >> +
> >> +static struct perf_pmu_events_attr *
> >> +add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name,
> >> +	     const char *str)
> >> +{
> >> +	sysfs_attr_init(&attr->attr.attr);
> >> +	attr->attr.attr.name = name;
> >> +	attr->attr.attr.mode = 0444;
> >> +	attr->attr.show = perf_event_sysfs_show;
> >> +	attr->event_str = str;
> >> +
> >> +	return ++attr;
> >> +}
> >> +
> >> +static struct attribute **
> >> +create_event_attributes(struct xe_pmu *pmu)
> >> +{
> >> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> >> +	static const struct {
> >> +		unsigned int counter;
> >> +		const char *name;
> >> +		const char *unit;
> >> +		bool global;
> >> +	} events[] = {
> >> +		__global_event(0, "interrupts", NULL),
> >> +		__event(1, "render-group-busy", "ns"),
> >> +		__event(2, "copy-group-busy", "ns"),
> >> +		__event(3, "media-group-busy", "ns"),
> >> +		__event(4, "any-engine-group-busy", "ns"),
> >> +	};
> >
> > OK: this function is some black magic to expose stuff through
> > PMU. Identical to i915 and seems to be working from the commit message so
> > should be fine.
> >
> >> +
> >> +	unsigned int count = 0;
> >> +	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
> >> +	struct xe_ext_attribute *xe_attr = NULL, *xe_iter;
> >> +	struct attribute **attr = NULL, **attr_iter;
> >> +	struct xe_gt *gt;
> >> +	unsigned int i, j;
> >> +
> >> +	/* Count how many counters we will be exposing. */
> >> +	for_each_gt(gt, xe, j) {
> >> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
> >> +			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
> >> +
> >> +			if (!config_status(xe, config))
> >> +				count++;
> >> +		}
> >> +	}
> >> +
> >> +	/* Allocate attribute objects and table. */
> >> +	xe_attr = kcalloc(count, sizeof(*xe_attr), GFP_KERNEL);
> >> +	if (!xe_attr)
> >> +		goto err_alloc;
> >> +
> >> +	pmu_attr = kcalloc(count, sizeof(*pmu_attr), GFP_KERNEL);
> >> +	if (!pmu_attr)
> >> +		goto err_alloc;
> >> +
> >> +	/* Max one pointer of each attribute type plus a termination entry. */
> >> +	attr = kcalloc(count * 2 + 1, sizeof(*attr), GFP_KERNEL);
> >> +	if (!attr)
> >> +		goto err_alloc;
> >> +
> >> +	xe_iter = xe_attr;
> >> +	pmu_iter = pmu_attr;
> >> +	attr_iter = attr;
> >> +
> >> +	for_each_gt(gt, xe, j) {
> >> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
> >> +			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
> >> +			char *str;
> >> +
> >> +			if (config_status(xe, config))
> >> +				continue;
> >> +
> >> +			if (events[i].global)
> >> +				str = kstrdup(events[i].name, GFP_KERNEL);
> >> +			else
> >> +				str = kasprintf(GFP_KERNEL, "%s-gt%u",
> >> +						events[i].name, j);
> >> +			if (!str)
> >> +				goto err;
> >> +
> >> +			*attr_iter++ = &xe_iter->attr.attr;
> >> +			xe_iter = add_xe_attr(xe_iter, str, config);
> >> +
> >> +			if (events[i].unit) {
> >> +				if (events[i].global)
> >> +					str = kasprintf(GFP_KERNEL, "%s.unit",
> >> +							events[i].name);
> >> +				else
> >> +					str = kasprintf(GFP_KERNEL, "%s-gt%u.unit",
> >> +							events[i].name, j);
> >> +				if (!str)
> >> +					goto err;
> >> +
> >> +				*attr_iter++ = &pmu_iter->attr.attr;
> >> +				pmu_iter = add_pmu_attr(pmu_iter, str,
> >> +							events[i].unit);
> >> +			}
> >> +		}
> >> +	}
> >> +
> >> +	pmu->xe_attr = xe_attr;
> >> +	pmu->pmu_attr = pmu_attr;
> >> +
> >> +	return attr;
> >> +
> >> +err:
> >> +	for (attr_iter = attr; *attr_iter; attr_iter++)
> >> +		kfree((*attr_iter)->name);
> >> +
> >> +err_alloc:
> >> +	kfree(attr);
> >> +	kfree(xe_attr);
> >> +	kfree(pmu_attr);
> >> +
> >> +	return NULL;
> >> +}
> >> +
> >> +static void free_event_attributes(struct xe_pmu *pmu)
> >> +{
> >> +	struct attribute **attr_iter = pmu->events_attr_group.attrs;
> >> +
> >> +	for (; *attr_iter; attr_iter++)
> >> +		kfree((*attr_iter)->name);
> >> +
> >> +	kfree(pmu->events_attr_group.attrs);
> >> +	kfree(pmu->xe_attr);
> >> +	kfree(pmu->pmu_attr);
> >> +
> >> +	pmu->events_attr_group.attrs = NULL;
> >> +	pmu->xe_attr = NULL;
> >> +	pmu->pmu_attr = NULL;
> >> +}
> >> +
> >> +static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
> >> +{
> >> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
> >> +
> >> +	XE_BUG_ON(!pmu->base.event_init);
> >> +
> >> +	/* Select the first online CPU as a designated reader. */
> >> +	if (cpumask_empty(&xe_pmu_cpumask))
> >> +		cpumask_set_cpu(cpu, &xe_pmu_cpumask);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
> >> +{
> >> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
> >> +	unsigned int target = xe_pmu_target_cpu;
> >> +
> >> +	XE_BUG_ON(!pmu->base.event_init);
> >> +
> >> +	/*
> >> +	 * Unregistering an instance generates a CPU offline event which we must
> >> +	 * ignore to avoid incorrectly modifying the shared xe_pmu_cpumask.
> >> +	 */
> >> +	if (pmu->closed)
> >> +		return 0;
> >> +
> >> +	if (cpumask_test_and_clear_cpu(cpu, &xe_pmu_cpumask)) {
> >> +		target = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
> >> +
> >> +		/* Migrate events if there is a valid target */
> >> +		if (target < nr_cpu_ids) {
> >> +			cpumask_set_cpu(target, &xe_pmu_cpumask);
> >> +			xe_pmu_target_cpu = target;
> >> +		}
> >> +	}
> >> +
> >> +	if (target < nr_cpu_ids && target != pmu->cpuhp.cpu) {
> >> +		perf_pmu_migrate_context(&pmu->base, cpu, target);
> >> +		pmu->cpuhp.cpu = target;
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
> >> +
> >> +int xe_pmu_init(void)
> >> +{
> >> +	int ret;
> >> +
> >> +	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
> >> +				      "perf/x86/intel/xe:online",
> >> +				      xe_pmu_cpu_online,
> >> +				      xe_pmu_cpu_offline);
> >> +	if (ret < 0)
> >> +		pr_notice("Failed to setup cpuhp state for xe PMU! (%d)\n",
> >> +			  ret);
> >> +	else
> >> +		cpuhp_slot = ret;
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +void xe_pmu_exit(void)
> >> +{
> >> +	if (cpuhp_slot != CPUHP_INVALID)
> >> +		cpuhp_remove_multi_state(cpuhp_slot);
> >> +}
> >> +
> >> +static int xe_pmu_register_cpuhp_state(struct xe_pmu *pmu)
> >> +{
> >> +	if (cpuhp_slot == CPUHP_INVALID)
> >> +		return -EINVAL;
> >> +
> >> +	return cpuhp_state_add_instance(cpuhp_slot, &pmu->cpuhp.node);
> >> +}
> >> +
> >> +static void xe_pmu_unregister_cpuhp_state(struct xe_pmu *pmu)
> >> +{
> >> +	cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node);
> >> +}
> >> +
> >> +static void xe_pmu_unregister(struct drm_device *device, void *arg)
> >> +{
> >> +	struct xe_pmu *pmu = arg;
> >> +
> >> +	if (!pmu->base.event_init)
> >> +		return;
> >> +
> >> +	/*
> >> +	 * "Disconnect" the PMU callbacks - since all are atomic synchronize_rcu
> >> +	 * ensures all currently executing ones will have exited before we
> >> +	 * proceed with unregistration.
> >> +	 */
> >> +	pmu->closed = true;
> >> +	synchronize_rcu();
> >> +
> >> +	xe_pmu_unregister_cpuhp_state(pmu);
> >> +
> >> +	perf_pmu_unregister(&pmu->base);
> >> +	pmu->base.event_init = NULL;
> >> +	kfree(pmu->base.attr_groups);
> >> +	kfree(pmu->name);
> >> +	free_event_attributes(pmu);
> >> +}
> >> +
> >> +static void init_samples(struct xe_pmu *pmu)
> >> +{
> >> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> >> +	struct xe_gt *gt;
> >> +	unsigned int i;
> >> +
> >> +	for_each_gt(gt, xe, i)
> >> +		engine_group_busyness_store(gt);
> >> +}
> >> +
> >> +void xe_pmu_register(struct xe_pmu *pmu)
> >
> > Why void, why not int? PMU failure is non fatal error?
>
> Ya device is functional , it is only that these counters are not
> available. Hence didn't want to fail the driver load.
> >
> >> +{
> >> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> >> +	const struct attribute_group *attr_groups[] = {
> >> +		&xe_pmu_format_attr_group,
> >> +		&pmu->events_attr_group,
> >> +		&xe_pmu_cpumask_attr_group,
> >
> > Can someone please explain what this cpumask/cpuhotplug stuff does and
> > whether it needs to be in this patch? There's something here:
>
> comments from original patch series in
> i915:https://patchwork.kernel.org/project/intel-gfx/patch/20170802123249.14194-5-tvrtko.ursulin@linux.intel.com/
>
> "IIRC an uncore PMU should expose a cpumask through sysfs, and then perf
> tools will read that mask and auto-magically limit the number of CPUs it
> instantiates the counter on."
>
> and ours are global counters not per cpu so we limit to just to single cpu.
>
> And as i understand we use the cpuhotplug support to migrate too new cpu
> incase the earlier one goes offline.

OK, leave as is.

>
> >
> > b46a33e271ed ("drm/i915/pmu: Expose a PMU interface for perf queries")
> >
> > I'd rather just have the basic PMU infra and the events in this patch and
> > punt this cpumask/cpuhotplug stuff to a later patch, unless someone can say
> > what it does.
> >
> > Though perf_pmu_register seems to be doing some per cpu stuff so likely
> > this is needed. But amdgpu_pmu only has event and format attributes.
> >
> > Mostly leave as is I guess.
> >
> >> +		NULL
> >> +	};
> >> +
> >> +	int ret = -ENOMEM;
> >> +
> >> +	spin_lock_init(&pmu->lock);
> >> +	pmu->cpuhp.cpu = -1;
> >> +	init_samples(pmu);
> >
> > Why init_samples here? Can't we init the particular sample in
> > xe_pmu_event_init or even xe_pmu_event_start?
> >
> > Init'ing here may be too soon since the event might not be enabled for a
> > long time. So really this needs to move to xe_pmu_event_init or
> > xe_pmu_event_start.
>
> The device is put to suspend immediately after driver probe, typically
> pmu is opened even before any workload is run so essentially device is
> still in suspend state hence we cannot access registers so storing the
> last known value in init_samples. otherwise we see the bug in v#1 of series.
>
> >
> > Actually this is already happening in xe_pmu_enable. We just need to decide
> > when we want to wake the device up and when we don't. So maybe wake the
> > device up at start (use xe_device_mem_access_get) and not wake up during
> > read (xe_device_mem_access_get_if_ongoing etc.)?

Just going to repeat this again:

xe_pmu_event_start calls xe_pmu_enable. In xe_pmu_enable use
xe_device_mem_access_get before calling __xe_pmu_event_read. This will wake
the device up and get a valid value in event->hw.prev_count.

In xe_pmu_event_read, use xe_device_mem_access_get_if_ongoing, to read the
event without waking the device up (and return previous value etc.).

Or, pass a flag in to __xe_pmu_event_read and to engine_group_busyness_read
and __engine_group_busyness_read. The flag will say whether or not to wake
up the device. If the flag says wake the device up, call
xe_device_mem_access_get and xe_force_wake_get, maybe in
__engine_group_busyness_read, before reading device registers. If the flag
says don't wake up the device call xe_device_mem_access_get_if_ongoing.

This way we:
* don't need to call init_samples in xe_pmu_register
* we don't need engine_group_busyness_store
* we don't need to specifically call engine_group_busyness_store in xe_gt_suspend

The correct sample is read by waking up the device in xe_pmu_event_start.

Hopefully this is clear now.

>
> >
> >> +
> >> +	pmu->name = kasprintf(GFP_KERNEL,
> >> +			      "xe_%s",
> >> +			      dev_name(xe->drm.dev));
> >> +	if (pmu->name)
> >> +		/* tools/perf reserves colons as special. */
> >> +		strreplace((char *)pmu->name, ':', '_');
> >> +
> >> +	if (!pmu->name)
> >> +		goto err;
> >> +
> >> +	pmu->events_attr_group.name = "events";
> >> +	pmu->events_attr_group.attrs = create_event_attributes(pmu);
> >> +	if (!pmu->events_attr_group.attrs)
> >> +		goto err_name;
> >> +
> >> +	pmu->base.attr_groups = kmemdup(attr_groups, sizeof(attr_groups),
> >> +					GFP_KERNEL);
> >> +	if (!pmu->base.attr_groups)
> >> +		goto err_attr;
> >> +
> >> +	pmu->base.module	= THIS_MODULE;
> >> +	pmu->base.task_ctx_nr	= perf_invalid_context;
> >> +	pmu->base.event_init	= xe_pmu_event_init;
> >> +	pmu->base.add		= xe_pmu_event_add;
> >> +	pmu->base.del		= xe_pmu_event_del;
> >> +	pmu->base.start		= xe_pmu_event_start;
> >> +	pmu->base.stop		= xe_pmu_event_stop;
> >> +	pmu->base.read		= xe_pmu_event_read;
> >> +	pmu->base.event_idx	= xe_pmu_event_event_idx;
> >> +
> >> +	ret = perf_pmu_register(&pmu->base, pmu->name, -1);
> >> +	if (ret)
> >> +		goto err_groups;
> >> +
> >> +	ret = xe_pmu_register_cpuhp_state(pmu);
> >> +	if (ret)
> >> +		goto err_unreg;
> >> +
> >> +	ret = drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister, pmu);
> >> +	XE_WARN_ON(ret);
> >
> > We should just follow the regular error rewind here too and let
> > drm_notice() at the end print the error. This is what other drivers calling
> > drmm_add_action_or_reset seem to be doing.
>
> Ok ok.
> >
> >> +
> >> +	return;
> >> +
> >> +err_unreg:
> >> +	perf_pmu_unregister(&pmu->base);
> >> +err_groups:
> >> +	kfree(pmu->base.attr_groups);
> >> +err_attr:
> >> +	pmu->base.event_init = NULL;
> >> +	free_event_attributes(pmu);
> >> +err_name:
> >> +	kfree(pmu->name);
> >> +err:
> >> +	drm_notice(&xe->drm, "Failed to register PMU!\n");
> >> +}
> >> diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
> >> new file mode 100644
> >> index 000000000000..d3f47f4ab343
> >> --- /dev/null
> >> +++ b/drivers/gpu/drm/xe/xe_pmu.h
> >> @@ -0,0 +1,25 @@
> >> +/* SPDX-License-Identifier: MIT */
> >> +/*
> >> + * Copyright © 2023 Intel Corporation
> >> + */
> >> +
> >> +#ifndef _XE_PMU_H_
> >> +#define _XE_PMU_H_
> >> +
> >> +#include "xe_gt_types.h"
> >> +#include "xe_pmu_types.h"
> >> +
> >> +#ifdef CONFIG_PERF_EVENTS
> >
> > nit but maybe this should be:
> >
> > #if IS_ENABLED(CONFIG_PERF_EVENTS)
> >
> > or,
> >
> > #if IS_BUILTIN(CONFIG_PERF_EVENTS)
> >
> > Note CONFIG_PERF_EVENTS is a boolean kconfig option.
> >
> > See similar macro IS_REACHABLE() in i915_hwmon.h.
> >
> >> +int xe_pmu_init(void);
> >> +void xe_pmu_exit(void);
> >> +void xe_pmu_register(struct xe_pmu *pmu);
> >> +void engine_group_busyness_store(struct xe_gt *gt);
> >
> > Add xe_pmu_ prefix if function is needed (hopefully not).
>
> OK
> >
> >> +#else
> >> +static inline int xe_pmu_init(void) { return 0; }
> >> +static inline void xe_pmu_exit(void) {}
> >> +static inline void xe_pmu_register(struct xe_pmu *pmu) {}
> >> +static inline void engine_group_busyness_store(struct xe_gt *gt) {}
> >> +#endif
> >> +
> >> +#endif
> >> +
> >> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h b/drivers/gpu/drm/xe/xe_pmu_types.h
> >> new file mode 100644
> >> index 000000000000..e87edd4d6a87
> >> --- /dev/null
> >> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
> >> @@ -0,0 +1,80 @@
> >> +/* SPDX-License-Identifier: MIT */
> >> +/*
> >> + * Copyright © 2023 Intel Corporation
> >> + */
> >> +
> >> +#ifndef _XE_PMU_TYPES_H_
> >> +#define _XE_PMU_TYPES_H_
> >> +
> >> +#include <linux/perf_event.h>
> >> +#include <linux/spinlock_types.h>
> >> +#include <uapi/drm/xe_drm.h>
> >> +
> >> +enum {
> >> +	__XE_SAMPLE_RENDER_GROUP_BUSY,
> >> +	__XE_SAMPLE_COPY_GROUP_BUSY,
> >> +	__XE_SAMPLE_MEDIA_GROUP_BUSY,
> >> +	__XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> >> +	__XE_NUM_PMU_SAMPLERS
> >
> > OK: irq_count is missing here since these are read from device...
> >
> >> +};
> >> +
> >> +struct xe_pmu_sample {
> >> +	u64 cur;
> >> +};
> >
> > This was also discussed for i915 PMU, no point having a struct with a
> > single u64 member. Might as well just use u64 wherever we are using
> > struct xe_pmu_sample.
>
> OK.
> >
> >> +
> >> +#define XE_MAX_GT_PER_TILE 2
> >
> > Why per tile? The array size should be max_gt_per_device. Just call it
> > XE_MAX_GT?
>
> I declared similar to what we have in drivers/gpu/drm/xe/xe_device.h

Our 2-d array size is for the device, not per tile. So let's use XE_MAX_GT.

> >
> >> +
> >> +struct xe_pmu {
> >> +	/**
> >> +	 * @cpuhp: Struct used for CPU hotplug handling.
> >> +	 */
> >> +	struct {
> >> +		struct hlist_node node;
> >> +		unsigned int cpu;
> >> +	} cpuhp;
> >> +	/**
> >> +	 * @base: PMU base.
> >> +	 */
> >> +	struct pmu base;
> >> +	/**
> >> +	 * @closed: xe is unregistering.
> >> +	 */
> >> +	bool closed;
> >> +	/**
> >> +	 * @name: Name as registered with perf core.
> >> +	 */
> >> +	const char *name;
> >> +	/**
> >> +	 * @lock: Lock protecting enable mask and ref count handling.
> >> +	 */
> >> +	spinlock_t lock;
> >> +	/**
> >> +	 * @sample: Current and previous (raw) counters.
> >> +	 *
> >> +	 * These counters are updated when the device is awake.
> >> +	 *
> >> +	 */
> >> +	struct xe_pmu_sample sample[XE_MAX_GT_PER_TILE * __XE_NUM_PMU_SAMPLERS];
> >
> > Change to 2-d array. See above.
> >
> >> +	/**
> >> +	 * @irq_count: Number of interrupts
> >> +	 *
> >> +	 * Intentionally unsigned long to avoid atomics or heuristics on 32bit.
> >> +	 * 4e9 interrupts are a lot and postprocessing can really deal with an
> >> +	 * occasional wraparound easily. It's 32bit after all.
> >> +	 */
> >> +	unsigned long irq_count;
> >> +	/**
> >> +	 * @events_attr_group: Device events attribute group.
> >> +	 */
> >> +	struct attribute_group events_attr_group;
> >> +	/**
> >> +	 * @xe_attr: Memory block holding device attributes.
> >> +	 */
> >> +	void *xe_attr;
> >> +	/**
> >> +	 * @pmu_attr: Memory block holding device attributes.
> >> +	 */
> >> +	void *pmu_attr;
> >> +};
> >> +
> >> +#endif
> >> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> >> index 965cd9527ff1..ed097056f944 100644
> >> --- a/include/uapi/drm/xe_drm.h
> >> +++ b/include/uapi/drm/xe_drm.h
> >> @@ -990,6 +990,22 @@ struct drm_xe_vm_madvise {
> >>	__u64 reserved[2];
> >>  };
> >>
> >> +/* PMU event config IDs */
> >> +
> >> +/*
> >> + * Top 4 bits of every counter are GT id.
> >> + */
> >> +#define __XE_PMU_GT_SHIFT (60)
> >
> > To future-proof this, and also because we seem to have plenty of bits
> > available, I think we should change this to 56 (instead of 60).
>
> OK
>
> Thanks,
> Aravind.
> >
> >> +
> >> +#define ___XE_PMU_OTHER(gt, x) \
> >> +	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
> >> +
> >> +#define XE_PMU_INTERRUPTS(gt)			___XE_PMU_OTHER(gt, 0)
> >> +#define XE_PMU_RENDER_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 1)
> >> +#define XE_PMU_COPY_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 2)
> >> +#define XE_PMU_MEDIA_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 3)
> >> +#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)	___XE_PMU_OTHER(gt, 4)
> >> +
> >>  #if defined(__cplusplus)
> >>  }
> >>  #endif
> >> --
> >> 2.25.1

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-21 23:36       ` Dixit, Ashutosh
@ 2023-07-22  6:04         ` Dixit, Ashutosh
  2023-07-24  8:03           ` Iddamsetty, Aravind
  2023-07-24  9:38           ` Iddamsetty, Aravind
  0 siblings, 2 replies; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-22  6:04 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Fri, 21 Jul 2023 16:36:02 -0700, Dixit, Ashutosh wrote:
>
> On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote:
> >
> Hi Aravind,
>
> > On 21-07-2023 06:32, Dixit, Ashutosh wrote:
> > > On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
> > >>
> > > More stuff to mull over. You can ignore comments starting with "OK", those
> > > are just notes to myself.
> > >
> > > Also, maybe some time we can add a basic IGT which reads these exposed
> > > counters and verifies that we can read them and they are monotonically
> > > increasing?
> >
> > this is the IGT https://patchwork.freedesktop.org/series/119936/ series
> > using these counters posted by Venkat.
> >
> > >
> > >> There are a set of engine group busyness counters provided by HW which are
> > >> perfect fit to be exposed via PMU perf events.
> > >>
> > >> BSPEC: 46559, 46560, 46722, 46729
> > >
> > > Also add these Bspec entries: 71028, 52071
> >
> > OK.
> >
> > >
> > >>
> > >> events can be listed using:
> > >> perf list
> > >>   xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
> > >>   xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
> > >>   xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
> > >>   xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
> > >>   xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
> > >>
> > >> and can be read using:
> > >>
> > >> perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
> > >>            time             counts unit events
> > >>      1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> > >>      2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> > >>      3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> > >>      4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> > >>      5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> > >>      6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> > >>      7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> > >>      8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> > >>      9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> > >>     10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> > >>     10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
> > >>
> > >> The pmu base implementation is taken from i915.
> > >>
> > >> v2:
> > >> Store last known value when device is awake return that while the GT is
> > >> suspended and then update the driver copy when read during awake.
> > >>
> > >> Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > >> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
> > >> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
> > >> ---
> > >>  drivers/gpu/drm/xe/Makefile          |   2 +
> > >>  drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
> > >>  drivers/gpu/drm/xe/xe_device.c       |   2 +
> > >>  drivers/gpu/drm/xe/xe_device_types.h |   4 +
> > >>  drivers/gpu/drm/xe/xe_gt.c           |   2 +
> > >>  drivers/gpu/drm/xe/xe_irq.c          |  22 +
> > >>  drivers/gpu/drm/xe/xe_module.c       |   5 +
> > >>  drivers/gpu/drm/xe/xe_pmu.c          | 739 +++++++++++++++++++++++++++
> > >>  drivers/gpu/drm/xe/xe_pmu.h          |  25 +
> > >>  drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
> > >>  include/uapi/drm/xe_drm.h            |  16 +
> > >>  11 files changed, 902 insertions(+)
> > >>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
> > >>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
> > >>  create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h
> > >>
> > >> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > >> index 081c57fd8632..e52ab795c566 100644
> > >> --- a/drivers/gpu/drm/xe/Makefile
> > >> +++ b/drivers/gpu/drm/xe/Makefile
> > >> @@ -217,6 +217,8 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
> > >>	i915-display/skl_universal_plane.o \
> > >>	i915-display/skl_watermark.o
> > >>
> > >> +xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
> > >> +
> > >>  ifeq ($(CONFIG_ACPI),y)
> > >>	xe-$(CONFIG_DRM_XE_DISPLAY) += \
> > >>		i915-display/intel_acpi.o \
> > >> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > >> index 3f664011eaea..c7d9e4634745 100644
> > >> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > >> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > >> @@ -285,6 +285,11 @@
> > >>  #define   INVALIDATION_BROADCAST_MODE_DIS	REG_BIT(12)
> > >>  #define   GLOBAL_INVALIDATION_MODE		REG_BIT(2)
> > >>
> > >> +#define XE_OAG_RC0_ANY_ENGINE_BUSY_FREE		XE_REG(0xdb80)
> > >> +#define XE_OAG_ANY_MEDIA_FF_BUSY_FREE		XE_REG(0xdba0)
> > >> +#define XE_OAG_BLT_BUSY_FREE			XE_REG(0xdbbc)
> > >> +#define XE_OAG_RENDER_BUSY_FREE			XE_REG(0xdbdc)
> > >> +
> > >>  #define SAMPLER_MODE				XE_REG_MCR(0xe18c, XE_REG_OPTION_MASKED)
> > >>  #define   ENABLE_SMALLPL			REG_BIT(15)
> > >>  #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
> > >> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > >> index c7985af85a53..b2c7bd4a97d9 100644
> > >> --- a/drivers/gpu/drm/xe/xe_device.c
> > >> +++ b/drivers/gpu/drm/xe/xe_device.c
> > >> @@ -328,6 +328,8 @@ int xe_device_probe(struct xe_device *xe)
> > >>
> > >>	xe_debugfs_register(xe);
> > >>
> > >> +	xe_pmu_register(&xe->pmu);
> > >> +
> > >>	err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize, xe);
> > >>	if (err)
> > >>		return err;
> > >> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> > >> index 0226d44a6af2..3ba99aae92b9 100644
> > >> --- a/drivers/gpu/drm/xe/xe_device_types.h
> > >> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > >> @@ -15,6 +15,7 @@
> > >>  #include "xe_devcoredump_types.h"
> > >>  #include "xe_gt_types.h"
> > >>  #include "xe_platform_types.h"
> > >> +#include "xe_pmu.h"
> > >>  #include "xe_step_types.h"
> > >>
> > >>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> > >> @@ -332,6 +333,9 @@ struct xe_device {
> > >>	/** @d3cold_allowed: Indicates if d3cold is a valid device state */
> > >>	bool d3cold_allowed;
> > >>
> > >> +	/* @pmu: performance monitoring unit */
> > >> +	struct xe_pmu pmu;
> > >> +
> > >
> > > Now I am wondering why we don't make the PMU per-gt (or per-tile)? Per-gt
> > > would work for these busyness registers and I am not sure how the
> > > interrupts are hooked up.
> > >
> > > In i915 the PMU being device level helped in having a single timer (rather
> > > than a per gt timer).
> > >
> > > Anyway probably not much practical benefit by make it per-gt/per-tile, so
> > > maybe leave as is. Just thinking out loud.
> >
> > we are able to expose per-gt counters so do not see any benefit in
> > making pmu per-gt, also i believe struct pmu is per device it will have
> > a type associated which is unique for a device.
>
> PMU can be made per gt and named xe-gt0, xe-gt1 etc. if we want. But anyway
> leave as is.
>
> >
> > >
> > >>	/* private: */
> > >>
> > >>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> > >> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> > >> index 2458397ce8af..96e3720923d4 100644
> > >> --- a/drivers/gpu/drm/xe/xe_gt.c
> > >> +++ b/drivers/gpu/drm/xe/xe_gt.c
> > >> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
> > >>	if (err)
> > >>		goto err_msg;
> > >>
> > >> +	engine_group_busyness_store(gt);
> > >
> > > Not sure I follow the reason for doing this at suspend time? If PMU was
> > > active there should be a previous value. Why must we take a new sample
> > > explicitly here?
> >
> > the PMU interface can be read even when device is suspend and in such
> > cases we should not wake up the device, and if we try to read the
> > register when device is suspended it gives spurious counter you can
> > check in version#1 of this series we were getting huge values. as we put
> > the device to suspend immediately after probe. So storing the last known
> > good value before suspend.
>
> No need, see comment at init_samples below.

Sorry, you are right, I changed my mind about this, I see your point. See
more discussion on this at init_samples below.

>
> >
> > >
> > > To me looks like engine_group_busyness_store should not be needed, see
> > > comment below for init_samples too.
> > >
> > >> +
> > >>	err = xe_uc_suspend(&gt->uc);
> > >>	if (err)
> > >>		goto err_force_wake;
> > >> diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
> > >> index b4ed1e4a3388..cb943fb94ec7 100644
> > >> --- a/drivers/gpu/drm/xe/xe_irq.c
> > >> +++ b/drivers/gpu/drm/xe/xe_irq.c
> > >> @@ -27,6 +27,24 @@
> > >>  #define IIR(offset)				XE_REG(offset + 0x8)
> > >>  #define IER(offset)				XE_REG(offset + 0xc)
> > >>
> > >> +/*
> > >> + * Interrupt statistic for PMU. Increments the counter only if the
> > >> + * interrupt originated from the GPU so interrupts from a device which
> > >> + * shares the interrupt line are not accounted.
> > >> + */
> > >> +static inline void xe_pmu_irq_stats(struct xe_device *xe,
> > >
> > > No inline, compiler will do it, but looks like we may want to always_inline
> > > this. Also this function should really be in xe_pmu.h? Anyway probably
> > > leave as is.
> > >
> > >> +				    irqreturn_t res)
> > >> +{
> > >> +	if (unlikely(res != IRQ_HANDLED))
> > >> +		return;
> > >> +
> > >> +	/*
> > >> +	 * A clever compiler translates that into INC. A not so clever one
> > >> +	 * should at least prevent store tearing.
> > >> +	 */
> > >> +	WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1);
> > >> +}
> > >> +
> > >>  static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)
> > >>  {
> > >>	u32 val = xe_mmio_read32(mmio, reg);
> > >> @@ -325,6 +343,8 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
> > >>
> > >>	xe_display_irq_enable(xe, gu_misc_iir);
> > >>
> > >> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
> > >> +
> > >>	return IRQ_HANDLED;
> > >>  }
> > >>
> > >> @@ -414,6 +434,8 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
> > >>	dg1_intr_enable(xe, false);
> > >>	xe_display_irq_enable(xe, gu_misc_iir);
> > >>
> > >> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
> > >> +
> > >>	return IRQ_HANDLED;
> > >>  }
> > >>
> > >> diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
> > >> index 75e5be939f53..f6fe89748525 100644
> > >> --- a/drivers/gpu/drm/xe/xe_module.c
> > >> +++ b/drivers/gpu/drm/xe/xe_module.c
> > >> @@ -12,6 +12,7 @@
> > >>  #include "xe_hw_fence.h"
> > >>  #include "xe_module.h"
> > >>  #include "xe_pci.h"
> > >> +#include "xe_pmu.h"
> > >>  #include "xe_sched_job.h"
> > >>
> > >>  bool enable_guc = true;
> > >> @@ -49,6 +50,10 @@ static const struct init_funcs init_funcs[] = {
> > >>		.init = xe_sched_job_module_init,
> > >>		.exit = xe_sched_job_module_exit,
> > >>	},
> > >> +	{
> > >> +		.init = xe_pmu_init,
> > >> +		.exit = xe_pmu_exit,
> > >> +	},
> > >>	{
> > >>		.init = xe_register_pci_driver,
> > >>		.exit = xe_unregister_pci_driver,
> > >> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
> > >> new file mode 100644
> > >> index 000000000000..bef1895be9f7
> > >> --- /dev/null
> > >> +++ b/drivers/gpu/drm/xe/xe_pmu.c
> > >> @@ -0,0 +1,739 @@
> > >> +/*
> > >> + * SPDX-License-Identifier: MIT
> > >> + *
> > >> + * Copyright © 2023 Intel Corporation
> > >> + */
> > >
> > > This SPDX header is for .h files not .c files. Actually, it is for neither :/
> >
> > But i see this in almost all the files in xe.
>
> Look closely.
>
> > >
> > >> +
> > >> +#include <drm/drm_drv.h>
> > >> +#include <drm/drm_managed.h>
> > >> +#include <drm/xe_drm.h>
> > >> +
> > >> +#include "regs/xe_gt_regs.h"
> > >> +#include "xe_device.h"
> > >> +#include "xe_gt_clock.h"
> > >> +#include "xe_mmio.h"
> > >> +
> > >> +static cpumask_t xe_pmu_cpumask;
> > >> +static unsigned int xe_pmu_target_cpu = -1;
> > >> +
> > >> +static unsigned int config_gt_id(const u64 config)
> > >> +{
> > >> +	return config >> __XE_PMU_GT_SHIFT;
> > >> +}
> > >> +
> > >> +static u64 config_counter(const u64 config)
> > >> +{
> > >> +	return config & ~(~0ULL << __XE_PMU_GT_SHIFT);
> > >> +}
> > >> +
> > >> +static unsigned int
> > >> +__sample_idx(struct xe_pmu *pmu, unsigned int gt_id, int sample)
> > >> +{
> > >> +	unsigned int idx = gt_id * __XE_NUM_PMU_SAMPLERS + sample;
> > >> +
> > >> +	XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
> > >> +
> > >> +	return idx;
> > >> +}
> > >
> > > The compiler does this for us if we make sample array 2-d.
> > >
> > >> +
> > >> +static u64 read_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample)
> > >> +{
> > >> +	return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur;
> > >> +}
> > >> +
> > >> +static void
> > >> +store_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample, u64 val)
> > >> +{
> > >> +	pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val;
> > >> +}
> > >
> > > The three functions above are not needed if we make the sample array
> > > 2-d. See here:
> > >
> > > https://patchwork.freedesktop.org/patch/538887/?series=118225&rev=1
> > >
> > > Only a part of the patch above was merged (see 8ed0753b527dc) to keep the
> > > patch size small, but since for xe we are starting from scratch we can
> > > implement the whole approach above (get rid of the read/store helpers
> > > entirely, direct assignment without the helpers works).
> >
> > Ok I'll go over it.
> >
> > >
> > >> +
> > >> +static int engine_busyness_sample_type(u64 config)
> > >> +{
> > >> +	int type = 0;
> > >> +
> > >> +	switch (config) {
> > >> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> > >> +		type =  __XE_SAMPLE_RENDER_GROUP_BUSY;
> > >> +		break;
> > >> +	case XE_PMU_COPY_GROUP_BUSY(0):
> > >> +		type = __XE_SAMPLE_COPY_GROUP_BUSY;
> > >> +		break;
> > >> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> > >> +		type = __XE_SAMPLE_MEDIA_GROUP_BUSY;
> > >> +		break;
> > >> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> > >> +		type = __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY;
> > >> +		break;
> > >> +	}
> > >> +
> > >> +	return type;
> > >> +}
> > >> +
> > >> +static void xe_pmu_event_destroy(struct perf_event *event)
> > >> +{
> > >> +	struct xe_device *xe =
> > >> +		container_of(event->pmu, typeof(*xe), pmu.base);
> > >> +
> > >> +	drm_WARN_ON(&xe->drm, event->parent);
> > >> +
> > >> +	drm_dev_put(&xe->drm);
> > >> +}
> > >> +
> > >> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
> > >> +{
> > >> +	u64 val = 0;
> > >> +
> > >> +	switch (config) {
> > >> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> > >> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
> > >> +		break;
> > >> +	case XE_PMU_COPY_GROUP_BUSY(0):
> > >> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
> > >> +		break;
> > >> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> > >> +		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
> > >> +		break;
> > >> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> > >> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
> > >> +		break;
> > >> +	default:
> > >> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
> > >> +	}
> > >
> > > We need xe_device_mem_access_get, also xe_force_wake_get if applicable,
> > > somewhere before reading these registers.
> > >
> > >> +
> > >> +	return xe_gt_clock_interval_to_ns(gt, val * 16);
> > >> +}
> > >> +
> > >> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config)
> > >> +{
> > >> +	int sample_type = engine_busyness_sample_type(config);
> > >> +	struct xe_device *xe = gt->tile->xe;
> > >> +	const unsigned int gt_id = gt->info.id;
> > >> +	struct xe_pmu *pmu = &xe->pmu;
> > >> +	bool device_awake;
> > >> +	unsigned long flags;
> > >> +	u64 val;
> > >> +
> > >> +	/*
> > >> +	 * found no better way to check if device is awake or not. Before
> > >
> > > xe_device_mem_access_get_if_ongoing (hard to find name).
> >
> > thanks for sharing looks to be added recently. If we use this we needn't
> > use xe_device_mem_access_get.
>
> See comment at init_samples.
>
> >
> > >
> > >> +	 * we suspend we set the submission_state.enabled to false.
> > >> +	 */
> > >> +	device_awake = gt->uc.guc.submission_state.enabled ? true : false;
> > >> +	if (device_awake)
> > >> +		val = __engine_group_busyness_read(gt, config);
> > >> +
> > >> +	spin_lock_irqsave(&pmu->lock, flags);
> > >> +
> > >> +	if (device_awake)
> > >> +		store_sample(pmu, gt_id, sample_type, val);
> > >> +	else
> > >> +		val = read_sample(pmu, gt_id, sample_type);
> > >> +
> > >> +	spin_unlock_irqrestore(&pmu->lock, flags);
> > >> +
> > >> +	return val;
> > >> +}
> > >> +
> > >> +void engine_group_busyness_store(struct xe_gt *gt)
> > >> +{
> > >> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
> > >> +	unsigned int gt_id = gt->info.id;
> > >> +	unsigned long flags;
> > >> +
> > >> +	spin_lock_irqsave(&pmu->lock, flags);
> > >> +
> > >> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
> > >> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
> > >> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
> > >> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
> > >> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
> > >> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
> > >> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> > >> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));

Here why should we store everything, we should store only those events
which are enabled?

Also it would good if the above can be done in a loop somehow. 4 is fine
but if we add events later, a loop will be nice, if possible.

> > >> +
> > >> +	spin_unlock_irqrestore(&pmu->lock, flags);
> > >> +}
> > >> +
> > >> +static int
> > >> +config_status(struct xe_device *xe, u64 config)
> > >> +{
> > >> +	unsigned int max_gt_id = xe->info.gt_count > 1 ? 1 : 0;
> > >> +	unsigned int gt_id = config_gt_id(config);
> > >> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
> > >> +
> > >> +	if (gt_id > max_gt_id)
> > >> +		return -ENOENT;
> > >> +
> > >> +	switch (config_counter(config)) {
> > >> +	case XE_PMU_INTERRUPTS(0):
> > >> +		if (gt_id)
> > >> +			return -ENOENT;
> > >
> > > OK: this is a global event so we say this is enabled only for gt0.
> > >
> > >> +		break;
> > >> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> > >> +	case XE_PMU_COPY_GROUP_BUSY(0):
> > >> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> > >> +		if (GRAPHICS_VER(xe) < 12)
> > >
> > > Any point checking? xe only supports Gen12+.
> >
> > hmmm ya good point will remove this.
> > >
> > >> +			return -ENOENT;
> > >> +		break;
> > >> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> > >> +		if (MEDIA_VER(xe) >= 13 && gt->info.type != XE_GT_TYPE_MEDIA)
> > >> +			return -ENOENT;
> > >
> > > OK: this makes sense, so we will expose this event only for media gt's.
> > >
> > >> +		break;
> > >> +	default:
> > >> +		return -ENOENT;
> > >> +	}
> > >> +
> > >> +	return 0;
> > >> +}
> > >> +
> > >> +static int xe_pmu_event_init(struct perf_event *event)
> > >> +{
> > >> +	struct xe_device *xe =
> > >> +		container_of(event->pmu, typeof(*xe), pmu.base);
> > >> +	struct xe_pmu *pmu = &xe->pmu;
> > >> +	int ret;
> > >> +
> > >> +	if (pmu->closed)
> > >> +		return -ENODEV;
> > >> +
> > >> +	if (event->attr.type != event->pmu->type)
> > >> +		return -ENOENT;
> > >> +
> > >> +	/* unsupported modes and filters */
> > >> +	if (event->attr.sample_period) /* no sampling */
> > >> +		return -EINVAL;
> > >> +
> > >> +	if (has_branch_stack(event))
> > >> +		return -EOPNOTSUPP;
> > >> +
> > >> +	if (event->cpu < 0)
> > >> +		return -EINVAL;
> > >> +
> > >> +	/* only allow running on one cpu at a time */
> > >> +	if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask))
> > >> +		return -EINVAL;
> > >> +
> > >> +	ret = config_status(xe, event->attr.config);
> > >> +	if (ret)
> > >> +		return ret;
> > >> +
> > >> +	if (!event->parent) {
> > >> +		drm_dev_get(&xe->drm);
> > >> +		event->destroy = xe_pmu_event_destroy;
> > >> +	}
> > >> +
> > >> +	return 0;
> > >> +}
> > >> +
> > >> +static u64 __xe_pmu_event_read(struct perf_event *event)
> > >> +{
> > >> +	struct xe_device *xe =
> > >> +		container_of(event->pmu, typeof(*xe), pmu.base);
> > >> +	const unsigned int gt_id = config_gt_id(event->attr.config);
> > >> +	const u64 config = config_counter(event->attr.config);
> > >> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
> > >> +	struct xe_pmu *pmu = &xe->pmu;
> > >> +	u64 val = 0;
> > >> +
> > >> +	switch (config) {
> > >> +	case XE_PMU_INTERRUPTS(0):
> > >> +		val = READ_ONCE(pmu->irq_count);
> > >
> > > OK: The correct way to do this READ_ONCE/WRITE_ONCE irq_count stuff would
> > > be to take pmu->lock (both while reading and writing irq_count) but that
> > > would be expensive in the interrupt handler (as the .h hints). So all we
> > > can do is what is done here.
> > >
> > >> +		break;
> > >> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> > >> +	case XE_PMU_COPY_GROUP_BUSY(0):
> > >> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> > >> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> > >> +		val = engine_group_busyness_read(gt, config);
> > >> +	}
> > >> +
> > >> +	return val;
> > >> +}
> > >> +
> > >> +static void xe_pmu_event_read(struct perf_event *event)
> > >> +{
> > >> +	struct xe_device *xe =
> > >> +		container_of(event->pmu, typeof(*xe), pmu.base);
> > >> +	struct hw_perf_event *hwc = &event->hw;
> > >> +	struct xe_pmu *pmu = &xe->pmu;
> > >> +	u64 prev, new;
> > >> +
> > >> +	if (pmu->closed) {
> > >> +		event->hw.state = PERF_HES_STOPPED;
> > >> +		return;
> > >> +	}
> > >> +again:
> > >> +	prev = local64_read(&hwc->prev_count);
> > >> +	new = __xe_pmu_event_read(event);
> > >> +
> > >> +	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
> > >> +		goto again;
> > >> +
> > >> +	local64_add(new - prev, &event->count);
> > >> +}
> > >> +
> > >> +static void xe_pmu_enable(struct perf_event *event)
> > >> +{
> > >> +	/*
> > >> +	 * Store the current counter value so we can report the correct delta
> > >> +	 * for all listeners. Even when the event was already enabled and has
> > >> +	 * an existing non-zero value.
> > >> +	 */
> > >> +	local64_set(&event->hw.prev_count, __xe_pmu_event_read(event));
> > >> +}
> > >> +
> > >> +static void xe_pmu_event_start(struct perf_event *event, int flags)
> > >> +{
> > >> +	struct xe_device *xe =
> > >> +		container_of(event->pmu, typeof(*xe), pmu.base);
> > >> +	struct xe_pmu *pmu = &xe->pmu;
> > >> +
> > >> +	if (pmu->closed)
> > >> +		return;
> > >> +
> > >> +	xe_pmu_enable(event);
> > >> +	event->hw.state = 0;
> > >> +}
> > >> +
> > >> +static void xe_pmu_event_stop(struct perf_event *event, int flags)
> > >> +{
> > >> +	if (flags & PERF_EF_UPDATE)
> > >> +		xe_pmu_event_read(event);
> > >> +
> > >> +	event->hw.state = PERF_HES_STOPPED;
> > >> +}
> > >> +
> > >> +static int xe_pmu_event_add(struct perf_event *event, int flags)
> > >> +{
> > >> +	struct xe_device *xe =
> > >> +		container_of(event->pmu, typeof(*xe), pmu.base);
> > >> +	struct xe_pmu *pmu = &xe->pmu;
> > >> +
> > >> +	if (pmu->closed)
> > >> +		return -ENODEV;
> > >> +
> > >> +	if (flags & PERF_EF_START)
> > >> +		xe_pmu_event_start(event, flags);
> > >> +
> > >> +	return 0;
> > >> +}
> > >> +
> > >> +static void xe_pmu_event_del(struct perf_event *event, int flags)
> > >> +{
> > >> +	xe_pmu_event_stop(event, PERF_EF_UPDATE);
> > >> +}
> > >> +
> > >> +static int xe_pmu_event_event_idx(struct perf_event *event)
> > >> +{
> > >> +	return 0;
> > >> +}
> > >> +
> > >> +struct xe_str_attribute {
> > >> +	struct device_attribute attr;
> > >> +	const char *str;
> > >> +};
> > >> +
> > >> +static ssize_t xe_pmu_format_show(struct device *dev,
> > >> +				  struct device_attribute *attr, char *buf)
> > >> +{
> > >> +	struct xe_str_attribute *eattr;
> > >> +
> > >> +	eattr = container_of(attr, struct xe_str_attribute, attr);
> > >> +	return sprintf(buf, "%s\n", eattr->str);
> > >> +}
> > >> +
> > >> +#define XE_PMU_FORMAT_ATTR(_name, _config) \
> > >> +	(&((struct xe_str_attribute[]) { \
> > >> +		{ .attr = __ATTR(_name, 0444, xe_pmu_format_show, NULL), \
> > >> +		  .str = _config, } \
> > >> +	})[0].attr.attr)
> > >> +
> > >> +static struct attribute *xe_pmu_format_attrs[] = {
> > >> +	XE_PMU_FORMAT_ATTR(xe_eventid, "config:0-20"),
> > >
> > > 0-20 means 0-20 bits? Though here we probably have different number of
> > > config bits? Probably doesn't matter?
> >
> > as i understand this is not used anymore so will remove it.
> >
> > >
> > > The string will show up with:
> > >
> > > cat /sys/devices/xe/format/xe_eventid
> > >
> > >> +	NULL,
> > >> +};
> > >> +
> > >> +static const struct attribute_group xe_pmu_format_attr_group = {
> > >> +	.name = "format",
> > >> +	.attrs = xe_pmu_format_attrs,
> > >> +};
> > >> +
> > >> +struct xe_ext_attribute {
> > >> +	struct device_attribute attr;
> > >> +	unsigned long val;
> > >> +};
> > >> +
> > >> +static ssize_t xe_pmu_event_show(struct device *dev,
> > >> +				 struct device_attribute *attr, char *buf)
> > >> +{
> > >> +	struct xe_ext_attribute *eattr;
> > >> +
> > >> +	eattr = container_of(attr, struct xe_ext_attribute, attr);
> > >> +	return sprintf(buf, "config=0x%lx\n", eattr->val);
> > >> +}
> > >> +
> > >> +static ssize_t cpumask_show(struct device *dev,
> > >> +			    struct device_attribute *attr, char *buf)
> > >> +{
> > >> +	return cpumap_print_to_pagebuf(true, buf, &xe_pmu_cpumask);
> > >> +}
> > >> +
> > >> +static DEVICE_ATTR_RO(cpumask);
> > >> +
> > >> +static struct attribute *xe_cpumask_attrs[] = {
> > >> +	&dev_attr_cpumask.attr,
> > >> +	NULL,
> > >> +};
> > >> +
> > >> +static const struct attribute_group xe_pmu_cpumask_attr_group = {
> > >> +	.attrs = xe_cpumask_attrs,
> > >> +};
> > >> +
> > >> +#define __event(__counter, __name, __unit) \
> > >> +{ \
> > >> +	.counter = (__counter), \
> > >> +	.name = (__name), \
> > >> +	.unit = (__unit), \
> > >> +	.global = false, \
> > >> +}
> > >> +
> > >> +#define __global_event(__counter, __name, __unit) \
> > >> +{ \
> > >> +	.counter = (__counter), \
> > >> +	.name = (__name), \
> > >> +	.unit = (__unit), \
> > >> +	.global = true, \
> > >> +}
> > >> +
> > >> +static struct xe_ext_attribute *
> > >> +add_xe_attr(struct xe_ext_attribute *attr, const char *name, u64 config)
> > >> +{
> > >> +	sysfs_attr_init(&attr->attr.attr);
> > >> +	attr->attr.attr.name = name;
> > >> +	attr->attr.attr.mode = 0444;
> > >> +	attr->attr.show = xe_pmu_event_show;
> > >> +	attr->val = config;
> > >> +
> > >> +	return ++attr;
> > >> +}
> > >> +
> > >> +static struct perf_pmu_events_attr *
> > >> +add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name,
> > >> +	     const char *str)
> > >> +{
> > >> +	sysfs_attr_init(&attr->attr.attr);
> > >> +	attr->attr.attr.name = name;
> > >> +	attr->attr.attr.mode = 0444;
> > >> +	attr->attr.show = perf_event_sysfs_show;
> > >> +	attr->event_str = str;
> > >> +
> > >> +	return ++attr;
> > >> +}
> > >> +
> > >> +static struct attribute **
> > >> +create_event_attributes(struct xe_pmu *pmu)
> > >> +{
> > >> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> > >> +	static const struct {
> > >> +		unsigned int counter;
> > >> +		const char *name;
> > >> +		const char *unit;
> > >> +		bool global;
> > >> +	} events[] = {
> > >> +		__global_event(0, "interrupts", NULL),
> > >> +		__event(1, "render-group-busy", "ns"),
> > >> +		__event(2, "copy-group-busy", "ns"),
> > >> +		__event(3, "media-group-busy", "ns"),
> > >> +		__event(4, "any-engine-group-busy", "ns"),
> > >> +	};
> > >
> > > OK: this function is some black magic to expose stuff through
> > > PMU. Identical to i915 and seems to be working from the commit message so
> > > should be fine.
> > >
> > >> +
> > >> +	unsigned int count = 0;
> > >> +	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
> > >> +	struct xe_ext_attribute *xe_attr = NULL, *xe_iter;
> > >> +	struct attribute **attr = NULL, **attr_iter;
> > >> +	struct xe_gt *gt;
> > >> +	unsigned int i, j;
> > >> +
> > >> +	/* Count how many counters we will be exposing. */
> > >> +	for_each_gt(gt, xe, j) {
> > >> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
> > >> +			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
> > >> +
> > >> +			if (!config_status(xe, config))
> > >> +				count++;
> > >> +		}
> > >> +	}
> > >> +
> > >> +	/* Allocate attribute objects and table. */
> > >> +	xe_attr = kcalloc(count, sizeof(*xe_attr), GFP_KERNEL);
> > >> +	if (!xe_attr)
> > >> +		goto err_alloc;
> > >> +
> > >> +	pmu_attr = kcalloc(count, sizeof(*pmu_attr), GFP_KERNEL);
> > >> +	if (!pmu_attr)
> > >> +		goto err_alloc;
> > >> +
> > >> +	/* Max one pointer of each attribute type plus a termination entry. */
> > >> +	attr = kcalloc(count * 2 + 1, sizeof(*attr), GFP_KERNEL);
> > >> +	if (!attr)
> > >> +		goto err_alloc;
> > >> +
> > >> +	xe_iter = xe_attr;
> > >> +	pmu_iter = pmu_attr;
> > >> +	attr_iter = attr;
> > >> +
> > >> +	for_each_gt(gt, xe, j) {
> > >> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
> > >> +			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
> > >> +			char *str;
> > >> +
> > >> +			if (config_status(xe, config))
> > >> +				continue;
> > >> +
> > >> +			if (events[i].global)
> > >> +				str = kstrdup(events[i].name, GFP_KERNEL);
> > >> +			else
> > >> +				str = kasprintf(GFP_KERNEL, "%s-gt%u",
> > >> +						events[i].name, j);
> > >> +			if (!str)
> > >> +				goto err;
> > >> +
> > >> +			*attr_iter++ = &xe_iter->attr.attr;
> > >> +			xe_iter = add_xe_attr(xe_iter, str, config);
> > >> +
> > >> +			if (events[i].unit) {
> > >> +				if (events[i].global)
> > >> +					str = kasprintf(GFP_KERNEL, "%s.unit",
> > >> +							events[i].name);
> > >> +				else
> > >> +					str = kasprintf(GFP_KERNEL, "%s-gt%u.unit",
> > >> +							events[i].name, j);
> > >> +				if (!str)
> > >> +					goto err;
> > >> +
> > >> +				*attr_iter++ = &pmu_iter->attr.attr;
> > >> +				pmu_iter = add_pmu_attr(pmu_iter, str,
> > >> +							events[i].unit);
> > >> +			}
> > >> +		}
> > >> +	}
> > >> +
> > >> +	pmu->xe_attr = xe_attr;
> > >> +	pmu->pmu_attr = pmu_attr;
> > >> +
> > >> +	return attr;
> > >> +
> > >> +err:
> > >> +	for (attr_iter = attr; *attr_iter; attr_iter++)
> > >> +		kfree((*attr_iter)->name);
> > >> +
> > >> +err_alloc:
> > >> +	kfree(attr);
> > >> +	kfree(xe_attr);
> > >> +	kfree(pmu_attr);
> > >> +
> > >> +	return NULL;
> > >> +}
> > >> +
> > >> +static void free_event_attributes(struct xe_pmu *pmu)
> > >> +{
> > >> +	struct attribute **attr_iter = pmu->events_attr_group.attrs;
> > >> +
> > >> +	for (; *attr_iter; attr_iter++)
> > >> +		kfree((*attr_iter)->name);
> > >> +
> > >> +	kfree(pmu->events_attr_group.attrs);
> > >> +	kfree(pmu->xe_attr);
> > >> +	kfree(pmu->pmu_attr);
> > >> +
> > >> +	pmu->events_attr_group.attrs = NULL;
> > >> +	pmu->xe_attr = NULL;
> > >> +	pmu->pmu_attr = NULL;
> > >> +}
> > >> +
> > >> +static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
> > >> +{
> > >> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
> > >> +
> > >> +	XE_BUG_ON(!pmu->base.event_init);
> > >> +
> > >> +	/* Select the first online CPU as a designated reader. */
> > >> +	if (cpumask_empty(&xe_pmu_cpumask))
> > >> +		cpumask_set_cpu(cpu, &xe_pmu_cpumask);
> > >> +
> > >> +	return 0;
> > >> +}
> > >> +
> > >> +static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
> > >> +{
> > >> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
> > >> +	unsigned int target = xe_pmu_target_cpu;
> > >> +
> > >> +	XE_BUG_ON(!pmu->base.event_init);
> > >> +
> > >> +	/*
> > >> +	 * Unregistering an instance generates a CPU offline event which we must
> > >> +	 * ignore to avoid incorrectly modifying the shared xe_pmu_cpumask.
> > >> +	 */
> > >> +	if (pmu->closed)
> > >> +		return 0;
> > >> +
> > >> +	if (cpumask_test_and_clear_cpu(cpu, &xe_pmu_cpumask)) {
> > >> +		target = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
> > >> +
> > >> +		/* Migrate events if there is a valid target */
> > >> +		if (target < nr_cpu_ids) {
> > >> +			cpumask_set_cpu(target, &xe_pmu_cpumask);
> > >> +			xe_pmu_target_cpu = target;
> > >> +		}
> > >> +	}
> > >> +
> > >> +	if (target < nr_cpu_ids && target != pmu->cpuhp.cpu) {
> > >> +		perf_pmu_migrate_context(&pmu->base, cpu, target);
> > >> +		pmu->cpuhp.cpu = target;
> > >> +	}
> > >> +
> > >> +	return 0;
> > >> +}
> > >> +
> > >> +static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
> > >> +
> > >> +int xe_pmu_init(void)
> > >> +{
> > >> +	int ret;
> > >> +
> > >> +	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
> > >> +				      "perf/x86/intel/xe:online",
> > >> +				      xe_pmu_cpu_online,
> > >> +				      xe_pmu_cpu_offline);
> > >> +	if (ret < 0)
> > >> +		pr_notice("Failed to setup cpuhp state for xe PMU! (%d)\n",
> > >> +			  ret);
> > >> +	else
> > >> +		cpuhp_slot = ret;
> > >> +
> > >> +	return 0;
> > >> +}
> > >> +
> > >> +void xe_pmu_exit(void)
> > >> +{
> > >> +	if (cpuhp_slot != CPUHP_INVALID)
> > >> +		cpuhp_remove_multi_state(cpuhp_slot);
> > >> +}
> > >> +
> > >> +static int xe_pmu_register_cpuhp_state(struct xe_pmu *pmu)
> > >> +{
> > >> +	if (cpuhp_slot == CPUHP_INVALID)
> > >> +		return -EINVAL;
> > >> +
> > >> +	return cpuhp_state_add_instance(cpuhp_slot, &pmu->cpuhp.node);
> > >> +}
> > >> +
> > >> +static void xe_pmu_unregister_cpuhp_state(struct xe_pmu *pmu)
> > >> +{
> > >> +	cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node);
> > >> +}
> > >> +
> > >> +static void xe_pmu_unregister(struct drm_device *device, void *arg)
> > >> +{
> > >> +	struct xe_pmu *pmu = arg;
> > >> +
> > >> +	if (!pmu->base.event_init)
> > >> +		return;
> > >> +
> > >> +	/*
> > >> +	 * "Disconnect" the PMU callbacks - since all are atomic synchronize_rcu
> > >> +	 * ensures all currently executing ones will have exited before we
> > >> +	 * proceed with unregistration.
> > >> +	 */
> > >> +	pmu->closed = true;
> > >> +	synchronize_rcu();
> > >> +
> > >> +	xe_pmu_unregister_cpuhp_state(pmu);
> > >> +
> > >> +	perf_pmu_unregister(&pmu->base);
> > >> +	pmu->base.event_init = NULL;
> > >> +	kfree(pmu->base.attr_groups);
> > >> +	kfree(pmu->name);
> > >> +	free_event_attributes(pmu);
> > >> +}
> > >> +
> > >> +static void init_samples(struct xe_pmu *pmu)
> > >> +{
> > >> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> > >> +	struct xe_gt *gt;
> > >> +	unsigned int i;
> > >> +
> > >> +	for_each_gt(gt, xe, i)
> > >> +		engine_group_busyness_store(gt);
> > >> +}
> > >> +
> > >> +void xe_pmu_register(struct xe_pmu *pmu)
> > >
> > > Why void, why not int? PMU failure is non fatal error?
> >
> > Ya device is functional , it is only that these counters are not
> > available. Hence didn't want to fail the driver load.
> > >
> > >> +{
> > >> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> > >> +	const struct attribute_group *attr_groups[] = {
> > >> +		&xe_pmu_format_attr_group,
> > >> +		&pmu->events_attr_group,
> > >> +		&xe_pmu_cpumask_attr_group,
> > >
> > > Can someone please explain what this cpumask/cpuhotplug stuff does and
> > > whether it needs to be in this patch? There's something here:
> >
> > comments from original patch series in
> > i915:https://patchwork.kernel.org/project/intel-gfx/patch/20170802123249.14194-5-tvrtko.ursulin@linux.intel.com/
> >
> > "IIRC an uncore PMU should expose a cpumask through sysfs, and then perf
> > tools will read that mask and auto-magically limit the number of CPUs it
> > instantiates the counter on."
> >
> > and ours are global counters not per cpu so we limit to just to single cpu.
> >
> > And as i understand we use the cpuhotplug support to migrate too new cpu
> > incase the earlier one goes offline.
>
> OK, leave as is.
>
> >
> > >
> > > b46a33e271ed ("drm/i915/pmu: Expose a PMU interface for perf queries")
> > >
> > > I'd rather just have the basic PMU infra and the events in this patch and
> > > punt this cpumask/cpuhotplug stuff to a later patch, unless someone can say
> > > what it does.
> > >
> > > Though perf_pmu_register seems to be doing some per cpu stuff so likely
> > > this is needed. But amdgpu_pmu only has event and format attributes.
> > >
> > > Mostly leave as is I guess.
> > >
> > >> +		NULL
> > >> +	};
> > >> +
> > >> +	int ret = -ENOMEM;
> > >> +
> > >> +	spin_lock_init(&pmu->lock);
> > >> +	pmu->cpuhp.cpu = -1;
> > >> +	init_samples(pmu);
> > >
> > > Why init_samples here? Can't we init the particular sample in
> > > xe_pmu_event_init or even xe_pmu_event_start?
> > >
> > > Init'ing here may be too soon since the event might not be enabled for a
> > > long time. So really this needs to move to xe_pmu_event_init or
> > > xe_pmu_event_start.
> >
> > The device is put to suspend immediately after driver probe, typically
> > pmu is opened even before any workload is run so essentially device is
> > still in suspend state hence we cannot access registers so storing the
> > last known value in init_samples. otherwise we see the bug in v#1 of series.
> >
> > >
> > > Actually this is already happening in xe_pmu_enable. We just need to decide
> > > when we want to wake the device up and when we don't. So maybe wake the
> > > device up at start (use xe_device_mem_access_get) and not wake up during
> > > read (xe_device_mem_access_get_if_ongoing etc.)?
>
> Just going to repeat this again:
>
> xe_pmu_event_start calls xe_pmu_enable. In xe_pmu_enable use
> xe_device_mem_access_get before calling __xe_pmu_event_read. This will wake
> the device up and get a valid value in event->hw.prev_count.
>
> In xe_pmu_event_read, use xe_device_mem_access_get_if_ongoing, to read the
> event without waking the device up (and return previous value etc.).
>
> Or, pass a flag in to __xe_pmu_event_read and to engine_group_busyness_read
> and __engine_group_busyness_read. The flag will say whether or not to wake
> up the device. If the flag says wake the device up, call
> xe_device_mem_access_get and xe_force_wake_get, maybe in
> __engine_group_busyness_read, before reading device registers. If the flag
> says don't wake up the device call xe_device_mem_access_get_if_ongoing.
>
> This way we:
> * don't need to call init_samples in xe_pmu_register
> * we don't need engine_group_busyness_store
> * we don't need to specifically call engine_group_busyness_store in xe_gt_suspend
>
> The correct sample is read by waking up the device in xe_pmu_event_start.
>
> Hopefully this is clear now.

Actually I think it is not necessary to do anything in xe_pmu_event_start,
as long as we keep the engine_group_busyness_store in xe_gt_suspend (so we
can just use xe_device_mem_access_get_if_ongoing, don't need
xe_device_mem_access_get as you said).

Afais, the problem of huge values in v1 was due to reading the device
registers when device was not awake. That problem we've already solved in
v2 where in engine_group_busyness_read() we only read if the device is
awake and skip and return the previous value when the device is not
awake. So that fixes the problem of huge values.

The problem in this patch (I thought) is that we are effectively sampling
the registers each time perf calls xe_pmu_event_read, say every 1 sec. So
we are sampling the registers every 1 sec. When we sample (every 1 sec) if
the device is awake we will return the correct ns value, if device is
asleep we will return the old value. If the device wakes up and does some
work in the 1 sec period but then again suspends will miss that
activity. That is the problem that is being solved by storing the samples
in xe_gt_suspend().

i915 solved this problem using a 5 ms timer but I like the solution of
sampling in xe_gt_suspend better, so let's keep it.

But init_samples is not needed, afais we don't need to do anything in
xe_pmu_register or in xe_pmu_event_start (and we don't need to use
xe_device_mem_access_get). xe_gt_suspend will take care of storing the
register values before the device suspends.

Hopefully it makes sense now, sorry for the confusion.

Ashutosh

>
> >
> > >
> > >> +
> > >> +	pmu->name = kasprintf(GFP_KERNEL,
> > >> +			      "xe_%s",
> > >> +			      dev_name(xe->drm.dev));
> > >> +	if (pmu->name)
> > >> +		/* tools/perf reserves colons as special. */
> > >> +		strreplace((char *)pmu->name, ':', '_');
> > >> +
> > >> +	if (!pmu->name)
> > >> +		goto err;
> > >> +
> > >> +	pmu->events_attr_group.name = "events";
> > >> +	pmu->events_attr_group.attrs = create_event_attributes(pmu);
> > >> +	if (!pmu->events_attr_group.attrs)
> > >> +		goto err_name;
> > >> +
> > >> +	pmu->base.attr_groups = kmemdup(attr_groups, sizeof(attr_groups),
> > >> +					GFP_KERNEL);
> > >> +	if (!pmu->base.attr_groups)
> > >> +		goto err_attr;
> > >> +
> > >> +	pmu->base.module	= THIS_MODULE;
> > >> +	pmu->base.task_ctx_nr	= perf_invalid_context;
> > >> +	pmu->base.event_init	= xe_pmu_event_init;
> > >> +	pmu->base.add		= xe_pmu_event_add;
> > >> +	pmu->base.del		= xe_pmu_event_del;
> > >> +	pmu->base.start		= xe_pmu_event_start;
> > >> +	pmu->base.stop		= xe_pmu_event_stop;
> > >> +	pmu->base.read		= xe_pmu_event_read;
> > >> +	pmu->base.event_idx	= xe_pmu_event_event_idx;
> > >> +
> > >> +	ret = perf_pmu_register(&pmu->base, pmu->name, -1);
> > >> +	if (ret)
> > >> +		goto err_groups;
> > >> +
> > >> +	ret = xe_pmu_register_cpuhp_state(pmu);
> > >> +	if (ret)
> > >> +		goto err_unreg;
> > >> +
> > >> +	ret = drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister, pmu);
> > >> +	XE_WARN_ON(ret);
> > >
> > > We should just follow the regular error rewind here too and let
> > > drm_notice() at the end print the error. This is what other drivers calling
> > > drmm_add_action_or_reset seem to be doing.
> >
> > Ok ok.
> > >
> > >> +
> > >> +	return;
> > >> +
> > >> +err_unreg:
> > >> +	perf_pmu_unregister(&pmu->base);
> > >> +err_groups:
> > >> +	kfree(pmu->base.attr_groups);
> > >> +err_attr:
> > >> +	pmu->base.event_init = NULL;
> > >> +	free_event_attributes(pmu);
> > >> +err_name:
> > >> +	kfree(pmu->name);
> > >> +err:
> > >> +	drm_notice(&xe->drm, "Failed to register PMU!\n");
> > >> +}
> > >> diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
> > >> new file mode 100644
> > >> index 000000000000..d3f47f4ab343
> > >> --- /dev/null
> > >> +++ b/drivers/gpu/drm/xe/xe_pmu.h
> > >> @@ -0,0 +1,25 @@
> > >> +/* SPDX-License-Identifier: MIT */
> > >> +/*
> > >> + * Copyright © 2023 Intel Corporation
> > >> + */
> > >> +
> > >> +#ifndef _XE_PMU_H_
> > >> +#define _XE_PMU_H_
> > >> +
> > >> +#include "xe_gt_types.h"
> > >> +#include "xe_pmu_types.h"
> > >> +
> > >> +#ifdef CONFIG_PERF_EVENTS
> > >
> > > nit but maybe this should be:
> > >
> > > #if IS_ENABLED(CONFIG_PERF_EVENTS)
> > >
> > > or,
> > >
> > > #if IS_BUILTIN(CONFIG_PERF_EVENTS)
> > >
> > > Note CONFIG_PERF_EVENTS is a boolean kconfig option.
> > >
> > > See similar macro IS_REACHABLE() in i915_hwmon.h.
> > >
> > >> +int xe_pmu_init(void);
> > >> +void xe_pmu_exit(void);
> > >> +void xe_pmu_register(struct xe_pmu *pmu);
> > >> +void engine_group_busyness_store(struct xe_gt *gt);
> > >
> > > Add xe_pmu_ prefix if function is needed (hopefully not).
> >
> > OK
> > >
> > >> +#else
> > >> +static inline int xe_pmu_init(void) { return 0; }
> > >> +static inline void xe_pmu_exit(void) {}
> > >> +static inline void xe_pmu_register(struct xe_pmu *pmu) {}
> > >> +static inline void engine_group_busyness_store(struct xe_gt *gt) {}
> > >> +#endif
> > >> +
> > >> +#endif
> > >> +
> > >> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h b/drivers/gpu/drm/xe/xe_pmu_types.h
> > >> new file mode 100644
> > >> index 000000000000..e87edd4d6a87
> > >> --- /dev/null
> > >> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
> > >> @@ -0,0 +1,80 @@
> > >> +/* SPDX-License-Identifier: MIT */
> > >> +/*
> > >> + * Copyright © 2023 Intel Corporation
> > >> + */
> > >> +
> > >> +#ifndef _XE_PMU_TYPES_H_
> > >> +#define _XE_PMU_TYPES_H_
> > >> +
> > >> +#include <linux/perf_event.h>
> > >> +#include <linux/spinlock_types.h>
> > >> +#include <uapi/drm/xe_drm.h>
> > >> +
> > >> +enum {
> > >> +	__XE_SAMPLE_RENDER_GROUP_BUSY,
> > >> +	__XE_SAMPLE_COPY_GROUP_BUSY,
> > >> +	__XE_SAMPLE_MEDIA_GROUP_BUSY,
> > >> +	__XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> > >> +	__XE_NUM_PMU_SAMPLERS
> > >
> > > OK: irq_count is missing here since these are read from device...
> > >
> > >> +};
> > >> +
> > >> +struct xe_pmu_sample {
> > >> +	u64 cur;
> > >> +};
> > >
> > > This was also discussed for i915 PMU, no point having a struct with a
> > > single u64 member. Might as well just use u64 wherever we are using
> > > struct xe_pmu_sample.
> >
> > OK.
> > >
> > >> +
> > >> +#define XE_MAX_GT_PER_TILE 2
> > >
> > > Why per tile? The array size should be max_gt_per_device. Just call it
> > > XE_MAX_GT?
> >
> > I declared similar to what we have in drivers/gpu/drm/xe/xe_device.h
>
> Our 2-d array size is for the device, not per tile. So let's use XE_MAX_GT.
>
> > >
> > >> +
> > >> +struct xe_pmu {
> > >> +	/**
> > >> +	 * @cpuhp: Struct used for CPU hotplug handling.
> > >> +	 */
> > >> +	struct {
> > >> +		struct hlist_node node;
> > >> +		unsigned int cpu;
> > >> +	} cpuhp;
> > >> +	/**
> > >> +	 * @base: PMU base.
> > >> +	 */
> > >> +	struct pmu base;
> > >> +	/**
> > >> +	 * @closed: xe is unregistering.
> > >> +	 */
> > >> +	bool closed;
> > >> +	/**
> > >> +	 * @name: Name as registered with perf core.
> > >> +	 */
> > >> +	const char *name;
> > >> +	/**
> > >> +	 * @lock: Lock protecting enable mask and ref count handling.
> > >> +	 */
> > >> +	spinlock_t lock;
> > >> +	/**
> > >> +	 * @sample: Current and previous (raw) counters.
> > >> +	 *
> > >> +	 * These counters are updated when the device is awake.
> > >> +	 *
> > >> +	 */
> > >> +	struct xe_pmu_sample sample[XE_MAX_GT_PER_TILE * __XE_NUM_PMU_SAMPLERS];
> > >
> > > Change to 2-d array. See above.
> > >
> > >> +	/**
> > >> +	 * @irq_count: Number of interrupts
> > >> +	 *
> > >> +	 * Intentionally unsigned long to avoid atomics or heuristics on 32bit.
> > >> +	 * 4e9 interrupts are a lot and postprocessing can really deal with an
> > >> +	 * occasional wraparound easily. It's 32bit after all.
> > >> +	 */
> > >> +	unsigned long irq_count;
> > >> +	/**
> > >> +	 * @events_attr_group: Device events attribute group.
> > >> +	 */
> > >> +	struct attribute_group events_attr_group;
> > >> +	/**
> > >> +	 * @xe_attr: Memory block holding device attributes.
> > >> +	 */
> > >> +	void *xe_attr;
> > >> +	/**
> > >> +	 * @pmu_attr: Memory block holding device attributes.
> > >> +	 */
> > >> +	void *pmu_attr;
> > >> +};
> > >> +
> > >> +#endif
> > >> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> > >> index 965cd9527ff1..ed097056f944 100644
> > >> --- a/include/uapi/drm/xe_drm.h
> > >> +++ b/include/uapi/drm/xe_drm.h
> > >> @@ -990,6 +990,22 @@ struct drm_xe_vm_madvise {
> > >>	__u64 reserved[2];
> > >>  };
> > >>
> > >> +/* PMU event config IDs */
> > >> +
> > >> +/*
> > >> + * Top 4 bits of every counter are GT id.
> > >> + */
> > >> +#define __XE_PMU_GT_SHIFT (60)
> > >
> > > To future-proof this, and also because we seem to have plenty of bits
> > > available, I think we should change this to 56 (instead of 60).
> >
> > OK
> >
> > Thanks,
> > Aravind.
> > >
> > >> +
> > >> +#define ___XE_PMU_OTHER(gt, x) \
> > >> +	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
> > >> +
> > >> +#define XE_PMU_INTERRUPTS(gt)			___XE_PMU_OTHER(gt, 0)
> > >> +#define XE_PMU_RENDER_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 1)
> > >> +#define XE_PMU_COPY_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 2)
> > >> +#define XE_PMU_MEDIA_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 3)
> > >> +#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)	___XE_PMU_OTHER(gt, 4)
> > >> +
> > >>  #if defined(__cplusplus)
> > >>  }
> > >>  #endif
> > >> --
> > >> 2.25.1
>
> Thanks.
> --
> Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-06-27 12:21 ` [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
                     ` (5 preceding siblings ...)
  2023-07-21  1:02   ` Dixit, Ashutosh
@ 2023-07-22 14:39   ` Dixit, Ashutosh
  2023-07-24  8:02     ` Iddamsetty, Aravind
  6 siblings, 1 reply; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-22 14:39 UTC (permalink / raw)
  To: Aravind Iddamsetty; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
>
> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> index 2458397ce8af..96e3720923d4 100644
> --- a/drivers/gpu/drm/xe/xe_gt.c
> +++ b/drivers/gpu/drm/xe/xe_gt.c
> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
>	if (err)
>		goto err_msg;
>
> +	engine_group_busyness_store(gt);
> +
>	err = xe_uc_suspend(&gt->uc);
>	if (err)
>		goto err_force_wake;

Also, another tiny point, let's not call engine_group_busyness_store
directly from here. Let's expose a xe_pmu_suspend() in xe_pmu.h and call
that from here and have xe_pmu_suspend call engine_group_busyness_store.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-22 14:39   ` Dixit, Ashutosh
@ 2023-07-24  8:02     ` Iddamsetty, Aravind
  0 siblings, 0 replies; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-24  8:02 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin



On 22-07-2023 20:09, Dixit, Ashutosh wrote:
> On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
>>
>> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
>> index 2458397ce8af..96e3720923d4 100644
>> --- a/drivers/gpu/drm/xe/xe_gt.c
>> +++ b/drivers/gpu/drm/xe/xe_gt.c
>> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
>> 	if (err)
>> 		goto err_msg;
>>
>> +	engine_group_busyness_store(gt);
>> +
>> 	err = xe_uc_suspend(&gt->uc);
>> 	if (err)
>> 		goto err_force_wake;
> 
> Also, another tiny point, let's not call engine_group_busyness_store
> directly from here. Let's expose a xe_pmu_suspend() in xe_pmu.h and call
> that from here and have xe_pmu_suspend call engine_group_busyness_store.

Ok makes sense.

Thanks,
Aravind.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-22  6:04         ` Dixit, Ashutosh
@ 2023-07-24  8:03           ` Iddamsetty, Aravind
  2023-07-24  9:00             ` Ursulin, Tvrtko
  2023-07-24 15:52             ` Dixit, Ashutosh
  2023-07-24  9:38           ` Iddamsetty, Aravind
  1 sibling, 2 replies; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-24  8:03 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin



On 22-07-2023 11:34, Dixit, Ashutosh wrote:

Hi Ashutosh,

> On Fri, 21 Jul 2023 16:36:02 -0700, Dixit, Ashutosh wrote:
>>
>> On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote:
>>>
>> Hi Aravind,
>>
>>> On 21-07-2023 06:32, Dixit, Ashutosh wrote:
>>>> On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
>>>>>
>>>> More stuff to mull over. You can ignore comments starting with "OK", those
>>>> are just notes to myself.
>>>>
>>>> Also, maybe some time we can add a basic IGT which reads these exposed
>>>> counters and verifies that we can read them and they are monotonically
>>>> increasing?
>>>
>>> this is the IGT https://patchwork.freedesktop.org/series/119936/ series
>>> using these counters posted by Venkat.
>>>
>>>>
>>>>> There are a set of engine group busyness counters provided by HW which are
>>>>> perfect fit to be exposed via PMU perf events.
>>>>>
>>>>> BSPEC: 46559, 46560, 46722, 46729
>>>>
>>>> Also add these Bspec entries: 71028, 52071
>>>
>>> OK.
>>>
>>>>
>>>>>
>>>>> events can be listed using:
>>>>> perf list
>>>>>   xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
>>>>>   xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
>>>>>   xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
>>>>>   xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
>>>>>   xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
>>>>>
>>>>> and can be read using:
>>>>>
>>>>> perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
>>>>>            time             counts unit events
>>>>>      1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>     10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>     10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>
>>>>> The pmu base implementation is taken from i915.
>>>>>
>>>>> v2:
>>>>> Store last known value when device is awake return that while the GT is
>>>>> suspended and then update the driver copy when read during awake.
>>>>>
>>>>> Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
>>>>> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
>>>>> ---
>>>>>  drivers/gpu/drm/xe/Makefile          |   2 +
>>>>>  drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
>>>>>  drivers/gpu/drm/xe/xe_device.c       |   2 +
>>>>>  drivers/gpu/drm/xe/xe_device_types.h |   4 +
>>>>>  drivers/gpu/drm/xe/xe_gt.c           |   2 +
>>>>>  drivers/gpu/drm/xe/xe_irq.c          |  22 +
>>>>>  drivers/gpu/drm/xe/xe_module.c       |   5 +
>>>>>  drivers/gpu/drm/xe/xe_pmu.c          | 739 +++++++++++++++++++++++++++
>>>>>  drivers/gpu/drm/xe/xe_pmu.h          |  25 +
>>>>>  drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
>>>>>  include/uapi/drm/xe_drm.h            |  16 +
>>>>>  11 files changed, 902 insertions(+)
>>>>>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
>>>>>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
>>>>>  create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h
>>>>>
>>>>> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
>>>>> index 081c57fd8632..e52ab795c566 100644
>>>>> --- a/drivers/gpu/drm/xe/Makefile
>>>>> +++ b/drivers/gpu/drm/xe/Makefile
>>>>> @@ -217,6 +217,8 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
>>>>> 	i915-display/skl_universal_plane.o \
>>>>> 	i915-display/skl_watermark.o
>>>>>
>>>>> +xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
>>>>> +
>>>>>  ifeq ($(CONFIG_ACPI),y)
>>>>> 	xe-$(CONFIG_DRM_XE_DISPLAY) += \
>>>>> 		i915-display/intel_acpi.o \
>>>>> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>>>>> index 3f664011eaea..c7d9e4634745 100644
>>>>> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>>>>> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>>>>> @@ -285,6 +285,11 @@
>>>>>  #define   INVALIDATION_BROADCAST_MODE_DIS	REG_BIT(12)
>>>>>  #define   GLOBAL_INVALIDATION_MODE		REG_BIT(2)
>>>>>
>>>>> +#define XE_OAG_RC0_ANY_ENGINE_BUSY_FREE		XE_REG(0xdb80)
>>>>> +#define XE_OAG_ANY_MEDIA_FF_BUSY_FREE		XE_REG(0xdba0)
>>>>> +#define XE_OAG_BLT_BUSY_FREE			XE_REG(0xdbbc)
>>>>> +#define XE_OAG_RENDER_BUSY_FREE			XE_REG(0xdbdc)
>>>>> +
>>>>>  #define SAMPLER_MODE				XE_REG_MCR(0xe18c, XE_REG_OPTION_MASKED)
>>>>>  #define   ENABLE_SMALLPL			REG_BIT(15)
>>>>>  #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
>>>>> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>>>>> index c7985af85a53..b2c7bd4a97d9 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_device.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_device.c
>>>>> @@ -328,6 +328,8 @@ int xe_device_probe(struct xe_device *xe)
>>>>>
>>>>> 	xe_debugfs_register(xe);
>>>>>
>>>>> +	xe_pmu_register(&xe->pmu);
>>>>> +
>>>>> 	err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize, xe);
>>>>> 	if (err)
>>>>> 		return err;
>>>>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>>>>> index 0226d44a6af2..3ba99aae92b9 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>>>>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>>>>> @@ -15,6 +15,7 @@
>>>>>  #include "xe_devcoredump_types.h"
>>>>>  #include "xe_gt_types.h"
>>>>>  #include "xe_platform_types.h"
>>>>> +#include "xe_pmu.h"
>>>>>  #include "xe_step_types.h"
>>>>>
>>>>>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
>>>>> @@ -332,6 +333,9 @@ struct xe_device {
>>>>> 	/** @d3cold_allowed: Indicates if d3cold is a valid device state */
>>>>> 	bool d3cold_allowed;
>>>>>
>>>>> +	/* @pmu: performance monitoring unit */
>>>>> +	struct xe_pmu pmu;
>>>>> +
>>>>
>>>> Now I am wondering why we don't make the PMU per-gt (or per-tile)? Per-gt
>>>> would work for these busyness registers and I am not sure how the
>>>> interrupts are hooked up.
>>>>
>>>> In i915 the PMU being device level helped in having a single timer (rather
>>>> than a per gt timer).
>>>>
>>>> Anyway probably not much practical benefit by make it per-gt/per-tile, so
>>>> maybe leave as is. Just thinking out loud.
>>>
>>> we are able to expose per-gt counters so do not see any benefit in
>>> making pmu per-gt, also i believe struct pmu is per device it will have
>>> a type associated which is unique for a device.
>>
>> PMU can be made per gt and named xe-gt0, xe-gt1 etc. if we want. But anyway
>> leave as is.
>>
>>>
>>>>
>>>>> 	/* private: */
>>>>>
>>>>>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
>>>>> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
>>>>> index 2458397ce8af..96e3720923d4 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_gt.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_gt.c
>>>>> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
>>>>> 	if (err)
>>>>> 		goto err_msg;
>>>>>
>>>>> +	engine_group_busyness_store(gt);
>>>>
>>>> Not sure I follow the reason for doing this at suspend time? If PMU was
>>>> active there should be a previous value. Why must we take a new sample
>>>> explicitly here?
>>>
>>> the PMU interface can be read even when device is suspend and in such
>>> cases we should not wake up the device, and if we try to read the
>>> register when device is suspended it gives spurious counter you can
>>> check in version#1 of this series we were getting huge values. as we put
>>> the device to suspend immediately after probe. So storing the last known
>>> good value before suspend.
>>
>> No need, see comment at init_samples below.
> 
> Sorry, you are right, I changed my mind about this, I see your point. See
> more discussion on this at init_samples below.
> 
>>
>>>
>>>>
>>>> To me looks like engine_group_busyness_store should not be needed, see
>>>> comment below for init_samples too.
>>>>
>>>>> +
>>>>> 	err = xe_uc_suspend(&gt->uc);
>>>>> 	if (err)
>>>>> 		goto err_force_wake;
>>>>> diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
>>>>> index b4ed1e4a3388..cb943fb94ec7 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_irq.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_irq.c
>>>>> @@ -27,6 +27,24 @@
>>>>>  #define IIR(offset)				XE_REG(offset + 0x8)
>>>>>  #define IER(offset)				XE_REG(offset + 0xc)
>>>>>
>>>>> +/*
>>>>> + * Interrupt statistic for PMU. Increments the counter only if the
>>>>> + * interrupt originated from the GPU so interrupts from a device which
>>>>> + * shares the interrupt line are not accounted.
>>>>> + */
>>>>> +static inline void xe_pmu_irq_stats(struct xe_device *xe,
>>>>
>>>> No inline, compiler will do it, but looks like we may want to always_inline
>>>> this. Also this function should really be in xe_pmu.h? Anyway probably
>>>> leave as is.
>>>>
>>>>> +				    irqreturn_t res)
>>>>> +{
>>>>> +	if (unlikely(res != IRQ_HANDLED))
>>>>> +		return;
>>>>> +
>>>>> +	/*
>>>>> +	 * A clever compiler translates that into INC. A not so clever one
>>>>> +	 * should at least prevent store tearing.
>>>>> +	 */
>>>>> +	WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1);
>>>>> +}
>>>>> +
>>>>>  static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)
>>>>>  {
>>>>> 	u32 val = xe_mmio_read32(mmio, reg);
>>>>> @@ -325,6 +343,8 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
>>>>>
>>>>> 	xe_display_irq_enable(xe, gu_misc_iir);
>>>>>
>>>>> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
>>>>> +
>>>>> 	return IRQ_HANDLED;
>>>>>  }
>>>>>
>>>>> @@ -414,6 +434,8 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
>>>>> 	dg1_intr_enable(xe, false);
>>>>> 	xe_display_irq_enable(xe, gu_misc_iir);
>>>>>
>>>>> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
>>>>> +
>>>>> 	return IRQ_HANDLED;
>>>>>  }
>>>>>
>>>>> diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
>>>>> index 75e5be939f53..f6fe89748525 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_module.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_module.c
>>>>> @@ -12,6 +12,7 @@
>>>>>  #include "xe_hw_fence.h"
>>>>>  #include "xe_module.h"
>>>>>  #include "xe_pci.h"
>>>>> +#include "xe_pmu.h"
>>>>>  #include "xe_sched_job.h"
>>>>>
>>>>>  bool enable_guc = true;
>>>>> @@ -49,6 +50,10 @@ static const struct init_funcs init_funcs[] = {
>>>>> 		.init = xe_sched_job_module_init,
>>>>> 		.exit = xe_sched_job_module_exit,
>>>>> 	},
>>>>> +	{
>>>>> +		.init = xe_pmu_init,
>>>>> +		.exit = xe_pmu_exit,
>>>>> +	},
>>>>> 	{
>>>>> 		.init = xe_register_pci_driver,
>>>>> 		.exit = xe_unregister_pci_driver,
>>>>> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
>>>>> new file mode 100644
>>>>> index 000000000000..bef1895be9f7
>>>>> --- /dev/null
>>>>> +++ b/drivers/gpu/drm/xe/xe_pmu.c
>>>>> @@ -0,0 +1,739 @@
>>>>> +/*
>>>>> + * SPDX-License-Identifier: MIT
>>>>> + *
>>>>> + * Copyright © 2023 Intel Corporation
>>>>> + */
>>>>
>>>> This SPDX header is for .h files not .c files. Actually, it is for neither :/
>>>
>>> But i see this in almost all the files in xe.
>>
>> Look closely.
>>
>>>>
>>>>> +
>>>>> +#include <drm/drm_drv.h>
>>>>> +#include <drm/drm_managed.h>
>>>>> +#include <drm/xe_drm.h>
>>>>> +
>>>>> +#include "regs/xe_gt_regs.h"
>>>>> +#include "xe_device.h"
>>>>> +#include "xe_gt_clock.h"
>>>>> +#include "xe_mmio.h"
>>>>> +
>>>>> +static cpumask_t xe_pmu_cpumask;
>>>>> +static unsigned int xe_pmu_target_cpu = -1;
>>>>> +
>>>>> +static unsigned int config_gt_id(const u64 config)
>>>>> +{
>>>>> +	return config >> __XE_PMU_GT_SHIFT;
>>>>> +}
>>>>> +
>>>>> +static u64 config_counter(const u64 config)
>>>>> +{
>>>>> +	return config & ~(~0ULL << __XE_PMU_GT_SHIFT);
>>>>> +}
>>>>> +
>>>>> +static unsigned int
>>>>> +__sample_idx(struct xe_pmu *pmu, unsigned int gt_id, int sample)
>>>>> +{
>>>>> +	unsigned int idx = gt_id * __XE_NUM_PMU_SAMPLERS + sample;
>>>>> +
>>>>> +	XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
>>>>> +
>>>>> +	return idx;
>>>>> +}
>>>>
>>>> The compiler does this for us if we make sample array 2-d.
>>>>
>>>>> +
>>>>> +static u64 read_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample)
>>>>> +{
>>>>> +	return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur;
>>>>> +}
>>>>> +
>>>>> +static void
>>>>> +store_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample, u64 val)
>>>>> +{
>>>>> +	pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val;
>>>>> +}
>>>>
>>>> The three functions above are not needed if we make the sample array
>>>> 2-d. See here:
>>>>
>>>> https://patchwork.freedesktop.org/patch/538887/?series=118225&rev=1
>>>>
>>>> Only a part of the patch above was merged (see 8ed0753b527dc) to keep the
>>>> patch size small, but since for xe we are starting from scratch we can
>>>> implement the whole approach above (get rid of the read/store helpers
>>>> entirely, direct assignment without the helpers works).
>>>
>>> Ok I'll go over it.
>>>
>>>>
>>>>> +
>>>>> +static int engine_busyness_sample_type(u64 config)
>>>>> +{
>>>>> +	int type = 0;
>>>>> +
>>>>> +	switch (config) {
>>>>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>>>>> +		type =  __XE_SAMPLE_RENDER_GROUP_BUSY;
>>>>> +		break;
>>>>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>>>>> +		type = __XE_SAMPLE_COPY_GROUP_BUSY;
>>>>> +		break;
>>>>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>>>>> +		type = __XE_SAMPLE_MEDIA_GROUP_BUSY;
>>>>> +		break;
>>>>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>>>>> +		type = __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY;
>>>>> +		break;
>>>>> +	}
>>>>> +
>>>>> +	return type;
>>>>> +}
>>>>> +
>>>>> +static void xe_pmu_event_destroy(struct perf_event *event)
>>>>> +{
>>>>> +	struct xe_device *xe =
>>>>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>>>>> +
>>>>> +	drm_WARN_ON(&xe->drm, event->parent);
>>>>> +
>>>>> +	drm_dev_put(&xe->drm);
>>>>> +}
>>>>> +
>>>>> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
>>>>> +{
>>>>> +	u64 val = 0;
>>>>> +
>>>>> +	switch (config) {
>>>>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>>>>> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
>>>>> +		break;
>>>>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>>>>> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
>>>>> +		break;
>>>>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>>>>> +		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
>>>>> +		break;
>>>>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>>>>> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
>>>>> +		break;
>>>>> +	default:
>>>>> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
>>>>> +	}
>>>>
>>>> We need xe_device_mem_access_get, also xe_force_wake_get if applicable,
>>>> somewhere before reading these registers.
>>>>
>>>>> +
>>>>> +	return xe_gt_clock_interval_to_ns(gt, val * 16);
>>>>> +}
>>>>> +
>>>>> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config)
>>>>> +{
>>>>> +	int sample_type = engine_busyness_sample_type(config);
>>>>> +	struct xe_device *xe = gt->tile->xe;
>>>>> +	const unsigned int gt_id = gt->info.id;
>>>>> +	struct xe_pmu *pmu = &xe->pmu;
>>>>> +	bool device_awake;
>>>>> +	unsigned long flags;
>>>>> +	u64 val;
>>>>> +
>>>>> +	/*
>>>>> +	 * found no better way to check if device is awake or not. Before
>>>>
>>>> xe_device_mem_access_get_if_ongoing (hard to find name).
>>>
>>> thanks for sharing looks to be added recently. If we use this we needn't
>>> use xe_device_mem_access_get.
>>
>> See comment at init_samples.
>>
>>>
>>>>
>>>>> +	 * we suspend we set the submission_state.enabled to false.
>>>>> +	 */
>>>>> +	device_awake = gt->uc.guc.submission_state.enabled ? true : false;
>>>>> +	if (device_awake)
>>>>> +		val = __engine_group_busyness_read(gt, config);
>>>>> +
>>>>> +	spin_lock_irqsave(&pmu->lock, flags);
>>>>> +
>>>>> +	if (device_awake)
>>>>> +		store_sample(pmu, gt_id, sample_type, val);
>>>>> +	else
>>>>> +		val = read_sample(pmu, gt_id, sample_type);
>>>>> +
>>>>> +	spin_unlock_irqrestore(&pmu->lock, flags);
>>>>> +
>>>>> +	return val;
>>>>> +}
>>>>> +
>>>>> +void engine_group_busyness_store(struct xe_gt *gt)
>>>>> +{
>>>>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
>>>>> +	unsigned int gt_id = gt->info.id;
>>>>> +	unsigned long flags;
>>>>> +
>>>>> +	spin_lock_irqsave(&pmu->lock, flags);
>>>>> +
>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
> 
> Here why should we store everything, we should store only those events
> which are enabled?

The events are enabled only when they are opened which can happen after
the device is suspended hence we need to store all. As in the present
case device is put to suspend immediately after probe and event is
opened post driver load is done.

> 
> Also it would good if the above can be done in a loop somehow. 4 is fine
> but if we add events later, a loop will be nice, if possible.

I do not think loop would solve the furture events as there can be
selective i.e only few needs to be stored. pure counters like interrupts
need not be stored.

> 
>>>>> +
>>>>> +	spin_unlock_irqrestore(&pmu->lock, flags);
>>>>> +}
>>>>> +
>>>>> +static int
>>>>> +config_status(struct xe_device *xe, u64 config)
>>>>> +{
>>>>> +	unsigned int max_gt_id = xe->info.gt_count > 1 ? 1 : 0;
>>>>> +	unsigned int gt_id = config_gt_id(config);
>>>>> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
>>>>> +
>>>>> +	if (gt_id > max_gt_id)
>>>>> +		return -ENOENT;
>>>>> +
>>>>> +	switch (config_counter(config)) {
>>>>> +	case XE_PMU_INTERRUPTS(0):
>>>>> +		if (gt_id)
>>>>> +			return -ENOENT;
>>>>
>>>> OK: this is a global event so we say this is enabled only for gt0.
>>>>
>>>>> +		break;
>>>>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>>>>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>>>>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>>>>> +		if (GRAPHICS_VER(xe) < 12)
>>>>
>>>> Any point checking? xe only supports Gen12+.
>>>
>>> hmmm ya good point will remove this.
>>>>
>>>>> +			return -ENOENT;
>>>>> +		break;
>>>>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>>>>> +		if (MEDIA_VER(xe) >= 13 && gt->info.type != XE_GT_TYPE_MEDIA)
>>>>> +			return -ENOENT;
>>>>
>>>> OK: this makes sense, so we will expose this event only for media gt's.
>>>>
>>>>> +		break;
>>>>> +	default:
>>>>> +		return -ENOENT;
>>>>> +	}
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +static int xe_pmu_event_init(struct perf_event *event)
>>>>> +{
>>>>> +	struct xe_device *xe =
>>>>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>>>>> +	struct xe_pmu *pmu = &xe->pmu;
>>>>> +	int ret;
>>>>> +
>>>>> +	if (pmu->closed)
>>>>> +		return -ENODEV;
>>>>> +
>>>>> +	if (event->attr.type != event->pmu->type)
>>>>> +		return -ENOENT;
>>>>> +
>>>>> +	/* unsupported modes and filters */
>>>>> +	if (event->attr.sample_period) /* no sampling */
>>>>> +		return -EINVAL;
>>>>> +
>>>>> +	if (has_branch_stack(event))
>>>>> +		return -EOPNOTSUPP;
>>>>> +
>>>>> +	if (event->cpu < 0)
>>>>> +		return -EINVAL;
>>>>> +
>>>>> +	/* only allow running on one cpu at a time */
>>>>> +	if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask))
>>>>> +		return -EINVAL;
>>>>> +
>>>>> +	ret = config_status(xe, event->attr.config);
>>>>> +	if (ret)
>>>>> +		return ret;
>>>>> +
>>>>> +	if (!event->parent) {
>>>>> +		drm_dev_get(&xe->drm);
>>>>> +		event->destroy = xe_pmu_event_destroy;
>>>>> +	}
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +static u64 __xe_pmu_event_read(struct perf_event *event)
>>>>> +{
>>>>> +	struct xe_device *xe =
>>>>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>>>>> +	const unsigned int gt_id = config_gt_id(event->attr.config);
>>>>> +	const u64 config = config_counter(event->attr.config);
>>>>> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
>>>>> +	struct xe_pmu *pmu = &xe->pmu;
>>>>> +	u64 val = 0;
>>>>> +
>>>>> +	switch (config) {
>>>>> +	case XE_PMU_INTERRUPTS(0):
>>>>> +		val = READ_ONCE(pmu->irq_count);
>>>>
>>>> OK: The correct way to do this READ_ONCE/WRITE_ONCE irq_count stuff would
>>>> be to take pmu->lock (both while reading and writing irq_count) but that
>>>> would be expensive in the interrupt handler (as the .h hints). So all we
>>>> can do is what is done here.
>>>>
>>>>> +		break;
>>>>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>>>>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>>>>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>>>>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>>>>> +		val = engine_group_busyness_read(gt, config);
>>>>> +	}
>>>>> +
>>>>> +	return val;
>>>>> +}
>>>>> +
>>>>> +static void xe_pmu_event_read(struct perf_event *event)
>>>>> +{
>>>>> +	struct xe_device *xe =
>>>>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>>>>> +	struct hw_perf_event *hwc = &event->hw;
>>>>> +	struct xe_pmu *pmu = &xe->pmu;
>>>>> +	u64 prev, new;
>>>>> +
>>>>> +	if (pmu->closed) {
>>>>> +		event->hw.state = PERF_HES_STOPPED;
>>>>> +		return;
>>>>> +	}
>>>>> +again:
>>>>> +	prev = local64_read(&hwc->prev_count);
>>>>> +	new = __xe_pmu_event_read(event);
>>>>> +
>>>>> +	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
>>>>> +		goto again;
>>>>> +
>>>>> +	local64_add(new - prev, &event->count);
>>>>> +}
>>>>> +
>>>>> +static void xe_pmu_enable(struct perf_event *event)
>>>>> +{
>>>>> +	/*
>>>>> +	 * Store the current counter value so we can report the correct delta
>>>>> +	 * for all listeners. Even when the event was already enabled and has
>>>>> +	 * an existing non-zero value.
>>>>> +	 */
>>>>> +	local64_set(&event->hw.prev_count, __xe_pmu_event_read(event));
>>>>> +}
>>>>> +
>>>>> +static void xe_pmu_event_start(struct perf_event *event, int flags)
>>>>> +{
>>>>> +	struct xe_device *xe =
>>>>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>>>>> +	struct xe_pmu *pmu = &xe->pmu;
>>>>> +
>>>>> +	if (pmu->closed)
>>>>> +		return;
>>>>> +
>>>>> +	xe_pmu_enable(event);
>>>>> +	event->hw.state = 0;
>>>>> +}
>>>>> +
>>>>> +static void xe_pmu_event_stop(struct perf_event *event, int flags)
>>>>> +{
>>>>> +	if (flags & PERF_EF_UPDATE)
>>>>> +		xe_pmu_event_read(event);
>>>>> +
>>>>> +	event->hw.state = PERF_HES_STOPPED;
>>>>> +}
>>>>> +
>>>>> +static int xe_pmu_event_add(struct perf_event *event, int flags)
>>>>> +{
>>>>> +	struct xe_device *xe =
>>>>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>>>>> +	struct xe_pmu *pmu = &xe->pmu;
>>>>> +
>>>>> +	if (pmu->closed)
>>>>> +		return -ENODEV;
>>>>> +
>>>>> +	if (flags & PERF_EF_START)
>>>>> +		xe_pmu_event_start(event, flags);
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +static void xe_pmu_event_del(struct perf_event *event, int flags)
>>>>> +{
>>>>> +	xe_pmu_event_stop(event, PERF_EF_UPDATE);
>>>>> +}
>>>>> +
>>>>> +static int xe_pmu_event_event_idx(struct perf_event *event)
>>>>> +{
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +struct xe_str_attribute {
>>>>> +	struct device_attribute attr;
>>>>> +	const char *str;
>>>>> +};
>>>>> +
>>>>> +static ssize_t xe_pmu_format_show(struct device *dev,
>>>>> +				  struct device_attribute *attr, char *buf)
>>>>> +{
>>>>> +	struct xe_str_attribute *eattr;
>>>>> +
>>>>> +	eattr = container_of(attr, struct xe_str_attribute, attr);
>>>>> +	return sprintf(buf, "%s\n", eattr->str);
>>>>> +}
>>>>> +
>>>>> +#define XE_PMU_FORMAT_ATTR(_name, _config) \
>>>>> +	(&((struct xe_str_attribute[]) { \
>>>>> +		{ .attr = __ATTR(_name, 0444, xe_pmu_format_show, NULL), \
>>>>> +		  .str = _config, } \
>>>>> +	})[0].attr.attr)
>>>>> +
>>>>> +static struct attribute *xe_pmu_format_attrs[] = {
>>>>> +	XE_PMU_FORMAT_ATTR(xe_eventid, "config:0-20"),
>>>>
>>>> 0-20 means 0-20 bits? Though here we probably have different number of
>>>> config bits? Probably doesn't matter?
>>>
>>> as i understand this is not used anymore so will remove it.
>>>
>>>>
>>>> The string will show up with:
>>>>
>>>> cat /sys/devices/xe/format/xe_eventid
>>>>
>>>>> +	NULL,
>>>>> +};
>>>>> +
>>>>> +static const struct attribute_group xe_pmu_format_attr_group = {
>>>>> +	.name = "format",
>>>>> +	.attrs = xe_pmu_format_attrs,
>>>>> +};
>>>>> +
>>>>> +struct xe_ext_attribute {
>>>>> +	struct device_attribute attr;
>>>>> +	unsigned long val;
>>>>> +};
>>>>> +
>>>>> +static ssize_t xe_pmu_event_show(struct device *dev,
>>>>> +				 struct device_attribute *attr, char *buf)
>>>>> +{
>>>>> +	struct xe_ext_attribute *eattr;
>>>>> +
>>>>> +	eattr = container_of(attr, struct xe_ext_attribute, attr);
>>>>> +	return sprintf(buf, "config=0x%lx\n", eattr->val);
>>>>> +}
>>>>> +
>>>>> +static ssize_t cpumask_show(struct device *dev,
>>>>> +			    struct device_attribute *attr, char *buf)
>>>>> +{
>>>>> +	return cpumap_print_to_pagebuf(true, buf, &xe_pmu_cpumask);
>>>>> +}
>>>>> +
>>>>> +static DEVICE_ATTR_RO(cpumask);
>>>>> +
>>>>> +static struct attribute *xe_cpumask_attrs[] = {
>>>>> +	&dev_attr_cpumask.attr,
>>>>> +	NULL,
>>>>> +};
>>>>> +
>>>>> +static const struct attribute_group xe_pmu_cpumask_attr_group = {
>>>>> +	.attrs = xe_cpumask_attrs,
>>>>> +};
>>>>> +
>>>>> +#define __event(__counter, __name, __unit) \
>>>>> +{ \
>>>>> +	.counter = (__counter), \
>>>>> +	.name = (__name), \
>>>>> +	.unit = (__unit), \
>>>>> +	.global = false, \
>>>>> +}
>>>>> +
>>>>> +#define __global_event(__counter, __name, __unit) \
>>>>> +{ \
>>>>> +	.counter = (__counter), \
>>>>> +	.name = (__name), \
>>>>> +	.unit = (__unit), \
>>>>> +	.global = true, \
>>>>> +}
>>>>> +
>>>>> +static struct xe_ext_attribute *
>>>>> +add_xe_attr(struct xe_ext_attribute *attr, const char *name, u64 config)
>>>>> +{
>>>>> +	sysfs_attr_init(&attr->attr.attr);
>>>>> +	attr->attr.attr.name = name;
>>>>> +	attr->attr.attr.mode = 0444;
>>>>> +	attr->attr.show = xe_pmu_event_show;
>>>>> +	attr->val = config;
>>>>> +
>>>>> +	return ++attr;
>>>>> +}
>>>>> +
>>>>> +static struct perf_pmu_events_attr *
>>>>> +add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name,
>>>>> +	     const char *str)
>>>>> +{
>>>>> +	sysfs_attr_init(&attr->attr.attr);
>>>>> +	attr->attr.attr.name = name;
>>>>> +	attr->attr.attr.mode = 0444;
>>>>> +	attr->attr.show = perf_event_sysfs_show;
>>>>> +	attr->event_str = str;
>>>>> +
>>>>> +	return ++attr;
>>>>> +}
>>>>> +
>>>>> +static struct attribute **
>>>>> +create_event_attributes(struct xe_pmu *pmu)
>>>>> +{
>>>>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>>>>> +	static const struct {
>>>>> +		unsigned int counter;
>>>>> +		const char *name;
>>>>> +		const char *unit;
>>>>> +		bool global;
>>>>> +	} events[] = {
>>>>> +		__global_event(0, "interrupts", NULL),
>>>>> +		__event(1, "render-group-busy", "ns"),
>>>>> +		__event(2, "copy-group-busy", "ns"),
>>>>> +		__event(3, "media-group-busy", "ns"),
>>>>> +		__event(4, "any-engine-group-busy", "ns"),
>>>>> +	};
>>>>
>>>> OK: this function is some black magic to expose stuff through
>>>> PMU. Identical to i915 and seems to be working from the commit message so
>>>> should be fine.
>>>>
>>>>> +
>>>>> +	unsigned int count = 0;
>>>>> +	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
>>>>> +	struct xe_ext_attribute *xe_attr = NULL, *xe_iter;
>>>>> +	struct attribute **attr = NULL, **attr_iter;
>>>>> +	struct xe_gt *gt;
>>>>> +	unsigned int i, j;
>>>>> +
>>>>> +	/* Count how many counters we will be exposing. */
>>>>> +	for_each_gt(gt, xe, j) {
>>>>> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
>>>>> +			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
>>>>> +
>>>>> +			if (!config_status(xe, config))
>>>>> +				count++;
>>>>> +		}
>>>>> +	}
>>>>> +
>>>>> +	/* Allocate attribute objects and table. */
>>>>> +	xe_attr = kcalloc(count, sizeof(*xe_attr), GFP_KERNEL);
>>>>> +	if (!xe_attr)
>>>>> +		goto err_alloc;
>>>>> +
>>>>> +	pmu_attr = kcalloc(count, sizeof(*pmu_attr), GFP_KERNEL);
>>>>> +	if (!pmu_attr)
>>>>> +		goto err_alloc;
>>>>> +
>>>>> +	/* Max one pointer of each attribute type plus a termination entry. */
>>>>> +	attr = kcalloc(count * 2 + 1, sizeof(*attr), GFP_KERNEL);
>>>>> +	if (!attr)
>>>>> +		goto err_alloc;
>>>>> +
>>>>> +	xe_iter = xe_attr;
>>>>> +	pmu_iter = pmu_attr;
>>>>> +	attr_iter = attr;
>>>>> +
>>>>> +	for_each_gt(gt, xe, j) {
>>>>> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
>>>>> +			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
>>>>> +			char *str;
>>>>> +
>>>>> +			if (config_status(xe, config))
>>>>> +				continue;
>>>>> +
>>>>> +			if (events[i].global)
>>>>> +				str = kstrdup(events[i].name, GFP_KERNEL);
>>>>> +			else
>>>>> +				str = kasprintf(GFP_KERNEL, "%s-gt%u",
>>>>> +						events[i].name, j);
>>>>> +			if (!str)
>>>>> +				goto err;
>>>>> +
>>>>> +			*attr_iter++ = &xe_iter->attr.attr;
>>>>> +			xe_iter = add_xe_attr(xe_iter, str, config);
>>>>> +
>>>>> +			if (events[i].unit) {
>>>>> +				if (events[i].global)
>>>>> +					str = kasprintf(GFP_KERNEL, "%s.unit",
>>>>> +							events[i].name);
>>>>> +				else
>>>>> +					str = kasprintf(GFP_KERNEL, "%s-gt%u.unit",
>>>>> +							events[i].name, j);
>>>>> +				if (!str)
>>>>> +					goto err;
>>>>> +
>>>>> +				*attr_iter++ = &pmu_iter->attr.attr;
>>>>> +				pmu_iter = add_pmu_attr(pmu_iter, str,
>>>>> +							events[i].unit);
>>>>> +			}
>>>>> +		}
>>>>> +	}
>>>>> +
>>>>> +	pmu->xe_attr = xe_attr;
>>>>> +	pmu->pmu_attr = pmu_attr;
>>>>> +
>>>>> +	return attr;
>>>>> +
>>>>> +err:
>>>>> +	for (attr_iter = attr; *attr_iter; attr_iter++)
>>>>> +		kfree((*attr_iter)->name);
>>>>> +
>>>>> +err_alloc:
>>>>> +	kfree(attr);
>>>>> +	kfree(xe_attr);
>>>>> +	kfree(pmu_attr);
>>>>> +
>>>>> +	return NULL;
>>>>> +}
>>>>> +
>>>>> +static void free_event_attributes(struct xe_pmu *pmu)
>>>>> +{
>>>>> +	struct attribute **attr_iter = pmu->events_attr_group.attrs;
>>>>> +
>>>>> +	for (; *attr_iter; attr_iter++)
>>>>> +		kfree((*attr_iter)->name);
>>>>> +
>>>>> +	kfree(pmu->events_attr_group.attrs);
>>>>> +	kfree(pmu->xe_attr);
>>>>> +	kfree(pmu->pmu_attr);
>>>>> +
>>>>> +	pmu->events_attr_group.attrs = NULL;
>>>>> +	pmu->xe_attr = NULL;
>>>>> +	pmu->pmu_attr = NULL;
>>>>> +}
>>>>> +
>>>>> +static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
>>>>> +{
>>>>> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
>>>>> +
>>>>> +	XE_BUG_ON(!pmu->base.event_init);
>>>>> +
>>>>> +	/* Select the first online CPU as a designated reader. */
>>>>> +	if (cpumask_empty(&xe_pmu_cpumask))
>>>>> +		cpumask_set_cpu(cpu, &xe_pmu_cpumask);
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
>>>>> +{
>>>>> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
>>>>> +	unsigned int target = xe_pmu_target_cpu;
>>>>> +
>>>>> +	XE_BUG_ON(!pmu->base.event_init);
>>>>> +
>>>>> +	/*
>>>>> +	 * Unregistering an instance generates a CPU offline event which we must
>>>>> +	 * ignore to avoid incorrectly modifying the shared xe_pmu_cpumask.
>>>>> +	 */
>>>>> +	if (pmu->closed)
>>>>> +		return 0;
>>>>> +
>>>>> +	if (cpumask_test_and_clear_cpu(cpu, &xe_pmu_cpumask)) {
>>>>> +		target = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
>>>>> +
>>>>> +		/* Migrate events if there is a valid target */
>>>>> +		if (target < nr_cpu_ids) {
>>>>> +			cpumask_set_cpu(target, &xe_pmu_cpumask);
>>>>> +			xe_pmu_target_cpu = target;
>>>>> +		}
>>>>> +	}
>>>>> +
>>>>> +	if (target < nr_cpu_ids && target != pmu->cpuhp.cpu) {
>>>>> +		perf_pmu_migrate_context(&pmu->base, cpu, target);
>>>>> +		pmu->cpuhp.cpu = target;
>>>>> +	}
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
>>>>> +
>>>>> +int xe_pmu_init(void)
>>>>> +{
>>>>> +	int ret;
>>>>> +
>>>>> +	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
>>>>> +				      "perf/x86/intel/xe:online",
>>>>> +				      xe_pmu_cpu_online,
>>>>> +				      xe_pmu_cpu_offline);
>>>>> +	if (ret < 0)
>>>>> +		pr_notice("Failed to setup cpuhp state for xe PMU! (%d)\n",
>>>>> +			  ret);
>>>>> +	else
>>>>> +		cpuhp_slot = ret;
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +void xe_pmu_exit(void)
>>>>> +{
>>>>> +	if (cpuhp_slot != CPUHP_INVALID)
>>>>> +		cpuhp_remove_multi_state(cpuhp_slot);
>>>>> +}
>>>>> +
>>>>> +static int xe_pmu_register_cpuhp_state(struct xe_pmu *pmu)
>>>>> +{
>>>>> +	if (cpuhp_slot == CPUHP_INVALID)
>>>>> +		return -EINVAL;
>>>>> +
>>>>> +	return cpuhp_state_add_instance(cpuhp_slot, &pmu->cpuhp.node);
>>>>> +}
>>>>> +
>>>>> +static void xe_pmu_unregister_cpuhp_state(struct xe_pmu *pmu)
>>>>> +{
>>>>> +	cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node);
>>>>> +}
>>>>> +
>>>>> +static void xe_pmu_unregister(struct drm_device *device, void *arg)
>>>>> +{
>>>>> +	struct xe_pmu *pmu = arg;
>>>>> +
>>>>> +	if (!pmu->base.event_init)
>>>>> +		return;
>>>>> +
>>>>> +	/*
>>>>> +	 * "Disconnect" the PMU callbacks - since all are atomic synchronize_rcu
>>>>> +	 * ensures all currently executing ones will have exited before we
>>>>> +	 * proceed with unregistration.
>>>>> +	 */
>>>>> +	pmu->closed = true;
>>>>> +	synchronize_rcu();
>>>>> +
>>>>> +	xe_pmu_unregister_cpuhp_state(pmu);
>>>>> +
>>>>> +	perf_pmu_unregister(&pmu->base);
>>>>> +	pmu->base.event_init = NULL;
>>>>> +	kfree(pmu->base.attr_groups);
>>>>> +	kfree(pmu->name);
>>>>> +	free_event_attributes(pmu);
>>>>> +}
>>>>> +
>>>>> +static void init_samples(struct xe_pmu *pmu)
>>>>> +{
>>>>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>>>>> +	struct xe_gt *gt;
>>>>> +	unsigned int i;
>>>>> +
>>>>> +	for_each_gt(gt, xe, i)
>>>>> +		engine_group_busyness_store(gt);
>>>>> +}
>>>>> +
>>>>> +void xe_pmu_register(struct xe_pmu *pmu)
>>>>
>>>> Why void, why not int? PMU failure is non fatal error?
>>>
>>> Ya device is functional , it is only that these counters are not
>>> available. Hence didn't want to fail the driver load.
>>>>
>>>>> +{
>>>>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>>>>> +	const struct attribute_group *attr_groups[] = {
>>>>> +		&xe_pmu_format_attr_group,
>>>>> +		&pmu->events_attr_group,
>>>>> +		&xe_pmu_cpumask_attr_group,
>>>>
>>>> Can someone please explain what this cpumask/cpuhotplug stuff does and
>>>> whether it needs to be in this patch? There's something here:
>>>
>>> comments from original patch series in
>>> i915:https://patchwork.kernel.org/project/intel-gfx/patch/20170802123249.14194-5-tvrtko.ursulin@linux.intel.com/
>>>
>>> "IIRC an uncore PMU should expose a cpumask through sysfs, and then perf
>>> tools will read that mask and auto-magically limit the number of CPUs it
>>> instantiates the counter on."
>>>
>>> and ours are global counters not per cpu so we limit to just to single cpu.
>>>
>>> And as i understand we use the cpuhotplug support to migrate too new cpu
>>> incase the earlier one goes offline.
>>
>> OK, leave as is.
>>
>>>
>>>>
>>>> b46a33e271ed ("drm/i915/pmu: Expose a PMU interface for perf queries")
>>>>
>>>> I'd rather just have the basic PMU infra and the events in this patch and
>>>> punt this cpumask/cpuhotplug stuff to a later patch, unless someone can say
>>>> what it does.
>>>>
>>>> Though perf_pmu_register seems to be doing some per cpu stuff so likely
>>>> this is needed. But amdgpu_pmu only has event and format attributes.
>>>>
>>>> Mostly leave as is I guess.
>>>>
>>>>> +		NULL
>>>>> +	};
>>>>> +
>>>>> +	int ret = -ENOMEM;
>>>>> +
>>>>> +	spin_lock_init(&pmu->lock);
>>>>> +	pmu->cpuhp.cpu = -1;
>>>>> +	init_samples(pmu);
>>>>
>>>> Why init_samples here? Can't we init the particular sample in
>>>> xe_pmu_event_init or even xe_pmu_event_start?
>>>>
>>>> Init'ing here may be too soon since the event might not be enabled for a
>>>> long time. So really this needs to move to xe_pmu_event_init or
>>>> xe_pmu_event_start.
>>>
>>> The device is put to suspend immediately after driver probe, typically
>>> pmu is opened even before any workload is run so essentially device is
>>> still in suspend state hence we cannot access registers so storing the
>>> last known value in init_samples. otherwise we see the bug in v#1 of series.
>>>
>>>>
>>>> Actually this is already happening in xe_pmu_enable. We just need to decide
>>>> when we want to wake the device up and when we don't. So maybe wake the
>>>> device up at start (use xe_device_mem_access_get) and not wake up during
>>>> read (xe_device_mem_access_get_if_ongoing etc.)?
>>
>> Just going to repeat this again:
>>
>> xe_pmu_event_start calls xe_pmu_enable. In xe_pmu_enable use
>> xe_device_mem_access_get before calling __xe_pmu_event_read. This will wake
>> the device up and get a valid value in event->hw.prev_count.
>>
>> In xe_pmu_event_read, use xe_device_mem_access_get_if_ongoing, to read the
>> event without waking the device up (and return previous value etc.).
>>
>> Or, pass a flag in to __xe_pmu_event_read and to engine_group_busyness_read
>> and __engine_group_busyness_read. The flag will say whether or not to wake
>> up the device. If the flag says wake the device up, call
>> xe_device_mem_access_get and xe_force_wake_get, maybe in
>> __engine_group_busyness_read, before reading device registers. If the flag
>> says don't wake up the device call xe_device_mem_access_get_if_ongoing.
>>
>> This way we:
>> * don't need to call init_samples in xe_pmu_register
>> * we don't need engine_group_busyness_store
>> * we don't need to specifically call engine_group_busyness_store in xe_gt_suspend
>>
>> The correct sample is read by waking up the device in xe_pmu_event_start.
>>
>> Hopefully this is clear now.
> 
> Actually I think it is not necessary to do anything in xe_pmu_event_start,
> as long as we keep the engine_group_busyness_store in xe_gt_suspend (so we
> can just use xe_device_mem_access_get_if_ongoing, don't need
> xe_device_mem_access_get as you said).
> 
> Afais, the problem of huge values in v1 was due to reading the device
> registers when device was not awake. That problem we've already solved in
> v2 where in engine_group_busyness_read() we only read if the device is
> awake and skip and return the previous value when the device is not
> awake. So that fixes the problem of huge values.
> 
> The problem in this patch (I thought) is that we are effectively sampling
> the registers each time perf calls xe_pmu_event_read, say every 1 sec. So
> we are sampling the registers every 1 sec. When we sample (every 1 sec) if
> the device is awake we will return the correct ns value, if device is
> asleep we will return the old value. If the device wakes up and does some
> work in the 1 sec period but then again suspends will miss that
> activity. That is the problem that is being solved by storing the samples
> in xe_gt_suspend().
> 
> i915 solved this problem using a 5 ms timer but I like the solution of
> sampling in xe_gt_suspend better, so let's keep it.
> 
> But init_samples is not needed, afais we don't need to do anything in
> xe_pmu_register or in xe_pmu_event_start (and we don't need to use
> xe_device_mem_access_get). xe_gt_suspend will take care of storing the
> register values before the device suspends.
> 
> Hopefully it makes sense now, sorry for the confusion.

yes you are right we do not need init_samples.

Thanks,
Aravind.
> 
> Ashutosh
> 
>>
>>>
>>>>
>>>>> +
>>>>> +	pmu->name = kasprintf(GFP_KERNEL,
>>>>> +			      "xe_%s",
>>>>> +			      dev_name(xe->drm.dev));
>>>>> +	if (pmu->name)
>>>>> +		/* tools/perf reserves colons as special. */
>>>>> +		strreplace((char *)pmu->name, ':', '_');
>>>>> +
>>>>> +	if (!pmu->name)
>>>>> +		goto err;
>>>>> +
>>>>> +	pmu->events_attr_group.name = "events";
>>>>> +	pmu->events_attr_group.attrs = create_event_attributes(pmu);
>>>>> +	if (!pmu->events_attr_group.attrs)
>>>>> +		goto err_name;
>>>>> +
>>>>> +	pmu->base.attr_groups = kmemdup(attr_groups, sizeof(attr_groups),
>>>>> +					GFP_KERNEL);
>>>>> +	if (!pmu->base.attr_groups)
>>>>> +		goto err_attr;
>>>>> +
>>>>> +	pmu->base.module	= THIS_MODULE;
>>>>> +	pmu->base.task_ctx_nr	= perf_invalid_context;
>>>>> +	pmu->base.event_init	= xe_pmu_event_init;
>>>>> +	pmu->base.add		= xe_pmu_event_add;
>>>>> +	pmu->base.del		= xe_pmu_event_del;
>>>>> +	pmu->base.start		= xe_pmu_event_start;
>>>>> +	pmu->base.stop		= xe_pmu_event_stop;
>>>>> +	pmu->base.read		= xe_pmu_event_read;
>>>>> +	pmu->base.event_idx	= xe_pmu_event_event_idx;
>>>>> +
>>>>> +	ret = perf_pmu_register(&pmu->base, pmu->name, -1);
>>>>> +	if (ret)
>>>>> +		goto err_groups;
>>>>> +
>>>>> +	ret = xe_pmu_register_cpuhp_state(pmu);
>>>>> +	if (ret)
>>>>> +		goto err_unreg;
>>>>> +
>>>>> +	ret = drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister, pmu);
>>>>> +	XE_WARN_ON(ret);
>>>>
>>>> We should just follow the regular error rewind here too and let
>>>> drm_notice() at the end print the error. This is what other drivers calling
>>>> drmm_add_action_or_reset seem to be doing.
>>>
>>> Ok ok.
>>>>
>>>>> +
>>>>> +	return;
>>>>> +
>>>>> +err_unreg:
>>>>> +	perf_pmu_unregister(&pmu->base);
>>>>> +err_groups:
>>>>> +	kfree(pmu->base.attr_groups);
>>>>> +err_attr:
>>>>> +	pmu->base.event_init = NULL;
>>>>> +	free_event_attributes(pmu);
>>>>> +err_name:
>>>>> +	kfree(pmu->name);
>>>>> +err:
>>>>> +	drm_notice(&xe->drm, "Failed to register PMU!\n");
>>>>> +}
>>>>> diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
>>>>> new file mode 100644
>>>>> index 000000000000..d3f47f4ab343
>>>>> --- /dev/null
>>>>> +++ b/drivers/gpu/drm/xe/xe_pmu.h
>>>>> @@ -0,0 +1,25 @@
>>>>> +/* SPDX-License-Identifier: MIT */
>>>>> +/*
>>>>> + * Copyright © 2023 Intel Corporation
>>>>> + */
>>>>> +
>>>>> +#ifndef _XE_PMU_H_
>>>>> +#define _XE_PMU_H_
>>>>> +
>>>>> +#include "xe_gt_types.h"
>>>>> +#include "xe_pmu_types.h"
>>>>> +
>>>>> +#ifdef CONFIG_PERF_EVENTS
>>>>
>>>> nit but maybe this should be:
>>>>
>>>> #if IS_ENABLED(CONFIG_PERF_EVENTS)
>>>>
>>>> or,
>>>>
>>>> #if IS_BUILTIN(CONFIG_PERF_EVENTS)
>>>>
>>>> Note CONFIG_PERF_EVENTS is a boolean kconfig option.
>>>>
>>>> See similar macro IS_REACHABLE() in i915_hwmon.h.
>>>>
>>>>> +int xe_pmu_init(void);
>>>>> +void xe_pmu_exit(void);
>>>>> +void xe_pmu_register(struct xe_pmu *pmu);
>>>>> +void engine_group_busyness_store(struct xe_gt *gt);
>>>>
>>>> Add xe_pmu_ prefix if function is needed (hopefully not).
>>>
>>> OK
>>>>
>>>>> +#else
>>>>> +static inline int xe_pmu_init(void) { return 0; }
>>>>> +static inline void xe_pmu_exit(void) {}
>>>>> +static inline void xe_pmu_register(struct xe_pmu *pmu) {}
>>>>> +static inline void engine_group_busyness_store(struct xe_gt *gt) {}
>>>>> +#endif
>>>>> +
>>>>> +#endif
>>>>> +
>>>>> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h b/drivers/gpu/drm/xe/xe_pmu_types.h
>>>>> new file mode 100644
>>>>> index 000000000000..e87edd4d6a87
>>>>> --- /dev/null
>>>>> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
>>>>> @@ -0,0 +1,80 @@
>>>>> +/* SPDX-License-Identifier: MIT */
>>>>> +/*
>>>>> + * Copyright © 2023 Intel Corporation
>>>>> + */
>>>>> +
>>>>> +#ifndef _XE_PMU_TYPES_H_
>>>>> +#define _XE_PMU_TYPES_H_
>>>>> +
>>>>> +#include <linux/perf_event.h>
>>>>> +#include <linux/spinlock_types.h>
>>>>> +#include <uapi/drm/xe_drm.h>
>>>>> +
>>>>> +enum {
>>>>> +	__XE_SAMPLE_RENDER_GROUP_BUSY,
>>>>> +	__XE_SAMPLE_COPY_GROUP_BUSY,
>>>>> +	__XE_SAMPLE_MEDIA_GROUP_BUSY,
>>>>> +	__XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>>>>> +	__XE_NUM_PMU_SAMPLERS
>>>>
>>>> OK: irq_count is missing here since these are read from device...
>>>>
>>>>> +};
>>>>> +
>>>>> +struct xe_pmu_sample {
>>>>> +	u64 cur;
>>>>> +};
>>>>
>>>> This was also discussed for i915 PMU, no point having a struct with a
>>>> single u64 member. Might as well just use u64 wherever we are using
>>>> struct xe_pmu_sample.
>>>
>>> OK.
>>>>
>>>>> +
>>>>> +#define XE_MAX_GT_PER_TILE 2
>>>>
>>>> Why per tile? The array size should be max_gt_per_device. Just call it
>>>> XE_MAX_GT?
>>>
>>> I declared similar to what we have in drivers/gpu/drm/xe/xe_device.h
>>
>> Our 2-d array size is for the device, not per tile. So let's use XE_MAX_GT.
>>
>>>>
>>>>> +
>>>>> +struct xe_pmu {
>>>>> +	/**
>>>>> +	 * @cpuhp: Struct used for CPU hotplug handling.
>>>>> +	 */
>>>>> +	struct {
>>>>> +		struct hlist_node node;
>>>>> +		unsigned int cpu;
>>>>> +	} cpuhp;
>>>>> +	/**
>>>>> +	 * @base: PMU base.
>>>>> +	 */
>>>>> +	struct pmu base;
>>>>> +	/**
>>>>> +	 * @closed: xe is unregistering.
>>>>> +	 */
>>>>> +	bool closed;
>>>>> +	/**
>>>>> +	 * @name: Name as registered with perf core.
>>>>> +	 */
>>>>> +	const char *name;
>>>>> +	/**
>>>>> +	 * @lock: Lock protecting enable mask and ref count handling.
>>>>> +	 */
>>>>> +	spinlock_t lock;
>>>>> +	/**
>>>>> +	 * @sample: Current and previous (raw) counters.
>>>>> +	 *
>>>>> +	 * These counters are updated when the device is awake.
>>>>> +	 *
>>>>> +	 */
>>>>> +	struct xe_pmu_sample sample[XE_MAX_GT_PER_TILE * __XE_NUM_PMU_SAMPLERS];
>>>>
>>>> Change to 2-d array. See above.
>>>>
>>>>> +	/**
>>>>> +	 * @irq_count: Number of interrupts
>>>>> +	 *
>>>>> +	 * Intentionally unsigned long to avoid atomics or heuristics on 32bit.
>>>>> +	 * 4e9 interrupts are a lot and postprocessing can really deal with an
>>>>> +	 * occasional wraparound easily. It's 32bit after all.
>>>>> +	 */
>>>>> +	unsigned long irq_count;
>>>>> +	/**
>>>>> +	 * @events_attr_group: Device events attribute group.
>>>>> +	 */
>>>>> +	struct attribute_group events_attr_group;
>>>>> +	/**
>>>>> +	 * @xe_attr: Memory block holding device attributes.
>>>>> +	 */
>>>>> +	void *xe_attr;
>>>>> +	/**
>>>>> +	 * @pmu_attr: Memory block holding device attributes.
>>>>> +	 */
>>>>> +	void *pmu_attr;
>>>>> +};
>>>>> +
>>>>> +#endif
>>>>> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
>>>>> index 965cd9527ff1..ed097056f944 100644
>>>>> --- a/include/uapi/drm/xe_drm.h
>>>>> +++ b/include/uapi/drm/xe_drm.h
>>>>> @@ -990,6 +990,22 @@ struct drm_xe_vm_madvise {
>>>>> 	__u64 reserved[2];
>>>>>  };
>>>>>
>>>>> +/* PMU event config IDs */
>>>>> +
>>>>> +/*
>>>>> + * Top 4 bits of every counter are GT id.
>>>>> + */
>>>>> +#define __XE_PMU_GT_SHIFT (60)
>>>>
>>>> To future-proof this, and also because we seem to have plenty of bits
>>>> available, I think we should change this to 56 (instead of 60).
>>>
>>> OK
>>>
>>> Thanks,
>>> Aravind.
>>>>
>>>>> +
>>>>> +#define ___XE_PMU_OTHER(gt, x) \
>>>>> +	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
>>>>> +
>>>>> +#define XE_PMU_INTERRUPTS(gt)			___XE_PMU_OTHER(gt, 0)
>>>>> +#define XE_PMU_RENDER_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 1)
>>>>> +#define XE_PMU_COPY_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 2)
>>>>> +#define XE_PMU_MEDIA_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 3)
>>>>> +#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)	___XE_PMU_OTHER(gt, 4)
>>>>> +
>>>>>  #if defined(__cplusplus)
>>>>>  }
>>>>>  #endif
>>>>> --
>>>>> 2.25.1
>>
>> Thanks.
>> --
>> Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-24  8:03           ` Iddamsetty, Aravind
@ 2023-07-24  9:00             ` Ursulin, Tvrtko
  2023-07-24 15:52               ` Dixit, Ashutosh
  2023-07-24 15:52             ` Dixit, Ashutosh
  1 sibling, 1 reply; 59+ messages in thread
From: Ursulin, Tvrtko @ 2023-07-24  9:00 UTC (permalink / raw)
  To: Iddamsetty, Aravind, Dixit, Ashutosh; +Cc: Bommu, Krishnaiah, intel-xe


[Top-post since my dual email setup and this is the wrong one, sorry.]

Glancing over the discussion - small correction - i915 did not solve the problem of hardware counters and sleeping device with the timer but with the park/unpark hooks.

More similar to these group busyness counters would be the RC6, and you will notice there is nothing in the i915 sampling timer about RC6. There is just some complicated code in park/unpark to estimate RC6 while parked. But the estimation is beside the point for engine group busyness since it is the opposite metric (grows on busy vs grows idle).

And I think you converged to the same solution already, so I just wanted to correct the 5ms timer inaccuracy.

Regards,

Tvrtko

-----Original Message-----
From: Iddamsetty, Aravind <aravind.iddamsetty@intel.com> 
Sent: Monday, July 24, 2023 9:03 AM
To: Dixit, Ashutosh <ashutosh.dixit@intel.com>
Cc: intel-xe@lists.freedesktop.org; Bommu, Krishnaiah <krishnaiah.bommu@intel.com>; Ursulin, Tvrtko <tvrtko.ursulin@intel.com>; Upadhyay, Tejas <tejas.upadhyay@intel.com>
Subject: Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface



On 22-07-2023 11:34, Dixit, Ashutosh wrote:

Hi Ashutosh,

> On Fri, 21 Jul 2023 16:36:02 -0700, Dixit, Ashutosh wrote:
>>
>> On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote:
>>>
>> Hi Aravind,
>>
>>> On 21-07-2023 06:32, Dixit, Ashutosh wrote:
>>>> On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
>>>>>
>>>> More stuff to mull over. You can ignore comments starting with 
>>>> "OK", those are just notes to myself.
>>>>
>>>> Also, maybe some time we can add a basic IGT which reads these 
>>>> exposed counters and verifies that we can read them and they are 
>>>> monotonically increasing?
>>>
>>> this is the IGT https://patchwork.freedesktop.org/series/119936/ 
>>> series using these counters posted by Venkat.
>>>
>>>>
>>>>> There are a set of engine group busyness counters provided by HW 
>>>>> which are perfect fit to be exposed via PMU perf events.
>>>>>
>>>>> BSPEC: 46559, 46560, 46722, 46729
>>>>
>>>> Also add these Bspec entries: 71028, 52071
>>>
>>> OK.
>>>
>>>>
>>>>>
>>>>> events can be listed using:
>>>>> perf list
>>>>>   xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
>>>>>   xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
>>>>>   xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
>>>>>   xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
>>>>>   xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
>>>>>
>>>>> and can be read using:
>>>>>
>>>>> perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
>>>>>            time             counts unit events
>>>>>      1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>     10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>     10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>
>>>>> The pmu base implementation is taken from i915.
>>>>>
>>>>> v2:
>>>>> Store last known value when device is awake return that while the 
>>>>> GT is suspended and then update the driver copy when read during awake.
>>>>>
>>>>> Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
>>>>> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
>>>>> ---
>>>>>  drivers/gpu/drm/xe/Makefile          |   2 +
>>>>>  drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
>>>>>  drivers/gpu/drm/xe/xe_device.c       |   2 +
>>>>>  drivers/gpu/drm/xe/xe_device_types.h |   4 +
>>>>>  drivers/gpu/drm/xe/xe_gt.c           |   2 +
>>>>>  drivers/gpu/drm/xe/xe_irq.c          |  22 +
>>>>>  drivers/gpu/drm/xe/xe_module.c       |   5 +
>>>>>  drivers/gpu/drm/xe/xe_pmu.c          | 739 +++++++++++++++++++++++++++
>>>>>  drivers/gpu/drm/xe/xe_pmu.h          |  25 +
>>>>>  drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
>>>>>  include/uapi/drm/xe_drm.h            |  16 +
>>>>>  11 files changed, 902 insertions(+)  create mode 100644 
>>>>> drivers/gpu/drm/xe/xe_pmu.c  create mode 100644 
>>>>> drivers/gpu/drm/xe/xe_pmu.h  create mode 100644 
>>>>> drivers/gpu/drm/xe/xe_pmu_types.h
>>>>>
>>>>> diff --git a/drivers/gpu/drm/xe/Makefile 
>>>>> b/drivers/gpu/drm/xe/Makefile index 081c57fd8632..e52ab795c566 
>>>>> 100644
>>>>> --- a/drivers/gpu/drm/xe/Makefile
>>>>> +++ b/drivers/gpu/drm/xe/Makefile
>>>>> @@ -217,6 +217,8 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
>>>>> 	i915-display/skl_universal_plane.o \
>>>>> 	i915-display/skl_watermark.o
>>>>>
>>>>> +xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
>>>>> +
>>>>>  ifeq ($(CONFIG_ACPI),y)
>>>>> 	xe-$(CONFIG_DRM_XE_DISPLAY) += \
>>>>> 		i915-display/intel_acpi.o \
>>>>> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h 
>>>>> b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>>>>> index 3f664011eaea..c7d9e4634745 100644
>>>>> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>>>>> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>>>>> @@ -285,6 +285,11 @@
>>>>>  #define   INVALIDATION_BROADCAST_MODE_DIS	REG_BIT(12)
>>>>>  #define   GLOBAL_INVALIDATION_MODE		REG_BIT(2)
>>>>>
>>>>> +#define XE_OAG_RC0_ANY_ENGINE_BUSY_FREE		XE_REG(0xdb80)
>>>>> +#define XE_OAG_ANY_MEDIA_FF_BUSY_FREE		XE_REG(0xdba0)
>>>>> +#define XE_OAG_BLT_BUSY_FREE			XE_REG(0xdbbc)
>>>>> +#define XE_OAG_RENDER_BUSY_FREE			XE_REG(0xdbdc)
>>>>> +
>>>>>  #define SAMPLER_MODE				XE_REG_MCR(0xe18c, XE_REG_OPTION_MASKED)
>>>>>  #define   ENABLE_SMALLPL			REG_BIT(15)
>>>>>  #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
>>>>> diff --git a/drivers/gpu/drm/xe/xe_device.c 
>>>>> b/drivers/gpu/drm/xe/xe_device.c index c7985af85a53..b2c7bd4a97d9 
>>>>> 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_device.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_device.c
>>>>> @@ -328,6 +328,8 @@ int xe_device_probe(struct xe_device *xe)
>>>>>
>>>>> 	xe_debugfs_register(xe);
>>>>>
>>>>> +	xe_pmu_register(&xe->pmu);
>>>>> +
>>>>> 	err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize, xe);
>>>>> 	if (err)
>>>>> 		return err;
>>>>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h 
>>>>> b/drivers/gpu/drm/xe/xe_device_types.h
>>>>> index 0226d44a6af2..3ba99aae92b9 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>>>>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>>>>> @@ -15,6 +15,7 @@
>>>>>  #include "xe_devcoredump_types.h"
>>>>>  #include "xe_gt_types.h"
>>>>>  #include "xe_platform_types.h"
>>>>> +#include "xe_pmu.h"
>>>>>  #include "xe_step_types.h"
>>>>>
>>>>>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY) @@ -332,6 +333,9 @@ struct 
>>>>> xe_device {
>>>>> 	/** @d3cold_allowed: Indicates if d3cold is a valid device state */
>>>>> 	bool d3cold_allowed;
>>>>>
>>>>> +	/* @pmu: performance monitoring unit */
>>>>> +	struct xe_pmu pmu;
>>>>> +
>>>>
>>>> Now I am wondering why we don't make the PMU per-gt (or per-tile)? 
>>>> Per-gt would work for these busyness registers and I am not sure 
>>>> how the interrupts are hooked up.
>>>>
>>>> In i915 the PMU being device level helped in having a single timer 
>>>> (rather than a per gt timer).
>>>>
>>>> Anyway probably not much practical benefit by make it 
>>>> per-gt/per-tile, so maybe leave as is. Just thinking out loud.
>>>
>>> we are able to expose per-gt counters so do not see any benefit in 
>>> making pmu per-gt, also i believe struct pmu is per device it will 
>>> have a type associated which is unique for a device.
>>
>> PMU can be made per gt and named xe-gt0, xe-gt1 etc. if we want. But 
>> anyway leave as is.
>>
>>>
>>>>
>>>>> 	/* private: */
>>>>>
>>>>>  #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY) diff --git 
>>>>> a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index 
>>>>> 2458397ce8af..96e3720923d4 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_gt.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_gt.c
>>>>> @@ -593,6 +593,8 @@ int xe_gt_suspend(struct xe_gt *gt)
>>>>> 	if (err)
>>>>> 		goto err_msg;
>>>>>
>>>>> +	engine_group_busyness_store(gt);
>>>>
>>>> Not sure I follow the reason for doing this at suspend time? If PMU 
>>>> was active there should be a previous value. Why must we take a new 
>>>> sample explicitly here?
>>>
>>> the PMU interface can be read even when device is suspend and in 
>>> such cases we should not wake up the device, and if we try to read 
>>> the register when device is suspended it gives spurious counter you 
>>> can check in version#1 of this series we were getting huge values. 
>>> as we put the device to suspend immediately after probe. So storing 
>>> the last known good value before suspend.
>>
>> No need, see comment at init_samples below.
> 
> Sorry, you are right, I changed my mind about this, I see your point. 
> See more discussion on this at init_samples below.
> 
>>
>>>
>>>>
>>>> To me looks like engine_group_busyness_store should not be needed, 
>>>> see comment below for init_samples too.
>>>>
>>>>> +
>>>>> 	err = xe_uc_suspend(&gt->uc);
>>>>> 	if (err)
>>>>> 		goto err_force_wake;
>>>>> diff --git a/drivers/gpu/drm/xe/xe_irq.c 
>>>>> b/drivers/gpu/drm/xe/xe_irq.c index b4ed1e4a3388..cb943fb94ec7 
>>>>> 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_irq.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_irq.c
>>>>> @@ -27,6 +27,24 @@
>>>>>  #define IIR(offset)				XE_REG(offset + 0x8)
>>>>>  #define IER(offset)				XE_REG(offset + 0xc)
>>>>>
>>>>> +/*
>>>>> + * Interrupt statistic for PMU. Increments the counter only if 
>>>>> +the
>>>>> + * interrupt originated from the GPU so interrupts from a device 
>>>>> +which
>>>>> + * shares the interrupt line are not accounted.
>>>>> + */
>>>>> +static inline void xe_pmu_irq_stats(struct xe_device *xe,
>>>>
>>>> No inline, compiler will do it, but looks like we may want to 
>>>> always_inline this. Also this function should really be in 
>>>> xe_pmu.h? Anyway probably leave as is.
>>>>
>>>>> +				    irqreturn_t res)
>>>>> +{
>>>>> +	if (unlikely(res != IRQ_HANDLED))
>>>>> +		return;
>>>>> +
>>>>> +	/*
>>>>> +	 * A clever compiler translates that into INC. A not so clever one
>>>>> +	 * should at least prevent store tearing.
>>>>> +	 */
>>>>> +	WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1); }
>>>>> +
>>>>>  static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg 
>>>>> reg)  {
>>>>> 	u32 val = xe_mmio_read32(mmio, reg); @@ -325,6 +343,8 @@ static 
>>>>> irqreturn_t xelp_irq_handler(int irq, void *arg)
>>>>>
>>>>> 	xe_display_irq_enable(xe, gu_misc_iir);
>>>>>
>>>>> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
>>>>> +
>>>>> 	return IRQ_HANDLED;
>>>>>  }
>>>>>
>>>>> @@ -414,6 +434,8 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
>>>>> 	dg1_intr_enable(xe, false);
>>>>> 	xe_display_irq_enable(xe, gu_misc_iir);
>>>>>
>>>>> +	xe_pmu_irq_stats(xe, IRQ_HANDLED);
>>>>> +
>>>>> 	return IRQ_HANDLED;
>>>>>  }
>>>>>
>>>>> diff --git a/drivers/gpu/drm/xe/xe_module.c 
>>>>> b/drivers/gpu/drm/xe/xe_module.c index 75e5be939f53..f6fe89748525 
>>>>> 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_module.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_module.c
>>>>> @@ -12,6 +12,7 @@
>>>>>  #include "xe_hw_fence.h"
>>>>>  #include "xe_module.h"
>>>>>  #include "xe_pci.h"
>>>>> +#include "xe_pmu.h"
>>>>>  #include "xe_sched_job.h"
>>>>>
>>>>>  bool enable_guc = true;
>>>>> @@ -49,6 +50,10 @@ static const struct init_funcs init_funcs[] = {
>>>>> 		.init = xe_sched_job_module_init,
>>>>> 		.exit = xe_sched_job_module_exit,
>>>>> 	},
>>>>> +	{
>>>>> +		.init = xe_pmu_init,
>>>>> +		.exit = xe_pmu_exit,
>>>>> +	},
>>>>> 	{
>>>>> 		.init = xe_register_pci_driver,
>>>>> 		.exit = xe_unregister_pci_driver, diff --git 
>>>>> a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c new 
>>>>> file mode 100644 index 000000000000..bef1895be9f7
>>>>> --- /dev/null
>>>>> +++ b/drivers/gpu/drm/xe/xe_pmu.c
>>>>> @@ -0,0 +1,739 @@
>>>>> +/*
>>>>> + * SPDX-License-Identifier: MIT
>>>>> + *
>>>>> + * Copyright © 2023 Intel Corporation  */
>>>>
>>>> This SPDX header is for .h files not .c files. Actually, it is for 
>>>> neither :/
>>>
>>> But i see this in almost all the files in xe.
>>
>> Look closely.
>>
>>>>
>>>>> +
>>>>> +#include <drm/drm_drv.h>
>>>>> +#include <drm/drm_managed.h>
>>>>> +#include <drm/xe_drm.h>
>>>>> +
>>>>> +#include "regs/xe_gt_regs.h"
>>>>> +#include "xe_device.h"
>>>>> +#include "xe_gt_clock.h"
>>>>> +#include "xe_mmio.h"
>>>>> +
>>>>> +static cpumask_t xe_pmu_cpumask;
>>>>> +static unsigned int xe_pmu_target_cpu = -1;
>>>>> +
>>>>> +static unsigned int config_gt_id(const u64 config) {
>>>>> +	return config >> __XE_PMU_GT_SHIFT; }
>>>>> +
>>>>> +static u64 config_counter(const u64 config) {
>>>>> +	return config & ~(~0ULL << __XE_PMU_GT_SHIFT); }
>>>>> +
>>>>> +static unsigned int
>>>>> +__sample_idx(struct xe_pmu *pmu, unsigned int gt_id, int sample) 
>>>>> +{
>>>>> +	unsigned int idx = gt_id * __XE_NUM_PMU_SAMPLERS + sample;
>>>>> +
>>>>> +	XE_BUG_ON(idx >= ARRAY_SIZE(pmu->sample));
>>>>> +
>>>>> +	return idx;
>>>>> +}
>>>>
>>>> The compiler does this for us if we make sample array 2-d.
>>>>
>>>>> +
>>>>> +static u64 read_sample(struct xe_pmu *pmu, unsigned int gt_id, 
>>>>> +int sample) {
>>>>> +	return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur; }
>>>>> +
>>>>> +static void
>>>>> +store_sample(struct xe_pmu *pmu, unsigned int gt_id, int sample, 
>>>>> +u64 val) {
>>>>> +	pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val; }
>>>>
>>>> The three functions above are not needed if we make the sample 
>>>> array 2-d. See here:
>>>>
>>>> https://patchwork.freedesktop.org/patch/538887/?series=118225&rev=1
>>>>
>>>> Only a part of the patch above was merged (see 8ed0753b527dc) to 
>>>> keep the patch size small, but since for xe we are starting from 
>>>> scratch we can implement the whole approach above (get rid of the 
>>>> read/store helpers entirely, direct assignment without the helpers works).
>>>
>>> Ok I'll go over it.
>>>
>>>>
>>>>> +
>>>>> +static int engine_busyness_sample_type(u64 config) {
>>>>> +	int type = 0;
>>>>> +
>>>>> +	switch (config) {
>>>>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>>>>> +		type =  __XE_SAMPLE_RENDER_GROUP_BUSY;
>>>>> +		break;
>>>>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>>>>> +		type = __XE_SAMPLE_COPY_GROUP_BUSY;
>>>>> +		break;
>>>>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>>>>> +		type = __XE_SAMPLE_MEDIA_GROUP_BUSY;
>>>>> +		break;
>>>>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>>>>> +		type = __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY;
>>>>> +		break;
>>>>> +	}
>>>>> +
>>>>> +	return type;
>>>>> +}
>>>>> +
>>>>> +static void xe_pmu_event_destroy(struct perf_event *event) {
>>>>> +	struct xe_device *xe =
>>>>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>>>>> +
>>>>> +	drm_WARN_ON(&xe->drm, event->parent);
>>>>> +
>>>>> +	drm_dev_put(&xe->drm);
>>>>> +}
>>>>> +
>>>>> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 
>>>>> +config) {
>>>>> +	u64 val = 0;
>>>>> +
>>>>> +	switch (config) {
>>>>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>>>>> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
>>>>> +		break;
>>>>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>>>>> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
>>>>> +		break;
>>>>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>>>>> +		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
>>>>> +		break;
>>>>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>>>>> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
>>>>> +		break;
>>>>> +	default:
>>>>> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
>>>>> +	}
>>>>
>>>> We need xe_device_mem_access_get, also xe_force_wake_get if 
>>>> applicable, somewhere before reading these registers.
>>>>
>>>>> +
>>>>> +	return xe_gt_clock_interval_to_ns(gt, val * 16); }
>>>>> +
>>>>> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 
>>>>> +config) {
>>>>> +	int sample_type = engine_busyness_sample_type(config);
>>>>> +	struct xe_device *xe = gt->tile->xe;
>>>>> +	const unsigned int gt_id = gt->info.id;
>>>>> +	struct xe_pmu *pmu = &xe->pmu;
>>>>> +	bool device_awake;
>>>>> +	unsigned long flags;
>>>>> +	u64 val;
>>>>> +
>>>>> +	/*
>>>>> +	 * found no better way to check if device is awake or not. 
>>>>> +Before
>>>>
>>>> xe_device_mem_access_get_if_ongoing (hard to find name).
>>>
>>> thanks for sharing looks to be added recently. If we use this we 
>>> needn't use xe_device_mem_access_get.
>>
>> See comment at init_samples.
>>
>>>
>>>>
>>>>> +	 * we suspend we set the submission_state.enabled to false.
>>>>> +	 */
>>>>> +	device_awake = gt->uc.guc.submission_state.enabled ? true : false;
>>>>> +	if (device_awake)
>>>>> +		val = __engine_group_busyness_read(gt, config);
>>>>> +
>>>>> +	spin_lock_irqsave(&pmu->lock, flags);
>>>>> +
>>>>> +	if (device_awake)
>>>>> +		store_sample(pmu, gt_id, sample_type, val);
>>>>> +	else
>>>>> +		val = read_sample(pmu, gt_id, sample_type);
>>>>> +
>>>>> +	spin_unlock_irqrestore(&pmu->lock, flags);
>>>>> +
>>>>> +	return val;
>>>>> +}
>>>>> +
>>>>> +void engine_group_busyness_store(struct xe_gt *gt) {
>>>>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
>>>>> +	unsigned int gt_id = gt->info.id;
>>>>> +	unsigned long flags;
>>>>> +
>>>>> +	spin_lock_irqsave(&pmu->lock, flags);
>>>>> +
>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>>>>> +		     __engine_group_busyness_read(gt, 
>>>>> +XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
> 
> Here why should we store everything, we should store only those events 
> which are enabled?

The events are enabled only when they are opened which can happen after the device is suspended hence we need to store all. As in the present case device is put to suspend immediately after probe and event is opened post driver load is done.

> 
> Also it would good if the above can be done in a loop somehow. 4 is 
> fine but if we add events later, a loop will be nice, if possible.

I do not think loop would solve the furture events as there can be selective i.e only few needs to be stored. pure counters like interrupts need not be stored.

> 
>>>>> +
>>>>> +	spin_unlock_irqrestore(&pmu->lock, flags); }
>>>>> +
>>>>> +static int
>>>>> +config_status(struct xe_device *xe, u64 config) {
>>>>> +	unsigned int max_gt_id = xe->info.gt_count > 1 ? 1 : 0;
>>>>> +	unsigned int gt_id = config_gt_id(config);
>>>>> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
>>>>> +
>>>>> +	if (gt_id > max_gt_id)
>>>>> +		return -ENOENT;
>>>>> +
>>>>> +	switch (config_counter(config)) {
>>>>> +	case XE_PMU_INTERRUPTS(0):
>>>>> +		if (gt_id)
>>>>> +			return -ENOENT;
>>>>
>>>> OK: this is a global event so we say this is enabled only for gt0.
>>>>
>>>>> +		break;
>>>>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>>>>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>>>>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>>>>> +		if (GRAPHICS_VER(xe) < 12)
>>>>
>>>> Any point checking? xe only supports Gen12+.
>>>
>>> hmmm ya good point will remove this.
>>>>
>>>>> +			return -ENOENT;
>>>>> +		break;
>>>>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>>>>> +		if (MEDIA_VER(xe) >= 13 && gt->info.type != XE_GT_TYPE_MEDIA)
>>>>> +			return -ENOENT;
>>>>
>>>> OK: this makes sense, so we will expose this event only for media gt's.
>>>>
>>>>> +		break;
>>>>> +	default:
>>>>> +		return -ENOENT;
>>>>> +	}
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +static int xe_pmu_event_init(struct perf_event *event) {
>>>>> +	struct xe_device *xe =
>>>>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>>>>> +	struct xe_pmu *pmu = &xe->pmu;
>>>>> +	int ret;
>>>>> +
>>>>> +	if (pmu->closed)
>>>>> +		return -ENODEV;
>>>>> +
>>>>> +	if (event->attr.type != event->pmu->type)
>>>>> +		return -ENOENT;
>>>>> +
>>>>> +	/* unsupported modes and filters */
>>>>> +	if (event->attr.sample_period) /* no sampling */
>>>>> +		return -EINVAL;
>>>>> +
>>>>> +	if (has_branch_stack(event))
>>>>> +		return -EOPNOTSUPP;
>>>>> +
>>>>> +	if (event->cpu < 0)
>>>>> +		return -EINVAL;
>>>>> +
>>>>> +	/* only allow running on one cpu at a time */
>>>>> +	if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask))
>>>>> +		return -EINVAL;
>>>>> +
>>>>> +	ret = config_status(xe, event->attr.config);
>>>>> +	if (ret)
>>>>> +		return ret;
>>>>> +
>>>>> +	if (!event->parent) {
>>>>> +		drm_dev_get(&xe->drm);
>>>>> +		event->destroy = xe_pmu_event_destroy;
>>>>> +	}
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +static u64 __xe_pmu_event_read(struct perf_event *event) {
>>>>> +	struct xe_device *xe =
>>>>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>>>>> +	const unsigned int gt_id = config_gt_id(event->attr.config);
>>>>> +	const u64 config = config_counter(event->attr.config);
>>>>> +	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
>>>>> +	struct xe_pmu *pmu = &xe->pmu;
>>>>> +	u64 val = 0;
>>>>> +
>>>>> +	switch (config) {
>>>>> +	case XE_PMU_INTERRUPTS(0):
>>>>> +		val = READ_ONCE(pmu->irq_count);
>>>>
>>>> OK: The correct way to do this READ_ONCE/WRITE_ONCE irq_count stuff 
>>>> would be to take pmu->lock (both while reading and writing 
>>>> irq_count) but that would be expensive in the interrupt handler (as 
>>>> the .h hints). So all we can do is what is done here.
>>>>
>>>>> +		break;
>>>>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>>>>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>>>>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>>>>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>>>>> +		val = engine_group_busyness_read(gt, config);
>>>>> +	}
>>>>> +
>>>>> +	return val;
>>>>> +}
>>>>> +
>>>>> +static void xe_pmu_event_read(struct perf_event *event) {
>>>>> +	struct xe_device *xe =
>>>>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>>>>> +	struct hw_perf_event *hwc = &event->hw;
>>>>> +	struct xe_pmu *pmu = &xe->pmu;
>>>>> +	u64 prev, new;
>>>>> +
>>>>> +	if (pmu->closed) {
>>>>> +		event->hw.state = PERF_HES_STOPPED;
>>>>> +		return;
>>>>> +	}
>>>>> +again:
>>>>> +	prev = local64_read(&hwc->prev_count);
>>>>> +	new = __xe_pmu_event_read(event);
>>>>> +
>>>>> +	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
>>>>> +		goto again;
>>>>> +
>>>>> +	local64_add(new - prev, &event->count); }
>>>>> +
>>>>> +static void xe_pmu_enable(struct perf_event *event) {
>>>>> +	/*
>>>>> +	 * Store the current counter value so we can report the correct delta
>>>>> +	 * for all listeners. Even when the event was already enabled and has
>>>>> +	 * an existing non-zero value.
>>>>> +	 */
>>>>> +	local64_set(&event->hw.prev_count, __xe_pmu_event_read(event)); 
>>>>> +}
>>>>> +
>>>>> +static void xe_pmu_event_start(struct perf_event *event, int 
>>>>> +flags) {
>>>>> +	struct xe_device *xe =
>>>>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>>>>> +	struct xe_pmu *pmu = &xe->pmu;
>>>>> +
>>>>> +	if (pmu->closed)
>>>>> +		return;
>>>>> +
>>>>> +	xe_pmu_enable(event);
>>>>> +	event->hw.state = 0;
>>>>> +}
>>>>> +
>>>>> +static void xe_pmu_event_stop(struct perf_event *event, int 
>>>>> +flags) {
>>>>> +	if (flags & PERF_EF_UPDATE)
>>>>> +		xe_pmu_event_read(event);
>>>>> +
>>>>> +	event->hw.state = PERF_HES_STOPPED; }
>>>>> +
>>>>> +static int xe_pmu_event_add(struct perf_event *event, int flags) 
>>>>> +{
>>>>> +	struct xe_device *xe =
>>>>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>>>>> +	struct xe_pmu *pmu = &xe->pmu;
>>>>> +
>>>>> +	if (pmu->closed)
>>>>> +		return -ENODEV;
>>>>> +
>>>>> +	if (flags & PERF_EF_START)
>>>>> +		xe_pmu_event_start(event, flags);
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +static void xe_pmu_event_del(struct perf_event *event, int flags) 
>>>>> +{
>>>>> +	xe_pmu_event_stop(event, PERF_EF_UPDATE); }
>>>>> +
>>>>> +static int xe_pmu_event_event_idx(struct perf_event *event) {
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +struct xe_str_attribute {
>>>>> +	struct device_attribute attr;
>>>>> +	const char *str;
>>>>> +};
>>>>> +
>>>>> +static ssize_t xe_pmu_format_show(struct device *dev,
>>>>> +				  struct device_attribute *attr, char *buf) {
>>>>> +	struct xe_str_attribute *eattr;
>>>>> +
>>>>> +	eattr = container_of(attr, struct xe_str_attribute, attr);
>>>>> +	return sprintf(buf, "%s\n", eattr->str); }
>>>>> +
>>>>> +#define XE_PMU_FORMAT_ATTR(_name, _config) \
>>>>> +	(&((struct xe_str_attribute[]) { \
>>>>> +		{ .attr = __ATTR(_name, 0444, xe_pmu_format_show, NULL), \
>>>>> +		  .str = _config, } \
>>>>> +	})[0].attr.attr)
>>>>> +
>>>>> +static struct attribute *xe_pmu_format_attrs[] = {
>>>>> +	XE_PMU_FORMAT_ATTR(xe_eventid, "config:0-20"),
>>>>
>>>> 0-20 means 0-20 bits? Though here we probably have different number 
>>>> of config bits? Probably doesn't matter?
>>>
>>> as i understand this is not used anymore so will remove it.
>>>
>>>>
>>>> The string will show up with:
>>>>
>>>> cat /sys/devices/xe/format/xe_eventid
>>>>
>>>>> +	NULL,
>>>>> +};
>>>>> +
>>>>> +static const struct attribute_group xe_pmu_format_attr_group = {
>>>>> +	.name = "format",
>>>>> +	.attrs = xe_pmu_format_attrs,
>>>>> +};
>>>>> +
>>>>> +struct xe_ext_attribute {
>>>>> +	struct device_attribute attr;
>>>>> +	unsigned long val;
>>>>> +};
>>>>> +
>>>>> +static ssize_t xe_pmu_event_show(struct device *dev,
>>>>> +				 struct device_attribute *attr, char *buf) {
>>>>> +	struct xe_ext_attribute *eattr;
>>>>> +
>>>>> +	eattr = container_of(attr, struct xe_ext_attribute, attr);
>>>>> +	return sprintf(buf, "config=0x%lx\n", eattr->val); }
>>>>> +
>>>>> +static ssize_t cpumask_show(struct device *dev,
>>>>> +			    struct device_attribute *attr, char *buf) {
>>>>> +	return cpumap_print_to_pagebuf(true, buf, &xe_pmu_cpumask); }
>>>>> +
>>>>> +static DEVICE_ATTR_RO(cpumask);
>>>>> +
>>>>> +static struct attribute *xe_cpumask_attrs[] = {
>>>>> +	&dev_attr_cpumask.attr,
>>>>> +	NULL,
>>>>> +};
>>>>> +
>>>>> +static const struct attribute_group xe_pmu_cpumask_attr_group = {
>>>>> +	.attrs = xe_cpumask_attrs,
>>>>> +};
>>>>> +
>>>>> +#define __event(__counter, __name, __unit) \ { \
>>>>> +	.counter = (__counter), \
>>>>> +	.name = (__name), \
>>>>> +	.unit = (__unit), \
>>>>> +	.global = false, \
>>>>> +}
>>>>> +
>>>>> +#define __global_event(__counter, __name, __unit) \ { \
>>>>> +	.counter = (__counter), \
>>>>> +	.name = (__name), \
>>>>> +	.unit = (__unit), \
>>>>> +	.global = true, \
>>>>> +}
>>>>> +
>>>>> +static struct xe_ext_attribute *
>>>>> +add_xe_attr(struct xe_ext_attribute *attr, const char *name, u64 
>>>>> +config) {
>>>>> +	sysfs_attr_init(&attr->attr.attr);
>>>>> +	attr->attr.attr.name = name;
>>>>> +	attr->attr.attr.mode = 0444;
>>>>> +	attr->attr.show = xe_pmu_event_show;
>>>>> +	attr->val = config;
>>>>> +
>>>>> +	return ++attr;
>>>>> +}
>>>>> +
>>>>> +static struct perf_pmu_events_attr * add_pmu_attr(struct 
>>>>> +perf_pmu_events_attr *attr, const char *name,
>>>>> +	     const char *str)
>>>>> +{
>>>>> +	sysfs_attr_init(&attr->attr.attr);
>>>>> +	attr->attr.attr.name = name;
>>>>> +	attr->attr.attr.mode = 0444;
>>>>> +	attr->attr.show = perf_event_sysfs_show;
>>>>> +	attr->event_str = str;
>>>>> +
>>>>> +	return ++attr;
>>>>> +}
>>>>> +
>>>>> +static struct attribute **
>>>>> +create_event_attributes(struct xe_pmu *pmu) {
>>>>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>>>>> +	static const struct {
>>>>> +		unsigned int counter;
>>>>> +		const char *name;
>>>>> +		const char *unit;
>>>>> +		bool global;
>>>>> +	} events[] = {
>>>>> +		__global_event(0, "interrupts", NULL),
>>>>> +		__event(1, "render-group-busy", "ns"),
>>>>> +		__event(2, "copy-group-busy", "ns"),
>>>>> +		__event(3, "media-group-busy", "ns"),
>>>>> +		__event(4, "any-engine-group-busy", "ns"),
>>>>> +	};
>>>>
>>>> OK: this function is some black magic to expose stuff through PMU. 
>>>> Identical to i915 and seems to be working from the commit message 
>>>> so should be fine.
>>>>
>>>>> +
>>>>> +	unsigned int count = 0;
>>>>> +	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
>>>>> +	struct xe_ext_attribute *xe_attr = NULL, *xe_iter;
>>>>> +	struct attribute **attr = NULL, **attr_iter;
>>>>> +	struct xe_gt *gt;
>>>>> +	unsigned int i, j;
>>>>> +
>>>>> +	/* Count how many counters we will be exposing. */
>>>>> +	for_each_gt(gt, xe, j) {
>>>>> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
>>>>> +			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
>>>>> +
>>>>> +			if (!config_status(xe, config))
>>>>> +				count++;
>>>>> +		}
>>>>> +	}
>>>>> +
>>>>> +	/* Allocate attribute objects and table. */
>>>>> +	xe_attr = kcalloc(count, sizeof(*xe_attr), GFP_KERNEL);
>>>>> +	if (!xe_attr)
>>>>> +		goto err_alloc;
>>>>> +
>>>>> +	pmu_attr = kcalloc(count, sizeof(*pmu_attr), GFP_KERNEL);
>>>>> +	if (!pmu_attr)
>>>>> +		goto err_alloc;
>>>>> +
>>>>> +	/* Max one pointer of each attribute type plus a termination entry. */
>>>>> +	attr = kcalloc(count * 2 + 1, sizeof(*attr), GFP_KERNEL);
>>>>> +	if (!attr)
>>>>> +		goto err_alloc;
>>>>> +
>>>>> +	xe_iter = xe_attr;
>>>>> +	pmu_iter = pmu_attr;
>>>>> +	attr_iter = attr;
>>>>> +
>>>>> +	for_each_gt(gt, xe, j) {
>>>>> +		for (i = 0; i < ARRAY_SIZE(events); i++) {
>>>>> +			u64 config = ___XE_PMU_OTHER(j, events[i].counter);
>>>>> +			char *str;
>>>>> +
>>>>> +			if (config_status(xe, config))
>>>>> +				continue;
>>>>> +
>>>>> +			if (events[i].global)
>>>>> +				str = kstrdup(events[i].name, GFP_KERNEL);
>>>>> +			else
>>>>> +				str = kasprintf(GFP_KERNEL, "%s-gt%u",
>>>>> +						events[i].name, j);
>>>>> +			if (!str)
>>>>> +				goto err;
>>>>> +
>>>>> +			*attr_iter++ = &xe_iter->attr.attr;
>>>>> +			xe_iter = add_xe_attr(xe_iter, str, config);
>>>>> +
>>>>> +			if (events[i].unit) {
>>>>> +				if (events[i].global)
>>>>> +					str = kasprintf(GFP_KERNEL, "%s.unit",
>>>>> +							events[i].name);
>>>>> +				else
>>>>> +					str = kasprintf(GFP_KERNEL, "%s-gt%u.unit",
>>>>> +							events[i].name, j);
>>>>> +				if (!str)
>>>>> +					goto err;
>>>>> +
>>>>> +				*attr_iter++ = &pmu_iter->attr.attr;
>>>>> +				pmu_iter = add_pmu_attr(pmu_iter, str,
>>>>> +							events[i].unit);
>>>>> +			}
>>>>> +		}
>>>>> +	}
>>>>> +
>>>>> +	pmu->xe_attr = xe_attr;
>>>>> +	pmu->pmu_attr = pmu_attr;
>>>>> +
>>>>> +	return attr;
>>>>> +
>>>>> +err:
>>>>> +	for (attr_iter = attr; *attr_iter; attr_iter++)
>>>>> +		kfree((*attr_iter)->name);
>>>>> +
>>>>> +err_alloc:
>>>>> +	kfree(attr);
>>>>> +	kfree(xe_attr);
>>>>> +	kfree(pmu_attr);
>>>>> +
>>>>> +	return NULL;
>>>>> +}
>>>>> +
>>>>> +static void free_event_attributes(struct xe_pmu *pmu) {
>>>>> +	struct attribute **attr_iter = pmu->events_attr_group.attrs;
>>>>> +
>>>>> +	for (; *attr_iter; attr_iter++)
>>>>> +		kfree((*attr_iter)->name);
>>>>> +
>>>>> +	kfree(pmu->events_attr_group.attrs);
>>>>> +	kfree(pmu->xe_attr);
>>>>> +	kfree(pmu->pmu_attr);
>>>>> +
>>>>> +	pmu->events_attr_group.attrs = NULL;
>>>>> +	pmu->xe_attr = NULL;
>>>>> +	pmu->pmu_attr = NULL;
>>>>> +}
>>>>> +
>>>>> +static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node 
>>>>> +*node) {
>>>>> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), 
>>>>> +cpuhp.node);
>>>>> +
>>>>> +	XE_BUG_ON(!pmu->base.event_init);
>>>>> +
>>>>> +	/* Select the first online CPU as a designated reader. */
>>>>> +	if (cpumask_empty(&xe_pmu_cpumask))
>>>>> +		cpumask_set_cpu(cpu, &xe_pmu_cpumask);
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node 
>>>>> +*node) {
>>>>> +	struct xe_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
>>>>> +	unsigned int target = xe_pmu_target_cpu;
>>>>> +
>>>>> +	XE_BUG_ON(!pmu->base.event_init);
>>>>> +
>>>>> +	/*
>>>>> +	 * Unregistering an instance generates a CPU offline event which we must
>>>>> +	 * ignore to avoid incorrectly modifying the shared xe_pmu_cpumask.
>>>>> +	 */
>>>>> +	if (pmu->closed)
>>>>> +		return 0;
>>>>> +
>>>>> +	if (cpumask_test_and_clear_cpu(cpu, &xe_pmu_cpumask)) {
>>>>> +		target = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
>>>>> +
>>>>> +		/* Migrate events if there is a valid target */
>>>>> +		if (target < nr_cpu_ids) {
>>>>> +			cpumask_set_cpu(target, &xe_pmu_cpumask);
>>>>> +			xe_pmu_target_cpu = target;
>>>>> +		}
>>>>> +	}
>>>>> +
>>>>> +	if (target < nr_cpu_ids && target != pmu->cpuhp.cpu) {
>>>>> +		perf_pmu_migrate_context(&pmu->base, cpu, target);
>>>>> +		pmu->cpuhp.cpu = target;
>>>>> +	}
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
>>>>> +
>>>>> +int xe_pmu_init(void)
>>>>> +{
>>>>> +	int ret;
>>>>> +
>>>>> +	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
>>>>> +				      "perf/x86/intel/xe:online",
>>>>> +				      xe_pmu_cpu_online,
>>>>> +				      xe_pmu_cpu_offline);
>>>>> +	if (ret < 0)
>>>>> +		pr_notice("Failed to setup cpuhp state for xe PMU! (%d)\n",
>>>>> +			  ret);
>>>>> +	else
>>>>> +		cpuhp_slot = ret;
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +void xe_pmu_exit(void)
>>>>> +{
>>>>> +	if (cpuhp_slot != CPUHP_INVALID)
>>>>> +		cpuhp_remove_multi_state(cpuhp_slot);
>>>>> +}
>>>>> +
>>>>> +static int xe_pmu_register_cpuhp_state(struct xe_pmu *pmu) {
>>>>> +	if (cpuhp_slot == CPUHP_INVALID)
>>>>> +		return -EINVAL;
>>>>> +
>>>>> +	return cpuhp_state_add_instance(cpuhp_slot, &pmu->cpuhp.node); }
>>>>> +
>>>>> +static void xe_pmu_unregister_cpuhp_state(struct xe_pmu *pmu) {
>>>>> +	cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node); }
>>>>> +
>>>>> +static void xe_pmu_unregister(struct drm_device *device, void 
>>>>> +*arg) {
>>>>> +	struct xe_pmu *pmu = arg;
>>>>> +
>>>>> +	if (!pmu->base.event_init)
>>>>> +		return;
>>>>> +
>>>>> +	/*
>>>>> +	 * "Disconnect" the PMU callbacks - since all are atomic synchronize_rcu
>>>>> +	 * ensures all currently executing ones will have exited before we
>>>>> +	 * proceed with unregistration.
>>>>> +	 */
>>>>> +	pmu->closed = true;
>>>>> +	synchronize_rcu();
>>>>> +
>>>>> +	xe_pmu_unregister_cpuhp_state(pmu);
>>>>> +
>>>>> +	perf_pmu_unregister(&pmu->base);
>>>>> +	pmu->base.event_init = NULL;
>>>>> +	kfree(pmu->base.attr_groups);
>>>>> +	kfree(pmu->name);
>>>>> +	free_event_attributes(pmu);
>>>>> +}
>>>>> +
>>>>> +static void init_samples(struct xe_pmu *pmu) {
>>>>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>>>>> +	struct xe_gt *gt;
>>>>> +	unsigned int i;
>>>>> +
>>>>> +	for_each_gt(gt, xe, i)
>>>>> +		engine_group_busyness_store(gt); }
>>>>> +
>>>>> +void xe_pmu_register(struct xe_pmu *pmu)
>>>>
>>>> Why void, why not int? PMU failure is non fatal error?
>>>
>>> Ya device is functional , it is only that these counters are not 
>>> available. Hence didn't want to fail the driver load.
>>>>
>>>>> +{
>>>>> +	struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>>>>> +	const struct attribute_group *attr_groups[] = {
>>>>> +		&xe_pmu_format_attr_group,
>>>>> +		&pmu->events_attr_group,
>>>>> +		&xe_pmu_cpumask_attr_group,
>>>>
>>>> Can someone please explain what this cpumask/cpuhotplug stuff does 
>>>> and whether it needs to be in this patch? There's something here:
>>>
>>> comments from original patch series in 
>>> i915:https://patchwork.kernel.org/project/intel-gfx/patch/2017080212
>>> 3249.14194-5-tvrtko.ursulin@linux.intel.com/
>>>
>>> "IIRC an uncore PMU should expose a cpumask through sysfs, and then 
>>> perf tools will read that mask and auto-magically limit the number 
>>> of CPUs it instantiates the counter on."
>>>
>>> and ours are global counters not per cpu so we limit to just to single cpu.
>>>
>>> And as i understand we use the cpuhotplug support to migrate too new 
>>> cpu incase the earlier one goes offline.
>>
>> OK, leave as is.
>>
>>>
>>>>
>>>> b46a33e271ed ("drm/i915/pmu: Expose a PMU interface for perf 
>>>> queries")
>>>>
>>>> I'd rather just have the basic PMU infra and the events in this 
>>>> patch and punt this cpumask/cpuhotplug stuff to a later patch, 
>>>> unless someone can say what it does.
>>>>
>>>> Though perf_pmu_register seems to be doing some per cpu stuff so 
>>>> likely this is needed. But amdgpu_pmu only has event and format attributes.
>>>>
>>>> Mostly leave as is I guess.
>>>>
>>>>> +		NULL
>>>>> +	};
>>>>> +
>>>>> +	int ret = -ENOMEM;
>>>>> +
>>>>> +	spin_lock_init(&pmu->lock);
>>>>> +	pmu->cpuhp.cpu = -1;
>>>>> +	init_samples(pmu);
>>>>
>>>> Why init_samples here? Can't we init the particular sample in 
>>>> xe_pmu_event_init or even xe_pmu_event_start?
>>>>
>>>> Init'ing here may be too soon since the event might not be enabled 
>>>> for a long time. So really this needs to move to xe_pmu_event_init 
>>>> or xe_pmu_event_start.
>>>
>>> The device is put to suspend immediately after driver probe, 
>>> typically pmu is opened even before any workload is run so 
>>> essentially device is still in suspend state hence we cannot access 
>>> registers so storing the last known value in init_samples. otherwise we see the bug in v#1 of series.
>>>
>>>>
>>>> Actually this is already happening in xe_pmu_enable. We just need 
>>>> to decide when we want to wake the device up and when we don't. So 
>>>> maybe wake the device up at start (use xe_device_mem_access_get) 
>>>> and not wake up during read (xe_device_mem_access_get_if_ongoing etc.)?
>>
>> Just going to repeat this again:
>>
>> xe_pmu_event_start calls xe_pmu_enable. In xe_pmu_enable use 
>> xe_device_mem_access_get before calling __xe_pmu_event_read. This 
>> will wake the device up and get a valid value in event->hw.prev_count.
>>
>> In xe_pmu_event_read, use xe_device_mem_access_get_if_ongoing, to 
>> read the event without waking the device up (and return previous value etc.).
>>
>> Or, pass a flag in to __xe_pmu_event_read and to 
>> engine_group_busyness_read and __engine_group_busyness_read. The flag 
>> will say whether or not to wake up the device. If the flag says wake 
>> the device up, call xe_device_mem_access_get and xe_force_wake_get, 
>> maybe in __engine_group_busyness_read, before reading device 
>> registers. If the flag says don't wake up the device call xe_device_mem_access_get_if_ongoing.
>>
>> This way we:
>> * don't need to call init_samples in xe_pmu_register
>> * we don't need engine_group_busyness_store
>> * we don't need to specifically call engine_group_busyness_store in 
>> xe_gt_suspend
>>
>> The correct sample is read by waking up the device in xe_pmu_event_start.
>>
>> Hopefully this is clear now.
> 
> Actually I think it is not necessary to do anything in 
> xe_pmu_event_start, as long as we keep the engine_group_busyness_store 
> in xe_gt_suspend (so we can just use 
> xe_device_mem_access_get_if_ongoing, don't need xe_device_mem_access_get as you said).
> 
> Afais, the problem of huge values in v1 was due to reading the device 
> registers when device was not awake. That problem we've already solved 
> in
> v2 where in engine_group_busyness_read() we only read if the device is 
> awake and skip and return the previous value when the device is not 
> awake. So that fixes the problem of huge values.
> 
> The problem in this patch (I thought) is that we are effectively 
> sampling the registers each time perf calls xe_pmu_event_read, say 
> every 1 sec. So we are sampling the registers every 1 sec. When we 
> sample (every 1 sec) if the device is awake we will return the correct 
> ns value, if device is asleep we will return the old value. If the 
> device wakes up and does some work in the 1 sec period but then again 
> suspends will miss that activity. That is the problem that is being 
> solved by storing the samples in xe_gt_suspend().
> 
> i915 solved this problem using a 5 ms timer but I like the solution of 
> sampling in xe_gt_suspend better, so let's keep it.
> 
> But init_samples is not needed, afais we don't need to do anything in 
> xe_pmu_register or in xe_pmu_event_start (and we don't need to use 
> xe_device_mem_access_get). xe_gt_suspend will take care of storing the 
> register values before the device suspends.
> 
> Hopefully it makes sense now, sorry for the confusion.

yes you are right we do not need init_samples.

Thanks,
Aravind.
> 
> Ashutosh
> 
>>
>>>
>>>>
>>>>> +
>>>>> +	pmu->name = kasprintf(GFP_KERNEL,
>>>>> +			      "xe_%s",
>>>>> +			      dev_name(xe->drm.dev));
>>>>> +	if (pmu->name)
>>>>> +		/* tools/perf reserves colons as special. */
>>>>> +		strreplace((char *)pmu->name, ':', '_');
>>>>> +
>>>>> +	if (!pmu->name)
>>>>> +		goto err;
>>>>> +
>>>>> +	pmu->events_attr_group.name = "events";
>>>>> +	pmu->events_attr_group.attrs = create_event_attributes(pmu);
>>>>> +	if (!pmu->events_attr_group.attrs)
>>>>> +		goto err_name;
>>>>> +
>>>>> +	pmu->base.attr_groups = kmemdup(attr_groups, sizeof(attr_groups),
>>>>> +					GFP_KERNEL);
>>>>> +	if (!pmu->base.attr_groups)
>>>>> +		goto err_attr;
>>>>> +
>>>>> +	pmu->base.module	= THIS_MODULE;
>>>>> +	pmu->base.task_ctx_nr	= perf_invalid_context;
>>>>> +	pmu->base.event_init	= xe_pmu_event_init;
>>>>> +	pmu->base.add		= xe_pmu_event_add;
>>>>> +	pmu->base.del		= xe_pmu_event_del;
>>>>> +	pmu->base.start		= xe_pmu_event_start;
>>>>> +	pmu->base.stop		= xe_pmu_event_stop;
>>>>> +	pmu->base.read		= xe_pmu_event_read;
>>>>> +	pmu->base.event_idx	= xe_pmu_event_event_idx;
>>>>> +
>>>>> +	ret = perf_pmu_register(&pmu->base, pmu->name, -1);
>>>>> +	if (ret)
>>>>> +		goto err_groups;
>>>>> +
>>>>> +	ret = xe_pmu_register_cpuhp_state(pmu);
>>>>> +	if (ret)
>>>>> +		goto err_unreg;
>>>>> +
>>>>> +	ret = drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister, pmu);
>>>>> +	XE_WARN_ON(ret);
>>>>
>>>> We should just follow the regular error rewind here too and let
>>>> drm_notice() at the end print the error. This is what other drivers 
>>>> calling drmm_add_action_or_reset seem to be doing.
>>>
>>> Ok ok.
>>>>
>>>>> +
>>>>> +	return;
>>>>> +
>>>>> +err_unreg:
>>>>> +	perf_pmu_unregister(&pmu->base);
>>>>> +err_groups:
>>>>> +	kfree(pmu->base.attr_groups);
>>>>> +err_attr:
>>>>> +	pmu->base.event_init = NULL;
>>>>> +	free_event_attributes(pmu);
>>>>> +err_name:
>>>>> +	kfree(pmu->name);
>>>>> +err:
>>>>> +	drm_notice(&xe->drm, "Failed to register PMU!\n"); }
>>>>> diff --git a/drivers/gpu/drm/xe/xe_pmu.h 
>>>>> b/drivers/gpu/drm/xe/xe_pmu.h new file mode 100644 index 
>>>>> 000000000000..d3f47f4ab343
>>>>> --- /dev/null
>>>>> +++ b/drivers/gpu/drm/xe/xe_pmu.h
>>>>> @@ -0,0 +1,25 @@
>>>>> +/* SPDX-License-Identifier: MIT */
>>>>> +/*
>>>>> + * Copyright © 2023 Intel Corporation  */
>>>>> +
>>>>> +#ifndef _XE_PMU_H_
>>>>> +#define _XE_PMU_H_
>>>>> +
>>>>> +#include "xe_gt_types.h"
>>>>> +#include "xe_pmu_types.h"
>>>>> +
>>>>> +#ifdef CONFIG_PERF_EVENTS
>>>>
>>>> nit but maybe this should be:
>>>>
>>>> #if IS_ENABLED(CONFIG_PERF_EVENTS)
>>>>
>>>> or,
>>>>
>>>> #if IS_BUILTIN(CONFIG_PERF_EVENTS)
>>>>
>>>> Note CONFIG_PERF_EVENTS is a boolean kconfig option.
>>>>
>>>> See similar macro IS_REACHABLE() in i915_hwmon.h.
>>>>
>>>>> +int xe_pmu_init(void);
>>>>> +void xe_pmu_exit(void);
>>>>> +void xe_pmu_register(struct xe_pmu *pmu); void 
>>>>> +engine_group_busyness_store(struct xe_gt *gt);
>>>>
>>>> Add xe_pmu_ prefix if function is needed (hopefully not).
>>>
>>> OK
>>>>
>>>>> +#else
>>>>> +static inline int xe_pmu_init(void) { return 0; } static inline 
>>>>> +void xe_pmu_exit(void) {} static inline void 
>>>>> +xe_pmu_register(struct xe_pmu *pmu) {} static inline void 
>>>>> +engine_group_busyness_store(struct xe_gt *gt) {} #endif
>>>>> +
>>>>> +#endif
>>>>> +
>>>>> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h 
>>>>> b/drivers/gpu/drm/xe/xe_pmu_types.h
>>>>> new file mode 100644
>>>>> index 000000000000..e87edd4d6a87
>>>>> --- /dev/null
>>>>> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
>>>>> @@ -0,0 +1,80 @@
>>>>> +/* SPDX-License-Identifier: MIT */
>>>>> +/*
>>>>> + * Copyright © 2023 Intel Corporation  */
>>>>> +
>>>>> +#ifndef _XE_PMU_TYPES_H_
>>>>> +#define _XE_PMU_TYPES_H_
>>>>> +
>>>>> +#include <linux/perf_event.h>
>>>>> +#include <linux/spinlock_types.h> #include <uapi/drm/xe_drm.h>
>>>>> +
>>>>> +enum {
>>>>> +	__XE_SAMPLE_RENDER_GROUP_BUSY,
>>>>> +	__XE_SAMPLE_COPY_GROUP_BUSY,
>>>>> +	__XE_SAMPLE_MEDIA_GROUP_BUSY,
>>>>> +	__XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>>>>> +	__XE_NUM_PMU_SAMPLERS
>>>>
>>>> OK: irq_count is missing here since these are read from device...
>>>>
>>>>> +};
>>>>> +
>>>>> +struct xe_pmu_sample {
>>>>> +	u64 cur;
>>>>> +};
>>>>
>>>> This was also discussed for i915 PMU, no point having a struct with 
>>>> a single u64 member. Might as well just use u64 wherever we are 
>>>> using struct xe_pmu_sample.
>>>
>>> OK.
>>>>
>>>>> +
>>>>> +#define XE_MAX_GT_PER_TILE 2
>>>>
>>>> Why per tile? The array size should be max_gt_per_device. Just call 
>>>> it XE_MAX_GT?
>>>
>>> I declared similar to what we have in drivers/gpu/drm/xe/xe_device.h
>>
>> Our 2-d array size is for the device, not per tile. So let's use XE_MAX_GT.
>>
>>>>
>>>>> +
>>>>> +struct xe_pmu {
>>>>> +	/**
>>>>> +	 * @cpuhp: Struct used for CPU hotplug handling.
>>>>> +	 */
>>>>> +	struct {
>>>>> +		struct hlist_node node;
>>>>> +		unsigned int cpu;
>>>>> +	} cpuhp;
>>>>> +	/**
>>>>> +	 * @base: PMU base.
>>>>> +	 */
>>>>> +	struct pmu base;
>>>>> +	/**
>>>>> +	 * @closed: xe is unregistering.
>>>>> +	 */
>>>>> +	bool closed;
>>>>> +	/**
>>>>> +	 * @name: Name as registered with perf core.
>>>>> +	 */
>>>>> +	const char *name;
>>>>> +	/**
>>>>> +	 * @lock: Lock protecting enable mask and ref count handling.
>>>>> +	 */
>>>>> +	spinlock_t lock;
>>>>> +	/**
>>>>> +	 * @sample: Current and previous (raw) counters.
>>>>> +	 *
>>>>> +	 * These counters are updated when the device is awake.
>>>>> +	 *
>>>>> +	 */
>>>>> +	struct xe_pmu_sample sample[XE_MAX_GT_PER_TILE * 
>>>>> +__XE_NUM_PMU_SAMPLERS];
>>>>
>>>> Change to 2-d array. See above.
>>>>
>>>>> +	/**
>>>>> +	 * @irq_count: Number of interrupts
>>>>> +	 *
>>>>> +	 * Intentionally unsigned long to avoid atomics or heuristics on 32bit.
>>>>> +	 * 4e9 interrupts are a lot and postprocessing can really deal with an
>>>>> +	 * occasional wraparound easily. It's 32bit after all.
>>>>> +	 */
>>>>> +	unsigned long irq_count;
>>>>> +	/**
>>>>> +	 * @events_attr_group: Device events attribute group.
>>>>> +	 */
>>>>> +	struct attribute_group events_attr_group;
>>>>> +	/**
>>>>> +	 * @xe_attr: Memory block holding device attributes.
>>>>> +	 */
>>>>> +	void *xe_attr;
>>>>> +	/**
>>>>> +	 * @pmu_attr: Memory block holding device attributes.
>>>>> +	 */
>>>>> +	void *pmu_attr;
>>>>> +};
>>>>> +
>>>>> +#endif
>>>>> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h 
>>>>> index 965cd9527ff1..ed097056f944 100644
>>>>> --- a/include/uapi/drm/xe_drm.h
>>>>> +++ b/include/uapi/drm/xe_drm.h
>>>>> @@ -990,6 +990,22 @@ struct drm_xe_vm_madvise {
>>>>> 	__u64 reserved[2];
>>>>>  };
>>>>>
>>>>> +/* PMU event config IDs */
>>>>> +
>>>>> +/*
>>>>> + * Top 4 bits of every counter are GT id.
>>>>> + */
>>>>> +#define __XE_PMU_GT_SHIFT (60)
>>>>
>>>> To future-proof this, and also because we seem to have plenty of 
>>>> bits available, I think we should change this to 56 (instead of 60).
>>>
>>> OK
>>>
>>> Thanks,
>>> Aravind.
>>>>
>>>>> +
>>>>> +#define ___XE_PMU_OTHER(gt, x) \
>>>>> +	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
>>>>> +
>>>>> +#define XE_PMU_INTERRUPTS(gt)			___XE_PMU_OTHER(gt, 0)
>>>>> +#define XE_PMU_RENDER_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 1)
>>>>> +#define XE_PMU_COPY_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 2)
>>>>> +#define XE_PMU_MEDIA_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 3)
>>>>> +#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)	___XE_PMU_OTHER(gt, 4)
>>>>> +
>>>>>  #if defined(__cplusplus)
>>>>>  }
>>>>>  #endif
>>>>> --
>>>>> 2.25.1
>>
>> Thanks.
>> --
>> Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-22  6:04         ` Dixit, Ashutosh
  2023-07-24  8:03           ` Iddamsetty, Aravind
@ 2023-07-24  9:38           ` Iddamsetty, Aravind
  1 sibling, 0 replies; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-24  9:38 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin



On 22-07-2023 11:34, Dixit, Ashutosh wrote:
> On Fri, 21 Jul 2023 16:36:02 -0700, Dixit, Ashutosh wrote:
>>
>> On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote:
>>>
>> Hi Aravind,
>>
>>> On 21-07-2023 06:32, Dixit, Ashutosh wrote:
>>>> On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
>>>>>
>>>> More stuff to mull over. You can ignore comments starting with "OK", those
>>>> are just notes to myself.
>>>>
>>>> Also, maybe some time we can add a basic IGT which reads these exposed
>>>> counters and verifies that we can read them and they are monotonically
>>>> increasing?
>>>
>>> this is the IGT https://patchwork.freedesktop.org/series/119936/ series
>>> using these counters posted by Venkat.
>>>
>>>>
>>>>> There are a set of engine group busyness counters provided by HW which are
>>>>> perfect fit to be exposed via PMU perf events.
>>>>>
>>>>> BSPEC: 46559, 46560, 46722, 46729
>>>>
>>>> Also add these Bspec entries: 71028, 52071
>>>
>>> OK.
>>>
>>>>
>>>>>
>>>>> events can be listed using:
>>>>> perf list
>>>>>   xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
>>>>>   xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
>>>>>   xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
>>>>>   xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
>>>>>   xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]
>>>>>
>>>>> and can be read using:
>>>>>
>>>>> perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
>>>>>            time             counts unit events
>>>>>      1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>      9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>     10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>     10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
>>>>>
>>>>> The pmu base implementation is taken from i915.
>>>>>
>>>>> v2:
>>>>> Store last known value when device is awake return that while the GT is
>>>>> suspended and then update the driver copy when read during awake.
>>>>>
>>>>> Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
>>>>> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
>>>>> ---
>>>>>  drivers/gpu/drm/xe/Makefile          |   2 +
>>>>>  drivers/gpu/drm/xe/regs/xe_gt_regs.h |   5 +
>>>>>  drivers/gpu/drm/xe/xe_device.c       |   2 +
>>>>>  drivers/gpu/drm/xe/xe_device_types.h |   4 +
>>>>>  drivers/gpu/drm/xe/xe_gt.c           |   2 +
>>>>>  drivers/gpu/drm/xe/xe_irq.c          |  22 +
>>>>>  drivers/gpu/drm/xe/xe_module.c       |   5 +
>>>>>  drivers/gpu/drm/xe/xe_pmu.c          | 739 +++++++++++++++++++++++++++
>>>>>  drivers/gpu/drm/xe/xe_pmu.h          |  25 +
>>>>>  drivers/gpu/drm/xe/xe_pmu_types.h    |  80 +++
>>>>>  include/uapi/drm/xe_drm.h            |  16 +
>>>>>  11 files changed, 902 insertions(+)
>>>>>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
>>>>>  create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
>>>>>  create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h
>>>>>
<snip>
>>>>> +
>>>>> +void engine_group_busyness_store(struct xe_gt *gt)
>>>>> +{
>>>>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
>>>>> +	unsigned int gt_id = gt->info.id;
>>>>> +	unsigned long flags;
>>>>> +
>>>>> +	spin_lock_irqsave(&pmu->lock, flags);
>>>>> +
>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
> 
> Here why should we store everything, we should store only those events
> which are enabled?
> 
> Also it would good if the above can be done in a loop somehow. 4 is fine
> but if we add events later, a loop will be nice, if possible.

i got your point. i could do something like this

for (i = __XE_SAMPLE_RENDER_GROUP_BUSY; i < __XE_NUM_PMU_SAMPLERS; i++) {

  val = __engine_group_busyness_read(gt, i);
  pmu->sample[gt_id][i] = val;
}

Thanks,
Aravind.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-24  8:03           ` Iddamsetty, Aravind
  2023-07-24  9:00             ` Ursulin, Tvrtko
@ 2023-07-24 15:52             ` Dixit, Ashutosh
  2023-07-24 16:05               ` Iddamsetty, Aravind
  1 sibling, 1 reply; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-24 15:52 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Mon, 24 Jul 2023 01:03:23 -0700, Iddamsetty, Aravind wrote:

Hi Aravind,

> On 22-07-2023 11:34, Dixit, Ashutosh wrote:
>> > On Fri, 21 Jul 2023 16:36:02 -0700, Dixit, Ashutosh wrote:
> >> On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote:
> >>>>> +void engine_group_busyness_store(struct xe_gt *gt)
> >>>>> +{
> >>>>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
> >>>>> +	unsigned int gt_id = gt->info.id;
> >>>>> +	unsigned long flags;
> >>>>> +
> >>>>> +	spin_lock_irqsave(&pmu->lock, flags);
> >>>>> +
> >>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
> >>>>> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
> >>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
> >>>>> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
> >>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
> >>>>> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
> >>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> >>>>> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
> >
> > Here why should we store everything, we should store only those events
> > which are enabled?
>
> The events are enabled only when they are opened which can happen after
> the device is suspended hence we need to store all. As in the present
> case device is put to suspend immediately after probe and event is
> opened post driver load is done.

I don't think we can justify doing expensive PCIe reads and increasing the
time to go into runtime suspend, when PMU might not being used at all.

If we store only enabled samples and start storing them only after they are
enabled, what would be the consequence of this? The first non-zero sample
seen by the perf tool would be wrong and later samples will be fine?

If there is a consequence, we might have to go back to what I was saying
earlier about waking the device up and reading the enabled counter when
xe_pmu_event_start happens, to initialize the counter values. I am assuming
this will work?

Doing this IMO would be better than always doing these PCIe reads on
runtime suspend even when PMU is not being used.

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-24  9:00             ` Ursulin, Tvrtko
@ 2023-07-24 15:52               ` Dixit, Ashutosh
  0 siblings, 0 replies; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-24 15:52 UTC (permalink / raw)
  To: Ursulin, Tvrtko; +Cc: Bommu, Krishnaiah, intel-xe

On Mon, 24 Jul 2023 02:00:13 -0700, Ursulin, Tvrtko wrote:
>
>
> [Top-post since my dual email setup and this is the wrong one, sorry.]
>
> Glancing over the discussion - small correction - i915 did not solve the
> problem of hardware counters and sleeping device with the timer but with
> the park/unpark hooks.
>
> More similar to these group busyness counters would be the RC6, and you
> will notice there is nothing in the i915 sampling timer about RC6. There
> is just some complicated code in park/unpark to estimate RC6 while
> parked. But the estimation is beside the point for engine group busyness
> since it is the opposite metric (grows on busy vs grows idle).
>
> And I think you converged to the same solution already, so I just wanted
> to correct the 5ms timer inaccuracy.

Agreed. For a moment I was fantasizing that the timer would not be needed
in i915 either and we could just sample the values when parking, but that's
not true.

Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-24 15:52             ` Dixit, Ashutosh
@ 2023-07-24 16:05               ` Iddamsetty, Aravind
  2023-07-24 16:31                 ` Dixit, Ashutosh
  0 siblings, 1 reply; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-24 16:05 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin



On 24-07-2023 21:22, Dixit, Ashutosh wrote:
Hi Ashutosh,

> On Mon, 24 Jul 2023 01:03:23 -0700, Iddamsetty, Aravind wrote:
> 
> Hi Aravind,
> 
>> On 22-07-2023 11:34, Dixit, Ashutosh wrote:
>>>> On Fri, 21 Jul 2023 16:36:02 -0700, Dixit, Ashutosh wrote:
>>>> On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote:
>>>>>>> +void engine_group_busyness_store(struct xe_gt *gt)
>>>>>>> +{
>>>>>>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
>>>>>>> +	unsigned int gt_id = gt->info.id;
>>>>>>> +	unsigned long flags;
>>>>>>> +
>>>>>>> +	spin_lock_irqsave(&pmu->lock, flags);
>>>>>>> +
>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
>>>
>>> Here why should we store everything, we should store only those events
>>> which are enabled?
>>
>> The events are enabled only when they are opened which can happen after
>> the device is suspended hence we need to store all. As in the present
>> case device is put to suspend immediately after probe and event is
>> opened post driver load is done.
> 
> I don't think we can justify doing expensive PCIe reads and increasing the
> time to go into runtime suspend, when PMU might not being used at all.
> 
> If we store only enabled samples and start storing them only after they are
> enabled, what would be the consequence of this? The first non-zero sample
> seen by the perf tool would be wrong and later samples will be fine?

Why do you say it is wrong perf reports relative from the time an event
is opened.

> 
> If there is a consequence, we might have to go back to what I was saying
> earlier about waking the device up and reading the enabled counter when
> xe_pmu_event_start happens, to initialize the counter values. I am assuming
> this will work?

xe_pmu_event_start can be called when device is in suspend so we shall
not wake up the device i.e event being enabled when in suspend, so if we
do not store while going to suspend we will not have any value to
consider when event is enabled after suspend as we need to present
relative value.
> 
> Doing this IMO would be better than always doing these PCIe reads on
> runtime suspend even when PMU is not being used

we have been doing these in i915 not sure if it affected any timing
requirements for runtime suspend.

Thanks,
Aravind.
> 
> Thanks.
> --
> Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-24 16:05               ` Iddamsetty, Aravind
@ 2023-07-24 16:31                 ` Dixit, Ashutosh
  2023-07-25 11:38                   ` Iddamsetty, Aravind
  0 siblings, 1 reply; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-07-24 16:31 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Mon, 24 Jul 2023 09:05:53 -0700, Iddamsetty, Aravind wrote:
>
> On 24-07-2023 21:22, Dixit, Ashutosh wrote:
> Hi Ashutosh,
>
> > On Mon, 24 Jul 2023 01:03:23 -0700, Iddamsetty, Aravind wrote:
> >
> > Hi Aravind,
> >
> >> On 22-07-2023 11:34, Dixit, Ashutosh wrote:
> >>>> On Fri, 21 Jul 2023 16:36:02 -0700, Dixit, Ashutosh wrote:
> >>>> On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote:
> >>>>>>> +void engine_group_busyness_store(struct xe_gt *gt)
> >>>>>>> +{
> >>>>>>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
> >>>>>>> +	unsigned int gt_id = gt->info.id;
> >>>>>>> +	unsigned long flags;
> >>>>>>> +
> >>>>>>> +	spin_lock_irqsave(&pmu->lock, flags);
> >>>>>>> +
> >>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
> >>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
> >>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
> >>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
> >>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
> >>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
> >>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> >>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
> >>>
> >>> Here why should we store everything, we should store only those events
> >>> which are enabled?
> >>
> >> The events are enabled only when they are opened which can happen after
> >> the device is suspended hence we need to store all. As in the present
> >> case device is put to suspend immediately after probe and event is
> >> opened post driver load is done.
> >
> > I don't think we can justify doing expensive PCIe reads and increasing the
> > time to go into runtime suspend, when PMU might not being used at all.
> >
> > If we store only enabled samples and start storing them only after they are
> > enabled, what would be the consequence of this? The first non-zero sample
> > seen by the perf tool would be wrong and later samples will be fine?
>
> Why do you say it is wrong perf reports relative from the time an event
> is opened.

I am asking you what is the consequence. Initial values will all be zero
and then there is some activity and we get a non zero value but this will
include all the previous activity so the first difference we send to perf
will be large/wrong I think.

>
> >
> > If there is a consequence, we might have to go back to what I was saying
> > earlier about waking the device up and reading the enabled counter when
> > xe_pmu_event_start happens, to initialize the counter values. I am assuming
> > this will work?
>
> xe_pmu_event_start can be called when device is in suspend so we shall
> not wake up the device i.e event being enabled when in suspend, so if we
> do not store while going to suspend we will not have any value to
> consider when event is enabled after suspend as we need to present
> relative value.

That is why I am saying wake up the device and initialize the counters in
xe_pmu_event_start.

> >
> > Doing this IMO would be better than always doing these PCIe reads on
> > runtime suspend even when PMU is not being used
>
> we have been doing these in i915 not sure if it affected any timing
> requirements for runtime suspend.

Hmm i915 indeed seems to be reading the RC6 residency in __gt_park even
when RC6 event is not enabled or PMU might not be used.

@Tvrtko, any comments here?

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-24 16:31                 ` Dixit, Ashutosh
@ 2023-07-25 11:38                   ` Iddamsetty, Aravind
  2023-08-07 21:16                     ` Dixit, Ashutosh
  0 siblings, 1 reply; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-07-25 11:38 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin



On 24-07-2023 22:01, Dixit, Ashutosh wrote:
> On Mon, 24 Jul 2023 09:05:53 -0700, Iddamsetty, Aravind wrote:
>>
>> On 24-07-2023 21:22, Dixit, Ashutosh wrote:
>> Hi Ashutosh,
>>
>>> On Mon, 24 Jul 2023 01:03:23 -0700, Iddamsetty, Aravind wrote:
>>>
>>> Hi Aravind,
>>>
>>>> On 22-07-2023 11:34, Dixit, Ashutosh wrote:
>>>>>> On Fri, 21 Jul 2023 16:36:02 -0700, Dixit, Ashutosh wrote:
>>>>>> On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote:
>>>>>>>>> +void engine_group_busyness_store(struct xe_gt *gt)
>>>>>>>>> +{
>>>>>>>>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
>>>>>>>>> +	unsigned int gt_id = gt->info.id;
>>>>>>>>> +	unsigned long flags;
>>>>>>>>> +
>>>>>>>>> +	spin_lock_irqsave(&pmu->lock, flags);
>>>>>>>>> +
>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
>>>>>
>>>>> Here why should we store everything, we should store only those events
>>>>> which are enabled?
>>>>
>>>> The events are enabled only when they are opened which can happen after
>>>> the device is suspended hence we need to store all. As in the present
>>>> case device is put to suspend immediately after probe and event is
>>>> opened post driver load is done.
>>>
>>> I don't think we can justify doing expensive PCIe reads and increasing the
>>> time to go into runtime suspend, when PMU might not being used at all.
>>>
>>> If we store only enabled samples and start storing them only after they are
>>> enabled, what would be the consequence of this? The first non-zero sample
>>> seen by the perf tool would be wrong and later samples will be fine?
>>
>> Why do you say it is wrong perf reports relative from the time an event
>> is opened.
> 
> I am asking you what is the consequence. Initial values will all be zero
> and then there is some activity and we get a non zero value but this will
> include all the previous activity so the first difference we send to perf
> will be large/wrong I think.

correct if we just store the enabled events in suspend, any other event
will have 0 initial value and when we read the register later it will
have all the accumulation and since past value we have is 0 we would end
up reporting the entire value which is wrong.

> 
>>
>>>
>>> If there is a consequence, we might have to go back to what I was saying
>>> earlier about waking the device up and reading the enabled counter when
>>> xe_pmu_event_start happens, to initialize the counter values. I am assuming
>>> this will work?
>>
>> xe_pmu_event_start can be called when device is in suspend so we shall
>> not wake up the device i.e event being enabled when in suspend, so if we
>> do not store while going to suspend we will not have any value to
>> consider when event is enabled after suspend as we need to present
>> relative value.
> 
> That is why I am saying wake up the device and initialize the counters in
> xe_pmu_event_start.

Afaik since PMU doesn't take DRM reference we shall not wake up the
device. if we were allowed to wake up the device why do we even need to
store during suspend. when ever PMU event is opened we could wake up the
device and read the register directly.

Thanks,
Aravind.
> 
>>>
>>> Doing this IMO would be better than always doing these PCIe reads on
>>> runtime suspend even when PMU is not being used
>>
>> we have been doing these in i915 not sure if it affected any timing
>> requirements for runtime suspend.
> 
> Hmm i915 indeed seems to be reading the RC6 residency in __gt_park even
> when RC6 event is not enabled or PMU might not be used.
> 
> @Tvrtko, any comments here?
> 
> Thanks.
> --
> Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-07-25 11:38                   ` Iddamsetty, Aravind
@ 2023-08-07 21:16                     ` Dixit, Ashutosh
  2023-08-07 22:22                       ` Dixit, Ashutosh
  0 siblings, 1 reply; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-08-07 21:16 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Tue, 25 Jul 2023 04:38:45 -0700, Iddamsetty, Aravind wrote:
>

Hi Aravind,

> On 24-07-2023 22:01, Dixit, Ashutosh wrote:
> > On Mon, 24 Jul 2023 09:05:53 -0700, Iddamsetty, Aravind wrote:
> >>
> >>>> On 22-07-2023 11:34, Dixit, Ashutosh wrote:
> >>>>>> On Fri, 21 Jul 2023 16:36:02 -0700, Dixit, Ashutosh wrote:
> >>>>>> On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote:
> >>>>>>>>> +void engine_group_busyness_store(struct xe_gt *gt)
> >>>>>>>>> +{
> >>>>>>>>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
> >>>>>>>>> +	unsigned int gt_id = gt->info.id;
> >>>>>>>>> +	unsigned long flags;
> >>>>>>>>> +
> >>>>>>>>> +	spin_lock_irqsave(&pmu->lock, flags);
> >>>>>>>>> +
> >>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
> >>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
> >>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
> >>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
> >>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
> >>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
> >>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> >>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
> >>>>>
> >>>>> Here why should we store everything, we should store only those events
> >>>>> which are enabled?
> >>>>
> >>>> The events are enabled only when they are opened which can happen after
> >>>> the device is suspended hence we need to store all. As in the present
> >>>> case device is put to suspend immediately after probe and event is
> >>>> opened post driver load is done.
> >>>
> >>> I don't think we can justify doing expensive PCIe reads and increasing the
> >>> time to go into runtime suspend, when PMU might not being used at all.
> >>>
> >>> If we store only enabled samples and start storing them only after they are
> >>> enabled, what would be the consequence of this? The first non-zero sample
> >>> seen by the perf tool would be wrong and later samples will be fine?
> >>
> >> Why do you say it is wrong perf reports relative from the time an event
> >> is opened.
> >
> > I am asking you what is the consequence. Initial values will all be zero
> > and then there is some activity and we get a non zero value but this will
> > include all the previous activity so the first difference we send to perf
> > will be large/wrong I think.
>
> correct if we just store the enabled events in suspend, any other event
> will have 0 initial value and when we read the register later it will
> have all the accumulation and since past value we have is 0 we would end
> up reporting the entire value which is wrong.

Ok, agreed, so we need to do "something".

>
> >
> >>
> >>>
> >>> If there is a consequence, we might have to go back to what I was saying
> >>> earlier about waking the device up and reading the enabled counter when
> >>> xe_pmu_event_start happens, to initialize the counter values. I am assuming
> >>> this will work?
> >>
> >> xe_pmu_event_start can be called when device is in suspend so we shall
> >> not wake up the device i.e event being enabled when in suspend, so if we
> >> do not store while going to suspend we will not have any value to
> >> consider when event is enabled after suspend as we need to present
> >> relative value.
> >
> > That is why I am saying wake up the device and initialize the counters in
> > xe_pmu_event_start.
>
> Afaik since PMU doesn't take DRM reference we shall not wake up the
> device.

Not sure what you mean because PMU does do this:

	drm_dev_get(&xe->drm);

Anyway I don't think it has anything to do with waking up the device since
that is done via xe_device_mem_access_get.

> if we were allowed to wake up the device why do we even need to
> store during suspend. when ever PMU event is opened we could wake up the
> device and read the register directly.

No. That is why we are saving the counters during suspend so we don't have
to wake up the device just to read the counters. So the issue is only how
to *initialize* the counters.

You are saying we initialize by saving all counters during suspend, whether
or not they are enabled, which I don't agree with. I am saying we should
only read and store the counters which are enabled during normal
operation. And to initialize we wake the device up during
xe_pmu_event_start and store the counter value. Alternatively, we can zero
out the enabled counters during xe_pmu_event_start (the counters are RW)
but in any case that will also need waking up the device.

So this way we only wake up the device for initialization but not
afterwards.

Since this is the "base" patch we should try to set up a good
infrastructure in this patch so that other stuff which is exposed via PMU
can be easily added later.

> >>>
> >>> Doing this IMO would be better than always doing these PCIe reads on
> >>> runtime suspend even when PMU is not being used
> >>
> >> we have been doing these in i915 not sure if it affected any timing
> >> requirements for runtime suspend.
> >
> > Hmm i915 indeed seems to be reading the RC6 residency in __gt_park even
> > when RC6 event is not enabled or PMU might not be used.
> >
> > @Tvrtko, any comments here?
> >
> > Thanks.
> > --
> > Ashutosh

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-08-07 21:16                     ` Dixit, Ashutosh
@ 2023-08-07 22:22                       ` Dixit, Ashutosh
  2023-08-08 13:45                         ` Iddamsetty, Aravind
  0 siblings, 1 reply; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-08-07 22:22 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Mon, 07 Aug 2023 14:16:59 -0700, Dixit, Ashutosh wrote:
>

Hi Aravind,

> On Tue, 25 Jul 2023 04:38:45 -0700, Iddamsetty, Aravind wrote:
> > On 24-07-2023 22:01, Dixit, Ashutosh wrote:
> > > On Mon, 24 Jul 2023 09:05:53 -0700, Iddamsetty, Aravind wrote:
> > >>
> > >>>> On 22-07-2023 11:34, Dixit, Ashutosh wrote:
> > >>>>>> On Fri, 21 Jul 2023 16:36:02 -0700, Dixit, Ashutosh wrote:
> > >>>>>> On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote:
> > >>>>>>>>> +void engine_group_busyness_store(struct xe_gt *gt)
> > >>>>>>>>> +{
> > >>>>>>>>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
> > >>>>>>>>> +	unsigned int gt_id = gt->info.id;
> > >>>>>>>>> +	unsigned long flags;
> > >>>>>>>>> +
> > >>>>>>>>> +	spin_lock_irqsave(&pmu->lock, flags);
> > >>>>>>>>> +
> > >>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
> > >>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
> > >>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
> > >>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
> > >>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
> > >>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
> > >>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> > >>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
> > >>>>>
> > >>>>> Here why should we store everything, we should store only those events
> > >>>>> which are enabled?
> > >>>>
> > >>>> The events are enabled only when they are opened which can happen after
> > >>>> the device is suspended hence we need to store all. As in the present
> > >>>> case device is put to suspend immediately after probe and event is
> > >>>> opened post driver load is done.
> > >>>
> > >>> I don't think we can justify doing expensive PCIe reads and increasing the
> > >>> time to go into runtime suspend, when PMU might not being used at all.
> > >>>
> > >>> If we store only enabled samples and start storing them only after they are
> > >>> enabled, what would be the consequence of this? The first non-zero sample
> > >>> seen by the perf tool would be wrong and later samples will be fine?
> > >>
> > >> Why do you say it is wrong perf reports relative from the time an event
> > >> is opened.
> > >
> > > I am asking you what is the consequence. Initial values will all be zero
> > > and then there is some activity and we get a non zero value but this will
> > > include all the previous activity so the first difference we send to perf
> > > will be large/wrong I think.
> >
> > correct if we just store the enabled events in suspend, any other event
> > will have 0 initial value and when we read the register later it will
> > have all the accumulation and since past value we have is 0 we would end
> > up reporting the entire value which is wrong.
>
> Ok, agreed, so we need to do "something".
>
> >
> > >
> > >>
> > >>>
> > >>> If there is a consequence, we might have to go back to what I was saying
> > >>> earlier about waking the device up and reading the enabled counter when
> > >>> xe_pmu_event_start happens, to initialize the counter values. I am assuming
> > >>> this will work?
> > >>
> > >> xe_pmu_event_start can be called when device is in suspend so we shall
> > >> not wake up the device i.e event being enabled when in suspend, so if we
> > >> do not store while going to suspend we will not have any value to
> > >> consider when event is enabled after suspend as we need to present
> > >> relative value.
> > >
> > > That is why I am saying wake up the device and initialize the counters in
> > > xe_pmu_event_start.
> >
> > Afaik since PMU doesn't take DRM reference we shall not wake up the
> > device.
>
> Not sure what you mean because PMU does do this:
>
>	drm_dev_get(&xe->drm);
>
> Anyway I don't think it has anything to do with waking up the device since
> that is done via xe_device_mem_access_get.
>
> > if we were allowed to wake up the device why do we even need to
> > store during suspend. when ever PMU event is opened we could wake up the
> > device and read the register directly.
>
> No. That is why we are saving the counters during suspend so we don't have
> to wake up the device just to read the counters. So the issue is only how
> to *initialize* the counters.
>
> You are saying we initialize by saving all counters during suspend, whether
> or not they are enabled, which I don't agree with. I am saying we should
> only read and store the counters which are enabled during normal
> operation. And to initialize we wake the device up during
> xe_pmu_event_start and store the counter value. Alternatively, we can zero
> out the enabled counters during xe_pmu_event_start (the counters are RW)
> but in any case that will also need waking up the device.
>
> So this way we only wake up the device for initialization but not
> afterwards.
>
> Since this is the "base" patch we should try to set up a good
> infrastructure in this patch so that other stuff which is exposed via PMU
> can be easily added later.

After thinking a bit more about this, though I think this needs to be done,
I won't insist that we do this in this patch, we can review and do this in
a subsequent patch (if no one else objects).

So let's skip this for now. So if you can generate a new version of the
patch after addressing all of the other review comments, we can review that
again and try to get it merged.

Thanks.
--
Ashutosh

> > >>>
> > >>> Doing this IMO would be better than always doing these PCIe reads on
> > >>> runtime suspend even when PMU is not being used
> > >>
> > >> we have been doing these in i915 not sure if it affected any timing
> > >> requirements for runtime suspend.
> > >
> > > Hmm i915 indeed seems to be reading the RC6 residency in __gt_park even
> > > when RC6 event is not enabled or PMU might not be used.
> > >
> > > @Tvrtko, any comments here?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-08-07 22:22                       ` Dixit, Ashutosh
@ 2023-08-08 13:45                         ` Iddamsetty, Aravind
  2023-08-08 15:18                           ` Dixit, Ashutosh
  0 siblings, 1 reply; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-08-08 13:45 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin



On 08-08-2023 03:52, Dixit, Ashutosh wrote:
> On Mon, 07 Aug 2023 14:16:59 -0700, Dixit, Ashutosh wrote:
>>
> 
> Hi Aravind,

Hi Ashutosh,

I have sent a new revision, but commenting here for few comments.
> 
>> On Tue, 25 Jul 2023 04:38:45 -0700, Iddamsetty, Aravind wrote:
>>> On 24-07-2023 22:01, Dixit, Ashutosh wrote:
>>>> On Mon, 24 Jul 2023 09:05:53 -0700, Iddamsetty, Aravind wrote:
>>>>>
>>>>>>> On 22-07-2023 11:34, Dixit, Ashutosh wrote:
>>>>>>>>> On Fri, 21 Jul 2023 16:36:02 -0700, Dixit, Ashutosh wrote:
>>>>>>>>> On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote:
>>>>>>>>>>>> +void engine_group_busyness_store(struct xe_gt *gt)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
>>>>>>>>>>>> +	unsigned int gt_id = gt->info.id;
>>>>>>>>>>>> +	unsigned long flags;
>>>>>>>>>>>> +
>>>>>>>>>>>> +	spin_lock_irqsave(&pmu->lock, flags);
>>>>>>>>>>>> +
>>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
>>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
>>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
>>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
>>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
>>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
>>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
>>>>>>>>
>>>>>>>> Here why should we store everything, we should store only those events
>>>>>>>> which are enabled?
>>>>>>>
>>>>>>> The events are enabled only when they are opened which can happen after
>>>>>>> the device is suspended hence we need to store all. As in the present
>>>>>>> case device is put to suspend immediately after probe and event is
>>>>>>> opened post driver load is done.
>>>>>>
>>>>>> I don't think we can justify doing expensive PCIe reads and increasing the
>>>>>> time to go into runtime suspend, when PMU might not being used at all.
>>>>>>
>>>>>> If we store only enabled samples and start storing them only after they are
>>>>>> enabled, what would be the consequence of this? The first non-zero sample
>>>>>> seen by the perf tool would be wrong and later samples will be fine?
>>>>>
>>>>> Why do you say it is wrong perf reports relative from the time an event
>>>>> is opened.
>>>>
>>>> I am asking you what is the consequence. Initial values will all be zero
>>>> and then there is some activity and we get a non zero value but this will
>>>> include all the previous activity so the first difference we send to perf
>>>> will be large/wrong I think.
>>>
>>> correct if we just store the enabled events in suspend, any other event
>>> will have 0 initial value and when we read the register later it will
>>> have all the accumulation and since past value we have is 0 we would end
>>> up reporting the entire value which is wrong.
>>
>> Ok, agreed, so we need to do "something".
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> If there is a consequence, we might have to go back to what I was saying
>>>>>> earlier about waking the device up and reading the enabled counter when
>>>>>> xe_pmu_event_start happens, to initialize the counter values. I am assuming
>>>>>> this will work?
>>>>>
>>>>> xe_pmu_event_start can be called when device is in suspend so we shall
>>>>> not wake up the device i.e event being enabled when in suspend, so if we
>>>>> do not store while going to suspend we will not have any value to
>>>>> consider when event is enabled after suspend as we need to present
>>>>> relative value.
>>>>
>>>> That is why I am saying wake up the device and initialize the counters in
>>>> xe_pmu_event_start.
>>>
>>> Afaik since PMU doesn't take DRM reference we shall not wake up the
>>> device.
>>
>> Not sure what you mean because PMU does do this:
>>
>> 	drm_dev_get(&xe->drm);

sorry it was my misunderstanding here, please ignore.

>>
>> Anyway I don't think it has anything to do with waking up the device since
>> that is done via xe_device_mem_access_get.
>>
>>> if we were allowed to wake up the device why do we even need to
>>> store during suspend. when ever PMU event is opened we could wake up the
>>> device and read the register directly.
>>
>> No. That is why we are saving the counters during suspend so we don't have
>> to wake up the device just to read the counters. So the issue is only how
>> to *initialize* the counters.
>>
>> You are saying we initialize by saving all counters during suspend, whether
>> or not they are enabled, which I don't agree with. I am saying we should
>> only read and store the counters which are enabled during normal
>> operation. And to initialize we wake the device up during
>> xe_pmu_event_start and store the counter value. Alternatively, we can zero
>> out the enabled counters during xe_pmu_event_start (the counters are RW)
>> but in any case that will also need waking up the device.

when the driver is initially loaded there might not be any users of
device and immediately it might enter suspend, so at the time of suspend
there no event enabled, but the pmu can be opened just after suspend
wihtout any actual work on device so device still resides in suspend, so
we should not be waking the device just to read the register in
event_start or any of the callbacks without any real workload or user of
the device.

so ideally if the device didn't enter suspend, the counter is
initialized in the first read when device is still awake.

Thanks,
Aravind.
>>
>> So this way we only wake up the device for initialization but not
>> afterwards.
>>
>> Since this is the "base" patch we should try to set up a good
>> infrastructure in this patch so that other stuff which is exposed via PMU
>> can be easily added later.
> 
> After thinking a bit more about this, though I think this needs to be done,
> I won't insist that we do this in this patch, we can review and do this in
> a subsequent patch (if no one else objects).
> 
> So let's skip this for now. So if you can generate a new version of the
> patch after addressing all of the other review comments, we can review that
> again and try to get it merged.
> 
> Thanks.
> --
> Ashutosh
> 
>>>>>>
>>>>>> Doing this IMO would be better than always doing these PCIe reads on
>>>>>> runtime suspend even when PMU is not being used
>>>>>
>>>>> we have been doing these in i915 not sure if it affected any timing
>>>>> requirements for runtime suspend.
>>>>
>>>> Hmm i915 indeed seems to be reading the RC6 residency in __gt_park even
>>>> when RC6 event is not enabled or PMU might not be used.
>>>>
>>>> @Tvrtko, any comments here?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-08-08 13:45                         ` Iddamsetty, Aravind
@ 2023-08-08 15:18                           ` Dixit, Ashutosh
  2023-08-09  4:26                             ` Iddamsetty, Aravind
  0 siblings, 1 reply; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-08-08 15:18 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Tue, 08 Aug 2023 06:45:36 -0700, Iddamsetty, Aravind wrote:
>

Hi Aravind,

> On 08-08-2023 03:52, Dixit, Ashutosh wrote:
> > On Mon, 07 Aug 2023 14:16:59 -0700, Dixit, Ashutosh wrote:
> I have sent a new revision, but commenting here for few comments.
> >
> >> On Tue, 25 Jul 2023 04:38:45 -0700, Iddamsetty, Aravind wrote:
> >>> On 24-07-2023 22:01, Dixit, Ashutosh wrote:
> >>>> On Mon, 24 Jul 2023 09:05:53 -0700, Iddamsetty, Aravind wrote:
> >>>>>
> >>>>>>> On 22-07-2023 11:34, Dixit, Ashutosh wrote:
> >>>>>>>>> On Fri, 21 Jul 2023 16:36:02 -0700, Dixit, Ashutosh wrote:
> >>>>>>>>> On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote:
> >>>>>>>>>>>> +void engine_group_busyness_store(struct xe_gt *gt)
> >>>>>>>>>>>> +{
> >>>>>>>>>>>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
> >>>>>>>>>>>> +	unsigned int gt_id = gt->info.id;
> >>>>>>>>>>>> +	unsigned long flags;
> >>>>>>>>>>>> +
> >>>>>>>>>>>> +	spin_lock_irqsave(&pmu->lock, flags);
> >>>>>>>>>>>> +
> >>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
> >>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
> >>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
> >>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
> >>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
> >>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
> >>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> >>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
> >>>>>>>>
> >>>>>>>> Here why should we store everything, we should store only those events
> >>>>>>>> which are enabled?
> >>>>>>>
> >>>>>>> The events are enabled only when they are opened which can happen after
> >>>>>>> the device is suspended hence we need to store all. As in the present
> >>>>>>> case device is put to suspend immediately after probe and event is
> >>>>>>> opened post driver load is done.
> >>>>>>
> >>>>>> I don't think we can justify doing expensive PCIe reads and increasing the
> >>>>>> time to go into runtime suspend, when PMU might not being used at all.
> >>>>>>
> >>>>>> If we store only enabled samples and start storing them only after they are
> >>>>>> enabled, what would be the consequence of this? The first non-zero sample
> >>>>>> seen by the perf tool would be wrong and later samples will be fine?
> >>>>>
> >>>>> Why do you say it is wrong perf reports relative from the time an event
> >>>>> is opened.
> >>>>
> >>>> I am asking you what is the consequence. Initial values will all be zero
> >>>> and then there is some activity and we get a non zero value but this will
> >>>> include all the previous activity so the first difference we send to perf
> >>>> will be large/wrong I think.
> >>>
> >>> correct if we just store the enabled events in suspend, any other event
> >>> will have 0 initial value and when we read the register later it will
> >>> have all the accumulation and since past value we have is 0 we would end
> >>> up reporting the entire value which is wrong.
> >>
> >> Ok, agreed, so we need to do "something".
> >>
> >>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>> If there is a consequence, we might have to go back to what I was saying
> >>>>>> earlier about waking the device up and reading the enabled counter when
> >>>>>> xe_pmu_event_start happens, to initialize the counter values. I am assuming
> >>>>>> this will work?
> >>>>>
> >>>>> xe_pmu_event_start can be called when device is in suspend so we shall
> >>>>> not wake up the device i.e event being enabled when in suspend, so if we
> >>>>> do not store while going to suspend we will not have any value to
> >>>>> consider when event is enabled after suspend as we need to present
> >>>>> relative value.
> >>>>
> >>>> That is why I am saying wake up the device and initialize the counters in
> >>>> xe_pmu_event_start.
> >>>
> >>> Afaik since PMU doesn't take DRM reference we shall not wake up the
> >>> device.
> >>
> >> Not sure what you mean because PMU does do this:
> >>
> >>	drm_dev_get(&xe->drm);
>
> sorry it was my misunderstanding here, please ignore.
>
> >>
> >> Anyway I don't think it has anything to do with waking up the device since
> >> that is done via xe_device_mem_access_get.
> >>
> >>> if we were allowed to wake up the device why do we even need to
> >>> store during suspend. when ever PMU event is opened we could wake up the
> >>> device and read the register directly.
> >>
> >> No. That is why we are saving the counters during suspend so we don't have
> >> to wake up the device just to read the counters. So the issue is only how
> >> to *initialize* the counters.
> >>
> >> You are saying we initialize by saving all counters during suspend, whether
> >> or not they are enabled, which I don't agree with. I am saying we should
> >> only read and store the counters which are enabled during normal
> >> operation. And to initialize we wake the device up during
> >> xe_pmu_event_start and store the counter value. Alternatively, we can zero
> >> out the enabled counters during xe_pmu_event_start (the counters are RW)
> >> but in any case that will also need waking up the device.
>
> when the driver is initially loaded there might not be any users of
> device and immediately it might enter suspend, so at the time of suspend
> there no event enabled, but the pmu can be opened just after suspend
> wihtout any actual work on device so device still resides in suspend, so
> we should not be waking the device just to read the register in
> event_start or any of the callbacks without any real workload or user of
> the device.
>
> so ideally if the device didn't enter suspend, the counter is
> initialized in the first read when device is still awake.

Yes, I understand. But the issue is why are we reading (doing expensive
reads across PCIe) and saving all these registers for PMU when it's
possible PMU might not be used at all and none of these events might be
enabled at all?

So to me the lesser evil is to wake up the device at xe_pmu_event_start
time and initialize the counters. We are only waking the device up once at
init time, not during normal operation. Whereas in your case, you are
reading and saving these registers continuously each time we suspend,
whether or not PMU is or will be used.

> >> So this way we only wake up the device for initialization but not
> >> afterwards.
> >>
> >> Since this is the "base" patch we should try to set up a good
> >> infrastructure in this patch so that other stuff which is exposed via PMU
> >> can be easily added later.
> >
> > After thinking a bit more about this, though I think this needs to be done,
> > I won't insist that we do this in this patch, we can review and do this in
> > a subsequent patch (if no one else objects).
> >
> > So let's skip this for now. So if you can generate a new version of the
> > patch after addressing all of the other review comments, we can review that
> > again and try to get it merged.
> >
> > Thanks.
> > --
> > Ashutosh
> >
> >>>>>>
> >>>>>> Doing this IMO would be better than always doing these PCIe reads on
> >>>>>> runtime suspend even when PMU is not being used
> >>>>>
> >>>>> we have been doing these in i915 not sure if it affected any timing
> >>>>> requirements for runtime suspend.
> >>>>
> >>>> Hmm i915 indeed seems to be reading the RC6 residency in __gt_park even
> >>>> when RC6 event is not enabled or PMU might not be used.
> >>>>
> >>>> @Tvrtko, any comments here?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-08-08 15:18                           ` Dixit, Ashutosh
@ 2023-08-09  4:26                             ` Iddamsetty, Aravind
  2023-08-09  5:02                               ` Dixit, Ashutosh
  0 siblings, 1 reply; 59+ messages in thread
From: Iddamsetty, Aravind @ 2023-08-09  4:26 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin



On 08-08-2023 20:48, Dixit, Ashutosh wrote:

Hi Ashutosh,
> On Tue, 08 Aug 2023 06:45:36 -0700, Iddamsetty, Aravind wrote:
>>
> 
> Hi Aravind,
> 
>> On 08-08-2023 03:52, Dixit, Ashutosh wrote:
>>> On Mon, 07 Aug 2023 14:16:59 -0700, Dixit, Ashutosh wrote:
>> I have sent a new revision, but commenting here for few comments.
>>>
>>>> On Tue, 25 Jul 2023 04:38:45 -0700, Iddamsetty, Aravind wrote:
>>>>> On 24-07-2023 22:01, Dixit, Ashutosh wrote:
>>>>>> On Mon, 24 Jul 2023 09:05:53 -0700, Iddamsetty, Aravind wrote:
>>>>>>>
>>>>>>>>> On 22-07-2023 11:34, Dixit, Ashutosh wrote:
>>>>>>>>>>> On Fri, 21 Jul 2023 16:36:02 -0700, Dixit, Ashutosh wrote:
>>>>>>>>>>> On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote:
>>>>>>>>>>>>>> +void engine_group_busyness_store(struct xe_gt *gt)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
>>>>>>>>>>>>>> +	unsigned int gt_id = gt->info.id;
>>>>>>>>>>>>>> +	unsigned long flags;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +	spin_lock_irqsave(&pmu->lock, flags);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
>>>>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
>>>>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
>>>>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
>>>>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
>>>>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
>>>>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
>>>>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
>>>>>>>>>>
>>>>>>>>>> Here why should we store everything, we should store only those events
>>>>>>>>>> which are enabled?
>>>>>>>>>
>>>>>>>>> The events are enabled only when they are opened which can happen after
>>>>>>>>> the device is suspended hence we need to store all. As in the present
>>>>>>>>> case device is put to suspend immediately after probe and event is
>>>>>>>>> opened post driver load is done.
>>>>>>>>
>>>>>>>> I don't think we can justify doing expensive PCIe reads and increasing the
>>>>>>>> time to go into runtime suspend, when PMU might not being used at all.
>>>>>>>>
>>>>>>>> If we store only enabled samples and start storing them only after they are
>>>>>>>> enabled, what would be the consequence of this? The first non-zero sample
>>>>>>>> seen by the perf tool would be wrong and later samples will be fine?
>>>>>>>
>>>>>>> Why do you say it is wrong perf reports relative from the time an event
>>>>>>> is opened.
>>>>>>
>>>>>> I am asking you what is the consequence. Initial values will all be zero
>>>>>> and then there is some activity and we get a non zero value but this will
>>>>>> include all the previous activity so the first difference we send to perf
>>>>>> will be large/wrong I think.
>>>>>
>>>>> correct if we just store the enabled events in suspend, any other event
>>>>> will have 0 initial value and when we read the register later it will
>>>>> have all the accumulation and since past value we have is 0 we would end
>>>>> up reporting the entire value which is wrong.
>>>>
>>>> Ok, agreed, so we need to do "something".
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> If there is a consequence, we might have to go back to what I was saying
>>>>>>>> earlier about waking the device up and reading the enabled counter when
>>>>>>>> xe_pmu_event_start happens, to initialize the counter values. I am assuming
>>>>>>>> this will work?
>>>>>>>
>>>>>>> xe_pmu_event_start can be called when device is in suspend so we shall
>>>>>>> not wake up the device i.e event being enabled when in suspend, so if we
>>>>>>> do not store while going to suspend we will not have any value to
>>>>>>> consider when event is enabled after suspend as we need to present
>>>>>>> relative value.
>>>>>>
>>>>>> That is why I am saying wake up the device and initialize the counters in
>>>>>> xe_pmu_event_start.
>>>>>
>>>>> Afaik since PMU doesn't take DRM reference we shall not wake up the
>>>>> device.
>>>>
>>>> Not sure what you mean because PMU does do this:
>>>>
>>>> 	drm_dev_get(&xe->drm);
>>
>> sorry it was my misunderstanding here, please ignore.
>>
>>>>
>>>> Anyway I don't think it has anything to do with waking up the device since
>>>> that is done via xe_device_mem_access_get.
>>>>
>>>>> if we were allowed to wake up the device why do we even need to
>>>>> store during suspend. when ever PMU event is opened we could wake up the
>>>>> device and read the register directly.
>>>>
>>>> No. That is why we are saving the counters during suspend so we don't have
>>>> to wake up the device just to read the counters. So the issue is only how
>>>> to *initialize* the counters.
>>>>
>>>> You are saying we initialize by saving all counters during suspend, whether
>>>> or not they are enabled, which I don't agree with. I am saying we should
>>>> only read and store the counters which are enabled during normal
>>>> operation. And to initialize we wake the device up during
>>>> xe_pmu_event_start and store the counter value. Alternatively, we can zero
>>>> out the enabled counters during xe_pmu_event_start (the counters are RW)
>>>> but in any case that will also need waking up the device.
>>
>> when the driver is initially loaded there might not be any users of
>> device and immediately it might enter suspend, so at the time of suspend
>> there no event enabled, but the pmu can be opened just after suspend
>> wihtout any actual work on device so device still resides in suspend, so
>> we should not be waking the device just to read the register in
>> event_start or any of the callbacks without any real workload or user of
>> the device.
>>
>> so ideally if the device didn't enter suspend, the counter is
>> initialized in the first read when device is still awake.
> 
> Yes, I understand. But the issue is why are we reading (doing expensive
> reads across PCIe) and saving all these registers for PMU when it's
> possible PMU might not be used at all and none of these events might be
> enabled at all?
> 
> So to me the lesser evil is to wake up the device at xe_pmu_event_start
> time and initialize the counters. We are only waking the device up once at
> init time, not during normal operation. Whereas in your case, you are
> reading and saving these registers continuously each time we suspend,
> whether or not PMU is or will be used.

I'm not sure which is costly saving the registers during suspend or
waking the device on even_init we shall remember that the same event can
be opened by multiple listeners so that will make the device wake up
multiple times had the device got suspended in between opening those events.

and suppose if we do multiple PCIe reads are you suspecting it will
affect any timing requirements if the suspend has any and I have little
doubt there if PCIe reads will take so long to miss any timings atleast
in this case. But anyways as you said we can take this up later, if we
know we will be adding more such counters in future.

Thanks,
Aravind.
> 
>>>> So this way we only wake up the device for initialization but not
>>>> afterwards.
>>>>
>>>> Since this is the "base" patch we should try to set up a good
>>>> infrastructure in this patch so that other stuff which is exposed via PMU
>>>> can be easily added later.
>>>
>>> After thinking a bit more about this, though I think this needs to be done,
>>> I won't insist that we do this in this patch, we can review and do this in
>>> a subsequent patch (if no one else objects).
>>>
>>> So let's skip this for now. So if you can generate a new version of the
>>> patch after addressing all of the other review comments, we can review that
>>> again and try to get it merged.
>>>
>>> Thanks.
>>> --
>>> Ashutosh
>>>
>>>>>>>>
>>>>>>>> Doing this IMO would be better than always doing these PCIe reads on
>>>>>>>> runtime suspend even when PMU is not being used
>>>>>>>
>>>>>>> we have been doing these in i915 not sure if it affected any timing
>>>>>>> requirements for runtime suspend.
>>>>>>
>>>>>> Hmm i915 indeed seems to be reading the RC6 residency in __gt_park even
>>>>>> when RC6 event is not enabled or PMU might not be used.
>>>>>>
>>>>>> @Tvrtko, any comments here?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface
  2023-08-09  4:26                             ` Iddamsetty, Aravind
@ 2023-08-09  5:02                               ` Dixit, Ashutosh
  0 siblings, 0 replies; 59+ messages in thread
From: Dixit, Ashutosh @ 2023-08-09  5:02 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: Bommu Krishnaiah, intel-xe, Tvrtko Ursulin

On Tue, 08 Aug 2023 21:26:20 -0700, Iddamsetty, Aravind wrote:
>
> On 08-08-2023 20:48, Dixit, Ashutosh wrote:
> > On Tue, 08 Aug 2023 06:45:36 -0700, Iddamsetty, Aravind wrote:
> >> On 08-08-2023 03:52, Dixit, Ashutosh wrote:
> >>> On Mon, 07 Aug 2023 14:16:59 -0700, Dixit, Ashutosh wrote:
> >> I have sent a new revision, but commenting here for few comments.
> >>>
> >>>> On Tue, 25 Jul 2023 04:38:45 -0700, Iddamsetty, Aravind wrote:
> >>>>> On 24-07-2023 22:01, Dixit, Ashutosh wrote:
> >>>>>> On Mon, 24 Jul 2023 09:05:53 -0700, Iddamsetty, Aravind wrote:
> >>>>>>>
> >>>>>>>>> On 22-07-2023 11:34, Dixit, Ashutosh wrote:
> >>>>>>>>>>> On Fri, 21 Jul 2023 16:36:02 -0700, Dixit, Ashutosh wrote:
> >>>>>>>>>>> On Fri, 21 Jul 2023 04:51:09 -0700, Iddamsetty, Aravind wrote:
> >>>>>>>>>>>>>> +void engine_group_busyness_store(struct xe_gt *gt)
> >>>>>>>>>>>>>> +{
> >>>>>>>>>>>>>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
> >>>>>>>>>>>>>> +	unsigned int gt_id = gt->info.id;
> >>>>>>>>>>>>>> +	unsigned long flags;
> >>>>>>>>>>>>>> +
> >>>>>>>>>>>>>> +	spin_lock_irqsave(&pmu->lock, flags);
> >>>>>>>>>>>>>> +
> >>>>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_RENDER_GROUP_BUSY,
> >>>>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_RENDER_GROUP_BUSY(0)));
> >>>>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_COPY_GROUP_BUSY,
> >>>>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_COPY_GROUP_BUSY(0)));
> >>>>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_MEDIA_GROUP_BUSY,
> >>>>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_MEDIA_GROUP_BUSY(0)));
> >>>>>>>>>>>>>> +	store_sample(pmu, gt_id, __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY,
> >>>>>>>>>>>>>> +		     __engine_group_busyness_read(gt, XE_PMU_ANY_ENGINE_GROUP_BUSY(0)));
> >>>>>>>>>>
> >>>>>>>>>> Here why should we store everything, we should store only those events
> >>>>>>>>>> which are enabled?
> >>>>>>>>>
> >>>>>>>>> The events are enabled only when they are opened which can happen after
> >>>>>>>>> the device is suspended hence we need to store all. As in the present
> >>>>>>>>> case device is put to suspend immediately after probe and event is
> >>>>>>>>> opened post driver load is done.
> >>>>>>>>
> >>>>>>>> I don't think we can justify doing expensive PCIe reads and increasing the
> >>>>>>>> time to go into runtime suspend, when PMU might not being used at all.
> >>>>>>>>
> >>>>>>>> If we store only enabled samples and start storing them only after they are
> >>>>>>>> enabled, what would be the consequence of this? The first non-zero sample
> >>>>>>>> seen by the perf tool would be wrong and later samples will be fine?
> >>>>>>>
> >>>>>>> Why do you say it is wrong perf reports relative from the time an event
> >>>>>>> is opened.
> >>>>>>
> >>>>>> I am asking you what is the consequence. Initial values will all be zero
> >>>>>> and then there is some activity and we get a non zero value but this will
> >>>>>> include all the previous activity so the first difference we send to perf
> >>>>>> will be large/wrong I think.
> >>>>>
> >>>>> correct if we just store the enabled events in suspend, any other event
> >>>>> will have 0 initial value and when we read the register later it will
> >>>>> have all the accumulation and since past value we have is 0 we would end
> >>>>> up reporting the entire value which is wrong.
> >>>>
> >>>> Ok, agreed, so we need to do "something".
> >>>>
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>> If there is a consequence, we might have to go back to what I was saying
> >>>>>>>> earlier about waking the device up and reading the enabled counter when
> >>>>>>>> xe_pmu_event_start happens, to initialize the counter values. I am assuming
> >>>>>>>> this will work?
> >>>>>>>
> >>>>>>> xe_pmu_event_start can be called when device is in suspend so we shall
> >>>>>>> not wake up the device i.e event being enabled when in suspend, so if we
> >>>>>>> do not store while going to suspend we will not have any value to
> >>>>>>> consider when event is enabled after suspend as we need to present
> >>>>>>> relative value.
> >>>>>>
> >>>>>> That is why I am saying wake up the device and initialize the counters in
> >>>>>> xe_pmu_event_start.
> >>>>>
> >>>>> Afaik since PMU doesn't take DRM reference we shall not wake up the
> >>>>> device.
> >>>>
> >>>> Not sure what you mean because PMU does do this:
> >>>>
> >>>>	drm_dev_get(&xe->drm);
> >>
> >> sorry it was my misunderstanding here, please ignore.
> >>
> >>>>
> >>>> Anyway I don't think it has anything to do with waking up the device since
> >>>> that is done via xe_device_mem_access_get.
> >>>>
> >>>>> if we were allowed to wake up the device why do we even need to
> >>>>> store during suspend. when ever PMU event is opened we could wake up the
> >>>>> device and read the register directly.
> >>>>
> >>>> No. That is why we are saving the counters during suspend so we don't have
> >>>> to wake up the device just to read the counters. So the issue is only how
> >>>> to *initialize* the counters.
> >>>>
> >>>> You are saying we initialize by saving all counters during suspend, whether
> >>>> or not they are enabled, which I don't agree with. I am saying we should
> >>>> only read and store the counters which are enabled during normal
> >>>> operation. And to initialize we wake the device up during
> >>>> xe_pmu_event_start and store the counter value. Alternatively, we can zero
> >>>> out the enabled counters during xe_pmu_event_start (the counters are RW)
> >>>> but in any case that will also need waking up the device.
> >>
> >> when the driver is initially loaded there might not be any users of
> >> device and immediately it might enter suspend, so at the time of suspend
> >> there no event enabled, but the pmu can be opened just after suspend
> >> wihtout any actual work on device so device still resides in suspend, so
> >> we should not be waking the device just to read the register in
> >> event_start or any of the callbacks without any real workload or user of
> >> the device.
> >>
> >> so ideally if the device didn't enter suspend, the counter is
> >> initialized in the first read when device is still awake.
> >
> > Yes, I understand. But the issue is why are we reading (doing expensive
> > reads across PCIe) and saving all these registers for PMU when it's
> > possible PMU might not be used at all and none of these events might be
> > enabled at all?
> >
> > So to me the lesser evil is to wake up the device at xe_pmu_event_start
> > time and initialize the counters. We are only waking the device up once at
> > init time, not during normal operation. Whereas in your case, you are
> > reading and saving these registers continuously each time we suspend,
> > whether or not PMU is or will be used.
>
> I'm not sure which is costly saving the registers during suspend or
> waking the device on even_init we shall remember that the same event can
> be opened by multiple listeners so that will make the device wake up
> multiple times had the device got suspended in between opening those
> events.

Such optimizations should be possible, that is if we already have the
counter values initialized (say non-zero) don't wake up the device.
>
> and suppose if we do multiple PCIe reads are you suspecting it will
> affect any timing requirements if the suspend has any and I have little
> doubt there if PCIe reads will take so long to miss any timings atleast
> in this case. But anyways as you said we can take this up later, if we
> know we will be adding more such counters in future.

Yes, let's take this up later.

Thanks.
--
Ashutosh


> >
> >>>> So this way we only wake up the device for initialization but not
> >>>> afterwards.
> >>>>
> >>>> Since this is the "base" patch we should try to set up a good
> >>>> infrastructure in this patch so that other stuff which is exposed via PMU
> >>>> can be easily added later.
> >>>
> >>> After thinking a bit more about this, though I think this needs to be done,
> >>> I won't insist that we do this in this patch, we can review and do this in
> >>> a subsequent patch (if no one else objects).
> >>>
> >>> So let's skip this for now. So if you can generate a new version of the
> >>> patch after addressing all of the other review comments, we can review that
> >>> again and try to get it merged.
> >>>
> >>> Thanks.
> >>> --
> >>> Ashutosh
> >>>
> >>>>>>>>
> >>>>>>>> Doing this IMO would be better than always doing these PCIe reads on
> >>>>>>>> runtime suspend even when PMU is not being used
> >>>>>>>
> >>>>>>> we have been doing these in i915 not sure if it affected any timing
> >>>>>>> requirements for runtime suspend.
> >>>>>>
> >>>>>> Hmm i915 indeed seems to be reading the RC6 residency in __gt_park even
> >>>>>> when RC6 event is not enabled or PMU might not be used.
> >>>>>>
> >>>>>> @Tvrtko, any comments here?

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2023-08-09  5:08 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-27 12:21 [Intel-xe] [PATCH v2 0/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
2023-06-27 12:21 ` [Intel-xe] [PATCH v2 1/2] drm/xe: Get GT clock to nanosecs Aravind Iddamsetty
2023-07-04  9:29   ` Upadhyay, Tejas
2023-07-04 10:14     ` Upadhyay, Tejas
2023-07-05  4:46       ` Iddamsetty, Aravind
2023-07-06  0:55   ` Dixit, Ashutosh
2023-06-27 12:21 ` [Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
2023-06-30 13:53   ` Upadhyay, Tejas
2023-07-03  5:11     ` Iddamsetty, Aravind
2023-07-04  3:34   ` Ghimiray, Himal Prasad
2023-07-05  4:52     ` Iddamsetty, Aravind
2023-07-04  9:10   ` Upadhyay, Tejas
2023-07-05  4:42     ` Iddamsetty, Aravind
2023-07-06  2:39   ` Dixit, Ashutosh
2023-07-06 13:42     ` Iddamsetty, Aravind
2023-07-07  2:18       ` Dixit, Ashutosh
2023-07-07  3:53         ` Iddamsetty, Aravind
2023-07-07  6:08           ` Dixit, Ashutosh
2023-07-07 10:42             ` Iddamsetty, Aravind
2023-07-07 21:25               ` Dixit, Ashutosh
2023-07-10  6:05                 ` Iddamsetty, Aravind
2023-07-10  8:12                   ` Ursulin, Tvrtko
2023-07-11 16:19                     ` Iddamsetty, Aravind
2023-07-11 23:10                       ` Dixit, Ashutosh
2023-07-12  3:11                         ` Iddamsetty, Aravind
2023-07-12  5:24                           ` Dixit, Ashutosh
2023-07-11 22:58                     ` Dixit, Ashutosh
2023-07-09  0:32           ` Dixit, Ashutosh
2023-07-10  4:13             ` Iddamsetty, Aravind
2023-07-10  5:57               ` Dixit, Ashutosh
2023-07-18  5:07           ` Dixit, Ashutosh
2023-07-19  6:59             ` Iddamsetty, Aravind
2023-07-06  2:40   ` Belgaumkar, Vinay
2023-07-06 13:06     ` Iddamsetty, Aravind
2023-07-21  1:02   ` Dixit, Ashutosh
2023-07-21 11:51     ` Iddamsetty, Aravind
2023-07-21 23:36       ` Dixit, Ashutosh
2023-07-22  6:04         ` Dixit, Ashutosh
2023-07-24  8:03           ` Iddamsetty, Aravind
2023-07-24  9:00             ` Ursulin, Tvrtko
2023-07-24 15:52               ` Dixit, Ashutosh
2023-07-24 15:52             ` Dixit, Ashutosh
2023-07-24 16:05               ` Iddamsetty, Aravind
2023-07-24 16:31                 ` Dixit, Ashutosh
2023-07-25 11:38                   ` Iddamsetty, Aravind
2023-08-07 21:16                     ` Dixit, Ashutosh
2023-08-07 22:22                       ` Dixit, Ashutosh
2023-08-08 13:45                         ` Iddamsetty, Aravind
2023-08-08 15:18                           ` Dixit, Ashutosh
2023-08-09  4:26                             ` Iddamsetty, Aravind
2023-08-09  5:02                               ` Dixit, Ashutosh
2023-07-24  9:38           ` Iddamsetty, Aravind
2023-07-22 14:39   ` Dixit, Ashutosh
2023-07-24  8:02     ` Iddamsetty, Aravind
2023-06-27 13:04 ` [Intel-xe] ✓ CI.Patch_applied: success for drm/xe/pmu: Enable PMU interface (rev2) Patchwork
2023-06-27 13:05 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork
2023-06-27 13:06 ` [Intel-xe] ✓ CI.KUnit: success " Patchwork
2023-06-27 13:10 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-06-27 13:10 ` [Intel-xe] ✗ CI.Hooks: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.