All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/73] MES support
@ 2022-04-29 17:45 Alex Deucher
  2022-04-29 17:45 ` [PATCH 01/73] drm/amdgpu: define MQD abstract layer for hw ip Alex Deucher
                   ` (72 more replies)
  0 siblings, 73 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher

This patch set enables support for MES (Micro Engine Scheduler).  This
is similar the the HWS (HardWare Scheduler) on the MEC currently used
for KFD.  It's a scheduling microcontroller used for scheduling
engine queues to hardware slots.  This adds the basic infrastructure
to enable MES in amdgpu.

Jack Xiao (70):
  drm/amdgpu: define MQD abstract layer for hw ip
  drm/amdgpu: add helper function to initialize mqd from ring v4
  drm/amdgpu: add the per-context meta data v3
  drm/amdgpu: add mes ctx data in amdgpu_ring
  drm/amdgpu: define ring structure to access rptr/wptr/fence
  drm/amdgpu: use ring structure to access rptr/wptr v2
  drm/amdgpu: initialize/finalize the ring for mes queue
  drm/amdgpu: assign the cpu/gpu address of fence from ring
  drm/amdgpu/gfx10: implement mqd functions of gfx/compute eng v2
  drm/amdgpu/gfx10: use per ctx CSA for ce metadata
  drm/amdgpu/gfx10: use per ctx CSA for de metadata
  drm/amdgpu/gfx10: associate mes queue id with fence v2
  drm/amdgpu/gfx10: inherit vmid from mqd
  drm/amdgpu/gfx10: use INVALIDATE_TLBS to invalidate TLBs v2
  drm/amdgpu/gmc10: skip emitting pasid mapping packet
  drm/amdgpu: use the whole doorbell space for mes
  drm/amdgpu: update mes process/gang/queue definitions
  drm/amdgpu: add mes_kiq module parameter v2
  drm/amdgpu: allocate doorbell index for mes kiq
  drm/amdgpu/mes: extend mes framework to support multiple mes pipes
  drm/amdgpu/gfx10: add mes queue fence handling
  drm/amdgpu/gfx10: add mes support for gfx ib test
  drm/amdgpu: don't use kiq to flush gpu tlb if mes enabled
  drm/amdgpu/sdma: use per-ctx sdma csa address for mes sdma queue
  drm/amdgpu/sdma5.2: initialize sdma mqd
  drm/amdgpu/sdma5.2: associate mes queue id with fence
  drm/amdgpu/sdma5.2: add mes queue fence handling
  drm/amdgpu/sdma5.2: add mes support for sdma ring test
  drm/amdgpu/sdma5.2: add mes support for sdma ib test
  drm/amdgpu/sdma5: initialize sdma mqd
  drm/amdgpu/sdma5: associate mes queue id with fence
  drm/amdgpu/sdma5: add mes queue fence handling
  drm/amdgpu/sdma5: add mes support for sdma ring test
  drm/amdgpu/sdma5: add mes support for sdma ib test
  drm/amdgpu/mes: add mes kiq callback
  drm/amdgpu: add mes kiq frontdoor loading support
  drm/amdgpu: enable mes kiq N-1 test on sienna cichlid
  drm/amdgpu/mes: manage mes doorbell allocation
  drm/amdgpu: add mes queue id mask v2
  drm/amdgpu/mes: initialize/finalize common mes structure v2
  drm/amdgpu/mes: relocate status_fence slot allocation
  drm/amdgpu/mes10.1: call general mes initialization
  drm/amdgpu/mes10.1: add delay after mes engine enable
  drm/amdgpu/mes10.1: implement the suspend/resume routine
  drm/amdgpu/mes: implement creating mes process v2
  drm/amdgpu/mes: implement destroying mes process
  drm/amdgpu/mes: implement adding mes gang
  drm/amdgpu/mes: implement removing mes gang
  drm/amdgpu/mes: implement suspending all gangs
  drm/amdgpu/mes: implement resuming all gangs
  drm/amdgpu/mes: initialize mqd from queue properties
  drm/amdgpu/mes: implement adding mes queue
  drm/amdgpu/mes: implement removing mes queue
  drm/amdgpu/mes: add helper function to convert ring to queue property
  drm/amdgpu/mes: add helper function to get the ctx meta data offset
  drm/amdgpu/mes: use ring for kernel queue submission
  drm/amdgpu/mes: implement removing mes ring
  drm/amdgpu/mes: add helper functions to alloc/free ctx metadata
  drm/amdgpu: skip kfd routines when mes enabled
  drm/amdgpu: skip some checking for mes queue ib submission
  drm/amdgpu: skip kiq ib tests if mes enabled
  drm/amdgpu: skip gds switch for mes queue
  drm/amdgpu: kiq takes charge of all queues
  drm/amdgpu/mes: map ctx metadata for mes self test
  drm/amdgpu/mes: create gang and queues for mes self test
  drm/amdgpu/mes: add ring/ib test for mes self test
  drm/amdgpu/mes: implement mes self test
  drm/amdgpu/mes10.1: add mes self test in late init
  drm/amdgpu/mes: fix vm csa update issue
  drm/amdgpu/mes: disable mes sdma queue test

Likun Gao (1):
  drm/amdgpu: add mes kiq PSP GFX FW type

Mukul Joshi (2):
  drm/amdgpu: Enable KFD with MES enabled
  drm/amdgpu/mes: Update the doorbell function signatures

 drivers/gpu/drm/amd/amdgpu/Makefile          |    1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu.h          |   24 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c   |   42 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h |    6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c      |   10 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c    |    4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c      |    3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c       |    8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c      | 1138 ++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h      |  168 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h  |  121 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c      |    6 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c     |  193 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h     |   22 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c     |   24 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c    |   10 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c       |    3 +-
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c        |    8 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c       |  383 +++---
 drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c        |    8 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c        |   16 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c        |   20 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c        |   25 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c       |    6 +-
 drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c       |    4 +-
 drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c       |    4 +-
 drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c       |    4 +-
 drivers/gpu/drm/amd/amdgpu/mes_v10_1.c       |  461 ++++---
 drivers/gpu/drm/amd/amdgpu/nv.c              |    3 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c       |    8 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c       |   16 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c       |   28 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c       |  169 ++-
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c       |  171 ++-
 drivers/gpu/drm/amd/amdgpu/si_dma.c          |    4 +-
 drivers/gpu/drm/amd/amdgpu/soc21.c           |    3 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c        |    6 +-
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c        |    6 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c        |   12 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c        |   12 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c        |   12 +-
 41 files changed, 2619 insertions(+), 553 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h

-- 
2.35.1


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 01/73] drm/amdgpu: define MQD abstract layer for hw ip
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 02/73] drm/amdgpu: add helper function to initialize mqd from ring v4 Alex Deucher
                   ` (71 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Define MQD abstract layer for hw ip, for the passing
mqd configuration not only from ring but more sources,
like user queue.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 1a598e3247ca..2eed9479e854 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -720,6 +720,26 @@ struct ip_discovery_top;
 					  (rid == 0x01) || \
 					  (rid == 0x10))))
 
+struct amdgpu_mqd_prop {
+	uint64_t mqd_gpu_addr;
+	uint64_t hqd_base_gpu_addr;
+	uint64_t rptr_gpu_addr;
+	uint64_t wptr_gpu_addr;
+	uint32_t queue_size;
+	bool use_doorbell;
+	uint32_t doorbell_index;
+	uint64_t eop_gpu_addr;
+	uint32_t hqd_pipe_priority;
+	uint32_t hqd_queue_priority;
+	bool hqd_active;
+};
+
+struct amdgpu_mqd {
+	unsigned mqd_size;
+	int (*init_mqd)(struct amdgpu_device *adev, void *mqd,
+			struct amdgpu_mqd_prop *p);
+};
+
 #define AMDGPU_RESET_MAGIC_NUM 64
 #define AMDGPU_MAX_DF_PERFMONS 4
 #define AMDGPU_PRODUCT_NAME_LEN 64
@@ -920,6 +940,7 @@ struct amdgpu_device {
 	/* mes */
 	bool                            enable_mes;
 	struct amdgpu_mes               mes;
+	struct amdgpu_mqd               mqds[AMDGPU_HW_IP_NUM];
 
 	/* df */
 	struct amdgpu_df                df;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 02/73] drm/amdgpu: add helper function to initialize mqd from ring v4
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
  2022-04-29 17:45 ` [PATCH 01/73] drm/amdgpu: define MQD abstract layer for hw ip Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 03/73] drm/amdgpu: add the per-context meta data v3 Alex Deucher
                   ` (70 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add the helper function to initialize mqd from ring configuration.

v2: use if/else pair instead of ?/: pair
v3: use simpler way to judge hqd_active
v4: fix parameters to amdgpu_gfx_is_high_priority_compute_queue

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 48 ++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  2 +
 2 files changed, 50 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 7f33ae87cb41..773954318216 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -458,3 +458,51 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
 	ring->sched.ready = !r;
 	return r;
 }
+
+static void amdgpu_ring_to_mqd_prop(struct amdgpu_ring *ring,
+				    struct amdgpu_mqd_prop *prop)
+{
+	struct amdgpu_device *adev = ring->adev;
+
+	memset(prop, 0, sizeof(*prop));
+
+	prop->mqd_gpu_addr = ring->mqd_gpu_addr;
+	prop->hqd_base_gpu_addr = ring->gpu_addr;
+	prop->rptr_gpu_addr = ring->rptr_gpu_addr;
+	prop->wptr_gpu_addr = ring->wptr_gpu_addr;
+	prop->queue_size = ring->ring_size;
+	prop->eop_gpu_addr = ring->eop_gpu_addr;
+	prop->use_doorbell = ring->use_doorbell;
+	prop->doorbell_index = ring->doorbell_index;
+
+	/* map_queues packet doesn't need activate the queue,
+	 * so only kiq need set this field.
+	 */
+	prop->hqd_active = ring->funcs->type == AMDGPU_RING_TYPE_KIQ;
+
+	if (ring->funcs->type == AMDGPU_RING_TYPE_COMPUTE) {
+		if (amdgpu_gfx_is_high_priority_compute_queue(adev, ring)) {
+			prop->hqd_pipe_priority = AMDGPU_GFX_PIPE_PRIO_HIGH;
+			prop->hqd_queue_priority =
+				AMDGPU_GFX_QUEUE_PRIORITY_MAXIMUM;
+		}
+	}
+}
+
+int amdgpu_ring_init_mqd(struct amdgpu_ring *ring)
+{
+	struct amdgpu_device *adev = ring->adev;
+	struct amdgpu_mqd *mqd_mgr;
+	struct amdgpu_mqd_prop prop;
+
+	amdgpu_ring_to_mqd_prop(ring, &prop);
+
+	ring->wptr = 0;
+
+	if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
+		mqd_mgr = &adev->mqds[AMDGPU_HW_IP_COMPUTE];
+	else
+		mqd_mgr = &adev->mqds[ring->funcs->type];
+
+	return mqd_mgr->init_mqd(adev, ring->mqd_ptr, &prop);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 317d80209e95..20dfe5a19a81 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -369,6 +369,8 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
 void amdgpu_debugfs_ring_init(struct amdgpu_device *adev,
 			      struct amdgpu_ring *ring);
 
+int amdgpu_ring_init_mqd(struct amdgpu_ring *ring);
+
 static inline u32 amdgpu_ib_get_value(struct amdgpu_ib *ib, int idx)
 {
 	return ib->ptr[idx];
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 03/73] drm/amdgpu: add the per-context meta data v3
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
  2022-04-29 17:45 ` [PATCH 01/73] drm/amdgpu: define MQD abstract layer for hw ip Alex Deucher
  2022-04-29 17:45 ` [PATCH 02/73] drm/amdgpu: add helper function to initialize mqd from ring v4 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 04/73] drm/amdgpu: add mes ctx data in amdgpu_ring Alex Deucher
                   ` (69 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

The per-context meta data is a per-context data structure associated
with a mes-managed hardware ring, which includes MCBP CSA, ring buffer
and etc.

v2: fix typo
v3: a. use structure instead of typedef
    b. move amdgpu_mes_ctx_get_offs_* to amdgpu_ring.h
    c. use __aligned to make alignement

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h         |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h | 118 ++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h    |   9 ++
 3 files changed, 128 insertions(+)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 2eed9479e854..24bce7e691a8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -91,6 +91,7 @@
 #include "amdgpu_dm.h"
 #include "amdgpu_virt.h"
 #include "amdgpu_csa.h"
+#include "amdgpu_mes_ctx.h"
 #include "amdgpu_gart.h"
 #include "amdgpu_debugfs.h"
 #include "amdgpu_job.h"
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
new file mode 100644
index 000000000000..f3e1ba1a889f
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
@@ -0,0 +1,118 @@
+/*
+ * Copyright 2019 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __AMDGPU_MES_CTX_H__
+#define __AMDGPU_MES_CTX_H__
+
+#include "v10_structs.h"
+
+enum {
+	AMDGPU_MES_CTX_RPTR_OFFS = 0,
+	AMDGPU_MES_CTX_WPTR_OFFS,
+	AMDGPU_MES_CTX_FENCE_OFFS,
+	AMDGPU_MES_CTX_COND_EXE_OFFS,
+	AMDGPU_MES_CTX_TRAIL_FENCE_OFFS,
+	AMDGPU_MES_CTX_MAX_OFFS,
+};
+
+enum {
+	AMDGPU_MES_CTX_RING_OFFS = AMDGPU_MES_CTX_MAX_OFFS,
+	AMDGPU_MES_CTX_IB_OFFS,
+	AMDGPU_MES_CTX_PADDING_OFFS,
+};
+
+#define AMDGPU_MES_CTX_MAX_GFX_RINGS            1
+#define AMDGPU_MES_CTX_MAX_COMPUTE_RINGS        4
+#define AMDGPU_MES_CTX_MAX_SDMA_RINGS           2
+#define AMDGPU_MES_CTX_MAX_RINGS					\
+	(AMDGPU_MES_CTX_MAX_GFX_RINGS +					\
+	 AMDGPU_MES_CTX_MAX_COMPUTE_RINGS +				\
+	 AMDGPU_MES_CTX_MAX_SDMA_RINGS)
+
+#define AMDGPU_CSA_SDMA_SIZE    64
+#define GFX10_MEC_HPD_SIZE	2048
+
+struct amdgpu_wb_slot {
+	uint32_t data[8];
+};
+
+struct amdgpu_mes_ctx_meta_data {
+	struct {
+		uint8_t ring[PAGE_SIZE * 4];
+
+		/* gfx csa */
+		struct v10_gfx_meta_data gfx_meta_data;
+
+		uint8_t gds_backup[64 * 1024];
+
+		struct amdgpu_wb_slot slots[AMDGPU_MES_CTX_MAX_OFFS];
+
+		/* only for ib test */
+		uint32_t ib[256] __aligned(256);
+
+		uint32_t padding[64];
+
+	} __aligned(PAGE_SIZE) gfx[AMDGPU_MES_CTX_MAX_GFX_RINGS];
+
+	struct {
+		uint8_t ring[PAGE_SIZE * 4];
+
+		uint8_t mec_hpd[GFX10_MEC_HPD_SIZE];
+
+		struct amdgpu_wb_slot slots[AMDGPU_MES_CTX_MAX_OFFS];
+
+		/* only for ib test */
+		uint32_t ib[256] __aligned(256);
+
+		uint32_t padding[64];
+
+	} __aligned(PAGE_SIZE) compute[AMDGPU_MES_CTX_MAX_COMPUTE_RINGS];
+
+	struct {
+		uint8_t ring[PAGE_SIZE * 4];
+
+		/* sdma csa for mcbp */
+		uint8_t sdma_meta_data[AMDGPU_CSA_SDMA_SIZE];
+
+		struct amdgpu_wb_slot slots[AMDGPU_MES_CTX_MAX_OFFS];
+
+		/* only for ib test */
+		uint32_t ib[256] __aligned(256);
+
+		uint32_t padding[64];
+
+	} __aligned(PAGE_SIZE) sdma[AMDGPU_MES_CTX_MAX_SDMA_RINGS];
+};
+
+struct amdgpu_mes_ctx_data {
+	struct amdgpu_bo	*meta_data_obj;
+	uint64_t                meta_data_gpu_addr;
+	struct amdgpu_bo_va	*meta_data_va;
+	void                    *meta_data_ptr;
+	uint32_t                gang_ids[AMDGPU_HW_IP_DMA+1];
+};
+
+#define AMDGPU_FENCE_MES_QUEUE_FLAG     0x1000000u
+#define AMDGPU_FENCE_MES_QUEUE_ID_MASK  (AMDGPU_FENCE_MES_QUEUE_FLAG - 1)
+
+#endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 20dfe5a19a81..112c2b0ef0b1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -364,6 +364,15 @@ static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
 	ring->count_dw -= count_dw;
 }
 
+#define amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset)			\
+	(ring->is_mes_queue && ring->mes_ctx ?				\
+	 (ring->mes_ctx->meta_data_gpu_addr + offset) : 0)
+
+#define amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset)			\
+	(ring->is_mes_queue && ring->mes_ctx ?				\
+	 (void *)((uint8_t *)(ring->mes_ctx->meta_data_ptr) + offset) : \
+	 NULL)
+
 int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
 
 void amdgpu_debugfs_ring_init(struct amdgpu_device *adev,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 04/73] drm/amdgpu: add mes ctx data in amdgpu_ring
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (2 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 03/73] drm/amdgpu: add the per-context meta data v3 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 05/73] drm/amdgpu: define ring structure to access rptr/wptr/fence Alex Deucher
                   ` (68 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add mes context data structure in amdgpu_ring.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 112c2b0ef0b1..317a66bcd258 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -267,6 +267,11 @@ struct amdgpu_ring {
 	int			hw_prio;
 	unsigned 		num_hw_submission;
 	atomic_t		*sched_score;
+
+	/* used for mes */
+	bool			is_mes_queue;
+	uint32_t		hw_queue_id;
+	struct amdgpu_mes_ctx_data *mes_ctx;
 };
 
 #define amdgpu_ring_parse_cs(r, p, job, ib) ((r)->funcs->parse_cs((p), (job), (ib)))
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 05/73] drm/amdgpu: define ring structure to access rptr/wptr/fence
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (3 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 04/73] drm/amdgpu: add mes ctx data in amdgpu_ring Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 06/73] drm/amdgpu: use ring structure to access rptr/wptr v2 Alex Deucher
                   ` (67 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Define ring structure to access the cpu/gpu address of rptr/wptr/fence
instead of dynamic calculation.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 317a66bcd258..7d89a52091c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -230,6 +230,8 @@ struct amdgpu_ring {
 	struct amdgpu_bo	*ring_obj;
 	volatile uint32_t	*ring;
 	unsigned		rptr_offs;
+	u64			rptr_gpu_addr;
+	volatile u32		*rptr_cpu_addr;
 	u64			wptr;
 	u64			wptr_old;
 	unsigned		ring_size;
@@ -250,7 +252,11 @@ struct amdgpu_ring {
 	bool			use_doorbell;
 	bool			use_pollmem;
 	unsigned		wptr_offs;
+	u64			wptr_gpu_addr;
+	volatile u32		*wptr_cpu_addr;
 	unsigned		fence_offs;
+	u64			fence_gpu_addr;
+	volatile u32		*fence_cpu_addr;
 	uint64_t		current_ctx;
 	char			name[16];
 	u32                     trail_seq;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 06/73] drm/amdgpu: use ring structure to access rptr/wptr v2
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (4 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 05/73] drm/amdgpu: define ring structure to access rptr/wptr/fence Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 07/73] drm/amdgpu: initialize/finalize the ring for mes queue Alex Deucher
                   ` (66 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Use ring structure to access the cpu/gpu address of rptr/wptr.

v2: merge gfx10/sdma5/sdma5.2 patches

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c  |  8 +++---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 37 +++++++++++++-------------
 drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c  |  8 +++---
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 16 +++++------
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 20 +++++++-------
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 25 +++++++++--------
 drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c |  4 +--
 drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c |  4 +--
 drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c |  4 +--
 drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 11 ++++----
 drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c |  8 +++---
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 16 +++++------
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 28 ++++++++-----------
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 16 +++++------
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 16 +++++------
 drivers/gpu/drm/amd/amdgpu/si_dma.c    |  4 +--
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |  6 ++---
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c  |  6 ++---
 drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c  | 12 ++++-----
 drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c  | 12 ++++-----
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c  | 12 ++++-----
 21 files changed, 128 insertions(+), 145 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index 6c01199e9112..5647f13b98d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -164,7 +164,7 @@ static uint64_t cik_sdma_ring_get_rptr(struct amdgpu_ring *ring)
 {
 	u32 rptr;
 
-	rptr = ring->adev->wb.wb[ring->rptr_offs];
+	rptr = *ring->rptr_cpu_addr;
 
 	return (rptr & 0x3fffc) >> 2;
 }
@@ -436,12 +436,10 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev)
 	struct amdgpu_ring *ring;
 	u32 rb_cntl, ib_cntl;
 	u32 rb_bufsz;
-	u32 wb_offset;
 	int i, j, r;
 
 	for (i = 0; i < adev->sdma.num_instances; i++) {
 		ring = &adev->sdma.instance[i].ring;
-		wb_offset = (ring->rptr_offs * 4);
 
 		mutex_lock(&adev->srbm_mutex);
 		for (j = 0; j < 16; j++) {
@@ -477,9 +475,9 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev)
 
 		/* set the wb address whether it's enabled or not */
 		WREG32(mmSDMA0_GFX_RB_RPTR_ADDR_HI + sdma_offsets[i],
-		       upper_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFF);
+		       upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF);
 		WREG32(mmSDMA0_GFX_RB_RPTR_ADDR_LO + sdma_offsets[i],
-		       ((adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFC));
+		       ((ring->rptr_gpu_addr) & 0xFFFFFFFC));
 
 		rb_cntl |= SDMA0_GFX_RB_CNTL__RPTR_WRITEBACK_ENABLE_MASK;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 407074f958f4..f2dd53f2af61 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -3519,9 +3519,8 @@ static void gfx10_kiq_set_resources(struct amdgpu_ring *kiq_ring, uint64_t queue
 static void gfx10_kiq_map_queues(struct amdgpu_ring *kiq_ring,
 				 struct amdgpu_ring *ring)
 {
-	struct amdgpu_device *adev = kiq_ring->adev;
 	uint64_t mqd_addr = amdgpu_bo_gpu_offset(ring->mqd_obj);
-	uint64_t wptr_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+	uint64_t wptr_addr = ring->wptr_gpu_addr;
 	uint32_t eng_sel = ring->funcs->type == AMDGPU_RING_TYPE_GFX ? 4 : 0;
 
 	amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_MAP_QUEUES, 5));
@@ -6344,12 +6343,12 @@ static int gfx_v10_0_cp_gfx_resume(struct amdgpu_device *adev)
 	WREG32_SOC15(GC, 0, mmCP_RB0_WPTR_HI, upper_32_bits(ring->wptr));
 
 	/* set the wb address wether it's enabled or not */
-	rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+	rptr_addr = ring->rptr_gpu_addr;
 	WREG32_SOC15(GC, 0, mmCP_RB0_RPTR_ADDR, lower_32_bits(rptr_addr));
 	WREG32_SOC15(GC, 0, mmCP_RB0_RPTR_ADDR_HI, upper_32_bits(rptr_addr) &
 		     CP_RB_RPTR_ADDR_HI__RB_RPTR_ADDR_HI_MASK);
 
-	wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+	wptr_gpu_addr = ring->wptr_gpu_addr;
 	WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_LO,
 		     lower_32_bits(wptr_gpu_addr));
 	WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_HI,
@@ -6382,11 +6381,11 @@ static int gfx_v10_0_cp_gfx_resume(struct amdgpu_device *adev)
 		WREG32_SOC15(GC, 0, mmCP_RB1_WPTR, lower_32_bits(ring->wptr));
 		WREG32_SOC15(GC, 0, mmCP_RB1_WPTR_HI, upper_32_bits(ring->wptr));
 		/* Set the wb address wether it's enabled or not */
-		rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+		rptr_addr = ring->rptr_gpu_addr;
 		WREG32_SOC15(GC, 0, mmCP_RB1_RPTR_ADDR, lower_32_bits(rptr_addr));
 		WREG32_SOC15(GC, 0, mmCP_RB1_RPTR_ADDR_HI, upper_32_bits(rptr_addr) &
 			     CP_RB1_RPTR_ADDR_HI__RB_RPTR_ADDR_HI_MASK);
-		wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+		wptr_gpu_addr = ring->wptr_gpu_addr;
 		WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_LO,
 			     lower_32_bits(wptr_gpu_addr));
 		WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_HI,
@@ -6610,13 +6609,13 @@ static int gfx_v10_0_gfx_mqd_init(struct amdgpu_ring *ring)
 	mqd->cp_gfx_hqd_base_hi = upper_32_bits(hqd_gpu_addr);
 
 	/* set up hqd_rptr_addr/_hi, similar as CP_RB_RPTR */
-	wb_gpu_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+	wb_gpu_addr = ring->rptr_gpu_addr;
 	mqd->cp_gfx_hqd_rptr_addr = wb_gpu_addr & 0xfffffffc;
 	mqd->cp_gfx_hqd_rptr_addr_hi =
 		upper_32_bits(wb_gpu_addr) & 0xffff;
 
 	/* set up rb_wptr_poll addr */
-	wb_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+	wb_gpu_addr = ring->wptr_gpu_addr;
 	mqd->cp_rb_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffffc;
 	mqd->cp_rb_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
 
@@ -6730,7 +6729,7 @@ static int gfx_v10_0_gfx_init_queue(struct amdgpu_ring *ring)
 			memcpy(mqd, adev->gfx.me.mqd_backup[mqd_idx], sizeof(*mqd));
 		/* reset the ring */
 		ring->wptr = 0;
-		adev->wb.wb[ring->wptr_offs] = 0;
+		*ring->wptr_cpu_addr = 0;
 		amdgpu_ring_clear_ring(ring);
 #ifdef BRING_UP_DEBUG
 		mutex_lock(&adev->srbm_mutex);
@@ -6904,13 +6903,13 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
 	mqd->cp_hqd_pq_control = tmp;
 
 	/* set the wb address whether it's enabled or not */
-	wb_gpu_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+	wb_gpu_addr = ring->rptr_gpu_addr;
 	mqd->cp_hqd_pq_rptr_report_addr_lo = wb_gpu_addr & 0xfffffffc;
 	mqd->cp_hqd_pq_rptr_report_addr_hi =
 		upper_32_bits(wb_gpu_addr) & 0xffff;
 
 	/* only used if CP_PQ_WPTR_POLL_CNTL.CP_PQ_WPTR_POLL_CNTL__EN_MASK=1 */
-	wb_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+	wb_gpu_addr = ring->wptr_gpu_addr;
 	mqd->cp_hqd_pq_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffffc;
 	mqd->cp_hqd_pq_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
 
@@ -7130,7 +7129,7 @@ static int gfx_v10_0_kcq_init_queue(struct amdgpu_ring *ring)
 
 		/* reset ring buffer */
 		ring->wptr = 0;
-		atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs], 0);
+		atomic64_set((atomic64_t *)ring->wptr_cpu_addr, 0);
 		amdgpu_ring_clear_ring(ring);
 	} else {
 		amdgpu_ring_clear_ring(ring);
@@ -8496,7 +8495,8 @@ static void gfx_v10_0_get_clockgating_state(void *handle, u64 *flags)
 
 static u64 gfx_v10_0_ring_get_rptr_gfx(struct amdgpu_ring *ring)
 {
-	return ring->adev->wb.wb[ring->rptr_offs]; /* gfx10 is 32bit rptr*/
+	/* gfx10 is 32bit rptr*/
+	return *(uint32_t *)ring->rptr_cpu_addr;
 }
 
 static u64 gfx_v10_0_ring_get_wptr_gfx(struct amdgpu_ring *ring)
@@ -8506,7 +8506,7 @@ static u64 gfx_v10_0_ring_get_wptr_gfx(struct amdgpu_ring *ring)
 
 	/* XXX check if swapping is necessary on BE */
 	if (ring->use_doorbell) {
-		wptr = atomic64_read((atomic64_t *)&adev->wb.wb[ring->wptr_offs]);
+		wptr = atomic64_read((atomic64_t *)ring->wptr_cpu_addr);
 	} else {
 		wptr = RREG32_SOC15(GC, 0, mmCP_RB0_WPTR);
 		wptr += (u64)RREG32_SOC15(GC, 0, mmCP_RB0_WPTR_HI) << 32;
@@ -8521,7 +8521,7 @@ static void gfx_v10_0_ring_set_wptr_gfx(struct amdgpu_ring *ring)
 
 	if (ring->use_doorbell) {
 		/* XXX check if swapping is necessary on BE */
-		atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs], ring->wptr);
+		atomic64_set((atomic64_t *)ring->wptr_cpu_addr, ring->wptr);
 		WDOORBELL64(ring->doorbell_index, ring->wptr);
 	} else {
 		WREG32_SOC15(GC, 0, mmCP_RB0_WPTR, lower_32_bits(ring->wptr));
@@ -8531,7 +8531,8 @@ static void gfx_v10_0_ring_set_wptr_gfx(struct amdgpu_ring *ring)
 
 static u64 gfx_v10_0_ring_get_rptr_compute(struct amdgpu_ring *ring)
 {
-	return ring->adev->wb.wb[ring->rptr_offs]; /* gfx10 hardware is 32bit rptr */
+	/* gfx10 hardware is 32bit rptr */
+	return *(uint32_t *)ring->rptr_cpu_addr;
 }
 
 static u64 gfx_v10_0_ring_get_wptr_compute(struct amdgpu_ring *ring)
@@ -8540,7 +8541,7 @@ static u64 gfx_v10_0_ring_get_wptr_compute(struct amdgpu_ring *ring)
 
 	/* XXX check if swapping is necessary on BE */
 	if (ring->use_doorbell)
-		wptr = atomic64_read((atomic64_t *)&ring->adev->wb.wb[ring->wptr_offs]);
+		wptr = atomic64_read((atomic64_t *)ring->wptr_cpu_addr);
 	else
 		BUG();
 	return wptr;
@@ -8552,7 +8553,7 @@ static void gfx_v10_0_ring_set_wptr_compute(struct amdgpu_ring *ring)
 
 	/* XXX check if swapping is necessary on BE */
 	if (ring->use_doorbell) {
-		atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs], ring->wptr);
+		atomic64_set((atomic64_t *)ring->wptr_cpu_addr, ring->wptr);
 		WDOORBELL64(ring->doorbell_index, ring->wptr);
 	} else {
 		BUG(); /* only DOORBELL method supported on gfx10 now */
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
index 6a8dadea40f9..29a91b320d4f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
@@ -2117,7 +2117,7 @@ static int gfx_v6_0_cp_gfx_resume(struct amdgpu_device *adev)
 	WREG32(mmCP_RB0_WPTR, ring->wptr);
 
 	/* set the wb address whether it's enabled or not */
-	rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+	rptr_addr = ring->rptr_gpu_addr;
 	WREG32(mmCP_RB0_RPTR_ADDR, lower_32_bits(rptr_addr));
 	WREG32(mmCP_RB0_RPTR_ADDR_HI, upper_32_bits(rptr_addr) & 0xFF);
 
@@ -2139,7 +2139,7 @@ static int gfx_v6_0_cp_gfx_resume(struct amdgpu_device *adev)
 
 static u64 gfx_v6_0_ring_get_rptr(struct amdgpu_ring *ring)
 {
-	return ring->adev->wb.wb[ring->rptr_offs];
+	return *ring->rptr_cpu_addr;
 }
 
 static u64 gfx_v6_0_ring_get_wptr(struct amdgpu_ring *ring)
@@ -2203,7 +2203,7 @@ static int gfx_v6_0_cp_compute_resume(struct amdgpu_device *adev)
 	ring->wptr = 0;
 	WREG32(mmCP_RB1_WPTR, ring->wptr);
 
-	rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+	rptr_addr = ring->rptr_gpu_addr;
 	WREG32(mmCP_RB1_RPTR_ADDR, lower_32_bits(rptr_addr));
 	WREG32(mmCP_RB1_RPTR_ADDR_HI, upper_32_bits(rptr_addr) & 0xFF);
 
@@ -2222,7 +2222,7 @@ static int gfx_v6_0_cp_compute_resume(struct amdgpu_device *adev)
 	WREG32(mmCP_RB2_CNTL, tmp | CP_RB2_CNTL__RB_RPTR_WR_ENA_MASK);
 	ring->wptr = 0;
 	WREG32(mmCP_RB2_WPTR, ring->wptr);
-	rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+	rptr_addr = ring->rptr_gpu_addr;
 	WREG32(mmCP_RB2_RPTR_ADDR, lower_32_bits(rptr_addr));
 	WREG32(mmCP_RB2_RPTR_ADDR_HI, upper_32_bits(rptr_addr) & 0xFF);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index d17a6f399347..ac3f2dbba726 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -2630,8 +2630,8 @@ static int gfx_v7_0_cp_gfx_resume(struct amdgpu_device *adev)
 	ring->wptr = 0;
 	WREG32(mmCP_RB0_WPTR, lower_32_bits(ring->wptr));
 
-	/* set the wb address whether it's enabled or not */
-	rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+	/* set the wb address wether it's enabled or not */
+	rptr_addr = ring->rptr_gpu_addr;
 	WREG32(mmCP_RB0_RPTR_ADDR, lower_32_bits(rptr_addr));
 	WREG32(mmCP_RB0_RPTR_ADDR_HI, upper_32_bits(rptr_addr) & 0xFF);
 
@@ -2656,7 +2656,7 @@ static int gfx_v7_0_cp_gfx_resume(struct amdgpu_device *adev)
 
 static u64 gfx_v7_0_ring_get_rptr(struct amdgpu_ring *ring)
 {
-	return ring->adev->wb.wb[ring->rptr_offs];
+	return *ring->rptr_cpu_addr;
 }
 
 static u64 gfx_v7_0_ring_get_wptr_gfx(struct amdgpu_ring *ring)
@@ -2677,7 +2677,7 @@ static void gfx_v7_0_ring_set_wptr_gfx(struct amdgpu_ring *ring)
 static u64 gfx_v7_0_ring_get_wptr_compute(struct amdgpu_ring *ring)
 {
 	/* XXX check if swapping is necessary on BE */
-	return ring->adev->wb.wb[ring->wptr_offs];
+	return *ring->wptr_cpu_addr;
 }
 
 static void gfx_v7_0_ring_set_wptr_compute(struct amdgpu_ring *ring)
@@ -2685,7 +2685,7 @@ static void gfx_v7_0_ring_set_wptr_compute(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	/* XXX check if swapping is necessary on BE */
-	adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+	*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 	WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 }
 
@@ -2981,12 +2981,12 @@ static void gfx_v7_0_mqd_init(struct amdgpu_device *adev,
 		CP_HQD_PQ_CONTROL__KMD_QUEUE_MASK; /* assuming kernel queue control */
 
 	/* only used if CP_PQ_WPTR_POLL_CNTL.CP_PQ_WPTR_POLL_CNTL__EN_MASK=1 */
-	wb_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+	wb_gpu_addr = ring->wptr_gpu_addr;
 	mqd->cp_hqd_pq_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffffc;
 	mqd->cp_hqd_pq_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
 
-	/* set the wb address whether it's enabled or not */
-	wb_gpu_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+	/* set the wb address wether it's enabled or not */
+	wb_gpu_addr = ring->rptr_gpu_addr;
 	mqd->cp_hqd_pq_rptr_report_addr_lo = wb_gpu_addr & 0xfffffffc;
 	mqd->cp_hqd_pq_rptr_report_addr_hi =
 		upper_32_bits(wb_gpu_addr) & 0xffff;
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 25dc729d0ec2..e4e779a19c20 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -4306,11 +4306,11 @@ static int gfx_v8_0_cp_gfx_resume(struct amdgpu_device *adev)
 	WREG32(mmCP_RB0_WPTR, lower_32_bits(ring->wptr));
 
 	/* set the wb address wether it's enabled or not */
-	rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+	rptr_addr = ring->rptr_gpu_addr;
 	WREG32(mmCP_RB0_RPTR_ADDR, lower_32_bits(rptr_addr));
 	WREG32(mmCP_RB0_RPTR_ADDR_HI, upper_32_bits(rptr_addr) & 0xFF);
 
-	wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+	wptr_gpu_addr = ring->wptr_gpu_addr;
 	WREG32(mmCP_RB_WPTR_POLL_ADDR_LO, lower_32_bits(wptr_gpu_addr));
 	WREG32(mmCP_RB_WPTR_POLL_ADDR_HI, upper_32_bits(wptr_gpu_addr));
 	mdelay(1);
@@ -4393,7 +4393,7 @@ static int gfx_v8_0_kiq_kcq_enable(struct amdgpu_device *adev)
 	for (i = 0; i < adev->gfx.num_compute_rings; i++) {
 		struct amdgpu_ring *ring = &adev->gfx.compute_ring[i];
 		uint64_t mqd_addr = amdgpu_bo_gpu_offset(ring->mqd_obj);
-		uint64_t wptr_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+		uint64_t wptr_addr = ring->wptr_gpu_addr;
 
 		/* map queues */
 		amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_MAP_QUEUES, 5));
@@ -4517,13 +4517,13 @@ static int gfx_v8_0_mqd_init(struct amdgpu_ring *ring)
 	mqd->cp_hqd_pq_control = tmp;
 
 	/* set the wb address whether it's enabled or not */
-	wb_gpu_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+	wb_gpu_addr = ring->rptr_gpu_addr;
 	mqd->cp_hqd_pq_rptr_report_addr_lo = wb_gpu_addr & 0xfffffffc;
 	mqd->cp_hqd_pq_rptr_report_addr_hi =
 		upper_32_bits(wb_gpu_addr) & 0xffff;
 
 	/* only used if CP_PQ_WPTR_POLL_CNTL.CP_PQ_WPTR_POLL_CNTL__EN_MASK=1 */
-	wb_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+	wb_gpu_addr = ring->wptr_gpu_addr;
 	mqd->cp_hqd_pq_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffffc;
 	mqd->cp_hqd_pq_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
 
@@ -6051,7 +6051,7 @@ static int gfx_v8_0_set_clockgating_state(void *handle,
 
 static u64 gfx_v8_0_ring_get_rptr(struct amdgpu_ring *ring)
 {
-	return ring->adev->wb.wb[ring->rptr_offs];
+	return *ring->rptr_cpu_addr;
 }
 
 static u64 gfx_v8_0_ring_get_wptr_gfx(struct amdgpu_ring *ring)
@@ -6060,7 +6060,7 @@ static u64 gfx_v8_0_ring_get_wptr_gfx(struct amdgpu_ring *ring)
 
 	if (ring->use_doorbell)
 		/* XXX check if swapping is necessary on BE */
-		return ring->adev->wb.wb[ring->wptr_offs];
+		return *ring->wptr_cpu_addr;
 	else
 		return RREG32(mmCP_RB0_WPTR);
 }
@@ -6071,7 +6071,7 @@ static void gfx_v8_0_ring_set_wptr_gfx(struct amdgpu_ring *ring)
 
 	if (ring->use_doorbell) {
 		/* XXX check if swapping is necessary on BE */
-		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+		*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 		WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 	} else {
 		WREG32(mmCP_RB0_WPTR, lower_32_bits(ring->wptr));
@@ -6271,7 +6271,7 @@ static void gfx_v8_0_ring_emit_vm_flush(struct amdgpu_ring *ring,
 
 static u64 gfx_v8_0_ring_get_wptr_compute(struct amdgpu_ring *ring)
 {
-	return ring->adev->wb.wb[ring->wptr_offs];
+	return *ring->wptr_cpu_addr;
 }
 
 static void gfx_v8_0_ring_set_wptr_compute(struct amdgpu_ring *ring)
@@ -6279,7 +6279,7 @@ static void gfx_v8_0_ring_set_wptr_compute(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	/* XXX check if swapping is necessary on BE */
-	adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+	*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 	WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index d58fd83524ac..06182b7e4351 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -780,9 +780,8 @@ static void gfx_v9_0_kiq_set_resources(struct amdgpu_ring *kiq_ring,
 static void gfx_v9_0_kiq_map_queues(struct amdgpu_ring *kiq_ring,
 				 struct amdgpu_ring *ring)
 {
-	struct amdgpu_device *adev = kiq_ring->adev;
 	uint64_t mqd_addr = amdgpu_bo_gpu_offset(ring->mqd_obj);
-	uint64_t wptr_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+	uint64_t wptr_addr = ring->wptr_gpu_addr;
 	uint32_t eng_sel = ring->funcs->type == AMDGPU_RING_TYPE_GFX ? 4 : 0;
 
 	amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_MAP_QUEUES, 5));
@@ -3326,11 +3325,11 @@ static int gfx_v9_0_cp_gfx_resume(struct amdgpu_device *adev)
 	WREG32_SOC15(GC, 0, mmCP_RB0_WPTR_HI, upper_32_bits(ring->wptr));
 
 	/* set the wb address wether it's enabled or not */
-	rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+	rptr_addr = ring->rptr_gpu_addr;
 	WREG32_SOC15(GC, 0, mmCP_RB0_RPTR_ADDR, lower_32_bits(rptr_addr));
 	WREG32_SOC15(GC, 0, mmCP_RB0_RPTR_ADDR_HI, upper_32_bits(rptr_addr) & CP_RB_RPTR_ADDR_HI__RB_RPTR_ADDR_HI_MASK);
 
-	wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+	wptr_gpu_addr = ring->wptr_gpu_addr;
 	WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_LO, lower_32_bits(wptr_gpu_addr));
 	WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_HI, upper_32_bits(wptr_gpu_addr));
 
@@ -3542,13 +3541,13 @@ static int gfx_v9_0_mqd_init(struct amdgpu_ring *ring)
 	mqd->cp_hqd_pq_control = tmp;
 
 	/* set the wb address whether it's enabled or not */
-	wb_gpu_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+	wb_gpu_addr = ring->rptr_gpu_addr;
 	mqd->cp_hqd_pq_rptr_report_addr_lo = wb_gpu_addr & 0xfffffffc;
 	mqd->cp_hqd_pq_rptr_report_addr_hi =
 		upper_32_bits(wb_gpu_addr) & 0xffff;
 
 	/* only used if CP_PQ_WPTR_POLL_CNTL.CP_PQ_WPTR_POLL_CNTL__EN_MASK=1 */
-	wb_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+	wb_gpu_addr = ring->wptr_gpu_addr;
 	mqd->cp_hqd_pq_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffffc;
 	mqd->cp_hqd_pq_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
 
@@ -3830,7 +3829,7 @@ static int gfx_v9_0_kcq_init_queue(struct amdgpu_ring *ring)
 
 		/* reset ring buffer */
 		ring->wptr = 0;
-		atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs], 0);
+		atomic64_set((atomic64_t *)ring->wptr_cpu_addr, 0);
 		amdgpu_ring_clear_ring(ring);
 	} else {
 		amdgpu_ring_clear_ring(ring);
@@ -5279,7 +5278,7 @@ static void gfx_v9_0_get_clockgating_state(void *handle, u64 *flags)
 
 static u64 gfx_v9_0_ring_get_rptr_gfx(struct amdgpu_ring *ring)
 {
-	return ring->adev->wb.wb[ring->rptr_offs]; /* gfx9 is 32bit rptr*/
+	return *ring->rptr_cpu_addr; /* gfx9 is 32bit rptr*/
 }
 
 static u64 gfx_v9_0_ring_get_wptr_gfx(struct amdgpu_ring *ring)
@@ -5289,7 +5288,7 @@ static u64 gfx_v9_0_ring_get_wptr_gfx(struct amdgpu_ring *ring)
 
 	/* XXX check if swapping is necessary on BE */
 	if (ring->use_doorbell) {
-		wptr = atomic64_read((atomic64_t *)&adev->wb.wb[ring->wptr_offs]);
+		wptr = atomic64_read((atomic64_t *)ring->wptr_cpu_addr);
 	} else {
 		wptr = RREG32_SOC15(GC, 0, mmCP_RB0_WPTR);
 		wptr += (u64)RREG32_SOC15(GC, 0, mmCP_RB0_WPTR_HI) << 32;
@@ -5304,7 +5303,7 @@ static void gfx_v9_0_ring_set_wptr_gfx(struct amdgpu_ring *ring)
 
 	if (ring->use_doorbell) {
 		/* XXX check if swapping is necessary on BE */
-		atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs], ring->wptr);
+		atomic64_set((atomic64_t *)ring->wptr_cpu_addr, ring->wptr);
 		WDOORBELL64(ring->doorbell_index, ring->wptr);
 	} else {
 		WREG32_SOC15(GC, 0, mmCP_RB0_WPTR, lower_32_bits(ring->wptr));
@@ -5469,7 +5468,7 @@ static void gfx_v9_0_ring_emit_vm_flush(struct amdgpu_ring *ring,
 
 static u64 gfx_v9_0_ring_get_rptr_compute(struct amdgpu_ring *ring)
 {
-	return ring->adev->wb.wb[ring->rptr_offs]; /* gfx9 hardware is 32bit rptr */
+	return *ring->rptr_cpu_addr; /* gfx9 hardware is 32bit rptr */
 }
 
 static u64 gfx_v9_0_ring_get_wptr_compute(struct amdgpu_ring *ring)
@@ -5478,7 +5477,7 @@ static u64 gfx_v9_0_ring_get_wptr_compute(struct amdgpu_ring *ring)
 
 	/* XXX check if swapping is necessary on BE */
 	if (ring->use_doorbell)
-		wptr = atomic64_read((atomic64_t *)&ring->adev->wb.wb[ring->wptr_offs]);
+		wptr = atomic64_read((atomic64_t *)ring->wptr_cpu_addr);
 	else
 		BUG();
 	return wptr;
@@ -5490,7 +5489,7 @@ static void gfx_v9_0_ring_set_wptr_compute(struct amdgpu_ring *ring)
 
 	/* XXX check if swapping is necessary on BE */
 	if (ring->use_doorbell) {
-		atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs], ring->wptr);
+		atomic64_set((atomic64_t *)ring->wptr_cpu_addr, ring->wptr);
 		WDOORBELL64(ring->doorbell_index, ring->wptr);
 	} else{
 		BUG(); /* only DOORBELL method supported on gfx9 now */
diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c
index 299de1d131d8..d2722adabd1b 100644
--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c
@@ -407,7 +407,7 @@ static uint64_t jpeg_v2_0_dec_ring_get_wptr(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	if (ring->use_doorbell)
-		return adev->wb.wb[ring->wptr_offs];
+		return *ring->wptr_cpu_addr;
 	else
 		return RREG32_SOC15(JPEG, 0, mmUVD_JRBC_RB_WPTR);
 }
@@ -424,7 +424,7 @@ static void jpeg_v2_0_dec_ring_set_wptr(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	if (ring->use_doorbell) {
-		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+		*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 		WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 	} else {
 		WREG32_SOC15(JPEG, 0, mmUVD_JRBC_RB_WPTR, lower_32_bits(ring->wptr));
diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c
index 8c3227d0b8b4..c2bf036a7330 100644
--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c
+++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c
@@ -402,7 +402,7 @@ static uint64_t jpeg_v2_5_dec_ring_get_wptr(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	if (ring->use_doorbell)
-		return adev->wb.wb[ring->wptr_offs];
+		return *ring->wptr_cpu_addr;
 	else
 		return RREG32_SOC15(JPEG, ring->me, mmUVD_JRBC_RB_WPTR);
 }
@@ -419,7 +419,7 @@ static void jpeg_v2_5_dec_ring_set_wptr(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	if (ring->use_doorbell) {
-		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+		*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 		WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 	} else {
 		WREG32_SOC15(JPEG, ring->me, mmUVD_JRBC_RB_WPTR, lower_32_bits(ring->wptr));
diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c
index 41a00851b6c5..a1b751d9ac06 100644
--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c
@@ -427,7 +427,7 @@ static uint64_t jpeg_v3_0_dec_ring_get_wptr(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	if (ring->use_doorbell)
-		return adev->wb.wb[ring->wptr_offs];
+		return *ring->wptr_cpu_addr;
 	else
 		return RREG32_SOC15(JPEG, 0, mmUVD_JRBC_RB_WPTR);
 }
@@ -444,7 +444,7 @@ static void jpeg_v3_0_dec_ring_set_wptr(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	if (ring->use_doorbell) {
-		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+		*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 		WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 	} else {
 		WREG32_SOC15(JPEG, 0, mmUVD_JRBC_RB_WPTR, lower_32_bits(ring->wptr));
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index a7ec4ac89da5..0819ffe8e759 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -46,7 +46,7 @@ static void mes_v10_1_ring_set_wptr(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	if (ring->use_doorbell) {
-		atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs],
+		atomic64_set((atomic64_t *)ring->wptr_cpu_addr,
 			     ring->wptr);
 		WDOORBELL64(ring->doorbell_index, ring->wptr);
 	} else {
@@ -56,7 +56,7 @@ static void mes_v10_1_ring_set_wptr(struct amdgpu_ring *ring)
 
 static u64 mes_v10_1_ring_get_rptr(struct amdgpu_ring *ring)
 {
-	return ring->adev->wb.wb[ring->rptr_offs];
+	return *ring->rptr_cpu_addr;
 }
 
 static u64 mes_v10_1_ring_get_wptr(struct amdgpu_ring *ring)
@@ -64,8 +64,7 @@ static u64 mes_v10_1_ring_get_wptr(struct amdgpu_ring *ring)
 	u64 wptr;
 
 	if (ring->use_doorbell)
-		wptr = atomic64_read((atomic64_t *)
-				     &ring->adev->wb.wb[ring->wptr_offs]);
+		wptr = atomic64_read((atomic64_t *)ring->wptr_cpu_addr);
 	else
 		BUG();
 	return wptr;
@@ -673,13 +672,13 @@ static int mes_v10_1_mqd_init(struct amdgpu_ring *ring)
 	mqd->cp_hqd_pq_control = tmp;
 
 	/* set the wb address whether it's enabled or not */
-	wb_gpu_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+	wb_gpu_addr = ring->rptr_gpu_addr;
 	mqd->cp_hqd_pq_rptr_report_addr_lo = wb_gpu_addr & 0xfffffffc;
 	mqd->cp_hqd_pq_rptr_report_addr_hi =
 		upper_32_bits(wb_gpu_addr) & 0xffff;
 
 	/* only used if CP_PQ_WPTR_POLL_CNTL.CP_PQ_WPTR_POLL_CNTL__EN_MASK=1 */
-	wb_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+	wb_gpu_addr = ring->wptr_gpu_addr;
 	mqd->cp_hqd_pq_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffff8;
 	mqd->cp_hqd_pq_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
index 84b57b06b20c..6bdffdc1c0b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
@@ -194,7 +194,7 @@ static int sdma_v2_4_init_microcode(struct amdgpu_device *adev)
 static uint64_t sdma_v2_4_ring_get_rptr(struct amdgpu_ring *ring)
 {
 	/* XXX check if swapping is necessary on BE */
-	return ring->adev->wb.wb[ring->rptr_offs] >> 2;
+	return *ring->rptr_cpu_addr >> 2;
 }
 
 /**
@@ -414,12 +414,10 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev)
 	struct amdgpu_ring *ring;
 	u32 rb_cntl, ib_cntl;
 	u32 rb_bufsz;
-	u32 wb_offset;
 	int i, j, r;
 
 	for (i = 0; i < adev->sdma.num_instances; i++) {
 		ring = &adev->sdma.instance[i].ring;
-		wb_offset = (ring->rptr_offs * 4);
 
 		mutex_lock(&adev->srbm_mutex);
 		for (j = 0; j < 16; j++) {
@@ -455,9 +453,9 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev)
 
 		/* set the wb address whether it's enabled or not */
 		WREG32(mmSDMA0_GFX_RB_RPTR_ADDR_HI + sdma_offsets[i],
-		       upper_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFF);
+		       upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF);
 		WREG32(mmSDMA0_GFX_RB_RPTR_ADDR_LO + sdma_offsets[i],
-		       lower_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFC);
+		       lower_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFC);
 
 		rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RPTR_WRITEBACK_ENABLE, 1);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 8af5c94d526a..2584fa3cb13e 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -350,7 +350,7 @@ static int sdma_v3_0_init_microcode(struct amdgpu_device *adev)
 static uint64_t sdma_v3_0_ring_get_rptr(struct amdgpu_ring *ring)
 {
 	/* XXX check if swapping is necessary on BE */
-	return ring->adev->wb.wb[ring->rptr_offs] >> 2;
+	return *ring->rptr_cpu_addr >> 2;
 }
 
 /**
@@ -367,7 +367,7 @@ static uint64_t sdma_v3_0_ring_get_wptr(struct amdgpu_ring *ring)
 
 	if (ring->use_doorbell || ring->use_pollmem) {
 		/* XXX check if swapping is necessary on BE */
-		wptr = ring->adev->wb.wb[ring->wptr_offs] >> 2;
+		wptr = *ring->wptr_cpu_addr >> 2;
 	} else {
 		wptr = RREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me]) >> 2;
 	}
@@ -387,12 +387,12 @@ static void sdma_v3_0_ring_set_wptr(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	if (ring->use_doorbell) {
-		u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];
+		u32 *wb = (u32 *)ring->wptr_cpu_addr;
 		/* XXX check if swapping is necessary on BE */
 		WRITE_ONCE(*wb, ring->wptr << 2);
 		WDOORBELL32(ring->doorbell_index, ring->wptr << 2);
 	} else if (ring->use_pollmem) {
-		u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];
+		u32 *wb = (u32 *)ring->wptr_cpu_addr;
 
 		WRITE_ONCE(*wb, ring->wptr << 2);
 	} else {
@@ -649,7 +649,6 @@ static int sdma_v3_0_gfx_resume(struct amdgpu_device *adev)
 	struct amdgpu_ring *ring;
 	u32 rb_cntl, ib_cntl, wptr_poll_cntl;
 	u32 rb_bufsz;
-	u32 wb_offset;
 	u32 doorbell;
 	u64 wptr_gpu_addr;
 	int i, j, r;
@@ -657,7 +656,6 @@ static int sdma_v3_0_gfx_resume(struct amdgpu_device *adev)
 	for (i = 0; i < adev->sdma.num_instances; i++) {
 		ring = &adev->sdma.instance[i].ring;
 		amdgpu_ring_clear_ring(ring);
-		wb_offset = (ring->rptr_offs * 4);
 
 		mutex_lock(&adev->srbm_mutex);
 		for (j = 0; j < 16; j++) {
@@ -694,9 +692,9 @@ static int sdma_v3_0_gfx_resume(struct amdgpu_device *adev)
 
 		/* set the wb address whether it's enabled or not */
 		WREG32(mmSDMA0_GFX_RB_RPTR_ADDR_HI + sdma_offsets[i],
-		       upper_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFF);
+		       upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF);
 		WREG32(mmSDMA0_GFX_RB_RPTR_ADDR_LO + sdma_offsets[i],
-		       lower_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFC);
+		       lower_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFC);
 
 		rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RPTR_WRITEBACK_ENABLE, 1);
 
@@ -715,7 +713,7 @@ static int sdma_v3_0_gfx_resume(struct amdgpu_device *adev)
 		WREG32(mmSDMA0_GFX_DOORBELL + sdma_offsets[i], doorbell);
 
 		/* setup the wptr shadow polling */
-		wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+		wptr_gpu_addr = ring->wptr_gpu_addr;
 
 		WREG32(mmSDMA0_GFX_RB_WPTR_POLL_ADDR_LO + sdma_offsets[i],
 		       lower_32_bits(wptr_gpu_addr));
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index 80de85847712..65181efba50e 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -722,7 +722,7 @@ static uint64_t sdma_v4_0_ring_get_rptr(struct amdgpu_ring *ring)
 	u64 *rptr;
 
 	/* XXX check if swapping is necessary on BE */
-	rptr = ((u64 *)&ring->adev->wb.wb[ring->rptr_offs]);
+	rptr = ((u64 *)ring->rptr_cpu_addr);
 
 	DRM_DEBUG("rptr before shift == 0x%016llx\n", *rptr);
 	return ((*rptr) >> 2);
@@ -742,7 +742,7 @@ static uint64_t sdma_v4_0_ring_get_wptr(struct amdgpu_ring *ring)
 
 	if (ring->use_doorbell) {
 		/* XXX check if swapping is necessary on BE */
-		wptr = READ_ONCE(*((u64 *)&adev->wb.wb[ring->wptr_offs]));
+		wptr = READ_ONCE(*((u64 *)ring->wptr_cpu_addr));
 		DRM_DEBUG("wptr/doorbell before shift == 0x%016llx\n", wptr);
 	} else {
 		wptr = RREG32_SDMA(ring->me, mmSDMA0_GFX_RB_WPTR_HI);
@@ -768,7 +768,7 @@ static void sdma_v4_0_ring_set_wptr(struct amdgpu_ring *ring)
 
 	DRM_DEBUG("Setting write pointer\n");
 	if (ring->use_doorbell) {
-		u64 *wb = (u64 *)&adev->wb.wb[ring->wptr_offs];
+		u64 *wb = (u64 *)ring->wptr_cpu_addr;
 
 		DRM_DEBUG("Using doorbell -- "
 				"wptr_offs == 0x%08x "
@@ -811,7 +811,7 @@ static uint64_t sdma_v4_0_page_ring_get_wptr(struct amdgpu_ring *ring)
 
 	if (ring->use_doorbell) {
 		/* XXX check if swapping is necessary on BE */
-		wptr = READ_ONCE(*((u64 *)&adev->wb.wb[ring->wptr_offs]));
+		wptr = READ_ONCE(*((u64 *)ring->wptr_cpu_addr));
 	} else {
 		wptr = RREG32_SDMA(ring->me, mmSDMA0_PAGE_RB_WPTR_HI);
 		wptr = wptr << 32;
@@ -833,7 +833,7 @@ static void sdma_v4_0_page_ring_set_wptr(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	if (ring->use_doorbell) {
-		u64 *wb = (u64 *)&adev->wb.wb[ring->wptr_offs];
+		u64 *wb = (u64 *)ring->wptr_cpu_addr;
 
 		/* XXX check if swapping is necessary on BE */
 		WRITE_ONCE(*wb, (ring->wptr << 2));
@@ -1174,13 +1174,10 @@ static void sdma_v4_0_gfx_resume(struct amdgpu_device *adev, unsigned int i)
 {
 	struct amdgpu_ring *ring = &adev->sdma.instance[i].ring;
 	u32 rb_cntl, ib_cntl, wptr_poll_cntl;
-	u32 wb_offset;
 	u32 doorbell;
 	u32 doorbell_offset;
 	u64 wptr_gpu_addr;
 
-	wb_offset = (ring->rptr_offs * 4);
-
 	rb_cntl = RREG32_SDMA(i, mmSDMA0_GFX_RB_CNTL);
 	rb_cntl = sdma_v4_0_rb_cntl(ring, rb_cntl);
 	WREG32_SDMA(i, mmSDMA0_GFX_RB_CNTL, rb_cntl);
@@ -1193,9 +1190,9 @@ static void sdma_v4_0_gfx_resume(struct amdgpu_device *adev, unsigned int i)
 
 	/* set the wb address whether it's enabled or not */
 	WREG32_SDMA(i, mmSDMA0_GFX_RB_RPTR_ADDR_HI,
-	       upper_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFF);
+	       upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF);
 	WREG32_SDMA(i, mmSDMA0_GFX_RB_RPTR_ADDR_LO,
-	       lower_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFC);
+	       lower_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFC);
 
 	rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL,
 				RPTR_WRITEBACK_ENABLE, 1);
@@ -1225,7 +1222,7 @@ static void sdma_v4_0_gfx_resume(struct amdgpu_device *adev, unsigned int i)
 	WREG32_SDMA(i, mmSDMA0_GFX_MINOR_PTR_UPDATE, 0);
 
 	/* setup the wptr shadow polling */
-	wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+	wptr_gpu_addr = ring->wptr_gpu_addr;
 	WREG32_SDMA(i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_LO,
 		    lower_32_bits(wptr_gpu_addr));
 	WREG32_SDMA(i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_HI,
@@ -1264,13 +1261,10 @@ static void sdma_v4_0_page_resume(struct amdgpu_device *adev, unsigned int i)
 {
 	struct amdgpu_ring *ring = &adev->sdma.instance[i].page;
 	u32 rb_cntl, ib_cntl, wptr_poll_cntl;
-	u32 wb_offset;
 	u32 doorbell;
 	u32 doorbell_offset;
 	u64 wptr_gpu_addr;
 
-	wb_offset = (ring->rptr_offs * 4);
-
 	rb_cntl = RREG32_SDMA(i, mmSDMA0_PAGE_RB_CNTL);
 	rb_cntl = sdma_v4_0_rb_cntl(ring, rb_cntl);
 	WREG32_SDMA(i, mmSDMA0_PAGE_RB_CNTL, rb_cntl);
@@ -1283,9 +1277,9 @@ static void sdma_v4_0_page_resume(struct amdgpu_device *adev, unsigned int i)
 
 	/* set the wb address whether it's enabled or not */
 	WREG32_SDMA(i, mmSDMA0_PAGE_RB_RPTR_ADDR_HI,
-	       upper_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFF);
+	       upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF);
 	WREG32_SDMA(i, mmSDMA0_PAGE_RB_RPTR_ADDR_LO,
-	       lower_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFC);
+	       lower_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFC);
 
 	rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_PAGE_RB_CNTL,
 				RPTR_WRITEBACK_ENABLE, 1);
@@ -1316,7 +1310,7 @@ static void sdma_v4_0_page_resume(struct amdgpu_device *adev, unsigned int i)
 	WREG32_SDMA(i, mmSDMA0_PAGE_MINOR_PTR_UPDATE, 0);
 
 	/* setup the wptr shadow polling */
-	wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+	wptr_gpu_addr = ring->wptr_gpu_addr;
 	WREG32_SDMA(i, mmSDMA0_PAGE_RB_WPTR_POLL_ADDR_LO,
 		    lower_32_bits(wptr_gpu_addr));
 	WREG32_SDMA(i, mmSDMA0_PAGE_RB_WPTR_POLL_ADDR_HI,
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index d3939c5f531d..ff359e7f1eb8 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -347,7 +347,7 @@ static uint64_t sdma_v5_0_ring_get_rptr(struct amdgpu_ring *ring)
 	u64 *rptr;
 
 	/* XXX check if swapping is necessary on BE */
-	rptr = ((u64 *)&ring->adev->wb.wb[ring->rptr_offs]);
+	rptr = (u64 *)ring->rptr_cpu_addr;
 
 	DRM_DEBUG("rptr before shift == 0x%016llx\n", *rptr);
 	return ((*rptr) >> 2);
@@ -367,7 +367,7 @@ static uint64_t sdma_v5_0_ring_get_wptr(struct amdgpu_ring *ring)
 
 	if (ring->use_doorbell) {
 		/* XXX check if swapping is necessary on BE */
-		wptr = READ_ONCE(*((u64 *)&adev->wb.wb[ring->wptr_offs]));
+		wptr = READ_ONCE(*((u64 *)ring->wptr_cpu_addr));
 		DRM_DEBUG("wptr/doorbell before shift == 0x%016llx\n", wptr);
 	} else {
 		wptr = RREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR_HI));
@@ -400,8 +400,8 @@ static void sdma_v5_0_ring_set_wptr(struct amdgpu_ring *ring)
 				lower_32_bits(ring->wptr << 2),
 				upper_32_bits(ring->wptr << 2));
 		/* XXX check if swapping is necessary on BE */
-		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr << 2);
-		adev->wb.wb[ring->wptr_offs + 1] = upper_32_bits(ring->wptr << 2);
+		atomic64_set((atomic64_t *)ring->wptr_cpu_addr,
+			     ring->wptr << 2);
 		DRM_DEBUG("calling WDOORBELL64(0x%08x, 0x%016llx)\n",
 				ring->doorbell_index, ring->wptr << 2);
 		WDOORBELL64(ring->doorbell_index, ring->wptr << 2);
@@ -708,7 +708,6 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)
 	struct amdgpu_ring *ring;
 	u32 rb_cntl, ib_cntl;
 	u32 rb_bufsz;
-	u32 wb_offset;
 	u32 doorbell;
 	u32 doorbell_offset;
 	u32 temp;
@@ -718,7 +717,6 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)
 
 	for (i = 0; i < adev->sdma.num_instances; i++) {
 		ring = &adev->sdma.instance[i].ring;
-		wb_offset = (ring->rptr_offs * 4);
 
 		if (!amdgpu_sriov_vf(adev))
 			WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_SEM_WAIT_FAIL_TIMER_CNTL), 0);
@@ -741,7 +739,7 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)
 		WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), 0);
 
 		/* setup the wptr shadow polling */
-		wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+		wptr_gpu_addr = ring->wptr_gpu_addr;
 		WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_LO),
 		       lower_32_bits(wptr_gpu_addr));
 		WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_HI),
@@ -756,9 +754,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)
 
 		/* set the wb address whether it's enabled or not */
 		WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR_ADDR_HI),
-		       upper_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFF);
+		       upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF);
 		WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR_ADDR_LO),
-		       lower_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFC);
+		       lower_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFC);
 
 		rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RPTR_WRITEBACK_ENABLE, 1);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 8298926f8502..bf2cf95cbf8f 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -248,7 +248,7 @@ static uint64_t sdma_v5_2_ring_get_rptr(struct amdgpu_ring *ring)
 	u64 *rptr;
 
 	/* XXX check if swapping is necessary on BE */
-	rptr = ((u64 *)&ring->adev->wb.wb[ring->rptr_offs]);
+	rptr = (u64 *)ring->rptr_cpu_addr;
 
 	DRM_DEBUG("rptr before shift == 0x%016llx\n", *rptr);
 	return ((*rptr) >> 2);
@@ -268,7 +268,7 @@ static uint64_t sdma_v5_2_ring_get_wptr(struct amdgpu_ring *ring)
 
 	if (ring->use_doorbell) {
 		/* XXX check if swapping is necessary on BE */
-		wptr = READ_ONCE(*((u64 *)&adev->wb.wb[ring->wptr_offs]));
+		wptr = READ_ONCE(*((u64 *)ring->wptr_cpu_addr));
 		DRM_DEBUG("wptr/doorbell before shift == 0x%016llx\n", wptr);
 	} else {
 		wptr = RREG32(sdma_v5_2_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR_HI));
@@ -301,8 +301,8 @@ static void sdma_v5_2_ring_set_wptr(struct amdgpu_ring *ring)
 				lower_32_bits(ring->wptr << 2),
 				upper_32_bits(ring->wptr << 2));
 		/* XXX check if swapping is necessary on BE */
-		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr << 2);
-		adev->wb.wb[ring->wptr_offs + 1] = upper_32_bits(ring->wptr << 2);
+		atomic64_set((atomic64_t *)ring->wptr_cpu_addr,
+			     ring->wptr << 2);
 		DRM_DEBUG("calling WDOORBELL64(0x%08x, 0x%016llx)\n",
 				ring->doorbell_index, ring->wptr << 2);
 		WDOORBELL64(ring->doorbell_index, ring->wptr << 2);
@@ -609,7 +609,6 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
 	struct amdgpu_ring *ring;
 	u32 rb_cntl, ib_cntl;
 	u32 rb_bufsz;
-	u32 wb_offset;
 	u32 doorbell;
 	u32 doorbell_offset;
 	u32 temp;
@@ -619,7 +618,6 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
 
 	for (i = 0; i < adev->sdma.num_instances; i++) {
 		ring = &adev->sdma.instance[i].ring;
-		wb_offset = (ring->rptr_offs * 4);
 
 		if (!amdgpu_sriov_vf(adev))
 			WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_SEM_WAIT_FAIL_TIMER_CNTL), 0);
@@ -642,7 +640,7 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
 		WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), 0);
 
 		/* setup the wptr shadow polling */
-		wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+		wptr_gpu_addr = ring->wptr_gpu_addr;
 		WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_LO),
 		       lower_32_bits(wptr_gpu_addr));
 		WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_HI),
@@ -657,9 +655,9 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
 
 		/* set the wb address whether it's enabled or not */
 		WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR_ADDR_HI),
-		       upper_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFF);
+		       upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF);
 		WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR_ADDR_LO),
-		       lower_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFC);
+		       lower_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFC);
 
 		rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RPTR_WRITEBACK_ENABLE, 1);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/si_dma.c b/drivers/gpu/drm/amd/amdgpu/si_dma.c
index 2f95235bbfb3..f675111ace20 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_dma.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_dma.c
@@ -40,7 +40,7 @@ static void si_dma_set_irq_funcs(struct amdgpu_device *adev);
 
 static uint64_t si_dma_ring_get_rptr(struct amdgpu_ring *ring)
 {
-	return ring->adev->wb.wb[ring->rptr_offs>>2];
+	return *ring->rptr_cpu_addr;
 }
 
 static uint64_t si_dma_ring_get_wptr(struct amdgpu_ring *ring)
@@ -153,7 +153,7 @@ static int si_dma_start(struct amdgpu_device *adev)
 		WREG32(DMA_RB_RPTR + sdma_offsets[i], 0);
 		WREG32(DMA_RB_WPTR + sdma_offsets[i], 0);
 
-		rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+		rptr_addr = ring->rptr_gpu_addr;
 
 		WREG32(DMA_RB_RPTR_ADDR_LO + sdma_offsets[i], lower_32_bits(rptr_addr));
 		WREG32(DMA_RB_RPTR_ADDR_HI + sdma_offsets[i], upper_32_bits(rptr_addr) & 0xFF);
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
index 2f15b8e0f7d7..e668b3baa8c6 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
@@ -118,7 +118,7 @@ static uint64_t uvd_v7_0_enc_ring_get_wptr(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	if (ring->use_doorbell)
-		return adev->wb.wb[ring->wptr_offs];
+		return *ring->wptr_cpu_addr;
 
 	if (ring == &adev->uvd.inst[ring->me].ring_enc[0])
 		return RREG32_SOC15(UVD, ring->me, mmUVD_RB_WPTR);
@@ -153,7 +153,7 @@ static void uvd_v7_0_enc_ring_set_wptr(struct amdgpu_ring *ring)
 
 	if (ring->use_doorbell) {
 		/* XXX check if swapping is necessary on BE */
-		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+		*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 		WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 		return;
 	}
@@ -754,7 +754,7 @@ static int uvd_v7_0_mmsch_start(struct amdgpu_device *adev,
 		if (adev->uvd.harvest_config & (1 << i))
 			continue;
 		WDOORBELL32(adev->uvd.inst[i].ring_enc[0].doorbell_index, 0);
-		adev->wb.wb[adev->uvd.inst[i].ring_enc[0].wptr_offs] = 0;
+		*adev->uvd.inst[i].ring_enc[0].wptr_cpu_addr = 0;
 		adev->uvd.inst[i].ring_enc[0].wptr = 0;
 		adev->uvd.inst[i].ring_enc[0].wptr_old = 0;
 	}
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index d1fc4e0b8265..66cd3d11aa4b 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -83,7 +83,7 @@ static uint64_t vce_v4_0_ring_get_wptr(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	if (ring->use_doorbell)
-		return adev->wb.wb[ring->wptr_offs];
+		return *ring->wptr_cpu_addr;
 
 	if (ring->me == 0)
 		return RREG32(SOC15_REG_OFFSET(VCE, 0, mmVCE_RB_WPTR));
@@ -106,7 +106,7 @@ static void vce_v4_0_ring_set_wptr(struct amdgpu_ring *ring)
 
 	if (ring->use_doorbell) {
 		/* XXX check if swapping is necessary on BE */
-		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+		*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 		WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 		return;
 	}
@@ -177,7 +177,7 @@ static int vce_v4_0_mmsch_start(struct amdgpu_device *adev,
 	WREG32(SOC15_REG_OFFSET(VCE, 0, mmVCE_MMSCH_VF_MAILBOX_RESP), 0);
 
 	WDOORBELL32(adev->vce.ring[0].doorbell_index, 0);
-	adev->wb.wb[adev->vce.ring[0].wptr_offs] = 0;
+	*adev->vce.ring[0].wptr_cpu_addr = 0;
 	adev->vce.ring[0].wptr = 0;
 	adev->vce.ring[0].wptr_old = 0;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
index 7a7f35e83dd5..8421044d5629 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
@@ -1336,7 +1336,7 @@ static uint64_t vcn_v2_0_dec_ring_get_wptr(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	if (ring->use_doorbell)
-		return adev->wb.wb[ring->wptr_offs];
+		return *ring->wptr_cpu_addr;
 	else
 		return RREG32_SOC15(UVD, 0, mmUVD_RBC_RB_WPTR);
 }
@@ -1357,7 +1357,7 @@ static void vcn_v2_0_dec_ring_set_wptr(struct amdgpu_ring *ring)
 			lower_32_bits(ring->wptr) | 0x80000000);
 
 	if (ring->use_doorbell) {
-		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+		*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 		WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 	} else {
 		WREG32_SOC15(UVD, 0, mmUVD_RBC_RB_WPTR, lower_32_bits(ring->wptr));
@@ -1565,12 +1565,12 @@ static uint64_t vcn_v2_0_enc_ring_get_wptr(struct amdgpu_ring *ring)
 
 	if (ring == &adev->vcn.inst->ring_enc[0]) {
 		if (ring->use_doorbell)
-			return adev->wb.wb[ring->wptr_offs];
+			return *ring->wptr_cpu_addr;
 		else
 			return RREG32_SOC15(UVD, 0, mmUVD_RB_WPTR);
 	} else {
 		if (ring->use_doorbell)
-			return adev->wb.wb[ring->wptr_offs];
+			return *ring->wptr_cpu_addr;
 		else
 			return RREG32_SOC15(UVD, 0, mmUVD_RB_WPTR2);
 	}
@@ -1589,14 +1589,14 @@ static void vcn_v2_0_enc_ring_set_wptr(struct amdgpu_ring *ring)
 
 	if (ring == &adev->vcn.inst->ring_enc[0]) {
 		if (ring->use_doorbell) {
-			adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+			*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 			WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 		} else {
 			WREG32_SOC15(UVD, 0, mmUVD_RB_WPTR, lower_32_bits(ring->wptr));
 		}
 	} else {
 		if (ring->use_doorbell) {
-			adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+			*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 			WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 		} else {
 			WREG32_SOC15(UVD, 0, mmUVD_RB_WPTR2, lower_32_bits(ring->wptr));
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
index 17d44be58877..9352d07539b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
@@ -1491,7 +1491,7 @@ static uint64_t vcn_v2_5_dec_ring_get_wptr(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	if (ring->use_doorbell)
-		return adev->wb.wb[ring->wptr_offs];
+		return *ring->wptr_cpu_addr;
 	else
 		return RREG32_SOC15(VCN, ring->me, mmUVD_RBC_RB_WPTR);
 }
@@ -1508,7 +1508,7 @@ static void vcn_v2_5_dec_ring_set_wptr(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	if (ring->use_doorbell) {
-		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+		*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 		WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 	} else {
 		WREG32_SOC15(VCN, ring->me, mmUVD_RBC_RB_WPTR, lower_32_bits(ring->wptr));
@@ -1607,12 +1607,12 @@ static uint64_t vcn_v2_5_enc_ring_get_wptr(struct amdgpu_ring *ring)
 
 	if (ring == &adev->vcn.inst[ring->me].ring_enc[0]) {
 		if (ring->use_doorbell)
-			return adev->wb.wb[ring->wptr_offs];
+			return *ring->wptr_cpu_addr;
 		else
 			return RREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR);
 	} else {
 		if (ring->use_doorbell)
-			return adev->wb.wb[ring->wptr_offs];
+			return *ring->wptr_cpu_addr;
 		else
 			return RREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR2);
 	}
@@ -1631,14 +1631,14 @@ static void vcn_v2_5_enc_ring_set_wptr(struct amdgpu_ring *ring)
 
 	if (ring == &adev->vcn.inst[ring->me].ring_enc[0]) {
 		if (ring->use_doorbell) {
-			adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+			*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 			WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 		} else {
 			WREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR, lower_32_bits(ring->wptr));
 		}
 	} else {
 		if (ring->use_doorbell) {
-			adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+			*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 			WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 		} else {
 			WREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR2, lower_32_bits(ring->wptr));
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index cb5f0a12333f..19cdad38d134 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -1695,7 +1695,7 @@ static uint64_t vcn_v3_0_dec_ring_get_wptr(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 
 	if (ring->use_doorbell)
-		return adev->wb.wb[ring->wptr_offs];
+		return *ring->wptr_cpu_addr;
 	else
 		return RREG32_SOC15(VCN, ring->me, mmUVD_RBC_RB_WPTR);
 }
@@ -1721,7 +1721,7 @@ static void vcn_v3_0_dec_ring_set_wptr(struct amdgpu_ring *ring)
 	}
 
 	if (ring->use_doorbell) {
-		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+		*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 		WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 	} else {
 		WREG32_SOC15(VCN, ring->me, mmUVD_RBC_RB_WPTR, lower_32_bits(ring->wptr));
@@ -2012,12 +2012,12 @@ static uint64_t vcn_v3_0_enc_ring_get_wptr(struct amdgpu_ring *ring)
 
 	if (ring == &adev->vcn.inst[ring->me].ring_enc[0]) {
 		if (ring->use_doorbell)
-			return adev->wb.wb[ring->wptr_offs];
+			return *ring->wptr_cpu_addr;
 		else
 			return RREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR);
 	} else {
 		if (ring->use_doorbell)
-			return adev->wb.wb[ring->wptr_offs];
+			return *ring->wptr_cpu_addr;
 		else
 			return RREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR2);
 	}
@@ -2036,14 +2036,14 @@ static void vcn_v3_0_enc_ring_set_wptr(struct amdgpu_ring *ring)
 
 	if (ring == &adev->vcn.inst[ring->me].ring_enc[0]) {
 		if (ring->use_doorbell) {
-			adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+			*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 			WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 		} else {
 			WREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR, lower_32_bits(ring->wptr));
 		}
 	} else {
 		if (ring->use_doorbell) {
-			adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+			*ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
 			WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 		} else {
 			WREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR2, lower_32_bits(ring->wptr));
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 07/73] drm/amdgpu: initialize/finalize the ring for mes queue
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (5 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 06/73] drm/amdgpu: use ring structure to access rptr/wptr v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 08/73] drm/amdgpu: assign the cpu/gpu address of fence from ring Alex Deucher
                   ` (65 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Iniailize/finalize the ring for mes queue which submits the command
stream to the mes-managed hardware queue.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 145 ++++++++++++++++-------
 1 file changed, 104 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 773954318216..13db99d653bd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -149,6 +149,16 @@ void amdgpu_ring_undo(struct amdgpu_ring *ring)
 		ring->funcs->end_use(ring);
 }
 
+#define amdgpu_ring_get_gpu_addr(ring, offset)				\
+	(ring->is_mes_queue ?						\
+	 (ring->mes_ctx->meta_data_gpu_addr + offset) :			\
+	 (ring->adev->wb.gpu_addr + offset * 4))
+
+#define amdgpu_ring_get_cpu_addr(ring, offset)				\
+	(ring->is_mes_queue ?						\
+	 (void *)((uint8_t *)(ring->mes_ctx->meta_data_ptr) + offset) : \
+	 (&ring->adev->wb.wb[offset]))
+
 /**
  * amdgpu_ring_init - init driver ring struct.
  *
@@ -189,51 +199,88 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
 			return -EINVAL;
 
 		ring->adev = adev;
-		ring->idx = adev->num_rings++;
-		adev->rings[ring->idx] = ring;
 		ring->num_hw_submission = sched_hw_submission;
 		ring->sched_score = sched_score;
 		ring->vmid_wait = dma_fence_get_stub();
+
+		if (!ring->is_mes_queue) {
+			ring->idx = adev->num_rings++;
+			adev->rings[ring->idx] = ring;
+		}
+
 		r = amdgpu_fence_driver_init_ring(ring);
 		if (r)
 			return r;
 	}
 
-	r = amdgpu_device_wb_get(adev, &ring->rptr_offs);
-	if (r) {
-		dev_err(adev->dev, "(%d) ring rptr_offs wb alloc failed\n", r);
-		return r;
-	}
+	if (ring->is_mes_queue) {
+		ring->rptr_offs = amdgpu_mes_ctx_get_offs(ring,
+				AMDGPU_MES_CTX_RPTR_OFFS);
+		ring->wptr_offs = amdgpu_mes_ctx_get_offs(ring,
+				AMDGPU_MES_CTX_WPTR_OFFS);
+		ring->fence_offs = amdgpu_mes_ctx_get_offs(ring,
+				AMDGPU_MES_CTX_FENCE_OFFS);
+		ring->trail_fence_offs = amdgpu_mes_ctx_get_offs(ring,
+				AMDGPU_MES_CTX_TRAIL_FENCE_OFFS);
+		ring->cond_exe_offs = amdgpu_mes_ctx_get_offs(ring,
+				AMDGPU_MES_CTX_COND_EXE_OFFS);
+	} else {
+		r = amdgpu_device_wb_get(adev, &ring->rptr_offs);
+		if (r) {
+			dev_err(adev->dev, "(%d) ring rptr_offs wb alloc failed\n", r);
+			return r;
+		}
 
-	r = amdgpu_device_wb_get(adev, &ring->wptr_offs);
-	if (r) {
-		dev_err(adev->dev, "(%d) ring wptr_offs wb alloc failed\n", r);
-		return r;
-	}
+		r = amdgpu_device_wb_get(adev, &ring->wptr_offs);
+		if (r) {
+			dev_err(adev->dev, "(%d) ring wptr_offs wb alloc failed\n", r);
+			return r;
+		}
 
-	r = amdgpu_device_wb_get(adev, &ring->fence_offs);
-	if (r) {
-		dev_err(adev->dev, "(%d) ring fence_offs wb alloc failed\n", r);
-		return r;
-	}
+		r = amdgpu_device_wb_get(adev, &ring->fence_offs);
+		if (r) {
+			dev_err(adev->dev, "(%d) ring fence_offs wb alloc failed\n", r);
+			return r;
+		}
 
-	r = amdgpu_device_wb_get(adev, &ring->trail_fence_offs);
-	if (r) {
-		dev_err(adev->dev,
-			"(%d) ring trail_fence_offs wb alloc failed\n", r);
-		return r;
+		r = amdgpu_device_wb_get(adev, &ring->trail_fence_offs);
+		if (r) {
+			dev_err(adev->dev, "(%d) ring trail_fence_offs wb alloc failed\n", r);
+			return r;
+		}
+
+		r = amdgpu_device_wb_get(adev, &ring->cond_exe_offs);
+		if (r) {
+			dev_err(adev->dev, "(%d) ring cond_exec_polling wb alloc failed\n", r);
+			return r;
+		}
 	}
+
+	ring->fence_gpu_addr =
+		amdgpu_ring_get_gpu_addr(ring, ring->fence_offs);
+	ring->fence_cpu_addr =
+		amdgpu_ring_get_cpu_addr(ring, ring->fence_offs);
+
+	ring->rptr_gpu_addr =
+		amdgpu_ring_get_gpu_addr(ring, ring->rptr_offs);
+	ring->rptr_cpu_addr =
+		amdgpu_ring_get_cpu_addr(ring, ring->rptr_offs);
+
+	ring->wptr_gpu_addr =
+		amdgpu_ring_get_gpu_addr(ring, ring->wptr_offs);
+	ring->wptr_cpu_addr =
+		amdgpu_ring_get_cpu_addr(ring, ring->wptr_offs);
+
 	ring->trail_fence_gpu_addr =
-		adev->wb.gpu_addr + (ring->trail_fence_offs * 4);
-	ring->trail_fence_cpu_addr = &adev->wb.wb[ring->trail_fence_offs];
+		amdgpu_ring_get_gpu_addr(ring, ring->trail_fence_offs);
+	ring->trail_fence_cpu_addr =
+		amdgpu_ring_get_cpu_addr(ring, ring->trail_fence_offs);
+
+	ring->cond_exe_gpu_addr =
+		amdgpu_ring_get_gpu_addr(ring, ring->cond_exe_offs);
+	ring->cond_exe_cpu_addr =
+		amdgpu_ring_get_cpu_addr(ring, ring->cond_exe_offs);
 
-	r = amdgpu_device_wb_get(adev, &ring->cond_exe_offs);
-	if (r) {
-		dev_err(adev->dev, "(%d) ring cond_exec_polling wb alloc failed\n", r);
-		return r;
-	}
-	ring->cond_exe_gpu_addr = adev->wb.gpu_addr + (ring->cond_exe_offs * 4);
-	ring->cond_exe_cpu_addr = &adev->wb.wb[ring->cond_exe_offs];
 	/* always set cond_exec_polling to CONTINUE */
 	*ring->cond_exe_cpu_addr = 1;
 
@@ -248,8 +295,20 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
 	ring->buf_mask = (ring->ring_size / 4) - 1;
 	ring->ptr_mask = ring->funcs->support_64bit_ptrs ?
 		0xffffffffffffffff : ring->buf_mask;
+
 	/* Allocate ring buffer */
-	if (ring->ring_obj == NULL) {
+	if (ring->is_mes_queue) {
+		int offset = 0;
+
+		BUG_ON(ring->ring_size > PAGE_SIZE*4);
+
+		offset = amdgpu_mes_ctx_get_offs(ring,
+					 AMDGPU_MES_CTX_RING_OFFS);
+		ring->gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+		ring->ring = amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+		amdgpu_ring_clear_ring(ring);
+
+	} else if (ring->ring_obj == NULL) {
 		r = amdgpu_bo_create_kernel(adev, ring->ring_size + ring->funcs->extra_dw, PAGE_SIZE,
 					    AMDGPU_GEM_DOMAIN_GTT,
 					    &ring->ring_obj,
@@ -286,26 +345,30 @@ void amdgpu_ring_fini(struct amdgpu_ring *ring)
 {
 
 	/* Not to finish a ring which is not initialized */
-	if (!(ring->adev) || !(ring->adev->rings[ring->idx]))
+	if (!(ring->adev) ||
+	    (!ring->is_mes_queue && !(ring->adev->rings[ring->idx])))
 		return;
 
 	ring->sched.ready = false;
 
-	amdgpu_device_wb_free(ring->adev, ring->rptr_offs);
-	amdgpu_device_wb_free(ring->adev, ring->wptr_offs);
+	if (!ring->is_mes_queue) {
+		amdgpu_device_wb_free(ring->adev, ring->rptr_offs);
+		amdgpu_device_wb_free(ring->adev, ring->wptr_offs);
 
-	amdgpu_device_wb_free(ring->adev, ring->cond_exe_offs);
-	amdgpu_device_wb_free(ring->adev, ring->fence_offs);
+		amdgpu_device_wb_free(ring->adev, ring->cond_exe_offs);
+		amdgpu_device_wb_free(ring->adev, ring->fence_offs);
 
-	amdgpu_bo_free_kernel(&ring->ring_obj,
-			      &ring->gpu_addr,
-			      (void **)&ring->ring);
+		amdgpu_bo_free_kernel(&ring->ring_obj,
+				      &ring->gpu_addr,
+				      (void **)&ring->ring);
+	}
 
 	dma_fence_put(ring->vmid_wait);
 	ring->vmid_wait = NULL;
 	ring->me = 0;
 
-	ring->adev->rings[ring->idx] = NULL;
+	if (!ring->is_mes_queue)
+		ring->adev->rings[ring->idx] = NULL;
 }
 
 /**
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 08/73] drm/amdgpu: assign the cpu/gpu address of fence from ring
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (6 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 07/73] drm/amdgpu: initialize/finalize the ring for mes queue Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 09/73] drm/amdgpu/gfx10: implement mqd functions of gfx/compute eng v2 Alex Deucher
                   ` (64 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

assign the cpu/gpu address of fence for the normal or mes ring
from ring structure.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 5d13ed376ab4..d16c8c1f72db 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -422,8 +422,8 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,
 	uint64_t index;
 
 	if (ring->funcs->type != AMDGPU_RING_TYPE_UVD) {
-		ring->fence_drv.cpu_addr = &adev->wb.wb[ring->fence_offs];
-		ring->fence_drv.gpu_addr = adev->wb.gpu_addr + (ring->fence_offs * 4);
+		ring->fence_drv.cpu_addr = ring->fence_cpu_addr;
+		ring->fence_drv.gpu_addr = ring->fence_gpu_addr;
 	} else {
 		/* put fence directly behind firmware */
 		index = ALIGN(adev->uvd.fw->size, 8);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 09/73] drm/amdgpu/gfx10: implement mqd functions of gfx/compute eng v2
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (7 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 08/73] drm/amdgpu: assign the cpu/gpu address of fence from ring Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 10/73] drm/amdgpu/gfx10: use per ctx CSA for ce metadata Alex Deucher
                   ` (63 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Refine the existing gfx/compute mqd functions, and add them
to engine mqd layer.

v2: rebase fix.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 111 +++++++++++++------------
 1 file changed, 56 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index f2dd53f2af61..cc70594d7e4c 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -3485,6 +3485,7 @@ static void gfx_v10_0_set_ring_funcs(struct amdgpu_device *adev);
 static void gfx_v10_0_set_irq_funcs(struct amdgpu_device *adev);
 static void gfx_v10_0_set_gds_init(struct amdgpu_device *adev);
 static void gfx_v10_0_set_rlc_funcs(struct amdgpu_device *adev);
+static void gfx_v10_0_set_mqd_funcs(struct amdgpu_device *adev);
 static int gfx_v10_0_get_cu_info(struct amdgpu_device *adev,
 				 struct amdgpu_cu_info *cu_info);
 static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev);
@@ -6564,10 +6565,10 @@ static void gfx_v10_0_kiq_setting(struct amdgpu_ring *ring)
 	}
 }
 
-static int gfx_v10_0_gfx_mqd_init(struct amdgpu_ring *ring)
+static int gfx_v10_0_gfx_mqd_init(struct amdgpu_device *adev, void *m,
+				  struct amdgpu_mqd_prop *prop)
 {
-	struct amdgpu_device *adev = ring->adev;
-	struct v10_gfx_mqd *mqd = ring->mqd_ptr;
+	struct v10_gfx_mqd *mqd = m;
 	uint64_t hqd_gpu_addr, wb_gpu_addr;
 	uint32_t tmp;
 	uint32_t rb_bufsz;
@@ -6577,8 +6578,8 @@ static int gfx_v10_0_gfx_mqd_init(struct amdgpu_ring *ring)
 	mqd->cp_gfx_hqd_wptr_hi = 0;
 
 	/* set the pointer to the MQD */
-	mqd->cp_mqd_base_addr = ring->mqd_gpu_addr & 0xfffffffc;
-	mqd->cp_mqd_base_addr_hi = upper_32_bits(ring->mqd_gpu_addr);
+	mqd->cp_mqd_base_addr = prop->mqd_gpu_addr & 0xfffffffc;
+	mqd->cp_mqd_base_addr_hi = upper_32_bits(prop->mqd_gpu_addr);
 
 	/* set up mqd control */
 	tmp = RREG32_SOC15(GC, 0, mmCP_GFX_MQD_CONTROL);
@@ -6604,23 +6605,23 @@ static int gfx_v10_0_gfx_mqd_init(struct amdgpu_ring *ring)
 	mqd->cp_gfx_hqd_quantum = tmp;
 
 	/* set up gfx hqd base. this is similar as CP_RB_BASE */
-	hqd_gpu_addr = ring->gpu_addr >> 8;
+	hqd_gpu_addr = prop->hqd_base_gpu_addr >> 8;
 	mqd->cp_gfx_hqd_base = hqd_gpu_addr;
 	mqd->cp_gfx_hqd_base_hi = upper_32_bits(hqd_gpu_addr);
 
 	/* set up hqd_rptr_addr/_hi, similar as CP_RB_RPTR */
-	wb_gpu_addr = ring->rptr_gpu_addr;
+	wb_gpu_addr = prop->rptr_gpu_addr;
 	mqd->cp_gfx_hqd_rptr_addr = wb_gpu_addr & 0xfffffffc;
 	mqd->cp_gfx_hqd_rptr_addr_hi =
 		upper_32_bits(wb_gpu_addr) & 0xffff;
 
 	/* set up rb_wptr_poll addr */
-	wb_gpu_addr = ring->wptr_gpu_addr;
+	wb_gpu_addr = prop->wptr_gpu_addr;
 	mqd->cp_rb_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffffc;
 	mqd->cp_rb_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
 
 	/* set up the gfx_hqd_control, similar as CP_RB0_CNTL */
-	rb_bufsz = order_base_2(ring->ring_size / 4) - 1;
+	rb_bufsz = order_base_2(prop->queue_size / 4) - 1;
 	tmp = RREG32_SOC15(GC, 0, mmCP_GFX_HQD_CNTL);
 	tmp = REG_SET_FIELD(tmp, CP_GFX_HQD_CNTL, RB_BUFSZ, rb_bufsz);
 	tmp = REG_SET_FIELD(tmp, CP_GFX_HQD_CNTL, RB_BLKSZ, rb_bufsz - 2);
@@ -6631,9 +6632,9 @@ static int gfx_v10_0_gfx_mqd_init(struct amdgpu_ring *ring)
 
 	/* set up cp_doorbell_control */
 	tmp = RREG32_SOC15(GC, 0, mmCP_RB_DOORBELL_CONTROL);
-	if (ring->use_doorbell) {
+	if (prop->use_doorbell) {
 		tmp = REG_SET_FIELD(tmp, CP_RB_DOORBELL_CONTROL,
-				    DOORBELL_OFFSET, ring->doorbell_index);
+				    DOORBELL_OFFSET, prop->doorbell_index);
 		tmp = REG_SET_FIELD(tmp, CP_RB_DOORBELL_CONTROL,
 				    DOORBELL_EN, 1);
 	} else
@@ -6641,13 +6642,7 @@ static int gfx_v10_0_gfx_mqd_init(struct amdgpu_ring *ring)
 				    DOORBELL_EN, 0);
 	mqd->cp_rb_doorbell_control = tmp;
 
-	/*if there are 2 gfx rings, set the lower doorbell range of the first ring,
-	 *otherwise the range of the second ring will override the first ring */
-	if (ring->doorbell_index == adev->doorbell_index.gfx_ring0 << 1)
-		gfx_v10_0_cp_gfx_set_doorbell(adev, ring);
-
 	/* reset read and write pointers, similar to CP_RB0_WPTR/_RPTR */
-	ring->wptr = 0;
 	mqd->cp_gfx_hqd_rptr = RREG32_SOC15(GC, 0, mmCP_GFX_HQD_RPTR);
 
 	/* active the queue */
@@ -6715,7 +6710,16 @@ static int gfx_v10_0_gfx_init_queue(struct amdgpu_ring *ring)
 		memset((void *)mqd, 0, sizeof(*mqd));
 		mutex_lock(&adev->srbm_mutex);
 		nv_grbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
-		gfx_v10_0_gfx_mqd_init(ring);
+		amdgpu_ring_init_mqd(ring);
+
+		/*
+		 * if there are 2 gfx rings, set the lower doorbell
+		 * range of the first ring, otherwise the range of
+		 * the second ring will override the first ring
+		 */
+		if (ring->doorbell_index == adev->doorbell_index.gfx_ring0 << 1)
+			gfx_v10_0_cp_gfx_set_doorbell(adev, ring);
+
 #ifdef BRING_UP_DEBUG
 		gfx_v10_0_gfx_queue_init_register(ring);
 #endif
@@ -6808,23 +6812,10 @@ static int gfx_v10_0_cp_async_gfx_ring_resume(struct amdgpu_device *adev)
 	return r;
 }
 
-static void gfx_v10_0_compute_mqd_set_priority(struct amdgpu_ring *ring, struct v10_compute_mqd *mqd)
+static int gfx_v10_0_compute_mqd_init(struct amdgpu_device *adev, void *m,
+				      struct amdgpu_mqd_prop *prop)
 {
-	struct amdgpu_device *adev = ring->adev;
-
-	if (ring->funcs->type == AMDGPU_RING_TYPE_COMPUTE) {
-		if (amdgpu_gfx_is_high_priority_compute_queue(adev, ring)) {
-			mqd->cp_hqd_pipe_priority = AMDGPU_GFX_PIPE_PRIO_HIGH;
-			mqd->cp_hqd_queue_priority =
-				AMDGPU_GFX_QUEUE_PRIORITY_MAXIMUM;
-		}
-	}
-}
-
-static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
-{
-	struct amdgpu_device *adev = ring->adev;
-	struct v10_compute_mqd *mqd = ring->mqd_ptr;
+	struct v10_compute_mqd *mqd = m;
 	uint64_t hqd_gpu_addr, wb_gpu_addr, eop_base_addr;
 	uint32_t tmp;
 
@@ -6836,7 +6827,7 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
 	mqd->compute_static_thread_mgmt_se3 = 0xffffffff;
 	mqd->compute_misc_reserved = 0x00000003;
 
-	eop_base_addr = ring->eop_gpu_addr >> 8;
+	eop_base_addr = prop->eop_gpu_addr >> 8;
 	mqd->cp_hqd_eop_base_addr_lo = eop_base_addr;
 	mqd->cp_hqd_eop_base_addr_hi = upper_32_bits(eop_base_addr);
 
@@ -6850,9 +6841,9 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
 	/* enable doorbell? */
 	tmp = RREG32_SOC15(GC, 0, mmCP_HQD_PQ_DOORBELL_CONTROL);
 
-	if (ring->use_doorbell) {
+	if (prop->use_doorbell) {
 		tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_DOORBELL_CONTROL,
-				    DOORBELL_OFFSET, ring->doorbell_index);
+				    DOORBELL_OFFSET, prop->doorbell_index);
 		tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_DOORBELL_CONTROL,
 				    DOORBELL_EN, 1);
 		tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_DOORBELL_CONTROL,
@@ -6867,15 +6858,14 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
 	mqd->cp_hqd_pq_doorbell_control = tmp;
 
 	/* disable the queue if it's active */
-	ring->wptr = 0;
 	mqd->cp_hqd_dequeue_request = 0;
 	mqd->cp_hqd_pq_rptr = 0;
 	mqd->cp_hqd_pq_wptr_lo = 0;
 	mqd->cp_hqd_pq_wptr_hi = 0;
 
 	/* set the pointer to the MQD */
-	mqd->cp_mqd_base_addr_lo = ring->mqd_gpu_addr & 0xfffffffc;
-	mqd->cp_mqd_base_addr_hi = upper_32_bits(ring->mqd_gpu_addr);
+	mqd->cp_mqd_base_addr_lo = prop->mqd_gpu_addr & 0xfffffffc;
+	mqd->cp_mqd_base_addr_hi = upper_32_bits(prop->mqd_gpu_addr);
 
 	/* set MQD vmid to 0 */
 	tmp = RREG32_SOC15(GC, 0, mmCP_MQD_CONTROL);
@@ -6883,14 +6873,14 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
 	mqd->cp_mqd_control = tmp;
 
 	/* set the pointer to the HQD, this is similar CP_RB0_BASE/_HI */
-	hqd_gpu_addr = ring->gpu_addr >> 8;
+	hqd_gpu_addr = prop->hqd_base_gpu_addr >> 8;
 	mqd->cp_hqd_pq_base_lo = hqd_gpu_addr;
 	mqd->cp_hqd_pq_base_hi = upper_32_bits(hqd_gpu_addr);
 
 	/* set up the HQD, this is similar to CP_RB0_CNTL */
 	tmp = RREG32_SOC15(GC, 0, mmCP_HQD_PQ_CONTROL);
 	tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, QUEUE_SIZE,
-			    (order_base_2(ring->ring_size / 4) - 1));
+			    (order_base_2(prop->queue_size / 4) - 1));
 	tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, RPTR_BLOCK_SIZE,
 			    ((order_base_2(AMDGPU_GPU_PAGE_SIZE / 4) - 1) << 8));
 #ifdef __BIG_ENDIAN
@@ -6903,22 +6893,22 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
 	mqd->cp_hqd_pq_control = tmp;
 
 	/* set the wb address whether it's enabled or not */
-	wb_gpu_addr = ring->rptr_gpu_addr;
+	wb_gpu_addr = prop->rptr_gpu_addr;
 	mqd->cp_hqd_pq_rptr_report_addr_lo = wb_gpu_addr & 0xfffffffc;
 	mqd->cp_hqd_pq_rptr_report_addr_hi =
 		upper_32_bits(wb_gpu_addr) & 0xffff;
 
 	/* only used if CP_PQ_WPTR_POLL_CNTL.CP_PQ_WPTR_POLL_CNTL__EN_MASK=1 */
-	wb_gpu_addr = ring->wptr_gpu_addr;
+	wb_gpu_addr = prop->wptr_gpu_addr;
 	mqd->cp_hqd_pq_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffffc;
 	mqd->cp_hqd_pq_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
 
 	tmp = 0;
 	/* enable the doorbell if requested */
-	if (ring->use_doorbell) {
+	if (prop->use_doorbell) {
 		tmp = RREG32_SOC15(GC, 0, mmCP_HQD_PQ_DOORBELL_CONTROL);
 		tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_DOORBELL_CONTROL,
-				DOORBELL_OFFSET, ring->doorbell_index);
+				DOORBELL_OFFSET, prop->doorbell_index);
 
 		tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_DOORBELL_CONTROL,
 				    DOORBELL_EN, 1);
@@ -6931,7 +6921,6 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
 	mqd->cp_hqd_pq_doorbell_control = tmp;
 
 	/* reset read and write pointers, similar to CP_RB0_WPTR/_RPTR */
-	ring->wptr = 0;
 	mqd->cp_hqd_pq_rptr = RREG32_SOC15(GC, 0, mmCP_HQD_PQ_RPTR);
 
 	/* set the vmid for the queue */
@@ -6947,13 +6936,10 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
 	mqd->cp_hqd_ib_control = tmp;
 
 	/* set static priority for a compute queue/ring */
-	gfx_v10_0_compute_mqd_set_priority(ring, mqd);
+	mqd->cp_hqd_pipe_priority = prop->hqd_pipe_priority;
+	mqd->cp_hqd_queue_priority = prop->hqd_queue_priority;
 
-	/* map_queues packet doesn't need activate the queue,
-	 * so only kiq need set this field.
-	 */
-	if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
-		mqd->cp_hqd_active = 1;
+	mqd->cp_hqd_active = prop->hqd_active;
 
 	return 0;
 }
@@ -7094,7 +7080,7 @@ static int gfx_v10_0_kiq_init_queue(struct amdgpu_ring *ring)
 		memset((void *)mqd, 0, sizeof(*mqd));
 		mutex_lock(&adev->srbm_mutex);
 		nv_grbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
-		gfx_v10_0_compute_mqd_init(ring);
+		amdgpu_ring_init_mqd(ring);
 		gfx_v10_0_kiq_init_register(ring);
 		nv_grbm_select(adev, 0, 0, 0, 0);
 		mutex_unlock(&adev->srbm_mutex);
@@ -7116,7 +7102,7 @@ static int gfx_v10_0_kcq_init_queue(struct amdgpu_ring *ring)
 		memset((void *)mqd, 0, sizeof(*mqd));
 		mutex_lock(&adev->srbm_mutex);
 		nv_grbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
-		gfx_v10_0_compute_mqd_init(ring);
+		amdgpu_ring_init_mqd(ring);
 		nv_grbm_select(adev, 0, 0, 0, 0);
 		mutex_unlock(&adev->srbm_mutex);
 
@@ -7799,6 +7785,7 @@ static int gfx_v10_0_early_init(void *handle)
 	gfx_v10_0_set_irq_funcs(adev);
 	gfx_v10_0_set_gds_init(adev);
 	gfx_v10_0_set_rlc_funcs(adev);
+	gfx_v10_0_set_mqd_funcs(adev);
 
 	/* init rlcg reg access ctrl */
 	gfx_v10_0_init_rlcg_reg_access_ctrl(adev);
@@ -9581,6 +9568,20 @@ static void gfx_v10_0_set_gds_init(struct amdgpu_device *adev)
 	adev->gds.oa_size = 16;
 }
 
+static void gfx_v10_0_set_mqd_funcs(struct amdgpu_device *adev)
+{
+	/* set gfx eng mqd */
+	adev->mqds[AMDGPU_HW_IP_GFX].mqd_size =
+		sizeof(struct v10_gfx_mqd);
+	adev->mqds[AMDGPU_HW_IP_GFX].init_mqd =
+		gfx_v10_0_gfx_mqd_init;
+	/* set compute eng mqd */
+	adev->mqds[AMDGPU_HW_IP_COMPUTE].mqd_size =
+		sizeof(struct v10_compute_mqd);
+	adev->mqds[AMDGPU_HW_IP_COMPUTE].init_mqd =
+		gfx_v10_0_compute_mqd_init;
+}
+
 static void gfx_v10_0_set_user_wgp_inactive_bitmap_per_sh(struct amdgpu_device *adev,
 							  u32 bitmap)
 {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 10/73] drm/amdgpu/gfx10: use per ctx CSA for ce metadata
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (8 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 09/73] drm/amdgpu/gfx10: implement mqd functions of gfx/compute eng v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 11/73] drm/amdgpu/gfx10: use per ctx CSA for de metadata Alex Deucher
                   ` (62 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

As MES requires per context preemption, use per context CSA address
for CE metadata to correctly enable context MCBP preemption.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 28 +++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index cc70594d7e4c..56a7153474c6 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -8849,26 +8849,36 @@ static void gfx_v10_0_ring_emit_ce_meta(struct amdgpu_ring *ring, bool resume)
 {
 	struct amdgpu_device *adev = ring->adev;
 	struct v10_ce_ib_state ce_payload = {0};
-	uint64_t csa_addr;
+	uint64_t offset, ce_payload_gpu_addr;
+	void *ce_payload_cpu_addr;
 	int cnt;
 
 	cnt = (sizeof(ce_payload) >> 2) + 4 - 2;
-	csa_addr = amdgpu_csa_vaddr(ring->adev);
+
+	if (ring->is_mes_queue) {
+		offset = offsetof(struct amdgpu_mes_ctx_meta_data,
+				  gfx[0].gfx_meta_data) +
+			offsetof(struct v10_gfx_meta_data, ce_payload);
+		ce_payload_gpu_addr =
+			amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+		ce_payload_cpu_addr =
+			amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+	} else {
+		offset = offsetof(struct v10_gfx_meta_data, ce_payload);
+		ce_payload_gpu_addr = amdgpu_csa_vaddr(ring->adev) + offset;
+		ce_payload_cpu_addr = adev->virt.csa_cpu_addr + offset;
+	}
 
 	amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, cnt));
 	amdgpu_ring_write(ring, (WRITE_DATA_ENGINE_SEL(2) |
 				 WRITE_DATA_DST_SEL(8) |
 				 WR_CONFIRM) |
 				 WRITE_DATA_CACHE_POLICY(0));
-	amdgpu_ring_write(ring, lower_32_bits(csa_addr +
-			      offsetof(struct v10_gfx_meta_data, ce_payload)));
-	amdgpu_ring_write(ring, upper_32_bits(csa_addr +
-			      offsetof(struct v10_gfx_meta_data, ce_payload)));
+	amdgpu_ring_write(ring, lower_32_bits(ce_payload_gpu_addr));
+	amdgpu_ring_write(ring, upper_32_bits(ce_payload_gpu_addr));
 
 	if (resume)
-		amdgpu_ring_write_multiple(ring, adev->virt.csa_cpu_addr +
-					   offsetof(struct v10_gfx_meta_data,
-						    ce_payload),
+		amdgpu_ring_write_multiple(ring, ce_payload_cpu_addr,
 					   sizeof(ce_payload) >> 2);
 	else
 		amdgpu_ring_write_multiple(ring, (void *)&ce_payload,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 11/73] drm/amdgpu/gfx10: use per ctx CSA for de metadata
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (9 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 10/73] drm/amdgpu/gfx10: use per ctx CSA for ce metadata Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 12/73] drm/amdgpu/gfx10: associate mes queue id with fence v2 Alex Deucher
                   ` (61 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

As MES requires per context preemption, use per context CSA address
for DE metadata to correctly enable context MCBP preemption.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 39 ++++++++++++++++++--------
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 56a7153474c6..d06807355f5f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -8889,12 +8889,33 @@ static void gfx_v10_0_ring_emit_de_meta(struct amdgpu_ring *ring, bool resume)
 {
 	struct amdgpu_device *adev = ring->adev;
 	struct v10_de_ib_state de_payload = {0};
-	uint64_t csa_addr, gds_addr;
+	uint64_t offset, gds_addr, de_payload_gpu_addr;
+	void *de_payload_cpu_addr;
 	int cnt;
 
-	csa_addr = amdgpu_csa_vaddr(ring->adev);
-	gds_addr = ALIGN(csa_addr + AMDGPU_CSA_SIZE - adev->gds.gds_size,
-			 PAGE_SIZE);
+	if (ring->is_mes_queue) {
+		offset = offsetof(struct amdgpu_mes_ctx_meta_data,
+				  gfx[0].gfx_meta_data) +
+			offsetof(struct v10_gfx_meta_data, de_payload);
+		de_payload_gpu_addr =
+			amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+		de_payload_cpu_addr =
+			amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+
+		offset = offsetof(struct amdgpu_mes_ctx_meta_data,
+				  gfx[0].gds_backup) +
+			offsetof(struct v10_gfx_meta_data, de_payload);
+		gds_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+	} else {
+		offset = offsetof(struct v10_gfx_meta_data, de_payload);
+		de_payload_gpu_addr = amdgpu_csa_vaddr(ring->adev) + offset;
+		de_payload_cpu_addr = adev->virt.csa_cpu_addr + offset;
+
+		gds_addr = ALIGN(amdgpu_csa_vaddr(ring->adev) +
+				 AMDGPU_CSA_SIZE - adev->gds.gds_size,
+				 PAGE_SIZE);
+	}
+
 	de_payload.gds_backup_addrlo = lower_32_bits(gds_addr);
 	de_payload.gds_backup_addrhi = upper_32_bits(gds_addr);
 
@@ -8904,15 +8925,11 @@ static void gfx_v10_0_ring_emit_de_meta(struct amdgpu_ring *ring, bool resume)
 				 WRITE_DATA_DST_SEL(8) |
 				 WR_CONFIRM) |
 				 WRITE_DATA_CACHE_POLICY(0));
-	amdgpu_ring_write(ring, lower_32_bits(csa_addr +
-			      offsetof(struct v10_gfx_meta_data, de_payload)));
-	amdgpu_ring_write(ring, upper_32_bits(csa_addr +
-			      offsetof(struct v10_gfx_meta_data, de_payload)));
+	amdgpu_ring_write(ring, lower_32_bits(de_payload_gpu_addr));
+	amdgpu_ring_write(ring, upper_32_bits(de_payload_gpu_addr));
 
 	if (resume)
-		amdgpu_ring_write_multiple(ring, adev->virt.csa_cpu_addr +
-					   offsetof(struct v10_gfx_meta_data,
-						    de_payload),
+		amdgpu_ring_write_multiple(ring, de_payload_cpu_addr,
 					   sizeof(de_payload) >> 2);
 	else
 		amdgpu_ring_write_multiple(ring, (void *)&de_payload,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 12/73] drm/amdgpu/gfx10: associate mes queue id with fence v2
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (10 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 11/73] drm/amdgpu/gfx10: use per ctx CSA for de metadata Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 13/73] drm/amdgpu/gfx10: inherit vmid from mqd Alex Deucher
                   ` (60 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Associate mes queue id with fence, so that EOP trap handler can look up
which queue has issued the fence.

v2: move mes queue flag to amdgpu_mes_ctx.h

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h | 2 ++
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c      | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
index f3e1ba1a889f..544f1aa86edf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
@@ -115,4 +115,6 @@ struct amdgpu_mes_ctx_data {
 #define AMDGPU_FENCE_MES_QUEUE_FLAG     0x1000000u
 #define AMDGPU_FENCE_MES_QUEUE_ID_MASK  (AMDGPU_FENCE_MES_QUEUE_FLAG - 1)
 
+#define AMDGPU_FENCE_MES_QUEUE_FLAG     0x1000000u
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index d06807355f5f..e6e601296097 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -8678,7 +8678,8 @@ static void gfx_v10_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr,
 	amdgpu_ring_write(ring, upper_32_bits(addr));
 	amdgpu_ring_write(ring, lower_32_bits(seq));
 	amdgpu_ring_write(ring, upper_32_bits(seq));
-	amdgpu_ring_write(ring, 0);
+	amdgpu_ring_write(ring, ring->is_mes_queue ?
+			 (ring->hw_queue_id | AMDGPU_FENCE_MES_QUEUE_FLAG) : 0);
 }
 
 static void gfx_v10_0_ring_emit_pipeline_sync(struct amdgpu_ring *ring)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 13/73] drm/amdgpu/gfx10: inherit vmid from mqd
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (11 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 12/73] drm/amdgpu/gfx10: associate mes queue id with fence v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 14/73] drm/amdgpu/gfx10: use INVALIDATE_TLBS to invalidate TLBs v2 Alex Deucher
                   ` (59 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

For MES manages vmid assignment, let vmid inherit from mqd instead of
ib packet setting.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index e6e601296097..0d91632f563d 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -8602,6 +8602,10 @@ static void gfx_v10_0_ring_emit_ib_gfx(struct amdgpu_ring *ring,
 				    (!amdgpu_sriov_vf(ring->adev) && flags & AMDGPU_IB_PREEMPTED) ? true : false);
 	}
 
+	if (ring->is_mes_queue)
+		/* inherit vmid from mqd */
+		control |= 0x400000;
+
 	amdgpu_ring_write(ring, header);
 	BUG_ON(ib->gpu_addr & 0x3); /* Dword align */
 	amdgpu_ring_write(ring,
@@ -8621,6 +8625,10 @@ static void gfx_v10_0_ring_emit_ib_compute(struct amdgpu_ring *ring,
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 	u32 control = INDIRECT_BUFFER_VALID | ib->length_dw | (vmid << 24);
 
+	if (ring->is_mes_queue)
+		/* inherit vmid from mqd */
+		control |= 0x40000000;
+
 	/* Currently, there is a high possibility to get wave ID mismatch
 	 * between ME and GDS, leading to a hw deadlock, because ME generates
 	 * different wave IDs than the GDS expects. This situation happens
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 14/73] drm/amdgpu/gfx10: use INVALIDATE_TLBS to invalidate TLBs v2
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (12 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 13/73] drm/amdgpu/gfx10: inherit vmid from mqd Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 15/73] drm/amdgpu/gmc10: skip emitting pasid mapping packet Alex Deucher
                   ` (58 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx
  Cc: Alex Deucher, Le Ma, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

For MES queue VM flush, use INVALIDATE_TLBS to invalidate TLBs.
This packet can let CP firmware to determine the current vmid
and inv eng to invalidate.

v2: unify invalidate_tlbs functions

Cc: Le Ma <le.ma@amd.com>
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 27 +++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0d91632f563d..2ab5259c7305 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -3503,6 +3503,9 @@ static void gfx_v10_0_ring_emit_frame_cntl(struct amdgpu_ring *ring, bool start,
 static u32 gfx_v10_3_get_disabled_sa(struct amdgpu_device *adev);
 static void gfx_v10_3_program_pbb_mode(struct amdgpu_device *adev);
 static void gfx_v10_3_set_power_brake_sequence(struct amdgpu_device *adev);
+static void gfx_v10_0_ring_invalidate_tlbs(struct amdgpu_ring *ring,
+					   uint16_t pasid, uint32_t flush_type,
+					   bool all_hub, uint8_t dst_sel);
 
 static void gfx10_kiq_set_resources(struct amdgpu_ring *kiq_ring, uint64_t queue_mask)
 {
@@ -3595,12 +3598,7 @@ static void gfx10_kiq_invalidate_tlbs(struct amdgpu_ring *kiq_ring,
 				uint16_t pasid, uint32_t flush_type,
 				bool all_hub)
 {
-	amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_INVALIDATE_TLBS, 0));
-	amdgpu_ring_write(kiq_ring,
-			PACKET3_INVALIDATE_TLBS_DST_SEL(1) |
-			PACKET3_INVALIDATE_TLBS_ALL_HUB(all_hub) |
-			PACKET3_INVALIDATE_TLBS_PASID(pasid) |
-			PACKET3_INVALIDATE_TLBS_FLUSH_TYPE(flush_type));
+	gfx_v10_0_ring_invalidate_tlbs(kiq_ring, pasid, flush_type, all_hub, 1);
 }
 
 static const struct kiq_pm4_funcs gfx_v10_0_kiq_pm4_funcs = {
@@ -8700,10 +8698,25 @@ static void gfx_v10_0_ring_emit_pipeline_sync(struct amdgpu_ring *ring)
 			       upper_32_bits(addr), seq, 0xffffffff, 4);
 }
 
+static void gfx_v10_0_ring_invalidate_tlbs(struct amdgpu_ring *ring,
+				   uint16_t pasid, uint32_t flush_type,
+				   bool all_hub, uint8_t dst_sel)
+{
+	amdgpu_ring_write(ring, PACKET3(PACKET3_INVALIDATE_TLBS, 0));
+	amdgpu_ring_write(ring,
+			  PACKET3_INVALIDATE_TLBS_DST_SEL(dst_sel) |
+			  PACKET3_INVALIDATE_TLBS_ALL_HUB(all_hub) |
+			  PACKET3_INVALIDATE_TLBS_PASID(pasid) |
+			  PACKET3_INVALIDATE_TLBS_FLUSH_TYPE(flush_type));
+}
+
 static void gfx_v10_0_ring_emit_vm_flush(struct amdgpu_ring *ring,
 					 unsigned vmid, uint64_t pd_addr)
 {
-	amdgpu_gmc_emit_flush_gpu_tlb(ring, vmid, pd_addr);
+	if (ring->is_mes_queue)
+		gfx_v10_0_ring_invalidate_tlbs(ring, 0, 0, false, 0);
+	else
+		amdgpu_gmc_emit_flush_gpu_tlb(ring, vmid, pd_addr);
 
 	/* compute doesn't have PFP */
 	if (ring->funcs->type == AMDGPU_RING_TYPE_GFX) {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 15/73] drm/amdgpu/gmc10: skip emitting pasid mapping packet
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (13 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 14/73] drm/amdgpu/gfx10: use INVALIDATE_TLBS to invalidate TLBs v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 16/73] drm/amdgpu: use the whole doorbell space for mes Alex Deucher
                   ` (57 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

For MES FW manages IH_VMID_x_LUT updating, skip emitting pasid
mapping packet.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 487c33937a87..9b4a035a5bf1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -517,6 +517,10 @@ static void gmc_v10_0_emit_pasid_mapping(struct amdgpu_ring *ring, unsigned vmid
 	struct amdgpu_device *adev = ring->adev;
 	uint32_t reg;
 
+	/* MES fw manages IH_VMID_x_LUT updating */
+	if (ring->is_mes_queue)
+		return;
+
 	if (ring->funcs->vmhub == AMDGPU_GFXHUB_0)
 		reg = SOC15_REG_OFFSET(OSSSYS, 0, mmIH_VMID_0_LUT) + vmid;
 	else
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 16/73] drm/amdgpu: use the whole doorbell space for mes
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (14 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 15/73] drm/amdgpu/gmc10: skip emitting pasid mapping packet Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 17/73] drm/amdgpu: update mes process/gang/queue definitions Alex Deucher
                   ` (56 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Use the whole doorbell space for mes. Each queue in one process occupies
one doorbell slot to ring the queue submitting.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 32 +++++++++++++---------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f33e3c341f8f..b9844249d464 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1044,19 +1044,25 @@ static int amdgpu_device_doorbell_init(struct amdgpu_device *adev)
 	adev->doorbell.base = pci_resource_start(adev->pdev, 2);
 	adev->doorbell.size = pci_resource_len(adev->pdev, 2);
 
-	adev->doorbell.num_doorbells = min_t(u32, adev->doorbell.size / sizeof(u32),
-					     adev->doorbell_index.max_assignment+1);
-	if (adev->doorbell.num_doorbells == 0)
-		return -EINVAL;
-
-	/* For Vega, reserve and map two pages on doorbell BAR since SDMA
-	 * paging queue doorbell use the second page. The
-	 * AMDGPU_DOORBELL64_MAX_ASSIGNMENT definition assumes all the
-	 * doorbells are in the first page. So with paging queue enabled,
-	 * the max num_doorbells should + 1 page (0x400 in dword)
-	 */
-	if (adev->asic_type >= CHIP_VEGA10)
-		adev->doorbell.num_doorbells += 0x400;
+	if (adev->enable_mes) {
+		adev->doorbell.num_doorbells =
+			adev->doorbell.size / sizeof(u32);
+	} else {
+		adev->doorbell.num_doorbells =
+			min_t(u32, adev->doorbell.size / sizeof(u32),
+			      adev->doorbell_index.max_assignment+1);
+		if (adev->doorbell.num_doorbells == 0)
+			return -EINVAL;
+
+		/* For Vega, reserve and map two pages on doorbell BAR since SDMA
+		 * paging queue doorbell use the second page. The
+		 * AMDGPU_DOORBELL64_MAX_ASSIGNMENT definition assumes all the
+		 * doorbells are in the first page. So with paging queue enabled,
+		 * the max num_doorbells should + 1 page (0x400 in dword)
+		 */
+		if (adev->asic_type >= CHIP_VEGA10)
+			adev->doorbell.num_doorbells += 0x400;
+	}
 
 	adev->doorbell.ptr = ioremap(adev->doorbell.base,
 				     adev->doorbell.num_doorbells *
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 17/73] drm/amdgpu: update mes process/gang/queue definitions
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (15 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 16/73] drm/amdgpu: use the whole doorbell space for mes Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 18/73] drm/amdgpu: add mes_kiq module parameter v2 Alex Deucher
                   ` (55 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Update the definitions of MES process/gang/queue.

v2: add missing includes

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 58 +++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 7334982ea702..52483d7ce843 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -24,6 +24,10 @@
 #ifndef __AMDGPU_MES_H__
 #define __AMDGPU_MES_H__
 
+#include "amdgpu_irq.h"
+#include "kgd_kfd_interface.h"
+#include "amdgpu_gfx.h"
+
 #define AMDGPU_MES_MAX_COMPUTE_PIPES        8
 #define AMDGPU_MES_MAX_GFX_PIPES            2
 #define AMDGPU_MES_MAX_SDMA_PIPES           2
@@ -37,11 +41,23 @@ enum amdgpu_mes_priority_level {
 	AMDGPU_MES_PRIORITY_NUM_LEVELS
 };
 
+#define AMDGPU_MES_PROC_CTX_SIZE 0x1000 /* one page area */
+#define AMDGPU_MES_GANG_CTX_SIZE 0x1000 /* one page area */
+
 struct amdgpu_mes_funcs;
 
 struct amdgpu_mes {
 	struct amdgpu_device            *adev;
 
+	struct mutex                    mutex;
+
+	struct idr                      pasid_idr;
+	struct idr                      gang_id_idr;
+	struct idr                      queue_id_idr;
+	struct ida                      doorbell_ida;
+
+	spinlock_t                      queue_id_lock;
+
 	uint32_t                        total_max_queue;
 	uint32_t                        doorbell_id_offset;
 	uint32_t                        max_doorbell_slices;
@@ -90,6 +106,48 @@ struct amdgpu_mes {
 	const struct amdgpu_mes_funcs   *funcs;
 };
 
+struct amdgpu_mes_process {
+	int			pasid;
+	struct			amdgpu_vm *vm;
+	uint64_t		pd_gpu_addr;
+	struct amdgpu_bo 	*proc_ctx_bo;
+	uint64_t 		proc_ctx_gpu_addr;
+	void 			*proc_ctx_cpu_ptr;
+	uint64_t 		process_quantum;
+	struct 			list_head gang_list;
+	uint32_t 		doorbell_index;
+	unsigned long 		*doorbell_bitmap;
+	struct mutex		doorbell_lock;
+};
+
+struct amdgpu_mes_gang {
+	int 				gang_id;
+	int 				priority;
+	int 				inprocess_gang_priority;
+	int 				global_priority_level;
+	struct list_head 		list;
+	struct amdgpu_mes_process 	*process;
+	struct amdgpu_bo 		*gang_ctx_bo;
+	uint64_t 			gang_ctx_gpu_addr;
+	void 				*gang_ctx_cpu_ptr;
+	uint64_t 			gang_quantum;
+	struct list_head 		queue_list;
+};
+
+struct amdgpu_mes_queue {
+	struct list_head 		list;
+	struct amdgpu_mes_gang 		*gang;
+	int 				queue_id;
+	uint64_t 			doorbell_off;
+	struct amdgpu_bo		*mqd_obj;
+	void				*mqd_cpu_ptr;
+	uint64_t 			mqd_gpu_addr;
+	uint64_t 			wptr_gpu_addr;
+	int 				queue_type;
+	int 				paging;
+	struct amdgpu_ring 		*ring;
+};
+
 struct mes_add_queue_input {
 	uint32_t	process_id;
 	uint64_t	page_table_base_addr;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 18/73] drm/amdgpu: add mes_kiq module parameter v2
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (16 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 17/73] drm/amdgpu: update mes process/gang/queue definitions Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 19/73] drm/amdgpu: allocate doorbell index for mes kiq Alex Deucher
                   ` (54 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

mes_kiq parameter is used to enable mes kiq pipe.
This module parameter is unneccessary or enabled by default
in final version.

v2: reword commit message.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  9 +++++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 10 ++++++++++
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 24bce7e691a8..4264abc5604d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -208,6 +208,7 @@ extern int amdgpu_async_gfx_ring;
 extern int amdgpu_mcbp;
 extern int amdgpu_discovery;
 extern int amdgpu_mes;
+extern int amdgpu_mes_kiq;
 extern int amdgpu_noretry;
 extern int amdgpu_force_asic_type;
 extern int amdgpu_smartshift_bias;
@@ -940,6 +941,7 @@ struct amdgpu_device {
 
 	/* mes */
 	bool                            enable_mes;
+	bool                            enable_mes_kiq;
 	struct amdgpu_mes               mes;
 	struct amdgpu_mqd               mqds[AMDGPU_HW_IP_NUM];
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b9844249d464..b2366d0d3047 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3669,8 +3669,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	if (amdgpu_mcbp)
 		DRM_INFO("MCBP is enabled\n");
 
-	if (amdgpu_mes && adev->asic_type >= CHIP_NAVI10)
-		adev->enable_mes = true;
+	if (adev->asic_type >= CHIP_NAVI10) {
+		if (amdgpu_mes || amdgpu_mes_kiq)
+			adev->enable_mes = true;
+
+		if (amdgpu_mes_kiq)
+			adev->enable_mes_kiq = true;
+	}
 
 	/*
 	 * Reset domain needs to be present early, before XGMI hive discovered
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 80690cbac89f..9e72b3ec5d4d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -171,6 +171,7 @@ int amdgpu_async_gfx_ring = 1;
 int amdgpu_mcbp;
 int amdgpu_discovery = -1;
 int amdgpu_mes;
+int amdgpu_mes_kiq;
 int amdgpu_noretry = -1;
 int amdgpu_force_asic_type = -1;
 int amdgpu_tmz = -1; /* auto */
@@ -636,6 +637,15 @@ MODULE_PARM_DESC(mes,
 	"Enable Micro Engine Scheduler (0 = disabled (default), 1 = enabled)");
 module_param_named(mes, amdgpu_mes, int, 0444);
 
+/**
+ * DOC: mes_kiq (int)
+ * Enable Micro Engine Scheduler KIQ. This is a new engine pipe for kiq.
+ * (0 = disabled (default), 1 = enabled)
+ */
+MODULE_PARM_DESC(mes_kiq,
+	"Enable Micro Engine Scheduler KIQ (0 = disabled (default), 1 = enabled)");
+module_param_named(mes_kiq, amdgpu_mes_kiq, int, 0444);
+
 /**
  * DOC: noretry (int)
  * Disable XNACK retry in the SQ by default on GFXv9 hardware. On ASICs that
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 19/73] drm/amdgpu: allocate doorbell index for mes kiq
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (17 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 18/73] drm/amdgpu: add mes_kiq module parameter v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 20/73] drm/amdgpu/mes: extend mes framework to support multiple mes pipes Alex Deucher
                   ` (53 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Allocate a doorbell index for mes kiq queue.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h | 6 ++++--
 drivers/gpu/drm/amd/amdgpu/nv.c              | 3 ++-
 drivers/gpu/drm/amd/amdgpu/soc21.c           | 3 ++-
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
index 89e6ad30396f..2d9485e67125 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
@@ -53,7 +53,8 @@ struct amdgpu_doorbell_index {
 	uint32_t gfx_ring0;
 	uint32_t gfx_ring1;
 	uint32_t sdma_engine[8];
-	uint32_t mes_ring;
+	uint32_t mes_ring0;
+	uint32_t mes_ring1;
 	uint32_t ih;
 	union {
 		struct {
@@ -178,7 +179,8 @@ typedef enum _AMDGPU_NAVI10_DOORBELL_ASSIGNMENT
 	AMDGPU_NAVI10_DOORBELL_USERQUEUE_END		= 0x08A,
 	AMDGPU_NAVI10_DOORBELL_GFX_RING0		= 0x08B,
 	AMDGPU_NAVI10_DOORBELL_GFX_RING1		= 0x08C,
-	AMDGPU_NAVI10_DOORBELL_MES_RING		        = 0x090,
+	AMDGPU_NAVI10_DOORBELL_MES_RING0	        = 0x090,
+	AMDGPU_NAVI10_DOORBELL_MES_RING1		= 0x091,
 	/* SDMA:256~335*/
 	AMDGPU_NAVI10_DOORBELL_sDMA_ENGINE0		= 0x100,
 	AMDGPU_NAVI10_DOORBELL_sDMA_ENGINE1		= 0x10A,
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 0a7946c59a42..8cf1a7f8a632 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -607,7 +607,8 @@ static void nv_init_doorbell_index(struct amdgpu_device *adev)
 	adev->doorbell_index.userqueue_end = AMDGPU_NAVI10_DOORBELL_USERQUEUE_END;
 	adev->doorbell_index.gfx_ring0 = AMDGPU_NAVI10_DOORBELL_GFX_RING0;
 	adev->doorbell_index.gfx_ring1 = AMDGPU_NAVI10_DOORBELL_GFX_RING1;
-	adev->doorbell_index.mes_ring = AMDGPU_NAVI10_DOORBELL_MES_RING;
+	adev->doorbell_index.mes_ring0 = AMDGPU_NAVI10_DOORBELL_MES_RING0;
+	adev->doorbell_index.mes_ring1 = AMDGPU_NAVI10_DOORBELL_MES_RING1;
 	adev->doorbell_index.sdma_engine[0] = AMDGPU_NAVI10_DOORBELL_sDMA_ENGINE0;
 	adev->doorbell_index.sdma_engine[1] = AMDGPU_NAVI10_DOORBELL_sDMA_ENGINE1;
 	adev->doorbell_index.sdma_engine[2] = AMDGPU_NAVI10_DOORBELL_sDMA_ENGINE2;
diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c
index c3069f5e299a..68985a59a6a6 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc21.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
@@ -409,7 +409,8 @@ static void soc21_init_doorbell_index(struct amdgpu_device *adev)
 	adev->doorbell_index.userqueue_end = AMDGPU_NAVI10_DOORBELL_USERQUEUE_END;
 	adev->doorbell_index.gfx_ring0 = AMDGPU_NAVI10_DOORBELL_GFX_RING0;
 	adev->doorbell_index.gfx_ring1 = AMDGPU_NAVI10_DOORBELL_GFX_RING1;
-	adev->doorbell_index.mes_ring = AMDGPU_NAVI10_DOORBELL_MES_RING;
+	adev->doorbell_index.mes_ring0 = AMDGPU_NAVI10_DOORBELL_MES_RING0;
+	adev->doorbell_index.mes_ring1 = AMDGPU_NAVI10_DOORBELL_MES_RING1;
 	adev->doorbell_index.sdma_engine[0] = AMDGPU_NAVI10_DOORBELL_sDMA_ENGINE0;
 	adev->doorbell_index.sdma_engine[1] = AMDGPU_NAVI10_DOORBELL_sDMA_ENGINE1;
 	adev->doorbell_index.ih = AMDGPU_NAVI10_DOORBELL_IH;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 20/73] drm/amdgpu/mes: extend mes framework to support multiple mes pipes
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (18 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 19/73] drm/amdgpu: allocate doorbell index for mes kiq Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 21/73] drm/amdgpu/gfx10: add mes queue fence handling Alex Deucher
                   ` (52 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add support for multiple mes pipes, so that reuse the existing
code to initialize more mes pipe and queue.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  35 ++--
 drivers/gpu/drm/amd/amdgpu/mes_v10_1.c  | 215 ++++++++++++++----------
 2 files changed, 149 insertions(+), 101 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 52483d7ce843..91b020842eb0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -46,6 +46,12 @@ enum amdgpu_mes_priority_level {
 
 struct amdgpu_mes_funcs;
 
+enum admgpu_mes_pipe {
+	AMDGPU_MES_SCHED_PIPE = 0,
+	AMDGPU_MES_KIQ_PIPE,
+	AMDGPU_MAX_MES_PIPES = 2,
+};
+
 struct amdgpu_mes {
 	struct amdgpu_device            *adev;
 
@@ -67,27 +73,28 @@ struct amdgpu_mes {
 
 	struct amdgpu_ring              ring;
 
-	const struct firmware           *fw;
+	const struct firmware           *fw[AMDGPU_MAX_MES_PIPES];
 
 	/* mes ucode */
-	struct amdgpu_bo		*ucode_fw_obj;
-	uint64_t			ucode_fw_gpu_addr;
-	uint32_t			*ucode_fw_ptr;
-	uint32_t                        ucode_fw_version;
-	uint64_t                        uc_start_addr;
+	struct amdgpu_bo		*ucode_fw_obj[AMDGPU_MAX_MES_PIPES];
+	uint64_t			ucode_fw_gpu_addr[AMDGPU_MAX_MES_PIPES];
+	uint32_t			*ucode_fw_ptr[AMDGPU_MAX_MES_PIPES];
+	uint32_t                        ucode_fw_version[AMDGPU_MAX_MES_PIPES];
+	uint64_t                        uc_start_addr[AMDGPU_MAX_MES_PIPES];
 
 	/* mes ucode data */
-	struct amdgpu_bo		*data_fw_obj;
-	uint64_t			data_fw_gpu_addr;
-	uint32_t			*data_fw_ptr;
-	uint32_t                        data_fw_version;
-	uint64_t                        data_start_addr;
+	struct amdgpu_bo		*data_fw_obj[AMDGPU_MAX_MES_PIPES];
+	uint64_t			data_fw_gpu_addr[AMDGPU_MAX_MES_PIPES];
+	uint32_t			*data_fw_ptr[AMDGPU_MAX_MES_PIPES];
+	uint32_t                        data_fw_version[AMDGPU_MAX_MES_PIPES];
+	uint64_t                        data_start_addr[AMDGPU_MAX_MES_PIPES];
 
 	/* eop gpu obj */
-	struct amdgpu_bo		*eop_gpu_obj;
-	uint64_t                        eop_gpu_addr;
+	struct amdgpu_bo		*eop_gpu_obj[AMDGPU_MAX_MES_PIPES];
+	uint64_t                        eop_gpu_addr[AMDGPU_MAX_MES_PIPES];
 
-	void                            *mqd_backup;
+	void                            *mqd_backup[AMDGPU_MAX_MES_PIPES];
+	struct amdgpu_irq_src	        irq[AMDGPU_MAX_MES_PIPES];
 
 	uint32_t                        vmid_mask_gfxhub;
 	uint32_t                        vmid_mask_mmhub;
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index 0819ffe8e759..f82a6f981629 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -269,7 +269,8 @@ static const struct amdgpu_mes_funcs mes_v10_1_funcs = {
 	.resume_gang = mes_v10_1_resume_gang,
 };
 
-static int mes_v10_1_init_microcode(struct amdgpu_device *adev)
+static int mes_v10_1_init_microcode(struct amdgpu_device *adev,
+				    enum admgpu_mes_pipe pipe)
 {
 	const char *chip_name;
 	char fw_name[30];
@@ -288,40 +289,56 @@ static int mes_v10_1_init_microcode(struct amdgpu_device *adev)
 		BUG();
 	}
 
-	snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes.bin", chip_name);
-	err = request_firmware(&adev->mes.fw, fw_name, adev->dev);
+	if (pipe == AMDGPU_MES_SCHED_PIPE)
+		snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes.bin",
+			 chip_name);
+	else
+		BUG();
+
+	err = request_firmware(&adev->mes.fw[pipe], fw_name, adev->dev);
 	if (err)
 		return err;
 
-	err = amdgpu_ucode_validate(adev->mes.fw);
+	err = amdgpu_ucode_validate(adev->mes.fw[pipe]);
 	if (err) {
-		release_firmware(adev->mes.fw);
-		adev->mes.fw = NULL;
+		release_firmware(adev->mes.fw[pipe]);
+		adev->mes.fw[pipe] = NULL;
 		return err;
 	}
 
-	mes_hdr = (const struct mes_firmware_header_v1_0 *)adev->mes.fw->data;
-	adev->mes.ucode_fw_version = le32_to_cpu(mes_hdr->mes_ucode_version);
-	adev->mes.ucode_fw_version =
+	mes_hdr = (const struct mes_firmware_header_v1_0 *)
+		adev->mes.fw[pipe]->data;
+	adev->mes.ucode_fw_version[pipe] =
+		le32_to_cpu(mes_hdr->mes_ucode_version);
+	adev->mes.ucode_fw_version[pipe] =
 		le32_to_cpu(mes_hdr->mes_ucode_data_version);
-	adev->mes.uc_start_addr =
+	adev->mes.uc_start_addr[pipe] =
 		le32_to_cpu(mes_hdr->mes_uc_start_addr_lo) |
 		((uint64_t)(le32_to_cpu(mes_hdr->mes_uc_start_addr_hi)) << 32);
-	adev->mes.data_start_addr =
+	adev->mes.data_start_addr[pipe] =
 		le32_to_cpu(mes_hdr->mes_data_start_addr_lo) |
 		((uint64_t)(le32_to_cpu(mes_hdr->mes_data_start_addr_hi)) << 32);
 
 	if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-		info = &adev->firmware.ucode[AMDGPU_UCODE_ID_CP_MES];
-		info->ucode_id = AMDGPU_UCODE_ID_CP_MES;
-		info->fw = adev->mes.fw;
+		int ucode, ucode_data;
+
+		if (pipe == AMDGPU_MES_SCHED_PIPE) {
+			ucode = AMDGPU_UCODE_ID_CP_MES;
+			ucode_data = AMDGPU_UCODE_ID_CP_MES_DATA;
+		} else {
+			BUG();
+		}
+
+		info = &adev->firmware.ucode[ucode];
+		info->ucode_id = ucode;
+		info->fw = adev->mes.fw[pipe];
 		adev->firmware.fw_size +=
 			ALIGN(le32_to_cpu(mes_hdr->mes_ucode_size_bytes),
 			      PAGE_SIZE);
 
-		info = &adev->firmware.ucode[AMDGPU_UCODE_ID_CP_MES_DATA];
-		info->ucode_id = AMDGPU_UCODE_ID_CP_MES_DATA;
-		info->fw = adev->mes.fw;
+		info = &adev->firmware.ucode[ucode_data];
+		info->ucode_id = ucode_data;
+		info->fw = adev->mes.fw[pipe];
 		adev->firmware.fw_size +=
 			ALIGN(le32_to_cpu(mes_hdr->mes_ucode_data_size_bytes),
 			      PAGE_SIZE);
@@ -330,13 +347,15 @@ static int mes_v10_1_init_microcode(struct amdgpu_device *adev)
 	return 0;
 }
 
-static void mes_v10_1_free_microcode(struct amdgpu_device *adev)
+static void mes_v10_1_free_microcode(struct amdgpu_device *adev,
+				     enum admgpu_mes_pipe pipe)
 {
-	release_firmware(adev->mes.fw);
-	adev->mes.fw = NULL;
+	release_firmware(adev->mes.fw[pipe]);
+	adev->mes.fw[pipe] = NULL;
 }
 
-static int mes_v10_1_allocate_ucode_buffer(struct amdgpu_device *adev)
+static int mes_v10_1_allocate_ucode_buffer(struct amdgpu_device *adev,
+					   enum admgpu_mes_pipe pipe)
 {
 	int r;
 	const struct mes_firmware_header_v1_0 *mes_hdr;
@@ -344,31 +363,32 @@ static int mes_v10_1_allocate_ucode_buffer(struct amdgpu_device *adev)
 	unsigned fw_size;
 
 	mes_hdr = (const struct mes_firmware_header_v1_0 *)
-		adev->mes.fw->data;
+		adev->mes.fw[pipe]->data;
 
-	fw_data = (const __le32 *)(adev->mes.fw->data +
+	fw_data = (const __le32 *)(adev->mes.fw[pipe]->data +
 		   le32_to_cpu(mes_hdr->mes_ucode_offset_bytes));
 	fw_size = le32_to_cpu(mes_hdr->mes_ucode_size_bytes);
 
 	r = amdgpu_bo_create_reserved(adev, fw_size,
 				      PAGE_SIZE, AMDGPU_GEM_DOMAIN_GTT,
-				      &adev->mes.ucode_fw_obj,
-				      &adev->mes.ucode_fw_gpu_addr,
-				      (void **)&adev->mes.ucode_fw_ptr);
+				      &adev->mes.ucode_fw_obj[pipe],
+				      &adev->mes.ucode_fw_gpu_addr[pipe],
+				      (void **)&adev->mes.ucode_fw_ptr[pipe]);
 	if (r) {
 		dev_err(adev->dev, "(%d) failed to create mes fw bo\n", r);
 		return r;
 	}
 
-	memcpy(adev->mes.ucode_fw_ptr, fw_data, fw_size);
+	memcpy(adev->mes.ucode_fw_ptr[pipe], fw_data, fw_size);
 
-	amdgpu_bo_kunmap(adev->mes.ucode_fw_obj);
-	amdgpu_bo_unreserve(adev->mes.ucode_fw_obj);
+	amdgpu_bo_kunmap(adev->mes.ucode_fw_obj[pipe]);
+	amdgpu_bo_unreserve(adev->mes.ucode_fw_obj[pipe]);
 
 	return 0;
 }
 
-static int mes_v10_1_allocate_ucode_data_buffer(struct amdgpu_device *adev)
+static int mes_v10_1_allocate_ucode_data_buffer(struct amdgpu_device *adev,
+						enum admgpu_mes_pipe pipe)
 {
 	int r;
 	const struct mes_firmware_header_v1_0 *mes_hdr;
@@ -376,53 +396,63 @@ static int mes_v10_1_allocate_ucode_data_buffer(struct amdgpu_device *adev)
 	unsigned fw_size;
 
 	mes_hdr = (const struct mes_firmware_header_v1_0 *)
-		adev->mes.fw->data;
+		adev->mes.fw[pipe]->data;
 
-	fw_data = (const __le32 *)(adev->mes.fw->data +
+	fw_data = (const __le32 *)(adev->mes.fw[pipe]->data +
 		   le32_to_cpu(mes_hdr->mes_ucode_data_offset_bytes));
 	fw_size = le32_to_cpu(mes_hdr->mes_ucode_data_size_bytes);
 
 	r = amdgpu_bo_create_reserved(adev, fw_size,
 				      64 * 1024, AMDGPU_GEM_DOMAIN_GTT,
-				      &adev->mes.data_fw_obj,
-				      &adev->mes.data_fw_gpu_addr,
-				      (void **)&adev->mes.data_fw_ptr);
+				      &adev->mes.data_fw_obj[pipe],
+				      &adev->mes.data_fw_gpu_addr[pipe],
+				      (void **)&adev->mes.data_fw_ptr[pipe]);
 	if (r) {
 		dev_err(adev->dev, "(%d) failed to create mes data fw bo\n", r);
 		return r;
 	}
 
-	memcpy(adev->mes.data_fw_ptr, fw_data, fw_size);
+	memcpy(adev->mes.data_fw_ptr[pipe], fw_data, fw_size);
 
-	amdgpu_bo_kunmap(adev->mes.data_fw_obj);
-	amdgpu_bo_unreserve(adev->mes.data_fw_obj);
+	amdgpu_bo_kunmap(adev->mes.data_fw_obj[pipe]);
+	amdgpu_bo_unreserve(adev->mes.data_fw_obj[pipe]);
 
 	return 0;
 }
 
-static void mes_v10_1_free_ucode_buffers(struct amdgpu_device *adev)
+static void mes_v10_1_free_ucode_buffers(struct amdgpu_device *adev,
+					 enum admgpu_mes_pipe pipe)
 {
-	amdgpu_bo_free_kernel(&adev->mes.data_fw_obj,
-			      &adev->mes.data_fw_gpu_addr,
-			      (void **)&adev->mes.data_fw_ptr);
+	amdgpu_bo_free_kernel(&adev->mes.data_fw_obj[pipe],
+			      &adev->mes.data_fw_gpu_addr[pipe],
+			      (void **)&adev->mes.data_fw_ptr[pipe]);
 
-	amdgpu_bo_free_kernel(&adev->mes.ucode_fw_obj,
-			      &adev->mes.ucode_fw_gpu_addr,
-			      (void **)&adev->mes.ucode_fw_ptr);
+	amdgpu_bo_free_kernel(&adev->mes.ucode_fw_obj[pipe],
+			      &adev->mes.ucode_fw_gpu_addr[pipe],
+			      (void **)&adev->mes.ucode_fw_ptr[pipe]);
 }
 
 static void mes_v10_1_enable(struct amdgpu_device *adev, bool enable)
 {
-	uint32_t data = 0;
+	uint32_t pipe, data = 0;
 
 	if (enable) {
 		data = RREG32_SOC15(GC, 0, mmCP_MES_CNTL);
 		data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE0_RESET, 1);
 		WREG32_SOC15(GC, 0, mmCP_MES_CNTL, data);
 
-		/* set ucode start address */
-		WREG32_SOC15(GC, 0, mmCP_MES_PRGRM_CNTR_START,
-			     (uint32_t)(adev->mes.uc_start_addr) >> 2);
+		mutex_lock(&adev->srbm_mutex);
+		for (pipe = 0; pipe < AMDGPU_MAX_MES_PIPES; pipe++) {
+			if (!adev->enable_mes_kiq &&
+			    pipe == AMDGPU_MES_KIQ_PIPE)
+				continue;
+
+			nv_grbm_select(adev, 3, pipe, 0, 0);
+			WREG32_SOC15(GC, 0, mmCP_MES_PRGRM_CNTR_START,
+			     (uint32_t)(adev->mes.uc_start_addr[pipe]) >> 2);
+		}
+		nv_grbm_select(adev, 0, 0, 0, 0);
+		mutex_unlock(&adev->srbm_mutex);
 
 		/* clear BYPASS_UNCACHED to avoid hangs after interrupt. */
 		data = RREG32_SOC15(GC, 0, mmCP_MES_DC_OP_CNTL);
@@ -445,50 +475,51 @@ static void mes_v10_1_enable(struct amdgpu_device *adev, bool enable)
 }
 
 /* This function is for backdoor MES firmware */
-static int mes_v10_1_load_microcode(struct amdgpu_device *adev)
+static int mes_v10_1_load_microcode(struct amdgpu_device *adev,
+				    enum admgpu_mes_pipe pipe)
 {
 	int r;
 	uint32_t data;
 
-	if (!adev->mes.fw)
+	mes_v10_1_enable(adev, false);
+
+	if (!adev->mes.fw[pipe])
 		return -EINVAL;
 
-	r = mes_v10_1_allocate_ucode_buffer(adev);
+	r = mes_v10_1_allocate_ucode_buffer(adev, pipe);
 	if (r)
 		return r;
 
-	r = mes_v10_1_allocate_ucode_data_buffer(adev);
+	r = mes_v10_1_allocate_ucode_data_buffer(adev, pipe);
 	if (r) {
-		mes_v10_1_free_ucode_buffers(adev);
+		mes_v10_1_free_ucode_buffers(adev, pipe);
 		return r;
 	}
 
-	mes_v10_1_enable(adev, false);
-
 	WREG32_SOC15(GC, 0, mmCP_MES_IC_BASE_CNTL, 0);
 
 	mutex_lock(&adev->srbm_mutex);
 	/* me=3, pipe=0, queue=0 */
-	nv_grbm_select(adev, 3, 0, 0, 0);
+	nv_grbm_select(adev, 3, pipe, 0, 0);
 
 	/* set ucode start address */
 	WREG32_SOC15(GC, 0, mmCP_MES_PRGRM_CNTR_START,
-		     (uint32_t)(adev->mes.uc_start_addr) >> 2);
+		     (uint32_t)(adev->mes.uc_start_addr[pipe]) >> 2);
 
 	/* set ucode fimrware address */
 	WREG32_SOC15(GC, 0, mmCP_MES_IC_BASE_LO,
-		     lower_32_bits(adev->mes.ucode_fw_gpu_addr));
+		     lower_32_bits(adev->mes.ucode_fw_gpu_addr[pipe]));
 	WREG32_SOC15(GC, 0, mmCP_MES_IC_BASE_HI,
-		     upper_32_bits(adev->mes.ucode_fw_gpu_addr));
+		     upper_32_bits(adev->mes.ucode_fw_gpu_addr[pipe]));
 
 	/* set ucode instruction cache boundary to 2M-1 */
 	WREG32_SOC15(GC, 0, mmCP_MES_MIBOUND_LO, 0x1FFFFF);
 
 	/* set ucode data firmware address */
 	WREG32_SOC15(GC, 0, mmCP_MES_MDBASE_LO,
-		     lower_32_bits(adev->mes.data_fw_gpu_addr));
+		     lower_32_bits(adev->mes.data_fw_gpu_addr[pipe]));
 	WREG32_SOC15(GC, 0, mmCP_MES_MDBASE_HI,
-		     upper_32_bits(adev->mes.data_fw_gpu_addr));
+		     upper_32_bits(adev->mes.data_fw_gpu_addr[pipe]));
 
 	/* Set 0x3FFFF (256K-1) to CP_MES_MDBOUND_LO */
 	WREG32_SOC15(GC, 0, mmCP_MES_MDBOUND_LO, 0x3FFFF);
@@ -538,25 +569,26 @@ static int mes_v10_1_load_microcode(struct amdgpu_device *adev)
 	return 0;
 }
 
-static int mes_v10_1_allocate_eop_buf(struct amdgpu_device *adev)
+static int mes_v10_1_allocate_eop_buf(struct amdgpu_device *adev,
+				      enum admgpu_mes_pipe pipe)
 {
 	int r;
 	u32 *eop;
 
 	r = amdgpu_bo_create_reserved(adev, MES_EOP_SIZE, PAGE_SIZE,
-				      AMDGPU_GEM_DOMAIN_GTT,
-				      &adev->mes.eop_gpu_obj,
-				      &adev->mes.eop_gpu_addr,
-				      (void **)&eop);
+			      AMDGPU_GEM_DOMAIN_GTT,
+			      &adev->mes.eop_gpu_obj[pipe],
+			      &adev->mes.eop_gpu_addr[pipe],
+			      (void **)&eop);
 	if (r) {
 		dev_warn(adev->dev, "(%d) create EOP bo failed\n", r);
 		return r;
 	}
 
-	memset(eop, 0, adev->mes.eop_gpu_obj->tbo.base.size);
+	memset(eop, 0, adev->mes.eop_gpu_obj[pipe]->tbo.base.size);
 
-	amdgpu_bo_kunmap(adev->mes.eop_gpu_obj);
-	amdgpu_bo_unreserve(adev->mes.eop_gpu_obj);
+	amdgpu_bo_kunmap(adev->mes.eop_gpu_obj[pipe]);
+	amdgpu_bo_unreserve(adev->mes.eop_gpu_obj[pipe]);
 
 	return 0;
 }
@@ -727,7 +759,7 @@ static void mes_v10_1_queue_init_register(struct amdgpu_ring *ring)
 	uint32_t data = 0;
 
 	mutex_lock(&adev->srbm_mutex);
-	nv_grbm_select(adev, 3, 0, 0, 0);
+	nv_grbm_select(adev, 3, ring->pipe, 0, 0);
 
 	/* set CP_HQD_VMID.VMID = 0. */
 	data = RREG32_SOC15(GC, 0, mmCP_HQD_VMID);
@@ -842,8 +874,8 @@ static int mes_v10_1_ring_init(struct amdgpu_device *adev)
 
 	ring->ring_obj = NULL;
 	ring->use_doorbell = true;
-	ring->doorbell_index = adev->doorbell_index.mes_ring << 1;
-	ring->eop_gpu_addr = adev->mes.eop_gpu_addr;
+	ring->doorbell_index = adev->doorbell_index.mes_ring0 << 1;
+	ring->eop_gpu_addr = adev->mes.eop_gpu_addr[AMDGPU_MES_SCHED_PIPE];
 	ring->no_scheduler = true;
 	sprintf(ring->name, "mes_%d.%d.%d", ring->me, ring->pipe, ring->queue);
 
@@ -851,10 +883,16 @@ static int mes_v10_1_ring_init(struct amdgpu_device *adev)
 				AMDGPU_RING_PRIO_DEFAULT, NULL);
 }
 
-static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev)
+static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev,
+				 enum admgpu_mes_pipe pipe)
 {
 	int r, mqd_size = sizeof(struct v10_compute_mqd);
-	struct amdgpu_ring *ring = &adev->mes.ring;
+	struct amdgpu_ring *ring;
+
+	if (pipe == AMDGPU_MES_SCHED_PIPE)
+		ring = &adev->mes.ring;
+	else
+		BUG();
 
 	if (ring->mqd_obj)
 		return 0;
@@ -868,8 +906,8 @@ static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev)
 	}
 
 	/* prepare MQD backup */
-	adev->mes.mqd_backup = kmalloc(mqd_size, GFP_KERNEL);
-	if (!adev->mes.mqd_backup)
+	adev->mes.mqd_backup[pipe] = kmalloc(mqd_size, GFP_KERNEL);
+	if (!adev->mes.mqd_backup[pipe])
 		dev_warn(adev->dev,
 			 "no memory to create MQD backup for ring %s\n",
 			 ring->name);
@@ -879,21 +917,21 @@ static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev)
 
 static int mes_v10_1_sw_init(void *handle)
 {
-	int r;
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+	int r, pipe = AMDGPU_MES_SCHED_PIPE;
 
 	adev->mes.adev = adev;
 	adev->mes.funcs = &mes_v10_1_funcs;
 
-	r = mes_v10_1_init_microcode(adev);
+	r = mes_v10_1_init_microcode(adev, pipe);
 	if (r)
 		return r;
 
-	r = mes_v10_1_allocate_eop_buf(adev);
+	r = mes_v10_1_allocate_eop_buf(adev, pipe);
 	if (r)
 		return r;
 
-	r = mes_v10_1_mqd_sw_init(adev);
+	r = mes_v10_1_mqd_sw_init(adev, pipe);
 	if (r)
 		return r;
 
@@ -911,21 +949,23 @@ static int mes_v10_1_sw_init(void *handle)
 static int mes_v10_1_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+	int pipe = AMDGPU_MES_SCHED_PIPE;
 
 	amdgpu_device_wb_free(adev, adev->mes.sch_ctx_offs);
 	amdgpu_device_wb_free(adev, adev->mes.query_status_fence_offs);
 
-	kfree(adev->mes.mqd_backup);
+	kfree(adev->mes.mqd_backup[pipe]);
 
 	amdgpu_bo_free_kernel(&adev->mes.ring.mqd_obj,
 			      &adev->mes.ring.mqd_gpu_addr,
 			      &adev->mes.ring.mqd_ptr);
 
-	amdgpu_bo_free_kernel(&adev->mes.eop_gpu_obj,
-			      &adev->mes.eop_gpu_addr,
+	amdgpu_bo_free_kernel(&adev->mes.eop_gpu_obj[pipe],
+			      &adev->mes.eop_gpu_addr[pipe],
 			      NULL);
 
-	mes_v10_1_free_microcode(adev);
+	mes_v10_1_free_microcode(adev, pipe);
+	amdgpu_ring_fini(&adev->mes.ring);
 
 	return 0;
 }
@@ -936,7 +976,8 @@ static int mes_v10_1_hw_init(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT) {
-		r = mes_v10_1_load_microcode(adev);
+		r = mes_v10_1_load_microcode(adev,
+					     AMDGPU_MES_SCHED_PIPE);
 		if (r) {
 			DRM_ERROR("failed to MES fw, r=%d\n", r);
 			return r;
@@ -973,7 +1014,7 @@ static int mes_v10_1_hw_fini(void *handle)
 	mes_v10_1_enable(adev, false);
 
 	if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT)
-		mes_v10_1_free_ucode_buffers(adev);
+		mes_v10_1_free_ucode_buffers(adev, AMDGPU_MES_SCHED_PIPE);
 
 	return 0;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 21/73] drm/amdgpu/gfx10: add mes queue fence handling
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (19 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 20/73] drm/amdgpu/mes: extend mes framework to support multiple mes pipes Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 22/73] drm/amdgpu/gfx10: add mes support for gfx ib test Alex Deucher
                   ` (51 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

From IH ring buffer, look up the coresponding kernel queue and process.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 60 +++++++++++++++++---------
 1 file changed, 40 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 2ab5259c7305..0e009bd69a9b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -9188,31 +9188,51 @@ static int gfx_v10_0_eop_irq(struct amdgpu_device *adev,
 	int i;
 	u8 me_id, pipe_id, queue_id;
 	struct amdgpu_ring *ring;
+	uint32_t mes_queue_id = entry->src_data[0];
 
 	DRM_DEBUG("IH: CP EOP\n");
-	me_id = (entry->ring_id & 0x0c) >> 2;
-	pipe_id = (entry->ring_id & 0x03) >> 0;
-	queue_id = (entry->ring_id & 0x70) >> 4;
 
-	switch (me_id) {
-	case 0:
-		if (pipe_id == 0)
-			amdgpu_fence_process(&adev->gfx.gfx_ring[0]);
-		else
-			amdgpu_fence_process(&adev->gfx.gfx_ring[1]);
-		break;
-	case 1:
-	case 2:
-		for (i = 0; i < adev->gfx.num_compute_rings; i++) {
-			ring = &adev->gfx.compute_ring[i];
-			/* Per-queue interrupt is supported for MEC starting from VI.
-			  * The interrupt can only be enabled/disabled per pipe instead of per queue.
-			  */
-			if ((ring->me == me_id) && (ring->pipe == pipe_id) && (ring->queue == queue_id))
-				amdgpu_fence_process(ring);
+	if (adev->enable_mes && (mes_queue_id & AMDGPU_FENCE_MES_QUEUE_FLAG)) {
+		struct amdgpu_mes_queue *queue;
+
+		mes_queue_id &= AMDGPU_FENCE_MES_QUEUE_ID_MASK;
+
+		spin_lock(&adev->mes.queue_id_lock);
+		queue = idr_find(&adev->mes.queue_id_idr, mes_queue_id);
+		if (queue) {
+			DRM_DEBUG("process mes queue id = %d\n", mes_queue_id);
+			amdgpu_fence_process(queue->ring);
+		}
+		spin_unlock(&adev->mes.queue_id_lock);
+	} else {
+		me_id = (entry->ring_id & 0x0c) >> 2;
+		pipe_id = (entry->ring_id & 0x03) >> 0;
+		queue_id = (entry->ring_id & 0x70) >> 4;
+
+		switch (me_id) {
+		case 0:
+			if (pipe_id == 0)
+				amdgpu_fence_process(&adev->gfx.gfx_ring[0]);
+			else
+				amdgpu_fence_process(&adev->gfx.gfx_ring[1]);
+			break;
+		case 1:
+		case 2:
+			for (i = 0; i < adev->gfx.num_compute_rings; i++) {
+				ring = &adev->gfx.compute_ring[i];
+				/* Per-queue interrupt is supported for MEC starting from VI.
+				 * The interrupt can only be enabled/disabled per pipe instead
+				 * of per queue.
+				 */
+				if ((ring->me == me_id) &&
+				    (ring->pipe == pipe_id) &&
+				    (ring->queue == queue_id))
+					amdgpu_fence_process(ring);
+			}
+			break;
 		}
-		break;
 	}
+
 	return 0;
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 22/73] drm/amdgpu/gfx10: add mes support for gfx ib test
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (20 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 21/73] drm/amdgpu/gfx10: add mes queue fence handling Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 23/73] drm/amdgpu: don't use kiq to flush gpu tlb if mes enabled Alex Deucher
                   ` (50 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add mes support for gfx ib test.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 46 ++++++++++++++++++--------
 1 file changed, 33 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0e009bd69a9b..1208d01cc936 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -3818,19 +3818,39 @@ static int gfx_v10_0_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 	struct dma_fence *f = NULL;
 	unsigned index;
 	uint64_t gpu_addr;
-	uint32_t tmp;
+	volatile uint32_t *cpu_ptr;
 	long r;
 
-	r = amdgpu_device_wb_get(adev, &index);
-	if (r)
-		return r;
-
-	gpu_addr = adev->wb.gpu_addr + (index * 4);
-	adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD);
 	memset(&ib, 0, sizeof(ib));
-	r = amdgpu_ib_get(adev, NULL, 20, AMDGPU_IB_POOL_DIRECT, &ib);
-	if (r)
-		goto err1;
+
+	if (ring->is_mes_queue) {
+		uint32_t padding, offset;
+
+		offset = amdgpu_mes_ctx_get_offs(ring, AMDGPU_MES_CTX_IB_OFFS);
+		padding = amdgpu_mes_ctx_get_offs(ring,
+						  AMDGPU_MES_CTX_PADDING_OFFS);
+
+		ib.gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+		ib.ptr = amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+
+		gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, padding);
+		cpu_ptr = amdgpu_mes_ctx_get_offs_cpu_addr(ring, padding);
+		*cpu_ptr = cpu_to_le32(0xCAFEDEAD);
+	} else {
+		r = amdgpu_device_wb_get(adev, &index);
+		if (r)
+			return r;
+
+		gpu_addr = adev->wb.gpu_addr + (index * 4);
+		adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD);
+		cpu_ptr = &adev->wb.wb[index];
+
+		r = amdgpu_ib_get(adev, NULL, 20, AMDGPU_IB_POOL_DIRECT, &ib);
+		if (r) {
+			DRM_ERROR("amdgpu: failed to get ib (%ld).\n", r);
+			goto err1;
+		}
+	}
 
 	ib.ptr[0] = PACKET3(PACKET3_WRITE_DATA, 3);
 	ib.ptr[1] = WRITE_DATA_DST_SEL(5) | WR_CONFIRM;
@@ -3851,13 +3871,13 @@ static int gfx_v10_0_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 		goto err2;
 	}
 
-	tmp = adev->wb.wb[index];
-	if (tmp == 0xDEADBEEF)
+	if (le32_to_cpu(*cpu_ptr) == 0xDEADBEEF)
 		r = 0;
 	else
 		r = -EINVAL;
 err2:
-	amdgpu_ib_free(adev, &ib, NULL);
+	if (!ring->is_mes_queue)
+		amdgpu_ib_free(adev, &ib, NULL);
 	dma_fence_put(f);
 err1:
 	amdgpu_device_wb_free(adev, index);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 23/73] drm/amdgpu: don't use kiq to flush gpu tlb if mes enabled
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (21 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 22/73] drm/amdgpu/gfx10: add mes support for gfx ib test Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 24/73] drm/amdgpu/sdma: use per-ctx sdma csa address for mes sdma queue Alex Deucher
                   ` (49 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

If MES is enabled, don't use kiq to flush gpu tlb,
for it would result in conflicting with mes fw.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 9b4a035a5bf1..b8c79789e1e4 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -328,7 +328,7 @@ static void gmc_v10_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid,
 	/* For SRIOV run time, driver shouldn't access the register through MMIO
 	 * Directly use kiq to do the vm invalidation instead
 	 */
-	if (adev->gfx.kiq.ring.sched.ready &&
+	if (adev->gfx.kiq.ring.sched.ready && !adev->enable_mes &&
 	    (amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev)) &&
 	    down_read_trylock(&adev->reset_domain->sem)) {
 		struct amdgpu_vmhub *hub = &adev->vmhub[vmhub];
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 24/73] drm/amdgpu/sdma: use per-ctx sdma csa address for mes sdma queue
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (22 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 23/73] drm/amdgpu: don't use kiq to flush gpu tlb if mes enabled Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 25/73] drm/amdgpu/sdma5.2: initialize sdma mqd Alex Deucher
                   ` (48 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Use per context sdma csa address for mes sdma queue.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
index e1835fd4b237..8e221a1ba937 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
@@ -74,14 +74,22 @@ uint64_t amdgpu_sdma_get_csa_mc_addr(struct amdgpu_ring *ring,
 	if (amdgpu_sriov_vf(adev) || vmid == 0 || !amdgpu_mcbp)
 		return 0;
 
-	r = amdgpu_sdma_get_index_from_ring(ring, &index);
-
-	if (r || index > 31)
-		csa_mc_addr = 0;
-	else
-		csa_mc_addr = amdgpu_csa_vaddr(adev) +
-			AMDGPU_CSA_SDMA_OFFSET +
-			index * AMDGPU_CSA_SDMA_SIZE;
+	if (ring->is_mes_queue) {
+		uint32_t offset = 0;
+
+		offset = offsetof(struct amdgpu_mes_ctx_meta_data,
+				  sdma[ring->idx].sdma_meta_data);
+		csa_mc_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+	} else {
+		r = amdgpu_sdma_get_index_from_ring(ring, &index);
+
+		if (r || index > 31)
+			csa_mc_addr = 0;
+		else
+			csa_mc_addr = amdgpu_csa_vaddr(adev) +
+				AMDGPU_CSA_SDMA_OFFSET +
+				index * AMDGPU_CSA_SDMA_SIZE;
+	}
 
 	return csa_mc_addr;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 25/73] drm/amdgpu/sdma5.2: initialize sdma mqd
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (23 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 24/73] drm/amdgpu/sdma: use per-ctx sdma csa address for mes sdma queue Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 26/73] drm/amdgpu/sdma5.2: associate mes queue id with fence Alex Deucher
                   ` (47 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Initialize sdma mqd according to ring settings.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 44 ++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index bf2cf95cbf8f..f67801c5a6c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -903,6 +903,49 @@ static int sdma_v5_2_start(struct amdgpu_device *adev)
 	return r;
 }
 
+static int sdma_v5_2_mqd_init(struct amdgpu_device *adev, void *mqd,
+			      struct amdgpu_mqd_prop *prop)
+{
+	struct v10_sdma_mqd *m = mqd;
+	uint64_t wb_gpu_addr;
+
+	m->sdmax_rlcx_rb_cntl =
+		order_base_2(prop->queue_size / 4) << SDMA0_RLC0_RB_CNTL__RB_SIZE__SHIFT |
+		1 << SDMA0_RLC0_RB_CNTL__RPTR_WRITEBACK_ENABLE__SHIFT |
+		6 << SDMA0_RLC0_RB_CNTL__RPTR_WRITEBACK_TIMER__SHIFT |
+		1 << SDMA0_RLC0_RB_CNTL__RB_PRIV__SHIFT;
+
+	m->sdmax_rlcx_rb_base = lower_32_bits(prop->hqd_base_gpu_addr >> 8);
+	m->sdmax_rlcx_rb_base_hi = upper_32_bits(prop->hqd_base_gpu_addr >> 8);
+
+	m->sdmax_rlcx_rb_wptr_poll_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, 0,
+						  mmSDMA0_GFX_RB_WPTR_POLL_CNTL));
+
+	wb_gpu_addr = prop->wptr_gpu_addr;
+	m->sdmax_rlcx_rb_wptr_poll_addr_lo = lower_32_bits(wb_gpu_addr);
+	m->sdmax_rlcx_rb_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr);
+
+	wb_gpu_addr = prop->rptr_gpu_addr;
+	m->sdmax_rlcx_rb_rptr_addr_lo = lower_32_bits(wb_gpu_addr);
+	m->sdmax_rlcx_rb_rptr_addr_hi = upper_32_bits(wb_gpu_addr);
+
+	m->sdmax_rlcx_ib_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, 0,
+							mmSDMA0_GFX_IB_CNTL));
+
+	m->sdmax_rlcx_doorbell_offset =
+		prop->doorbell_index << SDMA0_RLC0_DOORBELL_OFFSET__OFFSET__SHIFT;
+
+	m->sdmax_rlcx_doorbell = REG_SET_FIELD(0, SDMA0_RLC0_DOORBELL, ENABLE, 1);
+
+	return 0;
+}
+
+static void sdma_v5_2_set_mqd_funcs(struct amdgpu_device *adev)
+{
+	adev->mqds[AMDGPU_HW_IP_DMA].mqd_size = sizeof(struct v10_sdma_mqd);
+	adev->mqds[AMDGPU_HW_IP_DMA].init_mqd = sdma_v5_2_mqd_init;
+}
+
 /**
  * sdma_v5_2_ring_test_ring - simple async dma engine test
  *
@@ -1233,6 +1276,7 @@ static int sdma_v5_2_early_init(void *handle)
 	sdma_v5_2_set_buffer_funcs(adev);
 	sdma_v5_2_set_vm_pte_funcs(adev);
 	sdma_v5_2_set_irq_funcs(adev);
+	sdma_v5_2_set_mqd_funcs(adev);
 
 	return 0;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 26/73] drm/amdgpu/sdma5.2: associate mes queue id with fence
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (24 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 25/73] drm/amdgpu/sdma5.2: initialize sdma mqd Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 27/73] drm/amdgpu/sdma5.2: add mes queue fence handling Alex Deucher
                   ` (46 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Associate mes queue id with fence, so that EOP trap handler can look up
which queue issues the fence.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index f67801c5a6c1..0b7de18df5f4 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -460,10 +460,12 @@ static void sdma_v5_2_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 se
 		amdgpu_ring_write(ring, upper_32_bits(seq));
 	}
 
-	if (flags & AMDGPU_FENCE_FLAG_INT) {
+	if ((flags & AMDGPU_FENCE_FLAG_INT)) {
+		uint32_t ctx = ring->is_mes_queue ?
+			(ring->hw_queue_id | AMDGPU_FENCE_MES_QUEUE_FLAG) : 0;
 		/* generate an interrupt */
 		amdgpu_ring_write(ring, SDMA_PKT_HEADER_OP(SDMA_OP_TRAP));
-		amdgpu_ring_write(ring, SDMA_PKT_TRAP_INT_CONTEXT_INT_CONTEXT(0));
+		amdgpu_ring_write(ring, SDMA_PKT_TRAP_INT_CONTEXT_INT_CONTEXT(ctx));
 	}
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 27/73] drm/amdgpu/sdma5.2: add mes queue fence handling
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (25 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 26/73] drm/amdgpu/sdma5.2: associate mes queue id with fence Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 28/73] drm/amdgpu/sdma5.2: add mes support for sdma ring test Alex Deucher
                   ` (45 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

From IH ring buffer, look up the coresponding kernel queue and process.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 0b7de18df5f4..9f246ab942f9 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -1512,7 +1512,25 @@ static int sdma_v5_2_process_trap_irq(struct amdgpu_device *adev,
 				      struct amdgpu_irq_src *source,
 				      struct amdgpu_iv_entry *entry)
 {
+	uint32_t mes_queue_id = entry->src_data[0];
+
 	DRM_DEBUG("IH: SDMA trap\n");
+
+	if (adev->enable_mes && (mes_queue_id & AMDGPU_FENCE_MES_QUEUE_FLAG)) {
+		struct amdgpu_mes_queue *queue;
+
+		mes_queue_id &= AMDGPU_FENCE_MES_QUEUE_ID_MASK;
+
+		spin_lock(&adev->mes.queue_id_lock);
+		queue = idr_find(&adev->mes.queue_id_idr, mes_queue_id);
+		if (queue) {
+			DRM_DEBUG("process smda queue id = %d\n", mes_queue_id);
+			amdgpu_fence_process(queue->ring);
+		}
+		spin_unlock(&adev->mes.queue_id_lock);
+		return 0;
+	}
+
 	switch (entry->client_id) {
 	case SOC15_IH_CLIENTID_SDMA0:
 		switch (entry->ring_id) {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 28/73] drm/amdgpu/sdma5.2: add mes support for sdma ring test
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (26 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 27/73] drm/amdgpu/sdma5.2: add mes queue fence handling Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 29/73] drm/amdgpu/sdma5.2: add mes support for sdma ib test Alex Deucher
                   ` (44 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add MES support for sdma ring test.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 37 ++++++++++++++++++--------
 1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 9f246ab942f9..7c9c70382591 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -965,18 +965,29 @@ static int sdma_v5_2_ring_test_ring(struct amdgpu_ring *ring)
 	int r;
 	u32 tmp;
 	u64 gpu_addr;
+	volatile uint32_t *cpu_ptr = NULL;
 
-	r = amdgpu_device_wb_get(adev, &index);
-	if (r) {
-		dev_err(adev->dev, "(%d) failed to allocate wb slot\n", r);
-		return r;
-	}
-
-	gpu_addr = adev->wb.gpu_addr + (index * 4);
 	tmp = 0xCAFEDEAD;
-	adev->wb.wb[index] = cpu_to_le32(tmp);
 
-	r = amdgpu_ring_alloc(ring, 5);
+	if (ring->is_mes_queue) {
+		uint32_t offset = 0;
+		offset = amdgpu_mes_ctx_get_offs(ring,
+					 AMDGPU_MES_CTX_PADDING_OFFS);
+		gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+		cpu_ptr = amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+		*cpu_ptr = tmp;
+	} else {
+		r = amdgpu_device_wb_get(adev, &index);
+		if (r) {
+			dev_err(adev->dev, "(%d) failed to allocate wb slot\n", r);
+			return r;
+		}
+
+		gpu_addr = adev->wb.gpu_addr + (index * 4);
+		adev->wb.wb[index] = cpu_to_le32(tmp);
+	}
+
+	r = amdgpu_ring_alloc(ring, 20);
 	if (r) {
 		DRM_ERROR("amdgpu: dma failed to lock ring %d (%d).\n", ring->idx, r);
 		amdgpu_device_wb_free(adev, index);
@@ -992,7 +1003,10 @@ static int sdma_v5_2_ring_test_ring(struct amdgpu_ring *ring)
 	amdgpu_ring_commit(ring);
 
 	for (i = 0; i < adev->usec_timeout; i++) {
-		tmp = le32_to_cpu(adev->wb.wb[index]);
+		if (ring->is_mes_queue)
+			tmp = le32_to_cpu(*cpu_ptr);
+		else
+			tmp = le32_to_cpu(adev->wb.wb[index]);
 		if (tmp == 0xDEADBEEF)
 			break;
 		if (amdgpu_emu_mode == 1)
@@ -1004,7 +1018,8 @@ static int sdma_v5_2_ring_test_ring(struct amdgpu_ring *ring)
 	if (i >= adev->usec_timeout)
 		r = -ETIMEDOUT;
 
-	amdgpu_device_wb_free(adev, index);
+	if (!ring->is_mes_queue)
+		amdgpu_device_wb_free(adev, index);
 
 	return r;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 29/73] drm/amdgpu/sdma5.2: add mes support for sdma ib test
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (27 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 28/73] drm/amdgpu/sdma5.2: add mes support for sdma ring test Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 30/73] drm/amdgpu/sdma5: initialize sdma mqd Alex Deucher
                   ` (43 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add MES support for sdma ib test.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 50 ++++++++++++++++++--------
 1 file changed, 36 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 7c9c70382591..83c6ccaaa9e4 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -1042,21 +1042,37 @@ static int sdma_v5_2_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 	long r;
 	u32 tmp = 0;
 	u64 gpu_addr;
+	volatile uint32_t *cpu_ptr = NULL;
 
-	r = amdgpu_device_wb_get(adev, &index);
-	if (r) {
-		dev_err(adev->dev, "(%ld) failed to allocate wb slot\n", r);
-		return r;
-	}
-
-	gpu_addr = adev->wb.gpu_addr + (index * 4);
 	tmp = 0xCAFEDEAD;
-	adev->wb.wb[index] = cpu_to_le32(tmp);
 	memset(&ib, 0, sizeof(ib));
-	r = amdgpu_ib_get(adev, NULL, 256, AMDGPU_IB_POOL_DIRECT, &ib);
-	if (r) {
-		DRM_ERROR("amdgpu: failed to get ib (%ld).\n", r);
-		goto err0;
+
+	if (ring->is_mes_queue) {
+		uint32_t offset = 0;
+		offset = amdgpu_mes_ctx_get_offs(ring, AMDGPU_MES_CTX_IB_OFFS);
+		ib.gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+		ib.ptr = (void *)amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+
+		offset = amdgpu_mes_ctx_get_offs(ring,
+					 AMDGPU_MES_CTX_PADDING_OFFS);
+		gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+		cpu_ptr = amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+		*cpu_ptr = tmp;
+	} else {
+		r = amdgpu_device_wb_get(adev, &index);
+		if (r) {
+			dev_err(adev->dev, "(%ld) failed to allocate wb slot\n", r);
+			return r;
+		}
+
+		gpu_addr = adev->wb.gpu_addr + (index * 4);
+		adev->wb.wb[index] = cpu_to_le32(tmp);
+
+		r = amdgpu_ib_get(adev, NULL, 256, AMDGPU_IB_POOL_DIRECT, &ib);
+		if (r) {
+			DRM_ERROR("amdgpu: failed to get ib (%ld).\n", r);
+			goto err0;
+		}
 	}
 
 	ib.ptr[0] = SDMA_PKT_HEADER_OP(SDMA_OP_WRITE) |
@@ -1083,7 +1099,12 @@ static int sdma_v5_2_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 		DRM_ERROR("amdgpu: fence wait failed (%ld).\n", r);
 		goto err1;
 	}
-	tmp = le32_to_cpu(adev->wb.wb[index]);
+
+	if (ring->is_mes_queue)
+		tmp = le32_to_cpu(*cpu_ptr);
+	else
+		tmp = le32_to_cpu(adev->wb.wb[index]);
+
 	if (tmp == 0xDEADBEEF)
 		r = 0;
 	else
@@ -1093,7 +1114,8 @@ static int sdma_v5_2_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 	amdgpu_ib_free(adev, &ib, NULL);
 	dma_fence_put(f);
 err0:
-	amdgpu_device_wb_free(adev, index);
+	if (!ring->is_mes_queue)
+		amdgpu_device_wb_free(adev, index);
 	return r;
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 30/73] drm/amdgpu/sdma5: initialize sdma mqd
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (28 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 29/73] drm/amdgpu/sdma5.2: add mes support for sdma ib test Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 31/73] drm/amdgpu/sdma5: associate mes queue id with fence Alex Deucher
                   ` (42 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Initialize sdma mqd according to ring settings.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 44 ++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index ff359e7f1eb8..30d12c9df911 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -959,6 +959,49 @@ static int sdma_v5_0_start(struct amdgpu_device *adev)
 	return r;
 }
 
+static int sdma_v5_0_mqd_init(struct amdgpu_device *adev, void *mqd,
+			      struct amdgpu_mqd_prop *prop)
+{
+	struct v10_sdma_mqd *m = mqd;
+	uint64_t wb_gpu_addr;
+
+	m->sdmax_rlcx_rb_cntl =
+		order_base_2(prop->queue_size / 4) << SDMA0_RLC0_RB_CNTL__RB_SIZE__SHIFT |
+		1 << SDMA0_RLC0_RB_CNTL__RPTR_WRITEBACK_ENABLE__SHIFT |
+		6 << SDMA0_RLC0_RB_CNTL__RPTR_WRITEBACK_TIMER__SHIFT |
+		1 << SDMA0_RLC0_RB_CNTL__RB_PRIV__SHIFT;
+
+	m->sdmax_rlcx_rb_base = lower_32_bits(prop->hqd_base_gpu_addr >> 8);
+	m->sdmax_rlcx_rb_base_hi = upper_32_bits(prop->hqd_base_gpu_addr >> 8);
+
+	m->sdmax_rlcx_rb_wptr_poll_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, 0,
+						  mmSDMA0_GFX_RB_WPTR_POLL_CNTL));
+
+	wb_gpu_addr = prop->wptr_gpu_addr;
+	m->sdmax_rlcx_rb_wptr_poll_addr_lo = lower_32_bits(wb_gpu_addr);
+	m->sdmax_rlcx_rb_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr);
+
+	wb_gpu_addr = prop->rptr_gpu_addr;
+	m->sdmax_rlcx_rb_rptr_addr_lo = lower_32_bits(wb_gpu_addr);
+	m->sdmax_rlcx_rb_rptr_addr_hi = upper_32_bits(wb_gpu_addr);
+
+	m->sdmax_rlcx_ib_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, 0,
+							mmSDMA0_GFX_IB_CNTL));
+
+	m->sdmax_rlcx_doorbell_offset =
+		prop->doorbell_index << SDMA0_RLC0_DOORBELL_OFFSET__OFFSET__SHIFT;
+
+	m->sdmax_rlcx_doorbell = REG_SET_FIELD(0, SDMA0_RLC0_DOORBELL, ENABLE, 1);
+
+	return 0;
+}
+
+static void sdma_v5_0_set_mqd_funcs(struct amdgpu_device *adev)
+{
+	adev->mqds[AMDGPU_HW_IP_DMA].mqd_size = sizeof(struct v10_sdma_mqd);
+	adev->mqds[AMDGPU_HW_IP_DMA].init_mqd = sdma_v5_0_mqd_init;
+}
+
 /**
  * sdma_v5_0_ring_test_ring - simple async dma engine test
  *
@@ -1289,6 +1332,7 @@ static int sdma_v5_0_early_init(void *handle)
 	sdma_v5_0_set_buffer_funcs(adev);
 	sdma_v5_0_set_vm_pte_funcs(adev);
 	sdma_v5_0_set_irq_funcs(adev);
+	sdma_v5_0_set_mqd_funcs(adev);
 
 	return 0;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 31/73] drm/amdgpu/sdma5: associate mes queue id with fence
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (29 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 30/73] drm/amdgpu/sdma5: initialize sdma mqd Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 32/73] drm/amdgpu/sdma5: add mes queue fence handling Alex Deucher
                   ` (41 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Associate mes queue id with fence, so that EOP trap handler can look up
which queue issues the fence.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index 30d12c9df911..b73e45597031 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -562,9 +562,11 @@ static void sdma_v5_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 se
 	}
 
 	if (flags & AMDGPU_FENCE_FLAG_INT) {
+		uint32_t ctx = ring->is_mes_queue ?
+			(ring->hw_queue_id | AMDGPU_FENCE_MES_QUEUE_FLAG) : 0;
 		/* generate an interrupt */
 		amdgpu_ring_write(ring, SDMA_PKT_HEADER_OP(SDMA_OP_TRAP));
-		amdgpu_ring_write(ring, SDMA_PKT_TRAP_INT_CONTEXT_INT_CONTEXT(0));
+		amdgpu_ring_write(ring, SDMA_PKT_TRAP_INT_CONTEXT_INT_CONTEXT(ctx));
 	}
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 32/73] drm/amdgpu/sdma5: add mes queue fence handling
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (30 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 31/73] drm/amdgpu/sdma5: associate mes queue id with fence Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 33/73] drm/amdgpu/sdma5: add mes support for sdma ring test Alex Deucher
                   ` (40 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

From IH ring buffer look up the coresponding kernel queue and process.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index b73e45597031..564adc7b010c 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -1555,7 +1555,25 @@ static int sdma_v5_0_process_trap_irq(struct amdgpu_device *adev,
 				      struct amdgpu_irq_src *source,
 				      struct amdgpu_iv_entry *entry)
 {
+	uint32_t mes_queue_id = entry->src_data[0];
+
 	DRM_DEBUG("IH: SDMA trap\n");
+
+	if (adev->enable_mes && (mes_queue_id & AMDGPU_FENCE_MES_QUEUE_FLAG)) {
+		struct amdgpu_mes_queue *queue;
+
+		mes_queue_id &= AMDGPU_FENCE_MES_QUEUE_ID_MASK;
+
+		spin_lock(&adev->mes.queue_id_lock);
+		queue = idr_find(&adev->mes.queue_id_idr, mes_queue_id);
+		if (queue) {
+			DRM_DEBUG("process smda queue id = %d\n", mes_queue_id);
+			amdgpu_fence_process(queue->ring);
+		}
+		spin_unlock(&adev->mes.queue_id_lock);
+		return 0;
+	}
+
 	switch (entry->client_id) {
 	case SOC15_IH_CLIENTID_SDMA0:
 		switch (entry->ring_id) {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 33/73] drm/amdgpu/sdma5: add mes support for sdma ring test
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (31 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 32/73] drm/amdgpu/sdma5: add mes queue fence handling Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 34/73] drm/amdgpu/sdma5: add mes support for sdma ib test Alex Deucher
                   ` (39 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add MES support for sdma ring test.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 37 ++++++++++++++++++--------
 1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index 564adc7b010c..1f0b19e18494 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -1021,18 +1021,29 @@ static int sdma_v5_0_ring_test_ring(struct amdgpu_ring *ring)
 	int r;
 	u32 tmp;
 	u64 gpu_addr;
+	volatile uint32_t *cpu_ptr = NULL;
 
-	r = amdgpu_device_wb_get(adev, &index);
-	if (r) {
-		dev_err(adev->dev, "(%d) failed to allocate wb slot\n", r);
-		return r;
-	}
-
-	gpu_addr = adev->wb.gpu_addr + (index * 4);
 	tmp = 0xCAFEDEAD;
-	adev->wb.wb[index] = cpu_to_le32(tmp);
 
-	r = amdgpu_ring_alloc(ring, 5);
+	if (ring->is_mes_queue) {
+		uint32_t offset = 0;
+		offset = amdgpu_mes_ctx_get_offs(ring,
+					 AMDGPU_MES_CTX_PADDING_OFFS);
+		gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+		cpu_ptr = amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+		*cpu_ptr = tmp;
+	} else {
+		r = amdgpu_device_wb_get(adev, &index);
+		if (r) {
+			dev_err(adev->dev, "(%d) failed to allocate wb slot\n", r);
+			return r;
+		}
+
+		gpu_addr = adev->wb.gpu_addr + (index * 4);
+		adev->wb.wb[index] = cpu_to_le32(tmp);
+	}
+
+	r = amdgpu_ring_alloc(ring, 20);
 	if (r) {
 		DRM_ERROR("amdgpu: dma failed to lock ring %d (%d).\n", ring->idx, r);
 		amdgpu_device_wb_free(adev, index);
@@ -1048,7 +1059,10 @@ static int sdma_v5_0_ring_test_ring(struct amdgpu_ring *ring)
 	amdgpu_ring_commit(ring);
 
 	for (i = 0; i < adev->usec_timeout; i++) {
-		tmp = le32_to_cpu(adev->wb.wb[index]);
+		if (ring->is_mes_queue)
+			tmp = le32_to_cpu(*cpu_ptr);
+		else
+			tmp = le32_to_cpu(adev->wb.wb[index]);
 		if (tmp == 0xDEADBEEF)
 			break;
 		if (amdgpu_emu_mode == 1)
@@ -1060,7 +1074,8 @@ static int sdma_v5_0_ring_test_ring(struct amdgpu_ring *ring)
 	if (i >= adev->usec_timeout)
 		r = -ETIMEDOUT;
 
-	amdgpu_device_wb_free(adev, index);
+	if (!ring->is_mes_queue)
+		amdgpu_device_wb_free(adev, index);
 
 	return r;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 34/73] drm/amdgpu/sdma5: add mes support for sdma ib test
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (32 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 33/73] drm/amdgpu/sdma5: add mes support for sdma ring test Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 35/73] drm/amdgpu: add mes kiq PSP GFX FW type Alex Deucher
                   ` (38 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add MES support for sdma ib test.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 50 ++++++++++++++++++--------
 1 file changed, 36 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index 1f0b19e18494..1f9021f896a1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -1098,22 +1098,38 @@ static int sdma_v5_0_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 	long r;
 	u32 tmp = 0;
 	u64 gpu_addr;
+	volatile uint32_t *cpu_ptr = NULL;
 
-	r = amdgpu_device_wb_get(adev, &index);
-	if (r) {
-		dev_err(adev->dev, "(%ld) failed to allocate wb slot\n", r);
-		return r;
-	}
-
-	gpu_addr = adev->wb.gpu_addr + (index * 4);
 	tmp = 0xCAFEDEAD;
-	adev->wb.wb[index] = cpu_to_le32(tmp);
 	memset(&ib, 0, sizeof(ib));
-	r = amdgpu_ib_get(adev, NULL, 256,
+
+	if (ring->is_mes_queue) {
+		uint32_t offset = 0;
+		offset = amdgpu_mes_ctx_get_offs(ring, AMDGPU_MES_CTX_IB_OFFS);
+		ib.gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+		ib.ptr = (void *)amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+
+		offset = amdgpu_mes_ctx_get_offs(ring,
+					 AMDGPU_MES_CTX_PADDING_OFFS);
+		gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+		cpu_ptr = amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+		*cpu_ptr = tmp;
+	} else {
+		r = amdgpu_device_wb_get(adev, &index);
+		if (r) {
+			dev_err(adev->dev, "(%ld) failed to allocate wb slot\n", r);
+			return r;
+		}
+
+		gpu_addr = adev->wb.gpu_addr + (index * 4);
+		adev->wb.wb[index] = cpu_to_le32(tmp);
+
+		r = amdgpu_ib_get(adev, NULL, 256,
 					AMDGPU_IB_POOL_DIRECT, &ib);
-	if (r) {
-		DRM_ERROR("amdgpu: failed to get ib (%ld).\n", r);
-		goto err0;
+		if (r) {
+			DRM_ERROR("amdgpu: failed to get ib (%ld).\n", r);
+			goto err0;
+		}
 	}
 
 	ib.ptr[0] = SDMA_PKT_HEADER_OP(SDMA_OP_WRITE) |
@@ -1140,7 +1156,12 @@ static int sdma_v5_0_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 		DRM_ERROR("amdgpu: fence wait failed (%ld).\n", r);
 		goto err1;
 	}
-	tmp = le32_to_cpu(adev->wb.wb[index]);
+
+	if (ring->is_mes_queue)
+		tmp = le32_to_cpu(*cpu_ptr);
+	else
+		tmp = le32_to_cpu(adev->wb.wb[index]);
+
 	if (tmp == 0xDEADBEEF)
 		r = 0;
 	else
@@ -1150,7 +1171,8 @@ static int sdma_v5_0_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 	amdgpu_ib_free(adev, &ib, NULL);
 	dma_fence_put(f);
 err0:
-	amdgpu_device_wb_free(adev, index);
+	if (!ring->is_mes_queue)
+		amdgpu_device_wb_free(adev, index);
 	return r;
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 35/73] drm/amdgpu: add mes kiq PSP GFX FW type
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (33 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 34/73] drm/amdgpu/sdma5: add mes support for sdma ib test Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 36/73] drm/amdgpu/mes: add mes kiq callback Alex Deucher
                   ` (37 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Likun Gao, Hawking Zhang

From: Likun Gao <Likun.Gao@amd.com>

Add MES KIQ PSP GFX FW type and the convert type.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index aabb208bebde..d0fb14ef645c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -2190,6 +2190,12 @@ static int psp_get_fw_type(struct amdgpu_firmware_info *ucode,
 	case AMDGPU_UCODE_ID_CP_MES_DATA:
 		*type = GFX_FW_TYPE_MES_STACK;
 		break;
+	case AMDGPU_UCODE_ID_CP_MES1:
+		*type = GFX_FW_TYPE_CP_MES_KIQ;
+		break;
+	case AMDGPU_UCODE_ID_CP_MES1_DATA:
+		*type = GFX_FW_TYPE_MES_KIQ_STACK;
+		break;
 	case AMDGPU_UCODE_ID_CP_CE:
 		*type = GFX_FW_TYPE_CP_CE;
 		break;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 36/73] drm/amdgpu/mes: add mes kiq callback
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (34 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 35/73] drm/amdgpu: add mes kiq PSP GFX FW type Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 37/73] drm/amdgpu: add mes kiq frontdoor loading support Alex Deucher
                   ` (36 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Needed to properly initialize mes kiq.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 91b020842eb0..117c95acfd48 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -109,6 +109,9 @@ struct amdgpu_mes {
 	uint64_t			query_status_fence_gpu_addr;
 	uint64_t			*query_status_fence_ptr;
 
+	/* initialize kiq pipe */
+	int                             (*kiq_hw_init)(struct amdgpu_device *adev);
+
 	/* ip specific functions */
 	const struct amdgpu_mes_funcs   *funcs;
 };
@@ -204,4 +207,6 @@ struct amdgpu_mes_funcs {
 			   struct mes_resume_gang_input *input);
 };
 
+#define amdgpu_mes_kiq_hw_init(adev) (adev)->mes.kiq_hw_init((adev))
+
 #endif /* __AMDGPU_MES_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 37/73] drm/amdgpu: add mes kiq frontdoor loading support
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (35 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 36/73] drm/amdgpu/mes: add mes kiq callback Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 38/73] drm/amdgpu: enable mes kiq N-1 test on sienna cichlid Alex Deucher
                   ` (35 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add mes kiq frontdoor loading support.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
index b7d575c7bcdc..a67f41465337 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
@@ -711,6 +711,16 @@ static int amdgpu_ucode_init_single_fw(struct amdgpu_device *adev,
 			ucode_addr = (u8 *)ucode->fw->data +
 				le32_to_cpu(mes_hdr->mes_ucode_data_offset_bytes);
 			break;
+		case AMDGPU_UCODE_ID_CP_MES1:
+			ucode->ucode_size = le32_to_cpu(mes_hdr->mes_ucode_size_bytes);
+			ucode_addr = (u8 *)ucode->fw->data +
+				le32_to_cpu(mes_hdr->mes_ucode_offset_bytes);
+			break;
+		case AMDGPU_UCODE_ID_CP_MES1_DATA:
+			ucode->ucode_size = le32_to_cpu(mes_hdr->mes_ucode_data_size_bytes);
+			ucode_addr = (u8 *)ucode->fw->data +
+				le32_to_cpu(mes_hdr->mes_ucode_data_offset_bytes);
+			break;
 		case AMDGPU_UCODE_ID_DMCU_ERAM:
 			ucode->ucode_size = le32_to_cpu(header->ucode_size_bytes) -
 				le32_to_cpu(dmcu_hdr->intv_size_bytes);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 38/73] drm/amdgpu: enable mes kiq N-1 test on sienna cichlid
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (36 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 37/73] drm/amdgpu: add mes kiq frontdoor loading support Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 39/73] drm/amdgpu/mes: manage mes doorbell allocation Alex Deucher
                   ` (34 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Enable kiq support on gfx10.3, enable mes kiq (n-1)
test on sienna cichlid, so that mes kiq can be tested on
sienna cichlid. The patch can be dropped once mes kiq
is functional.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c |  32 ++--
 drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 202 ++++++++++++++++++++-----
 2 files changed, 184 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 1208d01cc936..9042e0b480ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -4897,16 +4897,18 @@ static int gfx_v10_0_sw_init(void *handle)
 		}
 	}
 
-	r = amdgpu_gfx_kiq_init(adev, GFX10_MEC_HPD_SIZE);
-	if (r) {
-		DRM_ERROR("Failed to init KIQ BOs!\n");
-		return r;
-	}
+	if (!adev->enable_mes_kiq) {
+		r = amdgpu_gfx_kiq_init(adev, GFX10_MEC_HPD_SIZE);
+		if (r) {
+			DRM_ERROR("Failed to init KIQ BOs!\n");
+			return r;
+		}
 
-	kiq = &adev->gfx.kiq;
-	r = amdgpu_gfx_kiq_init_ring(adev, &kiq->ring, &kiq->irq);
-	if (r)
-		return r;
+		kiq = &adev->gfx.kiq;
+		r = amdgpu_gfx_kiq_init_ring(adev, &kiq->ring, &kiq->irq);
+		if (r)
+			return r;
+	}
 
 	r = amdgpu_gfx_mqd_sw_init(adev, sizeof(struct v10_compute_mqd));
 	if (r)
@@ -4958,8 +4960,11 @@ static int gfx_v10_0_sw_fini(void *handle)
 		amdgpu_ring_fini(&adev->gfx.compute_ring[i]);
 
 	amdgpu_gfx_mqd_sw_fini(adev);
-	amdgpu_gfx_kiq_free_ring(&adev->gfx.kiq.ring);
-	amdgpu_gfx_kiq_fini(adev);
+
+	if (!adev->enable_mes_kiq) {
+		amdgpu_gfx_kiq_free_ring(&adev->gfx.kiq.ring);
+		amdgpu_gfx_kiq_fini(adev);
+	}
 
 	gfx_v10_0_pfp_fini(adev);
 	gfx_v10_0_ce_fini(adev);
@@ -7213,7 +7218,10 @@ static int gfx_v10_0_cp_resume(struct amdgpu_device *adev)
 			return r;
 	}
 
-	r = gfx_v10_0_kiq_resume(adev);
+	if (adev->enable_mes_kiq && adev->mes.kiq_hw_init)
+		r = amdgpu_mes_kiq_hw_init(adev);
+	else
+		r = gfx_v10_0_kiq_resume(adev);
 	if (r)
 		return r;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index f82a6f981629..fecf3f26bf7c 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -33,11 +33,15 @@
 
 #define mmCP_MES_IC_OP_CNTL_Sienna_Cichlid               0x2820
 #define mmCP_MES_IC_OP_CNTL_Sienna_Cichlid_BASE_IDX      1
+#define mmRLC_CP_SCHEDULERS_Sienna_Cichlid		0x4ca1
+#define mmRLC_CP_SCHEDULERS_Sienna_Cichlid_BASE_IDX	1
 
 MODULE_FIRMWARE("amdgpu/navi10_mes.bin");
 MODULE_FIRMWARE("amdgpu/sienna_cichlid_mes.bin");
+MODULE_FIRMWARE("amdgpu/sienna_cichlid_mes1.bin");
 
 static int mes_v10_1_hw_fini(void *handle);
+static int mes_v10_1_kiq_hw_init(struct amdgpu_device *adev);
 
 #define MES_EOP_SIZE   2048
 
@@ -278,11 +282,11 @@ static int mes_v10_1_init_microcode(struct amdgpu_device *adev,
 	const struct mes_firmware_header_v1_0 *mes_hdr;
 	struct amdgpu_firmware_info *info;
 
-	switch (adev->asic_type) {
-	case CHIP_NAVI10:
+	switch (adev->ip_versions[GC_HWIP][0]) {
+	case IP_VERSION(10, 1, 10):
 		chip_name = "navi10";
 		break;
-	case CHIP_SIENNA_CICHLID:
+	case IP_VERSION(10, 3, 0):
 		chip_name = "sienna_cichlid";
 		break;
 	default:
@@ -293,7 +297,8 @@ static int mes_v10_1_init_microcode(struct amdgpu_device *adev,
 		snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes.bin",
 			 chip_name);
 	else
-		BUG();
+		snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes1.bin",
+			 chip_name);
 
 	err = request_firmware(&adev->mes.fw[pipe], fw_name, adev->dev);
 	if (err)
@@ -326,7 +331,8 @@ static int mes_v10_1_init_microcode(struct amdgpu_device *adev,
 			ucode = AMDGPU_UCODE_ID_CP_MES;
 			ucode_data = AMDGPU_UCODE_ID_CP_MES_DATA;
 		} else {
-			BUG();
+			ucode = AMDGPU_UCODE_ID_CP_MES1;
+			ucode_data = AMDGPU_UCODE_ID_CP_MES1_DATA;
 		}
 
 		info = &adev->firmware.ucode[ucode];
@@ -439,6 +445,8 @@ static void mes_v10_1_enable(struct amdgpu_device *adev, bool enable)
 	if (enable) {
 		data = RREG32_SOC15(GC, 0, mmCP_MES_CNTL);
 		data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE0_RESET, 1);
+		data = REG_SET_FIELD(data, CP_MES_CNTL,
+			     MES_PIPE1_RESET, adev->enable_mes_kiq ? 1 : 0);
 		WREG32_SOC15(GC, 0, mmCP_MES_CNTL, data);
 
 		mutex_lock(&adev->srbm_mutex);
@@ -462,13 +470,18 @@ static void mes_v10_1_enable(struct amdgpu_device *adev, bool enable)
 
 		/* unhalt MES and activate pipe0 */
 		data = REG_SET_FIELD(0, CP_MES_CNTL, MES_PIPE0_ACTIVE, 1);
+		data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE1_ACTIVE,
+				     adev->enable_mes_kiq ? 1 : 0);
 		WREG32_SOC15(GC, 0, mmCP_MES_CNTL, data);
 	} else {
 		data = RREG32_SOC15(GC, 0, mmCP_MES_CNTL);
 		data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE0_ACTIVE, 0);
+		data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE1_ACTIVE, 0);
 		data = REG_SET_FIELD(data, CP_MES_CNTL,
 				     MES_INVALIDATE_ICACHE, 1);
 		data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE0_RESET, 1);
+		data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE1_RESET,
+				     adev->enable_mes_kiq ? 1 : 0);
 		data = REG_SET_FIELD(data, CP_MES_CNTL, MES_HALT, 1);
 		WREG32_SOC15(GC, 0, mmCP_MES_CNTL, data);
 	}
@@ -525,8 +538,8 @@ static int mes_v10_1_load_microcode(struct amdgpu_device *adev,
 	WREG32_SOC15(GC, 0, mmCP_MES_MDBOUND_LO, 0x3FFFF);
 
 	/* invalidate ICACHE */
-	switch (adev->asic_type) {
-	case CHIP_SIENNA_CICHLID:
+	switch (adev->ip_versions[GC_HWIP][0]) {
+	case IP_VERSION(10, 3, 0):
 		data = RREG32_SOC15(GC, 0, mmCP_MES_IC_OP_CNTL_Sienna_Cichlid);
 		break;
 	default:
@@ -535,8 +548,8 @@ static int mes_v10_1_load_microcode(struct amdgpu_device *adev,
 	}
 	data = REG_SET_FIELD(data, CP_MES_IC_OP_CNTL, PRIME_ICACHE, 0);
 	data = REG_SET_FIELD(data, CP_MES_IC_OP_CNTL, INVALIDATE_CACHE, 1);
-	switch (adev->asic_type) {
-	case CHIP_SIENNA_CICHLID:
+	switch (adev->ip_versions[GC_HWIP][0]) {
+	case IP_VERSION(10, 3, 0):
 		WREG32_SOC15(GC, 0, mmCP_MES_IC_OP_CNTL_Sienna_Cichlid, data);
 		break;
 	default:
@@ -545,8 +558,8 @@ static int mes_v10_1_load_microcode(struct amdgpu_device *adev,
 	}
 
 	/* prime the ICACHE. */
-	switch (adev->asic_type) {
-	case CHIP_SIENNA_CICHLID:
+	switch (adev->ip_versions[GC_HWIP][0]) {
+	case IP_VERSION(10, 3, 0):
 		data = RREG32_SOC15(GC, 0, mmCP_MES_IC_OP_CNTL_Sienna_Cichlid);
 		break;
 	default:
@@ -554,8 +567,8 @@ static int mes_v10_1_load_microcode(struct amdgpu_device *adev,
 		break;
 	}
 	data = REG_SET_FIELD(data, CP_MES_IC_OP_CNTL, PRIME_ICACHE, 1);
-	switch (adev->asic_type) {
-	case CHIP_SIENNA_CICHLID:
+	switch (adev->ip_versions[GC_HWIP][0]) {
+	case IP_VERSION(10, 3, 0):
 		WREG32_SOC15(GC, 0, mmCP_MES_IC_OP_CNTL_Sienna_Cichlid, data);
 		break;
 	default:
@@ -883,13 +896,40 @@ static int mes_v10_1_ring_init(struct amdgpu_device *adev)
 				AMDGPU_RING_PRIO_DEFAULT, NULL);
 }
 
+static int mes_v10_1_kiq_ring_init(struct amdgpu_device *adev)
+{
+	struct amdgpu_ring *ring;
+
+	spin_lock_init(&adev->gfx.kiq.ring_lock);
+
+	ring = &adev->gfx.kiq.ring;
+
+	ring->me = 3;
+	ring->pipe = 1;
+	ring->queue = 0;
+
+	ring->adev = NULL;
+	ring->ring_obj = NULL;
+	ring->use_doorbell = true;
+	ring->doorbell_index = adev->doorbell_index.mes_ring1 << 1;
+	ring->eop_gpu_addr = adev->mes.eop_gpu_addr[AMDGPU_MES_KIQ_PIPE];
+	ring->no_scheduler = true;
+	sprintf(ring->name, "mes_kiq_%d.%d.%d",
+		ring->me, ring->pipe, ring->queue);
+
+	return amdgpu_ring_init(adev, ring, 1024, NULL, 0,
+				AMDGPU_RING_PRIO_DEFAULT, NULL);
+}
+
 static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev,
 				 enum admgpu_mes_pipe pipe)
 {
 	int r, mqd_size = sizeof(struct v10_compute_mqd);
 	struct amdgpu_ring *ring;
 
-	if (pipe == AMDGPU_MES_SCHED_PIPE)
+	if (pipe == AMDGPU_MES_KIQ_PIPE)
+		ring = &adev->gfx.kiq.ring;
+	else if (pipe == AMDGPU_MES_SCHED_PIPE)
 		ring = &adev->mes.ring;
 	else
 		BUG();
@@ -918,22 +958,34 @@ static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev,
 static int mes_v10_1_sw_init(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	int r, pipe = AMDGPU_MES_SCHED_PIPE;
+	int pipe, r;
 
 	adev->mes.adev = adev;
 	adev->mes.funcs = &mes_v10_1_funcs;
+	adev->mes.kiq_hw_init = &mes_v10_1_kiq_hw_init;
 
-	r = mes_v10_1_init_microcode(adev, pipe);
-	if (r)
-		return r;
+	for (pipe = 0; pipe < AMDGPU_MAX_MES_PIPES; pipe++) {
+		if (!adev->enable_mes_kiq && pipe == AMDGPU_MES_KIQ_PIPE)
+			continue;
 
-	r = mes_v10_1_allocate_eop_buf(adev, pipe);
-	if (r)
-		return r;
+		r = mes_v10_1_init_microcode(adev, pipe);
+		if (r)
+			return r;
 
-	r = mes_v10_1_mqd_sw_init(adev, pipe);
-	if (r)
-		return r;
+		r = mes_v10_1_allocate_eop_buf(adev, pipe);
+		if (r)
+			return r;
+
+		r = mes_v10_1_mqd_sw_init(adev, pipe);
+		if (r)
+			return r;
+	}
+
+	if (adev->enable_mes_kiq) {
+		r = mes_v10_1_kiq_ring_init(adev);
+		if (r)
+			return r;
+	}
 
 	r = mes_v10_1_ring_init(adev);
 	if (r)
@@ -949,43 +1001,115 @@ static int mes_v10_1_sw_init(void *handle)
 static int mes_v10_1_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	int pipe = AMDGPU_MES_SCHED_PIPE;
+	int pipe;
 
 	amdgpu_device_wb_free(adev, adev->mes.sch_ctx_offs);
 	amdgpu_device_wb_free(adev, adev->mes.query_status_fence_offs);
 
-	kfree(adev->mes.mqd_backup[pipe]);
+	for (pipe = 0; pipe < AMDGPU_MAX_MES_PIPES; pipe++) {
+		kfree(adev->mes.mqd_backup[pipe]);
+
+		amdgpu_bo_free_kernel(&adev->mes.eop_gpu_obj[pipe],
+				      &adev->mes.eop_gpu_addr[pipe],
+				      NULL);
+
+		mes_v10_1_free_microcode(adev, pipe);
+	}
+
+	amdgpu_bo_free_kernel(&adev->gfx.kiq.ring.mqd_obj,
+			      &adev->gfx.kiq.ring.mqd_gpu_addr,
+			      &adev->gfx.kiq.ring.mqd_ptr);
 
 	amdgpu_bo_free_kernel(&adev->mes.ring.mqd_obj,
 			      &adev->mes.ring.mqd_gpu_addr,
 			      &adev->mes.ring.mqd_ptr);
 
-	amdgpu_bo_free_kernel(&adev->mes.eop_gpu_obj[pipe],
-			      &adev->mes.eop_gpu_addr[pipe],
-			      NULL);
-
-	mes_v10_1_free_microcode(adev, pipe);
+	amdgpu_ring_fini(&adev->gfx.kiq.ring);
 	amdgpu_ring_fini(&adev->mes.ring);
 
 	return 0;
 }
 
-static int mes_v10_1_hw_init(void *handle)
+static void mes_v10_1_kiq_setting(struct amdgpu_ring *ring)
 {
-	int r;
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+	uint32_t tmp;
+	struct amdgpu_device *adev = ring->adev;
+
+	/* tell RLC which is KIQ queue */
+	switch (adev->ip_versions[GC_HWIP][0]) {
+	case IP_VERSION(10, 3, 0):
+	case IP_VERSION(10, 3, 2):
+	case IP_VERSION(10, 3, 1):
+	case IP_VERSION(10, 3, 4):
+		tmp = RREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS_Sienna_Cichlid);
+		tmp &= 0xffffff00;
+		tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue);
+		WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS_Sienna_Cichlid, tmp);
+		tmp |= 0x80;
+		WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS_Sienna_Cichlid, tmp);
+		break;
+	default:
+		tmp = RREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS);
+		tmp &= 0xffffff00;
+		tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue);
+		WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS, tmp);
+		tmp |= 0x80;
+		WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS, tmp);
+		break;
+	}
+}
+
+static int mes_v10_1_kiq_hw_init(struct amdgpu_device *adev)
+{
+	int r = 0;
 
 	if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT) {
-		r = mes_v10_1_load_microcode(adev,
-					     AMDGPU_MES_SCHED_PIPE);
+		r = mes_v10_1_load_microcode(adev, AMDGPU_MES_KIQ_PIPE);
+		if (r) {
+			DRM_ERROR("failed to load MES kiq fw, r=%d\n", r);
+			return r;
+		}
+
+		r = mes_v10_1_load_microcode(adev, AMDGPU_MES_SCHED_PIPE);
 		if (r) {
-			DRM_ERROR("failed to MES fw, r=%d\n", r);
+			DRM_ERROR("failed to load MES fw, r=%d\n", r);
 			return r;
 		}
 	}
 
 	mes_v10_1_enable(adev, true);
 
+	mes_v10_1_kiq_setting(&adev->gfx.kiq.ring);
+
+	r = mes_v10_1_queue_init(adev);
+	if (r)
+		goto failure;
+
+	return r;
+
+failure:
+	mes_v10_1_hw_fini(adev);
+	return r;
+}
+
+static int mes_v10_1_hw_init(void *handle)
+{
+	int r;
+	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+	if (!adev->enable_mes_kiq) {
+		if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT) {
+			r = mes_v10_1_load_microcode(adev,
+					     AMDGPU_MES_SCHED_PIPE);
+			if (r) {
+				DRM_ERROR("failed to MES fw, r=%d\n", r);
+				return r;
+			}
+		}
+
+		mes_v10_1_enable(adev, true);
+	}
+
 	r = mes_v10_1_queue_init(adev);
 	if (r)
 		goto failure;
@@ -1013,8 +1137,10 @@ static int mes_v10_1_hw_fini(void *handle)
 
 	mes_v10_1_enable(adev, false);
 
-	if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT)
+	if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT) {
+		mes_v10_1_free_ucode_buffers(adev, AMDGPU_MES_KIQ_PIPE);
 		mes_v10_1_free_ucode_buffers(adev, AMDGPU_MES_SCHED_PIPE);
+	}
 
 	return 0;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 39/73] drm/amdgpu/mes: manage mes doorbell allocation
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (37 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 38/73] drm/amdgpu: enable mes kiq N-1 test on sienna cichlid Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 40/73] drm/amdgpu: add mes queue id mask v2 Alex Deucher
                   ` (33 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

It is used to manage the doorbell allocation of mes processes and queues.
Driver calls into process doorbell allocation to get the slice doorbell
for the process, then the doorbell for a queue is allocated from the
process doorbell slice.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/Makefile     |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 133 ++++++++++++++++++++++++
 2 files changed, 134 insertions(+)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile
index 7a1b13fabebb..803e7f5dc458 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -144,6 +144,7 @@ amdgpu-y += \
 
 # add MES block
 amdgpu-y += \
+	amdgpu_mes.o \
 	mes_v10_1.o
 
 # add UVD block
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
new file mode 100644
index 000000000000..1c591cb45fd9
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -0,0 +1,133 @@
+/*
+ * Copyright 2019 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include "amdgpu_mes.h"
+#include "amdgpu.h"
+#include "soc15_common.h"
+#include "amdgpu_mes_ctx.h"
+
+#define AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS 1024
+#define AMDGPU_ONE_DOORBELL_SIZE 8
+
+static int amdgpu_mes_doorbell_process_slice(struct amdgpu_device *adev)
+{
+	return roundup(AMDGPU_ONE_DOORBELL_SIZE *
+		       AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS,
+		       PAGE_SIZE);
+}
+
+static int amdgpu_mes_alloc_process_doorbells(struct amdgpu_device *adev,
+				      struct amdgpu_mes_process *process)
+{
+	int r = ida_simple_get(&adev->mes.doorbell_ida, 2,
+			       adev->mes.max_doorbell_slices,
+			       GFP_KERNEL);
+	if (r > 0)
+		process->doorbell_index = r;
+
+	return r;
+}
+
+static void amdgpu_mes_free_process_doorbells(struct amdgpu_device *adev,
+				      struct amdgpu_mes_process *process)
+{
+	if (process->doorbell_index)
+		ida_simple_remove(&adev->mes.doorbell_ida,
+				  process->doorbell_index);
+}
+
+static int amdgpu_mes_queue_doorbell_get(struct amdgpu_device *adev,
+					 struct amdgpu_mes_process *process,
+					 int ip_type, uint64_t *doorbell_index)
+{
+	unsigned int offset, found;
+
+	if (ip_type == AMDGPU_RING_TYPE_SDMA) {
+		offset = adev->doorbell_index.sdma_engine[0];
+		found = find_next_zero_bit(process->doorbell_bitmap,
+					   AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS,
+					   offset);
+	} else {
+		found = find_first_zero_bit(process->doorbell_bitmap,
+					    AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS);
+	}
+
+	if (found >= AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS) {
+		DRM_WARN("No doorbell available\n");
+		return -ENOSPC;
+	}
+
+	set_bit(found, process->doorbell_bitmap);
+
+	*doorbell_index =
+		(process->doorbell_index *
+		 amdgpu_mes_doorbell_process_slice(adev)) / sizeof(u32) +
+		found * 2;
+
+	return 0;
+}
+
+static void amdgpu_mes_queue_doorbell_free(struct amdgpu_device *adev,
+					   struct amdgpu_mes_process *process,
+					   uint32_t doorbell_index)
+{
+	unsigned int old, doorbell_id;
+
+	doorbell_id = doorbell_index -
+		(process->doorbell_index *
+		 amdgpu_mes_doorbell_process_slice(adev)) / sizeof(u32);
+	doorbell_id /= 2;
+
+	old = test_and_clear_bit(doorbell_id, process->doorbell_bitmap);
+	WARN_ON(!old);
+}
+
+static int amdgpu_mes_doorbell_init(struct amdgpu_device *adev)
+{
+	size_t doorbell_start_offset;
+	size_t doorbell_aperture_size;
+	size_t doorbell_process_limit;
+
+	doorbell_start_offset = (adev->doorbell_index.max_assignment+1) * sizeof(u32);
+	doorbell_start_offset =
+		roundup(doorbell_start_offset,
+			amdgpu_mes_doorbell_process_slice(adev));
+
+	doorbell_aperture_size = adev->doorbell.size;
+	doorbell_aperture_size =
+			rounddown(doorbell_aperture_size,
+				  amdgpu_mes_doorbell_process_slice(adev));
+
+	if (doorbell_aperture_size > doorbell_start_offset)
+		doorbell_process_limit =
+			(doorbell_aperture_size - doorbell_start_offset) /
+			amdgpu_mes_doorbell_process_slice(adev);
+	else
+		return -ENOSPC;
+
+	adev->mes.doorbell_id_offset = doorbell_start_offset / sizeof(u32);
+	adev->mes.max_doorbell_slices = doorbell_process_limit;
+
+	DRM_INFO("max_doorbell_slices=%ld\n", doorbell_process_limit);
+	return 0;
+}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 40/73] drm/amdgpu: add mes queue id mask v2
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (38 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 39/73] drm/amdgpu/mes: manage mes doorbell allocation Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 41/73] drm/amdgpu/mes: initialize/finalize common mes structure v2 Alex Deucher
                   ` (32 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add MES queue id mask.

v2: move queue id mask to amdgpu_mes_ctx.h

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
index 544f1aa86edf..c000f656aae5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
@@ -116,5 +116,6 @@ struct amdgpu_mes_ctx_data {
 #define AMDGPU_FENCE_MES_QUEUE_ID_MASK  (AMDGPU_FENCE_MES_QUEUE_FLAG - 1)
 
 #define AMDGPU_FENCE_MES_QUEUE_FLAG     0x1000000u
+#define AMDGPU_FENCE_MES_QUEUE_ID_MASK  (AMDGPU_FENCE_MES_QUEUE_FLAG - 1)
 
 #endif
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 41/73] drm/amdgpu/mes: initialize/finalize common mes structure v2
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (39 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 40/73] drm/amdgpu: add mes queue id mask v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 42/73] drm/amdgpu/mes: relocate status_fence slot allocation Alex Deucher
                   ` (31 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx
  Cc: Alex Deucher, Le Ma, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Initialize/finalize common mes structure.

v2: add mutex_init for adev->mes.mutex

Cc: Le Ma <le.ma@amd.com>
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 72 +++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  4 ++
 2 files changed, 76 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 1c591cb45fd9..90c400564540 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -131,3 +131,75 @@ static int amdgpu_mes_doorbell_init(struct amdgpu_device *adev)
 	DRM_INFO("max_doorbell_slices=%ld\n", doorbell_process_limit);
 	return 0;
 }
+
+int amdgpu_mes_init(struct amdgpu_device *adev)
+{
+	int i, r;
+
+	adev->mes.adev = adev;
+
+	idr_init(&adev->mes.pasid_idr);
+	idr_init(&adev->mes.gang_id_idr);
+	idr_init(&adev->mes.queue_id_idr);
+	ida_init(&adev->mes.doorbell_ida);
+	spin_lock_init(&adev->mes.queue_id_lock);
+	mutex_init(&adev->mes.mutex);
+
+	adev->mes.total_max_queue = AMDGPU_FENCE_MES_QUEUE_ID_MASK;
+	adev->mes.vmid_mask_mmhub = 0xffffff00;
+	adev->mes.vmid_mask_gfxhub = 0xffffff00;
+
+	for (i = 0; i < AMDGPU_MES_MAX_COMPUTE_PIPES; i++) {
+		/* use only 1st MEC pipes */
+		if (i >= 4)
+			continue;
+		adev->mes.compute_hqd_mask[i] = 0xc;
+	}
+
+	for (i = 0; i < AMDGPU_MES_MAX_GFX_PIPES; i++)
+		adev->mes.gfx_hqd_mask[i] = i ? 0 : 0xfffffffe;
+
+	for (i = 0; i < AMDGPU_MES_MAX_SDMA_PIPES; i++)
+		adev->mes.sdma_hqd_mask[i] = i ? 0 : 0x3fc;
+
+	for (i = 0; i < AMDGPU_MES_PRIORITY_NUM_LEVELS; i++)
+		adev->mes.agreegated_doorbells[i] = 0xffffffff;
+
+	r = amdgpu_device_wb_get(adev, &adev->mes.sch_ctx_offs);
+	if (r) {
+		dev_err(adev->dev,
+			"(%d) ring trail_fence_offs wb alloc failed\n", r);
+		goto error_ids;
+	}
+	adev->mes.sch_ctx_gpu_addr =
+		adev->wb.gpu_addr + (adev->mes.sch_ctx_offs * 4);
+	adev->mes.sch_ctx_ptr =
+		(uint64_t *)&adev->wb.wb[adev->mes.sch_ctx_offs];
+
+	r = amdgpu_mes_doorbell_init(adev);
+	if (r)
+		goto error;
+
+	return 0;
+
+error:
+	amdgpu_device_wb_free(adev, adev->mes.sch_ctx_offs);
+error_ids:
+	idr_destroy(&adev->mes.pasid_idr);
+	idr_destroy(&adev->mes.gang_id_idr);
+	idr_destroy(&adev->mes.queue_id_idr);
+	ida_destroy(&adev->mes.doorbell_ida);
+	mutex_destroy(&adev->mes.mutex);
+	return r;
+}
+
+void amdgpu_mes_fini(struct amdgpu_device *adev)
+{
+	amdgpu_device_wb_free(adev, adev->mes.sch_ctx_offs);
+
+	idr_destroy(&adev->mes.pasid_idr);
+	idr_destroy(&adev->mes.gang_id_idr);
+	idr_destroy(&adev->mes.queue_id_idr);
+	ida_destroy(&adev->mes.doorbell_ida);
+	mutex_destroy(&adev->mes.mutex);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 117c95acfd48..e64b2114c7ba 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -207,6 +207,10 @@ struct amdgpu_mes_funcs {
 			   struct mes_resume_gang_input *input);
 };
 
+
 #define amdgpu_mes_kiq_hw_init(adev) (adev)->mes.kiq_hw_init((adev))
 
+int amdgpu_mes_init(struct amdgpu_device *adev);
+void amdgpu_mes_fini(struct amdgpu_device *adev);
+
 #endif /* __AMDGPU_MES_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 42/73] drm/amdgpu/mes: relocate status_fence slot allocation
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (40 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 41/73] drm/amdgpu/mes: initialize/finalize common mes structure v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 43/73] drm/amdgpu/mes10.1: call general mes initialization Alex Deucher
                   ` (30 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Move the status_fence slot allocation from ip specific function
to general mes function.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 11 +++++++++
 drivers/gpu/drm/amd/amdgpu/mes_v10_1.c  | 33 -------------------------
 2 files changed, 11 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 90c400564540..a988c232b4a3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -176,6 +176,17 @@ int amdgpu_mes_init(struct amdgpu_device *adev)
 	adev->mes.sch_ctx_ptr =
 		(uint64_t *)&adev->wb.wb[adev->mes.sch_ctx_offs];
 
+	r = amdgpu_device_wb_get(adev, &adev->mes.query_status_fence_offs);
+	if (r) {
+		dev_err(adev->dev,
+			"(%d) query_status_fence_offs wb alloc failed\n", r);
+		return r;
+	}
+	adev->mes.query_status_fence_gpu_addr =
+		adev->wb.gpu_addr + (adev->mes.query_status_fence_offs * 4);
+	adev->mes.query_status_fence_ptr =
+		(uint64_t *)&adev->wb.wb[adev->mes.query_status_fence_offs];
+
 	r = amdgpu_mes_doorbell_init(adev);
 	if (r)
 		goto error;
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index fecf3f26bf7c..d77242e0360e 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -606,35 +606,6 @@ static int mes_v10_1_allocate_eop_buf(struct amdgpu_device *adev,
 	return 0;
 }
 
-static int mes_v10_1_allocate_mem_slots(struct amdgpu_device *adev)
-{
-	int r;
-
-	r = amdgpu_device_wb_get(adev, &adev->mes.sch_ctx_offs);
-	if (r) {
-		dev_err(adev->dev,
-			"(%d) mes sch_ctx_offs wb alloc failed\n", r);
-		return r;
-	}
-	adev->mes.sch_ctx_gpu_addr =
-		adev->wb.gpu_addr + (adev->mes.sch_ctx_offs * 4);
-	adev->mes.sch_ctx_ptr =
-		(uint64_t *)&adev->wb.wb[adev->mes.sch_ctx_offs];
-
-	r = amdgpu_device_wb_get(adev, &adev->mes.query_status_fence_offs);
-	if (r) {
-		dev_err(adev->dev,
-			"(%d) query_status_fence_offs wb alloc failed\n", r);
-		return r;
-	}
-	adev->mes.query_status_fence_gpu_addr =
-		adev->wb.gpu_addr + (adev->mes.query_status_fence_offs * 4);
-	adev->mes.query_status_fence_ptr =
-		(uint64_t *)&adev->wb.wb[adev->mes.query_status_fence_offs];
-
-	return 0;
-}
-
 static int mes_v10_1_mqd_init(struct amdgpu_ring *ring)
 {
 	struct amdgpu_device *adev = ring->adev;
@@ -991,10 +962,6 @@ static int mes_v10_1_sw_init(void *handle)
 	if (r)
 		return r;
 
-	r = mes_v10_1_allocate_mem_slots(adev);
-	if (r)
-		return r;
-
 	return 0;
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 43/73] drm/amdgpu/mes10.1: call general mes initialization
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (41 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 42/73] drm/amdgpu/mes: relocate status_fence slot allocation Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 44/73] drm/amdgpu/mes10.1: add delay after mes engine enable Alex Deucher
                   ` (29 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Call general mes initialization/finalization.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index d77242e0360e..94812164998a 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -935,6 +935,10 @@ static int mes_v10_1_sw_init(void *handle)
 	adev->mes.funcs = &mes_v10_1_funcs;
 	adev->mes.kiq_hw_init = &mes_v10_1_kiq_hw_init;
 
+	r = amdgpu_mes_init(adev);
+	if (r)
+		return r;
+
 	for (pipe = 0; pipe < AMDGPU_MAX_MES_PIPES; pipe++) {
 		if (!adev->enable_mes_kiq && pipe == AMDGPU_MES_KIQ_PIPE)
 			continue;
@@ -994,6 +998,7 @@ static int mes_v10_1_sw_fini(void *handle)
 	amdgpu_ring_fini(&adev->gfx.kiq.ring);
 	amdgpu_ring_fini(&adev->mes.ring);
 
+	amdgpu_mes_fini(adev);
 	return 0;
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 44/73] drm/amdgpu/mes10.1: add delay after mes engine enable
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (42 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 43/73] drm/amdgpu/mes10.1: call general mes initialization Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 45/73] drm/amdgpu/mes10.1: implement the suspend/resume routine Alex Deucher
                   ` (28 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add delay after mes engine enable, for it needs more time
to complete engine initialising.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index 94812164998a..d4e64c5a3215 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -473,6 +473,7 @@ static void mes_v10_1_enable(struct amdgpu_device *adev, bool enable)
 		data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE1_ACTIVE,
 				     adev->enable_mes_kiq ? 1 : 0);
 		WREG32_SOC15(GC, 0, mmCP_MES_CNTL, data);
+		udelay(50);
 	} else {
 		data = RREG32_SOC15(GC, 0, mmCP_MES_CNTL);
 		data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE0_ACTIVE, 0);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 45/73] drm/amdgpu/mes10.1: implement the suspend/resume routine
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (43 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 44/73] drm/amdgpu/mes10.1: add delay after mes engine enable Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 46/73] drm/amdgpu/mes: implement creating mes process v2 Alex Deucher
                   ` (27 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Implement the suspend/resume routine of mes.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index d4e64c5a3215..d468cb5a8854 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -1120,12 +1120,26 @@ static int mes_v10_1_hw_fini(void *handle)
 
 static int mes_v10_1_suspend(void *handle)
 {
-	return 0;
+	int r;
+	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+	r = amdgpu_mes_suspend(adev);
+	if (r)
+		return r;
+
+	return mes_v10_1_hw_fini(adev);
 }
 
 static int mes_v10_1_resume(void *handle)
 {
-	return 0;
+	int r;
+	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+	r = mes_v10_1_hw_init(adev);
+	if (r)
+		return r;
+
+	return amdgpu_mes_resume(adev);
 }
 
 static const struct amd_ip_funcs mes_v10_1_ip_funcs = {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 46/73] drm/amdgpu/mes: implement creating mes process v2
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (44 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 45/73] drm/amdgpu/mes10.1: implement the suspend/resume routine Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 47/73] drm/amdgpu/mes: implement destroying mes process Alex Deucher
                   ` (26 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Create a mes process which contains process-related resources,
like vm, doorbell bitmap, process ctx bo and etc.

v2: move the simple variable to the end

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 77 +++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  3 +
 2 files changed, 80 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index a988c232b4a3..55005a594be1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -214,3 +214,80 @@ void amdgpu_mes_fini(struct amdgpu_device *adev)
 	ida_destroy(&adev->mes.doorbell_ida);
 	mutex_destroy(&adev->mes.mutex);
 }
+
+int amdgpu_mes_create_process(struct amdgpu_device *adev, int pasid,
+			      struct amdgpu_vm *vm)
+{
+	struct amdgpu_mes_process *process;
+	int r;
+
+	mutex_lock(&adev->mes.mutex);
+
+	/* allocate the mes process buffer */
+	process = kzalloc(sizeof(struct amdgpu_mes_process), GFP_KERNEL);
+	if (!process) {
+		DRM_ERROR("no more memory to create mes process\n");
+		mutex_unlock(&adev->mes.mutex);
+		return -ENOMEM;
+	}
+
+	process->doorbell_bitmap =
+		kzalloc(DIV_ROUND_UP(AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS,
+				     BITS_PER_BYTE), GFP_KERNEL);
+	if (!process->doorbell_bitmap) {
+		DRM_ERROR("failed to allocate doorbell bitmap\n");
+		kfree(process);
+		mutex_unlock(&adev->mes.mutex);
+		return -ENOMEM;
+	}
+
+	/* add the mes process to idr list */
+	r = idr_alloc(&adev->mes.pasid_idr, process, pasid, pasid + 1,
+		      GFP_KERNEL);
+	if (r < 0) {
+		DRM_ERROR("failed to lock pasid=%d\n", pasid);
+		goto clean_up_memory;
+	}
+
+	/* allocate the process context bo and map it */
+	r = amdgpu_bo_create_kernel(adev, AMDGPU_MES_PROC_CTX_SIZE, PAGE_SIZE,
+				    AMDGPU_GEM_DOMAIN_GTT,
+				    &process->proc_ctx_bo,
+				    &process->proc_ctx_gpu_addr,
+				    &process->proc_ctx_cpu_ptr);
+	if (r) {
+		DRM_ERROR("failed to allocate process context bo\n");
+		goto clean_up_pasid;
+	}
+	memset(process->proc_ctx_cpu_ptr, 0, AMDGPU_MES_PROC_CTX_SIZE);
+
+	/* allocate the starting doorbell index of the process */
+	r = amdgpu_mes_alloc_process_doorbells(adev, process);
+	if (r < 0) {
+		DRM_ERROR("failed to allocate doorbell for process\n");
+		goto clean_up_ctx;
+	}
+
+	DRM_DEBUG("process doorbell index = %d\n", process->doorbell_index);
+
+	INIT_LIST_HEAD(&process->gang_list);
+	process->vm = vm;
+	process->pasid = pasid;
+	process->process_quantum = adev->mes.default_process_quantum;
+	process->pd_gpu_addr = amdgpu_bo_gpu_offset(vm->root.bo);
+
+	mutex_unlock(&adev->mes.mutex);
+	return 0;
+
+clean_up_ctx:
+	amdgpu_bo_free_kernel(&process->proc_ctx_bo,
+			      &process->proc_ctx_gpu_addr,
+			      &process->proc_ctx_cpu_ptr);
+clean_up_pasid:
+	idr_remove(&adev->mes.pasid_idr, pasid);
+clean_up_memory:
+	kfree(process->doorbell_bitmap);
+	kfree(process);
+	mutex_unlock(&adev->mes.mutex);
+	return r;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index e64b2114c7ba..010a9727cbec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -213,4 +213,7 @@ struct amdgpu_mes_funcs {
 int amdgpu_mes_init(struct amdgpu_device *adev);
 void amdgpu_mes_fini(struct amdgpu_device *adev);
 
+int amdgpu_mes_create_process(struct amdgpu_device *adev, int pasid,
+			      struct amdgpu_vm *vm);
+
 #endif /* __AMDGPU_MES_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 47/73] drm/amdgpu/mes: implement destroying mes process
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (45 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 46/73] drm/amdgpu/mes: implement creating mes process v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:45 ` [PATCH 48/73] drm/amdgpu/mes: implement adding mes gang Alex Deucher
                   ` (25 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Destroy the mes process, which free resources of the process.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 58 +++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  1 +
 2 files changed, 59 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 55005a594be1..05e27636ce20 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -291,3 +291,61 @@ int amdgpu_mes_create_process(struct amdgpu_device *adev, int pasid,
 	mutex_unlock(&adev->mes.mutex);
 	return r;
 }
+
+void amdgpu_mes_destroy_process(struct amdgpu_device *adev, int pasid)
+{
+	struct amdgpu_mes_process *process;
+	struct amdgpu_mes_gang *gang, *tmp1;
+	struct amdgpu_mes_queue *queue, *tmp2;
+	struct mes_remove_queue_input queue_input;
+	unsigned long flags;
+	int r;
+
+	mutex_lock(&adev->mes.mutex);
+
+	process = idr_find(&adev->mes.pasid_idr, pasid);
+	if (!process) {
+		DRM_WARN("pasid %d doesn't exist\n", pasid);
+		mutex_unlock(&adev->mes.mutex);
+		return;
+	}
+
+	/* free all gangs in the process */
+	list_for_each_entry_safe(gang, tmp1, &process->gang_list, list) {
+		/* free all queues in the gang */
+		list_for_each_entry_safe(queue, tmp2, &gang->queue_list, list) {
+			spin_lock_irqsave(&adev->mes.queue_id_lock, flags);
+			idr_remove(&adev->mes.queue_id_idr, queue->queue_id);
+			spin_unlock_irqrestore(&adev->mes.queue_id_lock, flags);
+
+			queue_input.doorbell_offset = queue->doorbell_off;
+			queue_input.gang_context_addr = gang->gang_ctx_gpu_addr;
+
+			r = adev->mes.funcs->remove_hw_queue(&adev->mes,
+							     &queue_input);
+			if (r)
+				DRM_WARN("failed to remove hardware queue\n");
+
+			list_del(&queue->list);
+			kfree(queue);
+		}
+
+		idr_remove(&adev->mes.gang_id_idr, gang->gang_id);
+		amdgpu_bo_free_kernel(&gang->gang_ctx_bo,
+				      &gang->gang_ctx_gpu_addr,
+				      &gang->gang_ctx_cpu_ptr);
+		list_del(&gang->list);
+		kfree(gang);
+	}
+
+	amdgpu_mes_free_process_doorbells(adev, process);
+
+	idr_remove(&adev->mes.pasid_idr, pasid);
+	amdgpu_bo_free_kernel(&process->proc_ctx_bo,
+			      &process->proc_ctx_gpu_addr,
+			      &process->proc_ctx_cpu_ptr);
+	kfree(process->doorbell_bitmap);
+	kfree(process);
+
+	mutex_unlock(&adev->mes.mutex);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 010a9727cbec..fa2f47e4cd5a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -215,5 +215,6 @@ void amdgpu_mes_fini(struct amdgpu_device *adev);
 
 int amdgpu_mes_create_process(struct amdgpu_device *adev, int pasid,
 			      struct amdgpu_vm *vm);
+void amdgpu_mes_destroy_process(struct amdgpu_device *adev, int pasid);
 
 #endif /* __AMDGPU_MES_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 48/73] drm/amdgpu/mes: implement adding mes gang
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (46 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 47/73] drm/amdgpu/mes: implement destroying mes process Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 49/73] drm/amdgpu/mes: implement removing " Alex Deucher
                   ` (24 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Gang is a group of the same type queue, which is the scheduling
unit of mes hardware scheduler.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 67 +++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 12 +++++
 2 files changed, 79 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 05e27636ce20..74385e4b45c4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -349,3 +349,70 @@ void amdgpu_mes_destroy_process(struct amdgpu_device *adev, int pasid)
 
 	mutex_unlock(&adev->mes.mutex);
 }
+
+int amdgpu_mes_add_gang(struct amdgpu_device *adev, int pasid,
+			struct amdgpu_mes_gang_properties *gprops,
+			int *gang_id)
+{
+	struct amdgpu_mes_process *process;
+	struct amdgpu_mes_gang *gang;
+	int r;
+
+	mutex_lock(&adev->mes.mutex);
+
+	process = idr_find(&adev->mes.pasid_idr, pasid);
+	if (!process) {
+		DRM_ERROR("pasid %d doesn't exist\n", pasid);
+		mutex_unlock(&adev->mes.mutex);
+		return -EINVAL;
+	}
+
+	/* allocate the mes gang buffer */
+	gang = kzalloc(sizeof(struct amdgpu_mes_gang), GFP_KERNEL);
+	if (!gang) {
+		mutex_unlock(&adev->mes.mutex);
+		return -ENOMEM;
+	}
+
+	/* add the mes gang to idr list */
+	r = idr_alloc(&adev->mes.gang_id_idr, gang, 1, 0,
+		      GFP_KERNEL);
+	if (r < 0) {
+		kfree(gang);
+		mutex_unlock(&adev->mes.mutex);
+		return r;
+	}
+
+	gang->gang_id = r;
+	*gang_id = r;
+
+	/* allocate the gang context bo and map it to cpu space */
+	r = amdgpu_bo_create_kernel(adev, AMDGPU_MES_GANG_CTX_SIZE, PAGE_SIZE,
+				    AMDGPU_GEM_DOMAIN_GTT,
+				    &gang->gang_ctx_bo,
+				    &gang->gang_ctx_gpu_addr,
+				    &gang->gang_ctx_cpu_ptr);
+	if (r) {
+		DRM_ERROR("failed to allocate process context bo\n");
+		goto clean_up;
+	}
+	memset(gang->gang_ctx_cpu_ptr, 0, AMDGPU_MES_GANG_CTX_SIZE);
+
+	INIT_LIST_HEAD(&gang->queue_list);
+	gang->process = process;
+	gang->priority = gprops->priority;
+	gang->gang_quantum = gprops->gang_quantum ?
+		gprops->gang_quantum : adev->mes.default_gang_quantum;
+	gang->global_priority_level = gprops->global_priority_level;
+	gang->inprocess_gang_priority = gprops->inprocess_gang_priority;
+	list_add_tail(&gang->list, &process->gang_list);
+
+	mutex_unlock(&adev->mes.mutex);
+	return 0;
+
+clean_up:
+	idr_remove(&adev->mes.gang_id_idr, gang->gang_id);
+	kfree(gang);
+	mutex_unlock(&adev->mes.mutex);
+	return r;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index fa2f47e4cd5a..3109bd1db6bc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -158,6 +158,14 @@ struct amdgpu_mes_queue {
 	struct amdgpu_ring 		*ring;
 };
 
+struct amdgpu_mes_gang_properties {
+	uint32_t 	priority;
+	uint32_t 	gang_quantum;
+	uint32_t 	inprocess_gang_priority;
+	uint32_t 	priority_level;
+	int 		global_priority_level;
+};
+
 struct mes_add_queue_input {
 	uint32_t	process_id;
 	uint64_t	page_table_base_addr;
@@ -217,4 +225,8 @@ int amdgpu_mes_create_process(struct amdgpu_device *adev, int pasid,
 			      struct amdgpu_vm *vm);
 void amdgpu_mes_destroy_process(struct amdgpu_device *adev, int pasid);
 
+int amdgpu_mes_add_gang(struct amdgpu_device *adev, int pasid,
+			struct amdgpu_mes_gang_properties *gprops,
+			int *gang_id);
+
 #endif /* __AMDGPU_MES_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 49/73] drm/amdgpu/mes: implement removing mes gang
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (47 preceding siblings ...)
  2022-04-29 17:45 ` [PATCH 48/73] drm/amdgpu/mes: implement adding mes gang Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 50/73] drm/amdgpu/mes: implement suspending all gangs Alex Deucher
                   ` (23 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Free the mes gang and its resources.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 30 +++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  1 +
 2 files changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 74385e4b45c4..07ddf7bf6a3b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -416,3 +416,33 @@ int amdgpu_mes_add_gang(struct amdgpu_device *adev, int pasid,
 	mutex_unlock(&adev->mes.mutex);
 	return r;
 }
+
+int amdgpu_mes_remove_gang(struct amdgpu_device *adev, int gang_id)
+{
+	struct amdgpu_mes_gang *gang;
+
+	mutex_lock(&adev->mes.mutex);
+
+	gang = idr_find(&adev->mes.gang_id_idr, gang_id);
+	if (!gang) {
+		DRM_ERROR("gang id %d doesn't exist\n", gang_id);
+		mutex_unlock(&adev->mes.mutex);
+		return -EINVAL;
+	}
+
+	if (!list_empty(&gang->queue_list)) {
+		DRM_ERROR("queue list is not empty\n");
+		mutex_unlock(&adev->mes.mutex);
+		return -EBUSY;
+	}
+
+	idr_remove(&adev->mes.gang_id_idr, gang->gang_id);
+	amdgpu_bo_free_kernel(&gang->gang_ctx_bo,
+			      &gang->gang_ctx_gpu_addr,
+			      &gang->gang_ctx_cpu_ptr);
+	list_del(&gang->list);
+	kfree(gang);
+
+	mutex_unlock(&adev->mes.mutex);
+	return 0;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 3109bd1db6bc..f401a0a3eebd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -228,5 +228,6 @@ void amdgpu_mes_destroy_process(struct amdgpu_device *adev, int pasid);
 int amdgpu_mes_add_gang(struct amdgpu_device *adev, int pasid,
 			struct amdgpu_mes_gang_properties *gprops,
 			int *gang_id);
+int amdgpu_mes_remove_gang(struct amdgpu_device *adev, int gang_id);
 
 #endif /* __AMDGPU_MES_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 50/73] drm/amdgpu/mes: implement suspending all gangs
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (48 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 49/73] drm/amdgpu/mes: implement removing " Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 51/73] drm/amdgpu/mes: implement resuming " Alex Deucher
                   ` (22 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Implement suspending all gangs.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 25 +++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  2 ++
 2 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 07ddf7bf6a3b..e64f2a4b5a3b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -446,3 +446,28 @@ int amdgpu_mes_remove_gang(struct amdgpu_device *adev, int gang_id)
 	mutex_unlock(&adev->mes.mutex);
 	return 0;
 }
+
+int amdgpu_mes_suspend(struct amdgpu_device *adev)
+{
+	struct idr *idp;
+	struct amdgpu_mes_process *process;
+	struct amdgpu_mes_gang *gang;
+	struct mes_suspend_gang_input input;
+	int r, pasid;
+
+	mutex_lock(&adev->mes.mutex);
+
+	idp = &adev->mes.pasid_idr;
+
+	idr_for_each_entry(idp, process, pasid) {
+		list_for_each_entry(gang, &process->gang_list, list) {
+			r = adev->mes.funcs->suspend_gang(&adev->mes, &input);
+			if (r)
+				DRM_ERROR("failed to suspend pasid %d gangid %d",
+					 pasid, gang->gang_id);
+		}
+	}
+
+	mutex_unlock(&adev->mes.mutex);
+	return 0;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index f401a0a3eebd..667fc9f9b21b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -230,4 +230,6 @@ int amdgpu_mes_add_gang(struct amdgpu_device *adev, int pasid,
 			int *gang_id);
 int amdgpu_mes_remove_gang(struct amdgpu_device *adev, int gang_id);
 
+int amdgpu_mes_suspend(struct amdgpu_device *adev);
+
 #endif /* __AMDGPU_MES_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 51/73] drm/amdgpu/mes: implement resuming all gangs
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (49 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 50/73] drm/amdgpu/mes: implement suspending all gangs Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 52/73] drm/amdgpu/mes: initialize mqd from queue properties Alex Deucher
                   ` (21 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Implement resuming all gangs.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 25 +++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  1 +
 2 files changed, 26 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index e64f2a4b5a3b..b58af81f04a3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -471,3 +471,28 @@ int amdgpu_mes_suspend(struct amdgpu_device *adev)
 	mutex_unlock(&adev->mes.mutex);
 	return 0;
 }
+
+int amdgpu_mes_resume(struct amdgpu_device *adev)
+{
+	struct idr *idp;
+	struct amdgpu_mes_process *process;
+	struct amdgpu_mes_gang *gang;
+	struct mes_resume_gang_input input;
+	int r, pasid;
+
+	mutex_lock(&adev->mes.mutex);
+
+	idp = &adev->mes.pasid_idr;
+
+	idr_for_each_entry(idp, process, pasid) {
+		list_for_each_entry(gang, &process->gang_list, list) {
+			r = adev->mes.funcs->resume_gang(&adev->mes, &input);
+			if (r)
+				DRM_ERROR("failed to resume pasid %d gangid %d",
+					 pasid, gang->gang_id);
+		}
+	}
+
+	mutex_unlock(&adev->mes.mutex);
+	return 0;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 667fc9f9b21b..43d3a689732a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -231,5 +231,6 @@ int amdgpu_mes_add_gang(struct amdgpu_device *adev, int pasid,
 int amdgpu_mes_remove_gang(struct amdgpu_device *adev, int gang_id);
 
 int amdgpu_mes_suspend(struct amdgpu_device *adev);
+int amdgpu_mes_resume(struct amdgpu_device *adev);
 
 #endif /* __AMDGPU_MES_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 52/73] drm/amdgpu/mes: initialize mqd from queue properties
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (50 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 51/73] drm/amdgpu/mes: implement resuming " Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 53/73] drm/amdgpu/mes: implement adding mes queue Alex Deucher
                   ` (20 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add helper function to initialize mqd from queue properties.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 54 +++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index b58af81f04a3..2cd2fa76b5c8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -496,3 +496,57 @@ int amdgpu_mes_resume(struct amdgpu_device *adev)
 	mutex_unlock(&adev->mes.mutex);
 	return 0;
 }
+
+static int amdgpu_mes_queue_init_mqd(struct amdgpu_device *adev,
+				     struct amdgpu_mes_queue *q,
+				     struct amdgpu_mes_queue_properties *p)
+{
+	struct amdgpu_mqd *mqd_mgr = &adev->mqds[p->queue_type];
+	u32 mqd_size = mqd_mgr->mqd_size;
+	struct amdgpu_mqd_prop mqd_prop = {0};
+	int r;
+
+	r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
+				    AMDGPU_GEM_DOMAIN_GTT,
+				    &q->mqd_obj,
+				    &q->mqd_gpu_addr, &q->mqd_cpu_ptr);
+	if (r) {
+		dev_warn(adev->dev, "failed to create queue mqd bo (%d)", r);
+		return r;
+	}
+	memset(q->mqd_cpu_ptr, 0, mqd_size);
+
+	mqd_prop.mqd_gpu_addr = q->mqd_gpu_addr;
+	mqd_prop.hqd_base_gpu_addr = p->hqd_base_gpu_addr;
+	mqd_prop.rptr_gpu_addr = p->rptr_gpu_addr;
+	mqd_prop.wptr_gpu_addr = p->wptr_gpu_addr;
+	mqd_prop.queue_size = p->queue_size;
+	mqd_prop.use_doorbell = true;
+	mqd_prop.doorbell_index = p->doorbell_off;
+	mqd_prop.eop_gpu_addr = p->eop_gpu_addr;
+	mqd_prop.hqd_pipe_priority = p->hqd_pipe_priority;
+	mqd_prop.hqd_queue_priority = p->hqd_queue_priority;
+	mqd_prop.hqd_active = false;
+
+	r = amdgpu_bo_reserve(q->mqd_obj, false);
+	if (unlikely(r != 0))
+		goto clean_up;
+
+	mqd_mgr->init_mqd(adev, q->mqd_cpu_ptr, &mqd_prop);
+
+	amdgpu_bo_unreserve(q->mqd_obj);
+	return 0;
+
+clean_up:
+	amdgpu_bo_free_kernel(&q->mqd_obj,
+			      &q->mqd_gpu_addr,
+			      &q->mqd_cpu_ptr);
+	return r;
+}
+
+static void amdgpu_mes_queue_free_mqd(struct amdgpu_mes_queue *q)
+{
+	amdgpu_bo_free_kernel(&q->mqd_obj,
+			      &q->mqd_gpu_addr,
+			      &q->mqd_cpu_ptr);
+}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 53/73] drm/amdgpu/mes: implement adding mes queue
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (51 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 52/73] drm/amdgpu/mes: initialize mqd from queue properties Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 54/73] drm/amdgpu/mes: implement removing " Alex Deucher
                   ` (19 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Allocate related resources for the queue and add it to mes
for scheduling.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 105 ++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  19 +++++
 2 files changed, 124 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 2cd2fa76b5c8..9f059c32c6c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -550,3 +550,108 @@ static void amdgpu_mes_queue_free_mqd(struct amdgpu_mes_queue *q)
 			      &q->mqd_gpu_addr,
 			      &q->mqd_cpu_ptr);
 }
+
+int amdgpu_mes_add_hw_queue(struct amdgpu_device *adev, int gang_id,
+			    struct amdgpu_mes_queue_properties *qprops,
+			    int *queue_id)
+{
+	struct amdgpu_mes_queue *queue;
+	struct amdgpu_mes_gang *gang;
+	struct mes_add_queue_input queue_input;
+	unsigned long flags;
+	int r;
+
+	mutex_lock(&adev->mes.mutex);
+
+	gang = idr_find(&adev->mes.gang_id_idr, gang_id);
+	if (!gang) {
+		DRM_ERROR("gang id %d doesn't exist\n", gang_id);
+		mutex_unlock(&adev->mes.mutex);
+		return -EINVAL;
+	}
+
+	/* allocate the mes queue buffer */
+	queue = kzalloc(sizeof(struct amdgpu_mes_queue), GFP_KERNEL);
+	if (!queue) {
+		mutex_unlock(&adev->mes.mutex);
+		return -ENOMEM;
+	}
+
+	/* add the mes gang to idr list */
+	spin_lock_irqsave(&adev->mes.queue_id_lock, flags);
+	r = idr_alloc(&adev->mes.queue_id_idr, queue, 1, 0,
+		      GFP_ATOMIC);
+	if (r < 0) {
+		spin_unlock_irqrestore(&adev->mes.queue_id_lock, flags);
+		goto clean_up_memory;
+	}
+	spin_unlock_irqrestore(&adev->mes.queue_id_lock, flags);
+	*queue_id = queue->queue_id = r;
+
+	/* allocate a doorbell index for the queue */
+	r = amdgpu_mes_queue_doorbell_get(adev, gang->process,
+					  qprops->queue_type,
+					  &qprops->doorbell_off);
+	if (r)
+		goto clean_up_queue_id;
+
+	/* initialize the queue mqd */
+	r = amdgpu_mes_queue_init_mqd(adev, queue, qprops);
+	if (r)
+		goto clean_up_doorbell;
+
+	/* add hw queue to mes */
+	queue_input.process_id = gang->process->pasid;
+	queue_input.page_table_base_addr = gang->process->pd_gpu_addr;
+	queue_input.process_va_start = 0;
+	queue_input.process_va_end =
+		(adev->vm_manager.max_pfn - 1) << AMDGPU_GPU_PAGE_SHIFT;
+	queue_input.process_quantum = gang->process->process_quantum;
+	queue_input.process_context_addr = gang->process->proc_ctx_gpu_addr;
+	queue_input.gang_quantum = gang->gang_quantum;
+	queue_input.gang_context_addr = gang->gang_ctx_gpu_addr;
+	queue_input.inprocess_gang_priority = gang->inprocess_gang_priority;
+	queue_input.gang_global_priority_level = gang->global_priority_level;
+	queue_input.doorbell_offset = qprops->doorbell_off;
+	queue_input.mqd_addr = queue->mqd_gpu_addr;
+	queue_input.wptr_addr = qprops->wptr_gpu_addr;
+	queue_input.queue_type = qprops->queue_type;
+	queue_input.paging = qprops->paging;
+
+	r = adev->mes.funcs->add_hw_queue(&adev->mes, &queue_input);
+	if (r) {
+		DRM_ERROR("failed to add hardware queue to MES, doorbell=0x%llx\n",
+			  qprops->doorbell_off);
+		goto clean_up_mqd;
+	}
+
+	DRM_DEBUG("MES hw queue was added, pasid=%d, gang id=%d, "
+		  "queue type=%d, doorbell=0x%llx\n",
+		  gang->process->pasid, gang_id, qprops->queue_type,
+		  qprops->doorbell_off);
+
+	queue->ring = qprops->ring;
+	queue->doorbell_off = qprops->doorbell_off;
+	queue->wptr_gpu_addr = qprops->wptr_gpu_addr;
+	queue->queue_type = qprops->queue_type;
+	queue->paging = qprops->paging;
+	queue->gang = gang;
+	list_add_tail(&queue->list, &gang->queue_list);
+
+	mutex_unlock(&adev->mes.mutex);
+	return 0;
+
+clean_up_mqd:
+	amdgpu_mes_queue_free_mqd(queue);
+clean_up_doorbell:
+	amdgpu_mes_queue_doorbell_free(adev, gang->process,
+				       qprops->doorbell_off);
+clean_up_queue_id:
+	spin_lock_irqsave(&adev->mes.queue_id_lock, flags);
+	idr_remove(&adev->mes.queue_id_idr, queue->queue_id);
+	spin_unlock_irqrestore(&adev->mes.queue_id_lock, flags);
+clean_up_memory:
+	kfree(queue);
+	mutex_unlock(&adev->mes.mutex);
+	return r;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 43d3a689732a..ec727c2109bc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -158,6 +158,21 @@ struct amdgpu_mes_queue {
 	struct amdgpu_ring 		*ring;
 };
 
+struct amdgpu_mes_queue_properties {
+	int 			queue_type;
+	uint64_t                hqd_base_gpu_addr;
+	uint64_t                rptr_gpu_addr;
+	uint64_t                wptr_gpu_addr;
+	uint32_t                queue_size;
+	uint64_t                eop_gpu_addr;
+	uint32_t                hqd_pipe_priority;
+	uint32_t                hqd_queue_priority;
+	bool 			paging;
+	struct amdgpu_ring 	*ring;
+	/* out */
+	uint64_t       		doorbell_off;
+};
+
 struct amdgpu_mes_gang_properties {
 	uint32_t 	priority;
 	uint32_t 	gang_quantum;
@@ -233,4 +248,8 @@ int amdgpu_mes_remove_gang(struct amdgpu_device *adev, int gang_id);
 int amdgpu_mes_suspend(struct amdgpu_device *adev);
 int amdgpu_mes_resume(struct amdgpu_device *adev);
 
+int amdgpu_mes_add_hw_queue(struct amdgpu_device *adev, int gang_id,
+			    struct amdgpu_mes_queue_properties *qprops,
+			    int *queue_id);
+
 #endif /* __AMDGPU_MES_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 54/73] drm/amdgpu/mes: implement removing mes queue
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (52 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 53/73] drm/amdgpu/mes: implement adding mes queue Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 55/73] drm/amdgpu/mes: add helper function to convert ring to queue property Alex Deucher
                   ` (18 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Remove the MES queue from MES scheduling and free its resources.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 45 +++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  1 +
 2 files changed, 46 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 9f059c32c6c2..df0e542bd687 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -655,3 +655,48 @@ int amdgpu_mes_add_hw_queue(struct amdgpu_device *adev, int gang_id,
 	mutex_unlock(&adev->mes.mutex);
 	return r;
 }
+
+int amdgpu_mes_remove_hw_queue(struct amdgpu_device *adev, int queue_id)
+{
+	unsigned long flags;
+	struct amdgpu_mes_queue *queue;
+	struct amdgpu_mes_gang *gang;
+	struct mes_remove_queue_input queue_input;
+	int r;
+
+	mutex_lock(&adev->mes.mutex);
+
+	/* remove the mes gang from idr list */
+	spin_lock_irqsave(&adev->mes.queue_id_lock, flags);
+
+	queue = idr_find(&adev->mes.queue_id_idr, queue_id);
+	if (!queue) {
+		spin_unlock_irqrestore(&adev->mes.queue_id_lock, flags);
+		mutex_unlock(&adev->mes.mutex);
+		DRM_ERROR("queue id %d doesn't exist\n", queue_id);
+		return -EINVAL;
+	}
+
+	idr_remove(&adev->mes.queue_id_idr, queue_id);
+	spin_unlock_irqrestore(&adev->mes.queue_id_lock, flags);
+
+	DRM_DEBUG("try to remove queue, doorbell off = 0x%llx\n",
+		  queue->doorbell_off);
+
+	gang = queue->gang;
+	queue_input.doorbell_offset = queue->doorbell_off;
+	queue_input.gang_context_addr = gang->gang_ctx_gpu_addr;
+
+	r = adev->mes.funcs->remove_hw_queue(&adev->mes, &queue_input);
+	if (r)
+		DRM_ERROR("failed to remove hardware queue, queue id = %d\n",
+			  queue_id);
+
+	amdgpu_mes_queue_free_mqd(queue);
+	list_del(&queue->list);
+	amdgpu_mes_queue_doorbell_free(adev, gang->process,
+				       queue->doorbell_off);
+	kfree(queue);
+	mutex_unlock(&adev->mes.mutex);
+	return 0;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index ec727c2109bc..bf90863852a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -251,5 +251,6 @@ int amdgpu_mes_resume(struct amdgpu_device *adev);
 int amdgpu_mes_add_hw_queue(struct amdgpu_device *adev, int gang_id,
 			    struct amdgpu_mes_queue_properties *qprops,
 			    int *queue_id);
+int amdgpu_mes_remove_hw_queue(struct amdgpu_device *adev, int queue_id);
 
 #endif /* __AMDGPU_MES_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 55/73] drm/amdgpu/mes: add helper function to convert ring to queue property
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (53 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 54/73] drm/amdgpu/mes: implement removing " Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 56/73] drm/amdgpu/mes: add helper function to get the ctx meta data offset Alex Deucher
                   ` (17 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add the helper function to convert ring to queue property.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index df0e542bd687..8cb74d0d0a1f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -700,3 +700,20 @@ int amdgpu_mes_remove_hw_queue(struct amdgpu_device *adev, int queue_id)
 	mutex_unlock(&adev->mes.mutex);
 	return 0;
 }
+
+static void
+amdgpu_mes_ring_to_queue_props(struct amdgpu_device *adev,
+			       struct amdgpu_ring *ring,
+			       struct amdgpu_mes_queue_properties *props)
+{
+	props->queue_type = ring->funcs->type;
+	props->hqd_base_gpu_addr = ring->gpu_addr;
+	props->rptr_gpu_addr = ring->rptr_gpu_addr;
+	props->wptr_gpu_addr = ring->wptr_gpu_addr;
+	props->queue_size = ring->ring_size;
+	props->eop_gpu_addr = ring->eop_gpu_addr;
+	props->hqd_pipe_priority = AMDGPU_GFX_PIPE_PRIO_NORMAL;
+	props->hqd_queue_priority = AMDGPU_GFX_QUEUE_PRIORITY_MINIMUM;
+	props->paging = false;
+	props->ring = ring;
+}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 56/73] drm/amdgpu/mes: add helper function to get the ctx meta data offset
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (54 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 55/73] drm/amdgpu/mes: add helper function to convert ring to queue property Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 57/73] drm/amdgpu/mes: use ring for kernel queue submission Alex Deucher
                   ` (16 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add the helper function to get the corresponding ctx meta data offset.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 36 +++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  3 ++-
 2 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 8cb74d0d0a1f..4e99adcfbb0e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -717,3 +717,39 @@ amdgpu_mes_ring_to_queue_props(struct amdgpu_device *adev,
 	props->paging = false;
 	props->ring = ring;
 }
+
+#define DEFINE_AMDGPU_MES_CTX_GET_OFFS_ENG(_eng)			\
+do {									\
+       if (id_offs < AMDGPU_MES_CTX_MAX_OFFS)				\
+		return offsetof(struct amdgpu_mes_ctx_meta_data,	\
+				_eng[ring->idx].slots[id_offs]);        \
+       else if (id_offs == AMDGPU_MES_CTX_RING_OFFS)			\
+		return offsetof(struct amdgpu_mes_ctx_meta_data,        \
+				_eng[ring->idx].ring);                  \
+       else if (id_offs == AMDGPU_MES_CTX_IB_OFFS)			\
+		return offsetof(struct amdgpu_mes_ctx_meta_data,        \
+				_eng[ring->idx].ib);                    \
+       else if (id_offs == AMDGPU_MES_CTX_PADDING_OFFS)			\
+		return offsetof(struct amdgpu_mes_ctx_meta_data,        \
+				_eng[ring->idx].padding);               \
+} while(0)
+
+int amdgpu_mes_ctx_get_offs(struct amdgpu_ring *ring, unsigned int id_offs)
+{
+	switch (ring->funcs->type) {
+	case AMDGPU_RING_TYPE_GFX:
+		DEFINE_AMDGPU_MES_CTX_GET_OFFS_ENG(gfx);
+		break;
+	case AMDGPU_RING_TYPE_COMPUTE:
+		DEFINE_AMDGPU_MES_CTX_GET_OFFS_ENG(compute);
+		break;
+	case AMDGPU_RING_TYPE_SDMA:
+		DEFINE_AMDGPU_MES_CTX_GET_OFFS_ENG(sdma);
+		break;
+	default:
+		break;
+	}
+
+	WARN_ON(1);
+	return -EINVAL;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index bf90863852a7..36684416f277 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -230,9 +230,10 @@ struct amdgpu_mes_funcs {
 			   struct mes_resume_gang_input *input);
 };
 
-
 #define amdgpu_mes_kiq_hw_init(adev) (adev)->mes.kiq_hw_init((adev))
 
+int amdgpu_mes_ctx_get_offs(struct amdgpu_ring *ring, unsigned int id_offs);
+
 int amdgpu_mes_init(struct amdgpu_device *adev);
 void amdgpu_mes_fini(struct amdgpu_device *adev);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 57/73] drm/amdgpu/mes: use ring for kernel queue submission
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (55 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 56/73] drm/amdgpu/mes: add helper function to get the ctx meta data offset Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 58/73] drm/amdgpu/mes: implement removing mes ring Alex Deucher
                   ` (15 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Use ring as the front end for kernel queue submission.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 93 +++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  5 ++
 2 files changed, 98 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 4e99adcfbb0e..827391fcb2a3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -753,3 +753,96 @@ int amdgpu_mes_ctx_get_offs(struct amdgpu_ring *ring, unsigned int id_offs)
 	WARN_ON(1);
 	return -EINVAL;
 }
+
+int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
+			int queue_type, int idx,
+			struct amdgpu_mes_ctx_data *ctx_data,
+			struct amdgpu_ring **out)
+{
+	struct amdgpu_ring *ring;
+	struct amdgpu_mes_gang *gang;
+	struct amdgpu_mes_queue_properties qprops = {0};
+	int r, queue_id, pasid;
+
+	mutex_lock(&adev->mes.mutex);
+	gang = idr_find(&adev->mes.gang_id_idr, gang_id);
+	if (!gang) {
+		DRM_ERROR("gang id %d doesn't exist\n", gang_id);
+		mutex_unlock(&adev->mes.mutex);
+		return -EINVAL;
+	}
+	pasid = gang->process->pasid;
+
+	ring = kzalloc(sizeof(struct amdgpu_ring), GFP_KERNEL);
+	if (!ring) {
+		mutex_unlock(&adev->mes.mutex);
+		return -ENOMEM;
+	}
+
+	ring->ring_obj = NULL;
+	ring->use_doorbell = true;
+	ring->is_mes_queue = true;
+	ring->mes_ctx = ctx_data;
+	ring->idx = idx;
+	ring->no_scheduler = true;
+
+	if (queue_type == AMDGPU_RING_TYPE_COMPUTE) {
+		int offset = offsetof(struct amdgpu_mes_ctx_meta_data,
+				      compute[ring->idx].mec_hpd);
+		ring->eop_gpu_addr =
+			amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+	}
+
+	switch (queue_type) {
+	case AMDGPU_RING_TYPE_GFX:
+		ring->funcs = adev->gfx.gfx_ring[0].funcs;
+		break;
+	case AMDGPU_RING_TYPE_COMPUTE:
+		ring->funcs = adev->gfx.compute_ring[0].funcs;
+		break;
+	case AMDGPU_RING_TYPE_SDMA:
+		ring->funcs = adev->sdma.instance[0].ring.funcs;
+		break;
+	default:
+		BUG();
+	}
+
+	r = amdgpu_ring_init(adev, ring, 1024, NULL, 0,
+			     AMDGPU_RING_PRIO_DEFAULT, NULL);
+	if (r)
+		goto clean_up_memory;
+
+	amdgpu_mes_ring_to_queue_props(adev, ring, &qprops);
+
+	dma_fence_wait(gang->process->vm->last_update, false);
+	dma_fence_wait(ctx_data->meta_data_va->last_pt_update, false);
+	mutex_unlock(&adev->mes.mutex);
+
+	r = amdgpu_mes_add_hw_queue(adev, gang_id, &qprops, &queue_id);
+	if (r)
+		goto clean_up_ring;
+
+	ring->hw_queue_id = queue_id;
+	ring->doorbell_index = qprops.doorbell_off;
+
+	if (queue_type == AMDGPU_RING_TYPE_GFX)
+		sprintf(ring->name, "gfx_%d.%d.%d", pasid, gang_id, queue_id);
+	else if (queue_type == AMDGPU_RING_TYPE_COMPUTE)
+		sprintf(ring->name, "compute_%d.%d.%d", pasid, gang_id,
+			queue_id);
+	else if (queue_type == AMDGPU_RING_TYPE_SDMA)
+		sprintf(ring->name, "sdma_%d.%d.%d", pasid, gang_id,
+			queue_id);
+	else
+		BUG();
+
+	*out = ring;
+	return 0;
+
+clean_up_ring:
+	amdgpu_ring_fini(ring);
+clean_up_memory:
+	kfree(ring);
+	mutex_unlock(&adev->mes.mutex);
+	return r;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 36684416f277..1fe5c869f37e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -254,4 +254,9 @@ int amdgpu_mes_add_hw_queue(struct amdgpu_device *adev, int gang_id,
 			    int *queue_id);
 int amdgpu_mes_remove_hw_queue(struct amdgpu_device *adev, int queue_id);
 
+int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
+			int queue_type, int idx,
+			struct amdgpu_mes_ctx_data *ctx_data,
+			struct amdgpu_ring **out);
+
 #endif /* __AMDGPU_MES_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 58/73] drm/amdgpu/mes: implement removing mes ring
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (56 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 57/73] drm/amdgpu/mes: use ring for kernel queue submission Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 59/73] drm/amdgpu/mes: add helper functions to alloc/free ctx metadata Alex Deucher
                   ` (14 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Remove the mes ring and its resources.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 11 +++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  2 ++
 2 files changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 827391fcb2a3..fa43a7e3c9ab 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -846,3 +846,14 @@ int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
 	mutex_unlock(&adev->mes.mutex);
 	return r;
 }
+
+void amdgpu_mes_remove_ring(struct amdgpu_device *adev,
+			    struct amdgpu_ring *ring)
+{
+	if (!ring)
+		return;
+
+	amdgpu_mes_remove_hw_queue(adev, ring->hw_queue_id);
+	amdgpu_ring_fini(ring);
+	kfree(ring);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 1fe5c869f37e..37232b396b06 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -258,5 +258,7 @@ int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
 			int queue_type, int idx,
 			struct amdgpu_mes_ctx_data *ctx_data,
 			struct amdgpu_ring **out);
+void amdgpu_mes_remove_ring(struct amdgpu_device *adev,
+			    struct amdgpu_ring *ring);
 
 #endif /* __AMDGPU_MES_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 59/73] drm/amdgpu/mes: add helper functions to alloc/free ctx metadata
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (57 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 58/73] drm/amdgpu/mes: implement removing mes ring Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 60/73] drm/amdgpu: skip kfd routines when mes enabled Alex Deucher
                   ` (13 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add the helper functions to allocate/free context metadata.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 25 +++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  4 ++++
 2 files changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index fa43a7e3c9ab..6c01581e3a7b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -857,3 +857,28 @@ void amdgpu_mes_remove_ring(struct amdgpu_device *adev,
 	amdgpu_ring_fini(ring);
 	kfree(ring);
 }
+
+int amdgpu_mes_ctx_alloc_meta_data(struct amdgpu_device *adev,
+				   struct amdgpu_mes_ctx_data *ctx_data)
+{
+	int r;
+
+	r = amdgpu_bo_create_kernel(adev,
+			    sizeof(struct amdgpu_mes_ctx_meta_data),
+			    PAGE_SIZE, AMDGPU_GEM_DOMAIN_GTT,
+			    &ctx_data->meta_data_obj, NULL,
+			    &ctx_data->meta_data_ptr);
+	if (!ctx_data->meta_data_obj)
+		return -ENOMEM;
+
+	memset(ctx_data->meta_data_ptr, 0,
+	       sizeof(struct amdgpu_mes_ctx_meta_data));
+
+	return 0;
+}
+
+void amdgpu_mes_ctx_free_meta_data(struct amdgpu_mes_ctx_data *ctx_data)
+{
+	if (ctx_data->meta_data_obj)
+		amdgpu_bo_free_kernel(&ctx_data->meta_data_obj, NULL, NULL);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 37232b396b06..50d490e69cb7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -261,4 +261,8 @@ int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
 void amdgpu_mes_remove_ring(struct amdgpu_device *adev,
 			    struct amdgpu_ring *ring);
 
+int amdgpu_mes_ctx_alloc_meta_data(struct amdgpu_device *adev,
+				   struct amdgpu_mes_ctx_data *ctx_data);
+void amdgpu_mes_ctx_free_meta_data(struct amdgpu_mes_ctx_data *ctx_data);
+
 #endif /* __AMDGPU_MES_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 60/73] drm/amdgpu: skip kfd routines when mes enabled
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (58 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 59/73] drm/amdgpu/mes: add helper functions to alloc/free ctx metadata Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 61/73] drm/amdgpu: Enable KFD with MES enabled Alex Deucher
                   ` (12 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

For kfd hasn't supported mes, skip kfd routines.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 25 ++++++++++++++--------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b2366d0d3047..728e7be54a59 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2171,7 +2171,8 @@ static int amdgpu_device_ip_early_init(struct amdgpu_device *adev)
 		adev->has_pr3 = parent ? pci_pr3_present(parent) : false;
 	}
 
-	amdgpu_amdkfd_device_probe(adev);
+	if (!adev->enable_mes)
+		amdgpu_amdkfd_device_probe(adev);
 
 	adev->pm.pp_feature = amdgpu_pp_feature_mask;
 	if (amdgpu_sriov_vf(adev) || sched_policy == KFD_SCHED_POLICY_NO_HWS)
@@ -2498,7 +2499,8 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
 		goto init_failed;
 
 	/* Don't init kfd if whole hive need to be reset during init */
-	if (!adev->gmc.xgmi.pending_reset)
+	if (!adev->gmc.xgmi.pending_reset &&
+	    !adev->enable_mes)
 		amdgpu_amdkfd_device_init(adev);
 
 	amdgpu_fru_get_product_info(adev);
@@ -2861,7 +2863,8 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
 	if (adev->gmc.xgmi.num_physical_nodes > 1)
 		amdgpu_xgmi_remove_device(adev);
 
-	amdgpu_amdkfd_device_fini_sw(adev);
+	if (!adev->enable_mes)
+		amdgpu_amdkfd_device_fini_sw(adev);
 
 	for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
 		if (!adev->ip_blocks[i].status.sw)
@@ -4122,7 +4125,7 @@ int amdgpu_device_suspend(struct drm_device *dev, bool fbcon)
 
 	amdgpu_device_ip_suspend_phase1(adev);
 
-	if (!adev->in_s0ix)
+	if (!adev->in_s0ix && !adev->enable_mes)
 		amdgpu_amdkfd_suspend(adev, adev->in_runpm);
 
 	amdgpu_device_evict_resources(adev);
@@ -4176,7 +4179,7 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon)
 	queue_delayed_work(system_wq, &adev->delayed_init_work,
 			   msecs_to_jiffies(AMDGPU_RESUME_MS));
 
-	if (!adev->in_s0ix) {
+	if (!adev->in_s0ix && !adev->enable_mes) {
 		r = amdgpu_amdkfd_resume(adev, adev->in_runpm);
 		if (r)
 			return r;
@@ -4459,7 +4462,8 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
 	int retry_limit = 0;
 
 retry:
-	amdgpu_amdkfd_pre_reset(adev);
+	if (!adev->enable_mes)
+		amdgpu_amdkfd_pre_reset(adev);
 
 	amdgpu_amdkfd_pre_reset(adev);
 
@@ -4497,7 +4501,9 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
 	if (!r) {
 		amdgpu_irq_gpu_reset_resume_helper(adev);
 		r = amdgpu_ib_ring_tests(adev);
-		amdgpu_amdkfd_post_reset(adev);
+
+		if (!adev->enable_mes)
+			amdgpu_amdkfd_post_reset(adev);
 	}
 
 error:
@@ -5142,7 +5148,7 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
 
 		cancel_delayed_work_sync(&tmp_adev->delayed_init_work);
 
-		if (!amdgpu_sriov_vf(tmp_adev))
+		if (!amdgpu_sriov_vf(tmp_adev) && !adev->enable_mes)
 			amdgpu_amdkfd_pre_reset(tmp_adev);
 
 		/*
@@ -5265,7 +5271,8 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
 skip_sched_resume:
 	list_for_each_entry(tmp_adev, device_list_handle, reset_list) {
 		/* unlock kfd: SRIOV would do it separately */
-		if (!need_emergency_restart && !amdgpu_sriov_vf(tmp_adev))
+		if (!need_emergency_restart && !amdgpu_sriov_vf(tmp_adev) &&
+		    !adev->enable_mes)
 			amdgpu_amdkfd_post_reset(tmp_adev);
 
 		/* kfd_post_reset will do nothing if kfd device is not initialized,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 61/73] drm/amdgpu: Enable KFD with MES enabled
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (59 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 60/73] drm/amdgpu: skip kfd routines when mes enabled Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 62/73] drm/amdgpu: skip some checking for mes queue ib submission Alex Deucher
                   ` (11 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx
  Cc: Alex Deucher, Mukul Joshi, Oak Zeng, Harish Kasiviswanathan,
	Felix Kuehling

From: Mukul Joshi <mukul.joshi@amd.com>

Enable KFD initialization with MES enabled.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Acked-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 24 ++++++++--------------
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 728e7be54a59..e582f1044c0f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2171,8 +2171,7 @@ static int amdgpu_device_ip_early_init(struct amdgpu_device *adev)
 		adev->has_pr3 = parent ? pci_pr3_present(parent) : false;
 	}
 
-	if (!adev->enable_mes)
-		amdgpu_amdkfd_device_probe(adev);
+	amdgpu_amdkfd_device_probe(adev);
 
 	adev->pm.pp_feature = amdgpu_pp_feature_mask;
 	if (amdgpu_sriov_vf(adev) || sched_policy == KFD_SCHED_POLICY_NO_HWS)
@@ -2499,8 +2498,7 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
 		goto init_failed;
 
 	/* Don't init kfd if whole hive need to be reset during init */
-	if (!adev->gmc.xgmi.pending_reset &&
-	    !adev->enable_mes)
+	if (!adev->gmc.xgmi.pending_reset)
 		amdgpu_amdkfd_device_init(adev);
 
 	amdgpu_fru_get_product_info(adev);
@@ -2863,8 +2861,7 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
 	if (adev->gmc.xgmi.num_physical_nodes > 1)
 		amdgpu_xgmi_remove_device(adev);
 
-	if (!adev->enable_mes)
-		amdgpu_amdkfd_device_fini_sw(adev);
+	amdgpu_amdkfd_device_fini_sw(adev);
 
 	for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
 		if (!adev->ip_blocks[i].status.sw)
@@ -4125,7 +4122,7 @@ int amdgpu_device_suspend(struct drm_device *dev, bool fbcon)
 
 	amdgpu_device_ip_suspend_phase1(adev);
 
-	if (!adev->in_s0ix && !adev->enable_mes)
+	if (!adev->in_s0ix)
 		amdgpu_amdkfd_suspend(adev, adev->in_runpm);
 
 	amdgpu_device_evict_resources(adev);
@@ -4179,7 +4176,7 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon)
 	queue_delayed_work(system_wq, &adev->delayed_init_work,
 			   msecs_to_jiffies(AMDGPU_RESUME_MS));
 
-	if (!adev->in_s0ix && !adev->enable_mes) {
+	if (!adev->in_s0ix) {
 		r = amdgpu_amdkfd_resume(adev, adev->in_runpm);
 		if (r)
 			return r;
@@ -4462,8 +4459,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
 	int retry_limit = 0;
 
 retry:
-	if (!adev->enable_mes)
-		amdgpu_amdkfd_pre_reset(adev);
+	amdgpu_amdkfd_pre_reset(adev);
 
 	amdgpu_amdkfd_pre_reset(adev);
 
@@ -4502,8 +4498,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
 		amdgpu_irq_gpu_reset_resume_helper(adev);
 		r = amdgpu_ib_ring_tests(adev);
 
-		if (!adev->enable_mes)
-			amdgpu_amdkfd_post_reset(adev);
+		amdgpu_amdkfd_post_reset(adev);
 	}
 
 error:
@@ -5148,7 +5143,7 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
 
 		cancel_delayed_work_sync(&tmp_adev->delayed_init_work);
 
-		if (!amdgpu_sriov_vf(tmp_adev) && !adev->enable_mes)
+		if (!amdgpu_sriov_vf(tmp_adev))
 			amdgpu_amdkfd_pre_reset(tmp_adev);
 
 		/*
@@ -5271,8 +5266,7 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
 skip_sched_resume:
 	list_for_each_entry(tmp_adev, device_list_handle, reset_list) {
 		/* unlock kfd: SRIOV would do it separately */
-		if (!need_emergency_restart && !amdgpu_sriov_vf(tmp_adev) &&
-		    !adev->enable_mes)
+		if (!need_emergency_restart && !amdgpu_sriov_vf(tmp_adev))
 			amdgpu_amdkfd_post_reset(tmp_adev);
 
 		/* kfd_post_reset will do nothing if kfd device is not initialized,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 62/73] drm/amdgpu: skip some checking for mes queue ib submission
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (60 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 61/73] drm/amdgpu: Enable KFD with MES enabled Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 63/73] drm/amdgpu: skip kiq ib tests if mes enabled Alex Deucher
                   ` (10 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Skip some checking for mes queue ib submission.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index d583766ea392..d8354453cc29 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -155,12 +155,12 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 		fence_ctx = 0;
 	}
 
-	if (!ring->sched.ready) {
+	if (!ring->sched.ready && !ring->is_mes_queue) {
 		dev_err(adev->dev, "couldn't schedule ib on ring <%s>\n", ring->name);
 		return -EINVAL;
 	}
 
-	if (vm && !job->vmid) {
+	if (vm && !job->vmid && !ring->is_mes_queue) {
 		dev_err(adev->dev, "VM IB without ID\n");
 		return -EINVAL;
 	}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 63/73] drm/amdgpu: skip kiq ib tests if mes enabled
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (61 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 62/73] drm/amdgpu: skip some checking for mes queue ib submission Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 64/73] drm/amdgpu: skip gds switch for mes queue Alex Deucher
                   ` (9 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

For kiq conflicts with mes, skip kiq ib tests.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index d8354453cc29..258cffe3c06a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -390,6 +390,10 @@ int amdgpu_ib_ring_tests(struct amdgpu_device *adev)
 		if (!ring->sched.ready || !ring->funcs->test_ib)
 			continue;
 
+		if (adev->enable_mes &&
+		    ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
+			continue;
+
 		/* MM engine need more time */
 		if (ring->funcs->type == AMDGPU_RING_TYPE_UVD ||
 			ring->funcs->type == AMDGPU_RING_TYPE_VCE ||
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 64/73] drm/amdgpu: skip gds switch for mes queue
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (62 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 63/73] drm/amdgpu: skip kiq ib tests if mes enabled Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 65/73] drm/amdgpu: kiq takes charge of all queues Alex Deucher
                   ` (8 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

For mes manages gds allocation, skip gds switch.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 4736174f5e4d..7276b03ef970 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -639,7 +639,8 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job,
 	}
 	dma_fence_put(fence);
 
-	if (ring->funcs->emit_gds_switch && gds_switch_needed) {
+	if (!ring->is_mes_queue && ring->funcs->emit_gds_switch &&
+	    gds_switch_needed) {
 		id->gds_base = job->gds_base;
 		id->gds_size = job->gds_size;
 		id->gws_base = job->gws_base;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 65/73] drm/amdgpu: kiq takes charge of all queues
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (63 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 64/73] drm/amdgpu: skip gds switch for mes queue Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 66/73] drm/amdgpu/mes: map ctx metadata for mes self test Alex Deucher
                   ` (7 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

To make kgq/kcq and mes queue co-exist, kiq needs take charge
of all queues.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 28a736c507bb..40df1e04d682 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -535,6 +535,9 @@ int amdgpu_gfx_enable_kcq(struct amdgpu_device *adev)
 		return r;
 	}
 
+	if (adev->enable_mes)
+		queue_mask = ~0ULL;
+
 	kiq->pmf->kiq_set_resources(kiq_ring, queue_mask);
 	for (i = 0; i < adev->gfx.num_compute_rings; i++)
 		kiq->pmf->kiq_map_queues(kiq_ring, &adev->gfx.compute_ring[i]);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 66/73] drm/amdgpu/mes: map ctx metadata for mes self test
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (64 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 65/73] drm/amdgpu: kiq takes charge of all queues Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 67/73] drm/amdgpu/mes: create gang and queues " Alex Deucher
                   ` (6 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Map ctx metadata for mes self test.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 37 +++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 6c01581e3a7b..b440b36dd98a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -882,3 +882,40 @@ void amdgpu_mes_ctx_free_meta_data(struct amdgpu_mes_ctx_data *ctx_data)
 	if (ctx_data->meta_data_obj)
 		amdgpu_bo_free_kernel(&ctx_data->meta_data_obj, NULL, NULL);
 }
+
+static int amdgpu_mes_test_map_ctx_meta_data(struct amdgpu_device *adev,
+				     struct amdgpu_vm *vm,
+				     struct amdgpu_mes_ctx_data *ctx_data)
+{
+	struct amdgpu_bo_va *meta_data_va = NULL;
+	uint64_t meta_data_addr = AMDGPU_VA_RESERVED_SIZE;
+	int r;
+
+	r = amdgpu_map_static_csa(adev, vm, ctx_data->meta_data_obj,
+				  &meta_data_va, meta_data_addr,
+				  sizeof(struct amdgpu_mes_ctx_meta_data));
+	if (r)
+		return r;
+
+	r = amdgpu_vm_bo_update(adev, meta_data_va, false);
+	if (r)
+		goto error;
+
+	r = amdgpu_vm_update_pdes(adev, vm, false);
+	if (r)
+		goto error;
+
+	dma_fence_wait(vm->last_update, false);
+	dma_fence_wait(meta_data_va->last_pt_update, false);
+
+	ctx_data->meta_data_gpu_addr = meta_data_addr;
+	ctx_data->meta_data_va = meta_data_va;
+
+	return 0;
+
+error:
+	BUG_ON(amdgpu_bo_reserve(ctx_data->meta_data_obj, true));
+	amdgpu_vm_bo_rmv(adev, meta_data_va);
+	amdgpu_bo_unreserve(ctx_data->meta_data_obj);
+	return r;
+}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 67/73] drm/amdgpu/mes: create gang and queues for mes self test
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (65 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 66/73] drm/amdgpu/mes: map ctx metadata for mes self test Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 68/73] drm/amdgpu/mes: add ring/ib test " Alex Deucher
                   ` (5 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Create gang and queues for mes self test.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 39 +++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index b440b36dd98a..027f3aae6025 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -919,3 +919,42 @@ static int amdgpu_mes_test_map_ctx_meta_data(struct amdgpu_device *adev,
 	amdgpu_bo_unreserve(ctx_data->meta_data_obj);
 	return r;
 }
+
+static int amdgpu_mes_test_create_gang_and_queues(struct amdgpu_device *adev,
+					  int pasid, int *gang_id,
+					  int queue_type, int num_queue,
+					  struct amdgpu_ring **added_rings,
+					  struct amdgpu_mes_ctx_data *ctx_data)
+{
+	struct amdgpu_ring *ring;
+	struct amdgpu_mes_gang_properties gprops = {0};
+	int r, j;
+
+	/* create a gang for the process */
+	gprops.priority = AMDGPU_MES_PRIORITY_LEVEL_NORMAL;
+	gprops.gang_quantum = adev->mes.default_gang_quantum;
+	gprops.inprocess_gang_priority = AMDGPU_MES_PRIORITY_LEVEL_NORMAL;
+	gprops.priority_level = AMDGPU_MES_PRIORITY_LEVEL_NORMAL;
+	gprops.global_priority_level = AMDGPU_MES_PRIORITY_LEVEL_NORMAL;
+
+	r = amdgpu_mes_add_gang(adev, pasid, &gprops, gang_id);
+	if (r) {
+		DRM_ERROR("failed to add gang\n");
+		return r;
+	}
+
+	/* create queues for the gang */
+	for (j = 0; j < num_queue; j++) {
+		r = amdgpu_mes_add_ring(adev, *gang_id, queue_type, j,
+					ctx_data, &ring);
+		if (r) {
+			DRM_ERROR("failed to add ring\n");
+			break;
+		}
+
+		DRM_INFO("ring %s was added\n", ring->name);
+		added_rings[j] = ring;
+	}
+
+	return 0;
+}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 68/73] drm/amdgpu/mes: add ring/ib test for mes self test
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (66 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 67/73] drm/amdgpu/mes: create gang and queues " Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 69/73] drm/amdgpu/mes: implement " Alex Deucher
                   ` (4 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Run the ring test and ib test for mes self test.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 32 +++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 027f3aae6025..e2b1da08ab64 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -958,3 +958,35 @@ static int amdgpu_mes_test_create_gang_and_queues(struct amdgpu_device *adev,
 
 	return 0;
 }
+
+static int amdgpu_mes_test_queues(struct amdgpu_ring **added_rings)
+{
+	struct amdgpu_ring *ring;
+	int i, r;
+
+	for (i = 0; i < AMDGPU_MES_CTX_MAX_RINGS; i++) {
+		ring = added_rings[i];
+		if (!ring)
+			continue;
+
+		r = amdgpu_ring_test_ring(ring);
+		if (r) {
+			DRM_DEV_ERROR(ring->adev->dev,
+				      "ring %s test failed (%d)\n",
+				      ring->name, r);
+			return r;
+		} else
+			DRM_INFO("ring %s test pass\n", ring->name);
+
+		r = amdgpu_ring_test_ib(ring, 1000 * 10);
+		if (r) {
+			DRM_DEV_ERROR(ring->adev->dev,
+				      "ring %s ib test failed (%d)\n",
+				      ring->name, r);
+			return r;
+		} else
+			DRM_INFO("ring %s ib test pass\n", ring->name);
+	}
+
+	return 0;
+}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 69/73] drm/amdgpu/mes: implement mes self test
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (67 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 68/73] drm/amdgpu/mes: add ring/ib test " Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 70/73] drm/amdgpu/mes10.1: add mes self test in late init Alex Deucher
                   ` (3 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add mes self test to verify its fundamental functionality by
running ring test and ib test of mes kernel queue.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 97 +++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  2 +
 2 files changed, 99 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index e2b1da08ab64..c9516b3aa6d9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -990,3 +990,100 @@ static int amdgpu_mes_test_queues(struct amdgpu_ring **added_rings)
 
 	return 0;
 }
+
+int amdgpu_mes_self_test(struct amdgpu_device *adev)
+{
+	struct amdgpu_vm *vm = NULL;
+	struct amdgpu_mes_ctx_data ctx_data = {0};
+	struct amdgpu_ring *added_rings[AMDGPU_MES_CTX_MAX_RINGS] = { NULL };
+	int gang_ids[3] = {0};
+	int queue_types[][2] = { { AMDGPU_RING_TYPE_GFX,
+				   AMDGPU_MES_CTX_MAX_GFX_RINGS},
+				 { AMDGPU_RING_TYPE_COMPUTE,
+				   AMDGPU_MES_CTX_MAX_COMPUTE_RINGS},
+				 { AMDGPU_RING_TYPE_SDMA,
+				   AMDGPU_MES_CTX_MAX_SDMA_RINGS } };
+	int i, r, pasid, k = 0;
+
+	pasid = amdgpu_pasid_alloc(16);
+	if (pasid < 0) {
+		dev_warn(adev->dev, "No more PASIDs available!");
+		pasid = 0;
+	}
+
+	vm = kzalloc(sizeof(*vm), GFP_KERNEL);
+	if (!vm) {
+		r = -ENOMEM;
+		goto error_pasid;
+	}
+
+	r = amdgpu_vm_init(adev, vm);
+	if (r) {
+		DRM_ERROR("failed to initialize vm\n");
+		goto error_pasid;
+	}
+
+	r = amdgpu_mes_ctx_alloc_meta_data(adev, &ctx_data);
+	if (r) {
+		DRM_ERROR("failed to alloc ctx meta data\n");
+		goto error_pasid;
+	}
+
+	r = amdgpu_mes_test_map_ctx_meta_data(adev, vm, &ctx_data);
+	if (r) {
+		DRM_ERROR("failed to map ctx meta data\n");
+		goto error_vm;
+	}
+
+	r = amdgpu_mes_create_process(adev, pasid, vm);
+	if (r) {
+		DRM_ERROR("failed to create MES process\n");
+		goto error_vm;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(queue_types); i++) {
+		r = amdgpu_mes_test_create_gang_and_queues(adev, pasid,
+							   &gang_ids[i],
+							   queue_types[i][0],
+							   queue_types[i][1],
+							   &added_rings[k],
+							   &ctx_data);
+		if (r)
+			goto error_queues;
+
+		k += queue_types[i][1];
+	}
+
+	/* start ring test and ib test for MES queues */
+	amdgpu_mes_test_queues(added_rings);
+
+error_queues:
+	/* remove all queues */
+	for (i = 0; i < ARRAY_SIZE(added_rings); i++) {
+		if (!added_rings[i])
+			continue;
+		amdgpu_mes_remove_ring(adev, added_rings[i]);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(gang_ids); i++) {
+		if (!gang_ids[i])
+			continue;
+		amdgpu_mes_remove_gang(adev, gang_ids[i]);
+	}
+
+	amdgpu_mes_destroy_process(adev, pasid);
+
+error_vm:
+	BUG_ON(amdgpu_bo_reserve(ctx_data.meta_data_obj, true));
+	amdgpu_vm_bo_rmv(adev, ctx_data.meta_data_va);
+	amdgpu_bo_unreserve(ctx_data.meta_data_obj);
+	amdgpu_vm_fini(adev, vm);
+
+error_pasid:
+	if (pasid)
+		amdgpu_pasid_free(pasid);
+
+	amdgpu_mes_ctx_free_meta_data(&ctx_data);
+	kfree(vm);
+	return 0;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 50d490e69cb7..5c9e7932c7a9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -265,4 +265,6 @@ int amdgpu_mes_ctx_alloc_meta_data(struct amdgpu_device *adev,
 				   struct amdgpu_mes_ctx_data *ctx_data);
 void amdgpu_mes_ctx_free_meta_data(struct amdgpu_mes_ctx_data *ctx_data);
 
+int amdgpu_mes_self_test(struct amdgpu_device *adev);
+
 #endif /* __AMDGPU_MES_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 70/73] drm/amdgpu/mes10.1: add mes self test in late init
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (68 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 69/73] drm/amdgpu/mes: implement " Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 71/73] drm/amdgpu/mes: fix vm csa update issue Alex Deucher
                   ` (2 subsequent siblings)
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Add MES self test in late init.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index d468cb5a8854..622aa17b18e7 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -1142,8 +1142,18 @@ static int mes_v10_1_resume(void *handle)
 	return amdgpu_mes_resume(adev);
 }
 
+static int mes_v10_0_late_init(void *handle)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+	amdgpu_mes_self_test(adev);
+
+	return 0;
+}
+
 static const struct amd_ip_funcs mes_v10_1_ip_funcs = {
 	.name = "mes_v10_1",
+	.late_init = mes_v10_0_late_init,
 	.sw_init = mes_v10_1_sw_init,
 	.sw_fini = mes_v10_1_sw_fini,
 	.hw_init = mes_v10_1_hw_init,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 71/73] drm/amdgpu/mes: fix vm csa update issue
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (69 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 70/73] drm/amdgpu/mes10.1: add mes self test in late init Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 72/73] drm/amdgpu/mes: disable mes sdma queue test Alex Deucher
  2022-04-29 17:46 ` [PATCH 73/73] drm/amdgpu/mes: Update the doorbell function signatures Alex Deucher
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Need reserve VM buffers before update VM csa.

v2: rebase fixes

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 81 ++++++++++++++++++-------
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  3 +
 2 files changed, 62 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index c9516b3aa6d9..51a6f309ef22 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -883,40 +883,76 @@ void amdgpu_mes_ctx_free_meta_data(struct amdgpu_mes_ctx_data *ctx_data)
 		amdgpu_bo_free_kernel(&ctx_data->meta_data_obj, NULL, NULL);
 }
 
-static int amdgpu_mes_test_map_ctx_meta_data(struct amdgpu_device *adev,
-				     struct amdgpu_vm *vm,
-				     struct amdgpu_mes_ctx_data *ctx_data)
+int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device *adev,
+				 struct amdgpu_vm *vm,
+				 struct amdgpu_mes_ctx_data *ctx_data)
 {
-	struct amdgpu_bo_va *meta_data_va = NULL;
-	uint64_t meta_data_addr = AMDGPU_VA_RESERVED_SIZE;
+	struct amdgpu_bo_va *bo_va;
+	struct ww_acquire_ctx ticket;
+	struct list_head list;
+	struct amdgpu_bo_list_entry pd;
+	struct ttm_validate_buffer csa_tv;
+	struct amdgpu_sync sync;
 	int r;
 
-	r = amdgpu_map_static_csa(adev, vm, ctx_data->meta_data_obj,
-				  &meta_data_va, meta_data_addr,
-				  sizeof(struct amdgpu_mes_ctx_meta_data));
-	if (r)
+	amdgpu_sync_create(&sync);
+	INIT_LIST_HEAD(&list);
+	INIT_LIST_HEAD(&csa_tv.head);
+
+	csa_tv.bo = &ctx_data->meta_data_obj->tbo;
+	csa_tv.num_shared = 1;
+
+	list_add(&csa_tv.head, &list);
+	amdgpu_vm_get_pd_bo(vm, &list, &pd);
+
+	r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL);
+	if (r) {
+		DRM_ERROR("failed to reserve meta data BO: err=%d\n", r);
 		return r;
+	}
 
-	r = amdgpu_vm_bo_update(adev, meta_data_va, false);
-	if (r)
+	bo_va = amdgpu_vm_bo_add(adev, vm, ctx_data->meta_data_obj);
+	if (!bo_va) {
+		ttm_eu_backoff_reservation(&ticket, &list);
+		DRM_ERROR("failed to create bo_va for meta data BO\n");
+		return -ENOMEM;
+	}
+
+	r = amdgpu_vm_bo_map(adev, bo_va, ctx_data->meta_data_gpu_addr, 0,
+			     sizeof(struct amdgpu_mes_ctx_meta_data),
+			     AMDGPU_PTE_READABLE | AMDGPU_PTE_WRITEABLE |
+			     AMDGPU_PTE_EXECUTABLE);
+
+	if (r) {
+		DRM_ERROR("failed to do bo_map on meta data, err=%d\n", r);
 		goto error;
+	}
 
-	r = amdgpu_vm_update_pdes(adev, vm, false);
-	if (r)
+	r = amdgpu_vm_bo_update(adev, bo_va, false);
+	if (r) {
+		DRM_ERROR("failed to do vm_bo_update on meta data\n");
 		goto error;
+	}
+	amdgpu_sync_fence(&sync, bo_va->last_pt_update);
 
-	dma_fence_wait(vm->last_update, false);
-	dma_fence_wait(meta_data_va->last_pt_update, false);
+	r = amdgpu_vm_update_pdes(adev, vm, false);
+	if (r) {
+		DRM_ERROR("failed to update pdes on meta data\n");
+		goto error;
+	}
+	amdgpu_sync_fence(&sync, vm->last_update);
 
-	ctx_data->meta_data_gpu_addr = meta_data_addr;
-	ctx_data->meta_data_va = meta_data_va;
+	amdgpu_sync_wait(&sync, false);
+	ttm_eu_backoff_reservation(&ticket, &list);
 
+	amdgpu_sync_free(&sync);
+	ctx_data->meta_data_va = bo_va;
 	return 0;
 
 error:
-	BUG_ON(amdgpu_bo_reserve(ctx_data->meta_data_obj, true));
-	amdgpu_vm_bo_rmv(adev, meta_data_va);
-	amdgpu_bo_unreserve(ctx_data->meta_data_obj);
+	amdgpu_vm_bo_del(adev, bo_va);
+	ttm_eu_backoff_reservation(&ticket, &list);
+	amdgpu_sync_free(&sync);
 	return r;
 }
 
@@ -1029,7 +1065,8 @@ int amdgpu_mes_self_test(struct amdgpu_device *adev)
 		goto error_pasid;
 	}
 
-	r = amdgpu_mes_test_map_ctx_meta_data(adev, vm, &ctx_data);
+	ctx_data.meta_data_gpu_addr = AMDGPU_VA_RESERVED_SIZE;
+	r = amdgpu_mes_ctx_map_meta_data(adev, vm, &ctx_data);
 	if (r) {
 		DRM_ERROR("failed to map ctx meta data\n");
 		goto error_vm;
@@ -1075,7 +1112,7 @@ int amdgpu_mes_self_test(struct amdgpu_device *adev)
 
 error_vm:
 	BUG_ON(amdgpu_bo_reserve(ctx_data.meta_data_obj, true));
-	amdgpu_vm_bo_rmv(adev, ctx_data.meta_data_va);
+	amdgpu_vm_bo_del(adev, ctx_data.meta_data_va);
 	amdgpu_bo_unreserve(ctx_data.meta_data_obj);
 	amdgpu_vm_fini(adev, vm);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 5c9e7932c7a9..a965ace0fd0e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -264,6 +264,9 @@ void amdgpu_mes_remove_ring(struct amdgpu_device *adev,
 int amdgpu_mes_ctx_alloc_meta_data(struct amdgpu_device *adev,
 				   struct amdgpu_mes_ctx_data *ctx_data);
 void amdgpu_mes_ctx_free_meta_data(struct amdgpu_mes_ctx_data *ctx_data);
+int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device *adev,
+				 struct amdgpu_vm *vm,
+				 struct amdgpu_mes_ctx_data *ctx_data);
 
 int amdgpu_mes_self_test(struct amdgpu_device *adev);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 72/73] drm/amdgpu/mes: disable mes sdma queue test
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (70 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 71/73] drm/amdgpu/mes: fix vm csa update issue Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  2022-04-29 17:46 ` [PATCH 73/73] drm/amdgpu/mes: Update the doorbell function signatures Alex Deucher
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Hawking Zhang

From: Jack Xiao <Jack.Xiao@amd.com>

Disable mes sdma queue test on sienna cichlid+,
for fw hasn't supported to map sdma queue.
The test can be enabled if fw supports.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 51a6f309ef22..e23c864aca11 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -1079,6 +1079,11 @@ int amdgpu_mes_self_test(struct amdgpu_device *adev)
 	}
 
 	for (i = 0; i < ARRAY_SIZE(queue_types); i++) {
+		/* On sienna cichlid+, fw hasn't supported to map sdma queue. */
+		if (adev->asic_type >= CHIP_SIENNA_CICHLID &&
+		    i == AMDGPU_RING_TYPE_SDMA)
+			continue;
+
 		r = amdgpu_mes_test_create_gang_and_queues(adev, pasid,
 							   &gang_ids[i],
 							   queue_types[i][0],
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 73/73] drm/amdgpu/mes: Update the doorbell function signatures
  2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
                   ` (71 preceding siblings ...)
  2022-04-29 17:46 ` [PATCH 72/73] drm/amdgpu/mes: disable mes sdma queue test Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
  72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
  To: amd-gfx
  Cc: Mukul Joshi, Jack Xiao, Oak Zeng, Harish Kasiviswanathan,
	Alex Deucher, Felix Kuehling

From: Mukul Joshi <mukul.joshi@amd.com>

Update the function signatures for process doorbell allocations
with MES enabled to make them more generic. KFD would need to
access these functions to allocate/free doorbells when MES is
enabled.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Acked-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 37 +++++++++++++++----------
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  9 ++++++
 2 files changed, 31 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index e23c864aca11..5be30bf68b0c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -29,31 +29,40 @@
 #define AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS 1024
 #define AMDGPU_ONE_DOORBELL_SIZE 8
 
-static int amdgpu_mes_doorbell_process_slice(struct amdgpu_device *adev)
+int amdgpu_mes_doorbell_process_slice(struct amdgpu_device *adev)
 {
 	return roundup(AMDGPU_ONE_DOORBELL_SIZE *
 		       AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS,
 		       PAGE_SIZE);
 }
 
-static int amdgpu_mes_alloc_process_doorbells(struct amdgpu_device *adev,
-				      struct amdgpu_mes_process *process)
+int amdgpu_mes_alloc_process_doorbells(struct amdgpu_device *adev,
+				      unsigned int *doorbell_index)
 {
 	int r = ida_simple_get(&adev->mes.doorbell_ida, 2,
 			       adev->mes.max_doorbell_slices,
 			       GFP_KERNEL);
 	if (r > 0)
-		process->doorbell_index = r;
+		*doorbell_index = r;
 
 	return r;
 }
 
-static void amdgpu_mes_free_process_doorbells(struct amdgpu_device *adev,
-				      struct amdgpu_mes_process *process)
+void amdgpu_mes_free_process_doorbells(struct amdgpu_device *adev,
+				      unsigned int doorbell_index)
 {
-	if (process->doorbell_index)
-		ida_simple_remove(&adev->mes.doorbell_ida,
-				  process->doorbell_index);
+	if (doorbell_index)
+		ida_simple_remove(&adev->mes.doorbell_ida, doorbell_index);
+}
+
+unsigned int amdgpu_mes_get_doorbell_dw_offset_in_bar(
+					struct amdgpu_device *adev,
+					uint32_t doorbell_index,
+					unsigned int doorbell_id)
+{
+	return ((doorbell_index *
+		amdgpu_mes_doorbell_process_slice(adev)) / sizeof(u32) +
+		doorbell_id * 2);
 }
 
 static int amdgpu_mes_queue_doorbell_get(struct amdgpu_device *adev,
@@ -79,10 +88,8 @@ static int amdgpu_mes_queue_doorbell_get(struct amdgpu_device *adev,
 
 	set_bit(found, process->doorbell_bitmap);
 
-	*doorbell_index =
-		(process->doorbell_index *
-		 amdgpu_mes_doorbell_process_slice(adev)) / sizeof(u32) +
-		found * 2;
+	*doorbell_index = amdgpu_mes_get_doorbell_dw_offset_in_bar(adev,
+				process->doorbell_index, found);
 
 	return 0;
 }
@@ -262,7 +269,7 @@ int amdgpu_mes_create_process(struct amdgpu_device *adev, int pasid,
 	memset(process->proc_ctx_cpu_ptr, 0, AMDGPU_MES_PROC_CTX_SIZE);
 
 	/* allocate the starting doorbell index of the process */
-	r = amdgpu_mes_alloc_process_doorbells(adev, process);
+	r = amdgpu_mes_alloc_process_doorbells(adev, &process->doorbell_index);
 	if (r < 0) {
 		DRM_ERROR("failed to allocate doorbell for process\n");
 		goto clean_up_ctx;
@@ -338,7 +345,7 @@ void amdgpu_mes_destroy_process(struct amdgpu_device *adev, int pasid)
 		kfree(gang);
 	}
 
-	amdgpu_mes_free_process_doorbells(adev, process);
+	amdgpu_mes_free_process_doorbells(adev, process->doorbell_index);
 
 	idr_remove(&adev->mes.pasid_idr, pasid);
 	amdgpu_bo_free_kernel(&process->proc_ctx_bo,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index a965ace0fd0e..ba039984e431 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -270,4 +270,13 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device *adev,
 
 int amdgpu_mes_self_test(struct amdgpu_device *adev);
 
+int amdgpu_mes_alloc_process_doorbells(struct amdgpu_device *adev,
+					unsigned int *doorbell_index);
+void amdgpu_mes_free_process_doorbells(struct amdgpu_device *adev,
+					unsigned int doorbell_index);
+unsigned int amdgpu_mes_get_doorbell_dw_offset_in_bar(
+					struct amdgpu_device *adev,
+					uint32_t doorbell_index,
+					unsigned int doorbell_id);
+int amdgpu_mes_doorbell_process_slice(struct amdgpu_device *adev);
 #endif /* __AMDGPU_MES_H__ */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2022-04-29 17:48 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
2022-04-29 17:45 ` [PATCH 01/73] drm/amdgpu: define MQD abstract layer for hw ip Alex Deucher
2022-04-29 17:45 ` [PATCH 02/73] drm/amdgpu: add helper function to initialize mqd from ring v4 Alex Deucher
2022-04-29 17:45 ` [PATCH 03/73] drm/amdgpu: add the per-context meta data v3 Alex Deucher
2022-04-29 17:45 ` [PATCH 04/73] drm/amdgpu: add mes ctx data in amdgpu_ring Alex Deucher
2022-04-29 17:45 ` [PATCH 05/73] drm/amdgpu: define ring structure to access rptr/wptr/fence Alex Deucher
2022-04-29 17:45 ` [PATCH 06/73] drm/amdgpu: use ring structure to access rptr/wptr v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 07/73] drm/amdgpu: initialize/finalize the ring for mes queue Alex Deucher
2022-04-29 17:45 ` [PATCH 08/73] drm/amdgpu: assign the cpu/gpu address of fence from ring Alex Deucher
2022-04-29 17:45 ` [PATCH 09/73] drm/amdgpu/gfx10: implement mqd functions of gfx/compute eng v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 10/73] drm/amdgpu/gfx10: use per ctx CSA for ce metadata Alex Deucher
2022-04-29 17:45 ` [PATCH 11/73] drm/amdgpu/gfx10: use per ctx CSA for de metadata Alex Deucher
2022-04-29 17:45 ` [PATCH 12/73] drm/amdgpu/gfx10: associate mes queue id with fence v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 13/73] drm/amdgpu/gfx10: inherit vmid from mqd Alex Deucher
2022-04-29 17:45 ` [PATCH 14/73] drm/amdgpu/gfx10: use INVALIDATE_TLBS to invalidate TLBs v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 15/73] drm/amdgpu/gmc10: skip emitting pasid mapping packet Alex Deucher
2022-04-29 17:45 ` [PATCH 16/73] drm/amdgpu: use the whole doorbell space for mes Alex Deucher
2022-04-29 17:45 ` [PATCH 17/73] drm/amdgpu: update mes process/gang/queue definitions Alex Deucher
2022-04-29 17:45 ` [PATCH 18/73] drm/amdgpu: add mes_kiq module parameter v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 19/73] drm/amdgpu: allocate doorbell index for mes kiq Alex Deucher
2022-04-29 17:45 ` [PATCH 20/73] drm/amdgpu/mes: extend mes framework to support multiple mes pipes Alex Deucher
2022-04-29 17:45 ` [PATCH 21/73] drm/amdgpu/gfx10: add mes queue fence handling Alex Deucher
2022-04-29 17:45 ` [PATCH 22/73] drm/amdgpu/gfx10: add mes support for gfx ib test Alex Deucher
2022-04-29 17:45 ` [PATCH 23/73] drm/amdgpu: don't use kiq to flush gpu tlb if mes enabled Alex Deucher
2022-04-29 17:45 ` [PATCH 24/73] drm/amdgpu/sdma: use per-ctx sdma csa address for mes sdma queue Alex Deucher
2022-04-29 17:45 ` [PATCH 25/73] drm/amdgpu/sdma5.2: initialize sdma mqd Alex Deucher
2022-04-29 17:45 ` [PATCH 26/73] drm/amdgpu/sdma5.2: associate mes queue id with fence Alex Deucher
2022-04-29 17:45 ` [PATCH 27/73] drm/amdgpu/sdma5.2: add mes queue fence handling Alex Deucher
2022-04-29 17:45 ` [PATCH 28/73] drm/amdgpu/sdma5.2: add mes support for sdma ring test Alex Deucher
2022-04-29 17:45 ` [PATCH 29/73] drm/amdgpu/sdma5.2: add mes support for sdma ib test Alex Deucher
2022-04-29 17:45 ` [PATCH 30/73] drm/amdgpu/sdma5: initialize sdma mqd Alex Deucher
2022-04-29 17:45 ` [PATCH 31/73] drm/amdgpu/sdma5: associate mes queue id with fence Alex Deucher
2022-04-29 17:45 ` [PATCH 32/73] drm/amdgpu/sdma5: add mes queue fence handling Alex Deucher
2022-04-29 17:45 ` [PATCH 33/73] drm/amdgpu/sdma5: add mes support for sdma ring test Alex Deucher
2022-04-29 17:45 ` [PATCH 34/73] drm/amdgpu/sdma5: add mes support for sdma ib test Alex Deucher
2022-04-29 17:45 ` [PATCH 35/73] drm/amdgpu: add mes kiq PSP GFX FW type Alex Deucher
2022-04-29 17:45 ` [PATCH 36/73] drm/amdgpu/mes: add mes kiq callback Alex Deucher
2022-04-29 17:45 ` [PATCH 37/73] drm/amdgpu: add mes kiq frontdoor loading support Alex Deucher
2022-04-29 17:45 ` [PATCH 38/73] drm/amdgpu: enable mes kiq N-1 test on sienna cichlid Alex Deucher
2022-04-29 17:45 ` [PATCH 39/73] drm/amdgpu/mes: manage mes doorbell allocation Alex Deucher
2022-04-29 17:45 ` [PATCH 40/73] drm/amdgpu: add mes queue id mask v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 41/73] drm/amdgpu/mes: initialize/finalize common mes structure v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 42/73] drm/amdgpu/mes: relocate status_fence slot allocation Alex Deucher
2022-04-29 17:45 ` [PATCH 43/73] drm/amdgpu/mes10.1: call general mes initialization Alex Deucher
2022-04-29 17:45 ` [PATCH 44/73] drm/amdgpu/mes10.1: add delay after mes engine enable Alex Deucher
2022-04-29 17:45 ` [PATCH 45/73] drm/amdgpu/mes10.1: implement the suspend/resume routine Alex Deucher
2022-04-29 17:45 ` [PATCH 46/73] drm/amdgpu/mes: implement creating mes process v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 47/73] drm/amdgpu/mes: implement destroying mes process Alex Deucher
2022-04-29 17:45 ` [PATCH 48/73] drm/amdgpu/mes: implement adding mes gang Alex Deucher
2022-04-29 17:46 ` [PATCH 49/73] drm/amdgpu/mes: implement removing " Alex Deucher
2022-04-29 17:46 ` [PATCH 50/73] drm/amdgpu/mes: implement suspending all gangs Alex Deucher
2022-04-29 17:46 ` [PATCH 51/73] drm/amdgpu/mes: implement resuming " Alex Deucher
2022-04-29 17:46 ` [PATCH 52/73] drm/amdgpu/mes: initialize mqd from queue properties Alex Deucher
2022-04-29 17:46 ` [PATCH 53/73] drm/amdgpu/mes: implement adding mes queue Alex Deucher
2022-04-29 17:46 ` [PATCH 54/73] drm/amdgpu/mes: implement removing " Alex Deucher
2022-04-29 17:46 ` [PATCH 55/73] drm/amdgpu/mes: add helper function to convert ring to queue property Alex Deucher
2022-04-29 17:46 ` [PATCH 56/73] drm/amdgpu/mes: add helper function to get the ctx meta data offset Alex Deucher
2022-04-29 17:46 ` [PATCH 57/73] drm/amdgpu/mes: use ring for kernel queue submission Alex Deucher
2022-04-29 17:46 ` [PATCH 58/73] drm/amdgpu/mes: implement removing mes ring Alex Deucher
2022-04-29 17:46 ` [PATCH 59/73] drm/amdgpu/mes: add helper functions to alloc/free ctx metadata Alex Deucher
2022-04-29 17:46 ` [PATCH 60/73] drm/amdgpu: skip kfd routines when mes enabled Alex Deucher
2022-04-29 17:46 ` [PATCH 61/73] drm/amdgpu: Enable KFD with MES enabled Alex Deucher
2022-04-29 17:46 ` [PATCH 62/73] drm/amdgpu: skip some checking for mes queue ib submission Alex Deucher
2022-04-29 17:46 ` [PATCH 63/73] drm/amdgpu: skip kiq ib tests if mes enabled Alex Deucher
2022-04-29 17:46 ` [PATCH 64/73] drm/amdgpu: skip gds switch for mes queue Alex Deucher
2022-04-29 17:46 ` [PATCH 65/73] drm/amdgpu: kiq takes charge of all queues Alex Deucher
2022-04-29 17:46 ` [PATCH 66/73] drm/amdgpu/mes: map ctx metadata for mes self test Alex Deucher
2022-04-29 17:46 ` [PATCH 67/73] drm/amdgpu/mes: create gang and queues " Alex Deucher
2022-04-29 17:46 ` [PATCH 68/73] drm/amdgpu/mes: add ring/ib test " Alex Deucher
2022-04-29 17:46 ` [PATCH 69/73] drm/amdgpu/mes: implement " Alex Deucher
2022-04-29 17:46 ` [PATCH 70/73] drm/amdgpu/mes10.1: add mes self test in late init Alex Deucher
2022-04-29 17:46 ` [PATCH 71/73] drm/amdgpu/mes: fix vm csa update issue Alex Deucher
2022-04-29 17:46 ` [PATCH 72/73] drm/amdgpu/mes: disable mes sdma queue test Alex Deucher
2022-04-29 17:46 ` [PATCH 73/73] drm/amdgpu/mes: Update the doorbell function signatures Alex Deucher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.