* [PATCH 1/4] drm/amdgpu: Improve error checking in amdgpu_virt_rlcg_reg_rw
@ 2024-01-02 17:30 Victor Lu
2024-01-02 17:30 ` [PATCH 2/4] drm/amdgpu: Do not program SQ_TIMEOUT_CONFIG in SRIOV Victor Lu
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Victor Lu @ 2024-01-02 17:30 UTC (permalink / raw)
To: amd-gfx; +Cc: Vignesh.Chander, Victor Lu
The current error detection only looks for a timeout.
This should be changed to also check scratch_reg1 for any errors
returned from RLCG.
Also add a new error value.
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 8 ++++++--
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 2 ++
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 0dcff2889e25..3cd085569515 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -1022,7 +1022,7 @@ u32 amdgpu_virt_rlcg_reg_rw(struct amdgpu_device *adev, u32 offset, u32 v, u32 f
* SCRATCH_REG0 = read/write value
* SCRATCH_REG1[30:28] = command
* SCRATCH_REG1[19:0] = address in dword
- * SCRATCH_REG1[26:24] = Error reporting
+ * SCRATCH_REG1[27:24] = Error reporting
*/
writel(v, scratch_reg0);
writel((offset | flag), scratch_reg1);
@@ -1036,7 +1036,8 @@ u32 amdgpu_virt_rlcg_reg_rw(struct amdgpu_device *adev, u32 offset, u32 v, u32 f
udelay(10);
}
- if (i >= timeout) {
+ tmp = readl(scratch_reg1);
+ if (i >= timeout || (tmp & AMDGPU_RLCG_SCRATCH1_ERROR_MASK) != 0) {
if (amdgpu_sriov_rlcg_error_report_enabled(adev)) {
if (tmp & AMDGPU_RLCG_VFGATE_DISABLED) {
dev_err(adev->dev,
@@ -1047,6 +1048,9 @@ u32 amdgpu_virt_rlcg_reg_rw(struct amdgpu_device *adev, u32 offset, u32 v, u32 f
} else if (tmp & AMDGPU_RLCG_REG_NOT_IN_RANGE) {
dev_err(adev->dev,
"register is not in range, rlcg failed to program reg: 0x%05x\n", offset);
+ } else if (tmp & AMDGPU_RLCG_INVALID_XCD_ACCESS) {
+ dev_err(adev->dev,
+ "invalid xcd access, rlcg failed to program reg: 0x%05x\n", offset);
} else {
dev_err(adev->dev,
"unknown error type, rlcg failed to program reg: 0x%05x\n", offset);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
index d4207e44141f..447af2e4aef0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
@@ -40,11 +40,13 @@
#define AMDGPU_RLCG_MMHUB_WRITE (0x2 << 28)
/* error code for indirect register access path supported by rlcg for sriov */
+#define AMDGPU_RLCG_INVALID_XCD_ACCESS 0x8000000
#define AMDGPU_RLCG_VFGATE_DISABLED 0x4000000
#define AMDGPU_RLCG_WRONG_OPERATION_TYPE 0x2000000
#define AMDGPU_RLCG_REG_NOT_IN_RANGE 0x1000000
#define AMDGPU_RLCG_SCRATCH1_ADDRESS_MASK 0xFFFFF
+#define AMDGPU_RLCG_SCRATCH1_ERROR_MASK 0xF000000
/* all asic after AI use this offset */
#define mmRCC_IOV_FUNC_IDENTIFIER 0xDE5
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/4] drm/amdgpu: Do not program SQ_TIMEOUT_CONFIG in SRIOV
2024-01-02 17:30 [PATCH 1/4] drm/amdgpu: Improve error checking in amdgpu_virt_rlcg_reg_rw Victor Lu
@ 2024-01-02 17:30 ` Victor Lu
2024-02-20 15:51 ` Dhume, Samir
[not found] ` <PH7PR12MB59745A45653A1B10F9C1653AFA4C2@PH7PR12MB5974.namprd12.prod.outlook.com>
2024-01-02 17:30 ` [PATCH 3/4] drm/amdgpu: Use correct SRIOV macro for gmc_v9_0_vm_fault_interrupt_state Victor Lu
` (2 subsequent siblings)
3 siblings, 2 replies; 11+ messages in thread
From: Victor Lu @ 2024-01-02 17:30 UTC (permalink / raw)
To: amd-gfx; +Cc: Vignesh.Chander, Victor Lu
VF should not program this register.
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
index 00b21ece081f..30cc155f20d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
@@ -3888,6 +3888,9 @@ static void gfx_v9_4_3_inst_enable_watchdog_timer(struct amdgpu_device *adev,
uint32_t i;
uint32_t data;
+ if (amdgpu_sriov_vf(adev))
+ return;
+
data = RREG32_SOC15(GC, GET_INST(GC, 0), regSQ_TIMEOUT_CONFIG);
data = REG_SET_FIELD(data, SQ_TIMEOUT_CONFIG, TIMEOUT_FATAL_DISABLE,
amdgpu_watchdog_timer.timeout_fatal_disable ? 1 : 0);
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/4] drm/amdgpu: Use correct SRIOV macro for gmc_v9_0_vm_fault_interrupt_state
2024-01-02 17:30 [PATCH 1/4] drm/amdgpu: Improve error checking in amdgpu_virt_rlcg_reg_rw Victor Lu
2024-01-02 17:30 ` [PATCH 2/4] drm/amdgpu: Do not program SQ_TIMEOUT_CONFIG in SRIOV Victor Lu
@ 2024-01-02 17:30 ` Victor Lu
[not found] ` <PH7PR12MB5974D8CA3E8119960CD0F7BEFA4C2@PH7PR12MB5974.namprd12.prod.outlook.com>
2024-01-02 17:30 ` [PATCH 4/4] drm/amdgpu: Do not program VM_L2_CNTL under SRIOV Victor Lu
2024-02-13 18:55 ` [PATCH] drm/amdgpu: Improve error checking in amdgpu_virt_rlcg_reg_rw (v2) Victor Lu
3 siblings, 1 reply; 11+ messages in thread
From: Victor Lu @ 2024-01-02 17:30 UTC (permalink / raw)
To: amd-gfx; +Cc: Vignesh.Chander, Victor Lu
Under SRIOV, programming to VM_CONTEXT*_CNTL regs failed because the
current macro does not pass through the correct xcc instance.
Use the *REG32_XCC macro in this case.
The behaviour without SRIOV is the same.
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 473a774294ce..e2e14d40109c 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -496,14 +496,14 @@ static int gmc_v9_0_vm_fault_interrupt_state(struct amdgpu_device *adev,
if (j >= AMDGPU_MMHUB0(0))
tmp = RREG32_SOC15_IP(MMHUB, reg);
else
- tmp = RREG32_SOC15_IP(GC, reg);
+ tmp = RREG32_XCC(reg, j);
tmp &= ~bits;
if (j >= AMDGPU_MMHUB0(0))
WREG32_SOC15_IP(MMHUB, reg, tmp);
else
- WREG32_SOC15_IP(GC, reg, tmp);
+ WREG32_XCC(reg, tmp, j);
}
}
break;
@@ -524,14 +524,14 @@ static int gmc_v9_0_vm_fault_interrupt_state(struct amdgpu_device *adev,
if (j >= AMDGPU_MMHUB0(0))
tmp = RREG32_SOC15_IP(MMHUB, reg);
else
- tmp = RREG32_SOC15_IP(GC, reg);
+ tmp = RREG32_XCC(reg, j);
tmp |= bits;
if (j >= AMDGPU_MMHUB0(0))
WREG32_SOC15_IP(MMHUB, reg, tmp);
else
- WREG32_SOC15_IP(GC, reg, tmp);
+ WREG32_XCC(reg, tmp, j);
}
}
break;
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/4] drm/amdgpu: Do not program VM_L2_CNTL under SRIOV
2024-01-02 17:30 [PATCH 1/4] drm/amdgpu: Improve error checking in amdgpu_virt_rlcg_reg_rw Victor Lu
2024-01-02 17:30 ` [PATCH 2/4] drm/amdgpu: Do not program SQ_TIMEOUT_CONFIG in SRIOV Victor Lu
2024-01-02 17:30 ` [PATCH 3/4] drm/amdgpu: Use correct SRIOV macro for gmc_v9_0_vm_fault_interrupt_state Victor Lu
@ 2024-01-02 17:30 ` Victor Lu
2024-01-08 23:36 ` Chander, Vignesh
2024-02-20 15:52 ` Dhume, Samir
2024-02-13 18:55 ` [PATCH] drm/amdgpu: Improve error checking in amdgpu_virt_rlcg_reg_rw (v2) Victor Lu
3 siblings, 2 replies; 11+ messages in thread
From: Victor Lu @ 2024-01-02 17:30 UTC (permalink / raw)
To: amd-gfx; +Cc: Vignesh.Chander, Victor Lu
VM_L2_CNTL* should not be programmed on driver unload under SRIOV.
These regs are skipped during SRIOV driver init.
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c
index 55423ff1bb49..20e800bc0b68 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c
@@ -454,10 +454,12 @@ static void gfxhub_v1_2_xcc_gart_disable(struct amdgpu_device *adev,
WREG32_SOC15_RLC(GC, GET_INST(GC, j), regMC_VM_MX_L1_TLB_CNTL, tmp);
/* Setup L2 cache */
- tmp = RREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL);
- tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
- WREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL, tmp);
- WREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL3, 0);
+ if (!amdgpu_sriov_vf(adev)) {
+ tmp = RREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL);
+ tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
+ WREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL, tmp);
+ WREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL3, 0);
+ }
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* RE: [PATCH 4/4] drm/amdgpu: Do not program VM_L2_CNTL under SRIOV
2024-01-02 17:30 ` [PATCH 4/4] drm/amdgpu: Do not program VM_L2_CNTL under SRIOV Victor Lu
@ 2024-01-08 23:36 ` Chander, Vignesh
2024-02-20 15:52 ` Dhume, Samir
1 sibling, 0 replies; 11+ messages in thread
From: Chander, Vignesh @ 2024-01-08 23:36 UTC (permalink / raw)
To: Lu, Victor Cheng Chi (Victor), amd-gfx; +Cc: Lu, Victor Cheng Chi (Victor)
[AMD Official Use Only - General]
Reviewed-by: Vignesh Chander <Vignesh.Chander@amd.com>
-----Original Message-----
From: Victor Lu <victorchengchi.lu@amd.com>
Sent: Tuesday, January 2, 2024 12:30 PM
To: amd-gfx@lists.freedesktop.org
Cc: Chander, Vignesh <Vignesh.Chander@amd.com>; Lu, Victor Cheng Chi (Victor) <VictorChengChi.Lu@amd.com>
Subject: [PATCH 4/4] drm/amdgpu: Do not program VM_L2_CNTL under SRIOV
VM_L2_CNTL* should not be programmed on driver unload under SRIOV.
These regs are skipped during SRIOV driver init.
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c
index 55423ff1bb49..20e800bc0b68 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c
@@ -454,10 +454,12 @@ static void gfxhub_v1_2_xcc_gart_disable(struct amdgpu_device *adev,
WREG32_SOC15_RLC(GC, GET_INST(GC, j), regMC_VM_MX_L1_TLB_CNTL, tmp);
/* Setup L2 cache */
- tmp = RREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL);
- tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
- WREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL, tmp);
- WREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL3, 0);
+ if (!amdgpu_sriov_vf(adev)) {
+ tmp = RREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL);
+ tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
+ WREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL, tmp);
+ WREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL3, 0);
+ }
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH] drm/amdgpu: Improve error checking in amdgpu_virt_rlcg_reg_rw (v2)
2024-01-02 17:30 [PATCH 1/4] drm/amdgpu: Improve error checking in amdgpu_virt_rlcg_reg_rw Victor Lu
` (2 preceding siblings ...)
2024-01-02 17:30 ` [PATCH 4/4] drm/amdgpu: Do not program VM_L2_CNTL under SRIOV Victor Lu
@ 2024-02-13 18:55 ` Victor Lu
2024-02-16 15:28 ` Alex Deucher
3 siblings, 1 reply; 11+ messages in thread
From: Victor Lu @ 2024-02-13 18:55 UTC (permalink / raw)
To: amd-gfx; +Cc: samir.dhume, Victor Lu
The current error detection only looks for a timeout.
This should be changed to also check scratch_reg1 for any errors
returned from RLCG.
v2: remove new error value
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 5 +++--
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 1 +
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 6ff7d3fb2008..7a4eae36778a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -979,7 +979,7 @@ u32 amdgpu_virt_rlcg_reg_rw(struct amdgpu_device *adev, u32 offset, u32 v, u32 f
* SCRATCH_REG0 = read/write value
* SCRATCH_REG1[30:28] = command
* SCRATCH_REG1[19:0] = address in dword
- * SCRATCH_REG1[26:24] = Error reporting
+ * SCRATCH_REG1[27:24] = Error reporting
*/
writel(v, scratch_reg0);
writel((offset | flag), scratch_reg1);
@@ -993,7 +993,8 @@ u32 amdgpu_virt_rlcg_reg_rw(struct amdgpu_device *adev, u32 offset, u32 v, u32 f
udelay(10);
}
- if (i >= timeout) {
+ tmp = readl(scratch_reg1);
+ if (i >= timeout || (tmp & AMDGPU_RLCG_SCRATCH1_ERROR_MASK) != 0) {
if (amdgpu_sriov_rlcg_error_report_enabled(adev)) {
if (tmp & AMDGPU_RLCG_VFGATE_DISABLED) {
dev_err(adev->dev,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
index fa7be5f277b9..3f59b7b5523f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
@@ -45,6 +45,7 @@
#define AMDGPU_RLCG_REG_NOT_IN_RANGE 0x1000000
#define AMDGPU_RLCG_SCRATCH1_ADDRESS_MASK 0xFFFFF
+#define AMDGPU_RLCG_SCRATCH1_ERROR_MASK 0xF000000
/* all asic after AI use this offset */
#define mmRCC_IOV_FUNC_IDENTIFIER 0xDE5
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] drm/amdgpu: Improve error checking in amdgpu_virt_rlcg_reg_rw (v2)
2024-02-13 18:55 ` [PATCH] drm/amdgpu: Improve error checking in amdgpu_virt_rlcg_reg_rw (v2) Victor Lu
@ 2024-02-16 15:28 ` Alex Deucher
0 siblings, 0 replies; 11+ messages in thread
From: Alex Deucher @ 2024-02-16 15:28 UTC (permalink / raw)
To: Victor Lu; +Cc: amd-gfx, samir.dhume
On Tue, Feb 13, 2024 at 2:03 PM Victor Lu <victorchengchi.lu@amd.com> wrote:
>
> The current error detection only looks for a timeout.
> This should be changed to also check scratch_reg1 for any errors
> returned from RLCG.
>
> v2: remove new error value
>
> Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 5 +++--
> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 1 +
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> index 6ff7d3fb2008..7a4eae36778a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> @@ -979,7 +979,7 @@ u32 amdgpu_virt_rlcg_reg_rw(struct amdgpu_device *adev, u32 offset, u32 v, u32 f
> * SCRATCH_REG0 = read/write value
> * SCRATCH_REG1[30:28] = command
> * SCRATCH_REG1[19:0] = address in dword
> - * SCRATCH_REG1[26:24] = Error reporting
> + * SCRATCH_REG1[27:24] = Error reporting
> */
> writel(v, scratch_reg0);
> writel((offset | flag), scratch_reg1);
> @@ -993,7 +993,8 @@ u32 amdgpu_virt_rlcg_reg_rw(struct amdgpu_device *adev, u32 offset, u32 v, u32 f
> udelay(10);
> }
>
> - if (i >= timeout) {
> + tmp = readl(scratch_reg1);
> + if (i >= timeout || (tmp & AMDGPU_RLCG_SCRATCH1_ERROR_MASK) != 0) {
> if (amdgpu_sriov_rlcg_error_report_enabled(adev)) {
> if (tmp & AMDGPU_RLCG_VFGATE_DISABLED) {
> dev_err(adev->dev,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
> index fa7be5f277b9..3f59b7b5523f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
> @@ -45,6 +45,7 @@
> #define AMDGPU_RLCG_REG_NOT_IN_RANGE 0x1000000
>
> #define AMDGPU_RLCG_SCRATCH1_ADDRESS_MASK 0xFFFFF
> +#define AMDGPU_RLCG_SCRATCH1_ERROR_MASK 0xF000000
>
> /* all asic after AI use this offset */
> #define mmRCC_IOV_FUNC_IDENTIFIER 0xDE5
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH 3/4] drm/amdgpu: Use correct SRIOV macro for gmc_v9_0_vm_fault_interrupt_state
[not found] ` <PH7PR12MB5974D8CA3E8119960CD0F7BEFA4C2@PH7PR12MB5974.namprd12.prod.outlook.com>
@ 2024-02-16 21:10 ` Luo, Zhigang
0 siblings, 0 replies; 11+ messages in thread
From: Luo, Zhigang @ 2024-02-16 21:10 UTC (permalink / raw)
To: amd-gfx; +Cc: Lu, Victor Cheng Chi (Victor)
[-- Attachment #1: Type: text/plain, Size: 3318 bytes --]
[AMD Official Use Only - General]
Reviewed By Zhigang Luo <Zhigang.Luo@amd.com>
From: Lu, Victor Cheng Chi (Victor) <VictorChengChi.Lu@amd.com>
Sent: Friday, February 16, 2024 1:50 PM
To: Luo, Zhigang <Zhigang.Luo@amd.com>
Subject: Fw: [PATCH 3/4] drm/amdgpu: Use correct SRIOV macro for gmc_v9_0_vm_fault_interrupt_state
[AMD Official Use Only - General]
________________________________
From: Lu, Victor Cheng Chi (Victor) <VictorChengChi.Lu@amd.com<mailto:VictorChengChi.Lu@amd.com>>
Sent: Tuesday, January 2, 2024 12:30 PM
To: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>>
Cc: Chander, Vignesh <Vignesh.Chander@amd.com<mailto:Vignesh.Chander@amd.com>>; Lu, Victor Cheng Chi (Victor) <VictorChengChi.Lu@amd.com<mailto:VictorChengChi.Lu@amd.com>>
Subject: [PATCH 3/4] drm/amdgpu: Use correct SRIOV macro for gmc_v9_0_vm_fault_interrupt_state
Under SRIOV, programming to VM_CONTEXT*_CNTL regs failed because the
current macro does not pass through the correct xcc instance.
Use the *REG32_XCC macro in this case.
The behaviour without SRIOV is the same.
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com<mailto:victorchengchi.lu@amd.com>>
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 473a774294ce..e2e14d40109c 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -496,14 +496,14 @@ static int gmc_v9_0_vm_fault_interrupt_state(struct amdgpu_device *adev,
if (j >= AMDGPU_MMHUB0(0))
tmp = RREG32_SOC15_IP(MMHUB, reg);
else
- tmp = RREG32_SOC15_IP(GC, reg);
+ tmp = RREG32_XCC(reg, j);
tmp &= ~bits;
if (j >= AMDGPU_MMHUB0(0))
WREG32_SOC15_IP(MMHUB, reg, tmp);
else
- WREG32_SOC15_IP(GC, reg, tmp);
+ WREG32_XCC(reg, tmp, j);
}
}
break;
@@ -524,14 +524,14 @@ static int gmc_v9_0_vm_fault_interrupt_state(struct amdgpu_device *adev,
if (j >= AMDGPU_MMHUB0(0))
tmp = RREG32_SOC15_IP(MMHUB, reg);
else
- tmp = RREG32_SOC15_IP(GC, reg);
+ tmp = RREG32_XCC(reg, j);
tmp |= bits;
if (j >= AMDGPU_MMHUB0(0))
WREG32_SOC15_IP(MMHUB, reg, tmp);
else
- WREG32_SOC15_IP(GC, reg, tmp);
+ WREG32_XCC(reg, tmp, j);
}
}
break;
--
2.34.1
[-- Attachment #2: Type: text/html, Size: 11258 bytes --]
^ permalink raw reply related [flat|nested] 11+ messages in thread
* RE: [PATCH 2/4] drm/amdgpu: Do not program SQ_TIMEOUT_CONFIG in SRIOV
2024-01-02 17:30 ` [PATCH 2/4] drm/amdgpu: Do not program SQ_TIMEOUT_CONFIG in SRIOV Victor Lu
@ 2024-02-20 15:51 ` Dhume, Samir
[not found] ` <PH7PR12MB59745A45653A1B10F9C1653AFA4C2@PH7PR12MB5974.namprd12.prod.outlook.com>
1 sibling, 0 replies; 11+ messages in thread
From: Dhume, Samir @ 2024-02-20 15:51 UTC (permalink / raw)
To: Lu, Victor Cheng Chi (Victor), amd-gfx
Cc: Chander, Vignesh, Lu, Victor Cheng Chi (Victor)
[AMD Official Use Only - General]
Reviewed-by: Samir Dhume <samir.dhume@amd.com>
-----Original Message-----
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Victor Lu
Sent: Tuesday, January 2, 2024 12:30 PM
To: amd-gfx@lists.freedesktop.org
Cc: Chander, Vignesh <Vignesh.Chander@amd.com>; Lu, Victor Cheng Chi (Victor) <VictorChengChi.Lu@amd.com>
Subject: [PATCH 2/4] drm/amdgpu: Do not program SQ_TIMEOUT_CONFIG in SRIOV
VF should not program this register.
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
index 00b21ece081f..30cc155f20d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
@@ -3888,6 +3888,9 @@ static void gfx_v9_4_3_inst_enable_watchdog_timer(struct amdgpu_device *adev,
uint32_t i;
uint32_t data;
+ if (amdgpu_sriov_vf(adev))
+ return;
+
data = RREG32_SOC15(GC, GET_INST(GC, 0), regSQ_TIMEOUT_CONFIG);
data = REG_SET_FIELD(data, SQ_TIMEOUT_CONFIG, TIMEOUT_FATAL_DISABLE,
amdgpu_watchdog_timer.timeout_fatal_disable ? 1 : 0);
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* RE: [PATCH 4/4] drm/amdgpu: Do not program VM_L2_CNTL under SRIOV
2024-01-02 17:30 ` [PATCH 4/4] drm/amdgpu: Do not program VM_L2_CNTL under SRIOV Victor Lu
2024-01-08 23:36 ` Chander, Vignesh
@ 2024-02-20 15:52 ` Dhume, Samir
1 sibling, 0 replies; 11+ messages in thread
From: Dhume, Samir @ 2024-02-20 15:52 UTC (permalink / raw)
To: Lu, Victor Cheng Chi (Victor), amd-gfx
Cc: Chander, Vignesh, Lu, Victor Cheng Chi (Victor)
[AMD Official Use Only - General]
Reviewed-by: Samir Dhume <samir.dhume@amd.com>
-----Original Message-----
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Victor Lu
Sent: Tuesday, January 2, 2024 12:30 PM
To: amd-gfx@lists.freedesktop.org
Cc: Chander, Vignesh <Vignesh.Chander@amd.com>; Lu, Victor Cheng Chi (Victor) <VictorChengChi.Lu@amd.com>
Subject: [PATCH 4/4] drm/amdgpu: Do not program VM_L2_CNTL under SRIOV
VM_L2_CNTL* should not be programmed on driver unload under SRIOV.
These regs are skipped during SRIOV driver init.
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c
index 55423ff1bb49..20e800bc0b68 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c
@@ -454,10 +454,12 @@ static void gfxhub_v1_2_xcc_gart_disable(struct amdgpu_device *adev,
WREG32_SOC15_RLC(GC, GET_INST(GC, j), regMC_VM_MX_L1_TLB_CNTL, tmp);
/* Setup L2 cache */
- tmp = RREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL);
- tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
- WREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL, tmp);
- WREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL3, 0);
+ if (!amdgpu_sriov_vf(adev)) {
+ tmp = RREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL);
+ tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
+ WREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL, tmp);
+ WREG32_SOC15(GC, GET_INST(GC, j), regVM_L2_CNTL3, 0);
+ }
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* RE: [PATCH 2/4] drm/amdgpu: Do not program SQ_TIMEOUT_CONFIG in SRIOV
[not found] ` <PH7PR12MB59745A45653A1B10F9C1653AFA4C2@PH7PR12MB5974.namprd12.prod.outlook.com>
@ 2024-02-23 17:11 ` Luo, Zhigang
0 siblings, 0 replies; 11+ messages in thread
From: Luo, Zhigang @ 2024-02-23 17:11 UTC (permalink / raw)
To: Lu, Victor Cheng Chi (Victor), amd-gfx; +Cc: Zhang, Hawking
[-- Attachment #1: Type: text/plain, Size: 1867 bytes --]
[AMD Official Use Only - General]
Reviewed By Zhigang Luo <Zhigang.Luo@amd.com<mailto:Zhigang.Luo@amd.com>>
From: Lu, Victor Cheng Chi (Victor) <VictorChengChi.Lu@amd.com>
Sent: Friday, February 16, 2024 1:50 PM
To: Luo, Zhigang <Zhigang.Luo@amd.com>
Subject: Fw: [PATCH 2/4] drm/amdgpu: Do not program SQ_TIMEOUT_CONFIG in SRIOV
[AMD Official Use Only - General]
________________________________
From: Lu, Victor Cheng Chi (Victor) <VictorChengChi.Lu@amd.com<mailto:VictorChengChi.Lu@amd.com>>
Sent: Tuesday, January 2, 2024 12:30 PM
To: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>>
Cc: Chander, Vignesh <Vignesh.Chander@amd.com<mailto:Vignesh.Chander@amd.com>>; Lu, Victor Cheng Chi (Victor) <VictorChengChi.Lu@amd.com<mailto:VictorChengChi.Lu@amd.com>>
Subject: [PATCH 2/4] drm/amdgpu: Do not program SQ_TIMEOUT_CONFIG in SRIOV
VF should not program this register.
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com<mailto:victorchengchi.lu@amd.com>>
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
index 00b21ece081f..30cc155f20d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
@@ -3888,6 +3888,9 @@ static void gfx_v9_4_3_inst_enable_watchdog_timer(struct amdgpu_device *adev,
uint32_t i;
uint32_t data;
+ if (amdgpu_sriov_vf(adev))
+ return;
+
data = RREG32_SOC15(GC, GET_INST(GC, 0), regSQ_TIMEOUT_CONFIG);
data = REG_SET_FIELD(data, SQ_TIMEOUT_CONFIG, TIMEOUT_FATAL_DISABLE,
amdgpu_watchdog_timer.timeout_fatal_disable ? 1 : 0);
--
2.34.1
[-- Attachment #2: Type: text/html, Size: 5676 bytes --]
^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-02-23 17:11 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-02 17:30 [PATCH 1/4] drm/amdgpu: Improve error checking in amdgpu_virt_rlcg_reg_rw Victor Lu
2024-01-02 17:30 ` [PATCH 2/4] drm/amdgpu: Do not program SQ_TIMEOUT_CONFIG in SRIOV Victor Lu
2024-02-20 15:51 ` Dhume, Samir
[not found] ` <PH7PR12MB59745A45653A1B10F9C1653AFA4C2@PH7PR12MB5974.namprd12.prod.outlook.com>
2024-02-23 17:11 ` Luo, Zhigang
2024-01-02 17:30 ` [PATCH 3/4] drm/amdgpu: Use correct SRIOV macro for gmc_v9_0_vm_fault_interrupt_state Victor Lu
[not found] ` <PH7PR12MB5974D8CA3E8119960CD0F7BEFA4C2@PH7PR12MB5974.namprd12.prod.outlook.com>
2024-02-16 21:10 ` Luo, Zhigang
2024-01-02 17:30 ` [PATCH 4/4] drm/amdgpu: Do not program VM_L2_CNTL under SRIOV Victor Lu
2024-01-08 23:36 ` Chander, Vignesh
2024-02-20 15:52 ` Dhume, Samir
2024-02-13 18:55 ` [PATCH] drm/amdgpu: Improve error checking in amdgpu_virt_rlcg_reg_rw (v2) Victor Lu
2024-02-16 15:28 ` Alex Deucher
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.