From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Panariti, David" Subject: RE: CZ EDC param and support Date: Fri, 28 Apr 2017 14:18:11 +0000 Message-ID: References: <40321e5a-36c5-6e07-8176-bcd596e7e982@amd.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0841706889==" Return-path: In-Reply-To: <40321e5a-36c5-6e07-8176-bcd596e7e982-5C7GfCeVMHo@public.gmane.org> Content-Language: en-US List-Id: Discussion list for AMD gfx List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: amd-gfx-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org Sender: "amd-gfx" To: "Koenig, Christian" , gpudriverdevsupport , "amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org" --===============0841706889== Content-Language: en-US Content-Type: multipart/alternative; boundary="_000_BN6PR12MB1889C8C94B10AA6E3DF6BD6495130BN6PR12MB1889namp_" --_000_BN6PR12MB1889C8C94B10AA6E3DF6BD6495130BN6PR12MB1889namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Actually, the attachment was an oversight. It's easier for me to attach, open the attachment and then delete the attac= hment. I got only 2/3 this time. I've gotten a comment that inline patches are preferred. Sorry for the inconvenience. davep From: Koenig, Christian Sent: Friday, April 28, 2017 4:06 AM To: Panariti, David ; gpudriverdevsupport ; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org Subject: Re: CZ EDC param and support You somehow messed up the attachment. Instead of individual files everything is squashed together as all-edc.patc= h. Please fix that otherwise proper review won't be possible. Christian. Am 28.04.2017 um 00:13 schrieb Panariti, David: The changes in the workarounds function use DRM_INFO rather than DRM_DEBUG = because CZs with EDC are often used in embedded environments and any info c= an be useful especially in the case of an intermittent problem. >>From e1ce383592c275b58ad95bd80b5479af8c1f9dae Mon Sep 17 00:00:00 2001 From: David Panariti Date: Fri, 14 Apr 2017 13:41:52 -0400 Subject: [PATCH 1/3] drm/amdgpu: Moved gfx_v8_0_select_se_sh() in lieu of re-redundant prototype. Will be needed for the rest of the EDC workarounds patch. Change-Id: Ie586ab38a69e98a91c6cb5747e285ce8bfdd1c86 Signed-off-by: David Panariti --- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 46 +++++++++++++++++---------------= --- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/am= dgpu/gfx_v8_0.c index 2ff5f19..27b57cb 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c @@ -1500,6 +1500,29 @@ static int gfx_v8_0_kiq_init(struct amdgpu_device *a= dev) return 0; } +static void gfx_v8_0_select_se_sh(struct amdgpu_device *adev, + u32 se_num,= u32 sh_num, u32 instance) +{ + u32 data; + + if (instance =3D=3D 0xffffffff) + data =3D REG_SET_FIELD(0, GRBM_GFX_INDEX, INS= TANCE_BROADCAST_WRITES, 1); + else + data =3D REG_SET_FIELD(0, GRBM_GFX_INDEX, INS= TANCE_INDEX, instance); + + if (se_num =3D=3D 0xffffffff) + data =3D REG_SET_FIELD(data, GRBM_GFX_INDEX, = SE_BROADCAST_WRITES, 1); + else + data =3D REG_SET_FIELD(data, GRBM_GFX_INDEX, = SE_INDEX, se_num); + + if (sh_num =3D=3D 0xffffffff) + data =3D REG_SET_FIELD(data, GRBM_GFX_INDEX, = SH_BROADCAST_WRITES, 1); + else + data =3D REG_SET_FIELD(data, GRBM_GFX_INDEX, = SH_INDEX, sh_num); + + WREG32(mmGRBM_GFX_INDEX, data); +} + static const u32 vgpr_init_compute_shader[] =3D { 0x7e000209, 0x7e020208, @@ -3556,29 +3579,6 @@ static void gfx_v8_0_tiling_mode_table_init(struct a= mdgpu_device *adev) } } -static void gfx_v8_0_select_se_sh(struct amdgpu_device *adev, - u32 se_num= , u32 sh_num, u32 instance) -{ - u32 data; - - if (instance =3D=3D 0xffffffff) - data =3D REG_SET_FIELD(0, GRBM_GFX_INDEX, IN= STANCE_BROADCAST_WRITES, 1); - else - data =3D REG_SET_FIELD(0, GRBM_GFX_INDEX, IN= STANCE_INDEX, instance); - - if (se_num =3D=3D 0xffffffff) - data =3D REG_SET_FIELD(data, GRBM_GFX_INDEX,= SE_BROADCAST_WRITES, 1); - else - data =3D REG_SET_FIELD(data, GRBM_GFX_INDEX,= SE_INDEX, se_num); - - if (sh_num =3D=3D 0xffffffff) - data =3D REG_SET_FIELD(data, GRBM_GFX_INDEX,= SH_BROADCAST_WRITES, 1); - else - data =3D REG_SET_FIELD(data, GRBM_GFX_INDEX,= SH_INDEX, sh_num); - - WREG32(mmGRBM_GFX_INDEX, data); -} - static u32 gfx_v8_0_create_bitmask(u32 bit_width) { return (u32)((1ULL << bit_width) - 1); -- 2.7.4 >>From 38fac8cab73dbc07e0ee7599b52106bc09dd32ea Mon Sep 17 00:00:00 2001 From: David Panariti Date: Mon, 24 Apr 2017 11:05:45 -0400 Subject: [PATCH 2/3] drm/amdgpu: Complete Carrizo EDC (Error Detection and Correction) workarounds. The workarounds are unconditionally performed on CZs with EDC enabled. EDC detects uncorrected ECC errors and uses data poisoning to prevent corrupted compute results from being used (read). EDC enabled CZs are often used in embedded environments. Change-Id: I84c261785329beeb797f11efbe0ec35790f2996c Signed-off-by: David Panariti --- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 148 ++++++++++++++++++++++++-------= --- 1 file changed, 106 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/am= dgpu/gfx_v8_0.c index 27b57cb..2f5bf5f 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c @@ -1645,35 +1645,92 @@ static const u32 sgpr2_init_regs[] =3D mmCOMPUTE_USER_DATA_9, 0xedcedc09, }; -static const u32 sec_ded_counter_registers[] =3D -{ - mmCPC_EDC_ATC_CNT, - mmCPC_EDC_SCRATCH_CNT, - mmCPC_EDC_UCODE_CNT, - mmCPF_EDC_ATC_CNT, - mmCPF_EDC_ROQ_CNT, - mmCPF_EDC_TAG_CNT, - mmCPG_EDC_ATC_CNT, - mmCPG_EDC_DMA_CNT, - mmCPG_EDC_TAG_CNT, - mmDC_EDC_CSINVOC_CNT, - mmDC_EDC_RESTORE_CNT, - mmDC_EDC_STATE_CNT, - mmGDS_EDC_CNT, - mmGDS_EDC_GRBM_CNT, - mmGDS_EDC_OA_DED, - mmSPI_EDC_CNT, - mmSQC_ATC_EDC_GATCL1_CNT, - mmSQC_EDC_CNT, - mmSQ_EDC_DED_CNT, - mmSQ_EDC_INFO, - mmSQ_EDC_SEC_CNT, - mmTCC_EDC_CNT, - mmTCP_ATC_EDC_GATCL1_CNT, - mmTCP_EDC_CNT, - mmTD_EDC_CNT +struct reg32_counter_name_map { + uint32_t rnmap_addr; /* Counter register address */ + const char *rnmap_name; /* Name of the counter */ + size_t rnmap_num_instances; /* Number of block instances */ }; +#define DEF_mmREG32_NAME_MAP_ELEMENT(reg, num_instances) { \ + .rnmap_addr =3D mm##reg, = \ + .rnmap_name =3D #reg, = \ + .rnmap_num_instances =3D num_instances = \ +} + +/* See GRBM_GFX_INDEX, et. al. registers. */ +static const struct reg32_counter_name_map sec_ded_counter_registers[] =3D= { + DEF_mmREG32_NAME_MAP_ELEMENT(SQC_EDC_CNT, 2), + DEF_mmREG32_NAME_MAP_ELEMENT(SQC_ATC_EDC_GATCL1_CNT, 2), + + DEF_mmREG32_NAME_MAP_ELEMENT(SQ_EDC_SEC_CNT, 8), + DEF_mmREG32_NAME_MAP_ELEMENT(SQ_EDC_DED_CNT, 8), + DEF_mmREG32_NAME_MAP_ELEMENT(TCP_EDC_CNT, 8), + DEF_mmREG32_NAME_MAP_ELEMENT(TCP_ATC_EDC_GATCL1_CNT, 8), + DEF_mmREG32_NAME_MAP_ELEMENT(TD_EDC_CNT, 8), + + DEF_mmREG32_NAME_MAP_ELEMENT(TCC_EDC_CNT, 4), + + DEF_mmREG32_NAME_MAP_ELEMENT(CPC_EDC_ATC_CNT, 1), + DEF_mmREG32_NAME_MAP_ELEMENT(CPC_EDC_SCRATCH_CNT, 1), + DEF_mmREG32_NAME_MAP_ELEMENT(CPC_EDC_UCODE_CNT, 1), + DEF_mmREG32_NAME_MAP_ELEMENT(CPF_EDC_ATC_CNT, 1), + DEF_mmREG32_NAME_MAP_ELEMENT(CPF_EDC_ROQ_CNT, 1), + DEF_mmREG32_NAME_MAP_ELEMENT(CPF_EDC_TAG_CNT, 1), + DEF_mmREG32_NAME_MAP_ELEMENT(CPG_EDC_ATC_CNT, 1), + DEF_mmREG32_NAME_MAP_ELEMENT(CPG_EDC_DMA_CNT, 1), + DEF_mmREG32_NAME_MAP_ELEMENT(CPG_EDC_TAG_CNT, 1), + DEF_mmREG32_NAME_MAP_ELEMENT(DC_EDC_CSINVOC_CNT, 1), + DEF_mmREG32_NAME_MAP_ELEMENT(DC_EDC_STATE_CNT, 1), + DEF_mmREG32_NAME_MAP_ELEMENT(DC_EDC_RESTORE_CNT, 1), + DEF_mmREG32_NAME_MAP_ELEMENT(GDS_EDC_CNT, 1), + DEF_mmREG32_NAME_MAP_ELEMENT(GDS_EDC_GRBM_CNT, 1), + DEF_mmREG32_NAME_MAP_ELEMENT(SPI_EDC_CNT, 1), +}; + +static int gfx_v8_0_edc_clear_counters(struct amdgpu_device *adev) +{ + int ci, se, sh, i; + uint32_t count; + int r =3D 0; + + mutex_lock(&adev->grbm_idx_mutex); + + for (ci =3D 0; ci < ARRAY_SIZE(sec_ded_counter_registers); ++= ci) { + const struct reg32_counter_name_map *cp =3D + sec_ded_counter_registers + c= i; + const char *name =3D cp->rnmap_name; + + for (se =3D 0; se < adev->gfx.config.max_shad= er_engines; ++se) { + for (sh =3D 0; sh < adev->gfx= .config.max_sh_per_se; ++sh) { + for (i =3D 0;= i < cp->rnmap_num_instances; ++i) { + = gfx_v8_0_select_se_sh(adev, se, sh, i); + = count =3D RREG32(cp->rnmap_addr); + = count =3D RREG32(cp->rnmap_addr); + = if (count !=3D 0) { + = /* + = * Workaround failed. + = * If people are interested + = * in EDC at all, they will + = * want to know which + = * counters had problems. + = */ + = DRM_WARN("EDC counter %s is 0x%08x, but should be 0\n.", + = name, count); + = r =3D -EINVAL; + = goto ret; + = } + } + } + } + } + +ret: + gfx_v8_0_select_se_sh(adev, 0xffffffff, 0xffffffff, 0xfffffff= f); + mutex_unlock(&adev->grbm_idx_mutex); + + return r; +} + static int gfx_v8_0_do_edc_gpr_workarounds(struct amdgpu_device *adev) { struct amdgpu_ring *ring =3D &adev->gfx.compute_ring[0]; @@ -1681,18 +1738,36 @@ static int gfx_v8_0_do_edc_gpr_workarounds(struct a= mdgpu_device *adev) struct fence *f =3D NULL; int r, i; u32 tmp; + u32 dis_bit; unsigned total_size, vgpr_offset, sgpr_offset; u64 gpu_addr; - /* only supported on CZ */ - if (adev->asic_type !=3D CHIP_CARRIZO) + if (adev->asic_type !=3D CHIP_CARRIZO) { + /* EDC is only supported on CZ */ + return 0; + } + + DRM_INFO("Detected Carrizo.\n"); + + tmp =3D RREG32(mmCC_GC_EDC_CONFIG); + dis_bit =3D REG_GET_FIELD(tmp, CC_GC_EDC_CONFIG, DIS_EDC); + if (dis_bit) { + /* On Carrizo, EDC may be disabled by a fuse.= */ + DRM_INFO("EDC hardware is disabled, GC_EDC_CO= NFIG: 0x%08x.\n", + tmp); return 0; + } /* bail if the compute ring is not ready */ if (!ring->ready) return 0; - tmp =3D RREG32(mmGB_EDC_MODE); + DRM_INFO("Applying EDC workarounds.\n"); + + /* + * Interested parties can enable EDC using debugfs register re= ads and + * writes. + */ WREG32(mmGB_EDC_MODE, 0); total_size =3D @@ -1817,18 +1892,7 @@ static int gfx_v8_0_do_edc_gpr_workarounds(struct am= dgpu_device *adev) goto fail; } - tmp =3D REG_SET_FIELD(tmp, GB_EDC_MODE, DED_MODE, 2); - tmp =3D REG_SET_FIELD(tmp, GB_EDC_MODE, PROP_FED, 1); - WREG32(mmGB_EDC_MODE, tmp); - - tmp =3D RREG32(mmCC_GC_EDC_CONFIG); - tmp =3D REG_SET_FIELD(tmp, CC_GC_EDC_CONFIG, DIS_EDC, 0) | 1= ; - WREG32(mmCC_GC_EDC_CONFIG, tmp); - - - /* read back registers to clear the counters */ - for (i =3D 0; i < ARRAY_SIZE(sec_ded_counter_registers); i++= ) - RREG32(sec_ded_counter_registers[i]); + gfx_v8_0_edc_clear_counters(adev); fail: amdgpu_ib_free(adev, &ib, NULL); -- 2.7.4 >>From ec4803205582c1011f5ced1ead70ee244268b4b8 Mon Sep 17 00:00:00 2001 From: David Panariti Date: Wed, 26 Apr 2017 10:13:06 -0400 Subject: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control use of ECC/EDC. Allow various kinds of memory integrity methods (e.g. ECC/EDC) to be enable= d or disabled. By default, all features are disabled. EDC is Error Detection and Correction. It can detect ECC errors and do 0 o= r more of: count SEC (single error corrected) and DED (double error detected, i.e. uncorrected ECC error), halt the affected block, interrupt the CPU. Currently, only counting errors is supported. Signed-off-by: David Panariti --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 ++++ drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 34 +++++++++++++++++++++++++++--= --- drivers/gpu/drm/amd/include/amd_shared.h | 14 +++++++++++++ 4 files changed, 48 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdg= pu/amdgpu.h index 4a16e3c..0322392 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -111,6 +111,7 @@ extern int amdgpu_prim_buf_per_se; extern int amdgpu_pos_buf_per_se; extern int amdgpu_cntl_sb_buf_per_se; extern int amdgpu_param_buf_per_se; +extern unsigned amdgpu_ecc_flags; #define AMDGPU_DEFAULT_GTT_SIZE_MB 3072ULL /* 3GB by = default */ #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS 3000 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/= amdgpu/amdgpu_drv.c index ead00d7..00e16ac 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -110,6 +110,7 @@ int amdgpu_prim_buf_per_se =3D 0; int amdgpu_pos_buf_per_se =3D 0; int amdgpu_cntl_sb_buf_per_se =3D 0; int amdgpu_param_buf_per_se =3D 0; +unsigned amdgpu_ecc_flags =3D 0; MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in megabytes"); module_param_named(vramlimit, amdgpu_vram_limit, int, 0600); @@ -235,6 +236,9 @@ module_param_named(cntl_sb_buf_per_se, amdgpu_cntl_sb_b= uf_per_se, int, 0444); MODULE_PARM_DESC(param_buf_per_se, "the size of Off-Chip Pramater Cache per= Shader Engine (default depending on gfx)"); module_param_named(param_buf_per_se, amdgpu_param_buf_per_se, int, 0444); +MODULE_PARM_DESC(ecc_flags, "ECC/EDC enable flags (0 =3D disable ECC/EDC (= default))"); +module_param_named(ecc_flags, amdgpu_ecc_flags, uint, 0444); + static const struct pci_device_id pciidlist[] =3D { #ifdef CONFIG_DRM_AMDGPU_SI diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/am= dgpu/gfx_v8_0.c index 2f5bf5f..05cab7e 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c @@ -1708,7 +1708,7 @@ static int gfx_v8_0_edc_clear_counters(struct amdgpu_= device *adev) = count =3D RREG32(cp->rnmap_addr); = if (count !=3D 0) { = /* - = * Workaround failed. + = * EDC workaround failed. = * If people are interested = * in EDC at all, they will = * want to know which @@ -1747,14 +1747,24 @@ static int gfx_v8_0_do_edc_gpr_workarounds(struct a= mdgpu_device *adev) return 0; } - DRM_INFO("Detected Carrizo.\n"); + DRM_INFO("Detected Carrizo.\n"); tmp =3D RREG32(mmCC_GC_EDC_CONFIG); dis_bit =3D REG_GET_FIELD(tmp, CC_GC_EDC_CONFIG, DIS_EDC); if (dis_bit) { - /* On Carrizo, EDC may be disabled by a fuse= . */ - DRM_INFO("EDC hardware is disabled, GC_EDC_C= ONFIG: 0x%08x.\n", - tmp); + /* On Carrizo, EDC may be disabled permanentl= y by a fuse. */ + DRM_INFO("Carrizo EDC hardware is disabled, G= C_EDC_CONFIG: 0x%08x.\n", + tmp); + return 0; + } + + /* + * Check if EDC has been requested by a kernel parameter. + * For Carrizo, EDC is the best/safest mode WRT error handling= . + */ + if (!(amdgpu_ecc_flags + & (AMD_ECC_SUPPORT_BEST | AMD_ECC_SUPPORT_EDC))) { + DRM_INFO("EDC support has not been requested.= \n"); return 0; } @@ -1892,6 +1902,20 @@ static int gfx_v8_0_do_edc_gpr_workarounds(struct am= dgpu_device *adev) goto fail; } + /* 00 - GB_EDC_DED_MODE_LOG: Count DED errors but do not halt= */ + tmp =3D REG_SET_FIELD(tmp, GB_EDC_MODE, DED_MODE, 0); + /* Do not propagate the errors to the next block. */ + tmp =3D REG_SET_FIELD(tmp, GB_EDC_MODE, PROP_FED, 0); + WREG32(mmGB_EDC_MODE, tmp); + + tmp =3D RREG32(mmCC_GC_EDC_CONFIG); + + /* + * Clear EDC_DISABLE bit so the counters are available. + */ + tmp =3D REG_SET_FIELD(tmp, CC_GC_EDC_CONFIG, DIS_EDC, 0); + WREG32(mmCC_GC_EDC_CONFIG, tmp); + gfx_v8_0_edc_clear_counters(adev); fail: diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd= /include/amd_shared.h index 2ccf44e..c4fd013 100644 --- a/drivers/gpu/drm/amd/include/amd_shared.h +++ b/drivers/gpu/drm/amd/include/amd_shared.h @@ -179,6 +179,20 @@ struct amd_pp_profile { #define AMD_PG_SUPPORT_GFX_QUICK_MG (1 << 11= ) #define AMD_PG_SUPPORT_GFX_PIPELINE (1 << 12) +/* + * ECC flags + * Allows the user to choose what kind of error detection/correction is us= ed. + * Currently, EDC is supported on Carrizo. + * + * The AMD_ECC_SUPPORT_BEST bit is used to allow a user to have the driver + * set what it thinks is best/safest mode. This may not be the same as th= e + * default, depending on the GPU and the application. + * Using a single bit makes it easy to request the best support without + * needing to know all currently supported modes. + */ +#define AMD_ECC_SUPPORT_BEST (1 << 0) +#define AMD_ECC_SUPPORT_EDC (1 << 1) + enum amd_pm_state_type { /* not used for dpm */ POWER_STATE_TYPE_DEFAULT, -- 2.7.4 --_000_BN6PR12MB1889C8C94B10AA6E3DF6BD6495130BN6PR12MB1889namp_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Actually, the attachme= nt was an oversight.

It’s easier for = me to attach, open the attachment and then delete the attachment.

I got only 2/3 this ti= me.

I’ve gotten a co= mment that inline patches are preferred.

 

Sorry for the inconven= ience.

 

davep

 

From:= Koenig, Christian
Sent: Friday, April 28, 2017 4:06 AM
To: Panariti, David <David.Panariti-5C7GfCeVMHo@public.gmane.org>; gpudriverdevsupp= ort <gpudriverdevsupport-5C7GfCeVMHo@public.gmane.org>; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Subject: Re: CZ EDC param and support

 

You somehow messed up the attachment.

Instead of individual files everything is squashed together as all-edc.patc= h.

Please fix that otherwise proper review won't be possible.

Christian.

Am 28.04.2017 um 00:13 schrieb Panariti, David:

The changes in the workarounds function use DRM_INFO= rather than DRM_DEBUG because CZs with EDC are often used in embedded envi= ronments and any info can be useful especially in the case of an intermitte= nt problem.

 

From e1ce383592c275b58ad95bd80b5479af8c1f9dae Mon Se= p 17 00:00:00 2001

From: David Panariti <David.Panariti-5C7GfCeVMHo@public.gmane.org>

Date: Fri, 14 Apr 2017 13:41:52 -0400

Subject: [PATCH 1/3] drm/amdgpu: Moved gfx_v8_0_sele= ct_se_sh() in lieu of

re-redundant prototype.

 

Will be needed for the rest of the EDC workarounds p= atch.

 

Change-Id: Ie586ab38a69e98a91c6cb5747e285ce8bfdd1c86=

Signed-off-by: David Panariti <David.Panariti-5C7GfCeVMHo@public.gmane.org>

---

drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 46 ++= ;+++++++++++++++= ;------------------

1 file changed, 23 insertions(+), 23 deletions(-= )

 

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b= /drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c

index 2ff5f19..27b57cb 100644

--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c

+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_= 0.c

@@ -1500,6 +1500,29 @@ static int gfx_v8_0_kiq_i= nit(struct amdgpu_device *adev)

        &nbs= p;      return 0;

}

+static void gfx_v8_0_select_se_sh(struct amdgpu= _device *adev,

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;     u32 se_num, u32 sh_num, u32 instance)

+{

+        = ;     u32 data;

+

+        = ;     if (instance =3D=3D 0xffffffff)

+        = ;            &n= bsp;        data =3D REG_SET_FIELD(0, GR= BM_GFX_INDEX, INSTANCE_BROADCAST_WRITES, 1);

+        = ;     else

+        = ;            &n= bsp;        data =3D REG_SET_FIELD(0, GR= BM_GFX_INDEX, INSTANCE_INDEX, instance);

+

+        = ;     if (se_num =3D=3D 0xffffffff)

+        = ;            &n= bsp;        data =3D REG_SET_FIELD(data,= GRBM_GFX_INDEX, SE_BROADCAST_WRITES, 1);

+        = ;     else

+        = ;            &n= bsp;        data =3D REG_SET_FIELD(data,= GRBM_GFX_INDEX, SE_INDEX, se_num);

+

+        = ;     if (sh_num =3D=3D 0xffffffff)

+        = ;            &n= bsp;        data =3D REG_SET_FIELD(data,= GRBM_GFX_INDEX, SH_BROADCAST_WRITES, 1);

+        = ;     else

+        = ;            &n= bsp;        data =3D REG_SET_FIELD(data,= GRBM_GFX_INDEX, SH_INDEX, sh_num);

+

+        = ;     WREG32(mmGRBM_GFX_INDEX, data);

+}

+

static const u32 vgpr_init_compute_shader[] =3D=

{

        &nbs= p;      0x7e000209, 0x7e020208,

@@ -3556,29 +3579,6 @@ static void gfx_v8_0_tili= ng_mode_table_init(struct amdgpu_device *adev)

        &nbs= p;      }

}

-static void gfx_v8_0_select_se_sh(struct amdgpu_dev= ice *adev,

-        &nb= sp;            =             &nb= sp;            =             &nb= sp;     u32 se_num, u32 sh_num, u32 instance)

-{

-        &nb= sp;     u32 data;

-

-        &nb= sp;     if (instance =3D=3D 0xffffffff)

-        &nb= sp;            =          data =3D REG_SET_FIELD(0, = GRBM_GFX_INDEX, INSTANCE_BROADCAST_WRITES, 1);

-        &nb= sp;     else

-        &nb= sp;            =          data =3D REG_SET_FIELD(0, = GRBM_GFX_INDEX, INSTANCE_INDEX, instance);

-

-        &nb= sp;     if (se_num =3D=3D 0xffffffff)

-        &nb= sp;            =          data =3D REG_SET_FIELD(dat= a, GRBM_GFX_INDEX, SE_BROADCAST_WRITES, 1);

-        &nb= sp;     else

-        &nb= sp;            =          data =3D REG_SET_FIELD(dat= a, GRBM_GFX_INDEX, SE_INDEX, se_num);

-

-        &nb= sp;     if (sh_num =3D=3D 0xffffffff)

-        &nb= sp;            =          data =3D REG_SET_FIELD(dat= a, GRBM_GFX_INDEX, SH_BROADCAST_WRITES, 1);

-        &nb= sp;     else

-        &nb= sp;            =          data =3D REG_SET_FIELD(dat= a, GRBM_GFX_INDEX, SH_INDEX, sh_num);

-

-        &nb= sp;     WREG32(mmGRBM_GFX_INDEX, data);

-}

-

static u32 gfx_v8_0_create_bitmask(u32 bit_width)

{

        &nbs= p;      return (u32)((1ULL << bit_width) - 1= );

--

2.7.4

 

 

From 38fac8cab73dbc07e0ee7599b52106bc09dd32ea Mon Se= p 17 00:00:00 2001

From: David Panariti <David.Panariti-5C7GfCeVMHo@public.gmane.org>

Date: Mon, 24 Apr 2017 11:05:45 -0400

Subject: [PATCH 2/3] drm/amdgpu: Complete Carrizo ED= C (Error Detection and

Correction) workarounds.

 

The workarounds are unconditionally performed on CZs= with EDC enabled.

EDC detects uncorrected ECC errors and uses data poi= soning to prevent

corrupted compute results from being used (read).

EDC enabled CZs are often used in embedded environme= nts.

 

Change-Id: I84c261785329beeb797f11efbe0ec35790f2996c=

Signed-off-by: David Panariti <David.Panariti-5C7GfCeVMHo@public.gmane.org>

---

drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 148 += 3;++++++++++++++= 3;+++++++----------

1 file changed, 106 insertions(+), 42 deletions(= -)

 

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b= /drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c

index 27b57cb..2f5bf5f 100644

--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c

+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_= 0.c

@@ -1645,35 +1645,92 @@ static const u32 sgpr2_i= nit_regs[] =3D

        &nbs= p;      mmCOMPUTE_USER_DATA_9, 0xedcedc09,

};

-static const u32 sec_ded_counter_registers[] =3D

-{

-        &nb= sp;     mmCPC_EDC_ATC_CNT,

-        &nb= sp;     mmCPC_EDC_SCRATCH_CNT,

-        &nb= sp;     mmCPC_EDC_UCODE_CNT,

-        &nb= sp;     mmCPF_EDC_ATC_CNT,

-        &nb= sp;     mmCPF_EDC_ROQ_CNT,

-        &nb= sp;     mmCPF_EDC_TAG_CNT,

-        &nb= sp;     mmCPG_EDC_ATC_CNT,

-        &nb= sp;     mmCPG_EDC_DMA_CNT,

-        &nb= sp;     mmCPG_EDC_TAG_CNT,

-        &nb= sp;     mmDC_EDC_CSINVOC_CNT,

-        &nb= sp;     mmDC_EDC_RESTORE_CNT,

-        &nb= sp;     mmDC_EDC_STATE_CNT,

-        &nb= sp;     mmGDS_EDC_CNT,

-        &nb= sp;     mmGDS_EDC_GRBM_CNT,

-        &nb= sp;     mmGDS_EDC_OA_DED,

-        &nb= sp;     mmSPI_EDC_CNT,

-        &nb= sp;     mmSQC_ATC_EDC_GATCL1_CNT,

-        &nb= sp;     mmSQC_EDC_CNT,

-        &nb= sp;     mmSQ_EDC_DED_CNT,

-        &nb= sp;     mmSQ_EDC_INFO,

-        &nb= sp;     mmSQ_EDC_SEC_CNT,

-        &nb= sp;     mmTCC_EDC_CNT,

-        &nb= sp;     mmTCP_ATC_EDC_GATCL1_CNT,

-        &nb= sp;     mmTCP_EDC_CNT,

-        &nb= sp;     mmTD_EDC_CNT

+struct reg32_counter_name_map {

+        = ;     uint32_t rnmap_addr;   /* Counter regis= ter address */

+        = ;     const char *rnmap_name;    &n= bsp;      /* Name of the counter */

+        = ;     size_t rnmap_num_instances;  /* Number of bl= ock instances */

};

+#define DEF_mmREG32_NAME_MAP_ELEMENT(reg, num_i= nstances) {          &nbs= p;  \

+        = ;     .rnmap_addr =3D mm##reg,    &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;        \

+        = ;     .rnmap_name =3D #reg,    &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;             = \

+        = ;     .rnmap_num_instances =3D num_instances  = ;            &n= bsp;            = ;            &n= bsp;    \

+}

+

+/* See GRBM_GFX_INDEX, et. al. registers. */

+static const struct reg32_counter_name_map sec_= ded_counter_registers[] =3D {

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(SQC_EDC_CNT, 2),

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(SQC_ATC_EDC_GATCL1_C= NT, 2),

+

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(SQ_EDC_SEC_CNT, 8),<= o:p>

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(SQ_EDC_DED_CNT, 8),<= o:p>

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(TCP_EDC_CNT, 8),

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(TCP_ATC_EDC_GATCL1_C= NT, 8),

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(TD_EDC_CNT, 8),=

+

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(TCC_EDC_CNT, 4),

+

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(CPC_EDC_ATC_CNT, 1),=

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(CPC_EDC_SCRATCH_CNT,= 1),

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(CPC_EDC_UCODE_CNT, 1= ),

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(CPF_EDC_ATC_CNT, 1),=

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(CPF_EDC_ROQ_CNT, 1),=

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(CPF_EDC_TAG_CNT, 1),=

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(CPG_EDC_ATC_CNT, 1),=

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(CPG_EDC_DMA_CNT, 1),=

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(CPG_EDC_TAG_CNT, 1),=

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(DC_EDC_CSINVOC_CNT, = 1),

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(DC_EDC_STATE_CNT, 1)= ,

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(DC_EDC_RESTORE_CNT, = 1),

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(GDS_EDC_CNT, 1),

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(GDS_EDC_GRBM_CNT, 1)= ,

+        = ;     DEF_mmREG32_NAME_MAP_ELEMENT(SPI_EDC_CNT, 1),

+};

+

+static int gfx_v8_0_edc_clear_counters(struct a= mdgpu_device *adev)

+{

+        = ;     int ci, se, sh, i;

+        = ;     uint32_t count;

+        = ;     int r =3D 0;

+

+        = ;     mutex_lock(&adev->grbm_idx_mutex);

+

+        = ;     for (ci =3D 0; ci < ARRAY_SIZE(sec_ded_counter= _registers); ++ci) {

+        = ;            &n= bsp;        const struct reg32_counter_n= ame_map *cp =3D

+        = ;            &n= bsp;            = ;            sec_ded= _counter_registers + ci;

+        = ;            &n= bsp;        const char *name =3D cp->= rnmap_name;

+

+        = ;            &n= bsp;        for (se =3D 0; se < adev-= >gfx.config.max_shader_engines; ++se) {

+        = ;            &n= bsp;            = ;            for (sh= =3D 0; sh < adev->gfx.config.max_sh_per_se; ++sh) {

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;   for (i =3D 0; i < cp->rnmap_num_instances; ++i)= {

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;      gfx_v8_0_select_se_sh(adev, se, sh, i);<= o:p>

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;      count =3D RREG32(cp->rnmap_addr);

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;      count =3D RREG32(cp->rnmap_addr);

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;      if (count !=3D 0) {

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;            = ;          /*

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;            = ;          * Workaround failed= .

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;            = ;          * If people are int= erested

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;            = ;          * in EDC at all, th= ey will

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;            = ;          * want to know whic= h

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;            = ;          * counters had prob= lems.

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;            = ;          */

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;            = ;          DRM_WARN("EDC = counter %s is 0x%08x, but should be 0\n.",

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;            = ; name, count);

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;            = ;          r =3D -EINVAL;=

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;            = ;          goto ret;

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;      }

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;   }

+        = ;            &n= bsp;            = ;            }<= /o:p>

+        = ;            &n= bsp;        }

+        = ;     }

+

+ret:

+        = ;     gfx_v8_0_select_se_sh(adev, 0xffffffff, 0xfffffff= f, 0xffffffff);

+        = ;     mutex_unlock(&adev->grbm_idx_mutex);<= /o:p>

+

+        = ;     return r;

+}

+

static int gfx_v8_0_do_edc_gpr_workarounds(struct am= dgpu_device *adev)

{

        &nbs= p;      struct amdgpu_ring *ring =3D &adev->= ;gfx.compute_ring[0];

@@ -1681,18 +1738,36 @@ static int gfx_v8_0_do_e= dc_gpr_workarounds(struct amdgpu_device *adev)

        &nbs= p;      struct fence *f =3D NULL;

        &nbs= p;      int r, i;

        &nbs= p;      u32 tmp;

+        = ;     u32 dis_bit;

        &nbs= p;      unsigned total_size, vgpr_offset, sgpr_off= set;

        &nbs= p;      u64 gpu_addr;

-        &nb= sp;     /* only supported on CZ */

-        &nb= sp;     if (adev->asic_type !=3D CHIP_CARRIZO)<= /o:p>

+        = ;     if (adev->asic_type !=3D CHIP_CARRIZO) {<= /o:p>

+        = ;            &n= bsp;        /* EDC is only supported on = CZ */

+        = ;            &n= bsp;        return 0;

+        = ;     }

+

+        DRM_= INFO("Detected Carrizo.\n");

+

+        = ;     tmp =3D RREG32(mmCC_GC_EDC_CONFIG);

+        = ;     dis_bit =3D REG_GET_FIELD(tmp, CC_GC_EDC_CONFIG, = DIS_EDC);

+        = ;     if (dis_bit) {

+        = ;            &n= bsp;        /* On Carrizo, EDC may be di= sabled by a fuse. */

+        = ;            &n= bsp;        DRM_INFO("EDC hardware = is disabled, GC_EDC_CONFIG: 0x%08x.\n",

+        = ;            &n= bsp;            = ;            tmp);

        &nbs= p;            &= nbsp;         return 0;<= /p>

+        = ;     }

        &nbs= p;       /* bail if the compute ring is not r= eady */

        &nbs= p;      if (!ring->ready)

        &nbs= p;            &= nbsp;         return 0;<= /p>

-        &nb= sp;     tmp =3D RREG32(mmGB_EDC_MODE);

+        = ;     DRM_INFO("Applying EDC workarounds.\n")= ;

+

+        = ;     /*

+        = ;     * Interested parties can enable EDC using debugfs= register reads and

+        = ;     * writes.

+        = ;     */

        &nbs= p;      WREG32(mmGB_EDC_MODE, 0);

        &nbs= p;       total_size =3D

@@ -1817,18 +1892,7 @@ static int gfx_v8_0_do_ed= c_gpr_workarounds(struct amdgpu_device *adev)

        &nbs= p;            &= nbsp;         goto fail;=

        &nbs= p;      }

-        &nb= sp;     tmp =3D REG_SET_FIELD(tmp, GB_EDC_MODE, DED_MOD= E, 2);

-        &nb= sp;     tmp =3D REG_SET_FIELD(tmp, GB_EDC_MODE, PROP_FE= D, 1);

-        &nb= sp;     WREG32(mmGB_EDC_MODE, tmp);

-

-        &nb= sp;     tmp =3D RREG32(mmCC_GC_EDC_CONFIG);<= /p>

-        &nb= sp;     tmp =3D REG_SET_FIELD(tmp, CC_GC_EDC_CONFIG, DI= S_EDC, 0) | 1;

-        &nb= sp;     WREG32(mmCC_GC_EDC_CONFIG, tmp);

-

-

-        &nb= sp;     /* read back registers to clear the counters */=

-        &nb= sp;     for (i =3D 0; i < ARRAY_SIZE(sec_ded_counter= _registers); i++)

-        &nb= sp;            =          RREG32(sec_ded_counter_reg= isters[i]);

+        = ;     gfx_v8_0_edc_clear_counters(adev);

 fail:

        &nbs= p;      amdgpu_ib_free(adev, &ib, NULL);<= /o:p>

--

2.7.4

 

 

From ec4803205582c1011f5ced1ead70ee244268b4b8 Mon Se= p 17 00:00:00 2001

From: David Panariti <David.Panariti-5C7GfCeVMHo@public.gmane.org>

Date: Wed, 26 Apr 2017 10:13:06 -0400

Subject: [PATCH 3/3] drm/amdgpu: Add kernel paramete= r to control use of

ECC/EDC.

 

Allow various kinds of memory integrity methods (e.g= . ECC/EDC) to be enabled

or disabled.  By default, all features are disa= bled.

 

EDC is Error Detection and Correction.  It can = detect ECC errors and do 0 or

more of: count SEC (single error corrected) and DED = (double error detected,

i.e. uncorrected ECC error), halt the affected block= , interrupt the CPU.

Currently, only counting errors is supported.

 

Signed-off-by: David Panariti <David.Panariti-5C7GfCeVMHo@public.gmane.org>

---

drivers/gpu/drm/amd/amdgpu/amdgpu.h   = ;   |  1 +

drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  | = ; 4 ++++

drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  &nb= sp; | 34 +++++++++++++&= #43;+++++++++++++-----<= o:p>

drivers/gpu/drm/amd/include/amd_shared.h | 14 +&= #43;+++++++++++

4 files changed, 48 insertions(+), 5 deletions(-= )

 

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/d= rivers/gpu/drm/amd/amdgpu/amdgpu.h

index 4a16e3c..0322392 100644

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h=

+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.= h

@@ -111,6 +111,7 @@ extern int amdgpu_prim_buf_p= er_se;

extern int amdgpu_pos_buf_per_se;

extern int amdgpu_cntl_sb_buf_per_se;

extern int amdgpu_param_buf_per_se;

+extern unsigned amdgpu_ecc_flags;

 #define AMDGPU_DEFAULT_GTT_SIZE_MB  =             &nb= sp;       3072ULL /* 3GB by default */

#define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS  &n= bsp;            &nbs= p;   3000

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c= b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

index ead00d7..00e16ac 100644

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_= drv.c

@@ -110,6 +110,7 @@ int amdgpu_prim_buf_per_se = =3D 0;

int amdgpu_pos_buf_per_se =3D 0;

int amdgpu_cntl_sb_buf_per_se =3D 0;

int amdgpu_param_buf_per_se =3D 0;

+unsigned amdgpu_ecc_flags =3D 0;

 MODULE_PARM_DESC(vramlimit, "Restrict VRA= M for testing, in megabytes");

module_param_named(vramlimit, amdgpu_vram_limit, int= , 0600);

@@ -235,6 +236,9 @@ module_param_named(cntl_sb_b= uf_per_se, amdgpu_cntl_sb_buf_per_se, int, 0444);

MODULE_PARM_DESC(param_buf_per_se, "the size of= Off-Chip Pramater Cache per Shader Engine (default depending on gfx)"= );

module_param_named(param_buf_per_se, amdgpu_param_bu= f_per_se, int, 0444);

+MODULE_PARM_DESC(ecc_flags, "ECC/EDC enabl= e flags (0 =3D disable ECC/EDC (default))");

+module_param_named(ecc_flags, amdgpu_ecc_flags,= uint, 0444);

+

 static const struct pci_device_id pciidlist[] = =3D {

#ifdef  CONFIG_DRM_AMDGPU_SI

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b= /drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c

index 2f5bf5f..05cab7e 100644

--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c

+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_= 0.c

@@ -1708,7 +1708,7 @@ static int gfx_v8_0_edc_cl= ear_counters(struct amdgpu_device *adev)

        &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;       count =3D RREG32(cp->rnmap_add= r);

        &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;       if (count !=3D 0) {

        &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;           /*

-        &nb= sp;            =             &nb= sp;            =             &nb= sp;            =             &nb= sp;          * Workaround fail= ed.

+        = ;            &n= bsp;            = ;            &n= bsp;            = ;            &n= bsp;            = ;          * EDC workaround fa= iled.

        &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;            * If p= eople are interested

        &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;            * in E= DC at all, they will

        &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;            * want= to know which

@@ -1747,14 +1747,24 @@ static int gfx_v8_0_do_e= dc_gpr_workarounds(struct amdgpu_device *adev)

        &nbs= p;            &= nbsp;         return 0;<= /p>

        &nbs= p;      }

-        DRM_INFO= ("Detected Carrizo.\n");

+        = ;     DRM_INFO("Detected Carrizo.\n");

        &nbs= p;       tmp =3D RREG32(mmCC_GC_EDC_CONFIG);<= o:p>

        &nbs= p;      dis_bit =3D REG_GET_FIELD(tmp, CC_GC_EDC_C= ONFIG, DIS_EDC);

        &nbs= p;      if (dis_bit) {

-        &nb= sp;            =          /* On Carrizo, EDC may be = disabled by a fuse. */

-        &nb= sp;            =          DRM_INFO("EDC hardwar= e is disabled, GC_EDC_CONFIG: 0x%08x.\n",

-        &nb= sp;            =             &nb= sp;            tmp);=

+        = ;            &n= bsp;        /* On Carrizo, EDC may be di= sabled permanently by a fuse. */

+        = ;            &n= bsp;        DRM_INFO("Carrizo EDC h= ardware is disabled, GC_EDC_CONFIG: 0x%08x.\n",

+        = ;            &n= bsp;            = ;            tmp);

+        = ;            &n= bsp;        return 0;

+        = ;     }

+

+        = ;     /*

+        = ;     * Check if EDC has been requested by a kernel par= ameter.

+        = ;     * For Carrizo, EDC is the best/safest mode WRT er= ror handling.

+        = ;     */

+        = ;     if (!(amdgpu_ecc_flags

+        = ;           & (AMD_ECC_SUP= PORT_BEST | AMD_ECC_SUPPORT_EDC))) {

+        = ;            &n= bsp;        DRM_INFO("EDC support h= as not been requested.\n");

        &nbs= p;            &= nbsp;         return 0;<= /p>

        &nbs= p;      }

@@ -1892,6 +1902,20 @@ static int gfx_v8_0_do_ed= c_gpr_workarounds(struct amdgpu_device *adev)

        &nbs= p;            &= nbsp;         goto fail;=

        &nbs= p;      }

+        = ;     /* 00 - GB_EDC_DED_MODE_LOG: Count DED errors but= do not halt */

+        = ;     tmp =3D REG_SET_FIELD(tmp, GB_EDC_MODE, DED_MODE,= 0);

+        = ;     /* Do not propagate the errors to the next block.= */

+        = ;     tmp =3D REG_SET_FIELD(tmp, GB_EDC_MODE, PROP_FED,= 0);

+        = ;     WREG32(mmGB_EDC_MODE, tmp);

+

+        = ;     tmp =3D RREG32(mmCC_GC_EDC_CONFIG);

+

+        = ;     /*

+        = ;     * Clear EDC_DISABLE bit so the counters are avail= able.

+        = ;     */

+        = ;     tmp =3D REG_SET_FIELD(tmp, CC_GC_EDC_CONFIG, DIS_= EDC, 0);

+        = ;     WREG32(mmCC_GC_EDC_CONFIG, tmp);

+

        &nbs= p;      gfx_v8_0_edc_clear_counters(adev);

 fail:

diff --git a/drivers/gpu/drm/amd/include/amd_shared.= h b/drivers/gpu/drm/amd/include/amd_shared.h

index 2ccf44e..c4fd013 100644

--- a/drivers/gpu/drm/amd/include/amd_shared.h<= /o:p>

+++ b/drivers/gpu/drm/amd/include/amd_sh= ared.h

@@ -179,6 +179,20 @@ struct amd_pp_profile {

#define AMD_PG_SUPPORT_GFX_QUICK_MG   = ;            &n= bsp;            = ;    (1 << 11)

#define AMD_PG_SUPPORT_GFX_PIPELINE   = ;            &n= bsp;     (1 << 12)

+/*

+ * ECC flags

+ * Allows the user to choose what kind of error= detection/correction is used.

+ * Currently, EDC is supported on Carrizo.=

+ *

+ * The AMD_ECC_SUPPORT_BEST bit is used to allo= w a user to have the driver

+ * set what it thinks is best/safest mode. = ; This may not be the same as the

+ * default, depending on the GPU and the applic= ation.

+ * Using a single bit makes it easy to request = the best support without

+ * needing to know all currently supported mode= s.

+ */

+#define AMD_ECC_SUPPORT_BEST   &= nbsp;           &nbs= p;            &= nbsp;        (1 << 0)

+#define AMD_ECC_SUPPORT_EDC   &n= bsp;            = ;            &n= bsp;          (1 << 1)

+

enum amd_pm_state_type {

        &nbs= p;      /* not used for dpm */

        &nbs= p;      POWER_STATE_TYPE_DEFAULT,

--

2.7.4

 

 

--_000_BN6PR12MB1889C8C94B10AA6E3DF6BD6495130BN6PR12MB1889namp_-- --===============0841706889== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KYW1kLWdmeCBt YWlsaW5nIGxpc3QKYW1kLWdmeEBsaXN0cy5mcmVlZGVza3RvcC5vcmcKaHR0cHM6Ly9saXN0cy5m cmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9hbWQtZ2Z4Cg== --===============0841706889==--