Re: [PATCH v3] drm/amdgpu: add workarounds for VCN TMZ issue on CHIP_RAVEN

From: Lang Yu <Lang.Yu@amd.com>
To: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: "Alex Deucher" <alexander.deucher@amd.com>,
	"Huang Rui" <ray.huang@amd.com>,
	"Christian König" <christian.koenig@amd.com>,
	amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH v3] drm/amdgpu: add workarounds for VCN TMZ issue on CHIP_RAVEN
Date: Wed, 16 Mar 2022 14:21:44 +0800	[thread overview]
Message-ID: <YjGB+Dx4nUIOyuMf@lang-desktop> (raw)
In-Reply-To: <c502a443-0987-6d88-1ef3-603723f04634@molgen.mpg.de>

On 03/16/ , Paul Menzel wrote:
> Dear Lang,
> 
> 
> Am 16.03.22 um 02:27 schrieb Lang Yu:
> > On 03/15/ , Paul Menzel wrote:
> 
> > > Am 14.03.22 um 03:45 schrieb Lang Yu:
> > > 
> > > Thank you for your patch. A shorter commit message summary would be:
> > > 
> > > > drm/amdgpu: Work around VNC TMZ issue on CHIP_RAVEN
> > > 
> > > > It is a hardware issue that VCN can't handle a GTT
> > > > backing stored TMZ buffer on CHIP_RAVEN series ASIC.
> > > 
> > > Where is that documented, and how can this be reproduced?
> > 
> > It is documented in AMD internal Confluence and JIRA.
> > Secure playback with a low VRAM config(thus TMZ buffer
> > will be allocted in GTT domain) may reproduce this issue.
> 
> It’d be great if as much of the details from this non-publicly accessible
> information could be added to the commit message, and a way to reproduce
> this as there does not seem to be a test for this. (Also I guess a tag with
> a reference to the internal issue would be acceptable, so in case more
> question surface in the future.)

Thanks. I will add an internal link.

Regards,
Lang

> > > > Move such a TMZ buffer to VRAM domain before command
> > > > submission as a wrokaround.
> > > 
> > > workaround
> > > 
> > > Please use a text width of 75 characters per line.
> > 
> > Thanks for your comments. I will take care about this.
> > 
> > > > v2:
> > > >    - Use patch_cs_in_place callback.
> > > > 
> > > > v3:
> > > >    - Bail out early if unsecure IBs.
> > > > 
> > > > Suggested-by: Christian König <christian.koenig@amd.com>
> > > > Signed-off-by: Lang Yu <Lang.Yu@amd.com>
> > > > ---
> > > >    drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 71 +++++++++++++++++++++++++++
> > > >    1 file changed, 71 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> > > > index 7bbb9ba6b80b..1ac9e06d3a4d 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> > > > @@ -24,6 +24,7 @@
> > > >    #include <linux/firmware.h>
> > > >    #include "amdgpu.h"
> > > > +#include "amdgpu_cs.h"
> > > >    #include "amdgpu_vcn.h"
> > > >    #include "amdgpu_pm.h"
> > > >    #include "soc15.h"
> > > > @@ -1905,6 +1906,75 @@ static const struct amd_ip_funcs vcn_v1_0_ip_funcs = {
> > > >    	.set_powergating_state = vcn_v1_0_set_powergating_state,
> > > >    };
> > > > +/*
> > > > + * It is a hardware issue that VCN can't handle a GTT TMZ buffer on
> > > > + * CHIP_RAVEN series ASIC. Move such a GTT TMZ buffer to VRAM domain
> > > > + * before command submission as a workaround.
> > > > + */
> > > > +static int vcn_v1_0_validate_bo(struct amdgpu_cs_parser *parser,
> > > > +				struct amdgpu_job *job,
> > > > +				uint64_t addr)
> > > > +{
> > > > +	struct ttm_operation_ctx ctx = { false, false };
> > > > +	struct amdgpu_fpriv *fpriv = parser->filp->driver_priv;
> > > > +	struct amdgpu_vm *vm = &fpriv->vm;
> > > > +	struct amdgpu_bo_va_mapping *mapping;
> > > > +	struct amdgpu_bo *bo;
> > > > +	int r;
> > > > +
> > > > +	addr &= AMDGPU_GMC_HOLE_MASK;
> > > > +	if (addr & 0x7) {
> > > > +		DRM_ERROR("VCN messages must be 8 byte aligned!\n");
> > > > +		return -EINVAL;
> > > > +	}
> > > > +
> > > > +	mapping = amdgpu_vm_bo_lookup_mapping(vm, addr/AMDGPU_GPU_PAGE_SIZE);
> > > > +	if (!mapping || !mapping->bo_va || !mapping->bo_va->base.bo)
> > > > +		return -EINVAL;
> > > > +
> > > > +	bo = mapping->bo_va->base.bo;
> > > > +	if (!(bo->flags & AMDGPU_GEM_CREATE_ENCRYPTED))
> > > > +		return 0;
> > > > +
> > > > +	amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_VRAM);
> > > > +	r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
> > > > +	if (r) {
> > > > +		DRM_ERROR("Failed validating the VCN message BO (%d)!\n", r);
> > > 
> > > to validate
> > > 
> > > > +		return r;
> > > > +	}
> > > > +
> > > > +	return r;
> > > > +}
> > > > +
> > > > +static int vcn_v1_0_ring_patch_cs_in_place(struct amdgpu_cs_parser *p,
> > > > +					   struct amdgpu_job *job,
> > > > +					   struct amdgpu_ib *ib)
> > > > +{
> > > > +	uint32_t msg_lo = 0, msg_hi = 0;
> > > > +	int i, r;
> > > 
> > > Use unsigned int for the counter variable?
> > 
> > You can see both signed and unsigned are used in kernel.
> > (e.g., mm/page_alloc.c).
> > 
> > For this case, I think it makes no difference.
> > 
> > In a word, we should consider the specific context.
> 
> Although it makes technically no difference, I prefer to use the best
> matching type to convey the intention of the variable to the reader. But you
> are right, there is no hard rule for it.
> 
> 
> Kind regards,
> 
> Paul
> 
> > > > +
> > > > +	if (!(ib->flags & AMDGPU_IB_FLAGS_SECURE))
> > > > +		return 0;
> > > > +
> > > > +	for (i = 0; i < ib->length_dw; i += 2) {
> > > > +		uint32_t reg = amdgpu_ib_get_value(ib, i);
> > > > +		uint32_t val = amdgpu_ib_get_value(ib, i + 1);
> > > > +
> > > > +		if (reg == PACKET0(p->adev->vcn.internal.data0, 0)) {
> > > > +			msg_lo = val;
> > > > +		} else if (reg == PACKET0(p->adev->vcn.internal.data1, 0)) {
> > > > +			msg_hi = val;
> > > > +		} else if (reg == PACKET0(p->adev->vcn.internal.cmd, 0)) {
> > > > +			r = vcn_v1_0_validate_bo(p, job,
> > > > +						 ((u64)msg_hi) << 32 | msg_lo);
> > > > +			if (r)
> > > > +				return r;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > >    static const struct amdgpu_ring_funcs vcn_v1_0_dec_ring_vm_funcs = {
> > > >    	.type = AMDGPU_RING_TYPE_VCN_DEC,
> > > >    	.align_mask = 0xf,
> > > > @@ -1914,6 +1984,7 @@ static const struct amdgpu_ring_funcs vcn_v1_0_dec_ring_vm_funcs = {
> > > >    	.get_rptr = vcn_v1_0_dec_ring_get_rptr,
> > > >    	.get_wptr = vcn_v1_0_dec_ring_get_wptr,
> > > >    	.set_wptr = vcn_v1_0_dec_ring_set_wptr,
> > > > +	.patch_cs_in_place = vcn_v1_0_ring_patch_cs_in_place,
> > > >    	.emit_frame_size =
> > > >    		6 + 6 + /* hdp invalidate / flush */
> > > >    		SOC15_FLUSH_GPU_TLB_NUM_WREG * 6 +
> > > 
> > > 
> > > Kind regards,
> > > 
> > > Paul