From: Felix Kuehling <felix.kuehling-5C7GfCeVMHo@public.gmane.org> To: Alex Sierra <alex.sierra-5C7GfCeVMHo@public.gmane.org>, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org Subject: Re: [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault Date: Mon, 18 Nov 2019 17:46:45 -0500 [thread overview] Message-ID: <f60eeb60-712f-6aa4-2660-86970b92c637@amd.com> (raw) In-Reply-To: <20191118222435.93134-2-alex.sierra-5C7GfCeVMHo@public.gmane.org> On 2019-11-18 17:24, Alex Sierra wrote: > Only for the debugger use case. > > [why] > Avoid endless translation retries, after an invalid address access has > been issued to the GPU. Instead, the trap handler is forced to enter by > generating a no-retry-fault. > A s_trap instruction is inserted in the debugger case to let the wave to > enter trap handler to save context. > > [how] > Intentionally using an invalid flag combination (F and P set at the same > time) to trigger a no-retry-fault, after a retry-fault happens. This is > only valid under compute context. > > Change-Id: I4180c30e2631dc0401cbd6171f8a6776e4733c9a > Signed-off-by: Alex Sierra <alex.sierra@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index d51ac8771ae0..358a4f50fcfb 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -3207,6 +3207,12 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int pasid, > value = adev->dummy_page_addr; > flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE | > AMDGPU_PTE_WRITEABLE; > + > + if (vm->is_compute_context) { > + /* Setting PTE flags to trigger a no-retry-fault */ > + flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE | > + AMDGPU_PTE_TF; Hmm, this looks like you're setting flags twice in the compute-case. I was also expecting something more like this: if (vm->is_compute_context) { ... } else if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) { ... } else { ... } I.e. for compute contexts, we do our compute-specific thing, otherwise the behaviour depends on the amdgpu_vm_fault_stop setting. Regards, Felix > + } > } else { > /* Let the hw retry silently on the PTE */ > value = 0; _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
WARNING: multiple messages have this Message-ID (diff)
From: Felix Kuehling <felix.kuehling@amd.com> To: Alex Sierra <alex.sierra@amd.com>, amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault Date: Mon, 18 Nov 2019 17:46:45 -0500 [thread overview] Message-ID: <f60eeb60-712f-6aa4-2660-86970b92c637@amd.com> (raw) Message-ID: <20191118224645.x_7Hv3Bc4ihZSG2orJk84Ry5A-CFuZoNvJpuZytEb9Q@z> (raw) In-Reply-To: <20191118222435.93134-2-alex.sierra@amd.com> On 2019-11-18 17:24, Alex Sierra wrote: > Only for the debugger use case. > > [why] > Avoid endless translation retries, after an invalid address access has > been issued to the GPU. Instead, the trap handler is forced to enter by > generating a no-retry-fault. > A s_trap instruction is inserted in the debugger case to let the wave to > enter trap handler to save context. > > [how] > Intentionally using an invalid flag combination (F and P set at the same > time) to trigger a no-retry-fault, after a retry-fault happens. This is > only valid under compute context. > > Change-Id: I4180c30e2631dc0401cbd6171f8a6776e4733c9a > Signed-off-by: Alex Sierra <alex.sierra@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index d51ac8771ae0..358a4f50fcfb 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -3207,6 +3207,12 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int pasid, > value = adev->dummy_page_addr; > flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE | > AMDGPU_PTE_WRITEABLE; > + > + if (vm->is_compute_context) { > + /* Setting PTE flags to trigger a no-retry-fault */ > + flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE | > + AMDGPU_PTE_TF; Hmm, this looks like you're setting flags twice in the compute-case. I was also expecting something more like this: if (vm->is_compute_context) { ... } else if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) { ... } else { ... } I.e. for compute contexts, we do our compute-specific thing, otherwise the behaviour depends on the amdgpu_vm_fault_stop setting. Regards, Felix > + } > } else { > /* Let the hw retry silently on the PTE */ > value = 0; _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
next prev parent reply other threads:[~2019-11-18 22:46 UTC|newest] Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-11-18 22:24 [PATCH 1/2] drm/amdgpu: add flag to indicate amdgpu vm context Alex Sierra 2019-11-18 22:24 ` Alex Sierra [not found] ` <20191118222435.93134-1-alex.sierra-5C7GfCeVMHo@public.gmane.org> 2019-11-18 22:24 ` [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault Alex Sierra 2019-11-18 22:24 ` Alex Sierra [not found] ` <20191118222435.93134-2-alex.sierra-5C7GfCeVMHo@public.gmane.org> 2019-11-18 22:46 ` Felix Kuehling [this message] 2019-11-18 22:46 ` Felix Kuehling 2019-11-19 16:37 [PATCH 1/2] drm/amdgpu: add flag to indicate amdgpu vm context Alex Sierra [not found] ` <20191119163754.4966-1-alex.sierra-5C7GfCeVMHo@public.gmane.org> 2019-11-19 16:37 ` [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault Alex Sierra 2019-11-19 16:37 ` Alex Sierra [not found] ` <20191119163754.4966-2-alex.sierra-5C7GfCeVMHo@public.gmane.org> 2019-11-19 16:45 ` Felix Kuehling 2019-11-19 16:45 ` Felix Kuehling [not found] ` <2b96848e-cf45-b558-e453-8a73de83d4a3-5C7GfCeVMHo@public.gmane.org> 2019-11-19 20:06 ` Christian König 2019-11-19 20:06 ` Christian König
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=f60eeb60-712f-6aa4-2660-86970b92c637@amd.com \ --to=felix.kuehling-5c7gfcevmho@public.gmane.org \ --cc=alex.sierra-5C7GfCeVMHo@public.gmane.org \ --cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.