Graceful page fault handling for Vega/Navi

* Graceful page fault handling for Vega/Navi
@ 2019-09-04 15:02 Christian König
       [not found] ` <20190904150230.13885-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Christian König @ 2019-09-04 15:02 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hi everyone,

this series is the next puzzle piece for recoverable page fault handling on Vega and Navi.

It adds a new direct scheduler entity for VM updates which is then used to update page tables during a fault.

In other words previously an application doing an invalid memory access would just hang and/or repeat the invalid access over and over again. Now the handling is modified so that the invalid memory access is redirected to the dummy page.

This needs the following prerequisites:
a) The firmware must be new enough so allow re-routing of page faults.
b) Fault retry must be enabled using the amdgpu.noretry=0 parameter.
c) Enough free VRAM to allocate page tables to point to the dummy page.

The re-routing of page faults current only works on Vega10, so Vega20 and Navi will still need some more time.

Please review and/or comment,
Christian.

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread