Dmesg doesn't contain anything. There is no backtrace because it's not a crash. The VA map ioctl just fails with the new flag. It looks like the flag is considered invalid. Marek On Mon., May 16, 2022, 12:13 Christian König, < ckoenig.leichtzumerken@gmail.com> wrote: > I don't have access to any gfx10 hardware. > > Can you give me a dmesg and/or backtrace, etc..? > > I can't push this unless it's working properly. > > Christian. > > Am 16.05.22 um 14:56 schrieb Marek Olšák: > > Reproduction steps: > - use mesa/main on gfx10.3 (not sure what other GPUs do) > - run: radeonsi_mall_noalloc=true glxgears > > Marek > > On Mon, May 16, 2022 at 7:53 AM Christian König < > ckoenig.leichtzumerken@gmail.com> wrote: > >> Crap, do you have a link to the failure? >> >> Am 16.05.22 um 13:10 schrieb Marek Olšák: >> >> I forgot to say: The NOALLOC flag causes an allocation failure, so there >> is a kernel bug somewhere. >> >> Marek >> >> On Mon, May 16, 2022 at 7:06 AM Marek Olšák wrote: >> >>> FYI, I think it's time to merge this because the Mesa commits are going >>> to be merged in ~30 minutes if Gitlab CI is green, and that includes >>> updated amdgpu_drm.h. >>> >>> Marek >>> >>> On Wed, May 11, 2022 at 2:55 PM Marek Olšák wrote: >>> >>>> Ok sounds good. >>>> >>>> Marek >>>> >>>> On Wed., May 11, 2022, 03:43 Christian König, < >>>> ckoenig.leichtzumerken@gmail.com> wrote: >>>> >>>>> It really *is* a NOALLOC feature. In other words there is no latency >>>>> improvement on reads because the cache is always checked, even with the >>>>> noalloc flag set. >>>>> >>>>> The only thing it affects is that misses not enter the cache and so >>>>> don't cause any additional pressure on evicting cache lines. >>>>> >>>>> You might want to double check with the hardware guys, but I'm >>>>> something like 95% sure that it works this way. >>>>> >>>>> Christian. >>>>> >>>>> Am 11.05.22 um 09:22 schrieb Marek Olšák: >>>>> >>>>> Bypass means that the contents of the cache are ignored, which >>>>> decreases latency at the cost of no coherency between bypassed and normal >>>>> memory requests. NOA (noalloc) means that the cache is checked and can give >>>>> you cache hits, but misses are not cached and the overall latency is >>>>> higher. I don't know what the hw does, but I hope it was misnamed and it >>>>> really means bypass because there is no point in doing cache lookups on >>>>> every memory request if the driver wants to disable caching to *decrease* >>>>> latency in the situations when the cache isn't helping. >>>>> >>>>> Marek >>>>> >>>>> On Wed, May 11, 2022 at 2:15 AM Lazar, Lijo >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On 5/11/2022 11:36 AM, Christian König wrote: >>>>>> > Mhm, it doesn't really bypass MALL. It just doesn't allocate any >>>>>> MALL >>>>>> > entries on write. >>>>>> > >>>>>> > How about AMDGPU_VM_PAGE_NO_MALL ? >>>>>> >>>>>> One more - AMDGPU_VM_PAGE_LLC_* [ LLC = last level cache, * = some >>>>>> sort >>>>>> of attribute which decides LLC behaviour] >>>>>> >>>>>> Thanks, >>>>>> Lijo >>>>>> >>>>>> > >>>>>> > Christian. >>>>>> > >>>>>> > Am 10.05.22 um 23:21 schrieb Marek Olšák: >>>>>> >> A better name would be: >>>>>> >> AMDGPU_VM_PAGE_BYPASS_MALL >>>>>> >> >>>>>> >> Marek >>>>>> >> >>>>>> >> On Fri, May 6, 2022 at 7:23 AM Christian König >>>>>> >> wrote: >>>>>> >> >>>>>> >> Add the AMDGPU_VM_NOALLOC flag to let userspace control MALL >>>>>> >> allocation. >>>>>> >> >>>>>> >> Only compile tested! >>>>>> >> >>>>>> >> Signed-off-by: Christian König >>>>>> >> --- >>>>>> >> drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 ++ >>>>>> >> drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 3 +++ >>>>>> >> drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 3 +++ >>>>>> >> include/uapi/drm/amdgpu_drm.h | 2 ++ >>>>>> >> 4 files changed, 10 insertions(+) >>>>>> >> >>>>>> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c >>>>>> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c >>>>>> >> index bf97d8f07f57..d8129626581f 100644 >>>>>> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c >>>>>> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c >>>>>> >> @@ -650,6 +650,8 @@ uint64_t amdgpu_gem_va_map_flags(struct >>>>>> >> amdgpu_device *adev, uint32_t flags) >>>>>> >> pte_flag |= AMDGPU_PTE_WRITEABLE; >>>>>> >> if (flags & AMDGPU_VM_PAGE_PRT) >>>>>> >> pte_flag |= AMDGPU_PTE_PRT; >>>>>> >> + if (flags & AMDGPU_VM_PAGE_NOALLOC) >>>>>> >> + pte_flag |= AMDGPU_PTE_NOALLOC; >>>>>> >> >>>>>> >> if (adev->gmc.gmc_funcs->map_mtype) >>>>>> >> pte_flag |= amdgpu_gmc_map_mtype(adev, >>>>>> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c >>>>>> >> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c >>>>>> >> index b8c79789e1e4..9077dfccaf3c 100644 >>>>>> >> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c >>>>>> >> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c >>>>>> >> @@ -613,6 +613,9 @@ static void gmc_v10_0_get_vm_pte(struct >>>>>> >> amdgpu_device *adev, >>>>>> >> *flags &= ~AMDGPU_PTE_MTYPE_NV10_MASK; >>>>>> >> *flags |= (mapping->flags & >>>>>> AMDGPU_PTE_MTYPE_NV10_MASK); >>>>>> >> >>>>>> >> + *flags &= ~AMDGPU_PTE_NOALLOC; >>>>>> >> + *flags |= (mapping->flags & AMDGPU_PTE_NOALLOC); >>>>>> >> + >>>>>> >> if (mapping->flags & AMDGPU_PTE_PRT) { >>>>>> >> *flags |= AMDGPU_PTE_PRT; >>>>>> >> *flags |= AMDGPU_PTE_SNOOPED; >>>>>> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c >>>>>> >> b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c >>>>>> >> index 8d733eeac556..32ee56adb602 100644 >>>>>> >> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c >>>>>> >> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c >>>>>> >> @@ -508,6 +508,9 @@ static void gmc_v11_0_get_vm_pte(struct >>>>>> >> amdgpu_device *adev, >>>>>> >> *flags &= ~AMDGPU_PTE_MTYPE_NV10_MASK; >>>>>> >> *flags |= (mapping->flags & >>>>>> AMDGPU_PTE_MTYPE_NV10_MASK); >>>>>> >> >>>>>> >> + *flags &= ~AMDGPU_PTE_NOALLOC; >>>>>> >> + *flags |= (mapping->flags & AMDGPU_PTE_NOALLOC); >>>>>> >> + >>>>>> >> if (mapping->flags & AMDGPU_PTE_PRT) { >>>>>> >> *flags |= AMDGPU_PTE_PRT; >>>>>> >> *flags |= AMDGPU_PTE_SNOOPED; >>>>>> >> diff --git a/include/uapi/drm/amdgpu_drm.h >>>>>> >> b/include/uapi/drm/amdgpu_drm.h >>>>>> >> index 57b9d8f0133a..9d71d6330687 100644 >>>>>> >> --- a/include/uapi/drm/amdgpu_drm.h >>>>>> >> +++ b/include/uapi/drm/amdgpu_drm.h >>>>>> >> @@ -533,6 +533,8 @@ struct drm_amdgpu_gem_op { >>>>>> >> #define AMDGPU_VM_MTYPE_UC (4 << 5) >>>>>> >> /* Use Read Write MTYPE instead of default MTYPE */ >>>>>> >> #define AMDGPU_VM_MTYPE_RW (5 << 5) >>>>>> >> +/* don't allocate MALL */ >>>>>> >> +#define AMDGPU_VM_PAGE_NOALLOC (1 << 9) >>>>>> >> >>>>>> >> struct drm_amdgpu_gem_va { >>>>>> >> /** GEM object handle */ >>>>>> >> -- >>>>>> >> 2.25.1 >>>>>> >> >>>>>> > >>>>>> >>>>> >>>>> >> >