All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tursulin@igalia.com>
To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Cc: kernel-dev@igalia.com,
	"Tvrtko Ursulin" <tvrtko.ursulin@igalia.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Friedrich Vock" <friedrich.vock@gmx.de>
Subject: [RFC 0/5] Discussion around eviction improvements
Date: Wed,  8 May 2024 19:09:40 +0100	[thread overview]
Message-ID: <20240508180946.96863-1-tursulin@igalia.com> (raw)

From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

Last few days I was looking at the situation with VRAM over subscription, what
happens versus what perhaps should happen. Browsing through the driver and
running some simple experiments.

I ended up with this patch series which, as a disclaimer, may be completely
wrong but as I found some suspicious things, to me at least, I thought it was a
good point to stop and request some comments.

To perhaps summarise what are the main issues I think I found:

 * Migration rate limiting does not bother knowing if actual migration happened
   and so can over-account and unfairly penalise.

 * Migration rate limiting does not even work, at least not for the common case
   where userspace configures VRAM+GTT. It thinks it can stop migration attempts
   by playing with bo->allowed_domains vs bo->preferred domains but, both from
   the code, and from empirical experiments, I see that not working at all. Both
   masks are identical so fiddling with them achieves nothing.

 * Idea of the fallback placement only works when VRAM has free space. As soon
   as it does not, ttm_resource_compatible is happy to leave the buffers in the
   secondary placement forever.

 * Driver thinks it will be re-validating evicted buffers on the next submission
   but it does not for the very common case of VRAM+GTT because it only checks
   if current placement is *none* of the preferred placements.

All those problems are addressed in individual patches.

End result of this series appears to be driver which will try harder to move
buffers back into VRAM, but will be (more) correctly throttled in doing so by
the existing rate limiting logic.

I have run a quick benchmark of Cyberpunk 2077 and cannot say that I saw a
change but that could be a good thing too. At least I did not break anything,
perhaps.. On one occassion I did see the rate limiting logic get confused while
for a period of few minutes it went to a mode where it was constantly giving a
high migration budget. But that recovered itself when I switched clients and did
not come back so I don't know. If there is something wrong there I don't think
it would be caused by any patches in this series.

Series is probably rough but should be good enough for dicsussion. I am curious
to hear if I identified at least something correctly as a real problem.

It would also be good to hear what are the suggested games to check and see
whether there is any improvement.

Cc: Christian König <christian.koenig@amd.com>
Cc: Friedrich Vock <friedrich.vock@gmx.de>

Tvrtko Ursulin (5):
  drm/amdgpu: Fix migration rate limiting accounting
  drm/amdgpu: Actually respect buffer migration budget
  drm/ttm: Add preferred placement flag
  drm/amdgpu: Use preferred placement for VRAM+GTT
  drm/amdgpu: Re-validate evicted buffers

 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     | 38 +++++++++++++++++-----
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  8 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c     | 21 ++++++++++--
 drivers/gpu/drm/ttm/ttm_resource.c         | 13 +++++---
 include/drm/ttm/ttm_placement.h            |  3 ++
 5 files changed, 65 insertions(+), 18 deletions(-)

-- 
2.44.0


             reply	other threads:[~2024-05-08 18:10 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-08 18:09 Tvrtko Ursulin [this message]
2024-05-08 18:09 ` [RFC 1/5] drm/amdgpu: Fix migration rate limiting accounting Tvrtko Ursulin
2024-05-08 19:08   ` Friedrich Vock
2024-05-09  9:19     ` Tvrtko Ursulin
2024-05-13 14:36       ` Friedrich Vock
2024-05-15  7:14   ` Christian König
2024-05-15 10:51     ` Tvrtko Ursulin
2024-05-08 18:09 ` [RFC 2/5] drm/amdgpu: Actually respect buffer migration budget Tvrtko Ursulin
2024-05-15  7:20   ` Christian König
2024-05-15 10:59     ` Tvrtko Ursulin
2024-05-15 14:31       ` Christian König
2024-05-15 15:13         ` Tvrtko Ursulin
2024-05-08 18:09 ` [RFC 3/5] drm/ttm: Add preferred placement flag Tvrtko Ursulin
2024-05-08 18:09 ` [RFC 4/5] drm/amdgpu: Use preferred placement for VRAM+GTT Tvrtko Ursulin
2024-05-08 18:09 ` [RFC 5/5] drm/amdgpu: Re-validate evicted buffers Tvrtko Ursulin
2024-05-09 12:40 ` [RFC 0/5] Discussion around eviction improvements Tvrtko Ursulin
2024-05-13 13:49   ` Tvrtko Ursulin
2024-05-14 15:14     ` Tvrtko Ursulin
2024-05-14 15:47       ` Christian König
2024-05-13  6:50 ` Christian König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240508180946.96863-1-tursulin@igalia.com \
    --to=tursulin@igalia.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=friedrich.vock@gmx.de \
    --cc=kernel-dev@igalia.com \
    --cc=tvrtko.ursulin@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.