All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 214425] New: [drm][amdgpu][TTM] Page pool memory never gets freed
@ 2021-09-15 21:09 bugzilla-daemon
  2022-10-15 14:34 ` [Bug 214425] " bugzilla-daemon
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: bugzilla-daemon @ 2021-09-15 21:09 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=214425

            Bug ID: 214425
           Summary: [drm][amdgpu][TTM] Page pool memory never gets freed
           Product: Drivers
           Version: 2.5
    Kernel Version: 5.14.3
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri@kernel-bugs.osdl.org
          Reporter: doucha@swarmtech.cz
        Regression: No

Hello,
while playing certain webGL games, I've noticed what appears to be a memory
leak in the kernel. Further investigation revealed that after about an hour of
gameplay, over 3GB of memory (half of all available RAM on my machine) will be
taken by the TTM page pool.

While the excessive allocation may be caused by a resource leak in the game
itself (I need to investigate that further), the larger problem is that TTM
never releases the memory even after I quit the game. Closing the game only
moves the allocated memory from active buffer objects to idle memory pool where
it'll get stuck until I reboot the system. Shutting down X server doesn't
release the memory either.

System specs:
HP Probook 455 G7
AMD Ryzen 5 4500U CPU
AMD Renoir GPU (Mesa 21.2.1, LLVM 12.0)
Gentoo Linux

TTM statistics before quitting the game:
/sys/kernel/debug/ttm/buffer_objects:
3116

/sys/kernel/debug/ttm/page_pool:
          --- 0--- --- 1--- --- 2--- --- 3--- --- 4--- --- 5--- --- 6--- ---
7--- --- 8--- --- 9--- ---10---
wc      :        2        2        1        1        8        2        0       
1        2        1        2
uc      :        0        0        0        0        0        0        0       
0        0        0        0
wc 32   :        0        0        0        0        0        0        0       
0        0        0        0
uc 32   :        0        0        0        0        0        0        0       
0        0        0        0

total   :     3410 of   939433

/sys/kernel/debug/ttm/page_pool_shrink:
2898/512


=======================================

TTM statistics after quitting the game (until reboot):
/sys/kernel/debug/ttm/buffer_objects:
403

/sys/kernel/debug/ttm/page_pool:
          --- 0--- --- 1--- --- 2--- --- 3--- --- 4--- --- 5--- --- 6--- ---
7--- --- 8--- --- 9--- ---10---
wc      :      151      134       20        5      255      241      790     
193      416     1121       83
uc      :        0        0        0        0        0        0        0       
0        0        0        0
wc 32   :        0        0        0        0        0        0        0       
0        0        0        0
uc 32   :        0        0        0        0        0        0        0       
0        0        0        0

total   :   853035 of   939433

/sys/kernel/debug/ttm/page_pool_shrink:
853034/1

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 214425] [drm][amdgpu][TTM] Page pool memory never gets freed
  2021-09-15 21:09 [Bug 214425] New: [drm][amdgpu][TTM] Page pool memory never gets freed bugzilla-daemon
@ 2022-10-15 14:34 ` bugzilla-daemon
  2022-10-15 14:47 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2022-10-15 14:34 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=214425

Rafael Ristovski (rafael.ristovski@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rafael.ristovski@gmail.com

--- Comment #1 from Rafael Ristovski (rafael.ristovski@gmail.com) ---
According to amdgpu devs, this is a feature where the allocated pages are kept
around in case they are needed later on. TTM is able to release the memory in
case the memory pressure increases.

See comment here:
https://gitlab.freedesktop.org/drm/amd/-/issues/1942#note_1311016

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 214425] [drm][amdgpu][TTM] Page pool memory never gets freed
  2021-09-15 21:09 [Bug 214425] New: [drm][amdgpu][TTM] Page pool memory never gets freed bugzilla-daemon
  2022-10-15 14:34 ` [Bug 214425] " bugzilla-daemon
@ 2022-10-15 14:47 ` bugzilla-daemon
  2022-10-15 15:05 ` bugzilla-daemon
  2022-10-15 16:14 ` bugzilla-daemon
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2022-10-15 14:47 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=214425

--- Comment #2 from Martin Doucha (doucha@swarmtech.cz) ---
(In reply to Rafael Ristovski from comment #1)
> According to amdgpu devs, this is a feature where the allocated pages are
> kept around in case they are needed later on. TTM is able to release the
> memory in case the memory pressure increases.

I understand the logic behind keeping idle buffers allocated for a while. But
it does not make sense to keep them for hours after last use and the release
mechanism on increased memory pressure does not seem to be working.

When I run a large compilation overnight, starting from a fresh reboot and
shutting down all graphics software including the X server, I'll often come
back in the morning to find that 70% of all RAM is allocated in idle TTM
buffers and GCC is stuck swapping for hours. The TTM buffers were likely
allocated by some GPU-accelerated build computation halfway through the night.
But this is harder to reproduce than the games I've mentioned in the initial
bugreport.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 214425] [drm][amdgpu][TTM] Page pool memory never gets freed
  2021-09-15 21:09 [Bug 214425] New: [drm][amdgpu][TTM] Page pool memory never gets freed bugzilla-daemon
  2022-10-15 14:34 ` [Bug 214425] " bugzilla-daemon
  2022-10-15 14:47 ` bugzilla-daemon
@ 2022-10-15 15:05 ` bugzilla-daemon
  2022-10-15 16:14 ` bugzilla-daemon
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2022-10-15 15:05 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=214425

--- Comment #3 from Rafael Ristovski (rafael.ristovski@gmail.com) ---
(In reply to Martin Doucha from comment #2)
> (In reply to Rafael Ristovski from comment #1)
> > According to amdgpu devs, this is a feature where the allocated pages are
> > kept around in case they are needed later on. TTM is able to release the
> > memory in case the memory pressure increases.
> 
> I understand the logic behind keeping idle buffers allocated for a while.
> But it does not make sense to keep them for hours after last use and the
> release mechanism on increased memory pressure does not seem to be working.
> 
> When I run a large compilation overnight, starting from a fresh reboot and
> shutting down all graphics software including the X server, I'll often come
> back in the morning to find that 70% of all RAM is allocated in idle TTM
> buffers and GCC is stuck swapping for hours. The TTM buffers were likely
> allocated by some GPU-accelerated build computation halfway through the
> night. But this is harder to reproduce than the games I've mentioned in the
> initial bugreport.

Indeed, I too run into situations where even if I purposefully trigger an OOM
situation just to get the TTM "cache" to evict itself through memory pressure,
_it still does not end up releasing all of the memory_.

There are also the following two sysfs files, simply reading them triggers an
eviction of GTT/VRAM:
> cat /sys/kernel/debug/dri/0/amdgpu_evict_vram
> cat /sys/kernel/debug/dri/0/amdgpu_evict_gtt

this can be confirmed as working with tools like `radeontop`/`nvtop`.

However, this once again does not release the TTM buffers.

As you can see in the issue I linked, I never got a reply about a mechanism to
manually release TTM memory. I will attempt coercing an answer on IRC, perhaps
I will have better luck asking directly there.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 214425] [drm][amdgpu][TTM] Page pool memory never gets freed
  2021-09-15 21:09 [Bug 214425] New: [drm][amdgpu][TTM] Page pool memory never gets freed bugzilla-daemon
                   ` (2 preceding siblings ...)
  2022-10-15 15:05 ` bugzilla-daemon
@ 2022-10-15 16:14 ` bugzilla-daemon
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2022-10-15 16:14 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=214425

--- Comment #4 from Rafael Ristovski (rafael.ristovski@gmail.com) ---
For what its worth, the following horrible incantation managed to release 2+GB
of TTM buffers on one of my machines, after I purposefully ran a VRAM intensive
game:
> for i in {1..1000}; do cat /sys/kernel/debug/ttm/page_pool_shrink; done

This seems to be the only sysfs mechanism to cause the memory to get released,
and as of now I am not aware of a... better and mainly "cleaner" alternative.

Newer kernel versions seem to feature
https://www.kernel.org/doc/html/next/admin-guide/mm/shrinker_debugfs.html,
which might be a better alternative, but I have not tested it yet, and its
usage is not exactly clear to me.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-10-15 16:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-15 21:09 [Bug 214425] New: [drm][amdgpu][TTM] Page pool memory never gets freed bugzilla-daemon
2022-10-15 14:34 ` [Bug 214425] " bugzilla-daemon
2022-10-15 14:47 ` bugzilla-daemon
2022-10-15 15:05 ` bugzilla-daemon
2022-10-15 16:14 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.