All of lore.kernel.org
 help / color / mirror / Atom feed
* Per file OOM badness
@ 2022-05-31  9:59 ` Christian König
  0 siblings, 0 replies; 145+ messages in thread
From: Christian König @ 2022-05-31  9:59 UTC (permalink / raw)
  To: linux-media, linux-kernel, intel-gfx, amd-gfx, nouveau,
	linux-tegra, linux-fsdevel, linux-mm
  Cc: christian.koenig, alexander.deucher, daniel, viro, akpm, hughd,
	andrey.grodzovsky

Hello everyone, 

To summarize the issue I'm trying to address here: Processes can allocate
resources through a file descriptor without being held responsible for it.

Especially for the DRM graphics driver subsystem this is rather
problematic. Modern games tend to allocate huge amounts of system memory
through the DRM drivers to make it accessible to GPU rendering.

But even outside of the DRM subsystem this problem exists and it is
trivial to exploit. See the following simple example of
using memfd_create():

         fd = memfd_create("test", 0);
         while (1)
                 write(fd, page, 4096);

Compile this and you can bring down any standard desktop system within
seconds.

The background is that the OOM killer will kill every processes in the
system, but just not the one which holds the only reference to the memory
allocated by the memfd.

Those problems where brought up on the mailing list multiple times now
[1][2][3], but without any final conclusion how to address them. Since
file descriptors are considered shared the process can not directly held
accountable for allocations made through them. Additional to that file
descriptors can also easily move between processes as well.

So what this patch set does is to instead of trying to account the
allocated memory to a specific process it adds a callback to struct
file_operations which the OOM killer can use to query the specific OOM
badness of this file reference. This badness is then divided by the
file_count, so that every process using a shmem file, DMA-buf or DRM
driver will get it's equal amount of OOM badness.

Callbacks are then implemented for the two core users (memfd and DMA-buf)
as well as 72 DRM based graphics drivers.

The result is that the OOM killer can now much better judge if a process
is worth killing to free up memory. Resulting a quite a bit better system
stability in OOM situations, especially while running games.

The only other possibility I can see would be to change the accounting of
resources whenever references to the file structure change, but this would
mean quite some additional overhead for a rather common operation.

Additionally I think trying to limit device driver allocations using
cgroups is orthogonal to this effort. While cgroups is very useful, it
works on per process limits and tries to enforce a collaborative model on
memory management while the OOM killer enforces a competitive model.

Please comment and/or review, we have that problem flying around for years
now and are not at a point where we finally need to find a solution for
this.

Regards,
Christian.

[1] https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
[2] https://lkml.org/lkml/2018/1/18/543
[3] https://lkml.org/lkml/2021/2/4/799



^ permalink raw reply	[flat|nested] 145+ messages in thread

end of thread, other threads:[~2022-08-04 20:48 UTC | newest]

Thread overview: 145+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-31  9:59 Per file OOM badness Christian König
2022-05-31  9:59 ` Christian König
2022-05-31  9:59 ` [Nouveau] " Christian König
2022-05-31  9:59 ` [PATCH 01/13] fs: add OOM badness callback to file_operatrations struct Christian König
2022-05-31  9:59   ` Christian König
2022-05-31  9:59   ` [Nouveau] " Christian König
2022-05-31  9:59 ` [PATCH 02/13] oom: take per file badness into account Christian König
2022-05-31  9:59   ` Christian König
2022-05-31  9:59   ` [Nouveau] " Christian König
2022-05-31  9:59 ` [PATCH 03/13] mm: shmem: provide oom badness for shmem files Christian König
2022-05-31  9:59   ` Christian König
2022-05-31  9:59   ` [Nouveau] " Christian König
2022-06-09  9:18   ` Michal Hocko
2022-06-09  9:18     ` [Nouveau] " Michal Hocko
2022-06-09  9:18     ` Michal Hocko
2022-06-09  9:18     ` [Intel-gfx] " Michal Hocko
2022-06-09 12:16     ` Christian König
2022-06-09 12:16       ` Christian König
2022-06-09 12:16       ` [Intel-gfx] " Christian König
2022-06-09 12:16       ` [Nouveau] " Christian König
2022-06-09 12:57       ` Michal Hocko
2022-06-09 12:57         ` [Nouveau] " Michal Hocko
2022-06-09 12:57         ` Michal Hocko
2022-06-09 12:57         ` [Intel-gfx] " Michal Hocko
2022-06-09 14:10         ` Christian König
2022-06-09 14:10           ` Christian König
2022-06-09 14:10           ` [Nouveau] " Christian König
2022-06-09 14:21           ` Michal Hocko
2022-06-09 14:21             ` [Nouveau] " Michal Hocko
2022-06-09 14:21             ` Michal Hocko
2022-06-09 14:21             ` [Intel-gfx] " Michal Hocko
2022-06-09 14:29             ` Christian König
2022-06-09 14:29               ` Christian König
2022-06-09 14:29               ` [Intel-gfx] " Christian König
2022-06-09 14:29               ` [Nouveau] " Christian König
2022-06-09 15:07               ` Michal Hocko
2022-06-09 15:07                 ` [Nouveau] " Michal Hocko
2022-06-09 15:07                 ` Michal Hocko
2022-06-09 15:07                 ` [Intel-gfx] " Michal Hocko
2022-06-10 10:58                 ` Christian König
2022-06-10 10:58                   ` Christian König
2022-06-10 10:58                   ` [Nouveau] " Christian König
2022-06-10 11:44                   ` Michal Hocko
2022-06-10 11:44                     ` [Nouveau] " Michal Hocko
2022-06-10 11:44                     ` Michal Hocko
2022-06-10 11:44                     ` [Intel-gfx] " Michal Hocko
2022-06-10 12:17                     ` Christian König
2022-06-10 12:17                       ` Christian König
2022-06-10 12:17                       ` [Intel-gfx] " Christian König
2022-06-10 12:17                       ` [Nouveau] " Christian König
2022-06-10 14:16                       ` Michal Hocko
2022-06-10 14:16                         ` [Nouveau] " Michal Hocko
2022-06-10 14:16                         ` Michal Hocko
2022-06-10 14:16                         ` [Intel-gfx] " Michal Hocko
2022-06-11  8:06                         ` Christian König
2022-06-11  8:06                           ` Christian König
2022-06-11  8:06                           ` [Intel-gfx] " Christian König
2022-06-11  8:06                           ` [Nouveau] " Christian König
2022-06-13  7:45                           ` Michal Hocko
2022-06-13  7:45                             ` [Nouveau] " Michal Hocko
2022-06-13  7:45                             ` Michal Hocko
2022-06-13  7:45                             ` [Intel-gfx] " Michal Hocko
2022-06-13 11:50                             ` Christian König
2022-06-13 11:50                               ` Christian König
2022-06-13 11:50                               ` [Intel-gfx] " Christian König
2022-06-13 11:50                               ` [Nouveau] " Christian König
2022-06-13 12:11                               ` Michal Hocko
2022-06-13 12:11                                 ` [Nouveau] " Michal Hocko
2022-06-13 12:11                                 ` Michal Hocko
2022-06-13 12:11                                 ` [Intel-gfx] " Michal Hocko
2022-06-13 12:55                                 ` [Nouveau] " Christian König
2022-06-13 12:55                                   ` Christian König
2022-06-13 12:55                                   ` Christian König
2022-06-13 12:55                                   ` [Intel-gfx] " Christian König
2022-06-13 14:11                                   ` Michal Hocko
2022-06-13 14:11                                     ` [Nouveau] " Michal Hocko
2022-06-13 14:11                                     ` Michal Hocko
2022-06-13 14:11                                     ` Michal Hocko
2022-06-15 12:35                                     ` Christian König
2022-06-15 12:35                                       ` Christian König
2022-06-15 12:35                                       ` [Intel-gfx] " Christian König
2022-06-15 12:35                                       ` [Nouveau] " Christian König
2022-06-15 13:15                                       ` Michal Hocko
2022-06-15 13:15                                         ` [Nouveau] " Michal Hocko
2022-06-15 13:15                                         ` Michal Hocko
2022-06-15 13:15                                         ` [Intel-gfx] " Michal Hocko
2022-06-15 14:24                                         ` Christian König
2022-06-15 14:24                                           ` Christian König
2022-06-15 14:24                                           ` [Intel-gfx] " Christian König
2022-06-15 14:24                                           ` [Nouveau] " Christian König
2022-06-13  9:08                           ` Michel Dänzer
2022-06-13  9:08                             ` [Nouveau] " Michel Dänzer
2022-06-13  9:08                             ` [Intel-gfx] " Michel Dänzer
2022-06-13  9:08                             ` Michel Dänzer
2022-06-13  9:11                             ` Christian König
2022-06-13  9:11                               ` Christian König
2022-06-13  9:11                               ` [Intel-gfx] " Christian König
2022-06-13  9:11                               ` [Nouveau] " Christian König
2022-06-09 15:19             ` Felix Kuehling
2022-06-09 15:19               ` Felix Kuehling
2022-06-09 15:19               ` [Intel-gfx] " Felix Kuehling
2022-06-09 15:19               ` [Nouveau] " Felix Kuehling
2022-06-09 15:22               ` Christian König
2022-06-09 15:22                 ` Christian König
2022-06-09 15:22                 ` [Intel-gfx] " Christian König
2022-06-09 15:22                 ` [Nouveau] " Christian König
2022-06-09 15:54                 ` Michal Hocko
2022-06-09 15:54                   ` [Nouveau] " Michal Hocko
2022-06-09 15:54                   ` Michal Hocko
2022-06-09 15:54                   ` [Intel-gfx] " Michal Hocko
2022-05-31  9:59 ` [PATCH 04/13] dma-buf: provide oom badness for DMA-buf files Christian König
2022-05-31  9:59   ` Christian König
2022-05-31  9:59   ` [Nouveau] " Christian König
2022-05-31  9:59 ` [PATCH 05/13] drm/gem: adjust per file OOM badness on handling buffers Christian König
2022-05-31  9:59   ` Christian König
2022-05-31  9:59   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 06/13] drm/gma500: use drm_oom_badness Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 07/13] drm/amdgpu: Use drm_oom_badness for amdgpu Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 08/13] drm/radeon: use drm_oom_badness Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 09/13] drm/i915: " Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 10/13] drm/nouveau: " Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 11/13] drm/omap: " Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 12/13] drm/vmwgfx: " Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 13/13] drm/tegra: " Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 22:00 ` Per file OOM badness Alex Deucher
2022-05-31 22:00   ` Alex Deucher
2022-05-31 22:00   ` [Intel-gfx] " Alex Deucher
2022-05-31 22:00   ` Alex Deucher
2022-05-31 22:00   ` [Nouveau] " Alex Deucher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.