All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: "Michal Hocko" <mhocko@suse.com>,
	"Christian König" <christian.koenig@amd.com>
Cc: andrey.grodzovsky@amd.com, linux-mm@kvack.org,
	nouveau@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	hughd@google.com, linux-kernel@vger.kernel.org,
	amd-gfx@lists.freedesktop.org, linux-fsdevel@vger.kernel.org,
	viro@zeniv.linux.org.uk, daniel@ffwll.ch,
	linux-tegra@vger.kernel.org, alexander.deucher@amd.com,
	akpm@linux-foundation.org, linux-media@vger.kernel.org
Subject: Re: [Nouveau] [PATCH 03/13] mm: shmem: provide oom badness for shmem files
Date: Mon, 13 Jun 2022 14:55:54 +0200	[thread overview]
Message-ID: <34daa8ab-a9f4-8f7b-0ea7-821bc36b9497@gmail.com> (raw)
In-Reply-To: <YqcpZY3Xx7Mk2ROH@dhcp22.suse.cz>

Am 13.06.22 um 14:11 schrieb Michal Hocko:
> [SNIP]
>>>> Alternative I could try to track the "owner" of a buffer (e.g. a shmem
>>>> file), but then it can happen that one processes creates the object and
>>>> another one is writing to it and actually allocating the memory.
>>> If you can enforce that the owner is really responsible for the
>>> allocation then all should be fine. That would require MAP_POPULATE like
>>> semantic and I suspect this is not really feasible with the existing
>>> userspace. It would be certainly hard to enforce for bad players.
>> I've tried this today and the result was: "BUG: Bad rss-counter state
>> mm:000000008751d9ff type:MM_FILEPAGES val:-571286".
>>
>> The problem is once more that files are not informed when the process
>> clones. So what happened is that somebody called fork() with an mm_struct
>> I've accounted my pages to. The result is just that we messed up the
>> rss_stats and  the the "BUG..." above.
>>
>> The key difference between normal allocated pages and the resources here is
>> just that we are not bound to an mm_struct in any way.
> It is not really clear to me what exactly you have tried.

I've tried to track the "owner" of a driver connection by keeping a 
reference to the mm_struct who created this connection inside our file 
private and then use add_mm_counter() to account all the allocations of 
the driver to this mm_struct.

This works to the extend that now the right process is killed in an OOM 
situation. The problem with this approach is that the driver is not 
informed about operations like fork() or clone(), so what happens is 
that after a fork()/clone() we have an unbalanced rss-counter.

Let me maybe get back to the initial question: We have resources which 
are not related to the virtual address space of a process, how should we 
tell the OOM killer about them?

Thanks for all the input so far,
Christian.

WARNING: multiple messages have this Message-ID (diff)
From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: "Michal Hocko" <mhocko@suse.com>,
	"Christian König" <christian.koenig@amd.com>
Cc: andrey.grodzovsky@amd.com, linux-mm@kvack.org,
	nouveau@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	hughd@google.com, linux-kernel@vger.kernel.org,
	amd-gfx@lists.freedesktop.org, linux-fsdevel@vger.kernel.org,
	viro@zeniv.linux.org.uk, linux-tegra@vger.kernel.org,
	alexander.deucher@amd.com, akpm@linux-foundation.org,
	linux-media@vger.kernel.org
Subject: Re: [Intel-gfx] [PATCH 03/13] mm: shmem: provide oom badness for shmem files
Date: Mon, 13 Jun 2022 14:55:54 +0200	[thread overview]
Message-ID: <34daa8ab-a9f4-8f7b-0ea7-821bc36b9497@gmail.com> (raw)
In-Reply-To: <YqcpZY3Xx7Mk2ROH@dhcp22.suse.cz>

Am 13.06.22 um 14:11 schrieb Michal Hocko:
> [SNIP]
>>>> Alternative I could try to track the "owner" of a buffer (e.g. a shmem
>>>> file), but then it can happen that one processes creates the object and
>>>> another one is writing to it and actually allocating the memory.
>>> If you can enforce that the owner is really responsible for the
>>> allocation then all should be fine. That would require MAP_POPULATE like
>>> semantic and I suspect this is not really feasible with the existing
>>> userspace. It would be certainly hard to enforce for bad players.
>> I've tried this today and the result was: "BUG: Bad rss-counter state
>> mm:000000008751d9ff type:MM_FILEPAGES val:-571286".
>>
>> The problem is once more that files are not informed when the process
>> clones. So what happened is that somebody called fork() with an mm_struct
>> I've accounted my pages to. The result is just that we messed up the
>> rss_stats and  the the "BUG..." above.
>>
>> The key difference between normal allocated pages and the resources here is
>> just that we are not bound to an mm_struct in any way.
> It is not really clear to me what exactly you have tried.

I've tried to track the "owner" of a driver connection by keeping a 
reference to the mm_struct who created this connection inside our file 
private and then use add_mm_counter() to account all the allocations of 
the driver to this mm_struct.

This works to the extend that now the right process is killed in an OOM 
situation. The problem with this approach is that the driver is not 
informed about operations like fork() or clone(), so what happens is 
that after a fork()/clone() we have an unbalanced rss-counter.

Let me maybe get back to the initial question: We have resources which 
are not related to the virtual address space of a process, how should we 
tell the OOM killer about them?

Thanks for all the input so far,
Christian.

WARNING: multiple messages have this Message-ID (diff)
From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: "Michal Hocko" <mhocko@suse.com>,
	"Christian König" <christian.koenig@amd.com>
Cc: andrey.grodzovsky@amd.com, linux-mm@kvack.org,
	nouveau@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	hughd@google.com, linux-kernel@vger.kernel.org,
	amd-gfx@lists.freedesktop.org, linux-fsdevel@vger.kernel.org,
	viro@zeniv.linux.org.uk, daniel@ffwll.ch,
	linux-tegra@vger.kernel.org, alexander.deucher@amd.com,
	akpm@linux-foundation.org, linux-media@vger.kernel.org
Subject: Re: [PATCH 03/13] mm: shmem: provide oom badness for shmem files
Date: Mon, 13 Jun 2022 14:55:54 +0200	[thread overview]
Message-ID: <34daa8ab-a9f4-8f7b-0ea7-821bc36b9497@gmail.com> (raw)
In-Reply-To: <YqcpZY3Xx7Mk2ROH@dhcp22.suse.cz>

Am 13.06.22 um 14:11 schrieb Michal Hocko:
> [SNIP]
>>>> Alternative I could try to track the "owner" of a buffer (e.g. a shmem
>>>> file), but then it can happen that one processes creates the object and
>>>> another one is writing to it and actually allocating the memory.
>>> If you can enforce that the owner is really responsible for the
>>> allocation then all should be fine. That would require MAP_POPULATE like
>>> semantic and I suspect this is not really feasible with the existing
>>> userspace. It would be certainly hard to enforce for bad players.
>> I've tried this today and the result was: "BUG: Bad rss-counter state
>> mm:000000008751d9ff type:MM_FILEPAGES val:-571286".
>>
>> The problem is once more that files are not informed when the process
>> clones. So what happened is that somebody called fork() with an mm_struct
>> I've accounted my pages to. The result is just that we messed up the
>> rss_stats and  the the "BUG..." above.
>>
>> The key difference between normal allocated pages and the resources here is
>> just that we are not bound to an mm_struct in any way.
> It is not really clear to me what exactly you have tried.

I've tried to track the "owner" of a driver connection by keeping a 
reference to the mm_struct who created this connection inside our file 
private and then use add_mm_counter() to account all the allocations of 
the driver to this mm_struct.

This works to the extend that now the right process is killed in an OOM 
situation. The problem with this approach is that the driver is not 
informed about operations like fork() or clone(), so what happens is 
that after a fork()/clone() we have an unbalanced rss-counter.

Let me maybe get back to the initial question: We have resources which 
are not related to the virtual address space of a process, how should we 
tell the OOM killer about them?

Thanks for all the input so far,
Christian.

WARNING: multiple messages have this Message-ID (diff)
From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: "Michal Hocko" <mhocko@suse.com>,
	"Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org, linux-kernel@vger.kernel.org,
	intel-gfx@lists.freedesktop.org, amd-gfx@lists.freedesktop.org,
	nouveau@lists.freedesktop.org, linux-tegra@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	alexander.deucher@amd.com, daniel@ffwll.ch,
	viro@zeniv.linux.org.uk, akpm@linux-foundation.org,
	hughd@google.com, andrey.grodzovsky@amd.com
Subject: Re: [PATCH 03/13] mm: shmem: provide oom badness for shmem files
Date: Mon, 13 Jun 2022 14:55:54 +0200	[thread overview]
Message-ID: <34daa8ab-a9f4-8f7b-0ea7-821bc36b9497@gmail.com> (raw)
In-Reply-To: <YqcpZY3Xx7Mk2ROH@dhcp22.suse.cz>

Am 13.06.22 um 14:11 schrieb Michal Hocko:
> [SNIP]
>>>> Alternative I could try to track the "owner" of a buffer (e.g. a shmem
>>>> file), but then it can happen that one processes creates the object and
>>>> another one is writing to it and actually allocating the memory.
>>> If you can enforce that the owner is really responsible for the
>>> allocation then all should be fine. That would require MAP_POPULATE like
>>> semantic and I suspect this is not really feasible with the existing
>>> userspace. It would be certainly hard to enforce for bad players.
>> I've tried this today and the result was: "BUG: Bad rss-counter state
>> mm:000000008751d9ff type:MM_FILEPAGES val:-571286".
>>
>> The problem is once more that files are not informed when the process
>> clones. So what happened is that somebody called fork() with an mm_struct
>> I've accounted my pages to. The result is just that we messed up the
>> rss_stats and  the the "BUG..." above.
>>
>> The key difference between normal allocated pages and the resources here is
>> just that we are not bound to an mm_struct in any way.
> It is not really clear to me what exactly you have tried.

I've tried to track the "owner" of a driver connection by keeping a 
reference to the mm_struct who created this connection inside our file 
private and then use add_mm_counter() to account all the allocations of 
the driver to this mm_struct.

This works to the extend that now the right process is killed in an OOM 
situation. The problem with this approach is that the driver is not 
informed about operations like fork() or clone(), so what happens is 
that after a fork()/clone() we have an unbalanced rss-counter.

Let me maybe get back to the initial question: We have resources which 
are not related to the virtual address space of a process, how should we 
tell the OOM killer about them?

Thanks for all the input so far,
Christian.

  reply	other threads:[~2022-06-13 12:56 UTC|newest]

Thread overview: 145+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-31  9:59 Per file OOM badness Christian König
2022-05-31  9:59 ` Christian König
2022-05-31  9:59 ` [Nouveau] " Christian König
2022-05-31  9:59 ` [PATCH 01/13] fs: add OOM badness callback to file_operatrations struct Christian König
2022-05-31  9:59   ` Christian König
2022-05-31  9:59   ` [Nouveau] " Christian König
2022-05-31  9:59 ` [PATCH 02/13] oom: take per file badness into account Christian König
2022-05-31  9:59   ` Christian König
2022-05-31  9:59   ` [Nouveau] " Christian König
2022-05-31  9:59 ` [PATCH 03/13] mm: shmem: provide oom badness for shmem files Christian König
2022-05-31  9:59   ` Christian König
2022-05-31  9:59   ` [Nouveau] " Christian König
2022-06-09  9:18   ` Michal Hocko
2022-06-09  9:18     ` [Nouveau] " Michal Hocko
2022-06-09  9:18     ` Michal Hocko
2022-06-09  9:18     ` [Intel-gfx] " Michal Hocko
2022-06-09 12:16     ` Christian König
2022-06-09 12:16       ` Christian König
2022-06-09 12:16       ` [Intel-gfx] " Christian König
2022-06-09 12:16       ` [Nouveau] " Christian König
2022-06-09 12:57       ` Michal Hocko
2022-06-09 12:57         ` [Nouveau] " Michal Hocko
2022-06-09 12:57         ` Michal Hocko
2022-06-09 12:57         ` [Intel-gfx] " Michal Hocko
2022-06-09 14:10         ` Christian König
2022-06-09 14:10           ` Christian König
2022-06-09 14:10           ` [Nouveau] " Christian König
2022-06-09 14:21           ` Michal Hocko
2022-06-09 14:21             ` [Nouveau] " Michal Hocko
2022-06-09 14:21             ` Michal Hocko
2022-06-09 14:21             ` [Intel-gfx] " Michal Hocko
2022-06-09 14:29             ` Christian König
2022-06-09 14:29               ` Christian König
2022-06-09 14:29               ` [Intel-gfx] " Christian König
2022-06-09 14:29               ` [Nouveau] " Christian König
2022-06-09 15:07               ` Michal Hocko
2022-06-09 15:07                 ` [Nouveau] " Michal Hocko
2022-06-09 15:07                 ` Michal Hocko
2022-06-09 15:07                 ` [Intel-gfx] " Michal Hocko
2022-06-10 10:58                 ` Christian König
2022-06-10 10:58                   ` Christian König
2022-06-10 10:58                   ` [Nouveau] " Christian König
2022-06-10 11:44                   ` Michal Hocko
2022-06-10 11:44                     ` [Nouveau] " Michal Hocko
2022-06-10 11:44                     ` Michal Hocko
2022-06-10 11:44                     ` [Intel-gfx] " Michal Hocko
2022-06-10 12:17                     ` Christian König
2022-06-10 12:17                       ` Christian König
2022-06-10 12:17                       ` [Intel-gfx] " Christian König
2022-06-10 12:17                       ` [Nouveau] " Christian König
2022-06-10 14:16                       ` Michal Hocko
2022-06-10 14:16                         ` [Nouveau] " Michal Hocko
2022-06-10 14:16                         ` Michal Hocko
2022-06-10 14:16                         ` [Intel-gfx] " Michal Hocko
2022-06-11  8:06                         ` Christian König
2022-06-11  8:06                           ` Christian König
2022-06-11  8:06                           ` [Intel-gfx] " Christian König
2022-06-11  8:06                           ` [Nouveau] " Christian König
2022-06-13  7:45                           ` Michal Hocko
2022-06-13  7:45                             ` [Nouveau] " Michal Hocko
2022-06-13  7:45                             ` Michal Hocko
2022-06-13  7:45                             ` [Intel-gfx] " Michal Hocko
2022-06-13 11:50                             ` Christian König
2022-06-13 11:50                               ` Christian König
2022-06-13 11:50                               ` [Intel-gfx] " Christian König
2022-06-13 11:50                               ` [Nouveau] " Christian König
2022-06-13 12:11                               ` Michal Hocko
2022-06-13 12:11                                 ` [Nouveau] " Michal Hocko
2022-06-13 12:11                                 ` Michal Hocko
2022-06-13 12:11                                 ` [Intel-gfx] " Michal Hocko
2022-06-13 12:55                                 ` Christian König [this message]
2022-06-13 12:55                                   ` Christian König
2022-06-13 12:55                                   ` Christian König
2022-06-13 12:55                                   ` [Intel-gfx] " Christian König
2022-06-13 14:11                                   ` Michal Hocko
2022-06-13 14:11                                     ` [Nouveau] " Michal Hocko
2022-06-13 14:11                                     ` Michal Hocko
2022-06-13 14:11                                     ` Michal Hocko
2022-06-15 12:35                                     ` Christian König
2022-06-15 12:35                                       ` Christian König
2022-06-15 12:35                                       ` [Intel-gfx] " Christian König
2022-06-15 12:35                                       ` [Nouveau] " Christian König
2022-06-15 13:15                                       ` Michal Hocko
2022-06-15 13:15                                         ` [Nouveau] " Michal Hocko
2022-06-15 13:15                                         ` Michal Hocko
2022-06-15 13:15                                         ` [Intel-gfx] " Michal Hocko
2022-06-15 14:24                                         ` Christian König
2022-06-15 14:24                                           ` Christian König
2022-06-15 14:24                                           ` [Intel-gfx] " Christian König
2022-06-15 14:24                                           ` [Nouveau] " Christian König
2022-06-13  9:08                           ` Michel Dänzer
2022-06-13  9:08                             ` [Nouveau] " Michel Dänzer
2022-06-13  9:08                             ` [Intel-gfx] " Michel Dänzer
2022-06-13  9:08                             ` Michel Dänzer
2022-06-13  9:11                             ` Christian König
2022-06-13  9:11                               ` Christian König
2022-06-13  9:11                               ` [Intel-gfx] " Christian König
2022-06-13  9:11                               ` [Nouveau] " Christian König
2022-06-09 15:19             ` Felix Kuehling
2022-06-09 15:19               ` Felix Kuehling
2022-06-09 15:19               ` [Intel-gfx] " Felix Kuehling
2022-06-09 15:19               ` [Nouveau] " Felix Kuehling
2022-06-09 15:22               ` Christian König
2022-06-09 15:22                 ` Christian König
2022-06-09 15:22                 ` [Intel-gfx] " Christian König
2022-06-09 15:22                 ` [Nouveau] " Christian König
2022-06-09 15:54                 ` Michal Hocko
2022-06-09 15:54                   ` [Nouveau] " Michal Hocko
2022-06-09 15:54                   ` Michal Hocko
2022-06-09 15:54                   ` [Intel-gfx] " Michal Hocko
2022-05-31  9:59 ` [PATCH 04/13] dma-buf: provide oom badness for DMA-buf files Christian König
2022-05-31  9:59   ` Christian König
2022-05-31  9:59   ` [Nouveau] " Christian König
2022-05-31  9:59 ` [PATCH 05/13] drm/gem: adjust per file OOM badness on handling buffers Christian König
2022-05-31  9:59   ` Christian König
2022-05-31  9:59   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 06/13] drm/gma500: use drm_oom_badness Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 07/13] drm/amdgpu: Use drm_oom_badness for amdgpu Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 08/13] drm/radeon: use drm_oom_badness Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 09/13] drm/i915: " Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 10/13] drm/nouveau: " Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 11/13] drm/omap: " Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 12/13] drm/vmwgfx: " Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 10:00 ` [PATCH 13/13] drm/tegra: " Christian König
2022-05-31 10:00   ` Christian König
2022-05-31 10:00   ` [Nouveau] " Christian König
2022-05-31 22:00 ` Per file OOM badness Alex Deucher
2022-05-31 22:00   ` Alex Deucher
2022-05-31 22:00   ` [Intel-gfx] " Alex Deucher
2022-05-31 22:00   ` Alex Deucher
2022-05-31 22:00   ` [Nouveau] " Alex Deucher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=34daa8ab-a9f4-8f7b-0ea7-821bc36b9497@gmail.com \
    --to=ckoenig.leichtzumerken@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=andrey.grodzovsky@amd.com \
    --cc=christian.koenig@amd.com \
    --cc=daniel@ffwll.ch \
    --cc=hughd@google.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=nouveau@lists.freedesktop.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.