All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: "T.J. Mercier" <tjmercier@google.com>
Cc: "Tejun Heo" <tj@kernel.org>, "Zefan Li" <lizefan.x@bytedance.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Christian König" <christian.koenig@amd.com>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Shakeel Butt" <shakeelb@google.com>,
	"Muchun Song" <muchun.song@linux.dev>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	daniel.vetter@ffwll.ch, android-mm@google.com,
	jstultz@google.com, jeffv@google.com, cmllamas@google.com,
	linux-security-module@vger.kernel.org, selinux@vger.kernel.org,
	cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-media@vger.kernel.org,
	dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org,
	linux-mm@kvack.org
Subject: Re: [PATCH v2 1/4] memcg: Track exported dma-buffers
Date: Wed, 25 Jan 2023 13:05:49 +0100	[thread overview]
Message-ID: <Y9EbHW84ydBzpTTO@dhcp22.suse.cz> (raw)
In-Reply-To: <CABdmKX1c_8LdJJboENnZhwGjrszDWOOVt-Do93-sJW46mZMD6A@mail.gmail.com>

On Tue 24-01-23 10:55:21, T.J. Mercier wrote:
> On Tue, Jan 24, 2023 at 7:00 AM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Mon 23-01-23 19:17:23, T.J. Mercier wrote:
> > > When a buffer is exported to userspace, use memcg to attribute the
> > > buffer to the allocating cgroup until all buffer references are
> > > released.
> >
> > Is there any reason why this memory cannot be charged during the
> > allocation (__GFP_ACCOUNT used)?
> 
> My main motivation was to keep code changes away from exporters and
> implement the accounting in one common spot for all of them. This is a
> bit of a carryover from a previous approach [1] where there was some
> objection to pushing off this work onto exporters and forcing them to
> adapt, but __GFP_ACCOUNT does seem like a smaller burden than before
> at least initially. However in order to support charge transfer
> between cgroups with __GFP_ACCOUNT we'd need to be able to get at the
> pages backing dmabuf objects, and the exporters are the ones with that
> access. Meaning I think we'd have to add some additional dma_buf_ops
> to achieve that, which was the objection from [1].
> 
> [1] https://lore.kernel.org/lkml/5cc27a05-8131-ce9b-dea1-5c75e994216d@amd.com/
> 
> >
> > Also you do charge and account the memory but underlying pages do not
> > know about their memcg (this is normally done with commit_charge for
> > user mapped pages). This would become a problem if the memory is
> > migrated for example.
> 
> Hmm, what problem do you see in this situation? If the backing pages
> are to be migrated that requires the cooperation of the exporter,
> which currently has no influence on how the cgroup charging is done
> and that seems fine. (Unless you mean migrating the charge across
> cgroups? In which case that's the next patch.)

My main concern was that page migration could lose the external tracking
without some additional steps on the dmabuf front.

> > This also means that you have to maintain memcg
> > reference outside of the memcg proper which is not really nice either.
> > This mimicks tcp kmem limit implementation which I really have to say I
> > am not a great fan of and this pattern shouldn't be coppied.
> >
> Ah, what can I say. This way looked simple to me. I think otherwise
> we're back to making all exporters do more stuff for the accounting.
> 
> > Also you are not really saying anything about the oom behavior. With
> > this implementation the kernel will try to reclaim the memory and even
> > trigger the memcg oom killer if the request size is <= 8 pages. Is this
> > a desirable behavior?
> 
> It will try to reclaim some memory, but not the dmabuf pages right?
> Not *yet* anyway. This behavior sounds expected to me.

Yes, we have discussed that shrinkers will follow up later which is
fine. The question is how much reclaim actually makes sense at this
stage. Charging interface usually copes with sizes resulting from
allocation requests (so usually 1<<order based). I can imagine that a
batch charge like implemented here could easily be 100s of MBs and it is
much harder to define reclaim targets for. At least that is something
the memcg charging hasn't really considered yet.  Maybe the existing
try_charge implementation can cope with that just fine but it would be
really great to have the expected behavior described.

E.g. should be memcg OOM killer be invoked? Should reclaim really target
regular memory at all costs or just a lightweight memory reclaim is
preferred (is the dmabuf charge failure an expensive operation wrt.
memory refault due to reclaim).
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@suse.com>
To: "T.J. Mercier" <tjmercier@google.com>
Cc: linux-doc@vger.kernel.org, daniel.vetter@ffwll.ch,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	cmllamas@google.com, dri-devel@lists.freedesktop.org,
	linux-mm@kvack.org, jstultz@google.com,
	"Zefan Li" <lizefan.x@bytedance.com>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	android-mm@google.com, "Jonathan Corbet" <corbet@lwn.net>,
	jeffv@google.com, linux-media@vger.kernel.org,
	selinux@vger.kernel.org, linaro-mm-sig@lists.linaro.org,
	"Shakeel Butt" <shakeelb@google.com>,
	cgroups@vger.kernel.org, "Muchun Song" <muchun.song@linux.dev>,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Tejun Heo" <tj@kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: [PATCH v2 1/4] memcg: Track exported dma-buffers
Date: Wed, 25 Jan 2023 13:05:49 +0100	[thread overview]
Message-ID: <Y9EbHW84ydBzpTTO@dhcp22.suse.cz> (raw)
In-Reply-To: <CABdmKX1c_8LdJJboENnZhwGjrszDWOOVt-Do93-sJW46mZMD6A@mail.gmail.com>

On Tue 24-01-23 10:55:21, T.J. Mercier wrote:
> On Tue, Jan 24, 2023 at 7:00 AM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Mon 23-01-23 19:17:23, T.J. Mercier wrote:
> > > When a buffer is exported to userspace, use memcg to attribute the
> > > buffer to the allocating cgroup until all buffer references are
> > > released.
> >
> > Is there any reason why this memory cannot be charged during the
> > allocation (__GFP_ACCOUNT used)?
> 
> My main motivation was to keep code changes away from exporters and
> implement the accounting in one common spot for all of them. This is a
> bit of a carryover from a previous approach [1] where there was some
> objection to pushing off this work onto exporters and forcing them to
> adapt, but __GFP_ACCOUNT does seem like a smaller burden than before
> at least initially. However in order to support charge transfer
> between cgroups with __GFP_ACCOUNT we'd need to be able to get at the
> pages backing dmabuf objects, and the exporters are the ones with that
> access. Meaning I think we'd have to add some additional dma_buf_ops
> to achieve that, which was the objection from [1].
> 
> [1] https://lore.kernel.org/lkml/5cc27a05-8131-ce9b-dea1-5c75e994216d@amd.com/
> 
> >
> > Also you do charge and account the memory but underlying pages do not
> > know about their memcg (this is normally done with commit_charge for
> > user mapped pages). This would become a problem if the memory is
> > migrated for example.
> 
> Hmm, what problem do you see in this situation? If the backing pages
> are to be migrated that requires the cooperation of the exporter,
> which currently has no influence on how the cgroup charging is done
> and that seems fine. (Unless you mean migrating the charge across
> cgroups? In which case that's the next patch.)

My main concern was that page migration could lose the external tracking
without some additional steps on the dmabuf front.

> > This also means that you have to maintain memcg
> > reference outside of the memcg proper which is not really nice either.
> > This mimicks tcp kmem limit implementation which I really have to say I
> > am not a great fan of and this pattern shouldn't be coppied.
> >
> Ah, what can I say. This way looked simple to me. I think otherwise
> we're back to making all exporters do more stuff for the accounting.
> 
> > Also you are not really saying anything about the oom behavior. With
> > this implementation the kernel will try to reclaim the memory and even
> > trigger the memcg oom killer if the request size is <= 8 pages. Is this
> > a desirable behavior?
> 
> It will try to reclaim some memory, but not the dmabuf pages right?
> Not *yet* anyway. This behavior sounds expected to me.

Yes, we have discussed that shrinkers will follow up later which is
fine. The question is how much reclaim actually makes sense at this
stage. Charging interface usually copes with sizes resulting from
allocation requests (so usually 1<<order based). I can imagine that a
batch charge like implemented here could easily be 100s of MBs and it is
much harder to define reclaim targets for. At least that is something
the memcg charging hasn't really considered yet.  Maybe the existing
try_charge implementation can cope with that just fine but it would be
really great to have the expected behavior described.

E.g. should be memcg OOM killer be invoked? Should reclaim really target
regular memory at all costs or just a lightweight memory reclaim is
preferred (is the dmabuf charge failure an expensive operation wrt.
memory refault due to reclaim).
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2023-01-25 12:06 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-23 19:17 [PATCH v2 0/4] Track exported dma-buffers with memcg T.J. Mercier
2023-01-23 19:17 ` T.J. Mercier
2023-01-23 19:17 ` T.J. Mercier
2023-01-23 19:17 ` [PATCH v2 1/4] memcg: Track exported dma-buffers T.J. Mercier
2023-01-23 19:17   ` T.J. Mercier
2023-01-23 19:17   ` T.J. Mercier
2023-01-24 14:59   ` Michal Hocko
2023-01-24 14:59     ` Michal Hocko
2023-01-24 18:55     ` T.J. Mercier
2023-01-24 18:55       ` T.J. Mercier
2023-01-25 12:05       ` Michal Hocko [this message]
2023-01-25 12:05         ` Michal Hocko
2023-01-25 20:04         ` T.J. Mercier
2023-01-25 20:04           ` T.J. Mercier
2023-01-24 19:46     ` Shakeel Butt
2023-01-24 19:46       ` Shakeel Butt
2023-01-24 19:46       ` Shakeel Butt
2023-01-25 11:52       ` Michal Hocko
2023-01-25 11:52         ` Michal Hocko
2023-01-25 17:30         ` Tvrtko Ursulin
2023-01-25 17:30           ` Tvrtko Ursulin
2023-01-25 20:04           ` T.J. Mercier
2023-01-25 20:04             ` T.J. Mercier
2023-01-25 20:04             ` T.J. Mercier
2023-01-31 14:00             ` Tvrtko Ursulin
2023-01-31 14:00               ` Tvrtko Ursulin
2023-01-31 14:00               ` Tvrtko Ursulin
2023-02-01  1:49               ` T.J. Mercier
2023-02-01  1:49                 ` T.J. Mercier
2023-02-01  1:49                 ` T.J. Mercier
2023-02-01 14:23                 ` Tvrtko Ursulin
2023-02-01 14:23                   ` Tvrtko Ursulin
2023-02-01 14:23                   ` Tvrtko Ursulin
2023-02-01 14:52                   ` Tvrtko Ursulin
2023-02-01 14:52                     ` Tvrtko Ursulin
2023-02-01 14:52                     ` Tvrtko Ursulin
2023-02-02 23:43                     ` T.J. Mercier
2023-02-02 23:43                       ` T.J. Mercier
2023-02-02 23:43                       ` T.J. Mercier
2023-02-03  9:27                       ` Tvrtko Ursulin
2023-02-03  9:27                         ` Tvrtko Ursulin
2023-02-03  9:27                         ` Tvrtko Ursulin
2023-02-02 23:43                   ` T.J. Mercier
2023-02-02 23:43                     ` T.J. Mercier
2023-02-02 23:43                     ` T.J. Mercier
2023-02-03  9:46                     ` Tvrtko Ursulin
2023-02-03  9:46                       ` Tvrtko Ursulin
2023-02-03  9:46                       ` Tvrtko Ursulin
2023-01-23 19:17 ` [PATCH v2 2/4] dmabuf: Add cgroup charge transfer function T.J. Mercier
2023-01-23 19:17   ` T.J. Mercier
2023-01-23 19:17   ` T.J. Mercier
2023-01-23 19:17 ` [PATCH v2 3/4] binder: Add flags to relinquish ownership of fds T.J. Mercier
2023-01-25  4:20   ` kernel test robot
2023-01-25 17:30   ` Carlos Llamas
2023-01-25 22:07     ` T.J. Mercier
2023-01-23 19:17 ` [PATCH v2 4/4] security: binder: Add binder object flags to selinux_binder_transfer_file T.J. Mercier
2023-01-23 19:17   ` T.J. Mercier
2023-01-23 21:36   ` Paul Moore
2023-01-23 21:36     ` Paul Moore
     [not found]     ` <CABdmKX0Jc3OTnSMv_GoL0eEo=7W9dP29+r5K=PfF84xAUHviBw@mail.gmail.com>
2023-01-24  4:47       ` T.J. Mercier
2023-01-24  4:47         ` T.J. Mercier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9EbHW84ydBzpTTO@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=android-mm@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=cmllamas@google.com \
    --cc=corbet@lwn.net \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hannes@cmpxchg.org \
    --cc=jeffv@google.com \
    --cc=jstultz@google.com \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=lizefan.x@bytedance.com \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=selinux@vger.kernel.org \
    --cc=shakeelb@google.com \
    --cc=sumit.semwal@linaro.org \
    --cc=tj@kernel.org \
    --cc=tjmercier@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.