linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>,
	Jeff Layton <jlayton@kernel.org>,
	 Chuck Lever <chuck.lever@oracle.com>,
	Kees Cook <kees@kernel.org>,  Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	 Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Roman Gushchin <roman.gushchin@linux.dev>,
	Hyeonggon Yoo <42.hyeyoo@gmail.com>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	 Shakeel Butt <shakeelb@google.com>,
	Muchun Song <muchun.song@linux.dev>,
	 Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	 linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	cgroups@vger.kernel.org,  linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH RFC 4/4] UNFINISHED mm, fs: use kmem_cache_charge() in path_openat()
Date: Fri, 1 Mar 2024 09:51:18 -0800	[thread overview]
Message-ID: <CAHk-=whgFtbTxCAg2CWQtDj7n6CEyzvdV1wcCj2qpMfpw0=m1A@mail.gmail.com> (raw)
In-Reply-To: <20240301-slab-memcg-v1-4-359328a46596@suse.cz>

On Fri, 1 Mar 2024 at 09:07, Vlastimil Babka <vbabka@suse.cz> wrote:
>
> This is just an example of using the kmem_cache_charge() API.  I think
> it's placed in a place that's applicable for Linus's example [1]
> although he mentions do_dentry_open() - I have followed from strace()
> showing openat(2) to path_openat() doing the alloc_empty_file().

Thanks. This is not the right patch,  but yes, patches 1-3 look very nice to me.

> The idea is that filp_cachep stops being SLAB_ACCOUNT. Allocations that
> want to be accounted immediately can use GFP_KERNEL_ACCOUNT. I did that
> in alloc_empty_file_noaccount() (despite the contradictory name but the
> noaccount refers to something else, right?) as IIUC it's about
> kernel-internal opens.

Yeah, the "noaccount" function is about not accounting it towards nr_files.

That said, I don't think it necessarily needs to do the memory
accounting either - it's literally for cases where we're never going
to install the file descriptor in any user space.

Your change to use GFP_KERNEL_ACCOUNT isn't exactly wrong, but I don't
think it's really the right thing either, because

> Why is this unfinished:
>
> - there are other callers of alloc_empty_file() which I didn't adjust so
>   they simply became memcg-unaccounted. I haven't investigated for which
>   ones it would make also sense to separate the allocation and accounting.
>   Maybe alloc_empty_file() would need to get a parameter to control
>   this.

Right. I think the natural and logical way to deal with this is to
just say "we account when we add the file to the fdtable".

IOW, just have fd_install() do it. That's the really natural point,
and also makes it very logical why alloc_empty_file_noaccount()
wouldn't need to do the GFP_KERNEL_ACCOUNT.

> - I don't know how to properly unwind the accounting failure case. It
>   seems like a new case because when we succeed the open, there's no
>   further error path at least in path_openat().

Yeah, let me think about this part. Becasue fd_install() is the right
point, but that too does not really allow for error handling.

Yes, we could close things and fail it, but it really is much too late
at this point.

What I *think* I'd want for this case is

 (a) allow the accounting to go over by a bit

 (b) make sure there's a cheap way to ask (before) about "did we go
over the limit"

IOW, the accounting never needed to be byte-accurate to begin with,
and making it fail (cheaply and early) on the next file allocation is
fine.

Just make it really cheap. Can we do that?

For example, maybe don't bother with the whole "bytes and pages"
stuff. Just a simple "are we more than one page over?" kind of
question. Without the 'stock_lock' mess for sub-page bytes etc

How would that look? Would it result in something that can be done
cheaply without locking and atomics and without excessive pointer
indirection through many levels of memcg data structures?

             Linus

  reply	other threads:[~2024-03-01 17:51 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-01 17:07 [PATCH RFC 0/4] memcg_kmem hooks refactoring and kmem_cache_charge() Vlastimil Babka
2024-03-01 17:07 ` [PATCH RFC 1/4] mm, slab: move memcg charging to post-alloc hook Vlastimil Babka
2024-03-12 18:52   ` Roman Gushchin
2024-03-12 18:59     ` Matthew Wilcox
2024-03-12 20:35       ` Roman Gushchin
2024-03-13 10:55     ` Vlastimil Babka
2024-03-13 17:34       ` Roman Gushchin
2024-03-15  3:23   ` Chengming Zhou
2024-03-01 17:07 ` [PATCH RFC 2/4] mm, slab: move slab_memcg hooks to mm/memcontrol.c Vlastimil Babka
2024-03-12 18:56   ` Roman Gushchin
2024-03-12 19:32     ` Matthew Wilcox
2024-03-12 20:36       ` Roman Gushchin
2024-03-01 17:07 ` [PATCH RFC 3/4] mm, slab: introduce kmem_cache_charge() Vlastimil Babka
2024-03-01 17:07 ` [PATCH RFC 4/4] UNFINISHED mm, fs: use kmem_cache_charge() in path_openat() Vlastimil Babka
2024-03-01 17:51   ` Linus Torvalds [this message]
2024-03-01 18:53     ` Roman Gushchin
2024-03-12  9:22       ` Vlastimil Babka
2024-03-12 19:05         ` Roman Gushchin
2024-03-04 12:47     ` Christian Brauner
2024-03-24  2:27     ` Al Viro
2024-03-24 17:44       ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=whgFtbTxCAg2CWQtDj7n6CEyzvdV1wcCj2qpMfpw0=m1A@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=cl@linux.com \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=jpoimboe@kernel.org \
    --cc=kees@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).