All of lore.kernel.org
 help / color / mirror / Atom feed
From: Adrian Vovk <adrianvovk@gmail.com>
To: John Hubbard <jhubbard@nvidia.com>, Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>, Matthew Wilcox <willy@infradead.org>,
	Christian Brauner <brauner@kernel.org>,
	lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-btrfs@vger.kernel.org,
	linux-block@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: init_on_alloc digression: [LSF/MM/BPF TOPIC] Dropping page cache of individual fs
Date: Fri, 16 Feb 2024 16:11:20 -0500	[thread overview]
Message-ID: <67eef60c-b0fe-4034-a2e5-b09c7ef38a5a@gmail.com> (raw)
In-Reply-To: <edec0ef8-00f5-4457-a1aa-59fd6bc9f6bf@nvidia.com>

On 2/16/24 15:38, John Hubbard wrote:
> On 2/15/24 17:14, Adrian Vovk wrote:
> ...
>>> Typical distro configuration is:
>>>
>>> $ sudo dmesg |grep auto-init
>>> [    0.018882] mem auto-init: stack:all(zero), heap alloc:on, heap 
>>> free:off
>>> $
>>>
>>> So this kernel zeroes all stack memory, page and heap memory on
>>> allocation, and does nothing on free...
>>
>> I see. Thank you for all the information.
>>
>> So ~5% performance penalty isn't trivial, especially to protect against 
>
> And it's more like 600% or more, on some systems. For example, imagine if
> someone had a memory-coherent system that included both CPUs and GPUs,
> each with their own NUMA memory nodes. The GPU has fast DMA engines that
> can zero a lot of that memory very very quickly, order(s) of magnitude
> faster than the CPU can clear it.
>
> So, the GPU driver is going to clear that memory before handing it
> out to user space, and all is well so far.
>
> But init_on_alloc forces the CPU to clear the memory first, because of
> the belief here that this is somehow required in order to get defense
> in depth. (True, if you can convince yourself that some parts of the
> kernel are in a different trust boundary than others. I lack faith
> here and am not a believer in such make belief boundaries.)

As far as I can tell init_on_alloc isn't about drawing a trust boundary 
between parts of the kernel, but about hardening the kernel against 
mistakes made by developers, i.e. if they forget to initialize some 
memory. If the memory isn't zero'd and the developer forgets to 
initialize it, then potentially memory under user control (from page 
cache or so) can control flow of execution in the kernel. Thus, zeroing 
out the memory provides a second layer of defense even in situations 
where the first layer (not using uninitialized memory) failed. Thus, 
defense in depth.

Is this just an NVIDIA embedded thing (AFAIK your desktop/laptop cards 
don't share memory with the CPU), or would it affect something like 
Intel/AMD APUs as well?

If the GPU is so much faster at zeroing out blocks of memory in these 
systems, maybe the kernel should use the GPU's DMA engine whenever it 
needs to zero out some blocks of memory (I'm joking, mostly; I can 
imagine it's not quite so simple)

> Anyway, this situation has wasted much time, and at this point, I
> wish I could delete the whole init_on_alloc feature.
>
> Just in case you wanted an alt perspective. :)

This is all good to know, thanks.

I'm not particularly interested in init_on_alloc since it doesn't help 
against cold-boot scenarios. Does init_on_free have similar performance 
issues on such systems? (i.e. are you often freeing memory and then 
immediately allocating the same memory in the GPU driver?)

Either way, I'd much prefer to have both turned off and only zero out 
free'd memory periodically / on user request. Not on every allocation/free.

> thanks,

Best,
Adrian


  reply	other threads:[~2024-02-16 21:11 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-16 10:50 [LSF/MM/BPF TOPIC] Dropping page cache of individual fs Christian Brauner
2024-01-16 11:45 ` Jan Kara
2024-01-17 12:53   ` Christian Brauner
2024-01-17 14:35     ` Jan Kara
2024-01-17 14:52       ` Matthew Wilcox
2024-01-17 20:51         ` Phillip Susi
2024-01-17 20:58           ` Matthew Wilcox
2024-01-18 14:26         ` Christian Brauner
2024-01-30  0:13         ` Adrian Vovk
2024-02-15 13:57           ` Jan Kara
2024-02-15 19:46             ` Adrian Vovk
2024-02-15 23:17               ` Dave Chinner
2024-02-16  1:14                 ` Adrian Vovk
2024-02-16 20:38                   ` init_on_alloc digression: " John Hubbard
2024-02-16 21:11                     ` Adrian Vovk [this message]
2024-02-16 21:19                       ` John Hubbard
2024-01-16 15:25 ` James Bottomley
2024-01-16 15:40   ` Matthew Wilcox
2024-01-16 15:54     ` James Bottomley
2024-01-16 20:56 ` Dave Chinner
2024-01-17  6:17   ` Theodore Ts'o
2024-01-30  1:14     ` Adrian Vovk
2024-01-17 13:19   ` Christian Brauner
2024-01-17 22:26     ` Dave Chinner
2024-01-18 14:09       ` Christian Brauner
2024-02-05 17:39     ` Russell Haley
2024-02-17  4:04 ` Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=67eef60c-b0fe-4034-a2e5-b09c7ef38a5a@gmail.com \
    --to=adrianvovk@gmail.com \
    --cc=brauner@kernel.org \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jhubbard@nvidia.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.