archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <>
To: Nick Kossifidis <>,
Cc: Andrew Morton <>,
	Mike Rapoport <>,
	Alexander Viro <>,
	Andy Lutomirski <>, Arnd Bergmann <>,
	Borislav Petkov <>,
	Catalin Marinas <>,
	Christopher Lameter <>,
	Dave Hansen <>,
	Elena Reshetova <>,
	"H. Peter Anvin" <>, Ingo Molnar <>,
	"Kirill A. Shutemov" <>,
	Matthew Wilcox <>,
	Matthew Garrett <>,
	Mark Rutland <>,
	Michal Hocko <>,
	Mike Rapoport <>,
	Michael Kerrisk <>,
	Palmer Dabbelt <>,
	Paul Walmsley <>,
	Peter Zijlstra <>,
	"Rafael J. Wysocki" <>,
	Rick Edgecombe <>,
	Roman Gushchin <>, Shakeel Butt <>,
	Shuah  Khan <>,
	Thomas Gleixner <>,
	Tycho Andersen <>, Will Deacon <>,,,,,,,,,,
Subject: Re: [PATCH v18 0/9] mm: introduce memfd_secret system call to create "secret" memory areas
Date: Fri, 7 May 2021 09:35:45 +0200	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On 07.05.21 01:16, Nick Kossifidis wrote:
> Στις 2021-05-06 20:05, James Bottomley έγραψε:
>> On Thu, 2021-05-06 at 18:45 +0200, David Hildenbrand wrote:
>>> Also, there is a way to still read that memory when root by
>>> 1. Having kdump active (which would often be the case, but maybe not
>>> to dump user pages )
>>> 2. Triggering a kernel crash (easy via proc as root)
>>> 3. Waiting for the reboot after kump() created the dump and then
>>> reading the content from disk.
>> Anything that can leave physical memory intact but boot to a kernel
>> where the missing direct map entry is restored could theoretically
>> extract the secret.  However, it's not exactly going to be a stealthy
>> extraction ...
>>> Or, as an attacker, load a custom kexec() kernel and read memory
>>> from the new environment. Of course, the latter two are advanced
>>> mechanisms, but they are possible when root. We might be able to
>>> mitigate, for example, by zeroing out secretmem pages before booting
>>> into the kexec kernel, if we care :)
>> I think we could handle it by marking the region, yes, and a zero on
>> shutdown might be useful ... it would prevent all warm reboot type
>> attacks.
> I had similar concerns about recovering secrets with kdump, and
> considered cleaning up keyrings before jumping to the new kernel. The
> problem is we can't provide guarantees in that case, once the kernel has
> crashed and we are on our way to run crashkernel, we can't be sure we
> can reliably zero-out anything, the more code we add to that path the

Well, I think it depends. Assume we do the following

1) Zero out any secretmem pages when handing them back to the buddy. 
(alternative: init_on_free=1) -- if not already done, I didn't check the 

2) On kdump(), zero out all allocated secretmem. It'd be easier if we'd 
just allocated from a fixed physical memory area; otherwise we have to 
walk process page tables or use a PFN walker. And zeroing out secretmem 
pages without a direct mapping is a different challenge.

Now, during 2) it can happen that

a) We crash in our clearing code (e.g., something is seriously messed 
up) and fail to start the kdump kernel. That's actually good, instead of 
leaking data we fail hard.

b) We don't find all secretmem pages, for example, because process page 
tables are messed up or something messed up our memmap (if we'd use that 
to identify secretmem pages via a PFN walker somehow)

But for the simple cases (e.g., malicious root tries to crash the kernel 
via /proc/sysrq-trigger) both a) and b) wouldn't apply.

Obviously, if an admin would want to mitigate right now, he would want 
to disable kdump completely, meaning any attempt to load a crashkernel 
would fail and cannot be enabled again for that kernel (also not via 
cmdline an attacker could modify to reboot into a system with the option 
for a crashkernel). Disabling kdump in the kernel when secretmem pages 
are allocated is one approach, although sub-optimal.

> more risky it gets. However during reboot/normal kexec() we should do
> some cleanup, it makes sense and secretmem can indeed be useful in that
> case. Regarding loading custom kexec() kernels, we mitigate this with
> the kexec file-based API where we can verify the signature of the loaded
> kimage (assuming the system runs a kernel provided by a trusted 3rd
> party and we 've maintained a chain of trust since booting).

For example in VMs (like QEMU), we often don't clear physical memory 
during a reboot. So if an attacker manages to load a kernel that you can 
trick into reading random physical memory areas, we can leak secretmem 
data I think.

And there might be ways to achieve that just using the cmdline, not 
necessarily loading a different kernel. For example if you limit the 
kernel footprint ("mem=256M") and disable strict_iomem_checks 
("strict_iomem_checks=relaxed") you can just extract that memory via 
/dev/mem if I am not wrong.

So as an attacker, modify the (grub) cmdline to "mem=256M 
strict_iomem_checks=relaxed", reboot, and read all memory via /dev/mem. 
Or load a signed kexec kernel with that cmdline and boot into it.

Interesting problem :)


David / dhildenb
Linux-nvdimm mailing list --
To unsubscribe send an email to

  reply	other threads:[~2021-05-07  7:35 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-03 16:22 [PATCH v18 0/9] mm: introduce memfd_secret system call to create "secret" memory areas Mike Rapoport
2021-03-03 16:22 ` [PATCH v18 1/9] mm: add definition of PMD_PAGE_ORDER Mike Rapoport
2021-03-03 16:22 ` [PATCH v18 2/9] mmap: make mlock_future_check() global Mike Rapoport
2021-03-03 16:22 ` [PATCH v18 3/9] riscv/Kconfig: make direct map manipulation options depend on MMU Mike Rapoport
2021-03-03 16:22 ` [PATCH v18 4/9] set_memory: allow set_direct_map_*_noflush() for multiple pages Mike Rapoport
2021-03-03 16:22 ` [PATCH v18 5/9] set_memory: allow querying whether set_direct_map_*() is actually enabled Mike Rapoport
2021-03-03 16:22 ` [PATCH v18 6/9] mm: introduce memfd_secret system call to create "secret" memory areas Mike Rapoport
2021-03-03 16:22 ` [PATCH v18 7/9] PM: hibernate: disable when there are active secretmem users Mike Rapoport
2021-03-03 16:22 ` [PATCH v18 8/9] arch, mm: wire up memfd_secret system call where relevant Mike Rapoport
2021-03-03 16:22 ` [PATCH v18 9/9] secretmem: test: add basic selftest for memfd_secret(2) Mike Rapoport
2021-05-05 19:08 ` [PATCH v18 0/9] mm: introduce memfd_secret system call to create "secret" memory areas Andrew Morton
2021-05-06 15:26   ` James Bottomley
2021-05-06 16:45     ` David Hildenbrand
2021-05-06 17:05       ` James Bottomley
2021-05-06 17:24         ` David Hildenbrand
2021-05-06 23:16         ` Nick Kossifidis
2021-05-07  7:35           ` David Hildenbrand [this message]
2021-05-06 17:33     ` Kees Cook
2021-05-06 18:47       ` James Bottomley
2021-05-07 23:57         ` Kees Cook
2021-05-10 18:02         ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).