All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@linux.dev>
To: Christian Brauner <brauner@kernel.org>
Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	 linux-mm@kvack.org, linux-btrfs@vger.kernel.org,
	linux-block@vger.kernel.org,
	 Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [LSF/MM/BPF TOPIC] Dropping page cache of individual fs
Date: Fri, 16 Feb 2024 23:04:28 -0500	[thread overview]
Message-ID: <h5wq7dsi6r7cjjmkpo2dvn5x662eseluzd2kmzbkzegntzlptd@ncjzyaurmiwb> (raw)
In-Reply-To: <20240116-tagelang-zugnummer-349edd1b5792@brauner>

On Tue, Jan 16, 2024 at 11:50:32AM +0100, Christian Brauner wrote:
> Hey,
> 
> I'm not sure this even needs a full LSFMM discussion but since I
> currently don't have time to work on the patch I may as well submit it.
> 
> Gnome recently got awared 1M Euro by the Sovereign Tech Fund (STF). The
> STF was created by the German government to fund public infrastructure:
> 
> "The Sovereign Tech Fund supports the development, improvement and
>  maintenance of open digital infrastructure. Our goal is to sustainably
>  strengthen the open source ecosystem. We focus on security, resilience,
>  technological diversity, and the people behind the code." (cf. [1])
> 
> Gnome has proposed various specific projects including integrating
> systemd-homed with Gnome. Systemd-homed provides various features and if
> you're interested in details then you might find it useful to read [2].
> It makes use of various new VFS and fs specific developments over the
> last years.
> 
> One feature is encrypting the home directory via LUKS. An approriate
> image or device must contain a GPT partition table. Currently there's
> only one partition which is a LUKS2 volume. Inside that LUKS2 volume is
> a Linux filesystem. Currently supported are btrfs (see [4] though),
> ext4, and xfs.
> 
> The following issue isn't specific to systemd-homed. Gnome wants to be
> able to support locking encrypted home directories. For example, when
> the laptop is suspended. To do this the luksSuspend command can be used.
> 
> The luksSuspend call is nothing else than a device mapper ioctl to
> suspend the block device and it's owning superblock/filesystem. Which in
> turn is nothing but a freeze initiated from the block layer:
> 
> dm_suspend()
> -> __dm_suspend()
>    -> lock_fs()
>       -> bdev_freeze()
> 
> So when we say luksSuspend we really mean block layer initiated freeze.
> The overall goal or expectation of userspace is that after a luksSuspend
> call all sensitive material has been evicted from relevant caches to
> harden against various attacks. And luksSuspend does wipe the encryption
> key and suspend the block device. However, the encryption key can still
> be available clear-text in the page cache. To illustrate this problem
> more simply:
> 
> truncate -s 500M /tmp/img
> echo password | cryptsetup luksFormat /tmp/img --force-password
> echo password | cryptsetup open /tmp/img test
> mkfs.xfs /dev/mapper/test
> mount /dev/mapper/test /mnt
> echo "secrets" > /mnt/data
> cryptsetup luksSuspend test
> cat /mnt/data
> 
> This will still happily print the contents of /mnt/data even though the
> block device and the owning filesystem are frozen because the data is
> still in the page cache.
> 
> To my knowledge, the only current way to get the contents of /mnt/data
> or the encryption key out of the page cache is via
> /proc/sys/vm/drop_caches which is a big hammer.
> 
> My initial reaction is to give userspace an API to drop the page cache
> of a specific filesystem which may have additional uses. I initially had
> started drafting an ioctl() and then got swayed towards a
> posix_fadvise() flag. I found out that this was already proposed a few
> years ago but got rejected as it was suspected this might just be
> someone toying around without a real world use-case. I think this here
> might qualify as a real-world use-case.
> 
> This may at least help securing users with a regular dm-crypt setup
> where dm-crypt is the top layer. Users that stack additional layers on
> top of dm-crypt may still leak plaintext of course if they introduce
> additional caching. But that's on them.
> 
> Of course other ideas welcome.

This isn't entirely unlike snapshot deletion, where we also need to
shoot down the pagecache.

Technically, the code I have now for snapshot deletion isn't quite what
I want; snapshot deletion probably wants something closer to revoke()
instead of waiting for files to be closed. But maybe the code I have is
close to what you need - maybe we could turn this into a common shared
API?

https://evilpiepirate.org/git/bcachefs.git/tree/fs/bcachefs/fs.c#n1569

The need for page zeroing is pretty orthogonal; if you want page zeroing
you want that enabled for all page cache folios at all times.

      parent reply	other threads:[~2024-02-17  4:04 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-16 10:50 [LSF/MM/BPF TOPIC] Dropping page cache of individual fs Christian Brauner
2024-01-16 11:45 ` Jan Kara
2024-01-17 12:53   ` Christian Brauner
2024-01-17 14:35     ` Jan Kara
2024-01-17 14:52       ` Matthew Wilcox
2024-01-17 20:51         ` Phillip Susi
2024-01-17 20:58           ` Matthew Wilcox
2024-01-18 14:26         ` Christian Brauner
2024-01-30  0:13         ` Adrian Vovk
2024-02-15 13:57           ` Jan Kara
2024-02-15 19:46             ` Adrian Vovk
2024-02-15 23:17               ` Dave Chinner
2024-02-16  1:14                 ` Adrian Vovk
2024-02-16 20:38                   ` init_on_alloc digression: " John Hubbard
2024-02-16 21:11                     ` Adrian Vovk
2024-02-16 21:19                       ` John Hubbard
2024-01-16 15:25 ` James Bottomley
2024-01-16 15:40   ` Matthew Wilcox
2024-01-16 15:54     ` James Bottomley
2024-01-16 20:56 ` Dave Chinner
2024-01-17  6:17   ` Theodore Ts'o
2024-01-30  1:14     ` Adrian Vovk
2024-01-17 13:19   ` Christian Brauner
2024-01-17 22:26     ` Dave Chinner
2024-01-18 14:09       ` Christian Brauner
2024-02-05 17:39     ` Russell Haley
2024-02-17  4:04 ` Kent Overstreet [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=h5wq7dsi6r7cjjmkpo2dvn5x662eseluzd2kmzbkzegntzlptd@ncjzyaurmiwb \
    --to=kent.overstreet@linux.dev \
    --cc=brauner@kernel.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.