linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>,
	lsf-pc@lists.linux-foundation.org,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	"Shutemov, Kirill" <kirill.shutemov@intel.com>,
	"Schofield, Alison" <alison.schofield@intel.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Jan Kara <jack@suse.cz>, Christoph Hellwig <hch@infradead.org>
Subject: Re: [LSF/MM TOPIC] Memory Encryption on top of filesystems
Date: Wed, 13 Feb 2019 13:13:18 +1100	[thread overview]
Message-ID: <20190213021318.GN20493@dastard> (raw)
In-Reply-To: <CAPcyv4jhbYfrdTOyh90-u-gEUV7QEgF_HrNid5w5WbPPGr=axw@mail.gmail.com>

On Tue, Feb 12, 2019 at 04:27:20PM -0800, Dan Williams wrote:
> On Tue, Feb 12, 2019 at 3:51 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Tue, Feb 12, 2019 at 08:55:57AM -0800, Dave Hansen wrote:
> > > Multi-Key Total Memory Encryption (MKTME) [1] is feature of a memory
> > > controller that allows memory to be selectively encrypted with
> > > user-controlled key, in hardware, at a very low runtime cost.  However,
> > > it is implemented using AES-XTS which encrypts each block with a key
> > > that is generated based on the physical address of the data being
> > > encrypted.  This has nice security properties, making some replay and
> > > substitution attacks harder, but it means that encrypted data can not be
> > > naively relocated.
> >
> > The subject is "Memory Encryption on top of filesystems", but really
> > what you are talking about is "physical memory encryption /below/
> > filesystems".
> >
> > i.e. it's encryption of the physical storage the filesystem manages,
> > not encryption within the fileystem (like fscrypt) or or user data
> > on top of the filesystem (ecryptfs or userspace).
> >
> > > Combined with persistent memory, MKTME allows data to be unlocked at the
> > > device (DIMM or namespace) level, but left encrypted until it actually
> > > needs to be used.
> >
> > This sounds more like full disk encryption (either in the IO
> > path software by dm-crypt or in hardware itself), where the contents
> > are decrypted/encrypted in the IO path as the data is moved between
> > physical storage and the filesystem's memory (page/buffer caches).
> >
> > Is there any finer granularity than a DIMM or pmem namespace for
> > specifying encrypted regions? Note that filesystems are not aware of
> > the physical layout of the memory address space (i.e. what DIMM
> > corresponds to which sector in the block device), so DIMM-level
> > granularity doesn't seem particularly useful right now....
> >
> > Also, how many different hardware encryption keys are available for
> > use, and how many separate memory regions can a single key have
> > associated with it?
> >
> > > However, if encrypted data were placed on a
> > > filesystem, it might be in its encrypted state for long periods of time
> > > and could not be moved by the filesystem during that time.
> >
> > I'm not sure what you mean by "if encrypted data were placed on a
> > filesystem", given that the memory encryption is transparent to the
> > filesystem (i.e. happens in the memory controller on it's way
> > to/from the physical storage).
> >
> > > The “easy” solution to this is to just require that the encryption key
> > > be present and programmed into the memory controller before data is
> > > moved.  However, this means that filesystems would need to know when a
> > > given block has been encrypted and can not be moved.
> >
> > I'm missing something here - how does the filesystem even get
> > mounted if we haven't unlocked the device the filesystem is stored
> > on? i.e. we need to unlock the entire memory region containing the
> > filesystem so it can read and write it's metadata (which can be
> > randomly spread all over the block device).
> >
> > And if we have to do that to mount the filesystem, then aren't we
> > also unlocking all the same memory regions that contain user data
> > and hence they can be moved?
> 
> Yes, and this is the most likely scenario for enabling MKTME with
> persistent memory. The filesystem will not be able to mount until the
> entire physical address range (namespace device) is unlocked, and the
> filesystem is kept unaware of the encryption. One key per namespace
> device.
> 
> > At what point do we end up with a filesystem mounted and trying to
> > access a locked memory region?
> 
> Another option is to enable encryption to be specified at mmap time
> with the motivation of being able to use the file system for
> provisioning instead of managing multiple namespaces.

I'm assuming you are talking about DAX here, yes?

Because fscrypt....

> The filesystem
> would need to be careful to use the key for any physical block
> management, and a decision would need to be made about when/whether
> read(2)/write(2) access cipher text .

... already handles all this via page cache coherency for
mmap/read/write IO.

> The current thinking is that
> this would be too invasive / restrictive for the filesystem, but it's
> otherwise an interesting thought experiment for allowing the
> filesystem to take on more physical-storage allocation
> responsibilities.

Actually what we want in the filesystem world is /hardware offload/
abstractions in the filesystems, not "filesystem controls hardware
specific physical storage features" mechanisms.

i.e. if the filesystem/fscrypt can offload the encryption of the
data to the IO path by passing the fscrypt key/info with the IO,
then it works with everything, not just pmem.

In the case of pmem+DAX+mmap(), it needs to associate the correct
key with the memory region that is to be encrypted when it is
mmap()d. Then the DAX subsystem can associate the key with the
physical pages that are faulted during DAX access. If it's bio based
IO going to the DAX driver, then the keys should be attached to the
bio....

fscrypt encrypt/decrypt is already done at the filesystem/bio
interface layer via bounce buffers - it's not a great stretch to
push this down a layer so that it can be offloaded to the underlying
device if it is hardware encryption capable. fscrypt would really
only be used for key management (like needs work to support
arbitrary hardware keys) and in filesystem metadata encryption (e.g.
filenames) in that case....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2019-02-13  2:13 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-12 16:55 [LSF/MM TOPIC] Memory Encryption on top of filesystems Dave Hansen
2019-02-12 23:51 ` Dave Chinner
2019-02-13  0:27   ` Dan Williams
2019-02-13  2:13     ` Dave Chinner [this message]
2019-02-13  3:31       ` Dan Williams
2019-02-13 15:43         ` Theodore Y. Ts'o
2019-02-13 15:51         ` Dave Hansen
2019-02-13 20:21           ` Dave Chinner
2019-02-13 20:29             ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190213021318.GN20493@dastard \
    --to=david@fromorbit.com \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=dave.hansen@intel.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=kirill.shutemov@intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).