linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: "Sven Peter" <sven@svenpeter.dev>
To: "Arnd Bergmann" <arnd@arndb.de>, "Christoph Hellwig" <hch@infradead.org>
Cc: linux-nvme@lists.infradead.org
Subject: Re: [asahilinux:nvme/dev 13/17] drivers/nvme/host/pci.c:2249:2-3: Unneeded semicolon
Date: Wed, 18 Aug 2021 08:26:29 +0200	[thread overview]
Message-ID: <d133c3f0-1805-4f23-9710-45b31ed5fb64@www.fastmail.com> (raw)
In-Reply-To: <05dfa9d8-7cee-4431-abe3-4cc583985773@www.fastmail.com>

Alright, I've observed what macOS does by using the simple hypervisor we
built and tracing its MMIO access. There actually is a single entry per page
in both the TCB struct and the command queue and all entries are used.

The reason for that needs a little more background around XNU and it's
security architecture works:

On these machines, XNU is split into two parts: The main kernel with all its
extensions and hardware drivers, and a small section called "Page Protection Layer"
or PPL.
These machines have CPU extensions that allow them to prevent the normal kernel
from writing to pagetables or accessing any MMIO related to IOMMUs. Switching to
the PPL section is done with a custom instruction ("genter") which changes memory
permissions, such that PPL is then able to modify pagetables and configures the 
IOMMUs. It kinda works like a low-overhead hypervisor that controls pagetables.
There are some writeups available about this if you are curious about
the details [1][2][3].

The TCB structs and the NVMMU MMIO registers cannot be accessed by the part of the
kernel that contains the NVME driver. The NVME driver can only prepare the
command structure and fill the PRP list with entries for the DMA buffers.
It then calls out to PPL, which verifies all the pages listed inside the PRP
are allowed for DMA and then constructs the same structure again inside
protected memory that can no longer be touched by the regular kernel.

Then the NVMMU is configured with this secondary PRP list from inside PPL
before it returns back control to the NVME driver. Effectively,
this prevents someone from breaking into the normal kernel to just DMA
over any buffer they want.

For Linux we can ignore all this and just point the NVMMU and the
queue entry to the same PRP list.



Sven


[1] Jonathan Levin's writeup about the Page Protection Layer http://newosxbook.com/articles/CasaDePPL.html
[2] siguza's writeup about how this was done on iOS https://siguza.github.io/APRR/
[3] my writeup about the CPU extensions https://blog.svenpeter.dev/posts/m1_sprr_gxf/


On Mon, Aug 9, 2021, at 22:11, Sven Peter wrote:
> 
> 
> On Mon, Aug 9, 2021, at 17:53, Arnd Bergmann wrote:
> > On Mon, Aug 9, 2021 at 4:29 PM Christoph Hellwig <hch@infradead.org> wrote:
> > > Also can one of you look how PRPs are actually used by MacOS?  Given
> > > that this device always seems to be behind a IOMMU creating one entry
> > > per page seems rather weird given that the apple_nvmmu_tcb structure
> > > already contains the full length.  Maybe it actually ignores all but
> > > the first PRP?
> > 
> > I'll leave this up to Sven to answer. He also wrote the iommu driver,
> > so he probably has a good idea of what is going on here already.
> > 
> >       Arnd
> > 
> 
> Not yet, but figuring out how this NVMe-IOMMU works in detail was
> already on my TODO list :-)
> 
> Some background - the M1 has at least four different IOMMU-like
> HW blocks:
> DART (for which I wrote a driver and where I'd actually know what's going
> on in detail), SART (simple DMA address filter required by the NVMe
> co-processor for non-nvme transactions), this weird NVMe IOMMU (that
> also seems to be somehow related to disk encryption) and GART for their GPU.
> 
> 
> 
> Sven
> 


-- 
Sven Peter

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

      reply	other threads:[~2021-08-18  6:27 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <202108051646.vdMMUBea-lkp@intel.com>
     [not found] ` <YQ7GVu1TqpqbHZOb@infradead.org>
2021-08-09 10:02   ` [asahilinux:nvme/dev 13/17] drivers/nvme/host/pci.c:2249:2-3: Unneeded semicolon Arnd Bergmann
2021-08-09 14:29     ` Christoph Hellwig
2021-08-09 15:53       ` Arnd Bergmann
2021-08-09 20:11         ` Sven Peter
2021-08-18  6:26           ` Sven Peter [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d133c3f0-1805-4f23-9710-45b31ed5fb64@www.fastmail.com \
    --to=sven@svenpeter.dev \
    --cc=arnd@arndb.de \
    --cc=hch@infradead.org \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).