All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Sean Christopherson <seanjc@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Jim Mattson <jmattson@google.com>,
	David Rientjes <rientjes@google.com>,
	"Edgecombe, Rick P" <rick.p.edgecombe@intel.com>,
	"Kleen, Andi" <andi.kleen@intel.com>,
	"Yamahata, Isaku" <isaku.yamahata@intel.com>,
	Erdem Aktas <erdemaktas@google.com>,
	Steve Rutherford <srutherford@google.com>,
	Peter Gonda <pgonda@google.com>,
	David Hildenbrand <david@redhat.com>,
	Chao Peng <chao.p.peng@linux.intel.com>,
	x86@kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFCv2 13/13] KVM: unmap guest memory using poisoned pages
Date: Mon, 31 May 2021 23:07:12 +0300	[thread overview]
Message-ID: <20210531200712.qjxghakcaj4s6ara@box.shutemov.name> (raw)
In-Reply-To: <YK6lrHeaeUZvHMJC@google.com>

On Wed, May 26, 2021 at 07:46:52PM +0000, Sean Christopherson wrote:
> On Fri, May 21, 2021, Kirill A. Shutemov wrote:
> > Hi Sean,
> > 
> > The core patch of the approach we've discussed before is below. It
> > introduce a new page type with the required semantics.
> > 
> > The full patchset can be found here:
> > 
> >  git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git kvm-unmapped-guest-only
> > 
> > but only the patch below is relevant for TDX. QEMU patch is attached.
> 
> Can you post the whole series?

I hoped to get it posted as part of TDX host enabling.

As it is the feature is incomplete for pure KVM. I didn't implement on KVM
side checks that provided by TDX module/hardware, so nothing prevents the
same page to be added to multiple KVM instances.

> The KVM behavior and usage of FOLL_GUEST is very relevant to TDX.

The patch can be found here:

https://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git/commit/?h=kvm-unmapped-guest-only&id=2cd6c2c20528696a46a2a59383ca81638bf856b5

> > CONFIG_HAVE_KVM_PROTECTED_MEMORY has to be changed to what is appropriate
> > for TDX and FOLL_GUEST has to be used in hva_to_pfn_slow() when running
> > TDX guest.
> 
> This behavior in particular is relevant; KVM should provide FOLL_GUEST iff the
> access is private or the VM type doesn't differentiate between private and
> shared.

I added FOL_GUEST if the KVM instance has the feature enabled.

On top of that TDX-specific code has to check that the page is in fact
PageGuest() before inserting it into private SEPT.

The scheme makes sure that user-accessible memory cannot be not added as
private to TD.

> > When page get inserted into private sept we must make sure it is
> > PageGuest() or SIGBUS otherwise.
> 
> More KVM feedback :-)
> 
> Ideally, KVM will synchronously exit to userspace with detailed information on
> the bad behavior, not do SIGBUS.  Hopefully that infrastructure will be in place
> sooner than later.
> 
> https://lkml.kernel.org/r/YKxJLcg/WomPE422@google.com

My experiments are still v5.11, but I can rebase to whatever needed once
the infrastructure hits upstream.

> > Inserting PageGuest() into shared is fine, but the page will not be accessible
> > from userspace.
> 
> Even if it can be functionally fine, I don't think we want to allow KVM to map
> PageGuest() as shared memory.  The only reason to map memory shared is to share
> it with something, e.g. the host, that doesn't have access to private memory, so
> I can't envision a use case.
> 
> On the KVM side, it's trivially easy to omit FOLL_GUEST for shared memory, while
> always passing FOLL_GUEST would require manually zapping.  Manual zapping isn't
> a big deal, but I do think it can be avoided if userspace must either remap the
> hva or define a new KVM memslot (new gpa->hva), both of which will automatically
> zap any existing translations.
> 
> Aha, thought of a concrete problem.  If KVM maps PageGuest() into shared memory,
> then KVM must ensure that the page is not mapped private via a different hva/gpa,
> and is not mapped _any_ other guest because the TDX-Module's 1:1 PFN:TD+GPA
> enforcement only applies to private memory.  The explicit "VM_WRITE | VM_SHARED"
> requirement below makes me think this wouldn't be prevented.

Hm. I didn't realize that TDX module doesn't prevent the same page to be
used as shared and private at the same time.

Omitting FOLL_GUEST for shared memory doesn't look like a right approach.
IIUC, it would require the kernel to track what memory is share and what
private, which defeat the purpose of the rework. I would rather enforce
!PageGuest() when share SEPT is populated in addition to enforcing
PageGuest() fro private SEPT.

Do you see any problems with this?

> Oh, and the other nicety is that I think it would avoid having to explicitly
> handle PageGuest() memory that is being accessed from kernel/KVM, i.e. if all
> memory exposed to KVM must be !PageGuest(), then it is also eligible for
> copy_{to,from}_user().

copy_{to,from}_user() enforce by setting PTE entries to PROT_NONE.
Or do I miss your point?

> 
> > Any feedback is welcome.
> > 
> > -------------------------------8<-------------------------------------------
> > 
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > Date: Fri, 16 Apr 2021 01:30:48 +0300
> > Subject: [PATCH] mm: Introduce guest-only pages
> > 
> > PageGuest() pages are only allowed to be used as guest memory. Userspace
> > is not allowed read from or write to such pages.
> > 
> > On page fault, PageGuest() pages produce PROT_NONE page table entries.
> > Read or write there will trigger SIGBUS. Access to such pages via
> > syscall leads to -EIO.
> > 
> > The new mprotect(2) flag PROT_GUEST translates to VM_GUEST. Any page
> > fault to VM_GUEST VMA produces PageGuest() page.
> > 
> > Only shared tmpfs/shmem mappings are supported.
> 
> Is limiting this to tmpfs/shmem only for the PoC/RFC, or is it also expected to
> be the long-term behavior?

I expect it to be enough to cover all relevant cases, no?

Note that MAP_ANONYMOUS|MAP_SHARED also fits here.

-- 
 Kirill A. Shutemov

  reply	other threads:[~2021-05-31 20:07 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-16 15:40 [RFCv2 00/13] TDX and guest memory unmapping Kirill A. Shutemov
2021-04-16 15:40 ` [RFCv2 01/13] x86/mm: Move force_dma_unencrypted() to common code Kirill A. Shutemov
2021-04-16 15:40 ` [RFCv2 02/13] x86/kvm: Introduce KVM memory protection feature Kirill A. Shutemov
2021-04-16 16:10   ` Borislav Petkov
2021-04-19 10:10     ` Kirill A. Shutemov
2021-04-16 15:40 ` [RFCv2 03/13] x86/kvm: Make DMA pages shared Kirill A. Shutemov
2021-04-16 15:40 ` [RFCv2 04/13] x86/kvm: Use bounce buffers for KVM memory protection Kirill A. Shutemov
2021-04-16 16:21   ` Dave Hansen
2021-04-16 15:40 ` [RFCv2 05/13] x86/kvmclock: Share hvclock memory with the host Kirill A. Shutemov
2021-04-16 15:40 ` [RFCv2 06/13] x86/realmode: Share trampoline area if KVM memory protection enabled Kirill A. Shutemov
2021-04-19 16:49   ` Dave Hansen
2021-04-16 15:41 ` [RFCv2 07/13] mm: Add hwpoison_entry_to_pfn() and hwpoison_entry_to_page() Kirill A. Shutemov
2021-04-16 15:41 ` [RFCv2 08/13] mm/gup: Add FOLL_ALLOW_POISONED Kirill A. Shutemov
2021-04-16 15:41 ` [RFCv2 09/13] shmem: Fail shmem_getpage_gfp() on poisoned pages Kirill A. Shutemov
2021-04-16 15:41 ` [RFCv2 10/13] mm: Keep page reference for hwpoison entries Kirill A. Shutemov
2021-04-16 15:41 ` [RFCv2 11/13] mm: Replace hwpoison entry with present PTE if page got unpoisoned Kirill A. Shutemov
2021-04-16 15:41 ` [RFCv2 12/13] KVM: passdown struct kvm to hva_to_pfn_slow() Kirill A. Shutemov
2021-04-16 15:41 ` [RFCv2 13/13] KVM: unmap guest memory using poisoned pages Kirill A. Shutemov
2021-04-16 17:30   ` Sean Christopherson
2021-04-19 11:32     ` Xiaoyao Li
2021-04-19 14:26     ` Kirill A. Shutemov
2021-04-19 16:01       ` Sean Christopherson
2021-04-19 16:40         ` Kirill A. Shutemov
2021-04-19 18:09           ` Sean Christopherson
2021-04-19 18:12             ` David Hildenbrand
2021-04-19 18:53             ` Kirill A. Shutemov
2021-04-19 20:09               ` Sean Christopherson
2021-04-19 22:57                 ` Kirill A. Shutemov
2021-04-20 17:13                   ` Sean Christopherson
2021-05-21 12:31                     ` Kirill A. Shutemov
2021-05-26 19:46                       ` Sean Christopherson
2021-05-31 20:07                         ` Kirill A. Shutemov [this message]
2021-06-02 17:51                           ` Sean Christopherson
2021-06-02 23:33                             ` Kirill A. Shutemov
2021-06-03 19:46                               ` Sean Christopherson
2021-06-04 14:29                                 ` Kirill A. Shutemov
2021-06-04 17:16                       ` Andy Lutomirski
2021-06-04 17:54                         ` Kirill A. Shutemov
2021-04-16 16:46 ` [RFCv2 00/13] TDX and guest memory unmapping Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210531200712.qjxghakcaj4s6ara@box.shutemov.name \
    --to=kirill@shutemov.name \
    --cc=andi.kleen@intel.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=erdemaktas@google.com \
    --cc=isaku.yamahata@intel.com \
    --cc=jmattson@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pgonda@google.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rientjes@google.com \
    --cc=seanjc@google.com \
    --cc=srutherford@google.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.