dri-devel Archive on lore.kernel.org
 help / color / Atom feed
From: Chia-I Wu <olvaffe@gmail.com>
To: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: wanpengli@tencent.com, kvm@vger.kernel.org, joro@8bytes.org,
	ML dri-devel <dri-devel@lists.freedesktop.org>,
	Gurchetan Singh <gurchetansingh@chromium.org>,
	Gerd Hoffmann <kraxel@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	vkuznets@redhat.com, jmattson@google.com
Subject: Re: [RFC PATCH 0/3] KVM: x86: honor guest memory type
Date: Fri, 14 Feb 2020 13:47:08 -0800
Message-ID: <CAPaKu7Q4gehyhEgG_Nw=tiZiTh+7A8-uuXq1w4he6knp6NWErQ@mail.gmail.com> (raw)
In-Reply-To: <20200214195229.GF20690@linux.intel.com>

On Fri, Feb 14, 2020 at 11:52 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Fri, Feb 14, 2020 at 11:26:06AM +0100, Paolo Bonzini wrote:
> > On 13/02/20 23:18, Chia-I Wu wrote:
> > >
> > > The bug you mentioned was probably this one
> > >
> > >   https://bugzilla.kernel.org/show_bug.cgi?id=104091
> >
> > Yes, indeed.
> >
> > > From what I can tell, the commit allowed the guests to create cached
> > > mappings to MMIO regions and caused MCEs.  That is different than what
> > > I need, which is to allow guests to create uncached mappings to system
> > > ram (i.e., !kvm_is_mmio_pfn) when the host userspace also has uncached
> > > mappings.  But it is true that this still allows the userspace & guest
> > > kernel to create conflicting memory types.
>
> This is ok.
>
> > Right, the question is whether the MCEs were tied to MMIO regions
> > specifically and if so why.
>
> 99.99999% likelihood the answer is "yes".  Cacheable accesses to non-existent
> memory and most (all?) MMIO regions will cause a #MC.  This includes
> speculative accesses.
>
> Commit fd717f11015f ("KVM: x86: apply guest MTRR virtualization on host
> reserved pages") explicitly had a comment "1. MMIO: trust guest MTRR",
> which is basically a direct avenue to generating #MCs.
>
> IIRC, WC accesses to non-existent memory will also cause #MC, but KVM has
> bigger problems if it has PRESENT EPTEs pointing at garbage.
>
> > An interesting remark is in the footnote of table 11-7 in the SDM.
> > There, for the MTRR (EPT for us) memory type UC you can read:
> >
> >   The UC attribute comes from the MTRRs and the processors are not
> >   required to snoop their caches since the data could never have
> >   been cached. This attribute is preferred for performance reasons.
> >
> > There are two possibilities:
> >
> > 1) the footnote doesn't apply to UC mode coming from EPT page tables.
> > That would make your change safe.
> >
> > 2) the footnote also applies when the UC attribute comes from the EPT
> > page tables rather than the MTRRs.  In that case, the host should use
> > UC as the EPT page attribute if and only if it's consistent with the host
> > MTRRs; it would be more or less impossible to honor UC in the guest MTRRs.
> > In that case, something like the patch below would be needed.
>
> (2), the EPTs effectively replace the MTRRs.  The expectation being that
> the VMM will use always use EPT memtypes consistent with the MTRRs.
This is my understanding as well.

> > It is not clear from the manual why the footnote would not apply to WC; that
> > is, the manual doesn't say explicitly that the processor does not do snooping
> > for accesses to WC memory.  But I guess that must be the case, which is why I
> > used MTRR_TYPE_WRCOMB in the patch below.
>
> A few paragraphs below table 11-12 states:
>
>   In particular, a WC page must never be aliased to a cacheable page because
>   WC writes may not check the processor caches.
>
> > Either way, we would have an explanation of why creating cached mapping to
> > MMIO regions would, and why in practice we're not seeing MCEs for guest RAM
> > (the guest would have set WB for that memory in its MTRRs, not UC).
>
> Aliasing (physical) RAM with different memtypes won't cause #MC, just
> memory corruption.

What we need potentially gives the userspace (the guest kernel, to be
exact) the ability to create conflicting memory types.  If we can be
sure that the worst scenario is for a guest to corrupt its own memory,
by only allowing aliases on physical ram, that seems alright.

AFAICT, it is currently allowed on ARM (verified) and AMD (not
verified, but svm_get_mt_mask returns 0 which supposedly means the NPT
does not restrict what the guest PAT can do).  This diff would do the
trick for Intel without needing any uapi change:

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e3394c839dea..88af11cc551a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6905,12 +6905,6 @@ static u64 vmx_get_mt_mask(struct kvm_vcpu
*vcpu, gfn_t gfn, bool is_mmio)
                goto exit;
        }

-       if (!kvm_arch_has_noncoherent_dma(vcpu->kvm)) {
-               ipat = VMX_EPT_IPAT_BIT;
-               cache = MTRR_TYPE_WRBACK;
-               goto exit;
-       }
-
        if (kvm_read_cr0(vcpu) & X86_CR0_CD) {
                ipat = VMX_EPT_IPAT_BIT;
                if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED))



> > One thing you didn't say: how would userspace use KVM_MEM_DMA?  On which
> > regions would it be set?
> >
> > Thanks,
> >
> > Paolo
> >
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index dc331fb06495..2be6f7effa1d 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -6920,8 +6920,16 @@ static u64 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
> >       }
> >
> >       cache = kvm_mtrr_get_guest_memory_type(vcpu, gfn);
> > -
> >  exit:
> > +     if (cache == MTRR_TYPE_UNCACHABLE && !is_mmio) {
> > +             /*
> > +              * We cannot set UC in the EPT page tables as it can cause
> > +              * machine check exceptions (??).  Hopefully the guest is
> > +              * using PAT.
> > +              */
> > +             cache = MTRR_TYPE_WRCOMB;
>
> This is unnecessary.  Setting UC in the EPT tables will never directly lead
> to #MC.  Forcing WC is likely more dangerous.
>
> > +     }
> > +
> >       return (cache << VMX_EPT_MT_EPTE_SHIFT) | ipat;
> >  }
> >
> >
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply index

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-13 21:30 Chia-I Wu
2020-02-13 21:30 ` [RFC PATCH 1/3] KVM: vmx: rewrite the comment in vmx_get_mt_mask Chia-I Wu
2020-02-14  9:36   ` Paolo Bonzini
2020-02-13 21:30 ` [RFC PATCH 2/3] RFC: KVM: add KVM_MEM_DMA Chia-I Wu
2020-02-13 21:30 ` [RFC PATCH 3/3] RFC: KVM: x86: support KVM_CAP_DMA_MEM Chia-I Wu
2020-02-13 21:41 ` [RFC PATCH 0/3] KVM: x86: honor guest memory type Paolo Bonzini
2020-02-13 22:18   ` Chia-I Wu
2020-02-14 10:26     ` Paolo Bonzini
2020-02-14 19:52       ` Sean Christopherson
2020-02-14 21:47         ` Chia-I Wu [this message]
2020-02-14 21:56           ` Jim Mattson
2020-02-14 22:03             ` Sean Christopherson
2020-02-18 16:28               ` Paolo Bonzini
2020-02-18 22:58                 ` Sean Christopherson
2020-02-19  9:52                 ` Tian, Kevin
2020-02-19 19:36                   ` Chia-I Wu
2020-02-20  2:04                     ` Tian, Kevin
2020-02-20  2:38                       ` Tian, Kevin
2020-02-20 22:23                         ` Chia-I Wu
2020-02-21  0:23                           ` Tian, Kevin
2020-02-21  4:45                             ` Chia-I Wu
2020-02-21  4:51                               ` Chia-I Wu
2020-02-21  5:39                                 ` Tian, Kevin
2020-02-21 15:59                                   ` Sean Christopherson
2020-02-21 18:21                                     ` Chia-I Wu
2020-02-14 21:15       ` Chia-I Wu
2020-02-19 10:00         ` Tian, Kevin
2020-02-19 19:18           ` Chia-I Wu
2020-02-20  2:13             ` Tian, Kevin
2020-02-20 23:02               ` Chia-I Wu
2020-02-24 10:57               ` Gerd Hoffmann

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPaKu7Q4gehyhEgG_Nw=tiZiTh+7A8-uuXq1w4he6knp6NWErQ@mail.gmail.com' \
    --to=olvaffe@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gurchetansingh@chromium.org \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kraxel@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

dri-devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/dri-devel/0 dri-devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dri-devel dri-devel/ https://lore.kernel.org/dri-devel \
		dri-devel@lists.freedesktop.org
	public-inbox-index dri-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.freedesktop.lists.dri-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git