dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Chia-I Wu <olvaffe@gmail.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: Wanpeng Li <wanpengli@tencent.com>,
	ML dri-devel <dri-devel@lists.freedesktop.org>,
	kvm list <kvm@vger.kernel.org>, Joerg Roedel <joro@8bytes.org>,
	"Christopherson, Sean J" <sean.j.christopherson@intel.com>,
	Gurchetan Singh <gurchetansingh@chromium.org>,
	Gerd Hoffmann <kraxel@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Jim Mattson <jmattson@google.com>
Subject: Re: [RFC PATCH 0/3] KVM: x86: honor guest memory type
Date: Thu, 20 Feb 2020 14:23:51 -0800	[thread overview]
Message-ID: <CAPaKu7RFY3nar9hmAdx6RYdZFPK3Cdg1O3cS+OvsEOT=yupyrQ@mail.gmail.com> (raw)
In-Reply-To: <AADFC41AFE54684AB9EE6CBC0274A5D19D78EF58@SHSMSX104.ccr.corp.intel.com>

On Wed, Feb 19, 2020 at 6:38 PM Tian, Kevin <kevin.tian@intel.com> wrote:
>
> > From: Tian, Kevin
> > Sent: Thursday, February 20, 2020 10:05 AM
> >
> > > From: Chia-I Wu <olvaffe@gmail.com>
> > > Sent: Thursday, February 20, 2020 3:37 AM
> > >
> > > On Wed, Feb 19, 2020 at 1:52 AM Tian, Kevin <kevin.tian@intel.com> wrote:
> > > >
> > > > > From: Paolo Bonzini
> > > > > Sent: Wednesday, February 19, 2020 12:29 AM
> > > > >
> > > > > On 14/02/20 23:03, Sean Christopherson wrote:
> > > > > >> On Fri, Feb 14, 2020 at 1:47 PM Chia-I Wu <olvaffe@gmail.com>
> > wrote:
> > > > > >>> AFAICT, it is currently allowed on ARM (verified) and AMD (not
> > > > > >>> verified, but svm_get_mt_mask returns 0 which supposedly means
> > > the
> > > > > NPT
> > > > > >>> does not restrict what the guest PAT can do).  This diff would do the
> > > > > >>> trick for Intel without needing any uapi change:
> > > > > >> I would be concerned about Intel CPU errata such as SKX40 and
> > SKX59.
> > > > > > The part KVM cares about, #MC, is already addressed by forcing UC
> > for
> > > > > MMIO.
> > > > > > The data corruption issue is on the guest kernel to correctly use WC
> > > > > > and/or non-temporal writes.
> > > > >
> > > > > What about coherency across live migration?  The userspace process
> > > would
> > > > > use cached accesses, and also a WBINVD could potentially corrupt guest
> > > > > memory.
> > > > >
> > > >
> > > > In such case the userspace process possibly should conservatively use
> > > > UC mapping, as if for MMIO regions on a passthrough device. However
> > > > there remains a problem. the definition of KVM_MEM_DMA implies
> > > > favoring guest setting, which could be whatever type in concept. Then
> > > > assuming UC is also problematic. I'm not sure whether inventing another
> > > > interface to query effective memory type from KVM is a good idea. There
> > > > is no guarantee that the guest will use same type for every page in the
> > > > same slot, then such interface might be messy. Alternatively, maybe
> > > > we could just have an interface for KVM userspace to force memory type
> > > > for a given slot, if it is mainly used in para-virtualized scenarios (e.g.
> > > > virtio-gpu) where the guest is enlightened to use a forced type (e.g. WC)?
> > > KVM forcing the memory type for a given slot should work too.  But the
> > > ignore-guest-pat bit seems to be Intel-specific.  We will need to
> > > define how the second-level page attributes combine with the guest
> > > page attributes somehow.
> >
> > oh, I'm not aware of that difference. without an ipat-equivalent
> > capability, I'm not sure how to forcing random type here. If you look at
> > table 11-7 in Intel SDM, none of MTRR (EPT) memory type can lead to
> > consistent effective type when combining with random PAT value. So
> >  it is definitely a dead end.
> >
> > >
> > > KVM should in theory be able to tell that the userspace region is
> > > mapped with a certain memory type and can force the same memory type
> > > onto the guest.  The userspace does not need to be involved.  But that
> > > sounds very slow?  This may be a dumb question, but would it help to
> > > add KVM_SET_DMA_BUF and let KVM negotiate the memory type with the
> > > in-kernel GPU drivers?
> > >
> > >
> >
> > KVM_SET_DMA_BUF looks more reasonable. But I guess we don't need
> > KVM to be aware of such negotiation. We can continue your original
> > proposal to have KVM simply favor guest memory type (maybe still call
> > KVM_MEM_DMA). On the other hand, Qemu should just mmap on the
> > fd handle of the dmabuf passed from the virtio-gpu device backend,  e.g.
> > to conduct migration. That way the mmap request is finally served by
> > DRM and underlying GPU drivers, with proper type enforced automatically.
> >
>
> Thinking more possibly we don't need introduce new interface to KVM.
> As long as Qemu uses dmabuf interface to mmap the specific region,
> KVM can simply check memory type in host page table given hva of a
> memslot. If the type is UC or WC, it implies that userspace wants a
> non-coherent mapping which should be reflected in the guest side too.
> In such case, KVM can go to non-cohenrent DMA path and favor guest
> memory type automatically.
Sorry, I mixed two things together.

Userspace access to dmabuf mmap must be guarded by
DMA_BUF_SYNC_{START,END} ioctls.  It is possible that the GPU driver
always picks a WB mapping and let the ioctls flush/invalidate CPU
caches.  We actually want the guest memory type to match vkMapMemory's
memory type, which can be different from dmabuf mmap's memory type.
It is not enough for KVM to inspect the hva's memory type.

KVM_SET_DMA_BUF, if supported, is a signal to KVM that the guest
memory type should be honored (or forced if there is a new op in
dma_buf_ops that tells KVM which memory type to force).  KVM_MEM_DMA
flag in this RFC sends the same signal.  Unless KVM_SET_DMA_BUF gives
the userspace other features such as setting unlimited number of
dmabufs to subregions of a memslot, it is not very useful.

If uapi change is to be avoided, it is the easiest that guest memory
type is always honored unless it causes #MC (i.e.,is_mmio==true).


>
> Thanks
> Kevin
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2020-02-20 22:24 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-13 21:30 [RFC PATCH 0/3] KVM: x86: honor guest memory type Chia-I Wu
2020-02-13 21:30 ` [RFC PATCH 1/3] KVM: vmx: rewrite the comment in vmx_get_mt_mask Chia-I Wu
2020-02-14  9:36   ` Paolo Bonzini
2020-02-13 21:30 ` [RFC PATCH 2/3] RFC: KVM: add KVM_MEM_DMA Chia-I Wu
2020-02-13 21:30 ` [RFC PATCH 3/3] RFC: KVM: x86: support KVM_CAP_DMA_MEM Chia-I Wu
2020-02-13 21:41 ` [RFC PATCH 0/3] KVM: x86: honor guest memory type Paolo Bonzini
2020-02-13 22:18   ` Chia-I Wu
2020-02-14 10:26     ` Paolo Bonzini
2020-02-14 19:52       ` Sean Christopherson
2020-02-14 21:47         ` Chia-I Wu
2020-02-14 21:56           ` Jim Mattson
2020-02-14 22:03             ` Sean Christopherson
2020-02-18 16:28               ` Paolo Bonzini
2020-02-18 22:58                 ` Sean Christopherson
2020-02-19  9:52                 ` Tian, Kevin
2020-02-19 19:36                   ` Chia-I Wu
2020-02-20  2:04                     ` Tian, Kevin
2020-02-20  2:38                       ` Tian, Kevin
2020-02-20 22:23                         ` Chia-I Wu [this message]
2020-02-21  0:23                           ` Tian, Kevin
2020-02-21  4:45                             ` Chia-I Wu
2020-02-21  4:51                               ` Chia-I Wu
2020-02-21  5:39                                 ` Tian, Kevin
2020-02-21 15:59                                   ` Sean Christopherson
2020-02-21 18:21                                     ` Chia-I Wu
2020-02-25  1:29                                       ` Tian, Kevin
2020-02-14 21:15       ` Chia-I Wu
2020-02-19 10:00         ` Tian, Kevin
2020-02-19 19:18           ` Chia-I Wu
2020-02-20  2:13             ` Tian, Kevin
2020-02-20 23:02               ` Chia-I Wu
2020-02-24 10:57               ` Gerd Hoffmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPaKu7RFY3nar9hmAdx6RYdZFPK3Cdg1O3cS+OvsEOT=yupyrQ@mail.gmail.com' \
    --to=olvaffe@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gurchetansingh@chromium.org \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kraxel@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).