KVM Archive on lore.kernel.org
 help / color / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Christophe de Dinechin <dinechin@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <sean.j.christopherson@intel.com>,
	Yan Zhao <yan.y.zhao@intel.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Kevin Kevin <kevin.tian@intel.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	Lei Cao <lei.cao@stratus.com>
Subject: Re: [PATCH v3 12/21] KVM: X86: Implement ring-based dirty memory tracking
Date: Sun, 12 Jan 2020 01:24:06 -0500
Message-ID: <20200112012239-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20200110152959.GC53397@xz-x1>

On Fri, Jan 10, 2020 at 10:29:59AM -0500, Peter Xu wrote:
> On Thu, Jan 09, 2020 at 05:18:24PM -0500, Michael S. Tsirkin wrote:
> > On Thu, Jan 09, 2020 at 03:19:16PM -0500, Peter Xu wrote:
> > > > > while for virtio, both sides (hypervisor,
> > > > > and the guest driver) are trusted.
> > > > 
> > > > What gave you the impression guest is trusted in virtio?
> > > 
> > > Hmm... maybe when I know virtio can bypass vIOMMU as long as it
> > > doesn't provide IOMMU_PLATFORM flag? :)
> > 
> > If guest driver does not provide IOMMU_PLATFORM, and device does,
> > then negotiation fails.
> 
> I mean it's still possible to specify "!IOMMU_PLATFORM" for the virtio
> device even if vIOMMU is enabled in the guest (rather than the
> negociation procedures).  Again I think it's fair, just the same
> reason as why we tend to even make "iommu=pt" by default for all the
> kernel drivers, because we should trust all the drivers as kernel
> itself.  The only thing we want to protect using vIOMMU is the
> userspace driver because we do have a line between the userspace and
> the kernel, and IMHO it's the same thing here for the new kvm
> interface.
> 
> > 
> > > I think it's logical to trust a virtio guest kernel driver, could you
> > > guide me on what I've missed?
> > 
> > 
> > guest driver is assumed to be part of guest kernel. It can't
> > do anything kernel can't do anyway.
> 
> Right, I think all things belongs to the kernel will have the same
> level of trust.  However again, userspace should be differently
> treated, and that's why I tend to prefer the index solution that we
> expose less to userspace to write (read is far safer comparing to
> writes from userspace).

You are mixing up different userspace kinds here. vIOMMU
prtects guest kernel from guest userspace.
Protecting guest kernel against userspace hypervisors
(e.g. QEMU) is mostly futile.


> > 
> > > > 
> > > > 
> > > > >  Above means we need to do these to
> > > > > change to the new design:
> > > > > 
> > > > >   - Allow the GFN array to be mapped as writable by userspace (so that
> > > > >     userspace can publish bit 2),
> > > > > 
> > > > >   - The userspace must be trusted to follow the design (just imagine
> > > > >     what if the userspace overwrites a GFN when it publishes bit 2
> > > > >     over a valid dirty gfn entry?  KVM could wrongly unprotect a page
> > > > >     for the guest...).
> > > > 
> > > > You mean protect, right?  So what?
> > > 
> > > Yes, I mean with that, more things are uncertain from userspace.  It
> > > seems easier to me that we restrict the userspace with one index.
> > 
> > Donnu how to treat vague statements like this.  You need to be specific
> > with threat models. Otherwise there's no way to tell whether code is
> > secure.
> > 
> > > > 
> > > > > While if we use the indices, we restrict the userspace to only be able
> > > > > to write to one index only (which is the reset_index).  That's all it
> > > > > can do to mess things up (and it could never as long as we properly
> > > > > validate the reset_index when read, which only happens during
> > > > > KVM_RESET_DIRTY_RINGS and is very rare).  From that pov, it seems the
> > > > > indices solution still has its benefits.
> > > > 
> > > > So if you mess up index how is this different?
> > > 
> > > We can't mess up much with that.  We simply check fetch_index (sorry I
> > > meant this when I said reset_index, anyway it's the only index that we
> > > expose to userspace) to make sure:
> > > 
> > >   reset_index <= fetch_index <= dirty_index
> > > 
> > > Otherwise we fail the ioctl.  With that, we're 100% safe.
> > 
> > safe from what? userspace can mess up guest memory trivially.
> > for example skip sending some memory or send junk.
> 
> Yes, QEMU can mess the guest up, but it should never mess the host up,
> am I right?  Regarding to QEMU as an userspace, KVM should see it as
> untrusted as well from host-wise.  However guest security is another
> thing, imho.
> 
> > 
> > > > 
> > > > I agree RO page kind of feels safer generally though.
> > > > 
> > > > I will have to re-read how does the ring works though,
> > > > my comments were based on the old assumption of mmaped
> > > > page with indices.
> > > 
> > > Yes, sorry again for a bad cover letter.
> > > 
> > > It's basically the same as before, just that we only have per-vcpu
> > > ring now, and the indices are exposed from kvm_run so we don't need
> > > the extra page, but we still expose that via mmap.
> > 
> > So that's why changelogs are useful.
> > Can you please write a changelog for this version so I don't
> > need to re-read all of it? Thanks!
> 
> Sure, actually I've got a changelog in the cover letter for this
> version [1]... it's:
> 
> V3 changelog:
> 
> - fail userspace writable maps on dirty ring ranges [Jason]
> - commit message fixups [Paolo]
> - change __x86_set_memory_region to return hva [Paolo]
> - cacheline align for indices [Paolo, Jason]
> - drop waitqueue, global lock, etc., include kvmgt rework patchset
> - take lock for __x86_set_memory_region() (otherwise it triggers a
>   lockdep in latest kvm/queue) [Paolo]
> - check KVM_DIRTY_LOG_PAGE_OFFSET in kvm_vm_ioctl_enable_dirty_log_ring
> - one more patch to drop x86_set_memory_region [Paolo]
> - one more patch to remove extra srcu usage in init_rmode_identity_map()
> - add some r-bs for Paolo
> 
> I didn't have detailed changelog for v2 because it could be a long
> list with trivial details which can hide the major things, but I've
> got a small write-up in the cover letter trying to mention the major
> changes [2].
> 
> Again, I'm very sorry for either missing a complete changelog in v2,
> or the high-level overview of v3 in the cover letter.  I'll make it
> better in v4.
> 
> Thanks,
> 
> [1] https://lore.kernel.org/kvm/20200109145729.32898-1-peterx@redhat.com/
> [2] https://lore.kernel.org/kvm/20191220211634.51231-1-peterx@redhat.com/
> 
> -- 
> Peter Xu


  reply index

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-09 14:57 [PATCH v3 00/21] KVM: Dirty ring interface Peter Xu
2020-01-09 14:57 ` [PATCH v3 01/21] vfio: introduce vfio_iova_rw to read/write a range of IOVAs Peter Xu
2020-01-09 14:57 ` [PATCH v3 02/21] drm/i915/gvt: subsitute kvm_read/write_guest with vfio_iova_rw Peter Xu
2020-01-09 14:57 ` [PATCH v3 03/21] KVM: Remove kvm_read_guest_atomic() Peter Xu
2020-01-09 14:57 ` [PATCH v3 04/21] KVM: Add build-time error check on kvm_run size Peter Xu
2020-01-09 14:57 ` [PATCH v3 05/21] KVM: X86: Change parameter for fast_page_fault tracepoint Peter Xu
2020-01-09 14:57 ` [PATCH v3 06/21] KVM: X86: Don't take srcu lock in init_rmode_identity_map() Peter Xu
2020-01-09 14:57 ` [PATCH v3 07/21] KVM: Cache as_id in kvm_memory_slot Peter Xu
2020-01-09 14:57 ` [PATCH v3 08/21] KVM: X86: Drop x86_set_memory_region() Peter Xu
2020-01-09 14:57 ` [PATCH v3 09/21] KVM: X86: Don't track dirty for KVM_SET_[TSS_ADDR|IDENTITY_MAP_ADDR] Peter Xu
2020-01-19  9:01   ` Paolo Bonzini
2020-01-20  6:45     ` Peter Xu
2020-01-21 15:56   ` Sean Christopherson
2020-01-21 16:14     ` Paolo Bonzini
2020-01-28  5:50     ` Peter Xu
2020-01-28 18:24       ` Sean Christopherson
2020-01-31 15:08         ` Peter Xu
2020-01-31 19:33           ` Sean Christopherson
2020-01-31 20:28             ` Peter Xu
2020-01-31 20:36               ` Sean Christopherson
2020-01-31 20:55                 ` Peter Xu
2020-01-31 21:29                   ` Sean Christopherson
2020-01-31 22:16                     ` Peter Xu
2020-01-31 22:20                       ` Sean Christopherson
2020-01-09 14:57 ` [PATCH v3 10/21] KVM: Pass in kvm pointer into mark_page_dirty_in_slot() Peter Xu
2020-01-09 14:57 ` [PATCH v3 11/21] KVM: Move running VCPU from ARM to common code Peter Xu
2020-01-09 14:57 ` [PATCH v3 12/21] KVM: X86: Implement ring-based dirty memory tracking Peter Xu
2020-01-09 16:29   ` Michael S. Tsirkin
2020-01-09 16:56     ` Alex Williamson
2020-01-09 19:21       ` Peter Xu
2020-01-09 19:36         ` Michael S. Tsirkin
2020-01-09 19:15     ` Peter Xu
2020-01-09 19:35       ` Michael S. Tsirkin
2020-01-09 20:19         ` Peter Xu
2020-01-09 22:18           ` Michael S. Tsirkin
2020-01-10 15:29             ` Peter Xu
2020-01-12  6:24               ` Michael S. Tsirkin [this message]
2020-01-14 20:01         ` Peter Xu
2020-01-15  6:50           ` Michael S. Tsirkin
2020-01-15 15:20             ` Peter Xu
2020-01-19  9:09       ` Paolo Bonzini
2020-01-19 10:12         ` Michael S. Tsirkin
2020-01-20  7:29           ` Peter Xu
2020-01-20  7:47             ` Michael S. Tsirkin
2020-01-21  8:29               ` Peter Xu
2020-01-21 10:25                 ` Paolo Bonzini
2020-01-21 10:24             ` Paolo Bonzini
2020-01-11  4:49   ` kbuild test robot
2020-01-11 23:19   ` kbuild test robot
2020-01-15  6:47   ` Michael S. Tsirkin
2020-01-15 15:27     ` Peter Xu
2020-01-16  8:38   ` Michael S. Tsirkin
2020-01-16 16:27     ` Peter Xu
2020-01-17  9:50       ` Michael S. Tsirkin
2020-01-20  6:48         ` Peter Xu
2020-01-09 14:57 ` [PATCH v3 13/21] KVM: Make dirty ring exclusive to dirty bitmap log Peter Xu
2020-01-09 14:57 ` [PATCH v3 14/21] KVM: Don't allocate dirty bitmap if dirty ring is enabled Peter Xu
2020-01-09 16:41   ` Peter Xu
2020-01-09 14:57 ` [PATCH v3 15/21] KVM: selftests: Always clear dirty bitmap after iteration Peter Xu
2020-01-09 14:57 ` [PATCH v3 16/21] KVM: selftests: Sync uapi/linux/kvm.h to tools/ Peter Xu
2020-01-09 14:57 ` [PATCH v3 17/21] KVM: selftests: Use a single binary for dirty/clear log test Peter Xu
2020-01-09 14:57 ` [PATCH v3 18/21] KVM: selftests: Introduce after_vcpu_run hook for dirty " Peter Xu
2020-01-09 14:57 ` [PATCH v3 19/21] KVM: selftests: Add dirty ring buffer test Peter Xu
2020-01-09 14:57 ` [PATCH v3 20/21] KVM: selftests: Let dirty_log_test async for dirty ring test Peter Xu
2020-01-09 14:57 ` [PATCH v3 21/21] KVM: selftests: Add "-c" parameter to dirty log test Peter Xu
2020-01-09 15:59 ` [PATCH v3 00/21] KVM: Dirty ring interface Michael S. Tsirkin
2020-01-09 16:17   ` Peter Xu
2020-01-09 16:40     ` Michael S. Tsirkin
2020-01-09 17:08       ` Peter Xu
2020-01-09 19:08         ` Michael S. Tsirkin
2020-01-09 19:39           ` Peter Xu
2020-01-09 20:42             ` Paolo Bonzini
2020-01-09 22:28             ` Michael S. Tsirkin
2020-01-10 15:10               ` Peter Xu
2020-01-09 16:47 ` Alex Williamson
2020-01-09 17:58   ` Peter Xu
2020-01-09 19:13     ` Michael S. Tsirkin
2020-01-09 19:23       ` Peter Xu
2020-01-09 19:37         ` Michael S. Tsirkin
2020-01-09 20:51       ` Paolo Bonzini
2020-01-09 22:21         ` Michael S. Tsirkin
2020-01-19  9:11 ` Paolo Bonzini

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200112012239-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=dinechin@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=lei.cao@stratus.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=vkuznets@redhat.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git