On Fri, 2021-11-12 at 15:56 +0100, Paolo Bonzini wrote:
> On 11/12/21 14:28, David Woodhouse wrote:
> > A back-to-back write_lock/write_unlock*without*  setting the address to
> > KVM_UNMAPPED_PAGE? I'm not sure I see how that protects the IRQ
> > delivery from accessing the (now stale) physical page after the MMU
> > notifier has completed? Not unless it's going to call hva_to_pfn again
> > for itself under the read_lock, every time it delivers an IRQ?
> 
> Yeah, you're right, it still has to invalidate it somehow.  So
> KVM_UNMAPPED_PAGE would go in the hva field of the gfn_to_pfn cache
> (merged with kvm_host_map).  Or maybe one can use an invalid generation,
> too.

Right. Or now you have me adding *flags* anyway, so I might just use
one of those to mark it invalid.

We do need to keep the original hva pointer around to unmap it anyway,
even if it's become invalid. That's why my existing code has a separate
kvm->arch.xen.shared_info pointer in *addition* to the map->hva
pointer; the latter of which *isn't* wiped on invalidation.

> > > 2) for memremap/memunmap, all you really care about is reacting to
> > > changes in the memslots, so the MMU notifier integration has nothing
> > > to do.  You still need to call the same hook as
> > > kvm_mmu_notifier_invalidate_range() when memslots change, so that
> > > the update is done outside atomic context.
> > 
> > Hm, we definitely *do* care about reacting to MMU notifiers in this
> > case too. Userspace can do memory overcommit / ballooning etc.
> > *without* changing the memslots, and only mmap/munmap/userfault_fd on
> > the corresponding HVA ranges.
> 
> Can it do so for VM_IO/VM_PFNMAP memory?

It can, yes.