[PATCH] KVM: arm64: Skip CMOs when updating a PTE pointing to non-memory

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] KVM: arm64: Skip CMOs when updating a PTE pointing to non-memory
@ 2021-04-26 10:36 Marc Zyngier
  2021-04-26 10:41 ` Marc Zyngier
  2021-04-27 14:52 ` Alexandru Elisei
  0 siblings, 2 replies; 5+ messages in thread
From: Marc Zyngier @ 2021-04-26 10:36 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm
  Cc: James Morse, Suzuki K Poulose, Alexandru Elisei, kernel-team,
	Jean-Philippe Brucker, Krishna Reddy, Sumit Gupta

Sumit Gupta and Krishna Reddy both reported that for MMIO regions
mapped into userspace using VFIO, a PTE update can trigger a MMU
notifier reaching kvm_set_spte_hva().

There is an assumption baked in kvm_set_spte_hva() that it only
deals with memory pages, and not MMIO. For this purpose, it
performs a cache cleaning of the potentially newly mapped page.
However, for a MMIO range, this explodes as there is no linear
mapping for this range (and doing cache maintenance on it would
make little sense anyway).

Check for the validity of the page before performing the CMO
addresses the problem.

Reported-by: Krishna Reddy <vdumpa@nvidia.com>
Reported-by: Sumit Gupta <sumitg@nvidia.com>,
Tested-by: Sumit Gupta <sumitg@nvidia.com>,
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/5a8825bc-286e-b316-515f-3bd3c9c70a80@nvidia.com
---
 arch/arm64/kvm/mmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index cd4d51ae3d4a..564a0f7fcd05 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1236,7 +1236,8 @@ int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte)
 	 * We've moved a page around, probably through CoW, so let's treat it
 	 * just like a translation fault and clean the cache to the PoC.
 	 */
-	clean_dcache_guest_page(pfn, PAGE_SIZE);
+	if (!kvm_is_device_pfn(pfn))
+		clean_dcache_guest_page(pfn, PAGE_SIZE);
 	handle_hva_to_gpa(kvm, hva, end, &kvm_set_spte_handler, &pfn);
 	return 0;
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: arm64: Skip CMOs when updating a PTE pointing to non-memory
  2021-04-26 10:36 [PATCH] KVM: arm64: Skip CMOs when updating a PTE pointing to non-memory Marc Zyngier
@ 2021-04-26 10:41 ` Marc Zyngier
  2021-04-27 14:52 ` Alexandru Elisei
  1 sibling, 0 replies; 5+ messages in thread
From: Marc Zyngier @ 2021-04-26 10:41 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm
  Cc: Jean-Philippe Brucker, Suzuki K Poulose, kernel-team,
	James Morse, Sumit Gupta, Alexandru Elisei

On 2021-04-26 11:36, Marc Zyngier wrote:
> Sumit Gupta and Krishna Reddy both reported that for MMIO regions
> mapped into userspace using VFIO, a PTE update can trigger a MMU
> notifier reaching kvm_set_spte_hva().
> 
> There is an assumption baked in kvm_set_spte_hva() that it only
> deals with memory pages, and not MMIO. For this purpose, it
> performs a cache cleaning of the potentially newly mapped page.
> However, for a MMIO range, this explodes as there is no linear
> mapping for this range (and doing cache maintenance on it would
> make little sense anyway).
> 
> Check for the validity of the page before performing the CMO
> addresses the problem.
> 
> Reported-by: Krishna Reddy <vdumpa@nvidia.com>
> Reported-by: Sumit Gupta <sumitg@nvidia.com>,
> Tested-by: Sumit Gupta <sumitg@nvidia.com>,
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Link: 
> https://lore.kernel.org/r/5a8825bc-286e-b316-515f-3bd3c9c70a80@nvidia.com

FWIW, I've locally added:

Fixes: 694556d54f35 ("KVM: arm/arm64: Clean dcache to PoC when changing 
PTE due to CoW")
Cc: stable@vger.kernel.org

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: arm64: Skip CMOs when updating a PTE pointing to non-memory
  2021-04-26 10:36 [PATCH] KVM: arm64: Skip CMOs when updating a PTE pointing to non-memory Marc Zyngier
  2021-04-26 10:41 ` Marc Zyngier
@ 2021-04-27 14:52 ` Alexandru Elisei
  2021-04-27 15:23   ` Jean-Philippe Brucker
  1 sibling, 1 reply; 5+ messages in thread
From: Alexandru Elisei @ 2021-04-27 14:52 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel, kvmarm, kvm
  Cc: James Morse, Suzuki K Poulose, kernel-team,
	Jean-Philippe Brucker, Krishna Reddy, Sumit Gupta

Hi,

I've been trying to reproduce the panic, but I haven't had any success.

With a known working PCI passtrough device, this is how I changed kvmtool:

diff --git a/vfio/core.c b/vfio/core.c
index 3ff2c0b..b4ee7e9 100644
--- a/vfio/core.c
+++ b/vfio/core.c
@@ -261,6 +261,9 @@ int vfio_map_region(struct kvm *kvm, struct vfio_device *vdev,
                return ret;
        }
 
+       char c = *(char *)base;
+       fprintf(stderr, "c = %c\n", c);
+
        return 0;
 }
 
What the change is doing is reading from the BAR region after it's has been
mmap'ed into userspace. I can see that the read hits vfio_pci_mmap_fault(), which
calls io_remap_pfn_range(), but I can't figure out how I can trigger the MMU
notifiers. Any suggestions?

The comment [1] suggested that the panic is triggered during page aging.
vfio_pci_mmap() sets the VM_PFNMAP for the VMA and I see in the Documentation that
pages with VM_PFNMAP are added to the unevictable LRU list, doesn't that mean it's
not subject the page aging? I feel like there's something I'm missing.

[1]
https://lore.kernel.org/kvm/BY5PR12MB37642B9AC7E5D907F5A664F6B3459@BY5PR12MB3764.namprd12.prod.outlook.com/

Thanks,

Alex

On 4/26/21 11:36 AM, Marc Zyngier wrote:
> Sumit Gupta and Krishna Reddy both reported that for MMIO regions
> mapped into userspace using VFIO, a PTE update can trigger a MMU
> notifier reaching kvm_set_spte_hva().
>
> There is an assumption baked in kvm_set_spte_hva() that it only
> deals with memory pages, and not MMIO. For this purpose, it
> performs a cache cleaning of the potentially newly mapped page.
> However, for a MMIO range, this explodes as there is no linear
> mapping for this range (and doing cache maintenance on it would
> make little sense anyway).
>
> Check for the validity of the page before performing the CMO
> addresses the problem.
>
> Reported-by: Krishna Reddy <vdumpa@nvidia.com>
> Reported-by: Sumit Gupta <sumitg@nvidia.com>,
> Tested-by: Sumit Gupta <sumitg@nvidia.com>,
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Link: https://lore.kernel.org/r/5a8825bc-286e-b316-515f-3bd3c9c70a80@nvidia.com
> ---
>  arch/arm64/kvm/mmu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index cd4d51ae3d4a..564a0f7fcd05 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1236,7 +1236,8 @@ int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte)
>  	 * We've moved a page around, probably through CoW, so let's treat it
>  	 * just like a translation fault and clean the cache to the PoC.
>  	 */
> -	clean_dcache_guest_page(pfn, PAGE_SIZE);
> +	if (!kvm_is_device_pfn(pfn))
> +		clean_dcache_guest_page(pfn, PAGE_SIZE);
>  	handle_hva_to_gpa(kvm, hva, end, &kvm_set_spte_handler, &pfn);
>  	return 0;
>  }

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: arm64: Skip CMOs when updating a PTE pointing to non-memory
  2021-04-27 14:52 ` Alexandru Elisei
@ 2021-04-27 15:23   ` Jean-Philippe Brucker
  2021-04-29 10:32     ` Alexandru Elisei
  0 siblings, 1 reply; 5+ messages in thread
From: Jean-Philippe Brucker @ 2021-04-27 15:23 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: Marc Zyngier, linux-arm-kernel, kvmarm, kvm, James Morse,
	Suzuki K Poulose, kernel-team, Krishna Reddy, Sumit Gupta

On Tue, Apr 27, 2021 at 03:52:46PM +0100, Alexandru Elisei wrote:
> The comment [1] suggested that the panic is triggered during page aging.

I think only with an out-of-tree patch applied
https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=d32d8baaf293aaefef8a1c9b8a4508ab2ec46c61
which probably is not going upstream.

Thanks,
Jean

> vfio_pci_mmap() sets the VM_PFNMAP for the VMA and I see in the Documentation that
> pages with VM_PFNMAP are added to the unevictable LRU list, doesn't that mean it's
> not subject the page aging? I feel like there's something I'm missing.
> 
> [1]
> https://lore.kernel.org/kvm/BY5PR12MB37642B9AC7E5D907F5A664F6B3459@BY5PR12MB3764.namprd12.prod.outlook.com/
> 
> Thanks,
> 
> Alex

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: arm64: Skip CMOs when updating a PTE pointing to non-memory
  2021-04-27 15:23   ` Jean-Philippe Brucker
@ 2021-04-29 10:32     ` Alexandru Elisei
  0 siblings, 0 replies; 5+ messages in thread
From: Alexandru Elisei @ 2021-04-29 10:32 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Marc Zyngier, linux-arm-kernel, kvmarm, kvm, James Morse,
	Suzuki K Poulose, kernel-team, Krishna Reddy, Sumit Gupta

Hi Jean,

On 4/27/21 4:23 PM, Jean-Philippe Brucker wrote:
> On Tue, Apr 27, 2021 at 03:52:46PM +0100, Alexandru Elisei wrote:
>> The comment [1] suggested that the panic is triggered during page aging.
> I think only with an out-of-tree patch applied
> https://jpbrucker.net/git/linux/commit/?h=sva/2021-03-01&id=d32d8baaf293aaefef8a1c9b8a4508ab2ec46c61
> which probably is not going upstream.

Thanks for that, that explains why I wasn't able to trigger the notification.

I did a grep for all the places where mmu_notifier_change_pte() and
set_pte_at_notify() are used in the kernel, and from what I can tell they are only
called for a new pte which has a struct page. From my investigation, the notifiers
are called from ksm (which deals with physical memory), swap migration (so still
pages in memory) and on copy-on-write.

On Linux v5.12, I tried to trigger the copy-on-write notification by forking
kvmtool right after the BAR region is mapped and then reading from the userspace
BAR address, but the new pte (for which the notifier is called) is valid.

I also looked at what x86 does, but I couldn't find where cache maintenance is
performed (wouldn't surprise me if it's not necessary at all).

So I guess my question is what kind of pfns the MMU notifiers for the secondary
MMUs are required to handle. If the requirement is that they should handle both
device and struct page backed pfns, then the patch looks correct to me
(user_mem_abort() also uses kvm_is_device_pfn() to decide if dcache maintenance is
needed).

Thanks,

Alex

> Thanks,
> Jean
>
>> vfio_pci_mmap() sets the VM_PFNMAP for the VMA and I see in the Documentation that
>> pages with VM_PFNMAP are added to the unevictable LRU list, doesn't that mean it's
>> not subject the page aging? I feel like there's something I'm missing.
>>
>> [1]
>> https://lore.kernel.org/kvm/BY5PR12MB37642B9AC7E5D907F5A664F6B3459@BY5PR12MB3764.namprd12.prod.outlook.com/
>>
>> Thanks,
>>
>> Alex

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-04-29 10:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-26 10:36 [PATCH] KVM: arm64: Skip CMOs when updating a PTE pointing to non-memory Marc Zyngier
2021-04-26 10:41 ` Marc Zyngier
2021-04-27 14:52 ` Alexandru Elisei
2021-04-27 15:23   ` Jean-Philippe Brucker
2021-04-29 10:32     ` Alexandru Elisei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).