All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Steven Price <steven.price@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Andrew Jones <drjones@redhat.com>, Haibo Xu <Haibo.Xu@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	qemu-devel@nongnu.org, Marc Zyngier <maz@kernel.org>,
	Juan Quintela <quintela@redhat.com>,
	Richard Henderson <richard.henderson@linaro.org>,
	linux-kernel@vger.kernel.org, Dave Martin <Dave.Martin@arm.com>,
	James Morse <james.morse@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>,
	kvmarm@lists.cs.columbia.edu,
	Julien Thierry <julien.thierry.kdev@gmail.com>
Subject: Re: [PATCH v10 2/6] arm64: kvm: Introduce MTE VM feature
Date: Wed, 31 Mar 2021 09:34:44 +0200	[thread overview]
Message-ID: <8977120b-841d-4882-2472-6e403bc9c797@redhat.com> (raw)
In-Reply-To: <20210330103013.GD18075@arm.com>

On 30.03.21 12:30, Catalin Marinas wrote:
> On Mon, Mar 29, 2021 at 05:06:51PM +0100, Steven Price wrote:
>> On 28/03/2021 13:21, Catalin Marinas wrote:
>>> On Sat, Mar 27, 2021 at 03:23:24PM +0000, Catalin Marinas wrote:
>>>> On Fri, Mar 12, 2021 at 03:18:58PM +0000, Steven Price wrote:
>>>>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>>>>> index 77cb2d28f2a4..b31b7a821f90 100644
>>>>> --- a/arch/arm64/kvm/mmu.c
>>>>> +++ b/arch/arm64/kvm/mmu.c
>>>>> @@ -879,6 +879,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>>>>    	if (vma_pagesize == PAGE_SIZE && !force_pte)
>>>>>    		vma_pagesize = transparent_hugepage_adjust(memslot, hva,
>>>>>    							   &pfn, &fault_ipa);
>>>>> +
>>>>> +	if (fault_status != FSC_PERM && kvm_has_mte(kvm) && pfn_valid(pfn)) {
>>>>> +		/*
>>>>> +		 * VM will be able to see the page's tags, so we must ensure
>>>>> +		 * they have been initialised. if PG_mte_tagged is set, tags
>>>>> +		 * have already been initialised.
>>>>> +		 */
>>>>> +		struct page *page = pfn_to_page(pfn);
>>>>> +		unsigned long i, nr_pages = vma_pagesize >> PAGE_SHIFT;
>>>>> +
>>>>> +		for (i = 0; i < nr_pages; i++, page++) {
>>>>> +			if (!test_and_set_bit(PG_mte_tagged, &page->flags))
>>>>> +				mte_clear_page_tags(page_address(page));
>>>>> +		}
>>>>> +	}
>>>>
>>>> This pfn_valid() check may be problematic. Following commit eeb0753ba27b
>>>> ("arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory"), it returns
>>>> true for ZONE_DEVICE memory but such memory is allowed not to support
>>>> MTE.
>>>
>>> Some more thinking, this should be safe as any ZONE_DEVICE would be
>>> mapped as untagged memory in the kernel linear map. It could be slightly
>>> inefficient if it unnecessarily tries to clear tags in ZONE_DEVICE,
>>> untagged memory. Another overhead is pfn_valid() which will likely end
>>> up calling memblock_is_map_memory().
>>>
>>> However, the bigger issue is that Stage 2 cannot disable tagging for
>>> Stage 1 unless the memory is Non-cacheable or Device at S2. Is there a
>>> way to detect what gets mapped in the guest as Normal Cacheable memory
>>> and make sure it's only early memory or hotplug but no ZONE_DEVICE (or
>>> something else like on-chip memory)?  If we can't guarantee that all
>>> Cacheable memory given to a guest supports tags, we should disable the
>>> feature altogether.
>>
>> In stage 2 I believe we only have two types of mapping - 'normal' or
>> DEVICE_nGnRE (see stage2_map_set_prot_attr()). Filtering out the latter is a
>> case of checking the 'device' variable, and makes sense to avoid the
>> overhead you describe.
>>
>> This should also guarantee that all stage-2 cacheable memory supports tags,
>> as kvm_is_device_pfn() is simply !pfn_valid(), and pfn_valid() should only
>> be true for memory that Linux considers "normal".

If you think "normal" == "normal System RAM", that's wrong; see below.

> 
> That's the problem. With Anshuman's commit I mentioned above,
> pfn_valid() returns true for ZONE_DEVICE mappings (e.g. persistent
> memory, not talking about some I/O mapping that requires Device_nGnRE).
> So kvm_is_device_pfn() is false for such memory and it may be mapped as
> Normal but it is not guaranteed to support tagging.

pfn_valid() means "there is a struct page"; if you do pfn_to_page() and 
touch the page, you won't fault. So Anshuman's commit is correct.

pfn_to_online_page() means, "there is a struct page and it's system RAM 
that's in use; the memmap has a sane content"

> 
> For user MTE, we get away with this as the MAP_ANONYMOUS requirement
> would filter it out while arch_add_memory() will ensure it's mapped as
> untagged in the linear map. See another recent fix for hotplugged
> memory: d15dfd31384b ("arm64: mte: Map hotplugged memory as Normal
> Tagged"). We needed to ensure that ZONE_DEVICE doesn't end up as tagged,
> only hoplugged memory. Both handled via arch_add_memory() in the arch
> code with ZONE_DEVICE starting at devm_memremap_pages().
> 
>>>> I now wonder if we can get a MAP_ANONYMOUS mapping of ZONE_DEVICE pfn
>>>> even without virtualisation.
>>>
>>> I haven't checked all the code paths but I don't think we can get a
>>> MAP_ANONYMOUS mapping of ZONE_DEVICE memory as we normally need a file
>>> descriptor.
>>
>> I certainly hope this is the case - it's the weird corner cases of device
>> drivers that worry me. E.g. I know i915 has a "hidden" mmap behind an ioctl
>> (see i915_gem_mmap_ioctl(), although this case is fine - it's MAP_SHARED).
>> Mali's kbase did something similar in the past.
> 
> I think this should be fine since it's not a MAP_ANONYMOUS (we do allow
> MAP_SHARED to be tagged).
> 


-- 
Thanks,

David / dhildenb


WARNING: multiple messages have this Message-ID (diff)
From: David Hildenbrand <david@redhat.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Steven Price <steven.price@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	Andrew Jones <drjones@redhat.com>, Haibo Xu <Haibo.Xu@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Juan Quintela <quintela@redhat.com>,
	Richard Henderson <richard.henderson@linaro.org>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	qemu-devel@nongnu.org, James Morse <james.morse@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu,
	Thomas Gleixner <tglx@linutronix.de>,
	Julien Thierry <julien.thierry.kdev@gmail.com>,
	Will Deacon <will@kernel.org>, Dave Martin <Dave.Martin@arm.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v10 2/6] arm64: kvm: Introduce MTE VM feature
Date: Wed, 31 Mar 2021 09:34:44 +0200	[thread overview]
Message-ID: <8977120b-841d-4882-2472-6e403bc9c797@redhat.com> (raw)
In-Reply-To: <20210330103013.GD18075@arm.com>

On 30.03.21 12:30, Catalin Marinas wrote:
> On Mon, Mar 29, 2021 at 05:06:51PM +0100, Steven Price wrote:
>> On 28/03/2021 13:21, Catalin Marinas wrote:
>>> On Sat, Mar 27, 2021 at 03:23:24PM +0000, Catalin Marinas wrote:
>>>> On Fri, Mar 12, 2021 at 03:18:58PM +0000, Steven Price wrote:
>>>>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>>>>> index 77cb2d28f2a4..b31b7a821f90 100644
>>>>> --- a/arch/arm64/kvm/mmu.c
>>>>> +++ b/arch/arm64/kvm/mmu.c
>>>>> @@ -879,6 +879,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>>>>    	if (vma_pagesize == PAGE_SIZE && !force_pte)
>>>>>    		vma_pagesize = transparent_hugepage_adjust(memslot, hva,
>>>>>    							   &pfn, &fault_ipa);
>>>>> +
>>>>> +	if (fault_status != FSC_PERM && kvm_has_mte(kvm) && pfn_valid(pfn)) {
>>>>> +		/*
>>>>> +		 * VM will be able to see the page's tags, so we must ensure
>>>>> +		 * they have been initialised. if PG_mte_tagged is set, tags
>>>>> +		 * have already been initialised.
>>>>> +		 */
>>>>> +		struct page *page = pfn_to_page(pfn);
>>>>> +		unsigned long i, nr_pages = vma_pagesize >> PAGE_SHIFT;
>>>>> +
>>>>> +		for (i = 0; i < nr_pages; i++, page++) {
>>>>> +			if (!test_and_set_bit(PG_mte_tagged, &page->flags))
>>>>> +				mte_clear_page_tags(page_address(page));
>>>>> +		}
>>>>> +	}
>>>>
>>>> This pfn_valid() check may be problematic. Following commit eeb0753ba27b
>>>> ("arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory"), it returns
>>>> true for ZONE_DEVICE memory but such memory is allowed not to support
>>>> MTE.
>>>
>>> Some more thinking, this should be safe as any ZONE_DEVICE would be
>>> mapped as untagged memory in the kernel linear map. It could be slightly
>>> inefficient if it unnecessarily tries to clear tags in ZONE_DEVICE,
>>> untagged memory. Another overhead is pfn_valid() which will likely end
>>> up calling memblock_is_map_memory().
>>>
>>> However, the bigger issue is that Stage 2 cannot disable tagging for
>>> Stage 1 unless the memory is Non-cacheable or Device at S2. Is there a
>>> way to detect what gets mapped in the guest as Normal Cacheable memory
>>> and make sure it's only early memory or hotplug but no ZONE_DEVICE (or
>>> something else like on-chip memory)?  If we can't guarantee that all
>>> Cacheable memory given to a guest supports tags, we should disable the
>>> feature altogether.
>>
>> In stage 2 I believe we only have two types of mapping - 'normal' or
>> DEVICE_nGnRE (see stage2_map_set_prot_attr()). Filtering out the latter is a
>> case of checking the 'device' variable, and makes sense to avoid the
>> overhead you describe.
>>
>> This should also guarantee that all stage-2 cacheable memory supports tags,
>> as kvm_is_device_pfn() is simply !pfn_valid(), and pfn_valid() should only
>> be true for memory that Linux considers "normal".

If you think "normal" == "normal System RAM", that's wrong; see below.

> 
> That's the problem. With Anshuman's commit I mentioned above,
> pfn_valid() returns true for ZONE_DEVICE mappings (e.g. persistent
> memory, not talking about some I/O mapping that requires Device_nGnRE).
> So kvm_is_device_pfn() is false for such memory and it may be mapped as
> Normal but it is not guaranteed to support tagging.

pfn_valid() means "there is a struct page"; if you do pfn_to_page() and 
touch the page, you won't fault. So Anshuman's commit is correct.

pfn_to_online_page() means, "there is a struct page and it's system RAM 
that's in use; the memmap has a sane content"

> 
> For user MTE, we get away with this as the MAP_ANONYMOUS requirement
> would filter it out while arch_add_memory() will ensure it's mapped as
> untagged in the linear map. See another recent fix for hotplugged
> memory: d15dfd31384b ("arm64: mte: Map hotplugged memory as Normal
> Tagged"). We needed to ensure that ZONE_DEVICE doesn't end up as tagged,
> only hoplugged memory. Both handled via arch_add_memory() in the arch
> code with ZONE_DEVICE starting at devm_memremap_pages().
> 
>>>> I now wonder if we can get a MAP_ANONYMOUS mapping of ZONE_DEVICE pfn
>>>> even without virtualisation.
>>>
>>> I haven't checked all the code paths but I don't think we can get a
>>> MAP_ANONYMOUS mapping of ZONE_DEVICE memory as we normally need a file
>>> descriptor.
>>
>> I certainly hope this is the case - it's the weird corner cases of device
>> drivers that worry me. E.g. I know i915 has a "hidden" mmap behind an ioctl
>> (see i915_gem_mmap_ioctl(), although this case is fine - it's MAP_SHARED).
>> Mali's kbase did something similar in the past.
> 
> I think this should be fine since it's not a MAP_ANONYMOUS (we do allow
> MAP_SHARED to be tagged).
> 


-- 
Thanks,

David / dhildenb



WARNING: multiple messages have this Message-ID (diff)
From: David Hildenbrand <david@redhat.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Steven Price <steven.price@arm.com>
Cc: Marc Zyngier <maz@kernel.org>,
	Juan Quintela <quintela@redhat.com>,
	Richard Henderson <richard.henderson@linaro.org>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	qemu-devel@nongnu.org, linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu,
	Thomas Gleixner <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>, Dave Martin <Dave.Martin@arm.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v10 2/6] arm64: kvm: Introduce MTE VM feature
Date: Wed, 31 Mar 2021 09:34:44 +0200	[thread overview]
Message-ID: <8977120b-841d-4882-2472-6e403bc9c797@redhat.com> (raw)
In-Reply-To: <20210330103013.GD18075@arm.com>

On 30.03.21 12:30, Catalin Marinas wrote:
> On Mon, Mar 29, 2021 at 05:06:51PM +0100, Steven Price wrote:
>> On 28/03/2021 13:21, Catalin Marinas wrote:
>>> On Sat, Mar 27, 2021 at 03:23:24PM +0000, Catalin Marinas wrote:
>>>> On Fri, Mar 12, 2021 at 03:18:58PM +0000, Steven Price wrote:
>>>>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>>>>> index 77cb2d28f2a4..b31b7a821f90 100644
>>>>> --- a/arch/arm64/kvm/mmu.c
>>>>> +++ b/arch/arm64/kvm/mmu.c
>>>>> @@ -879,6 +879,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>>>>    	if (vma_pagesize == PAGE_SIZE && !force_pte)
>>>>>    		vma_pagesize = transparent_hugepage_adjust(memslot, hva,
>>>>>    							   &pfn, &fault_ipa);
>>>>> +
>>>>> +	if (fault_status != FSC_PERM && kvm_has_mte(kvm) && pfn_valid(pfn)) {
>>>>> +		/*
>>>>> +		 * VM will be able to see the page's tags, so we must ensure
>>>>> +		 * they have been initialised. if PG_mte_tagged is set, tags
>>>>> +		 * have already been initialised.
>>>>> +		 */
>>>>> +		struct page *page = pfn_to_page(pfn);
>>>>> +		unsigned long i, nr_pages = vma_pagesize >> PAGE_SHIFT;
>>>>> +
>>>>> +		for (i = 0; i < nr_pages; i++, page++) {
>>>>> +			if (!test_and_set_bit(PG_mte_tagged, &page->flags))
>>>>> +				mte_clear_page_tags(page_address(page));
>>>>> +		}
>>>>> +	}
>>>>
>>>> This pfn_valid() check may be problematic. Following commit eeb0753ba27b
>>>> ("arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory"), it returns
>>>> true for ZONE_DEVICE memory but such memory is allowed not to support
>>>> MTE.
>>>
>>> Some more thinking, this should be safe as any ZONE_DEVICE would be
>>> mapped as untagged memory in the kernel linear map. It could be slightly
>>> inefficient if it unnecessarily tries to clear tags in ZONE_DEVICE,
>>> untagged memory. Another overhead is pfn_valid() which will likely end
>>> up calling memblock_is_map_memory().
>>>
>>> However, the bigger issue is that Stage 2 cannot disable tagging for
>>> Stage 1 unless the memory is Non-cacheable or Device at S2. Is there a
>>> way to detect what gets mapped in the guest as Normal Cacheable memory
>>> and make sure it's only early memory or hotplug but no ZONE_DEVICE (or
>>> something else like on-chip memory)?  If we can't guarantee that all
>>> Cacheable memory given to a guest supports tags, we should disable the
>>> feature altogether.
>>
>> In stage 2 I believe we only have two types of mapping - 'normal' or
>> DEVICE_nGnRE (see stage2_map_set_prot_attr()). Filtering out the latter is a
>> case of checking the 'device' variable, and makes sense to avoid the
>> overhead you describe.
>>
>> This should also guarantee that all stage-2 cacheable memory supports tags,
>> as kvm_is_device_pfn() is simply !pfn_valid(), and pfn_valid() should only
>> be true for memory that Linux considers "normal".

If you think "normal" == "normal System RAM", that's wrong; see below.

> 
> That's the problem. With Anshuman's commit I mentioned above,
> pfn_valid() returns true for ZONE_DEVICE mappings (e.g. persistent
> memory, not talking about some I/O mapping that requires Device_nGnRE).
> So kvm_is_device_pfn() is false for such memory and it may be mapped as
> Normal but it is not guaranteed to support tagging.

pfn_valid() means "there is a struct page"; if you do pfn_to_page() and 
touch the page, you won't fault. So Anshuman's commit is correct.

pfn_to_online_page() means, "there is a struct page and it's system RAM 
that's in use; the memmap has a sane content"

> 
> For user MTE, we get away with this as the MAP_ANONYMOUS requirement
> would filter it out while arch_add_memory() will ensure it's mapped as
> untagged in the linear map. See another recent fix for hotplugged
> memory: d15dfd31384b ("arm64: mte: Map hotplugged memory as Normal
> Tagged"). We needed to ensure that ZONE_DEVICE doesn't end up as tagged,
> only hoplugged memory. Both handled via arch_add_memory() in the arch
> code with ZONE_DEVICE starting at devm_memremap_pages().
> 
>>>> I now wonder if we can get a MAP_ANONYMOUS mapping of ZONE_DEVICE pfn
>>>> even without virtualisation.
>>>
>>> I haven't checked all the code paths but I don't think we can get a
>>> MAP_ANONYMOUS mapping of ZONE_DEVICE memory as we normally need a file
>>> descriptor.
>>
>> I certainly hope this is the case - it's the weird corner cases of device
>> drivers that worry me. E.g. I know i915 has a "hidden" mmap behind an ioctl
>> (see i915_gem_mmap_ioctl(), although this case is fine - it's MAP_SHARED).
>> Mali's kbase did something similar in the past.
> 
> I think this should be fine since it's not a MAP_ANONYMOUS (we do allow
> MAP_SHARED to be tagged).
> 


-- 
Thanks,

David / dhildenb

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: David Hildenbrand <david@redhat.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Steven Price <steven.price@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Andrew Jones <drjones@redhat.com>, Haibo Xu <Haibo.Xu@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	qemu-devel@nongnu.org, Marc Zyngier <maz@kernel.org>,
	Juan Quintela <quintela@redhat.com>,
	Richard Henderson <richard.henderson@linaro.org>,
	linux-kernel@vger.kernel.org, Dave Martin <Dave.Martin@arm.com>,
	James Morse <james.morse@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>,
	kvmarm@lists.cs.columbia.edu,
	Julien Thierry <julien.thierry.kdev@gmail.com>
Subject: Re: [PATCH v10 2/6] arm64: kvm: Introduce MTE VM feature
Date: Wed, 31 Mar 2021 09:34:44 +0200	[thread overview]
Message-ID: <8977120b-841d-4882-2472-6e403bc9c797@redhat.com> (raw)
In-Reply-To: <20210330103013.GD18075@arm.com>

On 30.03.21 12:30, Catalin Marinas wrote:
> On Mon, Mar 29, 2021 at 05:06:51PM +0100, Steven Price wrote:
>> On 28/03/2021 13:21, Catalin Marinas wrote:
>>> On Sat, Mar 27, 2021 at 03:23:24PM +0000, Catalin Marinas wrote:
>>>> On Fri, Mar 12, 2021 at 03:18:58PM +0000, Steven Price wrote:
>>>>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>>>>> index 77cb2d28f2a4..b31b7a821f90 100644
>>>>> --- a/arch/arm64/kvm/mmu.c
>>>>> +++ b/arch/arm64/kvm/mmu.c
>>>>> @@ -879,6 +879,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>>>>    	if (vma_pagesize == PAGE_SIZE && !force_pte)
>>>>>    		vma_pagesize = transparent_hugepage_adjust(memslot, hva,
>>>>>    							   &pfn, &fault_ipa);
>>>>> +
>>>>> +	if (fault_status != FSC_PERM && kvm_has_mte(kvm) && pfn_valid(pfn)) {
>>>>> +		/*
>>>>> +		 * VM will be able to see the page's tags, so we must ensure
>>>>> +		 * they have been initialised. if PG_mte_tagged is set, tags
>>>>> +		 * have already been initialised.
>>>>> +		 */
>>>>> +		struct page *page = pfn_to_page(pfn);
>>>>> +		unsigned long i, nr_pages = vma_pagesize >> PAGE_SHIFT;
>>>>> +
>>>>> +		for (i = 0; i < nr_pages; i++, page++) {
>>>>> +			if (!test_and_set_bit(PG_mte_tagged, &page->flags))
>>>>> +				mte_clear_page_tags(page_address(page));
>>>>> +		}
>>>>> +	}
>>>>
>>>> This pfn_valid() check may be problematic. Following commit eeb0753ba27b
>>>> ("arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory"), it returns
>>>> true for ZONE_DEVICE memory but such memory is allowed not to support
>>>> MTE.
>>>
>>> Some more thinking, this should be safe as any ZONE_DEVICE would be
>>> mapped as untagged memory in the kernel linear map. It could be slightly
>>> inefficient if it unnecessarily tries to clear tags in ZONE_DEVICE,
>>> untagged memory. Another overhead is pfn_valid() which will likely end
>>> up calling memblock_is_map_memory().
>>>
>>> However, the bigger issue is that Stage 2 cannot disable tagging for
>>> Stage 1 unless the memory is Non-cacheable or Device at S2. Is there a
>>> way to detect what gets mapped in the guest as Normal Cacheable memory
>>> and make sure it's only early memory or hotplug but no ZONE_DEVICE (or
>>> something else like on-chip memory)?  If we can't guarantee that all
>>> Cacheable memory given to a guest supports tags, we should disable the
>>> feature altogether.
>>
>> In stage 2 I believe we only have two types of mapping - 'normal' or
>> DEVICE_nGnRE (see stage2_map_set_prot_attr()). Filtering out the latter is a
>> case of checking the 'device' variable, and makes sense to avoid the
>> overhead you describe.
>>
>> This should also guarantee that all stage-2 cacheable memory supports tags,
>> as kvm_is_device_pfn() is simply !pfn_valid(), and pfn_valid() should only
>> be true for memory that Linux considers "normal".

If you think "normal" == "normal System RAM", that's wrong; see below.

> 
> That's the problem. With Anshuman's commit I mentioned above,
> pfn_valid() returns true for ZONE_DEVICE mappings (e.g. persistent
> memory, not talking about some I/O mapping that requires Device_nGnRE).
> So kvm_is_device_pfn() is false for such memory and it may be mapped as
> Normal but it is not guaranteed to support tagging.

pfn_valid() means "there is a struct page"; if you do pfn_to_page() and 
touch the page, you won't fault. So Anshuman's commit is correct.

pfn_to_online_page() means, "there is a struct page and it's system RAM 
that's in use; the memmap has a sane content"

> 
> For user MTE, we get away with this as the MAP_ANONYMOUS requirement
> would filter it out while arch_add_memory() will ensure it's mapped as
> untagged in the linear map. See another recent fix for hotplugged
> memory: d15dfd31384b ("arm64: mte: Map hotplugged memory as Normal
> Tagged"). We needed to ensure that ZONE_DEVICE doesn't end up as tagged,
> only hoplugged memory. Both handled via arch_add_memory() in the arch
> code with ZONE_DEVICE starting at devm_memremap_pages().
> 
>>>> I now wonder if we can get a MAP_ANONYMOUS mapping of ZONE_DEVICE pfn
>>>> even without virtualisation.
>>>
>>> I haven't checked all the code paths but I don't think we can get a
>>> MAP_ANONYMOUS mapping of ZONE_DEVICE memory as we normally need a file
>>> descriptor.
>>
>> I certainly hope this is the case - it's the weird corner cases of device
>> drivers that worry me. E.g. I know i915 has a "hidden" mmap behind an ioctl
>> (see i915_gem_mmap_ioctl(), although this case is fine - it's MAP_SHARED).
>> Mali's kbase did something similar in the past.
> 
> I think this should be fine since it's not a MAP_ANONYMOUS (we do allow
> MAP_SHARED to be tagged).
> 


-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-03-31  7:35 UTC|newest]

Thread overview: 112+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-12 15:18 [PATCH v10 0/6] MTE support for KVM guest Steven Price
2021-03-12 15:18 ` Steven Price
2021-03-12 15:18 ` Steven Price
2021-03-12 15:18 ` Steven Price
2021-03-12 15:18 ` [PATCH v10 1/6] arm64: mte: Sync tags for pages where PTE is untagged Steven Price
2021-03-12 15:18   ` Steven Price
2021-03-12 15:18   ` Steven Price
2021-03-12 15:18   ` Steven Price
2021-03-26 18:56   ` Catalin Marinas
2021-03-26 18:56     ` Catalin Marinas
2021-03-26 18:56     ` Catalin Marinas
2021-03-26 18:56     ` Catalin Marinas
2021-03-29 15:55     ` Steven Price
2021-03-29 15:55       ` Steven Price
2021-03-29 15:55       ` Steven Price
2021-03-29 15:55       ` Steven Price
2021-03-30 10:13       ` Catalin Marinas
2021-03-30 10:13         ` Catalin Marinas
2021-03-30 10:13         ` Catalin Marinas
2021-03-30 10:13         ` Catalin Marinas
2021-03-31 10:09         ` Steven Price
2021-03-31 10:09           ` Steven Price
2021-03-31 10:09           ` Steven Price
2021-03-31 10:09           ` Steven Price
2021-03-12 15:18 ` [PATCH v10 2/6] arm64: kvm: Introduce MTE VM feature Steven Price
2021-03-12 15:18   ` Steven Price
2021-03-12 15:18   ` Steven Price
2021-03-12 15:18   ` Steven Price
2021-03-27 15:23   ` Catalin Marinas
2021-03-27 15:23     ` Catalin Marinas
2021-03-27 15:23     ` Catalin Marinas
2021-03-27 15:23     ` Catalin Marinas
2021-03-28 12:21     ` Catalin Marinas
2021-03-28 12:21       ` Catalin Marinas
2021-03-28 12:21       ` Catalin Marinas
2021-03-28 12:21       ` Catalin Marinas
2021-03-29 16:06       ` Steven Price
2021-03-29 16:06         ` Steven Price
2021-03-29 16:06         ` Steven Price
2021-03-29 16:06         ` Steven Price
2021-03-30 10:30         ` Catalin Marinas
2021-03-30 10:30           ` Catalin Marinas
2021-03-30 10:30           ` Catalin Marinas
2021-03-30 10:30           ` Catalin Marinas
2021-03-31  7:34           ` David Hildenbrand [this message]
2021-03-31  7:34             ` David Hildenbrand
2021-03-31  7:34             ` David Hildenbrand
2021-03-31  7:34             ` David Hildenbrand
2021-03-31  9:21             ` Catalin Marinas
2021-03-31  9:21               ` Catalin Marinas
2021-03-31  9:21               ` Catalin Marinas
2021-03-31  9:21               ` Catalin Marinas
2021-03-31  9:32               ` David Hildenbrand
2021-03-31  9:32                 ` David Hildenbrand
2021-03-31  9:32                 ` David Hildenbrand
2021-03-31  9:32                 ` David Hildenbrand
2021-03-31 10:41                 ` Steven Price
2021-03-31 10:41                   ` Steven Price
2021-03-31 10:41                   ` Steven Price
2021-03-31 10:41                   ` Steven Price
2021-03-31 14:14                   ` David Hildenbrand
2021-03-31 14:14                     ` David Hildenbrand
2021-03-31 14:14                     ` David Hildenbrand
2021-03-31 14:14                     ` David Hildenbrand
2021-03-31 18:43                   ` Catalin Marinas
2021-03-31 18:43                     ` Catalin Marinas
2021-03-31 18:43                     ` Catalin Marinas
2021-03-31 18:43                     ` Catalin Marinas
2021-04-07 10:20                     ` Steven Price
2021-04-07 10:20                       ` Steven Price
2021-04-07 10:20                       ` Steven Price
2021-04-07 10:20                       ` Steven Price
2021-04-07 15:14                       ` Catalin Marinas
2021-04-07 15:14                         ` Catalin Marinas
2021-04-07 15:14                         ` Catalin Marinas
2021-04-07 15:14                         ` Catalin Marinas
2021-04-07 15:30                         ` David Hildenbrand
2021-04-07 15:30                           ` David Hildenbrand
2021-04-07 15:30                           ` David Hildenbrand
2021-04-07 15:30                           ` David Hildenbrand
2021-04-07 15:52                         ` Steven Price
2021-04-07 15:52                           ` Steven Price
2021-04-07 15:52                           ` Steven Price
2021-04-07 15:52                           ` Steven Price
2021-04-08 14:18                           ` Catalin Marinas
2021-04-08 14:18                             ` Catalin Marinas
2021-04-08 14:18                             ` Catalin Marinas
2021-04-08 14:18                             ` Catalin Marinas
2021-04-08 18:16                             ` David Hildenbrand
2021-04-08 18:16                               ` David Hildenbrand
2021-04-08 18:16                               ` David Hildenbrand
2021-04-08 18:16                               ` David Hildenbrand
2021-04-08 18:21                               ` Catalin Marinas
2021-04-08 18:21                                 ` Catalin Marinas
2021-04-08 18:21                                 ` Catalin Marinas
2021-04-08 18:21                                 ` Catalin Marinas
2021-03-12 15:18 ` [PATCH v10 3/6] arm64: kvm: Save/restore MTE registers Steven Price
2021-03-12 15:18   ` Steven Price
2021-03-12 15:18   ` Steven Price
2021-03-12 15:18   ` Steven Price
2021-03-12 15:19 ` [PATCH v10 4/6] arm64: kvm: Expose KVM_ARM_CAP_MTE Steven Price
2021-03-12 15:19   ` Steven Price
2021-03-12 15:19   ` Steven Price
2021-03-12 15:19   ` Steven Price
2021-03-12 15:19 ` [PATCH v10 5/6] KVM: arm64: ioctl to fetch/store tags in a guest Steven Price
2021-03-12 15:19   ` Steven Price
2021-03-12 15:19   ` Steven Price
2021-03-12 15:19   ` Steven Price
2021-03-12 15:19 ` [PATCH v10 6/6] KVM: arm64: Document MTE capability and ioctl Steven Price
2021-03-12 15:19   ` Steven Price
2021-03-12 15:19   ` Steven Price
2021-03-12 15:19   ` Steven Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8977120b-841d-4882-2472-6e403bc9c797@redhat.com \
    --to=david@redhat.com \
    --cc=Dave.Martin@arm.com \
    --cc=Haibo.Xu@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=dgilbert@redhat.com \
    --cc=drjones@redhat.com \
    --cc=james.morse@arm.com \
    --cc=julien.thierry.kdev@gmail.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=richard.henderson@linaro.org \
    --cc=steven.price@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.