linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] KVM: arm64: Force a PTE mapping when logging is enabled
@ 2019-03-02  3:35 Zenghui Yu
  2019-03-03 15:14 ` Zenghui Yu
  0 siblings, 1 reply; 8+ messages in thread
From: Zenghui Yu @ 2019-03-02  3:35 UTC (permalink / raw)
  To: christoffer.dall, marc.zyngier
  Cc: james.morse, julien.thierry, suzuki.poulose, wanghaibin.wang,
	kvmarm, linux-arm-kernel, linux-kernel, Zenghui Yu,
	Punit Agrawal

The idea behind this is: we don't want to keep tracking of huge pages when
logging_active is true, which will result in performance degradation.  We
still need to set vma_pagesize to PAGE_SIZE, so that we can make use of it
to force a PTE mapping.

Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>

---
Atfer looking into https://patchwork.codeaurora.org/patch/647985/ , the
"vma_pagesize = PAGE_SIZE" logic was not intended to be deleted. As far
as I can tell, we used to have "hugetlb" to force the PTE mapping, but
we have "vma_pagesize" currently instead. We should set it properly for
performance reasons (e.g, in VM migration). Did I miss something important?

---
 virt/kvm/arm/mmu.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 30251e2..7d41b16 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1705,6 +1705,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	     (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
 	    !force_pte) {
 		gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
+	} else {
+		/*
+		 * Fallback to PTE if it's not one of the stage2
+		 * supported hugepage sizes or the corresponding level
+		 * doesn't exist, or logging is enabled.
+		 */
+		vma_pagesize = PAGE_SIZE;
 	}
 	up_read(&current->mm->mmap_sem);
 
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] KVM: arm64: Force a PTE mapping when logging is enabled
  2019-03-02  3:35 [RFC PATCH] KVM: arm64: Force a PTE mapping when logging is enabled Zenghui Yu
@ 2019-03-03 15:14 ` Zenghui Yu
  2019-03-04 17:13   ` Suzuki K Poulose
  0 siblings, 1 reply; 8+ messages in thread
From: Zenghui Yu @ 2019-03-03 15:14 UTC (permalink / raw)
  To: christoffer.dall, marc.zyngier
  Cc: Punit Agrawal, suzuki.poulose, julien.thierry, LKML, james.morse,
	Zenghui Yu, wanghaibin.wang, kvmarm, linux-arm-kernel

I think there're still some problems in this patch... Details below.

On Sat, Mar 2, 2019 at 11:39 AM Zenghui Yu <yuzenghui@huawei.com> wrote:
>
> The idea behind this is: we don't want to keep tracking of huge pages when
> logging_active is true, which will result in performance degradation.  We
> still need to set vma_pagesize to PAGE_SIZE, so that we can make use of it
> to force a PTE mapping.
>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: Punit Agrawal <punit.agrawal@arm.com>
> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>
> ---
> Atfer looking into https://patchwork.codeaurora.org/patch/647985/ , the
> "vma_pagesize = PAGE_SIZE" logic was not intended to be deleted. As far
> as I can tell, we used to have "hugetlb" to force the PTE mapping, but
> we have "vma_pagesize" currently instead. We should set it properly for
> performance reasons (e.g, in VM migration). Did I miss something important?
>
> ---
>  virt/kvm/arm/mmu.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 30251e2..7d41b16 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1705,6 +1705,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>              (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
>             !force_pte) {
>                 gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
> +       } else {
> +               /*
> +                * Fallback to PTE if it's not one of the stage2
> +                * supported hugepage sizes or the corresponding level
> +                * doesn't exist, or logging is enabled.

First, Instead of "logging is enabled", it should be "force_pte is true",
since "force_pte" will be true when:

        1) fault_supports_stage2_pmd_mappings() return false; or
        2) "logging is enabled" (e.g, in VM migration).

Second, fallback some unsupported hugepage sizes (e.g, 64K hugepage with
4K pages) to PTE is somewhat strange. And it will then _unexpectedly_
reach transparent_hugepage_adjust(), though no real adjustment will happen
since commit fd2ef358282c ("KVM: arm/arm64: Ensure only THP is candidate
for adjustment"). Keeping "vma_pagesize" there as it is will be better,
right?

So I'd just simplify the logic like:

        } else if (force_pte) {
                vma_pagesize = PAGE_SIZE;
        }


Will send a V2 later and waiting for your comments :)


thanks,

zenghui


> +                */
> +               vma_pagesize = PAGE_SIZE;
>         }
>         up_read(&current->mm->mmap_sem);
>
> --
> 1.8.3.1
>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] KVM: arm64: Force a PTE mapping when logging is enabled
  2019-03-03 15:14 ` Zenghui Yu
@ 2019-03-04 17:13   ` Suzuki K Poulose
  2019-03-04 17:34     ` Marc Zyngier
  0 siblings, 1 reply; 8+ messages in thread
From: Suzuki K Poulose @ 2019-03-04 17:13 UTC (permalink / raw)
  To: Zenghui Yu
  Cc: christoffer.dall, marc.zyngier, Punit Agrawal, julien.thierry,
	LKML, james.morse, Zenghui Yu, wanghaibin.wang, kvmarm,
	linux-arm-kernel

Hi Zenghui,

On Sun, Mar 03, 2019 at 11:14:38PM +0800, Zenghui Yu wrote:
> I think there're still some problems in this patch... Details below.
> 
> On Sat, Mar 2, 2019 at 11:39 AM Zenghui Yu <yuzenghui@huawei.com> wrote:
> >
> > The idea behind this is: we don't want to keep tracking of huge pages when
> > logging_active is true, which will result in performance degradation.  We
> > still need to set vma_pagesize to PAGE_SIZE, so that we can make use of it
> > to force a PTE mapping.

Yes, you're right. We are indeed ignoring the force_pte flag.

> >
> > Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> > Cc: Punit Agrawal <punit.agrawal@arm.com>
> > Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
> >
> > ---
> > Atfer looking into https://patchwork.codeaurora.org/patch/647985/ , the
> > "vma_pagesize = PAGE_SIZE" logic was not intended to be deleted. As far
> > as I can tell, we used to have "hugetlb" to force the PTE mapping, but
> > we have "vma_pagesize" currently instead. We should set it properly for
> > performance reasons (e.g, in VM migration). Did I miss something important?
> >
> > ---
> >  virt/kvm/arm/mmu.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> > index 30251e2..7d41b16 100644
> > --- a/virt/kvm/arm/mmu.c
> > +++ b/virt/kvm/arm/mmu.c
> > @@ -1705,6 +1705,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >              (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
> >             !force_pte) {
> >                 gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
> > +       } else {
> > +               /*
> > +                * Fallback to PTE if it's not one of the stage2
> > +                * supported hugepage sizes or the corresponding level
> > +                * doesn't exist, or logging is enabled.
> 
> First, Instead of "logging is enabled", it should be "force_pte is true",
> since "force_pte" will be true when:
> 
>         1) fault_supports_stage2_pmd_mappings() return false; or
>         2) "logging is enabled" (e.g, in VM migration).
> 
> Second, fallback some unsupported hugepage sizes (e.g, 64K hugepage with
> 4K pages) to PTE is somewhat strange. And it will then _unexpectedly_
> reach transparent_hugepage_adjust(), though no real adjustment will happen
> since commit fd2ef358282c ("KVM: arm/arm64: Ensure only THP is candidate
> for adjustment"). Keeping "vma_pagesize" there as it is will be better,
> right?
> 
> So I'd just simplify the logic like:

We could fix this right in the beginning. See patch below:

> 
>         } else if (force_pte) {
>                 vma_pagesize = PAGE_SIZE;
>         }
> 
> 
> Will send a V2 later and waiting for your comments :)


diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 30251e2..529331e 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1693,7 +1693,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
-	vma_pagesize = vma_kernel_pagesize(vma);
+	/* If we are forced to map at page granularity, force the pagesize here */
+	vma_pagesize = force_pte ? PAGE_SIZE : vma_kernel_pagesize(vma);
+
 	/*
 	 * The stage2 has a minimum of 2 level table (For arm64 see
 	 * kvm_arm_setup_stage2()). Hence, we are guaranteed that we can
@@ -1701,11 +1703,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	 * As for PUD huge maps, we must make sure that we have at least
 	 * 3 levels, i.e, PMD is not folded.
 	 */
-	if ((vma_pagesize == PMD_SIZE ||
-	     (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
-	    !force_pte) {
+	if (vma_pagesize == PMD_SIZE ||
+	    (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
 		gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
-	}
+
 	up_read(&current->mm->mmap_sem);
 
 	/* We need minimum second+third level pages */

---

Suzuki

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] KVM: arm64: Force a PTE mapping when logging is enabled
  2019-03-04 17:13   ` Suzuki K Poulose
@ 2019-03-04 17:34     ` Marc Zyngier
  2019-03-05 11:09       ` Zenghui Yu
  0 siblings, 1 reply; 8+ messages in thread
From: Marc Zyngier @ 2019-03-04 17:34 UTC (permalink / raw)
  To: Suzuki K Poulose, Zenghui Yu
  Cc: christoffer.dall, Punit Agrawal, julien.thierry, LKML,
	james.morse, Zenghui Yu, wanghaibin.wang, kvmarm,
	linux-arm-kernel

Hi Zenghui, Suzuki,

On 04/03/2019 17:13, Suzuki K Poulose wrote:
> Hi Zenghui,
> 
> On Sun, Mar 03, 2019 at 11:14:38PM +0800, Zenghui Yu wrote:
>> I think there're still some problems in this patch... Details below.
>>
>> On Sat, Mar 2, 2019 at 11:39 AM Zenghui Yu <yuzenghui@huawei.com> wrote:
>>>
>>> The idea behind this is: we don't want to keep tracking of huge pages when
>>> logging_active is true, which will result in performance degradation.  We
>>> still need to set vma_pagesize to PAGE_SIZE, so that we can make use of it
>>> to force a PTE mapping.
> 
> Yes, you're right. We are indeed ignoring the force_pte flag.
> 
>>>
>>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>>> Cc: Punit Agrawal <punit.agrawal@arm.com>
>>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>>>
>>> ---
>>> Atfer looking into https://patchwork.codeaurora.org/patch/647985/ , the
>>> "vma_pagesize = PAGE_SIZE" logic was not intended to be deleted. As far
>>> as I can tell, we used to have "hugetlb" to force the PTE mapping, but
>>> we have "vma_pagesize" currently instead. We should set it properly for
>>> performance reasons (e.g, in VM migration). Did I miss something important?
>>>
>>> ---
>>>  virt/kvm/arm/mmu.c | 7 +++++++
>>>  1 file changed, 7 insertions(+)
>>>
>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>> index 30251e2..7d41b16 100644
>>> --- a/virt/kvm/arm/mmu.c
>>> +++ b/virt/kvm/arm/mmu.c
>>> @@ -1705,6 +1705,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>>              (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
>>>             !force_pte) {
>>>                 gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
>>> +       } else {
>>> +               /*
>>> +                * Fallback to PTE if it's not one of the stage2
>>> +                * supported hugepage sizes or the corresponding level
>>> +                * doesn't exist, or logging is enabled.
>>
>> First, Instead of "logging is enabled", it should be "force_pte is true",
>> since "force_pte" will be true when:
>>
>>         1) fault_supports_stage2_pmd_mappings() return false; or
>>         2) "logging is enabled" (e.g, in VM migration).
>>
>> Second, fallback some unsupported hugepage sizes (e.g, 64K hugepage with
>> 4K pages) to PTE is somewhat strange. And it will then _unexpectedly_
>> reach transparent_hugepage_adjust(), though no real adjustment will happen
>> since commit fd2ef358282c ("KVM: arm/arm64: Ensure only THP is candidate
>> for adjustment"). Keeping "vma_pagesize" there as it is will be better,
>> right?
>>
>> So I'd just simplify the logic like:
> 
> We could fix this right in the beginning. See patch below:
> 
>>
>>         } else if (force_pte) {
>>                 vma_pagesize = PAGE_SIZE;
>>         }
>>
>>
>> Will send a V2 later and waiting for your comments :)
> 
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 30251e2..529331e 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1693,7 +1693,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  		return -EFAULT;
>  	}
>  
> -	vma_pagesize = vma_kernel_pagesize(vma);
> +	/* If we are forced to map at page granularity, force the pagesize here */
> +	vma_pagesize = force_pte ? PAGE_SIZE : vma_kernel_pagesize(vma);
> +
>  	/*
>  	 * The stage2 has a minimum of 2 level table (For arm64 see
>  	 * kvm_arm_setup_stage2()). Hence, we are guaranteed that we can
> @@ -1701,11 +1703,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	 * As for PUD huge maps, we must make sure that we have at least
>  	 * 3 levels, i.e, PMD is not folded.
>  	 */
> -	if ((vma_pagesize == PMD_SIZE ||
> -	     (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
> -	    !force_pte) {
> +	if (vma_pagesize == PMD_SIZE ||
> +	    (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
>  		gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
> -	}
> +
>  	up_read(&current->mm->mmap_sem);
>  
>  	/* We need minimum second+third level pages */
That's pretty interesting, because this is almost what we already have
in the NV code:

https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/tree/virt/kvm/arm/mmu.c?h=kvm-arm64/nv-wip-v5.0-rc7#n1752

(note that force_pte is gone in that branch).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] KVM: arm64: Force a PTE mapping when logging is enabled
  2019-03-04 17:34     ` Marc Zyngier
@ 2019-03-05 11:09       ` Zenghui Yu
  2019-03-05 11:13         ` Suzuki K Poulose
  2019-03-05 11:51         ` Marc Zyngier
  0 siblings, 2 replies; 8+ messages in thread
From: Zenghui Yu @ 2019-03-05 11:09 UTC (permalink / raw)
  To: Marc Zyngier, Suzuki K Poulose, Zenghui Yu
  Cc: christoffer.dall, Punit Agrawal, julien.thierry, LKML,
	james.morse, wanghaibin.wang, kvmarm, linux-arm-kernel

Hi Marc, Suzuki,

On 2019/3/5 1:34, Marc Zyngier wrote:
> Hi Zenghui, Suzuki,
> 
> On 04/03/2019 17:13, Suzuki K Poulose wrote:
>> Hi Zenghui,
>>
>> On Sun, Mar 03, 2019 at 11:14:38PM +0800, Zenghui Yu wrote:
>>> I think there're still some problems in this patch... Details below.
>>>
>>> On Sat, Mar 2, 2019 at 11:39 AM Zenghui Yu <yuzenghui@huawei.com> wrote:
>>>>
>>>> The idea behind this is: we don't want to keep tracking of huge pages when
>>>> logging_active is true, which will result in performance degradation.  We
>>>> still need to set vma_pagesize to PAGE_SIZE, so that we can make use of it
>>>> to force a PTE mapping.
>>
>> Yes, you're right. We are indeed ignoring the force_pte flag.
>>
>>>>
>>>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>> Cc: Punit Agrawal <punit.agrawal@arm.com>
>>>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>>>>
>>>> ---
>>>> Atfer looking into https://patchwork.codeaurora.org/patch/647985/ , the
>>>> "vma_pagesize = PAGE_SIZE" logic was not intended to be deleted. As far
>>>> as I can tell, we used to have "hugetlb" to force the PTE mapping, but
>>>> we have "vma_pagesize" currently instead. We should set it properly for
>>>> performance reasons (e.g, in VM migration). Did I miss something important?
>>>>
>>>> ---
>>>>   virt/kvm/arm/mmu.c | 7 +++++++
>>>>   1 file changed, 7 insertions(+)
>>>>
>>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>>> index 30251e2..7d41b16 100644
>>>> --- a/virt/kvm/arm/mmu.c
>>>> +++ b/virt/kvm/arm/mmu.c
>>>> @@ -1705,6 +1705,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>>>               (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
>>>>              !force_pte) {
>>>>                  gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
>>>> +       } else {
>>>> +               /*
>>>> +                * Fallback to PTE if it's not one of the stage2
>>>> +                * supported hugepage sizes or the corresponding level
>>>> +                * doesn't exist, or logging is enabled.
>>>
>>> First, Instead of "logging is enabled", it should be "force_pte is true",
>>> since "force_pte" will be true when:
>>>
>>>          1) fault_supports_stage2_pmd_mappings() return false; or
>>>          2) "logging is enabled" (e.g, in VM migration).
>>>
>>> Second, fallback some unsupported hugepage sizes (e.g, 64K hugepage with
>>> 4K pages) to PTE is somewhat strange. And it will then _unexpectedly_
>>> reach transparent_hugepage_adjust(), though no real adjustment will happen
>>> since commit fd2ef358282c ("KVM: arm/arm64: Ensure only THP is candidate
>>> for adjustment"). Keeping "vma_pagesize" there as it is will be better,
>>> right?
>>>
>>> So I'd just simplify the logic like:
>>
>> We could fix this right in the beginning. See patch below:
>>
>>>
>>>          } else if (force_pte) {
>>>                  vma_pagesize = PAGE_SIZE;
>>>          }
>>>
>>>
>>> Will send a V2 later and waiting for your comments :)
>>
>>
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 30251e2..529331e 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1693,7 +1693,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>   		return -EFAULT;
>>   	}
>>   
>> -	vma_pagesize = vma_kernel_pagesize(vma);
>> +	/* If we are forced to map at page granularity, force the pagesize here */
>> +	vma_pagesize = force_pte ? PAGE_SIZE : vma_kernel_pagesize(vma);
>> +
>>   	/*
>>   	 * The stage2 has a minimum of 2 level table (For arm64 see
>>   	 * kvm_arm_setup_stage2()). Hence, we are guaranteed that we can
>> @@ -1701,11 +1703,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>   	 * As for PUD huge maps, we must make sure that we have at least
>>   	 * 3 levels, i.e, PMD is not folded.
>>   	 */
>> -	if ((vma_pagesize == PMD_SIZE ||
>> -	     (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
>> -	    !force_pte) {
>> +	if (vma_pagesize == PMD_SIZE ||
>> +	    (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
>>   		gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
>> -	}
>> +
>>   	up_read(&current->mm->mmap_sem);
>>   
>>   	/* We need minimum second+third level pages */

A nicer implementation and easier to understand, thanks!

> That's pretty interesting, because this is almost what we already have
> in the NV code:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/tree/virt/kvm/arm/mmu.c?h=kvm-arm64/nv-wip-v5.0-rc7#n1752
> 
> (note that force_pte is gone in that branch).

haha :-) sorry about that. I haven't looked into the NV code yet, so ...

But I'm still wondering: should we fix this wrong mapping size problem 
before NV is introduced? Since this problem has not much to do with NV, 
and 5.0 has already been released with this problem (and 5.1 will 
without fix ...).

Just a personal idea, ignore it if unnecessary.


thanks,

zenghui


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] KVM: arm64: Force a PTE mapping when logging is enabled
  2019-03-05 11:09       ` Zenghui Yu
@ 2019-03-05 11:13         ` Suzuki K Poulose
  2019-03-05 11:32           ` Zenghui Yu
  2019-03-05 11:51         ` Marc Zyngier
  1 sibling, 1 reply; 8+ messages in thread
From: Suzuki K Poulose @ 2019-03-05 11:13 UTC (permalink / raw)
  To: yuzenghui, marc.zyngier, zenghuiyu96
  Cc: christoffer.dall, punit.agrawal, julien.thierry, linux-kernel,
	james.morse, wanghaibin.wang, kvmarm, linux-arm-kernel

Hi Zenghui,

On 05/03/2019 11:09, Zenghui Yu wrote:
> Hi Marc, Suzuki,
> 
> On 2019/3/5 1:34, Marc Zyngier wrote:
>> Hi Zenghui, Suzuki,
>>
>> On 04/03/2019 17:13, Suzuki K Poulose wrote:
>>> Hi Zenghui,
>>>
>>> On Sun, Mar 03, 2019 at 11:14:38PM +0800, Zenghui Yu wrote:
>>>> I think there're still some problems in this patch... Details below.
>>>>
>>>> On Sat, Mar 2, 2019 at 11:39 AM Zenghui Yu <yuzenghui@huawei.com> wrote:
>>>>>
>>>>> The idea behind this is: we don't want to keep tracking of huge pages when
>>>>> logging_active is true, which will result in performance degradation.  We
>>>>> still need to set vma_pagesize to PAGE_SIZE, so that we can make use of it
>>>>> to force a PTE mapping.
>>>
>>> Yes, you're right. We are indeed ignoring the force_pte flag.
>>>
>>>>>
>>>>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>> Cc: Punit Agrawal <punit.agrawal@arm.com>
>>>>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>>>>>
>>>>> ---
>>>>> Atfer looking into https://patchwork.codeaurora.org/patch/647985/ , the
>>>>> "vma_pagesize = PAGE_SIZE" logic was not intended to be deleted. As far
>>>>> as I can tell, we used to have "hugetlb" to force the PTE mapping, but
>>>>> we have "vma_pagesize" currently instead. We should set it properly for
>>>>> performance reasons (e.g, in VM migration). Did I miss something important?
>>>>>
>>>>> ---
>>>>>    virt/kvm/arm/mmu.c | 7 +++++++
>>>>>    1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>>>> index 30251e2..7d41b16 100644
>>>>> --- a/virt/kvm/arm/mmu.c
>>>>> +++ b/virt/kvm/arm/mmu.c
>>>>> @@ -1705,6 +1705,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>>>>                (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
>>>>>               !force_pte) {
>>>>>                   gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
>>>>> +       } else {
>>>>> +               /*
>>>>> +                * Fallback to PTE if it's not one of the stage2
>>>>> +                * supported hugepage sizes or the corresponding level
>>>>> +                * doesn't exist, or logging is enabled.
>>>>
>>>> First, Instead of "logging is enabled", it should be "force_pte is true",
>>>> since "force_pte" will be true when:
>>>>
>>>>           1) fault_supports_stage2_pmd_mappings() return false; or
>>>>           2) "logging is enabled" (e.g, in VM migration).
>>>>
>>>> Second, fallback some unsupported hugepage sizes (e.g, 64K hugepage with
>>>> 4K pages) to PTE is somewhat strange. And it will then _unexpectedly_
>>>> reach transparent_hugepage_adjust(), though no real adjustment will happen
>>>> since commit fd2ef358282c ("KVM: arm/arm64: Ensure only THP is candidate
>>>> for adjustment"). Keeping "vma_pagesize" there as it is will be better,
>>>> right?
>>>>
>>>> So I'd just simplify the logic like:
>>>
>>> We could fix this right in the beginning. See patch below:
>>>
>>>>
>>>>           } else if (force_pte) {
>>>>                   vma_pagesize = PAGE_SIZE;
>>>>           }
>>>>
>>>>
>>>> Will send a V2 later and waiting for your comments :)
>>>
>>>
>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>> index 30251e2..529331e 100644
>>> --- a/virt/kvm/arm/mmu.c
>>> +++ b/virt/kvm/arm/mmu.c
>>> @@ -1693,7 +1693,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>>    		return -EFAULT;
>>>    	}
>>>    
>>> -	vma_pagesize = vma_kernel_pagesize(vma);
>>> +	/* If we are forced to map at page granularity, force the pagesize here */
>>> +	vma_pagesize = force_pte ? PAGE_SIZE : vma_kernel_pagesize(vma);
>>> +
>>>    	/*
>>>    	 * The stage2 has a minimum of 2 level table (For arm64 see
>>>    	 * kvm_arm_setup_stage2()). Hence, we are guaranteed that we can
>>> @@ -1701,11 +1703,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>>    	 * As for PUD huge maps, we must make sure that we have at least
>>>    	 * 3 levels, i.e, PMD is not folded.
>>>    	 */
>>> -	if ((vma_pagesize == PMD_SIZE ||
>>> -	     (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
>>> -	    !force_pte) {
>>> +	if (vma_pagesize == PMD_SIZE ||
>>> +	    (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
>>>    		gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
>>> -	}
>>> +
>>>    	up_read(&current->mm->mmap_sem);
>>>    
>>>    	/* We need minimum second+third level pages */
> 
> A nicer implementation and easier to understand, thanks!
> 
>> That's pretty interesting, because this is almost what we already have
>> in the NV code:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/tree/virt/kvm/arm/mmu.c?h=kvm-arm64/nv-wip-v5.0-rc7#n1752
>>
>> (note that force_pte is gone in that branch).
> 
> haha :-) sorry about that. I haven't looked into the NV code yet, so ...
> 
> But I'm still wondering: should we fix this wrong mapping size problem
> before NV is introduced? Since this problem has not much to do with NV,
> and 5.0 has already been released with this problem (and 5.1 will
> without fix ...).

Yes, we must fix it. I will soon send out a patch copying on it.
Its just that I find some more issues around forcing the PTE
mappings with PUD huge pages. I will send something out soon.

Cheers
Suzuki


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] KVM: arm64: Force a PTE mapping when logging is enabled
  2019-03-05 11:13         ` Suzuki K Poulose
@ 2019-03-05 11:32           ` Zenghui Yu
  0 siblings, 0 replies; 8+ messages in thread
From: Zenghui Yu @ 2019-03-05 11:32 UTC (permalink / raw)
  To: Suzuki K Poulose, marc.zyngier, zenghuiyu96
  Cc: christoffer.dall, punit.agrawal, julien.thierry, linux-kernel,
	james.morse, wanghaibin.wang, kvmarm, linux-arm-kernel



On 2019/3/5 19:13, Suzuki K Poulose wrote:
> Hi Zenghui,
> 
> On 05/03/2019 11:09, Zenghui Yu wrote:
>> Hi Marc, Suzuki,
>>
>> On 2019/3/5 1:34, Marc Zyngier wrote:
>>> Hi Zenghui, Suzuki,
>>>
>>> On 04/03/2019 17:13, Suzuki K Poulose wrote:
>>>> Hi Zenghui,
>>>>
>>>> On Sun, Mar 03, 2019 at 11:14:38PM +0800, Zenghui Yu wrote:
>>>>> I think there're still some problems in this patch... Details below.
>>>>>
>>>>> On Sat, Mar 2, 2019 at 11:39 AM Zenghui Yu <yuzenghui@huawei.com> 
>>>>> wrote:
>>>>>>
>>>>>> The idea behind this is: we don't want to keep tracking of huge 
>>>>>> pages when
>>>>>> logging_active is true, which will result in performance 
>>>>>> degradation.  We
>>>>>> still need to set vma_pagesize to PAGE_SIZE, so that we can make 
>>>>>> use of it
>>>>>> to force a PTE mapping.
>>>>
>>>> Yes, you're right. We are indeed ignoring the force_pte flag.
>>>>
>>>>>>
>>>>>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>>> Cc: Punit Agrawal <punit.agrawal@arm.com>
>>>>>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>>>>>>
>>>>>> ---
>>>>>> Atfer looking into https://patchwork.codeaurora.org/patch/647985/ 
>>>>>> , the
>>>>>> "vma_pagesize = PAGE_SIZE" logic was not intended to be deleted. 
>>>>>> As far
>>>>>> as I can tell, we used to have "hugetlb" to force the PTE mapping, 
>>>>>> but
>>>>>> we have "vma_pagesize" currently instead. We should set it 
>>>>>> properly for
>>>>>> performance reasons (e.g, in VM migration). Did I miss something 
>>>>>> important?
>>>>>>
>>>>>> ---
>>>>>>    virt/kvm/arm/mmu.c | 7 +++++++
>>>>>>    1 file changed, 7 insertions(+)
>>>>>>
>>>>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>>>>> index 30251e2..7d41b16 100644
>>>>>> --- a/virt/kvm/arm/mmu.c
>>>>>> +++ b/virt/kvm/arm/mmu.c
>>>>>> @@ -1705,6 +1705,13 @@ static int user_mem_abort(struct kvm_vcpu 
>>>>>> *vcpu, phys_addr_t fault_ipa,
>>>>>>                (vma_pagesize == PUD_SIZE && 
>>>>>> kvm_stage2_has_pmd(kvm))) &&
>>>>>>               !force_pte) {
>>>>>>                   gfn = (fault_ipa & 
>>>>>> huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
>>>>>> +       } else {
>>>>>> +               /*
>>>>>> +                * Fallback to PTE if it's not one of the stage2
>>>>>> +                * supported hugepage sizes or the corresponding 
>>>>>> level
>>>>>> +                * doesn't exist, or logging is enabled.
>>>>>
>>>>> First, Instead of "logging is enabled", it should be "force_pte is 
>>>>> true",
>>>>> since "force_pte" will be true when:
>>>>>
>>>>>           1) fault_supports_stage2_pmd_mappings() return false; or
>>>>>           2) "logging is enabled" (e.g, in VM migration).
>>>>>
>>>>> Second, fallback some unsupported hugepage sizes (e.g, 64K hugepage 
>>>>> with
>>>>> 4K pages) to PTE is somewhat strange. And it will then _unexpectedly_
>>>>> reach transparent_hugepage_adjust(), though no real adjustment will 
>>>>> happen
>>>>> since commit fd2ef358282c ("KVM: arm/arm64: Ensure only THP is 
>>>>> candidate
>>>>> for adjustment"). Keeping "vma_pagesize" there as it is will be 
>>>>> better,
>>>>> right?
>>>>>
>>>>> So I'd just simplify the logic like:
>>>>
>>>> We could fix this right in the beginning. See patch below:
>>>>
>>>>>
>>>>>           } else if (force_pte) {
>>>>>                   vma_pagesize = PAGE_SIZE;
>>>>>           }
>>>>>
>>>>>
>>>>> Will send a V2 later and waiting for your comments :)
>>>>
>>>>
>>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>>> index 30251e2..529331e 100644
>>>> --- a/virt/kvm/arm/mmu.c
>>>> +++ b/virt/kvm/arm/mmu.c
>>>> @@ -1693,7 +1693,9 @@ static int user_mem_abort(struct kvm_vcpu 
>>>> *vcpu, phys_addr_t fault_ipa,
>>>>            return -EFAULT;
>>>>        }
>>>> -    vma_pagesize = vma_kernel_pagesize(vma);
>>>> +    /* If we are forced to map at page granularity, force the 
>>>> pagesize here */
>>>> +    vma_pagesize = force_pte ? PAGE_SIZE : vma_kernel_pagesize(vma);
>>>> +
>>>>        /*
>>>>         * The stage2 has a minimum of 2 level table (For arm64 see
>>>>         * kvm_arm_setup_stage2()). Hence, we are guaranteed that we can
>>>> @@ -1701,11 +1703,10 @@ static int user_mem_abort(struct kvm_vcpu 
>>>> *vcpu, phys_addr_t fault_ipa,
>>>>         * As for PUD huge maps, we must make sure that we have at least
>>>>         * 3 levels, i.e, PMD is not folded.
>>>>         */
>>>> -    if ((vma_pagesize == PMD_SIZE ||
>>>> -         (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
>>>> -        !force_pte) {
>>>> +    if (vma_pagesize == PMD_SIZE ||
>>>> +        (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
>>>>            gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> 
>>>> PAGE_SHIFT;
>>>> -    }
>>>> +
>>>>        up_read(&current->mm->mmap_sem);
>>>>        /* We need minimum second+third level pages */
>>
>> A nicer implementation and easier to understand, thanks!
>>
>>> That's pretty interesting, because this is almost what we already have
>>> in the NV code:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/tree/virt/kvm/arm/mmu.c?h=kvm-arm64/nv-wip-v5.0-rc7#n1752 
>>>
>>>
>>> (note that force_pte is gone in that branch).
>>
>> haha :-) sorry about that. I haven't looked into the NV code yet, so ...
>>
>> But I'm still wondering: should we fix this wrong mapping size problem
>> before NV is introduced? Since this problem has not much to do with NV,
>> and 5.0 has already been released with this problem (and 5.1 will
>> without fix ...).
> 
> Yes, we must fix it. I will soon send out a patch copying on it.
> Its just that I find some more issues around forcing the PTE
> mappings with PUD huge pages. I will send something out soon.

Sounds good!


zenghui



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] KVM: arm64: Force a PTE mapping when logging is enabled
  2019-03-05 11:09       ` Zenghui Yu
  2019-03-05 11:13         ` Suzuki K Poulose
@ 2019-03-05 11:51         ` Marc Zyngier
  1 sibling, 0 replies; 8+ messages in thread
From: Marc Zyngier @ 2019-03-05 11:51 UTC (permalink / raw)
  To: Zenghui Yu, Suzuki K Poulose, Zenghui Yu
  Cc: christoffer.dall, Punit Agrawal, julien.thierry, LKML,
	james.morse, wanghaibin.wang, kvmarm, linux-arm-kernel

On 05/03/2019 11:09, Zenghui Yu wrote:
> Hi Marc, Suzuki,
> 
> On 2019/3/5 1:34, Marc Zyngier wrote:
>> Hi Zenghui, Suzuki,
>>
>> On 04/03/2019 17:13, Suzuki K Poulose wrote:
>>> Hi Zenghui,
>>>
>>> On Sun, Mar 03, 2019 at 11:14:38PM +0800, Zenghui Yu wrote:
>>>> I think there're still some problems in this patch... Details below.
>>>>
>>>> On Sat, Mar 2, 2019 at 11:39 AM Zenghui Yu <yuzenghui@huawei.com> wrote:
>>>>>
>>>>> The idea behind this is: we don't want to keep tracking of huge pages when
>>>>> logging_active is true, which will result in performance degradation.  We
>>>>> still need to set vma_pagesize to PAGE_SIZE, so that we can make use of it
>>>>> to force a PTE mapping.
>>>
>>> Yes, you're right. We are indeed ignoring the force_pte flag.
>>>
>>>>>
>>>>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>> Cc: Punit Agrawal <punit.agrawal@arm.com>
>>>>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>>>>>
>>>>> ---
>>>>> Atfer looking into https://patchwork.codeaurora.org/patch/647985/ , the
>>>>> "vma_pagesize = PAGE_SIZE" logic was not intended to be deleted. As far
>>>>> as I can tell, we used to have "hugetlb" to force the PTE mapping, but
>>>>> we have "vma_pagesize" currently instead. We should set it properly for
>>>>> performance reasons (e.g, in VM migration). Did I miss something important?
>>>>>
>>>>> ---
>>>>>   virt/kvm/arm/mmu.c | 7 +++++++
>>>>>   1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>>>> index 30251e2..7d41b16 100644
>>>>> --- a/virt/kvm/arm/mmu.c
>>>>> +++ b/virt/kvm/arm/mmu.c
>>>>> @@ -1705,6 +1705,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>>>>               (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
>>>>>              !force_pte) {
>>>>>                  gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
>>>>> +       } else {
>>>>> +               /*
>>>>> +                * Fallback to PTE if it's not one of the stage2
>>>>> +                * supported hugepage sizes or the corresponding level
>>>>> +                * doesn't exist, or logging is enabled.
>>>>
>>>> First, Instead of "logging is enabled", it should be "force_pte is true",
>>>> since "force_pte" will be true when:
>>>>
>>>>          1) fault_supports_stage2_pmd_mappings() return false; or
>>>>          2) "logging is enabled" (e.g, in VM migration).
>>>>
>>>> Second, fallback some unsupported hugepage sizes (e.g, 64K hugepage with
>>>> 4K pages) to PTE is somewhat strange. And it will then _unexpectedly_
>>>> reach transparent_hugepage_adjust(), though no real adjustment will happen
>>>> since commit fd2ef358282c ("KVM: arm/arm64: Ensure only THP is candidate
>>>> for adjustment"). Keeping "vma_pagesize" there as it is will be better,
>>>> right?
>>>>
>>>> So I'd just simplify the logic like:
>>>
>>> We could fix this right in the beginning. See patch below:
>>>
>>>>
>>>>          } else if (force_pte) {
>>>>                  vma_pagesize = PAGE_SIZE;
>>>>          }
>>>>
>>>>
>>>> Will send a V2 later and waiting for your comments :)
>>>
>>>
>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>> index 30251e2..529331e 100644
>>> --- a/virt/kvm/arm/mmu.c
>>> +++ b/virt/kvm/arm/mmu.c
>>> @@ -1693,7 +1693,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>>   		return -EFAULT;
>>>   	}
>>>   
>>> -	vma_pagesize = vma_kernel_pagesize(vma);
>>> +	/* If we are forced to map at page granularity, force the pagesize here */
>>> +	vma_pagesize = force_pte ? PAGE_SIZE : vma_kernel_pagesize(vma);
>>> +
>>>   	/*
>>>   	 * The stage2 has a minimum of 2 level table (For arm64 see
>>>   	 * kvm_arm_setup_stage2()). Hence, we are guaranteed that we can
>>> @@ -1701,11 +1703,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>>   	 * As for PUD huge maps, we must make sure that we have at least
>>>   	 * 3 levels, i.e, PMD is not folded.
>>>   	 */
>>> -	if ((vma_pagesize == PMD_SIZE ||
>>> -	     (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
>>> -	    !force_pte) {
>>> +	if (vma_pagesize == PMD_SIZE ||
>>> +	    (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
>>>   		gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
>>> -	}
>>> +
>>>   	up_read(&current->mm->mmap_sem);
>>>   
>>>   	/* We need minimum second+third level pages */
> 
> A nicer implementation and easier to understand, thanks!
> 
>> That's pretty interesting, because this is almost what we already have
>> in the NV code:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/tree/virt/kvm/arm/mmu.c?h=kvm-arm64/nv-wip-v5.0-rc7#n1752
>>
>> (note that force_pte is gone in that branch).
> 
> haha :-) sorry about that. I haven't looked into the NV code yet, so ...
> 
> But I'm still wondering: should we fix this wrong mapping size problem 
> before NV is introduced? Since this problem has not much to do with NV, 
> and 5.0 has already been released with this problem (and 5.1 will 
> without fix ...).
> 
> Just a personal idea, ignore it if unnecessary.

We definitely want to fix it now, and have the fix backported to older
versions. I can always rebase the NV branch on top of the current state
of mainline.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-03-05 11:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-02  3:35 [RFC PATCH] KVM: arm64: Force a PTE mapping when logging is enabled Zenghui Yu
2019-03-03 15:14 ` Zenghui Yu
2019-03-04 17:13   ` Suzuki K Poulose
2019-03-04 17:34     ` Marc Zyngier
2019-03-05 11:09       ` Zenghui Yu
2019-03-05 11:13         ` Suzuki K Poulose
2019-03-05 11:32           ` Zenghui Yu
2019-03-05 11:51         ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).