All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
@ 2018-11-02  7:53 ` Christoffer Dall
  0 siblings, 0 replies; 18+ messages in thread
From: Christoffer Dall @ 2018-11-02  7:53 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel; +Cc: Marc Zyngier, kvm

In attempting to re-construct the logic for our stage 2 page table
layout I found the reaoning in the comment explaining how we calculate
the number of levels used for stage 2 page tables a bit backwards.

This commit attempts to clarify the comment, to make it slightly easier
to read without having the Arm ARM open on the right page.

While we're at it, fixup a typo in a comment that was recently changed.

Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
---
 arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
 virt/kvm/arm/mmu.c                      |  2 +-
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
index d352f6df8d2c..9c387320b28c 100644
--- a/arch/arm64/include/asm/stage2_pgtable.h
+++ b/arch/arm64/include/asm/stage2_pgtable.h
@@ -31,15 +31,18 @@
 
 /*
  * The hardware supports concatenation of up to 16 tables at stage2 entry level
- * and we use the feature whenever possible.
+ * and we use the feature whenever possible, which means we resolve 4 bits of
+ * address at the entry level.
  *
- * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
+ * This implies, the total number of page table levels required for
+ * IPA_SHIFT at stage2 expected by the hardware can be calculated using
+ * the same logic used for the (non-collapsable) stage1 page tables but for
+ * (IPA_SHIFT - 4).
+ *
+ * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
  * On arm64, the smallest PAGE_SIZE supported is 4k, which means
- *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
- * This implies, the total number of page table levels at stage2 expected
- * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
- * in normal translations(e.g, stage1), since we cannot have another level in
- * the range (IPA_SHIFT, IPA_SHIFT - 4).
+ *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
+ * and therefore we will need a minimum of two levels for stage2 in all cases.
  */
 #define stage2_pgtable_levels(ipa)	ARM64_HW_PGTABLE_LEVELS((ipa) - 4)
 #define kvm_stage2_levels(kvm)		VTCR_EL2_LVLS(kvm->arch.vtcr)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 4e7572656b5c..78d8020df4a4 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1234,7 +1234,7 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
 	struct page *page = pfn_to_page(pfn);
 
 	/*
-	 * PageTransCompoungMap() returns true for THP and
+	 * PageTransCompoundMap() returns true for THP and
 	 * hugetlbfs. Make sure the adjustment is done only for THP
 	 * pages.
 	 */
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
@ 2018-11-02  7:53 ` Christoffer Dall
  0 siblings, 0 replies; 18+ messages in thread
From: Christoffer Dall @ 2018-11-02  7:53 UTC (permalink / raw)
  To: linux-arm-kernel

In attempting to re-construct the logic for our stage 2 page table
layout I found the reaoning in the comment explaining how we calculate
the number of levels used for stage 2 page tables a bit backwards.

This commit attempts to clarify the comment, to make it slightly easier
to read without having the Arm ARM open on the right page.

While we're at it, fixup a typo in a comment that was recently changed.

Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
---
 arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
 virt/kvm/arm/mmu.c                      |  2 +-
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
index d352f6df8d2c..9c387320b28c 100644
--- a/arch/arm64/include/asm/stage2_pgtable.h
+++ b/arch/arm64/include/asm/stage2_pgtable.h
@@ -31,15 +31,18 @@
 
 /*
  * The hardware supports concatenation of up to 16 tables at stage2 entry level
- * and we use the feature whenever possible.
+ * and we use the feature whenever possible, which means we resolve 4 bits of
+ * address at the entry level.
  *
- * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
+ * This implies, the total number of page table levels required for
+ * IPA_SHIFT at stage2 expected by the hardware can be calculated using
+ * the same logic used for the (non-collapsable) stage1 page tables but for
+ * (IPA_SHIFT - 4).
+ *
+ * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
  * On arm64, the smallest PAGE_SIZE supported is 4k, which means
- *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
- * This implies, the total number of page table levels at stage2 expected
- * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
- * in normal translations(e.g, stage1), since we cannot have another level in
- * the range (IPA_SHIFT, IPA_SHIFT - 4).
+ *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
+ * and therefore we will need a minimum of two levels for stage2 in all cases.
  */
 #define stage2_pgtable_levels(ipa)	ARM64_HW_PGTABLE_LEVELS((ipa) - 4)
 #define kvm_stage2_levels(kvm)		VTCR_EL2_LVLS(kvm->arch.vtcr)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 4e7572656b5c..78d8020df4a4 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1234,7 +1234,7 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
 	struct page *page = pfn_to_page(pfn);
 
 	/*
-	 * PageTransCompoungMap() returns true for THP and
+	 * PageTransCompoundMap() returns true for THP and
 	 * hugetlbfs. Make sure the adjustment is done only for THP
 	 * pages.
 	 */
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
  2018-11-02  7:53 ` Christoffer Dall
@ 2018-11-02 11:02   ` Suzuki K Poulose
  -1 siblings, 0 replies; 18+ messages in thread
From: Suzuki K Poulose @ 2018-11-02 11:02 UTC (permalink / raw)
  To: Christoffer Dall, kvmarm, linux-arm-kernel; +Cc: Marc Zyngier, kvm

Hi

On 02/11/18 07:53, Christoffer Dall wrote:
> In attempting to re-construct the logic for our stage 2 page table
> layout I found the reaoning in the comment explaining how we calculate
> the number of levels used for stage 2 page tables a bit backwards.
> 
> This commit attempts to clarify the comment, to make it slightly easier
> to read without having the Arm ARM open on the right page.
> 
> While we're at it, fixup a typo in a comment that was recently changed.
> 
> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> ---
>   arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
>   virt/kvm/arm/mmu.c                      |  2 +-
>   2 files changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
> index d352f6df8d2c..9c387320b28c 100644
> --- a/arch/arm64/include/asm/stage2_pgtable.h
> +++ b/arch/arm64/include/asm/stage2_pgtable.h
> @@ -31,15 +31,18 @@
>   
>   /*
>    * The hardware supports concatenation of up to 16 tables at stage2 entry level
> - * and we use the feature whenever possible.
> + * and we use the feature whenever possible, which means we resolve 4 bits of

s/we resolve 4 bits/we resolve 4 additional bits/ ?

> + * address at the entry level.
>    *
> - * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> + * This implies, the total number of page table levels required for
> + * IPA_SHIFT at stage2 expected by the hardware can be calculated using
> + * the same logic used for the (non-collapsable) stage1 page tables but for
> + * (IPA_SHIFT - 4).
> + *
> + * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).

May be we could improve it further by :

s/resolved at any level/resolved at any *non-entry* level/

as, we could resolve as small as 1 bit at the entry level.


>    * On arm64, the smallest PAGE_SIZE supported is 4k, which means
> - *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
> - * This implies, the total number of page table levels at stage2 expected
> - * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
> - * in normal translations(e.g, stage1), since we cannot have another level in
> - * the range (IPA_SHIFT, IPA_SHIFT - 4).
> + *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
> + * and therefore we will need a minimum of two levels for stage2 in all cases.

I think the above statement is misleading. The minimum number of
levels has nothing to do with the concatenation. For e.g, we could
still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
concatenated), going by the same rules above. The only reason why
we limit the number of levels to 2, is to prevent splitting stage1 PMD
huge mappings (which are quite common) at stage2.

>    */
>   #define stage2_pgtable_levels(ipa)	ARM64_HW_PGTABLE_LEVELS((ipa) - 4)
>   #define kvm_stage2_levels(kvm)		VTCR_EL2_LVLS(kvm->arch.vtcr)
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 4e7572656b5c..78d8020df4a4 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1234,7 +1234,7 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
>   	struct page *page = pfn_to_page(pfn);
>   
>   	/*
> -	 * PageTransCompoungMap() returns true for THP and
> +	 * PageTransCompoundMap() returns true for THP and
>   	 * hugetlbfs. Make sure the adjustment is done only for THP
>   	 * pages.
>   	 */
> 

Thanks
Suzuki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
@ 2018-11-02 11:02   ` Suzuki K Poulose
  0 siblings, 0 replies; 18+ messages in thread
From: Suzuki K Poulose @ 2018-11-02 11:02 UTC (permalink / raw)
  To: linux-arm-kernel

Hi

On 02/11/18 07:53, Christoffer Dall wrote:
> In attempting to re-construct the logic for our stage 2 page table
> layout I found the reaoning in the comment explaining how we calculate
> the number of levels used for stage 2 page tables a bit backwards.
> 
> This commit attempts to clarify the comment, to make it slightly easier
> to read without having the Arm ARM open on the right page.
> 
> While we're at it, fixup a typo in a comment that was recently changed.
> 
> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> ---
>   arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
>   virt/kvm/arm/mmu.c                      |  2 +-
>   2 files changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
> index d352f6df8d2c..9c387320b28c 100644
> --- a/arch/arm64/include/asm/stage2_pgtable.h
> +++ b/arch/arm64/include/asm/stage2_pgtable.h
> @@ -31,15 +31,18 @@
>   
>   /*
>    * The hardware supports concatenation of up to 16 tables at stage2 entry level
> - * and we use the feature whenever possible.
> + * and we use the feature whenever possible, which means we resolve 4 bits of

s/we resolve 4 bits/we resolve 4 additional bits/ ?

> + * address at the entry level.
>    *
> - * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> + * This implies, the total number of page table levels required for
> + * IPA_SHIFT at stage2 expected by the hardware can be calculated using
> + * the same logic used for the (non-collapsable) stage1 page tables but for
> + * (IPA_SHIFT - 4).
> + *
> + * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).

May be we could improve it further by :

s/resolved at any level/resolved at any *non-entry* level/

as, we could resolve as small as 1 bit at the entry level.


>    * On arm64, the smallest PAGE_SIZE supported is 4k, which means
> - *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
> - * This implies, the total number of page table levels at stage2 expected
> - * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
> - * in normal translations(e.g, stage1), since we cannot have another level in
> - * the range (IPA_SHIFT, IPA_SHIFT - 4).
> + *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
> + * and therefore we will need a minimum of two levels for stage2 in all cases.

I think the above statement is misleading. The minimum number of
levels has nothing to do with the concatenation. For e.g, we could
still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
concatenated), going by the same rules above. The only reason why
we limit the number of levels to 2, is to prevent splitting stage1 PMD
huge mappings (which are quite common) at stage2.

>    */
>   #define stage2_pgtable_levels(ipa)	ARM64_HW_PGTABLE_LEVELS((ipa) - 4)
>   #define kvm_stage2_levels(kvm)		VTCR_EL2_LVLS(kvm->arch.vtcr)
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 4e7572656b5c..78d8020df4a4 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1234,7 +1234,7 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
>   	struct page *page = pfn_to_page(pfn);
>   
>   	/*
> -	 * PageTransCompoungMap() returns true for THP and
> +	 * PageTransCompoundMap() returns true for THP and
>   	 * hugetlbfs. Make sure the adjustment is done only for THP
>   	 * pages.
>   	 */
> 

Thanks
Suzuki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
  2018-11-02 11:02   ` Suzuki K Poulose
@ 2018-11-02 14:25     ` Christoffer Dall
  -1 siblings, 0 replies; 18+ messages in thread
From: Christoffer Dall @ 2018-11-02 14:25 UTC (permalink / raw)
  To: Suzuki K Poulose; +Cc: Marc Zyngier, kvmarm, linux-arm-kernel, kvm

On Fri, Nov 02, 2018 at 11:02:38AM +0000, Suzuki K Poulose wrote:
> Hi
> 
> On 02/11/18 07:53, Christoffer Dall wrote:
> >In attempting to re-construct the logic for our stage 2 page table
> >layout I found the reaoning in the comment explaining how we calculate
> >the number of levels used for stage 2 page tables a bit backwards.
> >
> >This commit attempts to clarify the comment, to make it slightly easier
> >to read without having the Arm ARM open on the right page.
> >
> >While we're at it, fixup a typo in a comment that was recently changed.
> >
> >Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> >---
> >  arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
> >  virt/kvm/arm/mmu.c                      |  2 +-
> >  2 files changed, 11 insertions(+), 8 deletions(-)
> >
> >diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
> >index d352f6df8d2c..9c387320b28c 100644
> >--- a/arch/arm64/include/asm/stage2_pgtable.h
> >+++ b/arch/arm64/include/asm/stage2_pgtable.h
> >@@ -31,15 +31,18 @@
> >  /*
> >   * The hardware supports concatenation of up to 16 tables at stage2 entry level
> >- * and we use the feature whenever possible.
> >+ * and we use the feature whenever possible, which means we resolve 4 bits of
> 
> s/we resolve 4 bits/we resolve 4 additional bits/ ?
> 

yes

> >+ * address at the entry level.
> >   *
> >- * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> >+ * This implies, the total number of page table levels required for
> >+ * IPA_SHIFT at stage2 expected by the hardware can be calculated using
> >+ * the same logic used for the (non-collapsable) stage1 page tables but for
> >+ * (IPA_SHIFT - 4).
> >+ *
> >+ * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> 
> May be we could improve it further by :
> 
> s/resolved at any level/resolved at any *non-entry* level/
> 
> as, we could resolve as small as 1 bit at the entry level.
> 
> 

yes

> >   * On arm64, the smallest PAGE_SIZE supported is 4k, which means
> >- *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
> >- * This implies, the total number of page table levels at stage2 expected
> >- * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
> >- * in normal translations(e.g, stage1), since we cannot have another level in
> >- * the range (IPA_SHIFT, IPA_SHIFT - 4).
> >+ *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
> >+ * and therefore we will need a minimum of two levels for stage2 in all cases.
> 
> I think the above statement is misleading. The minimum number of
> levels has nothing to do with the concatenation.

Architecturally surely it does?  (The point of concatenation is to
reduce the minimal number of levels required.)

Maybe you mean the minimum number of levels imposed by KVM here?

> For e.g, we could
> still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
> concatenated), going by the same rules above. The only reason why
> we limit the number of levels to 2, is to prevent splitting stage1 PMD
> huge mappings (which are quite common) at stage2.
> 

So I wasn't entirely clear what the comment was trying to say with the
"(PAGE_SHIFT - 3) > 4 holds for all page size" statement, so I though
that was there to show that we'll need a minimum of two levels, but
maybe that was written under the assumption of the limitations of
IPA_SHIFT (was KVM_PHYS_SIZE).

Since you wrote the original comment, and I couldn't correctly parse
that, and I apparently still didn't fully understand, can you suggest an
alternative wording?

> >   */
> >  #define stage2_pgtable_levels(ipa)	ARM64_HW_PGTABLE_LEVELS((ipa) - 4)
> >  #define kvm_stage2_levels(kvm)		VTCR_EL2_LVLS(kvm->arch.vtcr)
> >diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> >index 4e7572656b5c..78d8020df4a4 100644
> >--- a/virt/kvm/arm/mmu.c
> >+++ b/virt/kvm/arm/mmu.c
> >@@ -1234,7 +1234,7 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
> >  	struct page *page = pfn_to_page(pfn);
> >  	/*
> >-	 * PageTransCompoungMap() returns true for THP and
> >+	 * PageTransCompoundMap() returns true for THP and
> >  	 * hugetlbfs. Make sure the adjustment is done only for THP
> >  	 * pages.
> >  	 */
> >
> 

Thanks!

    Christoffer

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
@ 2018-11-02 14:25     ` Christoffer Dall
  0 siblings, 0 replies; 18+ messages in thread
From: Christoffer Dall @ 2018-11-02 14:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 02, 2018 at 11:02:38AM +0000, Suzuki K Poulose wrote:
> Hi
> 
> On 02/11/18 07:53, Christoffer Dall wrote:
> >In attempting to re-construct the logic for our stage 2 page table
> >layout I found the reaoning in the comment explaining how we calculate
> >the number of levels used for stage 2 page tables a bit backwards.
> >
> >This commit attempts to clarify the comment, to make it slightly easier
> >to read without having the Arm ARM open on the right page.
> >
> >While we're at it, fixup a typo in a comment that was recently changed.
> >
> >Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> >---
> >  arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
> >  virt/kvm/arm/mmu.c                      |  2 +-
> >  2 files changed, 11 insertions(+), 8 deletions(-)
> >
> >diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
> >index d352f6df8d2c..9c387320b28c 100644
> >--- a/arch/arm64/include/asm/stage2_pgtable.h
> >+++ b/arch/arm64/include/asm/stage2_pgtable.h
> >@@ -31,15 +31,18 @@
> >  /*
> >   * The hardware supports concatenation of up to 16 tables at stage2 entry level
> >- * and we use the feature whenever possible.
> >+ * and we use the feature whenever possible, which means we resolve 4 bits of
> 
> s/we resolve 4 bits/we resolve 4 additional bits/ ?
> 

yes

> >+ * address at the entry level.
> >   *
> >- * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> >+ * This implies, the total number of page table levels required for
> >+ * IPA_SHIFT at stage2 expected by the hardware can be calculated using
> >+ * the same logic used for the (non-collapsable) stage1 page tables but for
> >+ * (IPA_SHIFT - 4).
> >+ *
> >+ * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> 
> May be we could improve it further by :
> 
> s/resolved at any level/resolved at any *non-entry* level/
> 
> as, we could resolve as small as 1 bit at the entry level.
> 
> 

yes

> >   * On arm64, the smallest PAGE_SIZE supported is 4k, which means
> >- *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
> >- * This implies, the total number of page table levels at stage2 expected
> >- * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
> >- * in normal translations(e.g, stage1), since we cannot have another level in
> >- * the range (IPA_SHIFT, IPA_SHIFT - 4).
> >+ *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
> >+ * and therefore we will need a minimum of two levels for stage2 in all cases.
> 
> I think the above statement is misleading. The minimum number of
> levels has nothing to do with the concatenation.

Architecturally surely it does?  (The point of concatenation is to
reduce the minimal number of levels required.)

Maybe you mean the minimum number of levels imposed by KVM here?

> For e.g, we could
> still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
> concatenated), going by the same rules above. The only reason why
> we limit the number of levels to 2, is to prevent splitting stage1 PMD
> huge mappings (which are quite common) at stage2.
> 

So I wasn't entirely clear what the comment was trying to say with the
"(PAGE_SHIFT - 3) > 4 holds for all page size" statement, so I though
that was there to show that we'll need a minimum of two levels, but
maybe that was written under the assumption of the limitations of
IPA_SHIFT (was KVM_PHYS_SIZE).

Since you wrote the original comment, and I couldn't correctly parse
that, and I apparently still didn't fully understand, can you suggest an
alternative wording?

> >   */
> >  #define stage2_pgtable_levels(ipa)	ARM64_HW_PGTABLE_LEVELS((ipa) - 4)
> >  #define kvm_stage2_levels(kvm)		VTCR_EL2_LVLS(kvm->arch.vtcr)
> >diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> >index 4e7572656b5c..78d8020df4a4 100644
> >--- a/virt/kvm/arm/mmu.c
> >+++ b/virt/kvm/arm/mmu.c
> >@@ -1234,7 +1234,7 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
> >  	struct page *page = pfn_to_page(pfn);
> >  	/*
> >-	 * PageTransCompoungMap() returns true for THP and
> >+	 * PageTransCompoundMap() returns true for THP and
> >  	 * hugetlbfs. Make sure the adjustment is done only for THP
> >  	 * pages.
> >  	 */
> >
> 

Thanks!

    Christoffer

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
  2018-11-02 14:25     ` Christoffer Dall
@ 2018-11-05 15:00       ` Suzuki K Poulose
  -1 siblings, 0 replies; 18+ messages in thread
From: Suzuki K Poulose @ 2018-11-05 15:00 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Marc Zyngier, kvmarm, linux-arm-kernel, kvm



On 02/11/18 14:25, Christoffer Dall wrote:
> On Fri, Nov 02, 2018 at 11:02:38AM +0000, Suzuki K Poulose wrote:
>> Hi
>>
>> On 02/11/18 07:53, Christoffer Dall wrote:
>>> In attempting to re-construct the logic for our stage 2 page table
>>> layout I found the reaoning in the comment explaining how we calculate
>>> the number of levels used for stage 2 page tables a bit backwards.
>>>
>>> This commit attempts to clarify the comment, to make it slightly easier
>>> to read without having the Arm ARM open on the right page.
>>>
>>> While we're at it, fixup a typo in a comment that was recently changed.
>>>
>>> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
>>> ---
>>>   arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
>>>   virt/kvm/arm/mmu.c                      |  2 +-
>>>   2 files changed, 11 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
>>> index d352f6df8d2c..9c387320b28c 100644
>>> --- a/arch/arm64/include/asm/stage2_pgtable.h
>>> +++ b/arch/arm64/include/asm/stage2_pgtable.h
>>> @@ -31,15 +31,18 @@
>>>   /*
>>>    * The hardware supports concatenation of up to 16 tables at stage2 entry level
>>> - * and we use the feature whenever possible.
>>> + * and we use the feature whenever possible, which means we resolve 4 bits of
>>
>> s/we resolve 4 bits/we resolve 4 additional bits/ ?
>>
> 
> yes
> 
>>> + * address at the entry level.
>>>    *
>>> - * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
>>> + * This implies, the total number of page table levels required for
>>> + * IPA_SHIFT at stage2 expected by the hardware can be calculated using
>>> + * the same logic used for the (non-collapsable) stage1 page tables but for
>>> + * (IPA_SHIFT - 4).
>>> + *
>>> + * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
>>
>> May be we could improve it further by :
>>
>> s/resolved at any level/resolved at any *non-entry* level/
>>
>> as, we could resolve as small as 1 bit at the entry level.
>>
>>
> 
> yes
> 
>>>    * On arm64, the smallest PAGE_SIZE supported is 4k, which means
>>> - *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
>>> - * This implies, the total number of page table levels at stage2 expected
>>> - * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
>>> - * in normal translations(e.g, stage1), since we cannot have another level in
>>> - * the range (IPA_SHIFT, IPA_SHIFT - 4).
>>> + *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
>>> + * and therefore we will need a minimum of two levels for stage2 in all cases.
>>
>> I think the above statement is misleading. The minimum number of
>> levels has nothing to do with the concatenation.
> 
> Architecturally surely it does?  (The point of concatenation is to
> reduce the minimal number of levels required.)
> 
> Maybe you mean the minimum number of levels imposed by KVM here?
> 
>> For e.g, we could
>> still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
>> concatenated), going by the same rules above. The only reason why
>> we limit the number of levels to 2, is to prevent splitting stage1 PMD
>> huge mappings (which are quite common) at stage2.
>>
> 
> So I wasn't entirely clear what the comment was trying to say with the
> "(PAGE_SHIFT - 3) > 4 holds for all page size" statement, so I though
> that was there to show that we'll need a minimum of two levels, but
> maybe that was written under the assumption of the limitations of
> IPA_SHIFT (was KVM_PHYS_SIZE).

See below.

> 
> Since you wrote the original comment, and I couldn't correctly parse
> that, and I apparently still didn't fully understand, can you suggest an
> alternative wording?

I think trying to over explain the concept has created more confusion.
The whole paragraph is trying to prove that we only need :

	ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
to map IPA_SHIFT bits at stage2 with maximum utilization of the
concatenation at entry level.

Right now the comment tries to establish it via the hard route, by
proving that there cannot be an intermediate level in the range
[IPA_SHIFT, IPA_SHIFT - 4]. Or in other words :

(ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) + 1) >= 	
			ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)

I don't know if it is worth the explanation and then causing further
confusion.

May be I could replace the confusing text with something like :

"A normal page table with ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) levels
is guaranteed to resolve minimum of (IPA_SHIFT - 4)bits (when the entry 
level is fully used, and more bits otherwise.). For an input address of
size IPA_SHIFT bits, we could cover the remaining 4 bits using the same
number of levels or using the concatenation if needed."

Does it help ?

Suzuki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
@ 2018-11-05 15:00       ` Suzuki K Poulose
  0 siblings, 0 replies; 18+ messages in thread
From: Suzuki K Poulose @ 2018-11-05 15:00 UTC (permalink / raw)
  To: linux-arm-kernel



On 02/11/18 14:25, Christoffer Dall wrote:
> On Fri, Nov 02, 2018 at 11:02:38AM +0000, Suzuki K Poulose wrote:
>> Hi
>>
>> On 02/11/18 07:53, Christoffer Dall wrote:
>>> In attempting to re-construct the logic for our stage 2 page table
>>> layout I found the reaoning in the comment explaining how we calculate
>>> the number of levels used for stage 2 page tables a bit backwards.
>>>
>>> This commit attempts to clarify the comment, to make it slightly easier
>>> to read without having the Arm ARM open on the right page.
>>>
>>> While we're at it, fixup a typo in a comment that was recently changed.
>>>
>>> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
>>> ---
>>>   arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
>>>   virt/kvm/arm/mmu.c                      |  2 +-
>>>   2 files changed, 11 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
>>> index d352f6df8d2c..9c387320b28c 100644
>>> --- a/arch/arm64/include/asm/stage2_pgtable.h
>>> +++ b/arch/arm64/include/asm/stage2_pgtable.h
>>> @@ -31,15 +31,18 @@
>>>   /*
>>>    * The hardware supports concatenation of up to 16 tables at stage2 entry level
>>> - * and we use the feature whenever possible.
>>> + * and we use the feature whenever possible, which means we resolve 4 bits of
>>
>> s/we resolve 4 bits/we resolve 4 additional bits/ ?
>>
> 
> yes
> 
>>> + * address at the entry level.
>>>    *
>>> - * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
>>> + * This implies, the total number of page table levels required for
>>> + * IPA_SHIFT at stage2 expected by the hardware can be calculated using
>>> + * the same logic used for the (non-collapsable) stage1 page tables but for
>>> + * (IPA_SHIFT - 4).
>>> + *
>>> + * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
>>
>> May be we could improve it further by :
>>
>> s/resolved at any level/resolved at any *non-entry* level/
>>
>> as, we could resolve as small as 1 bit at the entry level.
>>
>>
> 
> yes
> 
>>>    * On arm64, the smallest PAGE_SIZE supported is 4k, which means
>>> - *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
>>> - * This implies, the total number of page table levels at stage2 expected
>>> - * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
>>> - * in normal translations(e.g, stage1), since we cannot have another level in
>>> - * the range (IPA_SHIFT, IPA_SHIFT - 4).
>>> + *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
>>> + * and therefore we will need a minimum of two levels for stage2 in all cases.
>>
>> I think the above statement is misleading. The minimum number of
>> levels has nothing to do with the concatenation.
> 
> Architecturally surely it does?  (The point of concatenation is to
> reduce the minimal number of levels required.)
> 
> Maybe you mean the minimum number of levels imposed by KVM here?
> 
>> For e.g, we could
>> still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
>> concatenated), going by the same rules above. The only reason why
>> we limit the number of levels to 2, is to prevent splitting stage1 PMD
>> huge mappings (which are quite common) at stage2.
>>
> 
> So I wasn't entirely clear what the comment was trying to say with the
> "(PAGE_SHIFT - 3) > 4 holds for all page size" statement, so I though
> that was there to show that we'll need a minimum of two levels, but
> maybe that was written under the assumption of the limitations of
> IPA_SHIFT (was KVM_PHYS_SIZE).

See below.

> 
> Since you wrote the original comment, and I couldn't correctly parse
> that, and I apparently still didn't fully understand, can you suggest an
> alternative wording?

I think trying to over explain the concept has created more confusion.
The whole paragraph is trying to prove that we only need :

	ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
to map IPA_SHIFT bits at stage2 with maximum utilization of the
concatenation at entry level.

Right now the comment tries to establish it via the hard route, by
proving that there cannot be an intermediate level in the range
[IPA_SHIFT, IPA_SHIFT - 4]. Or in other words :

(ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) + 1) >= 	
			ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)

I don't know if it is worth the explanation and then causing further
confusion.

May be I could replace the confusing text with something like :

"A normal page table with ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) levels
is guaranteed to resolve minimum of (IPA_SHIFT - 4)bits (when the entry 
level is fully used, and more bits otherwise.). For an input address of
size IPA_SHIFT bits, we could cover the remaining 4 bits using the same
number of levels or using the concatenation if needed."

Does it help ?

Suzuki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
  2018-11-05 15:00       ` Suzuki K Poulose
@ 2018-11-06  8:42         ` Christoffer Dall
  -1 siblings, 0 replies; 18+ messages in thread
From: Christoffer Dall @ 2018-11-06  8:42 UTC (permalink / raw)
  To: Suzuki K Poulose; +Cc: Marc Zyngier, kvmarm, linux-arm-kernel, kvm

On Mon, Nov 05, 2018 at 03:00:34PM +0000, Suzuki K Poulose wrote:
> 
> 
> On 02/11/18 14:25, Christoffer Dall wrote:
> >On Fri, Nov 02, 2018 at 11:02:38AM +0000, Suzuki K Poulose wrote:
> >>Hi
> >>
> >>On 02/11/18 07:53, Christoffer Dall wrote:
> >>>In attempting to re-construct the logic for our stage 2 page table
> >>>layout I found the reaoning in the comment explaining how we calculate
> >>>the number of levels used for stage 2 page tables a bit backwards.
> >>>
> >>>This commit attempts to clarify the comment, to make it slightly easier
> >>>to read without having the Arm ARM open on the right page.
> >>>
> >>>While we're at it, fixup a typo in a comment that was recently changed.
> >>>
> >>>Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> >>>---
> >>>  arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
> >>>  virt/kvm/arm/mmu.c                      |  2 +-
> >>>  2 files changed, 11 insertions(+), 8 deletions(-)
> >>>
> >>>diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
> >>>index d352f6df8d2c..9c387320b28c 100644
> >>>--- a/arch/arm64/include/asm/stage2_pgtable.h
> >>>+++ b/arch/arm64/include/asm/stage2_pgtable.h
> >>>@@ -31,15 +31,18 @@
> >>>  /*
> >>>   * The hardware supports concatenation of up to 16 tables at stage2 entry level
> >>>- * and we use the feature whenever possible.
> >>>+ * and we use the feature whenever possible, which means we resolve 4 bits of
> >>
> >>s/we resolve 4 bits/we resolve 4 additional bits/ ?
> >>
> >
> >yes
> >
> >>>+ * address at the entry level.
> >>>   *
> >>>- * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> >>>+ * This implies, the total number of page table levels required for
> >>>+ * IPA_SHIFT at stage2 expected by the hardware can be calculated using
> >>>+ * the same logic used for the (non-collapsable) stage1 page tables but for
> >>>+ * (IPA_SHIFT - 4).
> >>>+ *
> >>>+ * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> >>
> >>May be we could improve it further by :
> >>
> >>s/resolved at any level/resolved at any *non-entry* level/
> >>
> >>as, we could resolve as small as 1 bit at the entry level.
> >>
> >>
> >
> >yes
> >
> >>>   * On arm64, the smallest PAGE_SIZE supported is 4k, which means
> >>>- *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
> >>>- * This implies, the total number of page table levels at stage2 expected
> >>>- * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
> >>>- * in normal translations(e.g, stage1), since we cannot have another level in
> >>>- * the range (IPA_SHIFT, IPA_SHIFT - 4).
> >>>+ *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
> >>>+ * and therefore we will need a minimum of two levels for stage2 in all cases.
> >>
> >>I think the above statement is misleading. The minimum number of
> >>levels has nothing to do with the concatenation.
> >
> >Architecturally surely it does?  (The point of concatenation is to
> >reduce the minimal number of levels required.)
> >
> >Maybe you mean the minimum number of levels imposed by KVM here?
> >
> >>For e.g, we could
> >>still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
> >>concatenated), going by the same rules above. The only reason why
> >>we limit the number of levels to 2, is to prevent splitting stage1 PMD
> >>huge mappings (which are quite common) at stage2.
> >>
> >
> >So I wasn't entirely clear what the comment was trying to say with the
> >"(PAGE_SHIFT - 3) > 4 holds for all page size" statement, so I though
> >that was there to show that we'll need a minimum of two levels, but
> >maybe that was written under the assumption of the limitations of
> >IPA_SHIFT (was KVM_PHYS_SIZE).
> 
> See below.
> 
> >
> >Since you wrote the original comment, and I couldn't correctly parse
> >that, and I apparently still didn't fully understand, can you suggest an
> >alternative wording?
> 
> I think trying to over explain the concept has created more confusion.
> The whole paragraph is trying to prove that we only need :
> 
> 	ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
> to map IPA_SHIFT bits at stage2 with maximum utilization of the
> concatenation at entry level.

Yes, I think that comes across more clearly in my rewording up to the
"Note, ...".

> 
> Right now the comment tries to establish it via the hard route, by
> proving that there cannot be an intermediate level in the range
> [IPA_SHIFT, IPA_SHIFT - 4]. Or in other words :
> 
> (ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) + 1) >= 	
> 			ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)
> 
> I don't know if it is worth the explanation and then causing further
> confusion.

I think you're then actually trying to explain two things.

First, we can get the number of levels by using the stage 1 calculation
and adjust for concatenation by subtracting 4 from the number of bits we
need to translate.

Second, some further reasoning about *why* that is true.

It remains unclear to me exactly what your point about
    '(PAGE_SHIFT - 3) > 4'
is and how that supports the second point.  Also I'm not entirely sure
why we need that.

I was trying to preserve all the information you had in the original
comment (assuming it was important), but I honestly think that we only
need to explain the first part (because the confusing part of the code
is the reuse of a stage 1 macro and subtracting 4).

> 
> May be I could replace the confusing text with something like :
> 
> "A normal page table with ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) levels
> is guaranteed to resolve minimum of (IPA_SHIFT - 4)bits (when the entry
> level is fully used, and more bits otherwise.). For an input address of
> size IPA_SHIFT bits, we could cover the remaining 4 bits using the same
> number of levels or using the concatenation if needed."
> 

I prefer my version above to explain this particular aspect :)

But this seems to exclude the stuff you originally had about
(PAGE_SHIFT - 3) > 4' ?


Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
@ 2018-11-06  8:42         ` Christoffer Dall
  0 siblings, 0 replies; 18+ messages in thread
From: Christoffer Dall @ 2018-11-06  8:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 05, 2018 at 03:00:34PM +0000, Suzuki K Poulose wrote:
> 
> 
> On 02/11/18 14:25, Christoffer Dall wrote:
> >On Fri, Nov 02, 2018 at 11:02:38AM +0000, Suzuki K Poulose wrote:
> >>Hi
> >>
> >>On 02/11/18 07:53, Christoffer Dall wrote:
> >>>In attempting to re-construct the logic for our stage 2 page table
> >>>layout I found the reaoning in the comment explaining how we calculate
> >>>the number of levels used for stage 2 page tables a bit backwards.
> >>>
> >>>This commit attempts to clarify the comment, to make it slightly easier
> >>>to read without having the Arm ARM open on the right page.
> >>>
> >>>While we're at it, fixup a typo in a comment that was recently changed.
> >>>
> >>>Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> >>>---
> >>>  arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
> >>>  virt/kvm/arm/mmu.c                      |  2 +-
> >>>  2 files changed, 11 insertions(+), 8 deletions(-)
> >>>
> >>>diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
> >>>index d352f6df8d2c..9c387320b28c 100644
> >>>--- a/arch/arm64/include/asm/stage2_pgtable.h
> >>>+++ b/arch/arm64/include/asm/stage2_pgtable.h
> >>>@@ -31,15 +31,18 @@
> >>>  /*
> >>>   * The hardware supports concatenation of up to 16 tables at stage2 entry level
> >>>- * and we use the feature whenever possible.
> >>>+ * and we use the feature whenever possible, which means we resolve 4 bits of
> >>
> >>s/we resolve 4 bits/we resolve 4 additional bits/ ?
> >>
> >
> >yes
> >
> >>>+ * address at the entry level.
> >>>   *
> >>>- * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> >>>+ * This implies, the total number of page table levels required for
> >>>+ * IPA_SHIFT at stage2 expected by the hardware can be calculated using
> >>>+ * the same logic used for the (non-collapsable) stage1 page tables but for
> >>>+ * (IPA_SHIFT - 4).
> >>>+ *
> >>>+ * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> >>
> >>May be we could improve it further by :
> >>
> >>s/resolved at any level/resolved at any *non-entry* level/
> >>
> >>as, we could resolve as small as 1 bit at the entry level.
> >>
> >>
> >
> >yes
> >
> >>>   * On arm64, the smallest PAGE_SIZE supported is 4k, which means
> >>>- *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
> >>>- * This implies, the total number of page table levels at stage2 expected
> >>>- * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
> >>>- * in normal translations(e.g, stage1), since we cannot have another level in
> >>>- * the range (IPA_SHIFT, IPA_SHIFT - 4).
> >>>+ *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
> >>>+ * and therefore we will need a minimum of two levels for stage2 in all cases.
> >>
> >>I think the above statement is misleading. The minimum number of
> >>levels has nothing to do with the concatenation.
> >
> >Architecturally surely it does?  (The point of concatenation is to
> >reduce the minimal number of levels required.)
> >
> >Maybe you mean the minimum number of levels imposed by KVM here?
> >
> >>For e.g, we could
> >>still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
> >>concatenated), going by the same rules above. The only reason why
> >>we limit the number of levels to 2, is to prevent splitting stage1 PMD
> >>huge mappings (which are quite common) at stage2.
> >>
> >
> >So I wasn't entirely clear what the comment was trying to say with the
> >"(PAGE_SHIFT - 3) > 4 holds for all page size" statement, so I though
> >that was there to show that we'll need a minimum of two levels, but
> >maybe that was written under the assumption of the limitations of
> >IPA_SHIFT (was KVM_PHYS_SIZE).
> 
> See below.
> 
> >
> >Since you wrote the original comment, and I couldn't correctly parse
> >that, and I apparently still didn't fully understand, can you suggest an
> >alternative wording?
> 
> I think trying to over explain the concept has created more confusion.
> The whole paragraph is trying to prove that we only need :
> 
> 	ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
> to map IPA_SHIFT bits at stage2 with maximum utilization of the
> concatenation at entry level.

Yes, I think that comes across more clearly in my rewording up to the
"Note, ...".

> 
> Right now the comment tries to establish it via the hard route, by
> proving that there cannot be an intermediate level in the range
> [IPA_SHIFT, IPA_SHIFT - 4]. Or in other words :
> 
> (ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) + 1) >= 	
> 			ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)
> 
> I don't know if it is worth the explanation and then causing further
> confusion.

I think you're then actually trying to explain two things.

First, we can get the number of levels by using the stage 1 calculation
and adjust for concatenation by subtracting 4 from the number of bits we
need to translate.

Second, some further reasoning about *why* that is true.

It remains unclear to me exactly what your point about
    '(PAGE_SHIFT - 3) > 4'
is and how that supports the second point.  Also I'm not entirely sure
why we need that.

I was trying to preserve all the information you had in the original
comment (assuming it was important), but I honestly think that we only
need to explain the first part (because the confusing part of the code
is the reuse of a stage 1 macro and subtracting 4).

> 
> May be I could replace the confusing text with something like :
> 
> "A normal page table with ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) levels
> is guaranteed to resolve minimum of (IPA_SHIFT - 4)bits (when the entry
> level is fully used, and more bits otherwise.). For an input address of
> size IPA_SHIFT bits, we could cover the remaining 4 bits using the same
> number of levels or using the concatenation if needed."
> 

I prefer my version above to explain this particular aspect :)

But this seems to exclude the stuff you originally had about
(PAGE_SHIFT - 3) > 4' ?


Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
  2018-11-06  8:42         ` Christoffer Dall
@ 2018-11-06  9:52           ` Suzuki K Poulose
  -1 siblings, 0 replies; 18+ messages in thread
From: Suzuki K Poulose @ 2018-11-06  9:52 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Marc Zyngier, kvmarm, linux-arm-kernel, kvm



On 06/11/2018 08:42, Christoffer Dall wrote:
> On Mon, Nov 05, 2018 at 03:00:34PM +0000, Suzuki K Poulose wrote:
>>
>>
>> On 02/11/18 14:25, Christoffer Dall wrote:
>>> On Fri, Nov 02, 2018 at 11:02:38AM +0000, Suzuki K Poulose wrote:
>>>> Hi
>>>>
>>>> On 02/11/18 07:53, Christoffer Dall wrote:
>>>>> In attempting to re-construct the logic for our stage 2 page table
>>>>> layout I found the reaoning in the comment explaining how we calculate
>>>>> the number of levels used for stage 2 page tables a bit backwards.
>>>>>
>>>>> This commit attempts to clarify the comment, to make it slightly easier
>>>>> to read without having the Arm ARM open on the right page.
>>>>>
>>>>> While we're at it, fixup a typo in a comment that was recently changed.
>>>>>
>>>>> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
>>>>> ---
>>>>>   arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
>>>>>   virt/kvm/arm/mmu.c                      |  2 +-
>>>>>   2 files changed, 11 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
>>>>> index d352f6df8d2c..9c387320b28c 100644
>>>>> --- a/arch/arm64/include/asm/stage2_pgtable.h
>>>>> +++ b/arch/arm64/include/asm/stage2_pgtable.h
>>>>> @@ -31,15 +31,18 @@
>>>>>   /*
>>>>>    * The hardware supports concatenation of up to 16 tables at stage2 entry level
>>>>> - * and we use the feature whenever possible.
>>>>> + * and we use the feature whenever possible, which means we resolve 4 bits of
>>>>
>>>> s/we resolve 4 bits/we resolve 4 additional bits/ ?
>>>>
>>>
>>> yes
>>>
>>>>> + * address at the entry level.
>>>>>    *
>>>>> - * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
>>>>> + * This implies, the total number of page table levels required for
>>>>> + * IPA_SHIFT at stage2 expected by the hardware can be calculated using
>>>>> + * the same logic used for the (non-collapsable) stage1 page tables but for
>>>>> + * (IPA_SHIFT - 4).
>>>>> + *
>>>>> + * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
>>>>
>>>> May be we could improve it further by :
>>>>
>>>> s/resolved at any level/resolved at any *non-entry* level/
>>>>
>>>> as, we could resolve as small as 1 bit at the entry level.
>>>>
>>>>
>>>
>>> yes
>>>
>>>>>    * On arm64, the smallest PAGE_SIZE supported is 4k, which means
>>>>> - *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
>>>>> - * This implies, the total number of page table levels at stage2 expected
>>>>> - * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
>>>>> - * in normal translations(e.g, stage1), since we cannot have another level in
>>>>> - * the range (IPA_SHIFT, IPA_SHIFT - 4).
>>>>> + *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
>>>>> + * and therefore we will need a minimum of two levels for stage2 in all cases.
>>>>
>>>> I think the above statement is misleading. The minimum number of
>>>> levels has nothing to do with the concatenation.
>>>
>>> Architecturally surely it does?  (The point of concatenation is to
>>> reduce the minimal number of levels required.)
>>>
>>> Maybe you mean the minimum number of levels imposed by KVM here?
>>>
>>>> For e.g, we could
>>>> still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
>>>> concatenated), going by the same rules above. The only reason why
>>>> we limit the number of levels to 2, is to prevent splitting stage1 PMD
>>>> huge mappings (which are quite common) at stage2.
>>>>
>>>
>>> So I wasn't entirely clear what the comment was trying to say with the
>>> "(PAGE_SHIFT - 3) > 4 holds for all page size" statement, so I though
>>> that was there to show that we'll need a minimum of two levels, but
>>> maybe that was written under the assumption of the limitations of
>>> IPA_SHIFT (was KVM_PHYS_SIZE).
>>
>> See below.
>>
>>>
>>> Since you wrote the original comment, and I couldn't correctly parse
>>> that, and I apparently still didn't fully understand, can you suggest an
>>> alternative wording?
>>
>> I think trying to over explain the concept has created more confusion.
>> The whole paragraph is trying to prove that we only need :
>>
>> 	ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
>> to map IPA_SHIFT bits at stage2 with maximum utilization of the
>> concatenation at entry level.
> 
> Yes, I think that comes across more clearly in my rewording up to the
> "Note, ...".

Yes it does and I prefer your version than mine.

> 
>>
>> Right now the comment tries to establish it via the hard route, by
>> proving that there cannot be an intermediate level in the range
>> [IPA_SHIFT, IPA_SHIFT - 4]. Or in other words :
>>
>> (ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) + 1) >= 	
>> 			ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)
>>
>> I don't know if it is worth the explanation and then causing further
>> confusion.
> 
> I think you're then actually trying to explain two things.
> 
> First, we can get the number of levels by using the stage 1 calculation
> and adjust for concatenation by subtracting 4 from the number of bits we
> need to translate.
> 
> Second, some further reasoning about *why* that is true.
> 
> It remains unclear to me exactly what your point about
>      '(PAGE_SHIFT - 3) > 4'
> is and how that supports the second point.  Also I'm not entirely sure
> why we need that.

So, we have to prove that :
   x + 1 >= y, where :

  x = ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
  y = ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)

We can prove it by contradiction. i.e let us assume
y > x + 1, => There is an additional level between,
(y) and (x + 1), which implies, there is an intermediate
level covering the "bits" of the input address in the
range [IPA_SHIFT, IPA_SHIFT - 4]. But since we must
resolve exactly (PAGE_SHIFT - 3) bits (9bits minimum) in an intermediate
level, that level can't exist. And thus our assumption can never
be true.

I guess we don't need to go to that level of craziness, just to prove it
formally.

> 
> I was trying to preserve all the information you had in the original
> comment (assuming it was important), but I honestly think that we only
> need to explain the first part (because the confusing part of the code
> is the reuse of a stage 1 macro and subtracting 4).
> 
>>
>> May be I could replace the confusing text with something like :
>>
>> "A normal page table with ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) levels
>> is guaranteed to resolve minimum of (IPA_SHIFT - 4)bits (when the entry
>> level is fully used, and more bits otherwise.). For an input address of
>> size IPA_SHIFT bits, we could cover the remaining 4 bits using the same
>> number of levels or using the concatenation if needed."
>>
> 
> I prefer my version above to explain this particular aspect :)
> 
> But this seems to exclude the stuff you originally had about
> (PAGE_SHIFT - 3) > 4' ?

Does the above make it clear ?

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
@ 2018-11-06  9:52           ` Suzuki K Poulose
  0 siblings, 0 replies; 18+ messages in thread
From: Suzuki K Poulose @ 2018-11-06  9:52 UTC (permalink / raw)
  To: linux-arm-kernel



On 06/11/2018 08:42, Christoffer Dall wrote:
> On Mon, Nov 05, 2018 at 03:00:34PM +0000, Suzuki K Poulose wrote:
>>
>>
>> On 02/11/18 14:25, Christoffer Dall wrote:
>>> On Fri, Nov 02, 2018 at 11:02:38AM +0000, Suzuki K Poulose wrote:
>>>> Hi
>>>>
>>>> On 02/11/18 07:53, Christoffer Dall wrote:
>>>>> In attempting to re-construct the logic for our stage 2 page table
>>>>> layout I found the reaoning in the comment explaining how we calculate
>>>>> the number of levels used for stage 2 page tables a bit backwards.
>>>>>
>>>>> This commit attempts to clarify the comment, to make it slightly easier
>>>>> to read without having the Arm ARM open on the right page.
>>>>>
>>>>> While we're at it, fixup a typo in a comment that was recently changed.
>>>>>
>>>>> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
>>>>> ---
>>>>>   arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
>>>>>   virt/kvm/arm/mmu.c                      |  2 +-
>>>>>   2 files changed, 11 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
>>>>> index d352f6df8d2c..9c387320b28c 100644
>>>>> --- a/arch/arm64/include/asm/stage2_pgtable.h
>>>>> +++ b/arch/arm64/include/asm/stage2_pgtable.h
>>>>> @@ -31,15 +31,18 @@
>>>>>   /*
>>>>>    * The hardware supports concatenation of up to 16 tables at stage2 entry level
>>>>> - * and we use the feature whenever possible.
>>>>> + * and we use the feature whenever possible, which means we resolve 4 bits of
>>>>
>>>> s/we resolve 4 bits/we resolve 4 additional bits/ ?
>>>>
>>>
>>> yes
>>>
>>>>> + * address at the entry level.
>>>>>    *
>>>>> - * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
>>>>> + * This implies, the total number of page table levels required for
>>>>> + * IPA_SHIFT at stage2 expected by the hardware can be calculated using
>>>>> + * the same logic used for the (non-collapsable) stage1 page tables but for
>>>>> + * (IPA_SHIFT - 4).
>>>>> + *
>>>>> + * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
>>>>
>>>> May be we could improve it further by :
>>>>
>>>> s/resolved at any level/resolved at any *non-entry* level/
>>>>
>>>> as, we could resolve as small as 1 bit at the entry level.
>>>>
>>>>
>>>
>>> yes
>>>
>>>>>    * On arm64, the smallest PAGE_SIZE supported is 4k, which means
>>>>> - *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
>>>>> - * This implies, the total number of page table levels at stage2 expected
>>>>> - * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
>>>>> - * in normal translations(e.g, stage1), since we cannot have another level in
>>>>> - * the range (IPA_SHIFT, IPA_SHIFT - 4).
>>>>> + *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
>>>>> + * and therefore we will need a minimum of two levels for stage2 in all cases.
>>>>
>>>> I think the above statement is misleading. The minimum number of
>>>> levels has nothing to do with the concatenation.
>>>
>>> Architecturally surely it does?  (The point of concatenation is to
>>> reduce the minimal number of levels required.)
>>>
>>> Maybe you mean the minimum number of levels imposed by KVM here?
>>>
>>>> For e.g, we could
>>>> still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
>>>> concatenated), going by the same rules above. The only reason why
>>>> we limit the number of levels to 2, is to prevent splitting stage1 PMD
>>>> huge mappings (which are quite common) at stage2.
>>>>
>>>
>>> So I wasn't entirely clear what the comment was trying to say with the
>>> "(PAGE_SHIFT - 3) > 4 holds for all page size" statement, so I though
>>> that was there to show that we'll need a minimum of two levels, but
>>> maybe that was written under the assumption of the limitations of
>>> IPA_SHIFT (was KVM_PHYS_SIZE).
>>
>> See below.
>>
>>>
>>> Since you wrote the original comment, and I couldn't correctly parse
>>> that, and I apparently still didn't fully understand, can you suggest an
>>> alternative wording?
>>
>> I think trying to over explain the concept has created more confusion.
>> The whole paragraph is trying to prove that we only need :
>>
>> 	ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
>> to map IPA_SHIFT bits at stage2 with maximum utilization of the
>> concatenation at entry level.
> 
> Yes, I think that comes across more clearly in my rewording up to the
> "Note, ...".

Yes it does and I prefer your version than mine.

> 
>>
>> Right now the comment tries to establish it via the hard route, by
>> proving that there cannot be an intermediate level in the range
>> [IPA_SHIFT, IPA_SHIFT - 4]. Or in other words :
>>
>> (ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) + 1) >= 	
>> 			ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)
>>
>> I don't know if it is worth the explanation and then causing further
>> confusion.
> 
> I think you're then actually trying to explain two things.
> 
> First, we can get the number of levels by using the stage 1 calculation
> and adjust for concatenation by subtracting 4 from the number of bits we
> need to translate.
> 
> Second, some further reasoning about *why* that is true.
> 
> It remains unclear to me exactly what your point about
>      '(PAGE_SHIFT - 3) > 4'
> is and how that supports the second point.  Also I'm not entirely sure
> why we need that.

So, we have to prove that :
   x + 1 >= y, where :

  x = ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
  y = ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)

We can prove it by contradiction. i.e let us assume
y > x + 1, => There is an additional level between,
(y) and (x + 1), which implies, there is an intermediate
level covering the "bits" of the input address in the
range [IPA_SHIFT, IPA_SHIFT - 4]. But since we must
resolve exactly (PAGE_SHIFT - 3) bits (9bits minimum) in an intermediate
level, that level can't exist. And thus our assumption can never
be true.

I guess we don't need to go to that level of craziness, just to prove it
formally.

> 
> I was trying to preserve all the information you had in the original
> comment (assuming it was important), but I honestly think that we only
> need to explain the first part (because the confusing part of the code
> is the reuse of a stage 1 macro and subtracting 4).
> 
>>
>> May be I could replace the confusing text with something like :
>>
>> "A normal page table with ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) levels
>> is guaranteed to resolve minimum of (IPA_SHIFT - 4)bits (when the entry
>> level is fully used, and more bits otherwise.). For an input address of
>> size IPA_SHIFT bits, we could cover the remaining 4 bits using the same
>> number of levels or using the concatenation if needed."
>>
> 
> I prefer my version above to explain this particular aspect :)
> 
> But this seems to exclude the stuff you originally had about
> (PAGE_SHIFT - 3) > 4' ?

Does the above make it clear ?

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
  2018-11-06  9:52           ` Suzuki K Poulose
@ 2018-11-06 11:45             ` Christoffer Dall
  -1 siblings, 0 replies; 18+ messages in thread
From: Christoffer Dall @ 2018-11-06 11:45 UTC (permalink / raw)
  To: Suzuki K Poulose; +Cc: Marc Zyngier, kvmarm, linux-arm-kernel, kvm

On Tue, Nov 06, 2018 at 09:52:59AM +0000, Suzuki K Poulose wrote:
> 
> 
> On 06/11/2018 08:42, Christoffer Dall wrote:
> >On Mon, Nov 05, 2018 at 03:00:34PM +0000, Suzuki K Poulose wrote:
> >>
> >>
> >>On 02/11/18 14:25, Christoffer Dall wrote:
> >>>On Fri, Nov 02, 2018 at 11:02:38AM +0000, Suzuki K Poulose wrote:
> >>>>Hi
> >>>>
> >>>>On 02/11/18 07:53, Christoffer Dall wrote:
> >>>>>In attempting to re-construct the logic for our stage 2 page table
> >>>>>layout I found the reaoning in the comment explaining how we calculate
> >>>>>the number of levels used for stage 2 page tables a bit backwards.
> >>>>>
> >>>>>This commit attempts to clarify the comment, to make it slightly easier
> >>>>>to read without having the Arm ARM open on the right page.
> >>>>>
> >>>>>While we're at it, fixup a typo in a comment that was recently changed.
> >>>>>
> >>>>>Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> >>>>>---
> >>>>>  arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
> >>>>>  virt/kvm/arm/mmu.c                      |  2 +-
> >>>>>  2 files changed, 11 insertions(+), 8 deletions(-)
> >>>>>
> >>>>>diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
> >>>>>index d352f6df8d2c..9c387320b28c 100644
> >>>>>--- a/arch/arm64/include/asm/stage2_pgtable.h
> >>>>>+++ b/arch/arm64/include/asm/stage2_pgtable.h
> >>>>>@@ -31,15 +31,18 @@
> >>>>>  /*
> >>>>>   * The hardware supports concatenation of up to 16 tables at stage2 entry level
> >>>>>- * and we use the feature whenever possible.
> >>>>>+ * and we use the feature whenever possible, which means we resolve 4 bits of
> >>>>
> >>>>s/we resolve 4 bits/we resolve 4 additional bits/ ?
> >>>>
> >>>
> >>>yes
> >>>
> >>>>>+ * address at the entry level.
> >>>>>   *
> >>>>>- * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> >>>>>+ * This implies, the total number of page table levels required for
> >>>>>+ * IPA_SHIFT at stage2 expected by the hardware can be calculated using
> >>>>>+ * the same logic used for the (non-collapsable) stage1 page tables but for
> >>>>>+ * (IPA_SHIFT - 4).
> >>>>>+ *
> >>>>>+ * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> >>>>
> >>>>May be we could improve it further by :
> >>>>
> >>>>s/resolved at any level/resolved at any *non-entry* level/
> >>>>
> >>>>as, we could resolve as small as 1 bit at the entry level.
> >>>>
> >>>>
> >>>
> >>>yes
> >>>
> >>>>>   * On arm64, the smallest PAGE_SIZE supported is 4k, which means
> >>>>>- *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
> >>>>>- * This implies, the total number of page table levels at stage2 expected
> >>>>>- * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
> >>>>>- * in normal translations(e.g, stage1), since we cannot have another level in
> >>>>>- * the range (IPA_SHIFT, IPA_SHIFT - 4).
> >>>>>+ *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
> >>>>>+ * and therefore we will need a minimum of two levels for stage2 in all cases.
> >>>>
> >>>>I think the above statement is misleading. The minimum number of
> >>>>levels has nothing to do with the concatenation.
> >>>
> >>>Architecturally surely it does?  (The point of concatenation is to
> >>>reduce the minimal number of levels required.)
> >>>
> >>>Maybe you mean the minimum number of levels imposed by KVM here?
> >>>
> >>>>For e.g, we could
> >>>>still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
> >>>>concatenated), going by the same rules above. The only reason why
> >>>>we limit the number of levels to 2, is to prevent splitting stage1 PMD
> >>>>huge mappings (which are quite common) at stage2.
> >>>>
> >>>
> >>>So I wasn't entirely clear what the comment was trying to say with the
> >>>"(PAGE_SHIFT - 3) > 4 holds for all page size" statement, so I though
> >>>that was there to show that we'll need a minimum of two levels, but
> >>>maybe that was written under the assumption of the limitations of
> >>>IPA_SHIFT (was KVM_PHYS_SIZE).
> >>
> >>See below.
> >>
> >>>
> >>>Since you wrote the original comment, and I couldn't correctly parse
> >>>that, and I apparently still didn't fully understand, can you suggest an
> >>>alternative wording?
> >>
> >>I think trying to over explain the concept has created more confusion.
> >>The whole paragraph is trying to prove that we only need :
> >>
> >>	ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
> >>to map IPA_SHIFT bits at stage2 with maximum utilization of the
> >>concatenation at entry level.
> >
> >Yes, I think that comes across more clearly in my rewording up to the
> >"Note, ...".
> 
> Yes it does and I prefer your version than mine.
> 
> >
> >>
> >>Right now the comment tries to establish it via the hard route, by
> >>proving that there cannot be an intermediate level in the range
> >>[IPA_SHIFT, IPA_SHIFT - 4]. Or in other words :
> >>
> >>(ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) + 1) >= 	
> >>			ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)
> >>
> >>I don't know if it is worth the explanation and then causing further
> >>confusion.
> >
> >I think you're then actually trying to explain two things.
> >
> >First, we can get the number of levels by using the stage 1 calculation
> >and adjust for concatenation by subtracting 4 from the number of bits we
> >need to translate.
> >
> >Second, some further reasoning about *why* that is true.
> >
> >It remains unclear to me exactly what your point about
> >     '(PAGE_SHIFT - 3) > 4'
> >is and how that supports the second point.  Also I'm not entirely sure
> >why we need that.
> 
> So, we have to prove that :
>   x + 1 >= y, where :

Why do we have to prove that?

> 
>  x = ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
>  y = ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)
> 
> We can prove it by contradiction. i.e let us assume
> y > x + 1, => There is an additional level between,
> (y) and (x + 1), which implies, there is an intermediate
> level covering the "bits" of the input address in the
> range [IPA_SHIFT, IPA_SHIFT - 4]. But since we must
> resolve exactly (PAGE_SHIFT - 3) bits (9bits minimum) in an intermediate
> level, that level can't exist. And thus our assumption can never
> be true.
> 
> I guess we don't need to go to that level of craziness, just to prove it
> formally.
> 

I can follow the math to prove x + 1 >= y, but I'm not sure how that
helps me understand anything or how it relates to the architecture?

Are you trying to explain why the architecture works the way it does, or
prove a specific relationship between architectural concepts, or what is
it we're trying to achieve here?


Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
@ 2018-11-06 11:45             ` Christoffer Dall
  0 siblings, 0 replies; 18+ messages in thread
From: Christoffer Dall @ 2018-11-06 11:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 06, 2018 at 09:52:59AM +0000, Suzuki K Poulose wrote:
> 
> 
> On 06/11/2018 08:42, Christoffer Dall wrote:
> >On Mon, Nov 05, 2018 at 03:00:34PM +0000, Suzuki K Poulose wrote:
> >>
> >>
> >>On 02/11/18 14:25, Christoffer Dall wrote:
> >>>On Fri, Nov 02, 2018 at 11:02:38AM +0000, Suzuki K Poulose wrote:
> >>>>Hi
> >>>>
> >>>>On 02/11/18 07:53, Christoffer Dall wrote:
> >>>>>In attempting to re-construct the logic for our stage 2 page table
> >>>>>layout I found the reaoning in the comment explaining how we calculate
> >>>>>the number of levels used for stage 2 page tables a bit backwards.
> >>>>>
> >>>>>This commit attempts to clarify the comment, to make it slightly easier
> >>>>>to read without having the Arm ARM open on the right page.
> >>>>>
> >>>>>While we're at it, fixup a typo in a comment that was recently changed.
> >>>>>
> >>>>>Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> >>>>>---
> >>>>>  arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
> >>>>>  virt/kvm/arm/mmu.c                      |  2 +-
> >>>>>  2 files changed, 11 insertions(+), 8 deletions(-)
> >>>>>
> >>>>>diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
> >>>>>index d352f6df8d2c..9c387320b28c 100644
> >>>>>--- a/arch/arm64/include/asm/stage2_pgtable.h
> >>>>>+++ b/arch/arm64/include/asm/stage2_pgtable.h
> >>>>>@@ -31,15 +31,18 @@
> >>>>>  /*
> >>>>>   * The hardware supports concatenation of up to 16 tables at stage2 entry level
> >>>>>- * and we use the feature whenever possible.
> >>>>>+ * and we use the feature whenever possible, which means we resolve 4 bits of
> >>>>
> >>>>s/we resolve 4 bits/we resolve 4 additional bits/ ?
> >>>>
> >>>
> >>>yes
> >>>
> >>>>>+ * address at the entry level.
> >>>>>   *
> >>>>>- * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> >>>>>+ * This implies, the total number of page table levels required for
> >>>>>+ * IPA_SHIFT at stage2 expected by the hardware can be calculated using
> >>>>>+ * the same logic used for the (non-collapsable) stage1 page tables but for
> >>>>>+ * (IPA_SHIFT - 4).
> >>>>>+ *
> >>>>>+ * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> >>>>
> >>>>May be we could improve it further by :
> >>>>
> >>>>s/resolved at any level/resolved at any *non-entry* level/
> >>>>
> >>>>as, we could resolve as small as 1 bit at the entry level.
> >>>>
> >>>>
> >>>
> >>>yes
> >>>
> >>>>>   * On arm64, the smallest PAGE_SIZE supported is 4k, which means
> >>>>>- *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
> >>>>>- * This implies, the total number of page table levels at stage2 expected
> >>>>>- * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
> >>>>>- * in normal translations(e.g, stage1), since we cannot have another level in
> >>>>>- * the range (IPA_SHIFT, IPA_SHIFT - 4).
> >>>>>+ *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
> >>>>>+ * and therefore we will need a minimum of two levels for stage2 in all cases.
> >>>>
> >>>>I think the above statement is misleading. The minimum number of
> >>>>levels has nothing to do with the concatenation.
> >>>
> >>>Architecturally surely it does?  (The point of concatenation is to
> >>>reduce the minimal number of levels required.)
> >>>
> >>>Maybe you mean the minimum number of levels imposed by KVM here?
> >>>
> >>>>For e.g, we could
> >>>>still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
> >>>>concatenated), going by the same rules above. The only reason why
> >>>>we limit the number of levels to 2, is to prevent splitting stage1 PMD
> >>>>huge mappings (which are quite common) at stage2.
> >>>>
> >>>
> >>>So I wasn't entirely clear what the comment was trying to say with the
> >>>"(PAGE_SHIFT - 3) > 4 holds for all page size" statement, so I though
> >>>that was there to show that we'll need a minimum of two levels, but
> >>>maybe that was written under the assumption of the limitations of
> >>>IPA_SHIFT (was KVM_PHYS_SIZE).
> >>
> >>See below.
> >>
> >>>
> >>>Since you wrote the original comment, and I couldn't correctly parse
> >>>that, and I apparently still didn't fully understand, can you suggest an
> >>>alternative wording?
> >>
> >>I think trying to over explain the concept has created more confusion.
> >>The whole paragraph is trying to prove that we only need :
> >>
> >>	ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
> >>to map IPA_SHIFT bits at stage2 with maximum utilization of the
> >>concatenation at entry level.
> >
> >Yes, I think that comes across more clearly in my rewording up to the
> >"Note, ...".
> 
> Yes it does and I prefer your version than mine.
> 
> >
> >>
> >>Right now the comment tries to establish it via the hard route, by
> >>proving that there cannot be an intermediate level in the range
> >>[IPA_SHIFT, IPA_SHIFT - 4]. Or in other words :
> >>
> >>(ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) + 1) >= 	
> >>			ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)
> >>
> >>I don't know if it is worth the explanation and then causing further
> >>confusion.
> >
> >I think you're then actually trying to explain two things.
> >
> >First, we can get the number of levels by using the stage 1 calculation
> >and adjust for concatenation by subtracting 4 from the number of bits we
> >need to translate.
> >
> >Second, some further reasoning about *why* that is true.
> >
> >It remains unclear to me exactly what your point about
> >     '(PAGE_SHIFT - 3) > 4'
> >is and how that supports the second point.  Also I'm not entirely sure
> >why we need that.
> 
> So, we have to prove that :
>   x + 1 >= y, where :

Why do we have to prove that?

> 
>  x = ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
>  y = ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)
> 
> We can prove it by contradiction. i.e let us assume
> y > x + 1, => There is an additional level between,
> (y) and (x + 1), which implies, there is an intermediate
> level covering the "bits" of the input address in the
> range [IPA_SHIFT, IPA_SHIFT - 4]. But since we must
> resolve exactly (PAGE_SHIFT - 3) bits (9bits minimum) in an intermediate
> level, that level can't exist. And thus our assumption can never
> be true.
> 
> I guess we don't need to go to that level of craziness, just to prove it
> formally.
> 

I can follow the math to prove x + 1 >= y, but I'm not sure how that
helps me understand anything or how it relates to the architecture?

Are you trying to explain why the architecture works the way it does, or
prove a specific relationship between architectural concepts, or what is
it we're trying to achieve here?


Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
  2018-11-06 11:45             ` Christoffer Dall
@ 2018-11-06 12:13               ` Suzuki K Poulose
  -1 siblings, 0 replies; 18+ messages in thread
From: Suzuki K Poulose @ 2018-11-06 12:13 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Marc Zyngier, kvmarm, linux-arm-kernel, kvm



On 06/11/2018 11:45, Christoffer Dall wrote:
> On Tue, Nov 06, 2018 at 09:52:59AM +0000, Suzuki K Poulose wrote:
>>
>>
>> On 06/11/2018 08:42, Christoffer Dall wrote:
>>> On Mon, Nov 05, 2018 at 03:00:34PM +0000, Suzuki K Poulose wrote:
>>>>
>>>>
>>>> On 02/11/18 14:25, Christoffer Dall wrote:
>>>>> On Fri, Nov 02, 2018 at 11:02:38AM +0000, Suzuki K Poulose wrote:
>>>>>> Hi
>>>>>>
>>>>>> On 02/11/18 07:53, Christoffer Dall wrote:
>>>>>>> In attempting to re-construct the logic for our stage 2 page table
>>>>>>> layout I found the reaoning in the comment explaining how we calculate
>>>>>>> the number of levels used for stage 2 page tables a bit backwards.
>>>>>>>
>>>>>>> This commit attempts to clarify the comment, to make it slightly easier
>>>>>>> to read without having the Arm ARM open on the right page.
>>>>>>>
>>>>>>> While we're at it, fixup a typo in a comment that was recently changed.
>>>>>>>
>>>>>>> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
>>>>>>> ---
>>>>>>>   arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
>>>>>>>   virt/kvm/arm/mmu.c                      |  2 +-
>>>>>>>   2 files changed, 11 insertions(+), 8 deletions(-)
>>>>>>>
>>>>>>> diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
>>>>>>> index d352f6df8d2c..9c387320b28c 100644
>>>>>>> --- a/arch/arm64/include/asm/stage2_pgtable.h
>>>>>>> +++ b/arch/arm64/include/asm/stage2_pgtable.h
>>>>>>> @@ -31,15 +31,18 @@
>>>>>>>   /*
>>>>>>>    * The hardware supports concatenation of up to 16 tables at stage2 entry level
>>>>>>> - * and we use the feature whenever possible.
>>>>>>> + * and we use the feature whenever possible, which means we resolve 4 bits of
>>>>>>
>>>>>> s/we resolve 4 bits/we resolve 4 additional bits/ ?
>>>>>>
>>>>>
>>>>> yes
>>>>>
>>>>>>> + * address at the entry level.
>>>>>>>    *
>>>>>>> - * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
>>>>>>> + * This implies, the total number of page table levels required for
>>>>>>> + * IPA_SHIFT at stage2 expected by the hardware can be calculated using
>>>>>>> + * the same logic used for the (non-collapsable) stage1 page tables but for
>>>>>>> + * (IPA_SHIFT - 4).
>>>>>>> + *
>>>>>>> + * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
>>>>>>
>>>>>> May be we could improve it further by :
>>>>>>
>>>>>> s/resolved at any level/resolved at any *non-entry* level/
>>>>>>
>>>>>> as, we could resolve as small as 1 bit at the entry level.
>>>>>>
>>>>>>
>>>>>
>>>>> yes
>>>>>
>>>>>>>    * On arm64, the smallest PAGE_SIZE supported is 4k, which means
>>>>>>> - *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
>>>>>>> - * This implies, the total number of page table levels at stage2 expected
>>>>>>> - * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
>>>>>>> - * in normal translations(e.g, stage1), since we cannot have another level in
>>>>>>> - * the range (IPA_SHIFT, IPA_SHIFT - 4).
>>>>>>> + *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
>>>>>>> + * and therefore we will need a minimum of two levels for stage2 in all cases.
>>>>>>
>>>>>> I think the above statement is misleading. The minimum number of
>>>>>> levels has nothing to do with the concatenation.
>>>>>
>>>>> Architecturally surely it does?  (The point of concatenation is to
>>>>> reduce the minimal number of levels required.)
>>>>>
>>>>> Maybe you mean the minimum number of levels imposed by KVM here?
>>>>>
>>>>>> For e.g, we could
>>>>>> still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
>>>>>> concatenated), going by the same rules above. The only reason why
>>>>>> we limit the number of levels to 2, is to prevent splitting stage1 PMD
>>>>>> huge mappings (which are quite common) at stage2.
>>>>>>
>>>>>
>>>>> So I wasn't entirely clear what the comment was trying to say with the
>>>>> "(PAGE_SHIFT - 3) > 4 holds for all page size" statement, so I though
>>>>> that was there to show that we'll need a minimum of two levels, but
>>>>> maybe that was written under the assumption of the limitations of
>>>>> IPA_SHIFT (was KVM_PHYS_SIZE).
>>>>
>>>> See below.
>>>>
>>>>>
>>>>> Since you wrote the original comment, and I couldn't correctly parse
>>>>> that, and I apparently still didn't fully understand, can you suggest an
>>>>> alternative wording?
>>>>
>>>> I think trying to over explain the concept has created more confusion.
>>>> The whole paragraph is trying to prove that we only need :
>>>>
>>>> 	ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
>>>> to map IPA_SHIFT bits at stage2 with maximum utilization of the
>>>> concatenation at entry level.
>>>
>>> Yes, I think that comes across more clearly in my rewording up to the
>>> "Note, ...".
>>
>> Yes it does and I prefer your version than mine.
>>
>>>
>>>>
>>>> Right now the comment tries to establish it via the hard route, by
>>>> proving that there cannot be an intermediate level in the range
>>>> [IPA_SHIFT, IPA_SHIFT - 4]. Or in other words :
>>>>
>>>> (ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) + 1) >= 	
>>>> 			ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)
>>>>
>>>> I don't know if it is worth the explanation and then causing further
>>>> confusion.
>>>
>>> I think you're then actually trying to explain two things.
>>>
>>> First, we can get the number of levels by using the stage 1 calculation
>>> and adjust for concatenation by subtracting 4 from the number of bits we
>>> need to translate.
>>>
>>> Second, some further reasoning about *why* that is true.
>>>
>>> It remains unclear to me exactly what your point about
>>>      '(PAGE_SHIFT - 3) > 4'
>>> is and how that supports the second point.  Also I'm not entirely sure
>>> why we need that.
>>
>> So, we have to prove that :
>>    x + 1 >= y, where :
> 
> Why do we have to prove that?

To make sure that we don't do concatenation at a level which is not
architecturally correct and end up concatenating more tables than
the architecture supports. i.e, if we were to go down two levels
below, we would be concatenating more than 16 tables at the entry
level and thus doing something not supported by the architecture.

We need to be sure about the correctness of the logic, as we don't
explicitly "check" how many tables are concatenated for any input
IPA_SHIFT+Page_Size combination in the stage2 page table code.

I guess, it is simpler to think of it the "non-formal" way. i.e,
we could at the maximum "not-resolve" 4 bits using a normal table
with Levels(IPA_SHIFT - 4).

>>
>>   x = ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
>>   y = ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)
>>
>> We can prove it by contradiction. i.e let us assume
>> y > x + 1, => There is an additional level between,
>> (y) and (x + 1), which implies, there is an intermediate
>> level covering the "bits" of the input address in the
>> range [IPA_SHIFT, IPA_SHIFT - 4]. But since we must
>> resolve exactly (PAGE_SHIFT - 3) bits (9bits minimum) in an intermediate
>> level, that level can't exist. And thus our assumption can never
>> be true.
>>
>> I guess we don't need to go to that level of craziness, just to prove it
>> formally.
>>
> 
> I can follow the math to prove x + 1 >= y, but I'm not sure how that
> helps me understand anything or how it relates to the architecture?
> 
> Are you trying to explain why the architecture works the way it does, or
> prove a specific relationship between architectural concepts, or what is
> it we're trying to achieve here?

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
@ 2018-11-06 12:13               ` Suzuki K Poulose
  0 siblings, 0 replies; 18+ messages in thread
From: Suzuki K Poulose @ 2018-11-06 12:13 UTC (permalink / raw)
  To: linux-arm-kernel



On 06/11/2018 11:45, Christoffer Dall wrote:
> On Tue, Nov 06, 2018 at 09:52:59AM +0000, Suzuki K Poulose wrote:
>>
>>
>> On 06/11/2018 08:42, Christoffer Dall wrote:
>>> On Mon, Nov 05, 2018 at 03:00:34PM +0000, Suzuki K Poulose wrote:
>>>>
>>>>
>>>> On 02/11/18 14:25, Christoffer Dall wrote:
>>>>> On Fri, Nov 02, 2018 at 11:02:38AM +0000, Suzuki K Poulose wrote:
>>>>>> Hi
>>>>>>
>>>>>> On 02/11/18 07:53, Christoffer Dall wrote:
>>>>>>> In attempting to re-construct the logic for our stage 2 page table
>>>>>>> layout I found the reaoning in the comment explaining how we calculate
>>>>>>> the number of levels used for stage 2 page tables a bit backwards.
>>>>>>>
>>>>>>> This commit attempts to clarify the comment, to make it slightly easier
>>>>>>> to read without having the Arm ARM open on the right page.
>>>>>>>
>>>>>>> While we're at it, fixup a typo in a comment that was recently changed.
>>>>>>>
>>>>>>> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
>>>>>>> ---
>>>>>>>   arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
>>>>>>>   virt/kvm/arm/mmu.c                      |  2 +-
>>>>>>>   2 files changed, 11 insertions(+), 8 deletions(-)
>>>>>>>
>>>>>>> diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
>>>>>>> index d352f6df8d2c..9c387320b28c 100644
>>>>>>> --- a/arch/arm64/include/asm/stage2_pgtable.h
>>>>>>> +++ b/arch/arm64/include/asm/stage2_pgtable.h
>>>>>>> @@ -31,15 +31,18 @@
>>>>>>>   /*
>>>>>>>    * The hardware supports concatenation of up to 16 tables at stage2 entry level
>>>>>>> - * and we use the feature whenever possible.
>>>>>>> + * and we use the feature whenever possible, which means we resolve 4 bits of
>>>>>>
>>>>>> s/we resolve 4 bits/we resolve 4 additional bits/ ?
>>>>>>
>>>>>
>>>>> yes
>>>>>
>>>>>>> + * address at the entry level.
>>>>>>>    *
>>>>>>> - * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
>>>>>>> + * This implies, the total number of page table levels required for
>>>>>>> + * IPA_SHIFT at stage2 expected by the hardware can be calculated using
>>>>>>> + * the same logic used for the (non-collapsable) stage1 page tables but for
>>>>>>> + * (IPA_SHIFT - 4).
>>>>>>> + *
>>>>>>> + * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
>>>>>>
>>>>>> May be we could improve it further by :
>>>>>>
>>>>>> s/resolved at any level/resolved at any *non-entry* level/
>>>>>>
>>>>>> as, we could resolve as small as 1 bit at the entry level.
>>>>>>
>>>>>>
>>>>>
>>>>> yes
>>>>>
>>>>>>>    * On arm64, the smallest PAGE_SIZE supported is 4k, which means
>>>>>>> - *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
>>>>>>> - * This implies, the total number of page table levels at stage2 expected
>>>>>>> - * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
>>>>>>> - * in normal translations(e.g, stage1), since we cannot have another level in
>>>>>>> - * the range (IPA_SHIFT, IPA_SHIFT - 4).
>>>>>>> + *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
>>>>>>> + * and therefore we will need a minimum of two levels for stage2 in all cases.
>>>>>>
>>>>>> I think the above statement is misleading. The minimum number of
>>>>>> levels has nothing to do with the concatenation.
>>>>>
>>>>> Architecturally surely it does?  (The point of concatenation is to
>>>>> reduce the minimal number of levels required.)
>>>>>
>>>>> Maybe you mean the minimum number of levels imposed by KVM here?
>>>>>
>>>>>> For e.g, we could
>>>>>> still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
>>>>>> concatenated), going by the same rules above. The only reason why
>>>>>> we limit the number of levels to 2, is to prevent splitting stage1 PMD
>>>>>> huge mappings (which are quite common) at stage2.
>>>>>>
>>>>>
>>>>> So I wasn't entirely clear what the comment was trying to say with the
>>>>> "(PAGE_SHIFT - 3) > 4 holds for all page size" statement, so I though
>>>>> that was there to show that we'll need a minimum of two levels, but
>>>>> maybe that was written under the assumption of the limitations of
>>>>> IPA_SHIFT (was KVM_PHYS_SIZE).
>>>>
>>>> See below.
>>>>
>>>>>
>>>>> Since you wrote the original comment, and I couldn't correctly parse
>>>>> that, and I apparently still didn't fully understand, can you suggest an
>>>>> alternative wording?
>>>>
>>>> I think trying to over explain the concept has created more confusion.
>>>> The whole paragraph is trying to prove that we only need :
>>>>
>>>> 	ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
>>>> to map IPA_SHIFT bits at stage2 with maximum utilization of the
>>>> concatenation at entry level.
>>>
>>> Yes, I think that comes across more clearly in my rewording up to the
>>> "Note, ...".
>>
>> Yes it does and I prefer your version than mine.
>>
>>>
>>>>
>>>> Right now the comment tries to establish it via the hard route, by
>>>> proving that there cannot be an intermediate level in the range
>>>> [IPA_SHIFT, IPA_SHIFT - 4]. Or in other words :
>>>>
>>>> (ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) + 1) >= 	
>>>> 			ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)
>>>>
>>>> I don't know if it is worth the explanation and then causing further
>>>> confusion.
>>>
>>> I think you're then actually trying to explain two things.
>>>
>>> First, we can get the number of levels by using the stage 1 calculation
>>> and adjust for concatenation by subtracting 4 from the number of bits we
>>> need to translate.
>>>
>>> Second, some further reasoning about *why* that is true.
>>>
>>> It remains unclear to me exactly what your point about
>>>      '(PAGE_SHIFT - 3) > 4'
>>> is and how that supports the second point.  Also I'm not entirely sure
>>> why we need that.
>>
>> So, we have to prove that :
>>    x + 1 >= y, where :
> 
> Why do we have to prove that?

To make sure that we don't do concatenation at a level which is not
architecturally correct and end up concatenating more tables than
the architecture supports. i.e, if we were to go down two levels
below, we would be concatenating more than 16 tables at the entry
level and thus doing something not supported by the architecture.

We need to be sure about the correctness of the logic, as we don't
explicitly "check" how many tables are concatenated for any input
IPA_SHIFT+Page_Size combination in the stage2 page table code.

I guess, it is simpler to think of it the "non-formal" way. i.e,
we could at the maximum "not-resolve" 4 bits using a normal table
with Levels(IPA_SHIFT - 4).

>>
>>   x = ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
>>   y = ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)
>>
>> We can prove it by contradiction. i.e let us assume
>> y > x + 1, => There is an additional level between,
>> (y) and (x + 1), which implies, there is an intermediate
>> level covering the "bits" of the input address in the
>> range [IPA_SHIFT, IPA_SHIFT - 4]. But since we must
>> resolve exactly (PAGE_SHIFT - 3) bits (9bits minimum) in an intermediate
>> level, that level can't exist. And thus our assumption can never
>> be true.
>>
>> I guess we don't need to go to that level of craziness, just to prove it
>> formally.
>>
> 
> I can follow the math to prove x + 1 >= y, but I'm not sure how that
> helps me understand anything or how it relates to the architecture?
> 
> Are you trying to explain why the architecture works the way it does, or
> prove a specific relationship between architectural concepts, or what is
> it we're trying to achieve here?

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
  2018-11-06 12:13               ` Suzuki K Poulose
@ 2018-11-06 12:30                 ` Christoffer Dall
  -1 siblings, 0 replies; 18+ messages in thread
From: Christoffer Dall @ 2018-11-06 12:30 UTC (permalink / raw)
  To: Suzuki K Poulose; +Cc: Marc Zyngier, kvmarm, linux-arm-kernel, kvm

On Tue, Nov 06, 2018 at 12:13:18PM +0000, Suzuki K Poulose wrote:
> 
> 
> On 06/11/2018 11:45, Christoffer Dall wrote:
> >On Tue, Nov 06, 2018 at 09:52:59AM +0000, Suzuki K Poulose wrote:
> >>
> >>
> >>On 06/11/2018 08:42, Christoffer Dall wrote:
> >>>On Mon, Nov 05, 2018 at 03:00:34PM +0000, Suzuki K Poulose wrote:
> >>>>
> >>>>
> >>>>On 02/11/18 14:25, Christoffer Dall wrote:
> >>>>>On Fri, Nov 02, 2018 at 11:02:38AM +0000, Suzuki K Poulose wrote:
> >>>>>>Hi
> >>>>>>
> >>>>>>On 02/11/18 07:53, Christoffer Dall wrote:
> >>>>>>>In attempting to re-construct the logic for our stage 2 page table
> >>>>>>>layout I found the reaoning in the comment explaining how we calculate
> >>>>>>>the number of levels used for stage 2 page tables a bit backwards.
> >>>>>>>
> >>>>>>>This commit attempts to clarify the comment, to make it slightly easier
> >>>>>>>to read without having the Arm ARM open on the right page.
> >>>>>>>
> >>>>>>>While we're at it, fixup a typo in a comment that was recently changed.
> >>>>>>>
> >>>>>>>Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> >>>>>>>---
> >>>>>>>  arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
> >>>>>>>  virt/kvm/arm/mmu.c                      |  2 +-
> >>>>>>>  2 files changed, 11 insertions(+), 8 deletions(-)
> >>>>>>>
> >>>>>>>diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
> >>>>>>>index d352f6df8d2c..9c387320b28c 100644
> >>>>>>>--- a/arch/arm64/include/asm/stage2_pgtable.h
> >>>>>>>+++ b/arch/arm64/include/asm/stage2_pgtable.h
> >>>>>>>@@ -31,15 +31,18 @@
> >>>>>>>  /*
> >>>>>>>   * The hardware supports concatenation of up to 16 tables at stage2 entry level
> >>>>>>>- * and we use the feature whenever possible.
> >>>>>>>+ * and we use the feature whenever possible, which means we resolve 4 bits of
> >>>>>>
> >>>>>>s/we resolve 4 bits/we resolve 4 additional bits/ ?
> >>>>>>
> >>>>>
> >>>>>yes
> >>>>>
> >>>>>>>+ * address at the entry level.
> >>>>>>>   *
> >>>>>>>- * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> >>>>>>>+ * This implies, the total number of page table levels required for
> >>>>>>>+ * IPA_SHIFT at stage2 expected by the hardware can be calculated using
> >>>>>>>+ * the same logic used for the (non-collapsable) stage1 page tables but for
> >>>>>>>+ * (IPA_SHIFT - 4).
> >>>>>>>+ *
> >>>>>>>+ * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> >>>>>>
> >>>>>>May be we could improve it further by :
> >>>>>>
> >>>>>>s/resolved at any level/resolved at any *non-entry* level/
> >>>>>>
> >>>>>>as, we could resolve as small as 1 bit at the entry level.
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>yes
> >>>>>
> >>>>>>>   * On arm64, the smallest PAGE_SIZE supported is 4k, which means
> >>>>>>>- *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
> >>>>>>>- * This implies, the total number of page table levels at stage2 expected
> >>>>>>>- * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
> >>>>>>>- * in normal translations(e.g, stage1), since we cannot have another level in
> >>>>>>>- * the range (IPA_SHIFT, IPA_SHIFT - 4).
> >>>>>>>+ *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
> >>>>>>>+ * and therefore we will need a minimum of two levels for stage2 in all cases.
> >>>>>>
> >>>>>>I think the above statement is misleading. The minimum number of
> >>>>>>levels has nothing to do with the concatenation.
> >>>>>
> >>>>>Architecturally surely it does?  (The point of concatenation is to
> >>>>>reduce the minimal number of levels required.)
> >>>>>
> >>>>>Maybe you mean the minimum number of levels imposed by KVM here?
> >>>>>
> >>>>>>For e.g, we could
> >>>>>>still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
> >>>>>>concatenated), going by the same rules above. The only reason why
> >>>>>>we limit the number of levels to 2, is to prevent splitting stage1 PMD
> >>>>>>huge mappings (which are quite common) at stage2.
> >>>>>>
> >>>>>
> >>>>>So I wasn't entirely clear what the comment was trying to say with the
> >>>>>"(PAGE_SHIFT - 3) > 4 holds for all page size" statement, so I though
> >>>>>that was there to show that we'll need a minimum of two levels, but
> >>>>>maybe that was written under the assumption of the limitations of
> >>>>>IPA_SHIFT (was KVM_PHYS_SIZE).
> >>>>
> >>>>See below.
> >>>>
> >>>>>
> >>>>>Since you wrote the original comment, and I couldn't correctly parse
> >>>>>that, and I apparently still didn't fully understand, can you suggest an
> >>>>>alternative wording?
> >>>>
> >>>>I think trying to over explain the concept has created more confusion.
> >>>>The whole paragraph is trying to prove that we only need :
> >>>>
> >>>>	ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
> >>>>to map IPA_SHIFT bits at stage2 with maximum utilization of the
> >>>>concatenation at entry level.
> >>>
> >>>Yes, I think that comes across more clearly in my rewording up to the
> >>>"Note, ...".
> >>
> >>Yes it does and I prefer your version than mine.
> >>
> >>>
> >>>>
> >>>>Right now the comment tries to establish it via the hard route, by
> >>>>proving that there cannot be an intermediate level in the range
> >>>>[IPA_SHIFT, IPA_SHIFT - 4]. Or in other words :
> >>>>
> >>>>(ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) + 1) >= 	
> >>>>			ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)
> >>>>
> >>>>I don't know if it is worth the explanation and then causing further
> >>>>confusion.
> >>>
> >>>I think you're then actually trying to explain two things.
> >>>
> >>>First, we can get the number of levels by using the stage 1 calculation
> >>>and adjust for concatenation by subtracting 4 from the number of bits we
> >>>need to translate.
> >>>
> >>>Second, some further reasoning about *why* that is true.
> >>>
> >>>It remains unclear to me exactly what your point about
> >>>     '(PAGE_SHIFT - 3) > 4'
> >>>is and how that supports the second point.  Also I'm not entirely sure
> >>>why we need that.
> >>
> >>So, we have to prove that :
> >>   x + 1 >= y, where :
> >
> >Why do we have to prove that?
> 
> To make sure that we don't do concatenation at a level which is not
> architecturally correct and end up concatenating more tables than
> the architecture supports. i.e, if we were to go down two levels
> below, we would be concatenating more than 16 tables at the entry
> level and thus doing something not supported by the architecture.
> 
> We need to be sure about the correctness of the logic, as we don't
> explicitly "check" how many tables are concatenated for any input
> IPA_SHIFT+Page_Size combination in the stage2 page table code.
> 
> I guess, it is simpler to think of it the "non-formal" way. i.e,
> we could at the maximum "not-resolve" 4 bits using a normal table
> with Levels(IPA_SHIFT - 4).
> 

Yes :)

v2 on its way.

Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS
@ 2018-11-06 12:30                 ` Christoffer Dall
  0 siblings, 0 replies; 18+ messages in thread
From: Christoffer Dall @ 2018-11-06 12:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 06, 2018 at 12:13:18PM +0000, Suzuki K Poulose wrote:
> 
> 
> On 06/11/2018 11:45, Christoffer Dall wrote:
> >On Tue, Nov 06, 2018 at 09:52:59AM +0000, Suzuki K Poulose wrote:
> >>
> >>
> >>On 06/11/2018 08:42, Christoffer Dall wrote:
> >>>On Mon, Nov 05, 2018 at 03:00:34PM +0000, Suzuki K Poulose wrote:
> >>>>
> >>>>
> >>>>On 02/11/18 14:25, Christoffer Dall wrote:
> >>>>>On Fri, Nov 02, 2018 at 11:02:38AM +0000, Suzuki K Poulose wrote:
> >>>>>>Hi
> >>>>>>
> >>>>>>On 02/11/18 07:53, Christoffer Dall wrote:
> >>>>>>>In attempting to re-construct the logic for our stage 2 page table
> >>>>>>>layout I found the reaoning in the comment explaining how we calculate
> >>>>>>>the number of levels used for stage 2 page tables a bit backwards.
> >>>>>>>
> >>>>>>>This commit attempts to clarify the comment, to make it slightly easier
> >>>>>>>to read without having the Arm ARM open on the right page.
> >>>>>>>
> >>>>>>>While we're at it, fixup a typo in a comment that was recently changed.
> >>>>>>>
> >>>>>>>Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> >>>>>>>---
> >>>>>>>  arch/arm64/include/asm/stage2_pgtable.h | 17 ++++++++++-------
> >>>>>>>  virt/kvm/arm/mmu.c                      |  2 +-
> >>>>>>>  2 files changed, 11 insertions(+), 8 deletions(-)
> >>>>>>>
> >>>>>>>diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
> >>>>>>>index d352f6df8d2c..9c387320b28c 100644
> >>>>>>>--- a/arch/arm64/include/asm/stage2_pgtable.h
> >>>>>>>+++ b/arch/arm64/include/asm/stage2_pgtable.h
> >>>>>>>@@ -31,15 +31,18 @@
> >>>>>>>  /*
> >>>>>>>   * The hardware supports concatenation of up to 16 tables at stage2 entry level
> >>>>>>>- * and we use the feature whenever possible.
> >>>>>>>+ * and we use the feature whenever possible, which means we resolve 4 bits of
> >>>>>>
> >>>>>>s/we resolve 4 bits/we resolve 4 additional bits/ ?
> >>>>>>
> >>>>>
> >>>>>yes
> >>>>>
> >>>>>>>+ * address at the entry level.
> >>>>>>>   *
> >>>>>>>- * Now, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> >>>>>>>+ * This implies, the total number of page table levels required for
> >>>>>>>+ * IPA_SHIFT at stage2 expected by the hardware can be calculated using
> >>>>>>>+ * the same logic used for the (non-collapsable) stage1 page tables but for
> >>>>>>>+ * (IPA_SHIFT - 4).
> >>>>>>>+ *
> >>>>>>>+ * Note, the minimum number of bits resolved at any level is (PAGE_SHIFT - 3).
> >>>>>>
> >>>>>>May be we could improve it further by :
> >>>>>>
> >>>>>>s/resolved at any level/resolved at any *non-entry* level/
> >>>>>>
> >>>>>>as, we could resolve as small as 1 bit at the entry level.
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>yes
> >>>>>
> >>>>>>>   * On arm64, the smallest PAGE_SIZE supported is 4k, which means
> >>>>>>>- *             (PAGE_SHIFT - 3) > 4 holds for all page sizes.
> >>>>>>>- * This implies, the total number of page table levels at stage2 expected
> >>>>>>>- * by the hardware is actually the number of levels required for (IPA_SHIFT - 4)
> >>>>>>>- * in normal translations(e.g, stage1), since we cannot have another level in
> >>>>>>>- * the range (IPA_SHIFT, IPA_SHIFT - 4).
> >>>>>>>+ *             (PAGE_SHIFT - 3) > 4 holds for all page sizes
> >>>>>>>+ * and therefore we will need a minimum of two levels for stage2 in all cases.
> >>>>>>
> >>>>>>I think the above statement is misleading. The minimum number of
> >>>>>>levels has nothing to do with the concatenation.
> >>>>>
> >>>>>Architecturally surely it does?  (The point of concatenation is to
> >>>>>reduce the minimal number of levels required.)
> >>>>>
> >>>>>Maybe you mean the minimum number of levels imposed by KVM here?
> >>>>>
> >>>>>>For e.g, we could
> >>>>>>still create a stage2 with 1 level (32bit IPA on 64K, 29bit + 3bit
> >>>>>>concatenated), going by the same rules above. The only reason why
> >>>>>>we limit the number of levels to 2, is to prevent splitting stage1 PMD
> >>>>>>huge mappings (which are quite common) at stage2.
> >>>>>>
> >>>>>
> >>>>>So I wasn't entirely clear what the comment was trying to say with the
> >>>>>"(PAGE_SHIFT - 3) > 4 holds for all page size" statement, so I though
> >>>>>that was there to show that we'll need a minimum of two levels, but
> >>>>>maybe that was written under the assumption of the limitations of
> >>>>>IPA_SHIFT (was KVM_PHYS_SIZE).
> >>>>
> >>>>See below.
> >>>>
> >>>>>
> >>>>>Since you wrote the original comment, and I couldn't correctly parse
> >>>>>that, and I apparently still didn't fully understand, can you suggest an
> >>>>>alternative wording?
> >>>>
> >>>>I think trying to over explain the concept has created more confusion.
> >>>>The whole paragraph is trying to prove that we only need :
> >>>>
> >>>>	ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4)
> >>>>to map IPA_SHIFT bits at stage2 with maximum utilization of the
> >>>>concatenation at entry level.
> >>>
> >>>Yes, I think that comes across more clearly in my rewording up to the
> >>>"Note, ...".
> >>
> >>Yes it does and I prefer your version than mine.
> >>
> >>>
> >>>>
> >>>>Right now the comment tries to establish it via the hard route, by
> >>>>proving that there cannot be an intermediate level in the range
> >>>>[IPA_SHIFT, IPA_SHIFT - 4]. Or in other words :
> >>>>
> >>>>(ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT - 4) + 1) >= 	
> >>>>			ARM64_HW_PGTABLE_LEVELS(IPA_SHIFT)
> >>>>
> >>>>I don't know if it is worth the explanation and then causing further
> >>>>confusion.
> >>>
> >>>I think you're then actually trying to explain two things.
> >>>
> >>>First, we can get the number of levels by using the stage 1 calculation
> >>>and adjust for concatenation by subtracting 4 from the number of bits we
> >>>need to translate.
> >>>
> >>>Second, some further reasoning about *why* that is true.
> >>>
> >>>It remains unclear to me exactly what your point about
> >>>     '(PAGE_SHIFT - 3) > 4'
> >>>is and how that supports the second point.  Also I'm not entirely sure
> >>>why we need that.
> >>
> >>So, we have to prove that :
> >>   x + 1 >= y, where :
> >
> >Why do we have to prove that?
> 
> To make sure that we don't do concatenation at a level which is not
> architecturally correct and end up concatenating more tables than
> the architecture supports. i.e, if we were to go down two levels
> below, we would be concatenating more than 16 tables at the entry
> level and thus doing something not supported by the architecture.
> 
> We need to be sure about the correctness of the logic, as we don't
> explicitly "check" how many tables are concatenated for any input
> IPA_SHIFT+Page_Size combination in the stage2 page table code.
> 
> I guess, it is simpler to think of it the "non-formal" way. i.e,
> we could at the maximum "not-resolve" 4 bits using a normal table
> with Levels(IPA_SHIFT - 4).
> 

Yes :)

v2 on its way.

Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-11-06 12:30 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-02  7:53 [PATCH] KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS Christoffer Dall
2018-11-02  7:53 ` Christoffer Dall
2018-11-02 11:02 ` Suzuki K Poulose
2018-11-02 11:02   ` Suzuki K Poulose
2018-11-02 14:25   ` Christoffer Dall
2018-11-02 14:25     ` Christoffer Dall
2018-11-05 15:00     ` Suzuki K Poulose
2018-11-05 15:00       ` Suzuki K Poulose
2018-11-06  8:42       ` Christoffer Dall
2018-11-06  8:42         ` Christoffer Dall
2018-11-06  9:52         ` Suzuki K Poulose
2018-11-06  9:52           ` Suzuki K Poulose
2018-11-06 11:45           ` Christoffer Dall
2018-11-06 11:45             ` Christoffer Dall
2018-11-06 12:13             ` Suzuki K Poulose
2018-11-06 12:13               ` Suzuki K Poulose
2018-11-06 12:30               ` Christoffer Dall
2018-11-06 12:30                 ` Christoffer Dall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.