All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] flag contiguous PTEs in linear mapping
@ 2016-02-12 16:06 Jeremy Linton
  2016-02-12 16:06 ` [PATCH 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels Jeremy Linton
  2016-02-12 16:06 ` [PATCH 2/2] arm64: Mark kernel page ranges contiguous Jeremy Linton
  0 siblings, 2 replies; 15+ messages in thread
From: Jeremy Linton @ 2016-02-12 16:06 UTC (permalink / raw)
  To: linux-arm-kernel

This is a rebase of the previous contiguous ptes in linear map patches
on top of Mark Rutland's fixmap changes. Those changes appear to be sufficient
to allow this patch set to boot on JunoR2, seattle and the xgene/m400. I've
also done basic testing with RODATA turned on. In all cases with ACPI systems.

This patch also adds the ability to align 64k kernels on the 2M CONT boundary
which helps to assure that a number of the sections are completely mapped with
CONT bits. There remain a number of holes due to smaller remapping operations
that don't really affect the page protection states. That can be worked around
in a number of cases if the code gets smart enough to detect that the break
doesn't result in an actual change in permissions. Some of these are visible
in the included example where its not initially obvious why a contigious 2M
region isn't marked as such until digging into it.

With 64k pages, and section alignment enabled the kernel looks like:
---[ Kernel Mapping ]---
0xfffffe0000000000-0xfffffe0000200000           2M     RW NX SHD AF    CON     UXN MEM/NORMAL
0xfffffe0000200000-0xfffffe0001200000          16M     ro x  SHD AF    CON     UXN MEM/NORMAL
0xfffffe0001200000-0xfffffe0001400000           2M     RW NX SHD AF    CON     UXN MEM/NORMAL
0xfffffe0001400000-0xfffffe0001600000           2M     RW NX SHD AF            UXN MEM/NORMAL
0xfffffe0001600000-0xfffffe0002600000          16M     RW NX SHD AF    CON     UXN MEM/NORMAL
0xfffffe0002600000-0xfffffe0002800000           2M     RW NX SHD AF            UXN MEM/NORMAL
0xfffffe0002800000-0xfffffe0020000000         472M     RW NX SHD AF    CON     UXN MEM/NORMAL
0xfffffe0020000000-0xfffffe0060000000           1G     RW NX SHD AF        BLK UXN MEM/NORMAL
0xfffffe00600f0000-0xfffffe0060200000        1088K     RW NX SHD AF            UXN MEM/NORMAL
0xfffffe0060200000-0xfffffe0076400000         354M     RW NX SHD AF    CON     UXN MEM/NORMAL
0xfffffe0076400000-0xfffffe0076600000           2M     RW NX SHD AF            UXN MEM/NORMAL
0xfffffe0076600000-0xfffffe0078e00000          40M     RW NX SHD AF    CON     UXN MEM/NORMAL
0xfffffe00793b0000-0xfffffe0079400000         320K     RW NX SHD AF            UXN MEM/NORMAL
0xfffffe0079400000-0xfffffe007e200000          78M     RW NX SHD AF    CON     UXN MEM/NORMAL
0xfffffe007e200000-0xfffffe007e3d0000        1856K     RW NX SHD AF            UXN MEM/NORMAL
0xfffffe007e420000-0xfffffe007e600000        1920K     RW NX SHD AF            UXN MEM/NORMAL
0xfffffe007e600000-0xfffffe007f000000          10M     RW NX SHD AF    CON     UXN MEM/NORMAL
0xfffffe0800000000-0xfffffe0980000000           6G     RW NX SHD AF        BLK UXN MEM/NORMAL

With 4k pages
---[ Kernel Mapping ]---
0xffffffc000000000-0xffffffc000200000           2M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc000200000-0xffffffc001200000          16M     ro x  SHD AF        BLK UXN MEM/NORMAL
0xffffffc001200000-0xffffffc001400000           2M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc001400000-0xffffffc0015c0000        1792K     RW NX SHD AF    CON     UXN MEM/NORMAL
0xffffffc0015c0000-0xffffffc0015d0000          64K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc0015d0000-0xffffffc001600000         192K     RW NX SHD AF    CON     UXN MEM/NORMAL
0xffffffc001600000-0xffffffc002600000          16M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc002600000-0xffffffc002620000         128K     RW NX SHD AF    CON     UXN MEM/NORMAL
0xffffffc002620000-0xffffffc002630000          64K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc002630000-0xffffffc002800000        1856K     RW NX SHD AF    CON     UXN MEM/NORMAL
0xffffffc002800000-0xffffffc060000000        1496M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc0600f0000-0xffffffc060200000        1088K     RW NX SHD AF    CON     UXN MEM/NORMAL
0xffffffc060200000-0xffffffc076400000         354M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc076400000-0xffffffc076590000        1600K     RW NX SHD AF    CON     UXN MEM/NORMAL
0xffffffc076590000-0xffffffc076595000          20K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc076596000-0xffffffc0765a0000          40K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc0765a0000-0xffffffc076600000         384K     RW NX SHD AF    CON     UXN MEM/NORMAL
0xffffffc076600000-0xffffffc078e00000          40M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc0793b0000-0xffffffc079400000         320K     RW NX SHD AF    CON     UXN MEM/NORMAL
0xffffffc079400000-0xffffffc07e200000          78M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc07e200000-0xffffffc07e3d0000        1856K     RW NX SHD AF    CON     UXN MEM/NORMAL
0xffffffc07e420000-0xffffffc07e600000        1920K     RW NX SHD AF    CON     UXN MEM/NORMAL
0xffffffc07e600000-0xffffffc07f000000          10M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc800000000-0xffffffc980000000           6G     RW NX SHD AF        BLK UXN MEM/NORMAL

Jeremy Linton (2):
  arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels.
  arm64: Mark kernel page ranges contiguous

 arch/arm64/Kconfig.debug        | 12 ++++----
 arch/arm64/kernel/vmlinux.lds.S | 11 +++----
 arch/arm64/mm/mmu.c             | 64 +++++++++++++++++++++++++++++++++++++----
 3 files changed, 70 insertions(+), 17 deletions(-)

-- 
2.4.3

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels.
  2016-02-12 16:06 [PATCH 0/2] flag contiguous PTEs in linear mapping Jeremy Linton
@ 2016-02-12 16:06 ` Jeremy Linton
  2016-02-12 16:11   ` Ard Biesheuvel
  2016-02-12 16:06 ` [PATCH 2/2] arm64: Mark kernel page ranges contiguous Jeremy Linton
  1 sibling, 1 reply; 15+ messages in thread
From: Jeremy Linton @ 2016-02-12 16:06 UTC (permalink / raw)
  To: linux-arm-kernel

This change allows ALIGN_RODATA for 16k and 64k kernels.
In the case of 64k kernels it actually aligns to the CONT_SIZE
rather than the SECTION_SIZE (which is 512M). This makes it generally
more useful, especially for CONT enabled kernels.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/Kconfig.debug        | 12 ++++++------
 arch/arm64/kernel/vmlinux.lds.S | 11 ++++++-----
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
index e13c4bf..65705ee 100644
--- a/arch/arm64/Kconfig.debug
+++ b/arch/arm64/Kconfig.debug
@@ -59,15 +59,15 @@ config DEBUG_RODATA
           If in doubt, say Y
 
 config DEBUG_ALIGN_RODATA
-	depends on DEBUG_RODATA && ARM64_4K_PAGES
+	depends on DEBUG_RODATA
 	bool "Align linker sections up to SECTION_SIZE"
 	help
 	  If this option is enabled, sections that may potentially be marked as
-	  read only or non-executable will be aligned up to the section size of
-	  the kernel. This prevents sections from being split into pages and
-	  avoids a potential TLB penalty. The downside is an increase in
-	  alignment and potentially wasted space. Turn on this option if
-	  performance is more important than memory pressure.
+	  read only or non-executable will be aligned up to the section size
+	  or contiguous hint size of the kernel. This prevents sections from
+	  being split into pages and avoids a potential TLB penalty. The downside
+	  is an increase in alignment and potentially wasted space. Turn on
+	  this option if performance is more important than memory pressure.
 
 	  If in doubt, say N
 
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index b78a3c7..ab4e436 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -63,13 +63,14 @@ PECOFF_FILE_ALIGNMENT = 0x200;
 #endif
 
 #if defined(CONFIG_DEBUG_ALIGN_RODATA)
-#define ALIGN_DEBUG_RO			. = ALIGN(1<<SECTION_SHIFT);
-#define ALIGN_DEBUG_RO_MIN(min)		ALIGN_DEBUG_RO
+#if defined(CONFIG_ARM64_64K_PAGES)
+#define ALIGN_DEBUG_RO_MIN(min)		. = ALIGN(CONT_SIZE);
+#else
+#define ALIGN_DEBUG_RO_MIN(min)		. = ALIGN(SECTION_SIZE);
+#endif
 #elif defined(CONFIG_DEBUG_RODATA)
-#define ALIGN_DEBUG_RO			. = ALIGN(1<<PAGE_SHIFT);
-#define ALIGN_DEBUG_RO_MIN(min)		ALIGN_DEBUG_RO
+#define ALIGN_DEBUG_RO_MIN(min)		. = ALIGN(PAGE_SIZE);
 #else
-#define ALIGN_DEBUG_RO
 #define ALIGN_DEBUG_RO_MIN(min)		. = ALIGN(min);
 #endif
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] arm64: Mark kernel page ranges contiguous
  2016-02-12 16:06 [PATCH 0/2] flag contiguous PTEs in linear mapping Jeremy Linton
  2016-02-12 16:06 ` [PATCH 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels Jeremy Linton
@ 2016-02-12 16:06 ` Jeremy Linton
  2016-02-12 16:57   ` Mark Rutland
  2016-02-13 16:43   ` Ard Biesheuvel
  1 sibling, 2 replies; 15+ messages in thread
From: Jeremy Linton @ 2016-02-12 16:06 UTC (permalink / raw)
  To: linux-arm-kernel

With 64k pages, the next larger segment size is 512M. The linux
kernel also uses different protection flags to cover its code and data.
Because of this requirement, the vast majority of the kernel code and
data structures end up being mapped with 64k pages instead of the larger
pages common with a 4k page kernel.

Recent ARM processors support a contiguous bit in the
page tables which allows the a TLB to cover a range larger than a
single PTE if that range is mapped into physically contiguous
ram.

So, for the kernel its a good idea to set this flag. Some basic
micro benchmarks show it can significantly reduce the number of
L1 dTLB refills.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/mm/mmu.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 58 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 7711554..ab69a99 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1,3 +1,4 @@
+
 /*
  * Based on arch/arm/mm/mmu.c
  *
@@ -103,17 +104,49 @@ static void split_pmd(pmd_t *pmd, pte_t *pte)
 		 * Need to have the least restrictive permissions available
 		 * permissions will be fixed up later
 		 */
-		set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC));
+		set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC_CONT));
 		pfn++;
 	} while (pte++, i++, i < PTRS_PER_PTE);
 }
 
+static void clear_cont_pte_range(pte_t *pte, unsigned long addr)
+{
+	int i;
+
+	pte -= CONT_RANGE_OFFSET(addr);
+	for (i = 0; i < CONT_PTES; i++) {
+		if (pte_cont(*pte))
+			set_pte(pte, pte_mknoncont(*pte));
+		pte++;
+	}
+	flush_tlb_all();
+}
+
+/*
+ * Given a range of PTEs set the pfn and provided page protection flags
+ */
+static void __populate_init_pte(pte_t *pte, unsigned long addr,
+			       unsigned long end, phys_addr_t phys,
+			       pgprot_t prot)
+{
+	unsigned long pfn = __phys_to_pfn(phys);
+
+	do {
+		/* clear all the bits except the pfn, then apply the prot */
+		set_pte(pte, pfn_pte(pfn, prot));
+		pte++;
+		pfn++;
+		addr += PAGE_SIZE;
+	} while (addr != end);
+}
+
 static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
-				  unsigned long end, unsigned long pfn,
+				  unsigned long end, phys_addr_t phys,
 				  pgprot_t prot,
 				  phys_addr_t (*pgtable_alloc)(void))
 {
 	pte_t *pte;
+	unsigned long next;
 
 	if (pmd_none(*pmd) || pmd_sect(*pmd)) {
 		phys_addr_t pte_phys = pgtable_alloc();
@@ -127,10 +160,29 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
 	BUG_ON(pmd_bad(*pmd));
 
 	pte = pte_set_fixmap_offset(pmd, addr);
+
 	do {
-		set_pte(pte, pfn_pte(pfn, prot));
-		pfn++;
-	} while (pte++, addr += PAGE_SIZE, addr != end);
+		next = min(end, (addr + CONT_SIZE) & CONT_MASK);
+		if (((addr | next | phys) & ~CONT_MASK) == 0) {
+			/* a block of CONT_PTES	 */
+			__populate_init_pte(pte, addr, next, phys,
+					    prot | __pgprot(PTE_CONT));
+		} else {
+			/*
+			 * If the range being split is already inside of a
+			 * contiguous range but this PTE isn't going to be
+			 * contiguous, then we want to unmark the adjacent
+			 * ranges, then update the portion of the range we
+			 * are interrested in.
+			 */
+			clear_cont_pte_range(pte, addr);
+			__populate_init_pte(pte, addr, next, phys, prot);
+		}
+
+		pte += (next - addr) >> PAGE_SHIFT;
+		phys += next - addr;
+		addr = next;
+	} while (addr != end);
 
 	pte_clear_fixmap();
 }
@@ -194,7 +246,7 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
 				}
 			}
 		} else {
-			alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys),
+			alloc_init_pte(pmd, addr, next, phys,
 				       prot, pgtable_alloc);
 		}
 		phys += next - addr;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels.
  2016-02-12 16:06 ` [PATCH 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels Jeremy Linton
@ 2016-02-12 16:11   ` Ard Biesheuvel
  2016-02-12 16:21     ` Jeremy Linton
  0 siblings, 1 reply; 15+ messages in thread
From: Ard Biesheuvel @ 2016-02-12 16:11 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jeremy,

Question below:

On 12 February 2016 at 17:06, Jeremy Linton <jeremy.linton@arm.com> wrote:
> This change allows ALIGN_RODATA for 16k and 64k kernels.
> In the case of 64k kernels it actually aligns to the CONT_SIZE
> rather than the SECTION_SIZE (which is 512M). This makes it generally
> more useful, especially for CONT enabled kernels.
>
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  arch/arm64/Kconfig.debug        | 12 ++++++------
>  arch/arm64/kernel/vmlinux.lds.S | 11 ++++++-----
>  2 files changed, 12 insertions(+), 11 deletions(-)
>
> diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
> index e13c4bf..65705ee 100644
> --- a/arch/arm64/Kconfig.debug
> +++ b/arch/arm64/Kconfig.debug
> @@ -59,15 +59,15 @@ config DEBUG_RODATA
>            If in doubt, say Y
>
>  config DEBUG_ALIGN_RODATA
> -       depends on DEBUG_RODATA && ARM64_4K_PAGES
> +       depends on DEBUG_RODATA
>         bool "Align linker sections up to SECTION_SIZE"
>         help
>           If this option is enabled, sections that may potentially be marked as
> -         read only or non-executable will be aligned up to the section size of
> -         the kernel. This prevents sections from being split into pages and
> -         avoids a potential TLB penalty. The downside is an increase in
> -         alignment and potentially wasted space. Turn on this option if
> -         performance is more important than memory pressure.
> +         read only or non-executable will be aligned up to the section size
> +         or contiguous hint size of the kernel. This prevents sections from
> +         being split into pages and avoids a potential TLB penalty. The downside
> +         is an increase in alignment and potentially wasted space. Turn on
> +         this option if performance is more important than memory pressure.
>
>           If in doubt, say N
>
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index b78a3c7..ab4e436 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -63,13 +63,14 @@ PECOFF_FILE_ALIGNMENT = 0x200;
>  #endif
>
>  #if defined(CONFIG_DEBUG_ALIGN_RODATA)
> -#define ALIGN_DEBUG_RO                 . = ALIGN(1<<SECTION_SHIFT);
> -#define ALIGN_DEBUG_RO_MIN(min)                ALIGN_DEBUG_RO
> +#if defined(CONFIG_ARM64_64K_PAGES)
> +#define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(CONT_SIZE);
> +#else
> +#define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(SECTION_SIZE);

Doesn't this align to 32 MB on 16k pages kernels?

> +#endif
>  #elif defined(CONFIG_DEBUG_RODATA)
> -#define ALIGN_DEBUG_RO                 . = ALIGN(1<<PAGE_SHIFT);
> -#define ALIGN_DEBUG_RO_MIN(min)                ALIGN_DEBUG_RO
> +#define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(PAGE_SIZE);
>  #else
> -#define ALIGN_DEBUG_RO
>  #define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(min);
>  #endif
>
> --
> 2.4.3
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels.
  2016-02-12 16:11   ` Ard Biesheuvel
@ 2016-02-12 16:21     ` Jeremy Linton
  2016-02-12 16:28       ` Ard Biesheuvel
  0 siblings, 1 reply; 15+ messages in thread
From: Jeremy Linton @ 2016-02-12 16:21 UTC (permalink / raw)
  To: linux-arm-kernel

On 02/12/2016 10:11 AM, Ard Biesheuvel wrote:
> On 12 February 2016 at 17:06, Jeremy Linton <jeremy.linton@arm.com> wrote:
(trimming)
>>   #if defined(CONFIG_DEBUG_ALIGN_RODATA)
>> -#define ALIGN_DEBUG_RO                 . = ALIGN(1<<SECTION_SHIFT);
>> -#define ALIGN_DEBUG_RO_MIN(min)                ALIGN_DEBUG_RO
>> +#if defined(CONFIG_ARM64_64K_PAGES)
>> +#define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(CONT_SIZE);
>> +#else
>> +#define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(SECTION_SIZE);
>
> Doesn't this align to 32 MB on 16k pages kernels?

Yes, I considered whether it was more appropriate to use CONT_SIZE for 
16k as well.

Opinions?

>
>> +#endif
>>   #elif defined(CONFIG_DEBUG_RODATA)
>> -#define ALIGN_DEBUG_RO                 . = ALIGN(1<<PAGE_SHIFT);
>> -#define ALIGN_DEBUG_RO_MIN(min)                ALIGN_DEBUG_RO
>> +#define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(PAGE_SIZE);
>>   #else
>> -#define ALIGN_DEBUG_RO
>>   #define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(min);
>>   #endif
>>
>> --
>> 2.4.3
>>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels.
  2016-02-12 16:21     ` Jeremy Linton
@ 2016-02-12 16:28       ` Ard Biesheuvel
  2016-02-12 16:43         ` Jeremy Linton
  0 siblings, 1 reply; 15+ messages in thread
From: Ard Biesheuvel @ 2016-02-12 16:28 UTC (permalink / raw)
  To: linux-arm-kernel

On 12 February 2016 at 17:21, Jeremy Linton <jeremy.linton@arm.com> wrote:
> On 02/12/2016 10:11 AM, Ard Biesheuvel wrote:
>>
>> On 12 February 2016 at 17:06, Jeremy Linton <jeremy.linton@arm.com> wrote:
>
> (trimming)
>>>
>>>   #if defined(CONFIG_DEBUG_ALIGN_RODATA)
>>> -#define ALIGN_DEBUG_RO                 . = ALIGN(1<<SECTION_SHIFT);
>>> -#define ALIGN_DEBUG_RO_MIN(min)                ALIGN_DEBUG_RO
>>> +#if defined(CONFIG_ARM64_64K_PAGES)
>>> +#define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(CONT_SIZE);
>>> +#else
>>> +#define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(SECTION_SIZE);
>>
>>
>> Doesn't this align to 32 MB on 16k pages kernels?
>
>
> Yes, I considered whether it was more appropriate to use CONT_SIZE for 16k
> as well.
>
> Opinions?
>

Looking at vmlinux.lds.S, I see that that would put _stext and
__init_begin at 32 MB aligned boundaries. making the size of the
kernel at least 64 MB. If I take your .rodata patch into account,
which adds a third instance of ALIGN_DEBUG_RO_MIN, the Image footprint
will rise to ~100 MB. Or am I missing something?



>
>>
>>> +#endif
>>>   #elif defined(CONFIG_DEBUG_RODATA)
>>> -#define ALIGN_DEBUG_RO                 . = ALIGN(1<<PAGE_SHIFT);
>>> -#define ALIGN_DEBUG_RO_MIN(min)                ALIGN_DEBUG_RO
>>> +#define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(PAGE_SIZE);
>>>   #else
>>> -#define ALIGN_DEBUG_RO
>>>   #define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(min);
>>>   #endif
>>>
>>> --
>>> 2.4.3
>>>
>>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels.
  2016-02-12 16:28       ` Ard Biesheuvel
@ 2016-02-12 16:43         ` Jeremy Linton
  2016-02-12 16:46           ` Ard Biesheuvel
  0 siblings, 1 reply; 15+ messages in thread
From: Jeremy Linton @ 2016-02-12 16:43 UTC (permalink / raw)
  To: linux-arm-kernel

On 02/12/2016 10:28 AM, Ard Biesheuvel wrote:
> On 12 February 2016 at 17:21, Jeremy Linton <jeremy.linton@arm.com> wrote:
>> On 02/12/2016 10:11 AM, Ard Biesheuvel wrote:
>>>
>>> On 12 February 2016 at 17:06, Jeremy Linton <jeremy.linton@arm.com> wrote:
>>
>> (trimming)
>>>>
>>>>    #if defined(CONFIG_DEBUG_ALIGN_RODATA)
>>>> -#define ALIGN_DEBUG_RO                 . = ALIGN(1<<SECTION_SHIFT);
>>>> -#define ALIGN_DEBUG_RO_MIN(min)                ALIGN_DEBUG_RO
>>>> +#if defined(CONFIG_ARM64_64K_PAGES)
>>>> +#define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(CONT_SIZE);
>>>> +#else
>>>> +#define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(SECTION_SIZE);
>>>
>>>
>>> Doesn't this align to 32 MB on 16k pages kernels?
>>
>>
>> Yes, I considered whether it was more appropriate to use CONT_SIZE for 16k
>> as well.
>>
>> Opinions?
>>
>
> Looking at vmlinux.lds.S, I see that that would put _stext and
> __init_begin at 32 MB aligned boundaries. making the size of the
> kernel at least 64 MB. If I take your .rodata patch into account,
> which adds a third instance of ALIGN_DEBUG_RO_MIN, the Image footprint
> will rise to ~100 MB. Or am I missing something?
>

No, I think your correct. But, its an option, and it sort of depends on 
use case. In a system with 100+GB of RAM it might be useful. Not so much 
on a phone or small embedded system. I don't really see those people 
enabling ALIGN_RODATA anyway. Worse, I expect the loss of RAM 
efficiently going from 4k-16k pages in a RAM constrained system to be a 
pretty big hit too.

I don't have any hard data one way or the other, and I don't have a 
strong opinion. Although, I suspect at the moment the potential users of 
16k pages may tend toward the small side.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels.
  2016-02-12 16:43         ` Jeremy Linton
@ 2016-02-12 16:46           ` Ard Biesheuvel
  2016-02-12 17:32             ` Ard Biesheuvel
  0 siblings, 1 reply; 15+ messages in thread
From: Ard Biesheuvel @ 2016-02-12 16:46 UTC (permalink / raw)
  To: linux-arm-kernel

On 12 February 2016 at 17:43, Jeremy Linton <jeremy.linton@arm.com> wrote:
> On 02/12/2016 10:28 AM, Ard Biesheuvel wrote:
>>
>> On 12 February 2016 at 17:21, Jeremy Linton <jeremy.linton@arm.com> wrote:
>>>
>>> On 02/12/2016 10:11 AM, Ard Biesheuvel wrote:
>>>>
>>>>
>>>> On 12 February 2016 at 17:06, Jeremy Linton <jeremy.linton@arm.com>
>>>> wrote:
>>>
>>>
>>> (trimming)
>>>>>
>>>>>
>>>>>    #if defined(CONFIG_DEBUG_ALIGN_RODATA)
>>>>> -#define ALIGN_DEBUG_RO                 . = ALIGN(1<<SECTION_SHIFT);
>>>>> -#define ALIGN_DEBUG_RO_MIN(min)                ALIGN_DEBUG_RO
>>>>> +#if defined(CONFIG_ARM64_64K_PAGES)
>>>>> +#define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(CONT_SIZE);
>>>>> +#else
>>>>> +#define ALIGN_DEBUG_RO_MIN(min)                . =
>>>>> ALIGN(SECTION_SIZE);
>>>>
>>>>
>>>>
>>>> Doesn't this align to 32 MB on 16k pages kernels?
>>>
>>>
>>>
>>> Yes, I considered whether it was more appropriate to use CONT_SIZE for
>>> 16k
>>> as well.
>>>
>>> Opinions?
>>>
>>
>> Looking at vmlinux.lds.S, I see that that would put _stext and
>> __init_begin at 32 MB aligned boundaries. making the size of the
>> kernel at least 64 MB. If I take your .rodata patch into account,
>> which adds a third instance of ALIGN_DEBUG_RO_MIN, the Image footprint
>> will rise to ~100 MB. Or am I missing something?
>>
>
> No, I think your correct. But, its an option, and it sort of depends on use
> case. In a system with 100+GB of RAM it might be useful. Not so much on a
> phone or small embedded system. I don't really see those people enabling
> ALIGN_RODATA anyway. Worse, I expect the loss of RAM efficiently going from
> 4k-16k pages in a RAM constrained system to be a pretty big hit too.
>
> I don't have any hard data one way or the other, and I don't have a strong
> opinion. Although, I suspect at the moment the potential users of 16k pages
> may tend toward the small side.
>

Well, the thing is, if you only use 2 MB of each 32 MB on average, you
gain nothing by going from contiguous pages to proper sections, since
the TLB footprint is the same. So my opinion would be to use CONT_SIZE
for 16k pages as well.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/2] arm64: Mark kernel page ranges contiguous
  2016-02-12 16:06 ` [PATCH 2/2] arm64: Mark kernel page ranges contiguous Jeremy Linton
@ 2016-02-12 16:57   ` Mark Rutland
  2016-02-12 17:35     ` Jeremy Linton
  2016-02-13 16:43   ` Ard Biesheuvel
  1 sibling, 1 reply; 15+ messages in thread
From: Mark Rutland @ 2016-02-12 16:57 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Fri, Feb 12, 2016 at 10:06:48AM -0600, Jeremy Linton wrote:
> With 64k pages, the next larger segment size is 512M. The linux
> kernel also uses different protection flags to cover its code and data.
> Because of this requirement, the vast majority of the kernel code and
> data structures end up being mapped with 64k pages instead of the larger
> pages common with a 4k page kernel.
> 
> Recent ARM processors support a contiguous bit in the
> page tables which allows the a TLB to cover a range larger than a
> single PTE if that range is mapped into physically contiguous
> ram.
> 
> So, for the kernel its a good idea to set this flag. Some basic
> micro benchmarks show it can significantly reduce the number of
> L1 dTLB refills.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  arch/arm64/mm/mmu.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 58 insertions(+), 6 deletions(-)

This generally looks good.

As a heads-up, I have one concern:

> +static void clear_cont_pte_range(pte_t *pte, unsigned long addr)
> +{
> +	int i;
> +
> +	pte -= CONT_RANGE_OFFSET(addr);
> +	for (i = 0; i < CONT_PTES; i++) {
> +		if (pte_cont(*pte))
> +			set_pte(pte, pte_mknoncont(*pte));
> +		pte++;
> +	}
> +	flush_tlb_all();
> +}

As far as I can tell, "splitting" contiguous entries comes with the same
caveats as splitting sections. In the absence of a BBM sequence we might
end up with conflicting TLB entries.

However, I think we're OK for now.

The way we consistently map/unmap/modify image/linear "chunks" should
prevent us from trying to split those, and if/when we do this for the
EFI runtime page tables thy aren't live.

It would be good to figure out how to get rid of the splitting entirely.

Otherwise, this looks good to me; I'll try to give this a spin next
week.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels.
  2016-02-12 16:46           ` Ard Biesheuvel
@ 2016-02-12 17:32             ` Ard Biesheuvel
  0 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2016-02-12 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 12 February 2016 at 17:46, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> On 12 February 2016 at 17:43, Jeremy Linton <jeremy.linton@arm.com> wrote:
>> On 02/12/2016 10:28 AM, Ard Biesheuvel wrote:
>>>
>>> On 12 February 2016 at 17:21, Jeremy Linton <jeremy.linton@arm.com> wrote:
>>>>
>>>> On 02/12/2016 10:11 AM, Ard Biesheuvel wrote:
>>>>>
>>>>>
>>>>> On 12 February 2016 at 17:06, Jeremy Linton <jeremy.linton@arm.com>
>>>>> wrote:
>>>>
>>>>
>>>> (trimming)
>>>>>>
>>>>>>
>>>>>>    #if defined(CONFIG_DEBUG_ALIGN_RODATA)
>>>>>> -#define ALIGN_DEBUG_RO                 . = ALIGN(1<<SECTION_SHIFT);
>>>>>> -#define ALIGN_DEBUG_RO_MIN(min)                ALIGN_DEBUG_RO
>>>>>> +#if defined(CONFIG_ARM64_64K_PAGES)
>>>>>> +#define ALIGN_DEBUG_RO_MIN(min)                . = ALIGN(CONT_SIZE);
>>>>>> +#else
>>>>>> +#define ALIGN_DEBUG_RO_MIN(min)                . =
>>>>>> ALIGN(SECTION_SIZE);
>>>>>
>>>>>
>>>>>
>>>>> Doesn't this align to 32 MB on 16k pages kernels?
>>>>
>>>>
>>>>
>>>> Yes, I considered whether it was more appropriate to use CONT_SIZE for
>>>> 16k
>>>> as well.
>>>>
>>>> Opinions?
>>>>
>>>
>>> Looking at vmlinux.lds.S, I see that that would put _stext and
>>> __init_begin at 32 MB aligned boundaries. making the size of the
>>> kernel at least 64 MB. If I take your .rodata patch into account,
>>> which adds a third instance of ALIGN_DEBUG_RO_MIN, the Image footprint
>>> will rise to ~100 MB. Or am I missing something?
>>>
>>
>> No, I think your correct. But, its an option, and it sort of depends on use
>> case. In a system with 100+GB of RAM it might be useful. Not so much on a
>> phone or small embedded system. I don't really see those people enabling
>> ALIGN_RODATA anyway. Worse, I expect the loss of RAM efficiently going from
>> 4k-16k pages in a RAM constrained system to be a pretty big hit too.
>>
>> I don't have any hard data one way or the other, and I don't have a strong
>> opinion. Although, I suspect at the moment the potential users of 16k pages
>> may tend toward the small side.
>>
>
> Well, the thing is, if you only use 2 MB of each 32 MB on average, you
> gain nothing by going from contiguous pages to proper sections, since
> the TLB footprint is the same. So my opinion would be to use CONT_SIZE
> for 16k pages as well.

Actually, none of this matters, since the physical alignment is not
guaranteed to be 32 MB. So anything beyond 2 MB is not really
meaningful imo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/2] arm64: Mark kernel page ranges contiguous
  2016-02-12 16:57   ` Mark Rutland
@ 2016-02-12 17:35     ` Jeremy Linton
  2016-02-12 17:58       ` Mark Rutland
  0 siblings, 1 reply; 15+ messages in thread
From: Jeremy Linton @ 2016-02-12 17:35 UTC (permalink / raw)
  To: linux-arm-kernel

On 02/12/2016 10:57 AM, Mark Rutland wrote:
(trimming)
 > On Fri, Feb 12, 2016 at 10:06:48AM -0600, Jeremy Linton wrote:
>> +static void clear_cont_pte_range(pte_t *pte, unsigned long addr)
>> +{
>> +	int i;
>> +
>> +	pte -= CONT_RANGE_OFFSET(addr);
>> +	for (i = 0; i < CONT_PTES; i++) {
>> +		if (pte_cont(*pte))
>> +			set_pte(pte, pte_mknoncont(*pte));
>> +		pte++;
>> +	}
>> +	flush_tlb_all();
>> +}
>
> As far as I can tell, "splitting" contiguous entries comes with the same
> caveats as splitting sections. In the absence of a BBM sequence we might
> end up with conflicting TLB entries.

As I mentioned a couple weeks ago, I'm not sure that inverting a BBM to 
a full "make partial copy of the whole table->break TTBR to copy 
sequence" is so bad if the copy process maintains references to the 
original table entries when they aren't in the modification path. It 
might even work with all the CPU's spun up because the break sequence 
would just be IPI's to the remaining cpu's to replace their TTBR/flush 
with a new value. I think you mentioned the ugly part is arbitrating 
access to the update functionality (and all the implied rules of when it 
could be done). But doing it that way doesn't require stalling the CPU's 
during the "make partial copy" portion.

> However, I think we're OK for now.
>
> The way we consistently map/unmap/modify image/linear "chunks" should
> prevent us from trying to split those, and if/when we do this for the
> EFI runtime page tables thy aren't live.
>
> It would be good to figure out how to get rid of the splitting entirely.

Well we could hoist some of it earlier by taking the 
create_mapping_late() calls and doing them earlier with RWX permissions, 
and then applying the RO,ROX,RW later as necessarily.

Which is ugly, but it might solve particular late splitting cases.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/2] arm64: Mark kernel page ranges contiguous
  2016-02-12 17:35     ` Jeremy Linton
@ 2016-02-12 17:58       ` Mark Rutland
  2016-02-12 18:09         ` Jeremy Linton
  0 siblings, 1 reply; 15+ messages in thread
From: Mark Rutland @ 2016-02-12 17:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 12, 2016 at 11:35:05AM -0600, Jeremy Linton wrote:
> On 02/12/2016 10:57 AM, Mark Rutland wrote:
> (trimming)
> > On Fri, Feb 12, 2016 at 10:06:48AM -0600, Jeremy Linton wrote:
> >>+static void clear_cont_pte_range(pte_t *pte, unsigned long addr)
> >>+{
> >>+	int i;
> >>+
> >>+	pte -= CONT_RANGE_OFFSET(addr);
> >>+	for (i = 0; i < CONT_PTES; i++) {
> >>+		if (pte_cont(*pte))
> >>+			set_pte(pte, pte_mknoncont(*pte));
> >>+		pte++;
> >>+	}
> >>+	flush_tlb_all();
> >>+}
> >
> >As far as I can tell, "splitting" contiguous entries comes with the same
> >caveats as splitting sections. In the absence of a BBM sequence we might
> >end up with conflicting TLB entries.
> 
> As I mentioned a couple weeks ago, I'm not sure that inverting a BBM
> to a full "make partial copy of the whole table->break TTBR to copy
> sequence" is so bad if the copy process maintains references to the
> original table entries when they aren't in the modification path. It
> might even work with all the CPU's spun up because the break
> sequence would just be IPI's to the remaining cpu's to replace their
> TTBR/flush with a new value. I think you mentioned the ugly part is
> arbitrating access to the update functionality (and all the implied
> rules of when it could be done). But doing it that way doesn't
> require stalling the CPU's during the "make partial copy" portion.

That may be true, and worthy of investigation.

One problem I envisaged with that is concurrent kernel pagetable
modification (e.g. vmalloc, DEBUG_PAGEALLOC). To handle that correctly
you require global serialization (or your copy may be stale), though as
you point out that doesn't mean stop-the-world entirely.

For the above, I was simply pointing out that in general,
splitting/fusing contiguous ranges comes with the same issues as
splitting/fusing sections, as that may not be immediately obvious.

> >However, I think we're OK for now.
> >
> >The way we consistently map/unmap/modify image/linear "chunks" should
> >prevent us from trying to split those, and if/when we do this for the
> >EFI runtime page tables thy aren't live.
> >
> >It would be good to figure out how to get rid of the splitting entirely.
> 
> Well we could hoist some of it earlier by taking the
> create_mapping_late() calls and doing them earlier with RWX
> permissions, and then applying the RO,ROX,RW later as necessarily.
> 
> Which is ugly, but it might solve particular late splitting cases.

I'm not sure I follow.

The aim was that after my changes we should only split/fuse for EFI page
tables, and only for !4K page kernels. See [1] for why. Avoiding that in
the EFI case is very painful, so for now we kept split_pud and
split_pmd.

All create_mapping_late() calls should be performed with the same
physical/virtual start/end as earlier "chunk" mappings, and thus should
never result in a fuse/split or translation change -- only permission
changes (which we believe do not result in TLB conflicts, or we'd need
to do far more work to fix those up).

If we split/fuse in any case other than EFI runtime table creation, that
is a bug that we need to fix. If you're seeing a case we do that, then
please let me know!

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/398178.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/2] arm64: Mark kernel page ranges contiguous
  2016-02-12 17:58       ` Mark Rutland
@ 2016-02-12 18:09         ` Jeremy Linton
  2016-02-12 18:11           ` Mark Rutland
  0 siblings, 1 reply; 15+ messages in thread
From: Jeremy Linton @ 2016-02-12 18:09 UTC (permalink / raw)
  To: linux-arm-kernel

On 02/12/2016 11:58 AM, Mark Rutland wrote:

(trimming)

> All create_mapping_late() calls should be performed with the same
> physical/virtual start/end as earlier "chunk" mappings, and thus should
> never result in a fuse/split or translation change -- only permission
> changes (which we believe do not result in TLB conflicts, or we'd need
> to do far more work to fix those up).
>
> If we split/fuse in any case other than EFI runtime table creation, that
> is a bug that we need to fix. If you're seeing a case we do that, then
> please let me know!

We are saying the same thing, and right now the biggest violator is 
probably the .rodata patch I just posted!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/2] arm64: Mark kernel page ranges contiguous
  2016-02-12 18:09         ` Jeremy Linton
@ 2016-02-12 18:11           ` Mark Rutland
  0 siblings, 0 replies; 15+ messages in thread
From: Mark Rutland @ 2016-02-12 18:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 12, 2016 at 12:09:41PM -0600, Jeremy Linton wrote:
> On 02/12/2016 11:58 AM, Mark Rutland wrote:
> 
> (trimming)
> 
> >All create_mapping_late() calls should be performed with the same
> >physical/virtual start/end as earlier "chunk" mappings, and thus should
> >never result in a fuse/split or translation change -- only permission
> >changes (which we believe do not result in TLB conflicts, or we'd need
> >to do far more work to fix those up).
> >
> >If we split/fuse in any case other than EFI runtime table creation, that
> >is a bug that we need to fix. If you're seeing a case we do that, then
> >please let me know!
> 
> We are saying the same thing, and right now the biggest violator is
> probably the .rodata patch I just posted!

Ok, phew!

The simple fix is to make .text and .rodata separate "chunks", then it
all falls out in the wash.

Mark.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/2] arm64: Mark kernel page ranges contiguous
  2016-02-12 16:06 ` [PATCH 2/2] arm64: Mark kernel page ranges contiguous Jeremy Linton
  2016-02-12 16:57   ` Mark Rutland
@ 2016-02-13 16:43   ` Ard Biesheuvel
  1 sibling, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2016-02-13 16:43 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jeremy,

On 12 February 2016 at 17:06, Jeremy Linton <jeremy.linton@arm.com> wrote:
> With 64k pages, the next larger segment size is 512M. The linux
> kernel also uses different protection flags to cover its code and data.
> Because of this requirement, the vast majority of the kernel code and
> data structures end up being mapped with 64k pages instead of the larger
> pages common with a 4k page kernel.
>
> Recent ARM processors support a contiguous bit in the
> page tables which allows the a TLB to cover a range larger than a
> single PTE if that range is mapped into physically contiguous
> ram.
>
> So, for the kernel its a good idea to set this flag. Some basic
> micro benchmarks show it can significantly reduce the number of
> L1 dTLB refills.
>
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>

AFAICT, extending this patch to implement contiguous PMDs for 16 KB
granule kernels should be fairly straightforward, right? Level 2
contiguous block size on 16 KB is 1 GB, which would be useful for the
linear mapping.

> ---
>  arch/arm64/mm/mmu.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 58 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 7711554..ab69a99 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1,3 +1,4 @@
> +
>  /*
>   * Based on arch/arm/mm/mmu.c
>   *
> @@ -103,17 +104,49 @@ static void split_pmd(pmd_t *pmd, pte_t *pte)
>                  * Need to have the least restrictive permissions available
>                  * permissions will be fixed up later
>                  */
> -               set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC));
> +               set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC_CONT));
>                 pfn++;
>         } while (pte++, i++, i < PTRS_PER_PTE);
>  }
>
> +static void clear_cont_pte_range(pte_t *pte, unsigned long addr)
> +{
> +       int i;
> +
> +       pte -= CONT_RANGE_OFFSET(addr);
> +       for (i = 0; i < CONT_PTES; i++) {
> +               if (pte_cont(*pte))
> +                       set_pte(pte, pte_mknoncont(*pte));
> +               pte++;
> +       }
> +       flush_tlb_all();
> +}
> +
> +/*
> + * Given a range of PTEs set the pfn and provided page protection flags
> + */
> +static void __populate_init_pte(pte_t *pte, unsigned long addr,
> +                              unsigned long end, phys_addr_t phys,
> +                              pgprot_t prot)
> +{
> +       unsigned long pfn = __phys_to_pfn(phys);
> +
> +       do {
> +               /* clear all the bits except the pfn, then apply the prot */
> +               set_pte(pte, pfn_pte(pfn, prot));
> +               pte++;
> +               pfn++;
> +               addr += PAGE_SIZE;
> +       } while (addr != end);
> +}
> +
>  static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
> -                                 unsigned long end, unsigned long pfn,
> +                                 unsigned long end, phys_addr_t phys,
>                                   pgprot_t prot,
>                                   phys_addr_t (*pgtable_alloc)(void))
>  {
>         pte_t *pte;
> +       unsigned long next;
>
>         if (pmd_none(*pmd) || pmd_sect(*pmd)) {
>                 phys_addr_t pte_phys = pgtable_alloc();
> @@ -127,10 +160,29 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
>         BUG_ON(pmd_bad(*pmd));
>
>         pte = pte_set_fixmap_offset(pmd, addr);
> +
>         do {
> -               set_pte(pte, pfn_pte(pfn, prot));
> -               pfn++;
> -       } while (pte++, addr += PAGE_SIZE, addr != end);
> +               next = min(end, (addr + CONT_SIZE) & CONT_MASK);
> +               if (((addr | next | phys) & ~CONT_MASK) == 0) {
> +                       /* a block of CONT_PTES  */
> +                       __populate_init_pte(pte, addr, next, phys,
> +                                           prot | __pgprot(PTE_CONT));
> +               } else {
> +                       /*
> +                        * If the range being split is already inside of a
> +                        * contiguous range but this PTE isn't going to be
> +                        * contiguous, then we want to unmark the adjacent
> +                        * ranges, then update the portion of the range we
> +                        * are interrested in.
> +                        */
> +                       clear_cont_pte_range(pte, addr);
> +                       __populate_init_pte(pte, addr, next, phys, prot);
> +               }
> +
> +               pte += (next - addr) >> PAGE_SHIFT;
> +               phys += next - addr;
> +               addr = next;
> +       } while (addr != end);
>
>         pte_clear_fixmap();
>  }
> @@ -194,7 +246,7 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
>                                 }
>                         }
>                 } else {
> -                       alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys),
> +                       alloc_init_pte(pmd, addr, next, phys,
>                                        prot, pgtable_alloc);
>                 }
>                 phys += next - addr;
> --
> 2.4.3
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-02-13 16:43 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-12 16:06 [PATCH 0/2] flag contiguous PTEs in linear mapping Jeremy Linton
2016-02-12 16:06 ` [PATCH 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels Jeremy Linton
2016-02-12 16:11   ` Ard Biesheuvel
2016-02-12 16:21     ` Jeremy Linton
2016-02-12 16:28       ` Ard Biesheuvel
2016-02-12 16:43         ` Jeremy Linton
2016-02-12 16:46           ` Ard Biesheuvel
2016-02-12 17:32             ` Ard Biesheuvel
2016-02-12 16:06 ` [PATCH 2/2] arm64: Mark kernel page ranges contiguous Jeremy Linton
2016-02-12 16:57   ` Mark Rutland
2016-02-12 17:35     ` Jeremy Linton
2016-02-12 17:58       ` Mark Rutland
2016-02-12 18:09         ` Jeremy Linton
2016-02-12 18:11           ` Mark Rutland
2016-02-13 16:43   ` Ard Biesheuvel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.