linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* kernel_page_tables issue with CONFIG_ARM64_USER_VA_BITS_52=y
@ 2019-02-01  4:01 Qian Cai
  2019-02-01 13:48 ` Will Deacon
  0 siblings, 1 reply; 8+ messages in thread
From: Qian Cai @ 2019-02-01  4:01 UTC (permalink / raw)
  To: steve.capper; +Cc: Catalin Marinas, Will Deacon, Linux ARM

On this ThunderX2 server with both,
CONFIG_ARM64_USER_VA_BITS_52=y
CONFIG_ARM64_PTDUMP_DEBUGFS=y

kernel_page_tables could only print out linear mappings.

# cat /sys/kernel/debug/kernel_page_tables
---[ Kasan shadow start ]---
---[ Kasan shadow end ]---
---[ Modules start ]---
---[ Modules end ]---
---[ vmalloc() area ]---
---[ vmalloc() end ]---
---[ Fixmap start ]---
---[ Fixmap end ]---
---[ PCI I/O start ]---
---[ PCI I/O end ]---
---[ vmemmap start ]---
---[ vmemmap end ]---
---[ Linear mapping ]---
0x000e000000000000-0x000e040001000000     4194320M PTE       ro NX SHD AF
    UXN MEM/NORMAL
0x000e0400011a0000-0x000e040001270000         832K PTE       RW NX SHD AF
    UXN MEM/NORMAL
0x000e040001280000-0x000e0400012e0000         384K PTE       RW NX SHD AF
    UXN MEM/NORMAL
...
0x000e809718f10000-0x000e809718f30000         128K PTE       RW NX SHD AF
    UXN MEM/NORMAL
0x000e809718f30000-0x000e809718f40000          64K PTE F     RW NX SHD AF
    UXN MEM/NORMAL
0x000e809718f40000-0x000e80977d000000     1639168K PTE       RW NX SHD AF
    UXN MEM/NORMAL

Using CONFIG_ARM64_VA_BITS_48=y instead makes everything pretty.

---[ Kasan shadow start ]---
0xffff000000000000-0xffff040001000000     4194320M PTE       ro NX SHD AF
    UXN MEM/NORMAL
0xffff0400011a0000-0xffff040001250000         704K PTE       RW NX SHD AF
    UXN MEM/NORMAL
0xffff040001270000-0xffff0400012e0000         448K PTE       RW NX SHD AF
    UXN MEM/NORMAL
...
0xffff100000050000-0xffff10000fe00000      259776K PTE       RW NX SHD AF
    UXN MEM/NORMAL
0xffff100100000000-0xffff1001f0000000        3840M PTE       RW NX SHD AF
    UXN MEM/NORMAL
0xffff1010f0000000-0xffff1012efa00000        8186M PTE       RW NX SHD AF
    UXN MEM/NORMAL
---[ Kasan shadow end ]---
---[ Modules start ]---
0xffff200008d00000-0xffff200008d10000          64K PTE       ro x  SHD AF
    UXN MEM/NORMAL
0xffff200008d10000-0xffff200008d20000          64K PTE       ro NX SHD AF
    UXN MEM/NORMAL
0xffff200008d20000-0xffff200008d40000         128K PTE       RW NX SHD AF
    UXN MEM/NORMAL
...
0xffff200009780000-0xffff200009790000          64K PTE       ro x  SHD AF
    UXN MEM/NORMAL
0xffff200009790000-0xffff2000097a0000          64K PTE       ro NX SHD AF
    UXN MEM/NORMAL
0xffff2000097a0000-0xffff2000097c0000         128K PTE       RW NX SHD AF
    UXN MEM/NORMAL
---[ Modules end ]---
---[ vmalloc() area ]---
0xffff200010000000-0xffff200010010000          64K PTE       RW NX SHD AF
    UXN DEVICE/nGnRE
0xffff200010020000-0xffff200010040000         128K PTE       RW NX SHD AF
    UXN DEVICE/nGnRE
0xffff200010050000-0xffff200010060000          64K PTE       ro NX SHD AF
    UXN MEM/NORMAL
...
0xffff200030000000-0xffff200038000000         128M PTE       RW NX SHD AF
    UXN DEVICE/nGnRnE
0xffff7bd3ff5e0000-0xffff7bd401fe0000          42M PTE       RW NX SHD AF
    UXN MEM/NORMAL
0xffff7bdffd5f0000-0xffff7bdfffff0000          42M PTE       RW NX SHD AF
    UXN MEM/NORMAL
---[ vmalloc() end ]---
---[ Fixmap start ]---
0xffff7fdffe7f0000-0xffff7fdffe800000          64K PTE       RW NX SHD AF
    UXN DEVICE/nGnRE
0xffff7fdffe800000-0xffff7fdffe810000          64K PTE       ro NX SHD AF
    UXN MEM/NORMAL
---[ Fixmap end ]---
---[ PCI I/O start ]---
---[ PCI I/O end ]---
---[ vmemmap start ]---
0xffff7fe000000000-0xffff7fe000200000           2M PTE       RW NX SHD AF
    UXN MEM/NORMAL
0xffff7fe002000000-0xffff7fe003e00000          30M PTE       RW NX SHD AF
    UXN MEM/NORMAL
0xffff7fe021e00000-0xffff7fe025e00000          64M PTE       RW NX SHD AF
    UXN MEM/NORMAL
---[ vmemmap end ]---
---[ Linear mapping ]---
0xffff800000310000-0xffff800000480000        1472K PTE F     RW NX SHD AF
    UXN MEM/NORMAL
0xffff800000480000-0xffff800001480000          16M PTE       ro NX SHD AF
    UXN MEM/NORMAL
0xffff800001480000-0xffff800002250000       14144K PTE F     RW NX SHD AF
    UXN MEM/NORMAL
...
0xffff8097189e0000-0xffff809718b20000        1280K PTE       RW NX SHD AF
    UXN MEM/NORMAL
0xffff809718b20000-0xffff809718b30000          64K PTE F     RW NX SHD AF
    UXN MEM/NORMAL
0xffff809718b30000-0xffff80977d000000     1643328K PTE       RW NX SHD AF
    UXN MEM/NORMAL

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel_page_tables issue with CONFIG_ARM64_USER_VA_BITS_52=y
  2019-02-01  4:01 kernel_page_tables issue with CONFIG_ARM64_USER_VA_BITS_52=y Qian Cai
@ 2019-02-01 13:48 ` Will Deacon
  2019-02-01 18:53   ` Steve Capper
  2019-02-01 20:49   ` Qian Cai
  0 siblings, 2 replies; 8+ messages in thread
From: Will Deacon @ 2019-02-01 13:48 UTC (permalink / raw)
  To: Qian Cai; +Cc: Catalin Marinas, Linux ARM, steve.capper

Hi Qian Cai,

Thanks for reporting this.

On Thu, Jan 31, 2019 at 11:01:50PM -0500, Qian Cai wrote:
> On this ThunderX2 server with both,
> CONFIG_ARM64_USER_VA_BITS_52=y
> CONFIG_ARM64_PTDUMP_DEBUGFS=y
> 
> kernel_page_tables could only print out linear mappings.

I think this is because the ptdump code is abusing both PTRS_PER_PGD and
pgd_offset(mm, 0UL) for walking kernel (TTBR1) page tables. That's not going
to work, because those are still configured as 48-bit.

Please can you try the diff below? (Steve -- can you have a look as well,
please?).

Thanks,

Will

--->8

diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index fcb1f2a6d7c6..00ca2adeb5ef 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -286,74 +286,72 @@ static void note_page(struct pg_state *st, unsigned long addr, unsigned level,
 
 }
 
-static void walk_pte(struct pg_state *st, pmd_t *pmdp, unsigned long start)
+static void walk_pte(struct pg_state *st, pmd_t *pmdp, unsigned long start,
+		     unsigned long end)
 {
-	pte_t *ptep = pte_offset_kernel(pmdp, 0UL);
-	unsigned long addr;
-	unsigned i;
+	unsigned long addr = start;
+	pte_t *ptep = pte_offset_kernel(pmdp, start);
 
-	for (i = 0; i < PTRS_PER_PTE; i++, ptep++) {
-		addr = start + i * PAGE_SIZE;
+	do {
 		note_page(st, addr, 4, READ_ONCE(pte_val(*ptep)));
-	}
+	} while (ptep++, addr += PAGE_SIZE, addr != end);
 }
 
-static void walk_pmd(struct pg_state *st, pud_t *pudp, unsigned long start)
+static void walk_pmd(struct pg_state *st, pud_t *pudp, unsigned long start,
+		     unsigned long end)
 {
-	pmd_t *pmdp = pmd_offset(pudp, 0UL);
-	unsigned long addr;
-	unsigned i;
+	unsigned long next, addr = start;
+	pmd_t *pmdp = pmd_offset(pudp, start);
 
-	for (i = 0; i < PTRS_PER_PMD; i++, pmdp++) {
+	do {
 		pmd_t pmd = READ_ONCE(*pmdp);
+		next = pmd_addr_end(addr, end);
 
-		addr = start + i * PMD_SIZE;
 		if (pmd_none(pmd) || pmd_sect(pmd)) {
 			note_page(st, addr, 3, pmd_val(pmd));
 		} else {
 			BUG_ON(pmd_bad(pmd));
-			walk_pte(st, pmdp, addr);
+			walk_pte(st, pmdp, addr, next);
 		}
-	}
+	} while (pmdp++, addr = next, addr != end);
 }
 
-static void walk_pud(struct pg_state *st, pgd_t *pgdp, unsigned long start)
+static void walk_pud(struct pg_state *st, pgd_t *pgdp, unsigned long start,
+		     unsigned long end)
 {
-	pud_t *pudp = pud_offset(pgdp, 0UL);
-	unsigned long addr;
-	unsigned i;
+	unsigned long next, addr = start;
+	pud_t *pudp = pud_offset(pgdp, start);
 
-	for (i = 0; i < PTRS_PER_PUD; i++, pudp++) {
+	do {
 		pud_t pud = READ_ONCE(*pudp);
+		next = pud_addr_end(addr, end);
 
-		addr = start + i * PUD_SIZE;
 		if (pud_none(pud) || pud_sect(pud)) {
 			note_page(st, addr, 2, pud_val(pud));
 		} else {
 			BUG_ON(pud_bad(pud));
-			walk_pmd(st, pudp, addr);
+			walk_pmd(st, pudp, addr, next);
 		}
-	}
+	} while (pudp++, addr = next, addr != end);
 }
 
 static void walk_pgd(struct pg_state *st, struct mm_struct *mm,
 		     unsigned long start)
 {
-	pgd_t *pgdp = pgd_offset(mm, 0UL);
-	unsigned i;
-	unsigned long addr;
+	unsigned long next, addr = start;
+	pgd_t *pgdp = pgd_offset(mm, addr);
 
-	for (i = 0; i < PTRS_PER_PGD; i++, pgdp++) {
+	do {
 		pgd_t pgd = READ_ONCE(*pgdp);
+		next = pgd_addr_end(addr, 0);
 
-		addr = start + i * PGDIR_SIZE;
 		if (pgd_none(pgd)) {
 			note_page(st, addr, 1, pgd_val(pgd));
 		} else {
 			BUG_ON(pgd_bad(pgd));
-			walk_pud(st, pgdp, addr);
+			walk_pud(st, pgdp, addr, next);
 		}
-	}
+	} while (pgdp++, addr = next, addr);
 }
 
 void ptdump_walk_pgd(struct seq_file *m, struct ptdump_info *info)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: kernel_page_tables issue with CONFIG_ARM64_USER_VA_BITS_52=y
  2019-02-01 13:48 ` Will Deacon
@ 2019-02-01 18:53   ` Steve Capper
  2019-02-01 20:49   ` Qian Cai
  1 sibling, 0 replies; 8+ messages in thread
From: Steve Capper @ 2019-02-01 18:53 UTC (permalink / raw)
  To: Will Deacon; +Cc: Catalin Marinas, nd, Qian Cai, Linux ARM

On Fri, Feb 01, 2019 at 01:48:47PM +0000, Will Deacon wrote:
> Hi Qian Cai,
> 
> Thanks for reporting this.
> 
> On Thu, Jan 31, 2019 at 11:01:50PM -0500, Qian Cai wrote:
> > On this ThunderX2 server with both,
> > CONFIG_ARM64_USER_VA_BITS_52=y
> > CONFIG_ARM64_PTDUMP_DEBUGFS=y
> > 
> > kernel_page_tables could only print out linear mappings.
> 
> I think this is because the ptdump code is abusing both PTRS_PER_PGD and
> pgd_offset(mm, 0UL) for walking kernel (TTBR1) page tables. That's not going
> to work, because those are still configured as 48-bit.
> 
> Please can you try the diff below? (Steve -- can you have a look as well,
> please?).
> 
> Thanks,
> 
> Will

Hi Will,
This patch looks good to me and worked well for my tests in a model.

FWIW:
Acked-by: Steve Capper <steve.capper@arm.com>
Tested-by: Steve Capper <steve.capper@arm.com>

Cheers,
-- 
Steve

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel_page_tables issue with CONFIG_ARM64_USER_VA_BITS_52=y
  2019-02-01 13:48 ` Will Deacon
  2019-02-01 18:53   ` Steve Capper
@ 2019-02-01 20:49   ` Qian Cai
  2019-02-04 10:27     ` Steve Capper
  1 sibling, 1 reply; 8+ messages in thread
From: Qian Cai @ 2019-02-01 20:49 UTC (permalink / raw)
  To: Will Deacon; +Cc: Catalin Marinas, Linux ARM, steve.capper

On Fri, 2019-02-01 at 13:48 +0000, Will Deacon wrote:
> Hi Qian Cai,
> 
> Thanks for reporting this.
> 
> On Thu, Jan 31, 2019 at 11:01:50PM -0500, Qian Cai wrote:
> > On this ThunderX2 server with both,
> > CONFIG_ARM64_USER_VA_BITS_52=y
> > CONFIG_ARM64_PTDUMP_DEBUGFS=y
> > 
> > kernel_page_tables could only print out linear mappings.
> 
> I think this is because the ptdump code is abusing both PTRS_PER_PGD and
> pgd_offset(mm, 0UL) for walking kernel (TTBR1) page tables. That's not going
> to work, because those are still configured as 48-bit.
> 
> Please can you try the diff below? (Steve -- can you have a look as well,
> please?).
> 
> Thanks,
> 
> Will
> 
> --->8
> 
> diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
> index fcb1f2a6d7c6..00ca2adeb5ef 100644
> --- a/arch/arm64/mm/dump.c
> +++ b/arch/arm64/mm/dump.c
> @@ -286,74 +286,72 @@ static void note_page(struct pg_state *st, unsigned long
> addr, unsigned level,
>  
>  }
>  
> -static void walk_pte(struct pg_state *st, pmd_t *pmdp, unsigned long start)
> +static void walk_pte(struct pg_state *st, pmd_t *pmdp, unsigned long start,
> +		     unsigned long end)
>  {
> -	pte_t *ptep = pte_offset_kernel(pmdp, 0UL);
> -	unsigned long addr;
> -	unsigned i;
> +	unsigned long addr = start;
> +	pte_t *ptep = pte_offset_kernel(pmdp, start);
>  
> -	for (i = 0; i < PTRS_PER_PTE; i++, ptep++) {
> -		addr = start + i * PAGE_SIZE;
> +	do {
>  		note_page(st, addr, 4, READ_ONCE(pte_val(*ptep)));
> -	}
> +	} while (ptep++, addr += PAGE_SIZE, addr != end);
>  }
>  
> -static void walk_pmd(struct pg_state *st, pud_t *pudp, unsigned long start)
> +static void walk_pmd(struct pg_state *st, pud_t *pudp, unsigned long start,
> +		     unsigned long end)
>  {
> -	pmd_t *pmdp = pmd_offset(pudp, 0UL);
> -	unsigned long addr;
> -	unsigned i;
> +	unsigned long next, addr = start;
> +	pmd_t *pmdp = pmd_offset(pudp, start);
>  
> -	for (i = 0; i < PTRS_PER_PMD; i++, pmdp++) {
> +	do {
>  		pmd_t pmd = READ_ONCE(*pmdp);
> +		next = pmd_addr_end(addr, end);
>  
> -		addr = start + i * PMD_SIZE;
>  		if (pmd_none(pmd) || pmd_sect(pmd)) {
>  			note_page(st, addr, 3, pmd_val(pmd));
>  		} else {
>  			BUG_ON(pmd_bad(pmd));
> -			walk_pte(st, pmdp, addr);
> +			walk_pte(st, pmdp, addr, next);
>  		}
> -	}
> +	} while (pmdp++, addr = next, addr != end);
>  }
>  
> -static void walk_pud(struct pg_state *st, pgd_t *pgdp, unsigned long start)
> +static void walk_pud(struct pg_state *st, pgd_t *pgdp, unsigned long start,
> +		     unsigned long end)
>  {
> -	pud_t *pudp = pud_offset(pgdp, 0UL);
> -	unsigned long addr;
> -	unsigned i;
> +	unsigned long next, addr = start;
> +	pud_t *pudp = pud_offset(pgdp, start);
>  
> -	for (i = 0; i < PTRS_PER_PUD; i++, pudp++) {
> +	do {
>  		pud_t pud = READ_ONCE(*pudp);
> +		next = pud_addr_end(addr, end);
>  
> -		addr = start + i * PUD_SIZE;
>  		if (pud_none(pud) || pud_sect(pud)) {
>  			note_page(st, addr, 2, pud_val(pud));
>  		} else {
>  			BUG_ON(pud_bad(pud));
> -			walk_pmd(st, pudp, addr);
> +			walk_pmd(st, pudp, addr, next);
>  		}
> -	}
> +	} while (pudp++, addr = next, addr != end);
>  }
>  
>  static void walk_pgd(struct pg_state *st, struct mm_struct *mm,
>  		     unsigned long start)
>  {
> -	pgd_t *pgdp = pgd_offset(mm, 0UL);
> -	unsigned i;
> -	unsigned long addr;
> +	unsigned long next, addr = start;
> +	pgd_t *pgdp = pgd_offset(mm, addr);
>  
> -	for (i = 0; i < PTRS_PER_PGD; i++, pgdp++) {
> +	do {
>  		pgd_t pgd = READ_ONCE(*pgdp);
> +		next = pgd_addr_end(addr, 0);
>  
> -		addr = start + i * PGDIR_SIZE;
>  		if (pgd_none(pgd)) {
>  			note_page(st, addr, 1, pgd_val(pgd));
>  		} else {
>  			BUG_ON(pgd_bad(pgd));
> -			walk_pud(st, pgdp, addr);
> +			walk_pud(st, pgdp, addr, next);
>  		}
> -	}
> +	} while (pgdp++, addr = next, addr);
>  }
>  
>  void ptdump_walk_pgd(struct seq_file *m, struct ptdump_info *info)

Unfortunately, although this takes care of kernel_page_tables, it breaks
efi_page_tables.

# cat /sys/kernel/debug/efi_page_tables
BUG()

# ./scripts/faddr2line vmlinux walk_pgd+0x84/0x254
walk_pgd+0x84/0x254:
__read_once_size at include/linux/compiler.h:193 (discriminator 1)
(inlined by) walk_pgd at arch/arm64/mm/dump.c:345 (discriminator 1)

344:        do {
345:                pgd_t pgd = READ_ONCE(*pgdp);
346:                next = pgd_addr_end(addr, 0);

kernel BUG at arch/arm64/mm/dump.c:332!

                } else {
                        BUG_ON(pud_bad(pud));
                        walk_pmd(st, pudp, addr, next);

[  181.397157] BUG: KASAN: slab-out-of-bounds in walk_pgd+0x84/0x254
[  181.403243] Read of size 8 at addr ffff808ba07a2000 by task cat/4046
[  181.409586] 
[  181.411073] CPU: 76 PID: 4046 Comm: cat Not tainted 5.0.0-rc4+ #11
[  181.428709] Call trace:
[  181.431150]  dump_backtrace+0x0/0x298
[  181.434805]  show_stack+0x24/0x30
[  181.438114]  dump_stack+0xb0/0xdc
[  181.441423]  print_address_description+0x64/0x2b0
[  181.446118]  kasan_report+0x150/0x1a4
[  181.449772]  __asan_report_load8_noabort+0x30/0x3c
[  181.454555]  walk_pgd+0x84/0x254
[  181.457774]  ptdump_walk_pgd+0xec/0x140
[  181.461602]  ptdump_show+0x40/0x50
[  181.464997]  seq_read+0x3f8/0xad0
[  181.468305]  full_proxy_read+0x9c/0xc0
[  181.472045]  __vfs_read+0xfc/0x4c8
[  181.475438]  vfs_read+0xec/0x208
[  181.478657]  ksys_read+0xd0/0x15c
[  181.481963]  __arm64_sys_read+0x84/0x94
[  181.485791]  el0_svc_handler+0x258/0x304
[  181.489705]  el0_svc+0x8/0xc
[  181.492576] 
[  181.494060] Allocated by task 1:
[  181.497281]  __kasan_kmalloc.isra.0.part.0+0x58/0x108
[  181.502324]  __kasan_kmalloc.isra.0+0x88/0xa4
[  181.506673]  kasan_slab_alloc+0x38/0x48
[  181.510500]  kmem_cache_alloc+0x2d0/0x430
[  181.514501]  pgd_alloc+0x24/0x2c
[  181.517723]  arm_enable_runtime_services+0x204/0x4a4
[  181.522679]  do_one_initcall+0x44c/0x9f4
[  181.526596]  kernel_init_freeable+0xde8/0xec8
[  181.530945]  kernel_init+0x18/0x134
[  181.534425]  ret_from_fork+0x10/0x18
[  181.537990] 
[  181.539473] Freed by task 0:
[  181.542344] (stack is not available)
[  181.545909] 
[  181.547393] The buggy address belongs to the object at ffff808ba07a0000
[  181.547393]  which belongs to the cache pgd_cache of size 8192
[  181.559814] The buggy address is located 0 bytes to the right of
[  181.559814]  8192-byte region [ffff808ba07a0000, ffff808ba07a2000)
[  181.571973] The buggy address belongs to the page:
[  181.576757] page:ffff7fe022e81e00 count:1 mapcount:0 mapping:ffff80082001cc80
index:0xffff808ba079a000 compound_mapcount: 0
[  181.587879] flags: 0x17ffffc00010200(slab|head)
[  181.592404] raw: 017ffffc00010200 ffff7fe025740608 ffff7fe0256a6a08
ffff80082001cc80
[  181.600138] raw: ffff808ba079a000 0000000000150009 00000001ffffffff
0000000000000000
[  181.607870] page dumped because: kasan: bad access detected
[  181.613431] 
[  181.614913] Memory state around the buggy address:
[  181.619696]  ffff808ba07a1f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00
[  181.626909]  ffff808ba07a1f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00
[  181.634121] >ffff808ba07a2000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
fc
[  181.641333]                    ^
[  181.644552]  ffff808ba07a2080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
fc
[  181.651765]  ffff808ba07a2100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
fc
[  181.658976]
==================================================================
[  181.666187] Disabling lock debugging due to kernel taint
[  181.671535] ------------[ cut here ]------------
[  181.676143] kernel BUG at arch/arm64/mm/dump.c:332!
[  181.681012] Internal error: Oops - BUG: 0 [#1] SMP
[  181.685795] Modules linked in: iptable_filter thunderx2_pmu ip_tables ext4
mbcache jbd2 sd_mod ahci mlx5_core libahci libata dm_mod efivarfs
[  181.698407] CPU: 76 PID: 4046 Comm: cat Tainted: G    B             5.0.0-
rc4+ #11
[  181.705966] Hardware name: To be filled by O.E.M. To be filled by O.E.M./To
be filled by O.E.M., BIOS L50_5.13_1.0.8 11/07/2018
[  181.717431] pstate: 10400009 (nzcV daif +PAN -UAO)
[  181.722214] pc : walk_pgd+0xe0/0x254
[  181.725780] lr : walk_pgd+0x84/0x254
[  181.729345] sp : ffff8089ab5d7910
[  181.732649] x29: ffff8089ab5d7910 x28: ffff0400021fca25 
[  181.737953] x27: 0010000000000000 x26: ffff808ba0760000 
[  181.743256] x25: dfff200000000000 x24: ffff8089ab5d79e0 
[  181.748558] x23: 0000040000000000 x22: ffff200010fe5128 
[  181.753861] x21: 0010040000000000 x20: ffff808ba07a2000 
[  181.759163] x19: ffff808ba0740000 x18: 0000000000000000 
[  181.764466] x17: 0000000000000000 x16: 0000000000000000 
[  181.769768] x15: ffff80896ecdba40 x14: 3d3d3d3d3d3d3d3d 
[  181.775070] x13: 3d3d3d3d3d3d3d3d x12: 1fffe400026894df 
[  181.780373] x11: ffff0400026894df x10: 3d3d3d3d3d3d3d3d 
[  181.785676] x9 : dfff200000000000 x8 : 746e696174206c65 
[  181.790978] x7 : 0000000000000000 x6 : ffff20001021b2c0 
[  181.796280] x5 : 0000000000000000 x4 : 0000000000000080 
[  181.801582] x3 : 0000000000000000 x2 : 0000000000000000 
[  181.806885] x1 : 21a4741323e59b00 x0 : cccccccccccccccc 
[  181.812190] Process cat (pid: 4046, stack limit = 0x000000002e8086a5)
[  181.818619] Call trace:
[  181.821057]  walk_pgd+0xe0/0x254
[  181.824276]  ptdump_walk_pgd+0xec/0x140
[  181.828102]  ptdump_show+0x40/0x50
[  181.831495]  seq_read+0x3f8/0xad0
[  181.834800]  full_proxy_read+0x9c/0xc0
[  181.838540]  __vfs_read+0xfc/0x4c8
[  181.841932]  vfs_read+0xec/0x208
[  181.845150]  ksys_read+0xd0/0x15c
[  181.848455]  __arm64_sys_read+0x84/0x94
[  181.852282]  el0_svc_handler+0x258/0x304
[  181.856195]  el0_svc+0x8/0xc
[  181.859069] Code: d65f03c0 aa1503fb 17ffffe6 370800c0 (d4210000) 
[  181.865365] ---[ end trace 2a43332e785e8f92 ]---
[  181.869973] Kernel panic - not syncing: Fatal exception
[  181.875320] SMP: stopping secondary CPUs
[  181.879281] Kernel Offset: disabled
[  181.882767] CPU features: 0x002,20000c38
[  181.886684] Memory Limit: none
[  181.889976] ---[ end Kernel panic - not syncing: Fatal exception ]---

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel_page_tables issue with CONFIG_ARM64_USER_VA_BITS_52=y
  2019-02-01 20:49   ` Qian Cai
@ 2019-02-04 10:27     ` Steve Capper
  2019-02-04 13:51       ` Ard Biesheuvel
  2019-02-04 13:57       ` Qian Cai
  0 siblings, 2 replies; 8+ messages in thread
From: Steve Capper @ 2019-02-04 10:27 UTC (permalink / raw)
  To: Qian Cai; +Cc: Catalin Marinas, nd, Will Deacon, Linux ARM

On Fri, Feb 01, 2019 at 03:49:02PM -0500, Qian Cai wrote:
> On Fri, 2019-02-01 at 13:48 +0000, Will Deacon wrote:
> > Hi Qian Cai,

[...]

> Unfortunately, although this takes care of kernel_page_tables, it breaks
> efi_page_tables.
> 
> # cat /sys/kernel/debug/efi_page_tables
> BUG()
> 
> # ./scripts/faddr2line vmlinux walk_pgd+0x84/0x254
> walk_pgd+0x84/0x254:
> __read_once_size at include/linux/compiler.h:193 (discriminator 1)
> (inlined by) walk_pgd at arch/arm64/mm/dump.c:345 (discriminator 1)
> 
> 344:        do {
> 345:                pgd_t pgd = READ_ONCE(*pgdp);
> 346:                next = pgd_addr_end(addr, 0);
> 
> kernel BUG at arch/arm64/mm/dump.c:332!
> 
>                 } else {
>                         BUG_ON(pud_bad(pud));
>                         walk_pmd(st, pudp, addr, next);
>

Hi Qian Cai,
Apologies for missing the efi_page_tables.

Does the following (applied to Will's fix) help?

Cheers,
-- 
Steve

--->8

diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index 00ca2adeb5ef..42ac8fd15e59 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -339,11 +339,12 @@ static void walk_pgd(struct pg_state *st, struct mm_struct *mm,
 		     unsigned long start)
 {
 	unsigned long next, addr = start;
+	unsigned long end = (start < TASK_SIZE_64) ? TASK_SIZE_64 : 0;
 	pgd_t *pgdp = pgd_offset(mm, addr);
 
 	do {
 		pgd_t pgd = READ_ONCE(*pgdp);
-		next = pgd_addr_end(addr, 0);
+		next = pgd_addr_end(addr, end);
 
 		if (pgd_none(pgd)) {
 			note_page(st, addr, 1, pgd_val(pgd));
@@ -351,7 +352,7 @@ static void walk_pgd(struct pg_state *st, struct mm_struct *mm,
 			BUG_ON(pgd_bad(pgd));
 			walk_pud(st, pgdp, addr, next);
 		}
-	} while (pgdp++, addr = next, addr);
+	} while (pgdp++, addr = next, addr != end);
 }
 
 void ptdump_walk_pgd(struct seq_file *m, struct ptdump_info *info)
diff --git a/drivers/firmware/efi/arm-runtime.c b/drivers/firmware/efi/arm-runtime.c
index 23ea1ed409d1..16757966c3bf 100644
--- a/drivers/firmware/efi/arm-runtime.c
+++ b/drivers/firmware/efi/arm-runtime.c
@@ -38,7 +38,8 @@ static struct ptdump_info efi_ptdump_info = {
 	.mm		= &efi_mm,
 	.markers	= (struct addr_marker[]){
 		{ 0,		"UEFI runtime start" },
-		{ DEFAULT_MAP_WINDOW_64, "UEFI runtime end" }
+		{ DEFAULT_MAP_WINDOW_64, "UEFI runtime end" },
+		{ -1, NULL }
 	},
 	.base_addr	= 0,
 };

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: kernel_page_tables issue with CONFIG_ARM64_USER_VA_BITS_52=y
  2019-02-04 10:27     ` Steve Capper
@ 2019-02-04 13:51       ` Ard Biesheuvel
  2019-02-04 13:57       ` Qian Cai
  1 sibling, 0 replies; 8+ messages in thread
From: Ard Biesheuvel @ 2019-02-04 13:51 UTC (permalink / raw)
  To: Steve Capper; +Cc: Catalin Marinas, nd, Qian Cai, Linux ARM, Will Deacon

Hi Steve,

On Mon, 4 Feb 2019 at 11:27, Steve Capper <Steve.Capper@arm.com> wrote:
>
> On Fri, Feb 01, 2019 at 03:49:02PM -0500, Qian Cai wrote:
> > On Fri, 2019-02-01 at 13:48 +0000, Will Deacon wrote:
> > > Hi Qian Cai,
>
> [...]
>
> > Unfortunately, although this takes care of kernel_page_tables, it breaks
> > efi_page_tables.
> >
> > # cat /sys/kernel/debug/efi_page_tables
> > BUG()
> >
> > # ./scripts/faddr2line vmlinux walk_pgd+0x84/0x254
> > walk_pgd+0x84/0x254:
> > __read_once_size at include/linux/compiler.h:193 (discriminator 1)
> > (inlined by) walk_pgd at arch/arm64/mm/dump.c:345 (discriminator 1)
> >
> > 344:        do {
> > 345:                pgd_t pgd = READ_ONCE(*pgdp);
> > 346:                next = pgd_addr_end(addr, 0);
> >
> > kernel BUG at arch/arm64/mm/dump.c:332!
> >
> >                 } else {
> >                         BUG_ON(pud_bad(pud));
> >                         walk_pmd(st, pudp, addr, next);
> >
>
> Hi Qian Cai,
> Apologies for missing the efi_page_tables.
>
> Does the following (applied to Will's fix) help?
>
> Cheers,
> --
> Steve
>
> --->8
>
> diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
> index 00ca2adeb5ef..42ac8fd15e59 100644
> --- a/arch/arm64/mm/dump.c
> +++ b/arch/arm64/mm/dump.c
> @@ -339,11 +339,12 @@ static void walk_pgd(struct pg_state *st, struct mm_struct *mm,
>                      unsigned long start)
>  {
>         unsigned long next, addr = start;
> +       unsigned long end = (start < TASK_SIZE_64) ? TASK_SIZE_64 : 0;
>         pgd_t *pgdp = pgd_offset(mm, addr);
>
>         do {
>                 pgd_t pgd = READ_ONCE(*pgdp);
> -               next = pgd_addr_end(addr, 0);
> +               next = pgd_addr_end(addr, end);
>
>                 if (pgd_none(pgd)) {
>                         note_page(st, addr, 1, pgd_val(pgd));
> @@ -351,7 +352,7 @@ static void walk_pgd(struct pg_state *st, struct mm_struct *mm,
>                         BUG_ON(pgd_bad(pgd));
>                         walk_pud(st, pgdp, addr, next);
>                 }
> -       } while (pgdp++, addr = next, addr);
> +       } while (pgdp++, addr = next, addr != end);
>  }
>
>  void ptdump_walk_pgd(struct seq_file *m, struct ptdump_info *info)
> diff --git a/drivers/firmware/efi/arm-runtime.c b/drivers/firmware/efi/arm-runtime.c
> index 23ea1ed409d1..16757966c3bf 100644
> --- a/drivers/firmware/efi/arm-runtime.c
> +++ b/drivers/firmware/efi/arm-runtime.c
> @@ -38,7 +38,8 @@ static struct ptdump_info efi_ptdump_info = {
>         .mm             = &efi_mm,
>         .markers        = (struct addr_marker[]){
>                 { 0,            "UEFI runtime start" },
> -               { DEFAULT_MAP_WINDOW_64, "UEFI runtime end" }
> +               { DEFAULT_MAP_WINDOW_64, "UEFI runtime end" },
> +               { -1, NULL }
>         },
>         .base_addr      = 0,
>  };
>

FYI this last hunk went into v5.0-rc5 as a fix (proposed by Qian)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel_page_tables issue with CONFIG_ARM64_USER_VA_BITS_52=y
  2019-02-04 10:27     ` Steve Capper
  2019-02-04 13:51       ` Ard Biesheuvel
@ 2019-02-04 13:57       ` Qian Cai
  2019-02-04 14:15         ` Will Deacon
  1 sibling, 1 reply; 8+ messages in thread
From: Qian Cai @ 2019-02-04 13:57 UTC (permalink / raw)
  To: Steve Capper; +Cc: Catalin Marinas, nd, Will Deacon, Linux ARM

On Mon, 2019-02-04 at 10:27 +0000, Steve Capper wrote:
> On Fri, Feb 01, 2019 at 03:49:02PM -0500, Qian Cai wrote:
> > On Fri, 2019-02-01 at 13:48 +0000, Will Deacon wrote:
> > > Hi Qian Cai,
> 
> [...]
> 
> > Unfortunately, although this takes care of kernel_page_tables, it breaks
> > efi_page_tables.
> > 
> > # cat /sys/kernel/debug/efi_page_tables
> > BUG()
> > 
> > # ./scripts/faddr2line vmlinux walk_pgd+0x84/0x254
> > walk_pgd+0x84/0x254:
> > __read_once_size at include/linux/compiler.h:193 (discriminator 1)
> > (inlined by) walk_pgd at arch/arm64/mm/dump.c:345 (discriminator 1)
> > 
> > 344:        do {
> > 345:                pgd_t pgd = READ_ONCE(*pgdp);
> > 346:                next = pgd_addr_end(addr, 0);
> > 
> > kernel BUG at arch/arm64/mm/dump.c:332!
> > 
> >                 } else {
> >                         BUG_ON(pud_bad(pud));
> >                         walk_pmd(st, pudp, addr, next);
> > 
> 
> Hi Qian Cai,
> Apologies for missing the efi_page_tables.
> 
> Does the following (applied to Will's fix) help?
> 

Yes, it works great!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel_page_tables issue with CONFIG_ARM64_USER_VA_BITS_52=y
  2019-02-04 13:57       ` Qian Cai
@ 2019-02-04 14:15         ` Will Deacon
  0 siblings, 0 replies; 8+ messages in thread
From: Will Deacon @ 2019-02-04 14:15 UTC (permalink / raw)
  To: Qian Cai; +Cc: Catalin Marinas, nd, Linux ARM, Steve Capper

On Mon, Feb 04, 2019 at 08:57:54AM -0500, Qian Cai wrote:
> On Mon, 2019-02-04 at 10:27 +0000, Steve Capper wrote:
> > On Fri, Feb 01, 2019 at 03:49:02PM -0500, Qian Cai wrote:
> > > On Fri, 2019-02-01 at 13:48 +0000, Will Deacon wrote:
> > > > Hi Qian Cai,
> > 
> > [...]
> > 
> > > Unfortunately, although this takes care of kernel_page_tables, it breaks
> > > efi_page_tables.
> > > 
> > > # cat /sys/kernel/debug/efi_page_tables
> > > BUG()
> > > 
> > > # ./scripts/faddr2line vmlinux walk_pgd+0x84/0x254
> > > walk_pgd+0x84/0x254:
> > > __read_once_size at include/linux/compiler.h:193 (discriminator 1)
> > > (inlined by) walk_pgd at arch/arm64/mm/dump.c:345 (discriminator 1)
> > > 
> > > 344:        do {
> > > 345:                pgd_t pgd = READ_ONCE(*pgdp);
> > > 346:                next = pgd_addr_end(addr, 0);
> > > 
> > > kernel BUG at arch/arm64/mm/dump.c:332!
> > > 
> > >                 } else {
> > >                         BUG_ON(pud_bad(pud));
> > >                         walk_pmd(st, pudp, addr, next);
> > > 
> > 
> > Hi Qian Cai,
> > Apologies for missing the efi_page_tables.
> > 
> > Does the following (applied to Will's fix) help?
> > 
> 
> Yes, it works great!

Thanks for testing this so quickly. I'll fold the change (minus the sentinel
entry) into my patch and queue up as a fix.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-02-04 14:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-01  4:01 kernel_page_tables issue with CONFIG_ARM64_USER_VA_BITS_52=y Qian Cai
2019-02-01 13:48 ` Will Deacon
2019-02-01 18:53   ` Steve Capper
2019-02-01 20:49   ` Qian Cai
2019-02-04 10:27     ` Steve Capper
2019-02-04 13:51       ` Ard Biesheuvel
2019-02-04 13:57       ` Qian Cai
2019-02-04 14:15         ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).