linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arm64: mm: add missing PTE_SPECIAL in pte_mkdevmap on arm64
@ 2019-08-07  4:58 Jia He
  2019-08-07  9:09 ` Will Deacon
  2019-08-08  5:18 ` Anshuman Khandual
  0 siblings, 2 replies; 6+ messages in thread
From: Jia He @ 2019-08-07  4:58 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Mark Rutland, James Morse
  Cc: Christoffer Dall, Punit Agrawal, Qian Cai, Anshuman Khandual,
	Jun Yao, Alex Van Brunt, Robin Murphy, Thomas Gleixner,
	linux-arm-kernel, linux-kernel, Jia He

Without this patch, the MAP_SYNC test case will cause a print_bad_pte
warning on arm64 as follows:
[   25.542693] BUG: Bad page map in process mapdax333
pte:2e8000448800f53 pmd:41ff5f003
[   25.546360] page:ffff7e0010220000 refcount:1 mapcount:-1
mapping:ffff8003e29c7440 index:0x0
[   25.550281] ext4_dax_aops
[   25.550282] name:"__aaabbbcccddd__"
[   25.551553] flags: 0x3ffff0000001002(referenced|reserved)
[   25.555802] raw: 03ffff0000001002 ffff8003dfffa908 0000000000000000
ffff8003e29c7440
[   25.559446] raw: 0000000000000000 0000000000000000 00000001fffffffe
0000000000000000
[   25.563075] page dumped because: bad pte
[   25.564938] addr:0000ffffbe05b000 vm_flags:208000fb
anon_vma:0000000000000000 mapping:ffff8003e29c7440 index:0
[   25.574272] file:__aaabbbcccddd__ fault:ext4_dax_fault
mmmmap:ext4_file_mmap readpage:0x0
[   25.578799] CPU: 1 PID: 1180 Comm: mapdax333 Not tainted 5.2.0+ #21
[   25.581702] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0
02/06/2015
[   25.585624] Call trace:
[   25.587008]  dump_backtrace+0x0/0x178
[   25.588799]  show_stack+0x24/0x30
[   25.590328]  dump_stack+0xa8/0xcc
[   25.591901]  print_bad_pte+0x18c/0x218
[   25.593628]  unmap_page_range+0x778/0xc00
[   25.595506]  unmap_single_vma+0x94/0xe8
[   25.597304]  unmap_vmas+0x90/0x108
[   25.598901]  unmap_region+0xc0/0x128
[   25.600566]  __do_munmap+0x284/0x3f0
[   25.602245]  __vm_munmap+0x78/0xe0
[   25.603820]  __arm64_sys_munmap+0x34/0x48
[   25.605709]  el0_svc_common.constprop.0+0x78/0x168
[   25.607956]  el0_svc_handler+0x34/0x90
[   25.609698]  el0_svc+0x8/0xc
[   25.611103] Disabling lock debugging due to kernel taint
[   25.613573] BUG: Bad page state in process mapdax333  pfn:448800
[   25.616359] page:ffff7e0010220000 refcount:0 mapcount:-1
mapping:ffff8003e29c7440 index:0x1
[   25.620236] ext4_dax_aops
[   25.620237] name:"__aaabbbcccddd__"
[   25.621495] flags: 0x3ffff0000000000()
[   25.624912] raw: 03ffff0000000000 dead000000000100 dead000000000200
ffff8003e29c7440
[   25.628502] raw: 0000000000000001 0000000000000000 00000000fffffffe
0000000000000000
[   25.632097] page dumped because: non-NULL mapping
[...]
[   25.656567] CPU: 1 PID: 1180 Comm: mapdax333 Tainted: G    B
5.2.0+ #21
[   25.660131] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0
02/06/2015
[   25.663324] Call trace:
[   25.664466]  dump_backtrace+0x0/0x178
[   25.666163]  show_stack+0x24/0x30
[   25.667721]  dump_stack+0xa8/0xcc
[   25.669270]  bad_page+0xf0/0x150
[   25.670772]  free_pages_check_bad+0x84/0xa0
[   25.672724]  free_pcppages_bulk+0x45c/0x708
[   25.674675]  free_unref_page_commit+0xcc/0x100
[   25.676751]  free_unref_page_list+0x13c/0x200
[   25.678801]  release_pages+0x350/0x420
[   25.680539]  free_pages_and_swap_cache+0xf8/0x128
[   25.682738]  tlb_flush_mmu+0x164/0x2b0
[   25.684485]  unmap_page_range+0x648/0xc00
[   25.686349]  unmap_single_vma+0x94/0xe8
[   25.688131]  unmap_vmas+0x90/0x108
[   25.689739]  unmap_region+0xc0/0x128
[   25.691392]  __do_munmap+0x284/0x3f0
[   25.693079]  __vm_munmap+0x78/0xe0
[   25.694658]  __arm64_sys_munmap+0x34/0x48
[   25.696530]  el0_svc_common.constprop.0+0x78/0x168
[   25.698772]  el0_svc_handler+0x34/0x90
[   25.700512]  el0_svc+0x8/0xc

The root cause is in _vm_normal_page, without the PTE_SPECIAL bit,
the return value will be incorrectly set to pfn_to_page(pfn) instead
of NULL. Besides, this patch also rewrite the pmd_mkdevmap to avoid
setting PTE_SPECIAL for pmd

The MAP_SYNC test case is as follows(Provided by Yibo Cai)
$#include <stdio.h>
$#include <string.h>
$#include <unistd.h>
$#include <sys/file.h>
$#include <sys/mman.h>

$#ifndef MAP_SYNC
$#define MAP_SYNC 0x80000
$#endif

/* mount -o dax /dev/pmem0 /mnt */
$#define F "/mnt/__aaabbbcccddd__"

int main(void)
{
    int fd;
    char buf[4096];
    void *addr;

    if ((fd = open(F, O_CREAT|O_TRUNC|O_RDWR, 0644)) < 0) {
        perror("open1");
        return 1;
    }

    if (write(fd, buf, 4096) != 4096) {
        perror("lseek");
        return 1;
    }

    addr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_SYNC,
fd, 0);
    if (addr == MAP_FAILED) {
        perror("mmap");
        printf("did you mount with '-o dax'?\n");
        return 1;
    }

    memset(addr, 0x55, 4096);

    if (munmap(addr, 4096) == -1) {
        perror("munmap");
        return 1;
    }

    close(fd);

    return 0;
}

Fixes: 73b20c84d42d ("arm64: mm: implement pte_devmap support")
Reported-by: Yibo Cai <Yibo.Cai@arm.com>
Signed-off-by: Jia He <justin.he@arm.com>
Acked-by: Robin Murphy <Robin.Murphy@arm.com>
---
 arch/arm64/include/asm/pgtable.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 5fdcfe237338..e09760ece844 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -209,7 +209,7 @@ static inline pmd_t pmd_mkcont(pmd_t pmd)
 
 static inline pte_t pte_mkdevmap(pte_t pte)
 {
-	return set_pte_bit(pte, __pgprot(PTE_DEVMAP));
+	return set_pte_bit(pte, __pgprot(PTE_DEVMAP | PTE_SPECIAL));
 }
 
 static inline void set_pte(pte_t *ptep, pte_t pte)
@@ -396,7 +396,10 @@ static inline int pmd_protnone(pmd_t pmd)
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 #define pmd_devmap(pmd)		pte_devmap(pmd_pte(pmd))
 #endif
-#define pmd_mkdevmap(pmd)	pte_pmd(pte_mkdevmap(pmd_pte(pmd)))
+static inline pmd_t pmd_mkdevmap(pmd_t pmd)
+{
+	return pte_pmd(set_pte_bit(pmd_pte(pmd), __pgprot(PTE_DEVMAP)));
+}
 
 #define __pmd_to_phys(pmd)	__pte_to_phys(pmd_pte(pmd))
 #define __phys_to_pmd_val(phys)	__phys_to_pte_val(phys)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] arm64: mm: add missing PTE_SPECIAL in pte_mkdevmap on arm64
  2019-08-07  4:58 [PATCH] arm64: mm: add missing PTE_SPECIAL in pte_mkdevmap on arm64 Jia He
@ 2019-08-07  9:09 ` Will Deacon
  2019-08-07  9:24   ` Catalin Marinas
  2019-08-08  5:18 ` Anshuman Khandual
  1 sibling, 1 reply; 6+ messages in thread
From: Will Deacon @ 2019-08-07  9:09 UTC (permalink / raw)
  To: Jia He
  Cc: Catalin Marinas, Mark Rutland, James Morse, Christoffer Dall,
	Punit Agrawal, Qian Cai, Anshuman Khandual, Jun Yao,
	Alex Van Brunt, Robin Murphy, Thomas Gleixner, linux-arm-kernel,
	linux-kernel

On Wed, Aug 07, 2019 at 12:58:51PM +0800, Jia He wrote:
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 5fdcfe237338..e09760ece844 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -209,7 +209,7 @@ static inline pmd_t pmd_mkcont(pmd_t pmd)
>  
>  static inline pte_t pte_mkdevmap(pte_t pte)
>  {
> -	return set_pte_bit(pte, __pgprot(PTE_DEVMAP));
> +	return set_pte_bit(pte, __pgprot(PTE_DEVMAP | PTE_SPECIAL));
>  }
>  
>  static inline void set_pte(pte_t *ptep, pte_t pte)
> @@ -396,7 +396,10 @@ static inline int pmd_protnone(pmd_t pmd)
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  #define pmd_devmap(pmd)		pte_devmap(pmd_pte(pmd))
>  #endif
> -#define pmd_mkdevmap(pmd)	pte_pmd(pte_mkdevmap(pmd_pte(pmd)))
> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
> +{
> +	return pte_pmd(set_pte_bit(pmd_pte(pmd), __pgprot(PTE_DEVMAP)));
> +}
>  
>  #define __pmd_to_phys(pmd)	__pte_to_phys(pmd_pte(pmd))
>  #define __phys_to_pmd_val(phys)	__phys_to_pte_val(phys)

Acked-by: Will Deacon <will@kernel.org>

I think Catalin can take this as a fix, although the commit message should
probably be trimmed down a bit to remove the two call traces etc.

Will

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] arm64: mm: add missing PTE_SPECIAL in pte_mkdevmap on arm64
  2019-08-07  9:09 ` Will Deacon
@ 2019-08-07  9:24   ` Catalin Marinas
  0 siblings, 0 replies; 6+ messages in thread
From: Catalin Marinas @ 2019-08-07  9:24 UTC (permalink / raw)
  To: Will Deacon
  Cc: Jia He, Mark Rutland, James Morse, Christoffer Dall,
	Punit Agrawal, Qian Cai, Anshuman Khandual, Jun Yao,
	Alex Van Brunt, Robin Murphy, Thomas Gleixner, linux-arm-kernel,
	linux-kernel

On Wed, Aug 07, 2019 at 10:09:29AM +0100, Will Deacon wrote:
> On Wed, Aug 07, 2019 at 12:58:51PM +0800, Jia He wrote:
> > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> > index 5fdcfe237338..e09760ece844 100644
> > --- a/arch/arm64/include/asm/pgtable.h
> > +++ b/arch/arm64/include/asm/pgtable.h
> > @@ -209,7 +209,7 @@ static inline pmd_t pmd_mkcont(pmd_t pmd)
> >  
> >  static inline pte_t pte_mkdevmap(pte_t pte)
> >  {
> > -	return set_pte_bit(pte, __pgprot(PTE_DEVMAP));
> > +	return set_pte_bit(pte, __pgprot(PTE_DEVMAP | PTE_SPECIAL));
> >  }
> >  
> >  static inline void set_pte(pte_t *ptep, pte_t pte)
> > @@ -396,7 +396,10 @@ static inline int pmd_protnone(pmd_t pmd)
> >  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> >  #define pmd_devmap(pmd)		pte_devmap(pmd_pte(pmd))
> >  #endif
> > -#define pmd_mkdevmap(pmd)	pte_pmd(pte_mkdevmap(pmd_pte(pmd)))
> > +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
> > +{
> > +	return pte_pmd(set_pte_bit(pmd_pte(pmd), __pgprot(PTE_DEVMAP)));
> > +}
> >  
> >  #define __pmd_to_phys(pmd)	__pte_to_phys(pmd_pte(pmd))
> >  #define __phys_to_pmd_val(phys)	__phys_to_pte_val(phys)
> 
> Acked-by: Will Deacon <will@kernel.org>
> 
> I think Catalin can take this as a fix, although the commit message should
> probably be trimmed down a bit to remove the two call traces etc.

I'll queue this for -rc4 and sort out the commit message. Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] arm64: mm: add missing PTE_SPECIAL in pte_mkdevmap on arm64
  2019-08-07  4:58 [PATCH] arm64: mm: add missing PTE_SPECIAL in pte_mkdevmap on arm64 Jia He
  2019-08-07  9:09 ` Will Deacon
@ 2019-08-08  5:18 ` Anshuman Khandual
  2019-08-08  6:20   ` Justin He (Arm Technology China)
  1 sibling, 1 reply; 6+ messages in thread
From: Anshuman Khandual @ 2019-08-08  5:18 UTC (permalink / raw)
  To: Jia He, Catalin Marinas, Will Deacon, Mark Rutland, James Morse
  Cc: Christoffer Dall, Punit Agrawal, Qian Cai, Jun Yao,
	Alex Van Brunt, Robin Murphy, Thomas Gleixner, linux-arm-kernel,
	linux-kernel



On 08/07/2019 10:28 AM, Jia He wrote:
> Without this patch, the MAP_SYNC test case will cause a print_bad_pte
> warning on arm64 as follows:
> [   25.542693] BUG: Bad page map in process mapdax333
> pte:2e8000448800f53 pmd:41ff5f003
> [   25.546360] page:ffff7e0010220000 refcount:1 mapcount:-1
> mapping:ffff8003e29c7440 index:0x0
> [   25.550281] ext4_dax_aops
> [   25.550282] name:"__aaabbbcccddd__"
> [   25.551553] flags: 0x3ffff0000001002(referenced|reserved)
> [   25.555802] raw: 03ffff0000001002 ffff8003dfffa908 0000000000000000
> ffff8003e29c7440
> [   25.559446] raw: 0000000000000000 0000000000000000 00000001fffffffe
> 0000000000000000
> [   25.563075] page dumped because: bad pte
> [   25.564938] addr:0000ffffbe05b000 vm_flags:208000fb
> anon_vma:0000000000000000 mapping:ffff8003e29c7440 index:0
> [   25.574272] file:__aaabbbcccddd__ fault:ext4_dax_fault
> mmmmap:ext4_file_mmap readpage:0x0
> [   25.578799] CPU: 1 PID: 1180 Comm: mapdax333 Not tainted 5.2.0+ #21
> [   25.581702] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0
> 02/06/2015
> [   25.585624] Call trace:
> [   25.587008]  dump_backtrace+0x0/0x178
> [   25.588799]  show_stack+0x24/0x30
> [   25.590328]  dump_stack+0xa8/0xcc
> [   25.591901]  print_bad_pte+0x18c/0x218
> [   25.593628]  unmap_page_range+0x778/0xc00
> [   25.595506]  unmap_single_vma+0x94/0xe8
> [   25.597304]  unmap_vmas+0x90/0x108
> [   25.598901]  unmap_region+0xc0/0x128
> [   25.600566]  __do_munmap+0x284/0x3f0
> [   25.602245]  __vm_munmap+0x78/0xe0
> [   25.603820]  __arm64_sys_munmap+0x34/0x48
> [   25.605709]  el0_svc_common.constprop.0+0x78/0x168
> [   25.607956]  el0_svc_handler+0x34/0x90
> [   25.609698]  el0_svc+0x8/0xc
> [   25.611103] Disabling lock debugging due to kernel taint
> [   25.613573] BUG: Bad page state in process mapdax333  pfn:448800
> [   25.616359] page:ffff7e0010220000 refcount:0 mapcount:-1
> mapping:ffff8003e29c7440 index:0x1
> [   25.620236] ext4_dax_aops
> [   25.620237] name:"__aaabbbcccddd__"
> [   25.621495] flags: 0x3ffff0000000000()
> [   25.624912] raw: 03ffff0000000000 dead000000000100 dead000000000200
> ffff8003e29c7440
> [   25.628502] raw: 0000000000000001 0000000000000000 00000000fffffffe
> 0000000000000000
> [   25.632097] page dumped because: non-NULL mapping
> [...]
> [   25.656567] CPU: 1 PID: 1180 Comm: mapdax333 Tainted: G    B
> 5.2.0+ #21
> [   25.660131] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0
> 02/06/2015
> [   25.663324] Call trace:
> [   25.664466]  dump_backtrace+0x0/0x178
> [   25.666163]  show_stack+0x24/0x30
> [   25.667721]  dump_stack+0xa8/0xcc
> [   25.669270]  bad_page+0xf0/0x150
> [   25.670772]  free_pages_check_bad+0x84/0xa0
> [   25.672724]  free_pcppages_bulk+0x45c/0x708
> [   25.674675]  free_unref_page_commit+0xcc/0x100
> [   25.676751]  free_unref_page_list+0x13c/0x200
> [   25.678801]  release_pages+0x350/0x420
> [   25.680539]  free_pages_and_swap_cache+0xf8/0x128
> [   25.682738]  tlb_flush_mmu+0x164/0x2b0
> [   25.684485]  unmap_page_range+0x648/0xc00
> [   25.686349]  unmap_single_vma+0x94/0xe8
> [   25.688131]  unmap_vmas+0x90/0x108
> [   25.689739]  unmap_region+0xc0/0x128
> [   25.691392]  __do_munmap+0x284/0x3f0
> [   25.693079]  __vm_munmap+0x78/0xe0
> [   25.694658]  __arm64_sys_munmap+0x34/0x48
> [   25.696530]  el0_svc_common.constprop.0+0x78/0x168
> [   25.698772]  el0_svc_handler+0x34/0x90
> [   25.700512]  el0_svc+0x8/0xc
> 
> The root cause is in _vm_normal_page, without the PTE_SPECIAL bit,
> the return value will be incorrectly set to pfn_to_page(pfn) instead
> of NULL. Besides, this patch also rewrite the pmd_mkdevmap to avoid
> setting PTE_SPECIAL for pmd
> 
> The MAP_SYNC test case is as follows(Provided by Yibo Cai)
> $#include <stdio.h>
> $#include <string.h>
> $#include <unistd.h>
> $#include <sys/file.h>
> $#include <sys/mman.h>
> 
> $#ifndef MAP_SYNC
> $#define MAP_SYNC 0x80000
> $#endif
> 
> /* mount -o dax /dev/pmem0 /mnt */
> $#define F "/mnt/__aaabbbcccddd__"
> 
> int main(void)
> {
>     int fd;
>     char buf[4096];
>     void *addr;
> 
>     if ((fd = open(F, O_CREAT|O_TRUNC|O_RDWR, 0644)) < 0) {
>         perror("open1");
>         return 1;
>     }
> 
>     if (write(fd, buf, 4096) != 4096) {
>         perror("lseek");
>         return 1;
>     }
> 
>     addr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_SYNC,
> fd, 0);
>     if (addr == MAP_FAILED) {
>         perror("mmap");
>         printf("did you mount with '-o dax'?\n");
>         return 1;
>     }
> 
>     memset(addr, 0x55, 4096);
> 
>     if (munmap(addr, 4096) == -1) {
>         perror("munmap");
>         return 1;
>     }
> 
>     close(fd);
> 
>     return 0;
> }
> 
> Fixes: 73b20c84d42d ("arm64: mm: implement pte_devmap support")
> Reported-by: Yibo Cai <Yibo.Cai@arm.com>
> Signed-off-by: Jia He <justin.he@arm.com>
> Acked-by: Robin Murphy <Robin.Murphy@arm.com>
> ---
>  arch/arm64/include/asm/pgtable.h | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 5fdcfe237338..e09760ece844 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -209,7 +209,7 @@ static inline pmd_t pmd_mkcont(pmd_t pmd)
>  
>  static inline pte_t pte_mkdevmap(pte_t pte)
>  {
> -	return set_pte_bit(pte, __pgprot(PTE_DEVMAP));
> +	return set_pte_bit(pte, __pgprot(PTE_DEVMAP | PTE_SPECIAL));
>  }
>  
>  static inline void set_pte(pte_t *ptep, pte_t pte)
> @@ -396,7 +396,10 @@ static inline int pmd_protnone(pmd_t pmd)
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  #define pmd_devmap(pmd)		pte_devmap(pmd_pte(pmd))
>  #endif
> -#define pmd_mkdevmap(pmd)	pte_pmd(pte_mkdevmap(pmd_pte(pmd)))
> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
> +{
> +	return pte_pmd(set_pte_bit(pmd_pte(pmd), __pgprot(PTE_DEVMAP)));
> +}

Though I could see other platforms like powerpc and x86 following same
approach (DEVMAP + SPECIAL) for pte so that it checks positive for
pte_special() but then just DEVMAP for pmd which could never have a
pmd_special(). But a more fundamental question is - why should a devmap
be a special pte as well ?

Also in vm_normal_page() why cannot it tests for pte_devmap() before it
starts looking for CONFIG_ARCH_HAS_PTE_SPECIAL. Is this the only path for
which we need to set SPECIAL bit on a devmap pte or there are other paths
where this semantics is assumed ?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] arm64: mm: add missing PTE_SPECIAL in pte_mkdevmap on arm64
  2019-08-08  5:18 ` Anshuman Khandual
@ 2019-08-08  6:20   ` Justin He (Arm Technology China)
  2019-08-08  6:58     ` Anshuman Khandual
  0 siblings, 1 reply; 6+ messages in thread
From: Justin He (Arm Technology China) @ 2019-08-08  6:20 UTC (permalink / raw)
  To: Anshuman Khandual, Catalin Marinas, Will Deacon, Mark Rutland,
	James Morse
  Cc: Christoffer Dall, Punit Agrawal, Qian Cai, Jun Yao,
	Alex Van Brunt, Robin Murphy, Thomas Gleixner, linux-arm-kernel,
	linux-kernel

Hi Anshuman
Thanks for the comments, please see my comments below

> -----Original Message-----
> From: Anshuman Khandual <anshuman.khandual@arm.com>
> Sent: 2019年8月8日 13:19
> To: Justin He (Arm Technology China) <Justin.He@arm.com>; Catalin
> Marinas <Catalin.Marinas@arm.com>; Will Deacon <will@kernel.org>;
> Mark Rutland <Mark.Rutland@arm.com>; James Morse
> <James.Morse@arm.com>
> Cc: Christoffer Dall <Christoffer.Dall@arm.com>; Punit Agrawal
> <punitagrawal@gmail.com>; Qian Cai <cai@lca.pw>; Jun Yao
> <yaojun8558363@gmail.com>; Alex Van Brunt <avanbrunt@nvidia.com>;
> Robin Murphy <Robin.Murphy@arm.com>; Thomas Gleixner
> <tglx@linutronix.de>; linux-arm-kernel@lists.infradead.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH] arm64: mm: add missing PTE_SPECIAL in
> pte_mkdevmap on arm64
>
[...]
> > diff --git a/arch/arm64/include/asm/pgtable.h
> b/arch/arm64/include/asm/pgtable.h
> > index 5fdcfe237338..e09760ece844 100644
> > --- a/arch/arm64/include/asm/pgtable.h
> > +++ b/arch/arm64/include/asm/pgtable.h
> > @@ -209,7 +209,7 @@ static inline pmd_t pmd_mkcont(pmd_t pmd)
> >
> >  static inline pte_t pte_mkdevmap(pte_t pte)
> >  {
> > -   return set_pte_bit(pte, __pgprot(PTE_DEVMAP));
> > +   return set_pte_bit(pte, __pgprot(PTE_DEVMAP | PTE_SPECIAL));
> >  }
> >
> >  static inline void set_pte(pte_t *ptep, pte_t pte)
> > @@ -396,7 +396,10 @@ static inline int pmd_protnone(pmd_t pmd)
> >  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> >  #define pmd_devmap(pmd)            pte_devmap(pmd_pte(pmd))
> >  #endif
> > -#define pmd_mkdevmap(pmd)
>       pte_pmd(pte_mkdevmap(pmd_pte(pmd)))
> > +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
> > +{
> > +   return pte_pmd(set_pte_bit(pmd_pte(pmd),
> __pgprot(PTE_DEVMAP)));
> > +}
>
> Though I could see other platforms like powerpc and x86 following same
> approach (DEVMAP + SPECIAL) for pte so that it checks positive for
> pte_special() but then just DEVMAP for pmd which could never have a
> pmd_special(). But a more fundamental question is - why should a devmap
> be a special pte as well ?

IIUC, special pte bit make things handling easier compare with those arches which
have no special bit. The memory codes will regard devmap page as a special one
compared with normal page.
Devmap page structure can be stored in ram/pmem/none.

>
> Also in vm_normal_page() why cannot it tests for pte_devmap() before it
> starts looking for CONFIG_ARCH_HAS_PTE_SPECIAL. Is this the only path
> for

AFAICT, yes, but it changes to much besides arm codes. 😊

> which we need to set SPECIAL bit on a devmap pte or there are other paths
> where this semantics is assumed ?

No idea


--
Cheers,
Justin (Jia He)


IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] arm64: mm: add missing PTE_SPECIAL in pte_mkdevmap on arm64
  2019-08-08  6:20   ` Justin He (Arm Technology China)
@ 2019-08-08  6:58     ` Anshuman Khandual
  0 siblings, 0 replies; 6+ messages in thread
From: Anshuman Khandual @ 2019-08-08  6:58 UTC (permalink / raw)
  To: Justin He (Arm Technology China),
	Catalin Marinas, Will Deacon, Mark Rutland, James Morse
  Cc: Christoffer Dall, Punit Agrawal, Qian Cai, Jun Yao,
	Alex Van Brunt, Robin Murphy, Thomas Gleixner, linux-arm-kernel,
	linux-kernel, linux-mm, Dan Williams, Jérôme Glisse,
	Logan Gunthorpe, Christoph Hellwig



On 08/08/2019 11:50 AM, Justin He (Arm Technology China) wrote:
> Hi Anshuman
> Thanks for the comments, please see my comments below
> 
>> -----Original Message-----
>> From: Anshuman Khandual <anshuman.khandual@arm.com>
>> Sent: 2019年8月8日 13:19
>> To: Justin He (Arm Technology China) <Justin.He@arm.com>; Catalin
>> Marinas <Catalin.Marinas@arm.com>; Will Deacon <will@kernel.org>;
>> Mark Rutland <Mark.Rutland@arm.com>; James Morse
>> <James.Morse@arm.com>
>> Cc: Christoffer Dall <Christoffer.Dall@arm.com>; Punit Agrawal
>> <punitagrawal@gmail.com>; Qian Cai <cai@lca.pw>; Jun Yao
>> <yaojun8558363@gmail.com>; Alex Van Brunt <avanbrunt@nvidia.com>;
>> Robin Murphy <Robin.Murphy@arm.com>; Thomas Gleixner
>> <tglx@linutronix.de>; linux-arm-kernel@lists.infradead.org; linux-
>> kernel@vger.kernel.org
>> Subject: Re: [PATCH] arm64: mm: add missing PTE_SPECIAL in
>> pte_mkdevmap on arm64
>>
> [...]
>>> diff --git a/arch/arm64/include/asm/pgtable.h
>> b/arch/arm64/include/asm/pgtable.h
>>> index 5fdcfe237338..e09760ece844 100644
>>> --- a/arch/arm64/include/asm/pgtable.h
>>> +++ b/arch/arm64/include/asm/pgtable.h
>>> @@ -209,7 +209,7 @@ static inline pmd_t pmd_mkcont(pmd_t pmd)
>>>
>>>  static inline pte_t pte_mkdevmap(pte_t pte)
>>>  {
>>> -	return set_pte_bit(pte, __pgprot(PTE_DEVMAP));
>>> +	return set_pte_bit(pte, __pgprot(PTE_DEVMAP | PTE_SPECIAL));
>>>  }
>>>
>>>  static inline void set_pte(pte_t *ptep, pte_t pte)
>>> @@ -396,7 +396,10 @@ static inline int pmd_protnone(pmd_t pmd)
>>>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>>>  #define pmd_devmap(pmd)		pte_devmap(pmd_pte(pmd))
>>>  #endif
>>> -#define pmd_mkdevmap(pmd)
>> 	pte_pmd(pte_mkdevmap(pmd_pte(pmd)))
>>> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
>>> +{
>>> +	return pte_pmd(set_pte_bit(pmd_pte(pmd),
>> __pgprot(PTE_DEVMAP)));
>>> +}
>>
>> Though I could see other platforms like powerpc and x86 following same
>> approach (DEVMAP + SPECIAL) for pte so that it checks positive for
>> pte_special() but then just DEVMAP for pmd which could never have a
>> pmd_special(). But a more fundamental question is - why should a devmap
>> be a special pte as well ?
> 
> IIUC, special pte bit make things handling easier compare with those arches which
> have no special bit. The memory codes will regard devmap page as a special one 
> compared with normal page.

For that we have PTE_DEVMAP on arm64 which differentiates device memory
entries from others and it should not again need PTE_SPECIAL as well for
that. We set both bits while creating the entries with pte_mkdevmap()
and check just one bit PTE_DEVMAP with pte_devmap(). Problem is it will
also test positive for pte_special() and risks being identified as one.

> Devmap page structure can be stored in ram/pmem/none.

That is altogether a different aspect which is handled with vmem_altmap
during hotplug and nothing to do with how device memory is mapped in the
page table. I am not sure about "none" though. IIUC unlike traditional
device pfn all ZONE_DEVICE memory will have struct page backing either
on system RAM or in the device memory itself.

> 
>>
>> Also in vm_normal_page() why cannot it tests for pte_devmap() before it
>> starts looking for CONFIG_ARCH_HAS_PTE_SPECIAL. Is this the only path
>> for
> 
> AFAICT, yes, but it changes to much besides arm codes. 😊

If this is the only path for which all platforms have to set PTE_SPECIAL
in their device mapping, then it should just be fixed in vm_normal_page().

> 
>> which we need to set SPECIAL bit on a devmap pte or there are other paths
>> where this semantics is assumed ?
> 
> No idea

Probably something to be asked in the mm community.

1. Why pte_mkdevmap() should set SPECIAL bit for a positive pte_special()
   check. Why the same mapping be identified as pte_devmap() as well as
   pte_special().

2. Can pte_devmap() and pte_special() re-ordering at vm_normal_page() will
   remove this dependency or there are other commons MM paths which assume
   this behavior ?

+ linux-mm@kvack.org <linux-mm@kvack.org>
+ Dan Williams <dan.j.williams@intel.com>
+ Jérôme Glisse <jglisse@redhat.com>
+ Logan Gunthorpe <logang@deltatee.com>
+ Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-08-08  6:59 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-07  4:58 [PATCH] arm64: mm: add missing PTE_SPECIAL in pte_mkdevmap on arm64 Jia He
2019-08-07  9:09 ` Will Deacon
2019-08-07  9:24   ` Catalin Marinas
2019-08-08  5:18 ` Anshuman Khandual
2019-08-08  6:20   ` Justin He (Arm Technology China)
2019-08-08  6:58     ` Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).