All of lore.kernel.org
 help / color / mirror / Atom feed
* jemalloc testsuite stalls in memset
@ 2016-12-14 14:34 ` Andreas Schwab
  0 siblings, 0 replies; 17+ messages in thread
From: Andreas Schwab @ 2016-12-14 14:34 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: linux-kernel, Minchan Kim, mbrugger

When running the jemalloc-4.4.0 testsuite on aarch64 with glibc 2.24 the
test/unit/junk test hangs in memset:

(gdb) r
Starting program: /tmp/jemalloc/jemalloc-4.4.0/test/unit/junk
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
test_junk_small: pass
test_junk_large: pass
^C
Program received signal SIGINT, Interrupt.
memset () at ../sysdeps/aarch64/memset.S:91
91              str     q0, [dstin]
(gdb) x/i $pc
=> 0xffffb7ddf54c <memset+140>: str     q0, [x0]

x0 is pointing to the start of this mmap'd block:

      0xffffb7400000     0xffffb7600000   0x200000        0x0

Any attempt to contine execution or step over the insn still causes the
process to hang here.  Only after accessing the memory through the
debugger the test successfully continues to completion.

The kernel has been configured with transparent hugepages.

CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
CONFIG_TRANSPARENT_HUGE_PAGECACHE=y

This issue has been bisected to commit
b8d3c4c3009d42869dc03a1da0efc2aa687d0ab4 ("mm/huge_memory.c: don't split
THP page when MADV_FREE syscall is called").

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* jemalloc testsuite stalls in memset
@ 2016-12-14 14:34 ` Andreas Schwab
  0 siblings, 0 replies; 17+ messages in thread
From: Andreas Schwab @ 2016-12-14 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

When running the jemalloc-4.4.0 testsuite on aarch64 with glibc 2.24 the
test/unit/junk test hangs in memset:

(gdb) r
Starting program: /tmp/jemalloc/jemalloc-4.4.0/test/unit/junk
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
test_junk_small: pass
test_junk_large: pass
^C
Program received signal SIGINT, Interrupt.
memset () at ../sysdeps/aarch64/memset.S:91
91              str     q0, [dstin]
(gdb) x/i $pc
=> 0xffffb7ddf54c <memset+140>: str     q0, [x0]

x0 is pointing to the start of this mmap'd block:

      0xffffb7400000     0xffffb7600000   0x200000        0x0

Any attempt to contine execution or step over the insn still causes the
process to hang here.  Only after accessing the memory through the
debugger the test successfully continues to completion.

The kernel has been configured with transparent hugepages.

CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
CONFIG_TRANSPARENT_HUGE_PAGECACHE=y

This issue has been bisected to commit
b8d3c4c3009d42869dc03a1da0efc2aa687d0ab4 ("mm/huge_memory.c: don't split
THP page when MADV_FREE syscall is called").

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab at suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: jemalloc testsuite stalls in memset
  2016-12-14 14:34 ` Andreas Schwab
  (?)
@ 2016-12-14 23:50   ` Minchan Kim
  -1 siblings, 0 replies; 17+ messages in thread
From: Minchan Kim @ 2016-12-14 23:50 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: linux-arm-kernel, linux-kernel, mbrugger, linux-mm, Jason Evans

Hello,

First of all, thanks for the report and sorry I have no time now so maybe
I should investigate the problem next week.

On Wed, Dec 14, 2016 at 03:34:54PM +0100, Andreas Schwab wrote:
> When running the jemalloc-4.4.0 testsuite on aarch64 with glibc 2.24 the
> test/unit/junk test hangs in memset:
> 
> (gdb) r
> Starting program: /tmp/jemalloc/jemalloc-4.4.0/test/unit/junk
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> test_junk_small: pass
> test_junk_large: pass
> ^C
> Program received signal SIGINT, Interrupt.
> memset () at ../sysdeps/aarch64/memset.S:91
> 91              str     q0, [dstin]
> (gdb) x/i $pc
> => 0xffffb7ddf54c <memset+140>: str     q0, [x0]
> 
> x0 is pointing to the start of this mmap'd block:
> 
>       0xffffb7400000     0xffffb7600000   0x200000        0x0
> 
> Any attempt to contine execution or step over the insn still causes the
> process to hang here.  Only after accessing the memory through the
> debugger the test successfully continues to completion.

You mean program itself access the address(ie, 0xffffb7400000) is hang
while access the address from the debugger is OK?

Scratch head. :/

Can you reproduce it easily?
Did you test it in real machine or qemu on x86?
Could you show me how I can reproduce it?
I want to test it in x86 machine, first of all.
Unfortunately, I don't have any aarch64 platform now so maybe I have to
run it on qemu on x86 until I can set up aarch64 platform if it is reproducible
on real machine only.

> 
> The kernel has been configured with transparent hugepages.
> 
> CONFIG_TRANSPARENT_HUGEPAGE=y
> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
> CONFIG_TRANSPARENT_HUGE_PAGECACHE=y

What's the exact kernel version?
I don't think it's HUGE_PAGECACHE problem but to narrow down the scope,
could you test it without CONFIG_TRANSPARENT_HUGE_PAGECACHE?

Thanks.

> 
> This issue has been bisected to commit
> b8d3c4c3009d42869dc03a1da0efc2aa687d0ab4 ("mm/huge_memory.c: don't split
> THP page when MADV_FREE syscall is called").
> 
> Andreas.
> 
> -- 
> Andreas Schwab, SUSE Labs, schwab@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: jemalloc testsuite stalls in memset
@ 2016-12-14 23:50   ` Minchan Kim
  0 siblings, 0 replies; 17+ messages in thread
From: Minchan Kim @ 2016-12-14 23:50 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: linux-arm-kernel, linux-kernel, mbrugger, linux-mm, Jason Evans

Hello,

First of all, thanks for the report and sorry I have no time now so maybe
I should investigate the problem next week.

On Wed, Dec 14, 2016 at 03:34:54PM +0100, Andreas Schwab wrote:
> When running the jemalloc-4.4.0 testsuite on aarch64 with glibc 2.24 the
> test/unit/junk test hangs in memset:
> 
> (gdb) r
> Starting program: /tmp/jemalloc/jemalloc-4.4.0/test/unit/junk
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> test_junk_small: pass
> test_junk_large: pass
> ^C
> Program received signal SIGINT, Interrupt.
> memset () at ../sysdeps/aarch64/memset.S:91
> 91              str     q0, [dstin]
> (gdb) x/i $pc
> => 0xffffb7ddf54c <memset+140>: str     q0, [x0]
> 
> x0 is pointing to the start of this mmap'd block:
> 
>       0xffffb7400000     0xffffb7600000   0x200000        0x0
> 
> Any attempt to contine execution or step over the insn still causes the
> process to hang here.  Only after accessing the memory through the
> debugger the test successfully continues to completion.

You mean program itself access the address(ie, 0xffffb7400000) is hang
while access the address from the debugger is OK?

Scratch head. :/

Can you reproduce it easily?
Did you test it in real machine or qemu on x86?
Could you show me how I can reproduce it?
I want to test it in x86 machine, first of all.
Unfortunately, I don't have any aarch64 platform now so maybe I have to
run it on qemu on x86 until I can set up aarch64 platform if it is reproducible
on real machine only.

> 
> The kernel has been configured with transparent hugepages.
> 
> CONFIG_TRANSPARENT_HUGEPAGE=y
> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
> CONFIG_TRANSPARENT_HUGE_PAGECACHE=y

What's the exact kernel version?
I don't think it's HUGE_PAGECACHE problem but to narrow down the scope,
could you test it without CONFIG_TRANSPARENT_HUGE_PAGECACHE?

Thanks.

> 
> This issue has been bisected to commit
> b8d3c4c3009d42869dc03a1da0efc2aa687d0ab4 ("mm/huge_memory.c: don't split
> THP page when MADV_FREE syscall is called").
> 
> Andreas.
> 
> -- 
> Andreas Schwab, SUSE Labs, schwab@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* jemalloc testsuite stalls in memset
@ 2016-12-14 23:50   ` Minchan Kim
  0 siblings, 0 replies; 17+ messages in thread
From: Minchan Kim @ 2016-12-14 23:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

First of all, thanks for the report and sorry I have no time now so maybe
I should investigate the problem next week.

On Wed, Dec 14, 2016 at 03:34:54PM +0100, Andreas Schwab wrote:
> When running the jemalloc-4.4.0 testsuite on aarch64 with glibc 2.24 the
> test/unit/junk test hangs in memset:
> 
> (gdb) r
> Starting program: /tmp/jemalloc/jemalloc-4.4.0/test/unit/junk
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> test_junk_small: pass
> test_junk_large: pass
> ^C
> Program received signal SIGINT, Interrupt.
> memset () at ../sysdeps/aarch64/memset.S:91
> 91              str     q0, [dstin]
> (gdb) x/i $pc
> => 0xffffb7ddf54c <memset+140>: str     q0, [x0]
> 
> x0 is pointing to the start of this mmap'd block:
> 
>       0xffffb7400000     0xffffb7600000   0x200000        0x0
> 
> Any attempt to contine execution or step over the insn still causes the
> process to hang here.  Only after accessing the memory through the
> debugger the test successfully continues to completion.

You mean program itself access the address(ie, 0xffffb7400000) is hang
while access the address from the debugger is OK?

Scratch head. :/

Can you reproduce it easily?
Did you test it in real machine or qemu on x86?
Could you show me how I can reproduce it?
I want to test it in x86 machine, first of all.
Unfortunately, I don't have any aarch64 platform now so maybe I have to
run it on qemu on x86 until I can set up aarch64 platform if it is reproducible
on real machine only.

> 
> The kernel has been configured with transparent hugepages.
> 
> CONFIG_TRANSPARENT_HUGEPAGE=y
> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
> CONFIG_TRANSPARENT_HUGE_PAGECACHE=y

What's the exact kernel version?
I don't think it's HUGE_PAGECACHE problem but to narrow down the scope,
could you test it without CONFIG_TRANSPARENT_HUGE_PAGECACHE?

Thanks.

> 
> This issue has been bisected to commit
> b8d3c4c3009d42869dc03a1da0efc2aa687d0ab4 ("mm/huge_memory.c: don't split
> THP page when MADV_FREE syscall is called").
> 
> Andreas.
> 
> -- 
> Andreas Schwab, SUSE Labs, schwab at suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: jemalloc testsuite stalls in memset
  2016-12-14 23:50   ` Minchan Kim
  (?)
@ 2016-12-15  9:24     ` Andreas Schwab
  -1 siblings, 0 replies; 17+ messages in thread
From: Andreas Schwab @ 2016-12-15  9:24 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-arm-kernel, linux-kernel, mbrugger, linux-mm, Jason Evans

On Dez 15 2016, Minchan Kim <minchan@kernel.org> wrote:

> You mean program itself access the address(ie, 0xffffb7400000) is hang
> while access the address from the debugger is OK?

Yes.

> Can you reproduce it easily?

100%

> Did you test it in real machine or qemu on x86?

Both real and kvm.

> Could you show me how I can reproduce it?

Just run make check.

> I want to test it in x86 machine, first of all.
> Unfortunately, I don't have any aarch64 platform now so maybe I have to
> run it on qemu on x86 until I can set up aarch64 platform if it is reproducible
> on real machine only.
>
>> 
>> The kernel has been configured with transparent hugepages.
>> 
>> CONFIG_TRANSPARENT_HUGEPAGE=y
>> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
>> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
>> CONFIG_TRANSPARENT_HUGE_PAGECACHE=y
>
> What's the exact kernel version?

Anything >= your commit.

> I don't think it's HUGE_PAGECACHE problem but to narrow down the scope,
> could you test it without CONFIG_TRANSPARENT_HUGE_PAGECACHE?

That cannot be deselected.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: jemalloc testsuite stalls in memset
@ 2016-12-15  9:24     ` Andreas Schwab
  0 siblings, 0 replies; 17+ messages in thread
From: Andreas Schwab @ 2016-12-15  9:24 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-arm-kernel, linux-kernel, mbrugger, linux-mm, Jason Evans

On Dez 15 2016, Minchan Kim <minchan@kernel.org> wrote:

> You mean program itself access the address(ie, 0xffffb7400000) is hang
> while access the address from the debugger is OK?

Yes.

> Can you reproduce it easily?

100%

> Did you test it in real machine or qemu on x86?

Both real and kvm.

> Could you show me how I can reproduce it?

Just run make check.

> I want to test it in x86 machine, first of all.
> Unfortunately, I don't have any aarch64 platform now so maybe I have to
> run it on qemu on x86 until I can set up aarch64 platform if it is reproducible
> on real machine only.
>
>> 
>> The kernel has been configured with transparent hugepages.
>> 
>> CONFIG_TRANSPARENT_HUGEPAGE=y
>> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
>> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
>> CONFIG_TRANSPARENT_HUGE_PAGECACHE=y
>
> What's the exact kernel version?

Anything >= your commit.

> I don't think it's HUGE_PAGECACHE problem but to narrow down the scope,
> could you test it without CONFIG_TRANSPARENT_HUGE_PAGECACHE?

That cannot be deselected.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* jemalloc testsuite stalls in memset
@ 2016-12-15  9:24     ` Andreas Schwab
  0 siblings, 0 replies; 17+ messages in thread
From: Andreas Schwab @ 2016-12-15  9:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Dez 15 2016, Minchan Kim <minchan@kernel.org> wrote:

> You mean program itself access the address(ie, 0xffffb7400000) is hang
> while access the address from the debugger is OK?

Yes.

> Can you reproduce it easily?

100%

> Did you test it in real machine or qemu on x86?

Both real and kvm.

> Could you show me how I can reproduce it?

Just run make check.

> I want to test it in x86 machine, first of all.
> Unfortunately, I don't have any aarch64 platform now so maybe I have to
> run it on qemu on x86 until I can set up aarch64 platform if it is reproducible
> on real machine only.
>
>> 
>> The kernel has been configured with transparent hugepages.
>> 
>> CONFIG_TRANSPARENT_HUGEPAGE=y
>> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
>> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
>> CONFIG_TRANSPARENT_HUGE_PAGECACHE=y
>
> What's the exact kernel version?

Anything >= your commit.

> I don't think it's HUGE_PAGECACHE problem but to narrow down the scope,
> could you test it without CONFIG_TRANSPARENT_HUGE_PAGECACHE?

That cannot be deselected.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab at suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: jemalloc testsuite stalls in memset
  2016-12-15  9:24     ` Andreas Schwab
  (?)
@ 2016-12-16  6:39       ` Minchan Kim
  -1 siblings, 0 replies; 17+ messages in thread
From: Minchan Kim @ 2016-12-16  6:39 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: linux-arm-kernel, linux-kernel, mbrugger, linux-mm, Jason Evans

Hello,

On Thu, Dec 15, 2016 at 10:24:47AM +0100, Andreas Schwab wrote:
> On Dez 15 2016, Minchan Kim <minchan@kernel.org> wrote:
> 
> > You mean program itself access the address(ie, 0xffffb7400000) is hang
> > while access the address from the debugger is OK?
> 
> Yes.
> 
> > Can you reproduce it easily?
> 
> 100%
> 
> > Did you test it in real machine or qemu on x86?
> 
> Both real and kvm.
> 
> > Could you show me how I can reproduce it?
> 
> Just run make check.
> 
> > I want to test it in x86 machine, first of all.
> > Unfortunately, I don't have any aarch64 platform now so maybe I have to
> > run it on qemu on x86 until I can set up aarch64 platform if it is reproducible
> > on real machine only.
> >
> >> 
> >> The kernel has been configured with transparent hugepages.
> >> 
> >> CONFIG_TRANSPARENT_HUGEPAGE=y
> >> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
> >> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
> >> CONFIG_TRANSPARENT_HUGE_PAGECACHE=y
> >
> > What's the exact kernel version?
> 
> Anything >= your commit.

Thanks for the info. I cannot setup testing enviroment but when I read code,
it seems we need pmd_wrprotect for non-hardware dirty architecture.

Below helps?

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e10a4fe..dc37c9a 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1611,6 +1611,7 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 			tlb->fullmm);
 		orig_pmd = pmd_mkold(orig_pmd);
 		orig_pmd = pmd_mkclean(orig_pmd);
+		orig_pmd = pmd_wrprotect(orig_pmd);
 
 		set_pmd_at(mm, addr, pmd, orig_pmd);
 		tlb_remove_pmd_tlb_entry(tlb, pmd, addr);

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: jemalloc testsuite stalls in memset
@ 2016-12-16  6:39       ` Minchan Kim
  0 siblings, 0 replies; 17+ messages in thread
From: Minchan Kim @ 2016-12-16  6:39 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: linux-arm-kernel, linux-kernel, mbrugger, linux-mm, Jason Evans

Hello,

On Thu, Dec 15, 2016 at 10:24:47AM +0100, Andreas Schwab wrote:
> On Dez 15 2016, Minchan Kim <minchan@kernel.org> wrote:
> 
> > You mean program itself access the address(ie, 0xffffb7400000) is hang
> > while access the address from the debugger is OK?
> 
> Yes.
> 
> > Can you reproduce it easily?
> 
> 100%
> 
> > Did you test it in real machine or qemu on x86?
> 
> Both real and kvm.
> 
> > Could you show me how I can reproduce it?
> 
> Just run make check.
> 
> > I want to test it in x86 machine, first of all.
> > Unfortunately, I don't have any aarch64 platform now so maybe I have to
> > run it on qemu on x86 until I can set up aarch64 platform if it is reproducible
> > on real machine only.
> >
> >> 
> >> The kernel has been configured with transparent hugepages.
> >> 
> >> CONFIG_TRANSPARENT_HUGEPAGE=y
> >> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
> >> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
> >> CONFIG_TRANSPARENT_HUGE_PAGECACHE=y
> >
> > What's the exact kernel version?
> 
> Anything >= your commit.

Thanks for the info. I cannot setup testing enviroment but when I read code,
it seems we need pmd_wrprotect for non-hardware dirty architecture.

Below helps?

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e10a4fe..dc37c9a 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1611,6 +1611,7 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 			tlb->fullmm);
 		orig_pmd = pmd_mkold(orig_pmd);
 		orig_pmd = pmd_mkclean(orig_pmd);
+		orig_pmd = pmd_wrprotect(orig_pmd);
 
 		set_pmd_at(mm, addr, pmd, orig_pmd);
 		tlb_remove_pmd_tlb_entry(tlb, pmd, addr);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* jemalloc testsuite stalls in memset
@ 2016-12-16  6:39       ` Minchan Kim
  0 siblings, 0 replies; 17+ messages in thread
From: Minchan Kim @ 2016-12-16  6:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Thu, Dec 15, 2016 at 10:24:47AM +0100, Andreas Schwab wrote:
> On Dez 15 2016, Minchan Kim <minchan@kernel.org> wrote:
> 
> > You mean program itself access the address(ie, 0xffffb7400000) is hang
> > while access the address from the debugger is OK?
> 
> Yes.
> 
> > Can you reproduce it easily?
> 
> 100%
> 
> > Did you test it in real machine or qemu on x86?
> 
> Both real and kvm.
> 
> > Could you show me how I can reproduce it?
> 
> Just run make check.
> 
> > I want to test it in x86 machine, first of all.
> > Unfortunately, I don't have any aarch64 platform now so maybe I have to
> > run it on qemu on x86 until I can set up aarch64 platform if it is reproducible
> > on real machine only.
> >
> >> 
> >> The kernel has been configured with transparent hugepages.
> >> 
> >> CONFIG_TRANSPARENT_HUGEPAGE=y
> >> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
> >> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
> >> CONFIG_TRANSPARENT_HUGE_PAGECACHE=y
> >
> > What's the exact kernel version?
> 
> Anything >= your commit.

Thanks for the info. I cannot setup testing enviroment but when I read code,
it seems we need pmd_wrprotect for non-hardware dirty architecture.

Below helps?

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e10a4fe..dc37c9a 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1611,6 +1611,7 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 			tlb->fullmm);
 		orig_pmd = pmd_mkold(orig_pmd);
 		orig_pmd = pmd_mkclean(orig_pmd);
+		orig_pmd = pmd_wrprotect(orig_pmd);
 
 		set_pmd_at(mm, addr, pmd, orig_pmd);
 		tlb_remove_pmd_tlb_entry(tlb, pmd, addr);

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: jemalloc testsuite stalls in memset
  2016-12-16  6:39       ` Minchan Kim
  (?)
@ 2016-12-16 14:16         ` Andreas Schwab
  -1 siblings, 0 replies; 17+ messages in thread
From: Andreas Schwab @ 2016-12-16 14:16 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-arm-kernel, linux-kernel, mbrugger, linux-mm, Jason Evans

On Dez 16 2016, Minchan Kim <minchan@kernel.org> wrote:

> Below helps?
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index e10a4fe..dc37c9a 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1611,6 +1611,7 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>  			tlb->fullmm);
>  		orig_pmd = pmd_mkold(orig_pmd);
>  		orig_pmd = pmd_mkclean(orig_pmd);
> +		orig_pmd = pmd_wrprotect(orig_pmd);
>  
>  		set_pmd_at(mm, addr, pmd, orig_pmd);
>  		tlb_remove_pmd_tlb_entry(tlb, pmd, addr);

Thanks, this fixes the issue (tested with 4.9).

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: jemalloc testsuite stalls in memset
@ 2016-12-16 14:16         ` Andreas Schwab
  0 siblings, 0 replies; 17+ messages in thread
From: Andreas Schwab @ 2016-12-16 14:16 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-arm-kernel, linux-kernel, mbrugger, linux-mm, Jason Evans

On Dez 16 2016, Minchan Kim <minchan@kernel.org> wrote:

> Below helps?
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index e10a4fe..dc37c9a 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1611,6 +1611,7 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>  			tlb->fullmm);
>  		orig_pmd = pmd_mkold(orig_pmd);
>  		orig_pmd = pmd_mkclean(orig_pmd);
> +		orig_pmd = pmd_wrprotect(orig_pmd);
>  
>  		set_pmd_at(mm, addr, pmd, orig_pmd);
>  		tlb_remove_pmd_tlb_entry(tlb, pmd, addr);

Thanks, this fixes the issue (tested with 4.9).

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* jemalloc testsuite stalls in memset
@ 2016-12-16 14:16         ` Andreas Schwab
  0 siblings, 0 replies; 17+ messages in thread
From: Andreas Schwab @ 2016-12-16 14:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Dez 16 2016, Minchan Kim <minchan@kernel.org> wrote:

> Below helps?
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index e10a4fe..dc37c9a 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1611,6 +1611,7 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>  			tlb->fullmm);
>  		orig_pmd = pmd_mkold(orig_pmd);
>  		orig_pmd = pmd_mkclean(orig_pmd);
> +		orig_pmd = pmd_wrprotect(orig_pmd);
>  
>  		set_pmd_at(mm, addr, pmd, orig_pmd);
>  		tlb_remove_pmd_tlb_entry(tlb, pmd, addr);

Thanks, this fixes the issue (tested with 4.9).

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab at suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: jemalloc testsuite stalls in memset
  2016-12-16 14:16         ` Andreas Schwab
  (?)
@ 2016-12-21 23:54           ` Minchan Kim
  -1 siblings, 0 replies; 17+ messages in thread
From: Minchan Kim @ 2016-12-21 23:54 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: linux-arm-kernel, linux-kernel, mbrugger, linux-mm, Jason Evans

Hello, Andreas

Sorry for long delay. I was on vacation.

On Fri, Dec 16, 2016 at 03:16:20PM +0100, Andreas Schwab wrote:
> On Dez 16 2016, Minchan Kim <minchan@kernel.org> wrote:
> 
> > Below helps?
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index e10a4fe..dc37c9a 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -1611,6 +1611,7 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
> >  			tlb->fullmm);
> >  		orig_pmd = pmd_mkold(orig_pmd);
> >  		orig_pmd = pmd_mkclean(orig_pmd);
> > +		orig_pmd = pmd_wrprotect(orig_pmd);
> >  
> >  		set_pmd_at(mm, addr, pmd, orig_pmd);
> >  		tlb_remove_pmd_tlb_entry(tlb, pmd, addr);
> 
> Thanks, this fixes the issue (tested with 4.9).

It was a quick hack to know what exact problem is there and your confirming
helped a lot to understand the problem clear.

More right approach is to support pmd dirty handling in general page fault
handler rather than tweaking MADV_FREE. I just sent a new patch with Ccing
you.

Could you test it, please?
Thanks!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: jemalloc testsuite stalls in memset
@ 2016-12-21 23:54           ` Minchan Kim
  0 siblings, 0 replies; 17+ messages in thread
From: Minchan Kim @ 2016-12-21 23:54 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: linux-arm-kernel, linux-kernel, mbrugger, linux-mm, Jason Evans

Hello, Andreas

Sorry for long delay. I was on vacation.

On Fri, Dec 16, 2016 at 03:16:20PM +0100, Andreas Schwab wrote:
> On Dez 16 2016, Minchan Kim <minchan@kernel.org> wrote:
> 
> > Below helps?
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index e10a4fe..dc37c9a 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -1611,6 +1611,7 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
> >  			tlb->fullmm);
> >  		orig_pmd = pmd_mkold(orig_pmd);
> >  		orig_pmd = pmd_mkclean(orig_pmd);
> > +		orig_pmd = pmd_wrprotect(orig_pmd);
> >  
> >  		set_pmd_at(mm, addr, pmd, orig_pmd);
> >  		tlb_remove_pmd_tlb_entry(tlb, pmd, addr);
> 
> Thanks, this fixes the issue (tested with 4.9).

It was a quick hack to know what exact problem is there and your confirming
helped a lot to understand the problem clear.

More right approach is to support pmd dirty handling in general page fault
handler rather than tweaking MADV_FREE. I just sent a new patch with Ccing
you.

Could you test it, please?
Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* jemalloc testsuite stalls in memset
@ 2016-12-21 23:54           ` Minchan Kim
  0 siblings, 0 replies; 17+ messages in thread
From: Minchan Kim @ 2016-12-21 23:54 UTC (permalink / raw)
  To: linux-arm-kernel

Hello, Andreas

Sorry for long delay. I was on vacation.

On Fri, Dec 16, 2016 at 03:16:20PM +0100, Andreas Schwab wrote:
> On Dez 16 2016, Minchan Kim <minchan@kernel.org> wrote:
> 
> > Below helps?
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index e10a4fe..dc37c9a 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -1611,6 +1611,7 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
> >  			tlb->fullmm);
> >  		orig_pmd = pmd_mkold(orig_pmd);
> >  		orig_pmd = pmd_mkclean(orig_pmd);
> > +		orig_pmd = pmd_wrprotect(orig_pmd);
> >  
> >  		set_pmd_at(mm, addr, pmd, orig_pmd);
> >  		tlb_remove_pmd_tlb_entry(tlb, pmd, addr);
> 
> Thanks, this fixes the issue (tested with 4.9).

It was a quick hack to know what exact problem is there and your confirming
helped a lot to understand the problem clear.

More right approach is to support pmd dirty handling in general page fault
handler rather than tweaking MADV_FREE. I just sent a new patch with Ccing
you.

Could you test it, please?
Thanks!

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-12-21 23:54 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-14 14:34 jemalloc testsuite stalls in memset Andreas Schwab
2016-12-14 14:34 ` Andreas Schwab
2016-12-14 23:50 ` Minchan Kim
2016-12-14 23:50   ` Minchan Kim
2016-12-14 23:50   ` Minchan Kim
2016-12-15  9:24   ` Andreas Schwab
2016-12-15  9:24     ` Andreas Schwab
2016-12-15  9:24     ` Andreas Schwab
2016-12-16  6:39     ` Minchan Kim
2016-12-16  6:39       ` Minchan Kim
2016-12-16  6:39       ` Minchan Kim
2016-12-16 14:16       ` Andreas Schwab
2016-12-16 14:16         ` Andreas Schwab
2016-12-16 14:16         ` Andreas Schwab
2016-12-21 23:54         ` Minchan Kim
2016-12-21 23:54           ` Minchan Kim
2016-12-21 23:54           ` Minchan Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.