linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613)
@ 2019-06-13 21:34 Qian Cai
  2019-06-14 10:20 ` Will Deacon
  0 siblings, 1 reply; 11+ messages in thread
From: Qian Cai @ 2019-06-13 21:34 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas
  Cc: linux-mm, linux-kernel, linux-arm-kernel, Anshuman Khandual

LTP hugemmap05 test case [1] could not exit itself properly and then degrade the
system performance on arm64 with linux-next (next-20190613). The bisection so
far indicates,

BAD:  30bafbc357f1 Merge remote-tracking branch 'arm64/for-next/core'
GOOD: 0c3d124a3043 Merge remote-tracking branch 'arm64-fixes/for-next/fixes'

I don't see anything obvious between those two pull requests, so I guess
something in 'arm64/for-next/core' is wrong.

$ git log --oneline 361413ee1992..9b6047220590
9b6047220590 arm64: mm: avoid redundant READ_ONCE(*ptep)
4745224b4509 arm64/mm: Refactor __do_page_fault()
c49bd02f4c74 arm64/mm: Document write abort detection from ESR
8e01076afd97 arm64: Fix comment after #endif
f086f67485c5 arm64: ptrace: add support for syscall emulation
fd3866381be2 arm64: add PTRACE_SYSEMU{,SINGLESTEP} definations to uapi headers
15532fd6f57c ptrace: move clearing of TIF_SYSCALL_EMU flag to core
616810360043 arm64/mm: Drop task_struct argument from __do_page_fault()
a0509313d5de arm64/mm: Drop mmap_sem before calling __do_kernel_fault()
01de1776f62e arm64/mm: Identify user instruction aborts
87dedf7c61ab arm64/mm: Change BUG_ON() to VM_BUG_ON() in [pmd|pud]_set_huge()
2e6aee5af330 arm64: kernel: use aff3 instead of aff2 in comment
27e6e7d63fc2 arm64/cpufeature: Convert hook_lock to raw_spin_lock_t in
cpu_enable_ssbs()
0c1f14ed1226 arm64: mm: make CONFIG_ZONE_DMA32 configurable
f7f0097af67c arm64/mm: Simplify protection flag creation for kernel huge
mappings
7b8c87b297a7 arm64: cacheinfo: Update cache_line_size detected from DT or PPTT
9a83c84c3a49 drivers: base: cacheinfo: Add variable to record max cache line
size
6dcdefcde413 arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
54b8c7cbc57c arm64/fpsimd: Introduce fpsimd_save_and_flush_cpu_state() and use
it
6fa9b41f6f15 arm64/fpsimd: Remove the prototype for sve_flush_cpu_state()
201d355c15c1 arm64/mm: Move PTE_VALID from SW defined to HW page table entry
definitions
441a62780687 arm64/hugetlb: Use macros for contiguous huge page sizes

[1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/h
ugetlb/hugemmap/hugemmap05.c

# /opt/ltp/testcases/bin/hugemmap05 -s -m
tst_test.c:1111: INFO: Timeout per run is 0h 05m 00s
hugemmap05.c:235: INFO: original nr_hugepages is 0
hugemmap05.c:248: INFO: original nr_overcommit_hugepages is 0
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Cannot kill test processes!
Congratulation, likely test hit a kernel bug.
Exitting uncleanly...

[ 7792.681691][ T5025] LTP: starting hugemmap05_3 (hugemmap05 -s -m)
[ 7911.149058][ T1309] INFO: task hugemmap05:51035 can't die for more than 122
seconds.
[ 7911.156833][ T1309] hugemmap05      R  running task    27648 51035      1
0x0000000d
[ 7911.164654][ T1309] Call trace:
[ 7911.167823][ T1309]  __switch_to+0x2e0/0x37c
[ 7911.172128][ T1309]  0x3e4ca
[ 7911.175033][ T1309] 
[ 7911.175033][ T1309] Showing all locks held in the system:
[ 7911.182888][ T1309] 1 lock held by khungtaskd/1309:
[ 7911.187778][ T1309]  #0: 0000000037a3e572 (rcu_read_lock){....}, at:
rcu_lock_acquire+0x8/0x38
[ 7911.196655][ T1309] 4 locks held by hugemmap05/51035:
[ 7911.201731][ T1309] 4 locks held by hugemmap05/51038:
[ 7911.206814][ T1309] 
[ 7911.209025][ T1309] =============================================
[ 7911.209025][ T1309] 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613)
  2019-06-13 21:34 LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613) Qian Cai
@ 2019-06-14 10:20 ` Will Deacon
  2019-06-14 12:15   ` Qian Cai
  0 siblings, 1 reply; 11+ messages in thread
From: Will Deacon @ 2019-06-14 10:20 UTC (permalink / raw)
  To: Qian Cai
  Cc: linux-mm, Catalin Marinas, linux-kernel, linux-arm-kernel,
	Anshuman Khandual

Hi Qian,

On Thu, Jun 13, 2019 at 05:34:01PM -0400, Qian Cai wrote:
> LTP hugemmap05 test case [1] could not exit itself properly and then degrade the
> system performance on arm64 with linux-next (next-20190613). The bisection so
> far indicates,
> 
> BAD:  30bafbc357f1 Merge remote-tracking branch 'arm64/for-next/core'
> GOOD: 0c3d124a3043 Merge remote-tracking branch 'arm64-fixes/for-next/fixes'

Did you finish the bisection in the end? Also, what config are you using
(you usually have something fairly esoteric ;)?

Thanks,

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613)
  2019-06-14 10:20 ` Will Deacon
@ 2019-06-14 12:15   ` Qian Cai
  2019-06-17  1:32     ` Anshuman Khandual
  0 siblings, 1 reply; 11+ messages in thread
From: Qian Cai @ 2019-06-14 12:15 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-mm, Catalin Marinas, linux-kernel, linux-arm-kernel,
	Anshuman Khandual

On Fri, 2019-06-14 at 11:20 +0100, Will Deacon wrote:
> Hi Qian,
> 
> On Thu, Jun 13, 2019 at 05:34:01PM -0400, Qian Cai wrote:
> > LTP hugemmap05 test case [1] could not exit itself properly and then degrade
> > the
> > system performance on arm64 with linux-next (next-20190613). The bisection
> > so
> > far indicates,
> > 
> > BAD:  30bafbc357f1 Merge remote-tracking branch 'arm64/for-next/core'
> > GOOD: 0c3d124a3043 Merge remote-tracking branch 'arm64-fixes/for-next/fixes'
> 
> Did you finish the bisection in the end? Also, what config are you using
> (you usually have something fairly esoteric ;)?

No, it is still running.

https://raw.githubusercontent.com/cailca/linux-mm/master/arm64.config

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613)
  2019-06-14 12:15   ` Qian Cai
@ 2019-06-17  1:32     ` Anshuman Khandual
  2019-06-17  1:41       ` Qian Cai
  0 siblings, 1 reply; 11+ messages in thread
From: Anshuman Khandual @ 2019-06-17  1:32 UTC (permalink / raw)
  To: Qian Cai, Will Deacon
  Cc: Catalin Marinas, linux-kernel, linux-arm-kernel, linux-mm

Hello Qian,

On 06/14/2019 05:45 PM, Qian Cai wrote:
> On Fri, 2019-06-14 at 11:20 +0100, Will Deacon wrote:
>> Hi Qian,
>>
>> On Thu, Jun 13, 2019 at 05:34:01PM -0400, Qian Cai wrote:
>>> LTP hugemmap05 test case [1] could not exit itself properly and then degrade
>>> the
>>> system performance on arm64 with linux-next (next-20190613). The bisection
>>> so
>>> far indicates,
>>>
>>> BAD:  30bafbc357f1 Merge remote-tracking branch 'arm64/for-next/core'
>>> GOOD: 0c3d124a3043 Merge remote-tracking branch 'arm64-fixes/for-next/fixes'
>>
>> Did you finish the bisection in the end? Also, what config are you using
>> (you usually have something fairly esoteric ;)?
> 
> No, it is still running.
> 
> https://raw.githubusercontent.com/cailca/linux-mm/master/arm64.config
> 

Were you able to bisect the problem till a particular commit ?

- Anshuman

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613)
  2019-06-17  1:32     ` Anshuman Khandual
@ 2019-06-17  1:41       ` Qian Cai
  2019-06-24  9:35         ` Will Deacon
  0 siblings, 1 reply; 11+ messages in thread
From: Qian Cai @ 2019-06-17  1:41 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Catalin Marinas, Will Deacon, linux-kernel, linux-arm-kernel, linux-mm



> On Jun 16, 2019, at 9:32 PM, Anshuman Khandual <anshuman.khandual@arm.com> wrote:
> 
> Hello Qian,
> 
> On 06/14/2019 05:45 PM, Qian Cai wrote:
>> On Fri, 2019-06-14 at 11:20 +0100, Will Deacon wrote:
>>> Hi Qian,
>>> 
>>> On Thu, Jun 13, 2019 at 05:34:01PM -0400, Qian Cai wrote:
>>>> LTP hugemmap05 test case [1] could not exit itself properly and then degrade
>>>> the
>>>> system performance on arm64 with linux-next (next-20190613). The bisection
>>>> so
>>>> far indicates,
>>>> 
>>>> BAD:  30bafbc357f1 Merge remote-tracking branch 'arm64/for-next/core'
>>>> GOOD: 0c3d124a3043 Merge remote-tracking branch 'arm64-fixes/for-next/fixes'
>>> 
>>> Did you finish the bisection in the end? Also, what config are you using
>>> (you usually have something fairly esoteric ;)?
>> 
>> No, it is still running.
>> 
>> https://raw.githubusercontent.com/cailca/linux-mm/master/arm64.config
>> 
> 
> Were you able to bisect the problem till a particular commit ?

Not yet, it turned out the test case needs to run a few times (usually within 5) to reproduce, so the previous bisection was totally wrong where it assume the bad commit will fail every time. Once reproduced, the test case becomes unkillable stuck in the D state.

I am still in the middle of running a new round of bisection. The current progress is,

35c99ffa20ed GOOD (survived 20 times)
def0fdae813d BAD
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613)
  2019-06-17  1:41       ` Qian Cai
@ 2019-06-24  9:35         ` Will Deacon
  2019-06-24 12:58           ` Qian Cai
  0 siblings, 1 reply; 11+ messages in thread
From: Will Deacon @ 2019-06-24  9:35 UTC (permalink / raw)
  To: Qian Cai
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, linux-kernel,
	linux-mm, linux-arm-kernel

Hi Qian Cai,

On Sun, Jun 16, 2019 at 09:41:09PM -0400, Qian Cai wrote:
> > On Jun 16, 2019, at 9:32 PM, Anshuman Khandual <anshuman.khandual@arm.com> wrote:
> > On 06/14/2019 05:45 PM, Qian Cai wrote:
> >> On Fri, 2019-06-14 at 11:20 +0100, Will Deacon wrote:
> >>> On Thu, Jun 13, 2019 at 05:34:01PM -0400, Qian Cai wrote:
> >>>> LTP hugemmap05 test case [1] could not exit itself properly and then degrade
> >>>> the
> >>>> system performance on arm64 with linux-next (next-20190613). The bisection
> >>>> so
> >>>> far indicates,
> >>>> 
> >>>> BAD:  30bafbc357f1 Merge remote-tracking branch 'arm64/for-next/core'
> >>>> GOOD: 0c3d124a3043 Merge remote-tracking branch 'arm64-fixes/for-next/fixes'
> >>> 
> >>> Did you finish the bisection in the end? Also, what config are you using
> >>> (you usually have something fairly esoteric ;)?
> >> 
> >> No, it is still running.
> >> 
> >> https://raw.githubusercontent.com/cailca/linux-mm/master/arm64.config
> >> 
> > 
> > Were you able to bisect the problem till a particular commit ?
> 
> Not yet, it turned out the test case needs to run a few times (usually
> within 5) to reproduce, so the previous bisection was totally wrong where
> it assume the bad commit will fail every time. Once reproduced, the test
> case becomes unkillable stuck in the D state.
> 
> I am still in the middle of running a new round of bisection. The current
> progress is,
> 
> 35c99ffa20ed GOOD (survived 20 times)
> def0fdae813d BAD

Just wondering if you got anywhere with this? We've failed to reproduce the
problem locally.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613)
  2019-06-24  9:35         ` Will Deacon
@ 2019-06-24 12:58           ` Qian Cai
  2019-06-24 21:30             ` Qian Cai
  0 siblings, 1 reply; 11+ messages in thread
From: Qian Cai @ 2019-06-24 12:58 UTC (permalink / raw)
  To: Will Deacon
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, linux-kernel,
	linux-mm, linux-arm-kernel

On Mon, 2019-06-24 at 10:35 +0100, Will Deacon wrote:
> Hi Qian Cai,
> 
> On Sun, Jun 16, 2019 at 09:41:09PM -0400, Qian Cai wrote:
> > > On Jun 16, 2019, at 9:32 PM, Anshuman Khandual <anshuman.khandual@arm.com>
> > > wrote:
> > > On 06/14/2019 05:45 PM, Qian Cai wrote:
> > > > On Fri, 2019-06-14 at 11:20 +0100, Will Deacon wrote:
> > > > > On Thu, Jun 13, 2019 at 05:34:01PM -0400, Qian Cai wrote:
> > > > > > LTP hugemmap05 test case [1] could not exit itself properly and then
> > > > > > degrade
> > > > > > the
> > > > > > system performance on arm64 with linux-next (next-20190613). The
> > > > > > bisection
> > > > > > so
> > > > > > far indicates,
> > > > > > 
> > > > > > BAD:  30bafbc357f1 Merge remote-tracking branch 'arm64/for-
> > > > > > next/core'
> > > > > > GOOD: 0c3d124a3043 Merge remote-tracking branch 'arm64-fixes/for-
> > > > > > next/fixes'
> > > > > 
> > > > > Did you finish the bisection in the end? Also, what config are you
> > > > > using
> > > > > (you usually have something fairly esoteric ;)?
> > > > 
> > > > No, it is still running.
> > > > 
> > > > https://raw.githubusercontent.com/cailca/linux-mm/master/arm64.config
> > > > 
> > > 
> > > Were you able to bisect the problem till a particular commit ?
> > 
> > Not yet, it turned out the test case needs to run a few times (usually
> > within 5) to reproduce, so the previous bisection was totally wrong where
> > it assume the bad commit will fail every time. Once reproduced, the test
> > case becomes unkillable stuck in the D state.
> > 
> > I am still in the middle of running a new round of bisection. The current
> > progress is,
> > 
> > 35c99ffa20ed GOOD (survived 20 times)
> > def0fdae813d BAD
> 
> Just wondering if you got anywhere with this? We've failed to reproduce the
> problem locally.

Unfortunately, I have not had a chance to dig this up yet. The progress I had so
far is,

The issue was there for a long time goes back to 4.20 and probably earlier. It
is not failing every time. The script below could reproduce it usually within 10
0 tires.

i=0; while :; do ./hugemmap05 -m -s; echo $((i++)); sleep 5; done

This can be reproduced in an error path, i.e., shmget() in the test case will
fail every time before triggering the soft lockups.

# ./hugemmap05 -s -m
tst_test.c:1112: INFO: Timeout per run is 0h 05m 00s
hugemmap05.c:235: INFO: original nr_hugepages is 0
hugemmap05.c:248: INFO: original nr_overcommit_hugepages is 0
tst_safe_sysv_ipc.c:111: BROK: hugemmap05.c:97: shmget(218366029, 103079215104,
b80) failed: ENOMEM
hugemmap05.c:192: INFO: restore nr_hugepages to 0.
hugemmap05.c:201: INFO: restore nr_overcommit_hugepages to 0.

Summary:
passed   0
failed   0
skipped  0
warnings 0
0

My understanding is that the soft lockups are triggered in this path,

ipcget
  ipcget_public
    ops->getnew
      newseg
        hugetlb_file_setup <- return ENOMEM

[ 1521.471216][ T1309] INFO: task hugemmap05:4718 blocked for more than 860
seconds.
[ 1521.478731][ T1309]       Tainted: G        W         5.2.0-rc4+ #8
[ 1521.485023][ T1309] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1521.493568][ T1309] hugemmap05      D27168  4718      1 0x00000001
[ 1521.499815][ T1309] Call trace:
[ 1521.502985][ T1309]  __switch_to+0x2e0/0x37c
[ 1521.507278][ T1309]  __schedule+0xa0c/0xd9c
[ 1521.511484][ T1309]  schedule+0x60/0x168
[ 1521.515430][ T1309]  __rwsem_down_write_failed_common+0x484/0x7b8
[ 1521.521546][ T1309]  rwsem_down_write_failed+0x20/0x2c
[ 1521.526717][ T1309]  down_write+0xa0/0xa4
[ 1521.530747][ T1309]  ipcget+0x74/0x414
[ 1521.534518][ T1309]  ksys_shmget+0x90/0xc4
[ 1521.538638][ T1309]  __arm64_sys_shmget+0x54/0x88
[ 1521.543366][ T1309]  el0_svc_handler+0x198/0x260
[ 1521.548005][ T1309]  el0_svc+0x8/0xc
[ 1521.551605][ T1309] 
[ 1521.551605][ T1309] Showing all locks held in the system:
[ 1521.559349][ T1309] 1 lock held by khungtaskd/1309:
[ 1521.564251][ T1309]  #0: 00000000033dd0e2 (rcu_read_lock){....}, at:
rcu_lock_acquire+0x8/0x38
[ 1521.573014][ T1309] 2 locks held by hugemmap05/4694:
[ 1521.578010][ T1309] 1 lock held by hugemmap05/4718:
[ 1521.582904][ T1309]  #0: 00000000c62a3d44 (&ids->rwsem){....}, at:
ipcget+0x74/0x414
[ 1521.590707][ T1309] 1 lock held by hugemmap05/4755:
[ 1521.595595][ T1309]  #0: 00000000c62a3d44 (&ids->rwsem){....}, at:
ipcget+0x74/0x414
[ 1521.603373][ T1309] 1 lock held by hugemmap05/4781:
[ 1521.608270][ T1309]  #0: 00000000c62a3d44 (&ids->rwsem){....}, at:
ipcget+0x74/0x414

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613)
  2019-06-24 12:58           ` Qian Cai
@ 2019-06-24 21:30             ` Qian Cai
  2019-06-24 21:53               ` Mike Kravetz
  0 siblings, 1 reply; 11+ messages in thread
From: Qian Cai @ 2019-06-24 21:30 UTC (permalink / raw)
  To: Will Deacon
  Cc: Anshuman Khandual, Catalin Marinas, linux-kernel, linux-mm,
	linux-arm-kernel, Mike Kravetz

So the problem is that ipcget_public() has held the semaphore "ids->rwsem" for
too long seems unnecessarily and then goes to sleep sometimes due to direct
reclaim (other times LTP hugemmap05 [1] has hugetlb_file_setup() returns
-ENOMEM),

[  788.765739][ T1315] INFO: task hugemmap05:5001 can't die for more than 122
seconds.
[  788.773512][ T1315] hugemmap05      R  running task    25600  5001      1
0x0000000d
[  788.781348][ T1315] Call trace:
[  788.784536][ T1315]  __switch_to+0x2e0/0x37c
[  788.788848][ T1315]  try_to_free_pages+0x614/0x934
[  788.793679][ T1315]  __alloc_pages_nodemask+0xe88/0x1d60
[  788.799030][ T1315]  alloc_fresh_huge_page+0x16c/0x588
[  788.804206][ T1315]  alloc_surplus_huge_page+0x9c/0x278
[  788.809468][ T1315]  hugetlb_acct_memory+0x114/0x5c4
[  788.814469][ T1315]  hugetlb_reserve_pages+0x170/0x2b0
[  788.819662][ T1315]  hugetlb_file_setup+0x26c/0x3a8
[  788.824600][ T1315]  newseg+0x220/0x63c
[  788.828490][ T1315]  ipcget+0x570/0x674
[  788.832377][ T1315]  ksys_shmget+0x90/0xc4
[  788.836525][ T1315]  __arm64_sys_shmget+0x54/0x88
[  788.841282][ T1315]  el0_svc_handler+0x19c/0x26c
[  788.845952][ T1315]  el0_svc+0x8/0xc

and then all other processes are waiting on the semaphore causes lock
contentions,

[  788.849583][ T1315] INFO: task hugemmap05:5027 blocked for more than 122
seconds.
[  788.857119][ T1315]       Tainted: G        W         5.2.0-rc6-next-20190624 
#2
[  788.864566][ T1315] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  788.873139][ T1315] hugemmap05      D26960  5027   5026 0x00000000
[  788.879395][ T1315] Call trace:
[  788.882576][ T1315]  __switch_to+0x2e0/0x37c
[  788.886901][ T1315]  __schedule+0xb74/0xf0c
[  788.891136][ T1315]  schedule+0x60/0x168
[  788.895097][ T1315]  rwsem_down_write_slowpath+0x5a0/0x8c8
[  788.900653][ T1315]  down_write+0xc0/0xc4
[  788.904715][ T1315]  ipcget+0x74/0x674
[  788.908516][ T1315]  ksys_shmget+0x90/0xc4
[  788.912664][ T1315]  __arm64_sys_shmget+0x54/0x88
[  788.917420][ T1315]  el0_svc_handler+0x19c/0x26c
[  788.922088][ T1315]  el0_svc+0x8/0xc

Ideally, it seems only ipc_findkey() and newseg() in this path needs to hold the
semaphore to protect concurrency access, so it could just be converted to a
spinlock instead.

[1] ./hugemmap05 -s -m

https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/huget
lb/hugemmap/hugemmap05.c

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613)
  2019-06-24 21:30             ` Qian Cai
@ 2019-06-24 21:53               ` Mike Kravetz
  2019-06-27 18:09                 ` Mike Kravetz
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Kravetz @ 2019-06-24 21:53 UTC (permalink / raw)
  To: Qian Cai, Will Deacon
  Cc: linux-mm, Catalin Marinas, linux-kernel, linux-arm-kernel,
	Anshuman Khandual

On 6/24/19 2:30 PM, Qian Cai wrote:
> So the problem is that ipcget_public() has held the semaphore "ids->rwsem" for
> too long seems unnecessarily and then goes to sleep sometimes due to direct
> reclaim (other times LTP hugemmap05 [1] has hugetlb_file_setup() returns
> -ENOMEM),

Thanks for looking into this!  I noticed that recent kernels could take a
VERY long time trying to do high order allocations.  In my case it was trying
to do dynamic hugetlb page allocations as well [1].  But, IMO this is more
of a general direct reclaim/compation issue than something hugetlb specific.

> 
> [  788.765739][ T1315] INFO: task hugemmap05:5001 can't die for more than 122
> seconds.
> [  788.773512][ T1315] hugemmap05      R  running task    25600  5001      1
> 0x0000000d
> [  788.781348][ T1315] Call trace:
> [  788.784536][ T1315]  __switch_to+0x2e0/0x37c
> [  788.788848][ T1315]  try_to_free_pages+0x614/0x934
> [  788.793679][ T1315]  __alloc_pages_nodemask+0xe88/0x1d60
> [  788.799030][ T1315]  alloc_fresh_huge_page+0x16c/0x588
> [  788.804206][ T1315]  alloc_surplus_huge_page+0x9c/0x278
> [  788.809468][ T1315]  hugetlb_acct_memory+0x114/0x5c4
> [  788.814469][ T1315]  hugetlb_reserve_pages+0x170/0x2b0
> [  788.819662][ T1315]  hugetlb_file_setup+0x26c/0x3a8
> [  788.824600][ T1315]  newseg+0x220/0x63c
> [  788.828490][ T1315]  ipcget+0x570/0x674
> [  788.832377][ T1315]  ksys_shmget+0x90/0xc4
> [  788.836525][ T1315]  __arm64_sys_shmget+0x54/0x88
> [  788.841282][ T1315]  el0_svc_handler+0x19c/0x26c
> [  788.845952][ T1315]  el0_svc+0x8/0xc
> 
> and then all other processes are waiting on the semaphore causes lock
> contentions,

That call to hugetlb_file_setup() via ipcget certainly could take a long
time to execute.  In the default case huge pages are reserved to back the
shared memory segment.  If these pages were not prealllocated, then the
code will try to dynamically allocate the required number of huge pages.
So, even if [1] were not an issue I think a change here makes sense.

> [  788.849583][ T1315] INFO: task hugemmap05:5027 blocked for more than 122
> seconds.
> [  788.857119][ T1315]       Tainted: G        W         5.2.0-rc6-next-20190624 
> #2
> [  788.864566][ T1315] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  788.873139][ T1315] hugemmap05      D26960  5027   5026 0x00000000
> [  788.879395][ T1315] Call trace:
> [  788.882576][ T1315]  __switch_to+0x2e0/0x37c
> [  788.886901][ T1315]  __schedule+0xb74/0xf0c
> [  788.891136][ T1315]  schedule+0x60/0x168
> [  788.895097][ T1315]  rwsem_down_write_slowpath+0x5a0/0x8c8
> [  788.900653][ T1315]  down_write+0xc0/0xc4
> [  788.904715][ T1315]  ipcget+0x74/0x674
> [  788.908516][ T1315]  ksys_shmget+0x90/0xc4
> [  788.912664][ T1315]  __arm64_sys_shmget+0x54/0x88
> [  788.917420][ T1315]  el0_svc_handler+0x19c/0x26c
> [  788.922088][ T1315]  el0_svc+0x8/0xc
> 
> Ideally, it seems only ipc_findkey() and newseg() in this path needs to hold the
> semaphore to protect concurrency access, so it could just be converted to a
> spinlock instead.

I do not have enough experience with this ipc code to comment on your proposed
change.  But, I will look into it.

[1] https://lkml.org/lkml/2019/4/23/2
-- 
Mike Kravetz

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613)
  2019-06-24 21:53               ` Mike Kravetz
@ 2019-06-27 18:09                 ` Mike Kravetz
  2019-06-27 18:54                   ` Qian Cai
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Kravetz @ 2019-06-27 18:09 UTC (permalink / raw)
  To: Qian Cai, Will Deacon
  Cc: linux-mm, Catalin Marinas, linux-kernel, linux-arm-kernel,
	Anshuman Khandual

On 6/24/19 2:53 PM, Mike Kravetz wrote:
> On 6/24/19 2:30 PM, Qian Cai wrote:
>> So the problem is that ipcget_public() has held the semaphore "ids->rwsem" for
>> too long seems unnecessarily and then goes to sleep sometimes due to direct
>> reclaim (other times LTP hugemmap05 [1] has hugetlb_file_setup() returns
>> -ENOMEM),
> 
> Thanks for looking into this!  I noticed that recent kernels could take a
> VERY long time trying to do high order allocations.  In my case it was trying
> to do dynamic hugetlb page allocations as well [1].  But, IMO this is more
> of a general direct reclaim/compation issue than something hugetlb specific.
> 

<snip>

>> Ideally, it seems only ipc_findkey() and newseg() in this path needs to hold the
>> semaphore to protect concurrency access, so it could just be converted to a
>> spinlock instead.
> 
> I do not have enough experience with this ipc code to comment on your proposed
> change.  But, I will look into it.
> 
> [1] https://lkml.org/lkml/2019/4/23/2

I only took a quick look at the ipc code, but there does not appear to be
a quick/easy change to make.  The issue is that shared memory creation could
take a long time.  With issue [1] above unresolved, creation of hugetlb backed
shared memory segments could take a VERY long time.

I do not believe the test failure is arm specific.  Most likely, it is just
because testing was done on a system with memory size to trigger this issue?

My plan is to focus on [1].  When that is resolved, this issue should go away.
-- 
Mike Kravetz

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613)
  2019-06-27 18:09                 ` Mike Kravetz
@ 2019-06-27 18:54                   ` Qian Cai
  0 siblings, 0 replies; 11+ messages in thread
From: Qian Cai @ 2019-06-27 18:54 UTC (permalink / raw)
  To: Mike Kravetz, Will Deacon
  Cc: linux-mm, Catalin Marinas, linux-kernel, linux-arm-kernel,
	Anshuman Khandual

On Thu, 2019-06-27 at 11:09 -0700, Mike Kravetz wrote:
> On 6/24/19 2:53 PM, Mike Kravetz wrote:
> > On 6/24/19 2:30 PM, Qian Cai wrote:
> > > So the problem is that ipcget_public() has held the semaphore "ids->rwsem" 
> > > for
> > > too long seems unnecessarily and then goes to sleep sometimes due to
> > > direct
> > > reclaim (other times LTP hugemmap05 [1] has hugetlb_file_setup() returns
> > > -ENOMEM),
> > 
> > Thanks for looking into this!  I noticed that recent kernels could take a
> > VERY long time trying to do high order allocations.  In my case it was
> > trying
> > to do dynamic hugetlb page allocations as well [1].  But, IMO this is more
> > of a general direct reclaim/compation issue than something hugetlb specific.
> > 
> 
> <snip>
> 
> > > Ideally, it seems only ipc_findkey() and newseg() in this path needs to
> > > hold the
> > > semaphore to protect concurrency access, so it could just be converted to
> > > a
> > > spinlock instead.
> > 
> > I do not have enough experience with this ipc code to comment on your
> > proposed
> > change.  But, I will look into it.
> > 
> > [1] https://lkml.org/lkml/2019/4/23/2
> 
> I only took a quick look at the ipc code, but there does not appear to be
> a quick/easy change to make.  The issue is that shared memory creation could
> take a long time.  With issue [1] above unresolved, creation of hugetlb backed
> shared memory segments could take a VERY long time.
> 
> I do not believe the test failure is arm specific.  Most likely, it is just
> because testing was done on a system with memory size to trigger this issue?

I think it is because the arm64 machine has the default hugepage size in 512M
instead of 2M on other arches, but the test case still blindly try to allocate
around 200 of hugepages which the system can't handle gracefully, i.e., return
-ENOMEM in reasonable time.

> 
> My plan is to focus on [1].  When that is resolved, this issue should go away.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-06-27 18:54 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-13 21:34 LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613) Qian Cai
2019-06-14 10:20 ` Will Deacon
2019-06-14 12:15   ` Qian Cai
2019-06-17  1:32     ` Anshuman Khandual
2019-06-17  1:41       ` Qian Cai
2019-06-24  9:35         ` Will Deacon
2019-06-24 12:58           ` Qian Cai
2019-06-24 21:30             ` Qian Cai
2019-06-24 21:53               ` Mike Kravetz
2019-06-27 18:09                 ` Mike Kravetz
2019-06-27 18:54                   ` Qian Cai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).