All of lore.kernel.org
 help / color / mirror / Atom feed
* Resend: 3.10.32-rt31: problem with "rcu: Eliminate softirq processing from rcutree"
@ 2014-03-06 19:05 Helmut Buchsbaum
  2014-03-07 16:21 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 9+ messages in thread
From: Helmut Buchsbaum @ 2014-03-06 19:05 UTC (permalink / raw)
  To: linux-rt-users, rostedt

I am working with a 3.10-rt based kernel on a custom device based on
Xilinx Zynq Z-7010 (dual Cortex-A9) with 64MB DDR2 RAM.
Today I rebased my work (3.10-rt mainly enhanced by Xilinx drivers,
but otherwise unchanged) from 3.10.32-rt30 to 3.10.32-rt31. There I
ran a dohell derivate (from
http://git.xenomai.org/xenomai-2.6.git/plain/src/testsuite/xeno-test/dohell?id=v2.6.3,
just using customized hackbench call and cyclictest as the only test)
when the OOM killer struck, which definitely did not happen with
3.10.32-rt30. Bisecting I detected
8cf5b0e982cba4fd0e469989a420a15c8a378fa2 "rcu: Eliminate softirq
processing from rcutree" as the change which introduced the unwanted
behavior:

[   73.865213] ls invoked oom-killer: gfp_mask=0x2000d0, order=0,
oom_score_adj=0
[   73.877229] CPU: 1 PID: 623 Comm: ls Not tainted 3.10.32-rt30+ #1
[   73.877280] [<c0013c80>] (unwind_backtrace+0x0/0x138) from
[<c0011d14>] (show_stack+0x10/0x14)
[   73.877310] [<c0011d14>] (show_stack+0x10/0x14) from [<c0394d88>]
(dump_header.isra.12+0x6c/0xa8)
[   73.877334] [<c0394d88>] (dump_header.isra.12+0x6c/0xa8) from
[<c0394e10>] (oom_kill_process.part.14+0x4c/0x384)
[   73.877357] [<c0394e10>] (oom_kill_process.part.14+0x4c/0x384) from
[<c007301c>] (out_of_memory+0x128/0x1c8)
[   73.877380] [<c007301c>] (out_of_memory+0x128/0x1c8) from
[<c0076f10>] (__alloc_pages_nodemask+0x6f4/0x71c)
[   73.877406] [<c0076f10>] (__alloc_pages_nodemask+0x6f4/0x71c) from
[<c009fe9c>] (allocate_slab+0xe4/0xfc)
[   73.877426] [<c009fe9c>] (allocate_slab+0xe4/0xfc) from
[<c009fee4>] (new_slab+0x30/0x154)
[   73.877447] [<c009fee4>] (new_slab+0x30/0x154) from [<c0396608>]
(__slab_alloc.isra.52.constprop.54+0x2f0/0x3c4)
[   73.877467] [<c0396608>]
(__slab_alloc.isra.52.constprop.54+0x2f0/0x3c4) from [<c00a0614>]
(kmem_cache_alloc+0x130/0x138)
[   73.877489] [<c00a0614>] (kmem_cache_alloc+0x130/0x138) from
[<c00a5d84>] (get_empty_filp+0x6c/0x1f4)
[   73.877509] [<c00a5d84>] (get_empty_filp+0x6c/0x1f4) from
[<c00b18b8>] (path_openat+0x2c/0x43c)
[   73.877527] [<c00b18b8>] (path_openat+0x2c/0x43c) from [<c00b1f80>]
(do_filp_open+0x2c/0x80)
[   73.877544] [<c00b1f80>] (do_filp_open+0x2c/0x80) from [<c00a40a4>]
(do_sys_open+0xe8/0x174)
[   73.877564] [<c00a40a4>] (do_sys_open+0xe8/0x174) from [<c000e9c0>]
(ret_fast_syscall+0x0/0x30)
[   73.877574] Mem-info:
[   73.877582] Normal per-cpu:
[   73.877593] CPU    0: hi:    6, btch:   1 usd:   0
[   73.877603] CPU    1: hi:    6, btch:   1 usd:   0
[   73.877625] active_anon:1827 inactive_anon:5 isolated_anon:0
[   73.877625]  active_file:174 inactive_file:1158 isolated_file:0
[   73.877625]  unevictable:2493 dirty:1 writeback:1160 unstable:0
[   73.877625]  free:3978 slab_reclaimable:1161 slab_unreclaimable:1345
[   73.877625]  mapped:440 shmem:512 pagetables:526 bounce:0
[   73.877625]  free_cma:3833
[   73.877669] Normal free:15912kB min:824kB low:1028kB high:1236kB
active_anon:7308kB inactive_anon:20kB active_file:696kB inacs
[   73.877679] lowmem_reserve[]: 0 0
[   73.877695] Normal: 234*4kB (ERC) 208*8kB (URC) 200*16kB (C)
189*32kB (RC) 61*64kB (RC) 1*128kB (R) 0*256kB 0*512kB 0*1024kB B
[   73.877769] 2196 total pagecache pages
[   73.877780] 0 pages in swap cache
[   73.877789] Swap cache stats: add 0, delete 0, find 0/0
[   73.877796] Free swap  = 0kB
[   73.877804] Total swap = 0kB
[   73.884950] 16384 pages of RAM
[   73.884966] 5008 free pages
[   73.884975] 1558 reserved pages
[   73.884983] 1674 slab pages
[   73.884990] 532633 pages shared
[   73.884998] 0 pages swap cached
[   73.885007] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents
oom_score_adj name
[   73.908257] [  502]     0   502      723      119       5        0
           0 klogd
[   73.936201] [  504]     0   504      723      112       5        0
           0 syslogd
[   73.980352] [  506]     0   506      723       76       5        0
           0 telnetd
[   74.028534] [  566]     0   566      724      166       5        0
           0 ash
[   74.053796] [  606]     0   606      723       57       4        0
           0 udhcpc
[   74.081419] [  607]     0   607      724      161       5        0
           0 dohell
[   74.090873] [  609]     0   609      724       68       5        0
           0 dohell
[   74.106407] [  610]     0   610      724       61       5        0
           0 dohell
[   74.118017] [  611]     0   611      724       58       5        0
           0 dohell
[   74.128390] [  612]     0   612      723      110       5        0
           0 nc
[   74.198260] [  613]     0   613      724       61       5        0
           0 dohell
[   74.214133] [  614]     0   614      724       60       5        0
           0 dohell
[   74.227782] [  615]     0   615      724       60       5        0
           0 dohell
[   74.240921] [  616]     0   616      723      104       4        0
           0 dd
[   74.264528] [  617]     0   617      941      360       4        0
           0 dd
[   74.284988] [  618]     0   618      724       60       5        0
           0 dohell
[   74.312808] [  620]     0   620     2523     2494       9        0
           0 cyclictest
[   74.333122] [  623]     0   623      724      174       5        0
           0 ls
[   74.361071] [ 1623]     0  1623      690       95       4        0
           0 sleep
[   74.382116] [ 2008]     0  2008      448      133       5        0
           0 hackbench
[   74.402224] [ 2011]     0  2011      448       54       5        0
           0 hackbench
[   74.426216] [ 2012]     0  2012      448       54       5        0
           0 hackbench
[   74.452513] [ 2013]     0  2013      448       54       5        0
           0 hackbench
[   74.468437] [ 2014]     0  2014      448       54       5        0
           0 hackbench
[   74.481603] [ 2015]     0  2015      448       54       5        0
           0 hackbench
[   74.495083] [ 2016]     0  2016      448       54       5        0
           0 hackbench
[   74.528757] [ 2017]     0  2017      448       54       5        0
           0 hackbench
[   74.542685] [ 2019]     0  2019      448       54       5        0
           0 hackbench
[   74.561320] [ 2020]     0  2020      448       54       5        0
           0 hackbench
[   74.576906] [ 2021]     0  2021      448       54       5        0
           0 hackbench
[   74.586537] [ 2022]     0  2022      448       54       5        0
           0 hackbench
[   74.610881] [ 2023]     0  2023      448       54       5        0
           0 hackbench
[   74.696010] [ 2026]     0  2026      448       59       5        0
           0 hackbench
[   74.746412] [ 2032]     0  2032      448       64       5        0
           0 hackbench
[   74.781200] [ 2095]     0  2095      689       36       3        0
           0 cat
[   74.795979] [ 2098]     0  2098      246        1       2        0
           0 ps
[   74.815133] [ 2099]     0  2099      448      133       4        0
           0 hackbench
[   74.829415] [ 2100]     0  2100      690       98       4        0
           0 cat
[   74.844691] [ 2101]     0  2101      448       54       4        0
           0 hackbench
[   74.855449] [ 2102]     0  2102      448       54       4        0
           0 hackbench
[   74.876889] [ 2103]     0  2103      448       54       4        0
           0 hackbench
[   74.893808] [ 2104]     0  2104      448       54       4        0
           0 hackbench
[   74.915100] [ 2105]     0  2105      448       54       4        0
           0 hackbench
[   74.932730] [ 2106]     0  2106      448       54       4        0
           0 hackbench
[   74.965216] [ 2107]     0  2107      448       54       4        0
           0 hackbench
[   75.027941] [ 2108]     0  2108      448       54       4        0
           0 hackbench
[   75.046197] [ 2109]     0  2109      448       54       4        0
           0 hackbench
[   75.063038] [ 2110]     0  2110      448       54       4        0
           0 hackbench
[   75.079646] [ 2111]     0  2111      448       55       4        0
           0 hackbench
[   75.115715] [ 2112]     0  2112      448       55       4        0
           0 hackbench
[   75.125629] [ 2113]     0  2113      448       55       4        0
           0 hackbench
[   75.136509] [ 2114]     0  2114      448       55       4        0
           0 hackbench
[   75.146358] [ 2115]     0  2115      448       55       4        0
           0 hackbench
[   75.157808] [ 2116]     0  2116      448       55       4        0
           0 hackbench
[   75.167646] [ 2117]     0  2117      448       55       4        0
           0 hackbench
[   75.177705] [ 2118]     0  2118      448       55       4        0
           0 hackbench
[   75.187688] [ 2119]     0  2119      448       55       4        0
           0 hackbench
[   75.202440] [ 2120]     0  2120      448       55       4        0
           0 hackbench
[   75.226442] [ 2121]     0  2121      448       54       4        0
           0 hackbench
[   75.299323] [ 2190]     0  2190      724      152       5        0
           0 ps
[   75.309857] [ 2191]     0  2191      392       31       3        0
           0 cat
[   75.320704] [ 2192]     0  2192      116       33       3        0
           0 hackbench
[   75.331584] [ 2193]     0  2193      246        1       2        0
           0 ps
[   75.341056] Out of memory: Kill process 620 (cyclictest) score 163
or sacrifice child
[   75.350899] Killed process 620 (cyclictest) total-vm:10092kB,
anon-rss:8544kB, file-rss:1432kB

Unfortunately I'm rather busy at the moment, so I couldn't investigate
this issue at all, I'm sorry.
Steven, probably this issue also concerns 3.8.13.14-rt28-rc1 (I just
discovered your post on the list).

Regards,
Helmut

PS: running this test with hackbench alone (no dohell), I get nearly
the same result with kmem_cache_alloc() failing.
PPS: I had to resend this post since I didn't get through to the list.
Sorry about any noise!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Resend: 3.10.32-rt31: problem with "rcu: Eliminate softirq processing from rcutree"
  2014-03-06 19:05 Resend: 3.10.32-rt31: problem with "rcu: Eliminate softirq processing from rcutree" Helmut Buchsbaum
@ 2014-03-07 16:21 ` Sebastian Andrzej Siewior
  2014-03-07 17:04   ` Paul E. McKenney
  0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2014-03-07 16:21 UTC (permalink / raw)
  To: Helmut Buchsbaum; +Cc: linux-rt-users, rostedt, paulmck

+ paulmck

* Helmut Buchsbaum | 2014-03-06 20:05:13 [+0100]:

>I am working with a 3.10-rt based kernel on a custom device based on
>Xilinx Zynq Z-7010 (dual Cortex-A9) with 64MB DDR2 RAM.
>Today I rebased my work (3.10-rt mainly enhanced by Xilinx drivers,
>but otherwise unchanged) from 3.10.32-rt30 to 3.10.32-rt31. There I
>ran a dohell derivate (from
>http://git.xenomai.org/xenomai-2.6.git/plain/src/testsuite/xeno-test/dohell?id=v2.6.3,
>just using customized hackbench call and cyclictest as the only test)
>when the OOM killer struck, which definitely did not happen with
>3.10.32-rt30. Bisecting I detected
>8cf5b0e982cba4fd0e469989a420a15c8a378fa2 "rcu: Eliminate softirq
>processing from rcutree" as the change which introduced the unwanted
>behavior:
>
>[   73.865213] ls invoked oom-killer: gfp_mask=0x2000d0, order=0, oom_score_adj=0
>[   73.877229] CPU: 1 PID: 623 Comm: ls Not tainted 3.10.32-rt30+ #1
>[   73.877280] [<c0013c80>] (unwind_backtrace+0x0/0x138) from [<c0011d14>] (show_stack+0x10/0x14)
>[   73.877310] [<c0011d14>] (show_stack+0x10/0x14) from [<c0394d88>] (dump_header.isra.12+0x6c/0xa8)
>[   73.877334] [<c0394d88>] (dump_header.isra.12+0x6c/0xa8) from [<c0394e10>] (oom_kill_process.part.14+0x4c/0x384)
>[   73.877357] [<c0394e10>] (oom_kill_process.part.14+0x4c/0x384) from [<c007301c>] (out_of_memory+0x128/0x1c8)
>[   73.877380] [<c007301c>] (out_of_memory+0x128/0x1c8) from >[<c0076f10>] (__alloc_pages_nodemask+0x6f4/0x71c)
>[   73.877406] [<c0076f10>] (__alloc_pages_nodemask+0x6f4/0x71c) from [<c009fe9c>] (allocate_slab+0xe4/0xfc)
>[   73.877426] [<c009fe9c>] (allocate_slab+0xe4/0xfc) from [<c009fee4>] (new_slab+0x30/0x154)
>[   73.877447] [<c009fee4>] (new_slab+0x30/0x154) from [<c0396608>] (__slab_alloc.isra.52.constprop.54+0x2f0/0x3c4)
>[   73.877467] [<c0396608>] (__slab_alloc.isra.52.constprop.54+0x2f0/0x3c4) from [<c00a0614>] (kmem_cache_alloc+0x130/0x138)
>[   73.877489] [<c00a0614>] (kmem_cache_alloc+0x130/0x138) from [<c00a5d84>] (get_empty_filp+0x6c/0x1f4)
>[   73.877509] [<c00a5d84>] (get_empty_filp+0x6c/0x1f4) from [<c00b18b8>] (path_openat+0x2c/0x43c)
>[   73.877527] [<c00b18b8>] (path_openat+0x2c/0x43c) from [<c00b1f80>] (do_filp_open+0x2c/0x80)
>[   73.877544] [<c00b1f80>] (do_filp_open+0x2c/0x80) from [<c00a40a4>] (do_sys_open+0xe8/0x174)
>[   73.877564] [<c00a40a4>] (do_sys_open+0xe8/0x174) from [<c000e9c0>] (ret_fast_syscall+0x0/0x30)
>[   73.877574] Mem-info:
>[   73.877582] Normal per-cpu:
>[   73.877593] CPU    0: hi:    6, btch:   1 usd:   0
>[   73.877603] CPU    1: hi:    6, btch:   1 usd:   0
>[   73.877625] active_anon:1827 inactive_anon:5 isolated_anon:0
>[   73.877625]  active_file:174 inactive_file:1158 isolated_file:0
>[   73.877625]  unevictable:2493 dirty:1 writeback:1160 unstable:0
>[   73.877625]  free:3978 slab_reclaimable:1161 slab_unreclaimable:1345
>[   73.877625]  mapped:440 shmem:512 pagetables:526 bounce:0
>[   73.877625]  free_cma:3833
>[   73.877669] Normal free:15912kB min:824kB low:1028kB high:1236kB active_anon:7308kB inactive_anon:20kB active_file:696kB inacs
>[   73.877679] lowmem_reserve[]: 0 0
>[   73.877695] Normal: 234*4kB (ERC) 208*8kB (URC) 200*16kB (C) 189*32kB (RC) 61*64kB (RC) 1*128kB (R) 0*256kB 0*512kB 0*1024kB B
>[   73.877769] 2196 total pagecache pages
>[   73.877780] 0 pages in swap cache
>[   73.877789] Swap cache stats: add 0, delete 0, find 0/0
>[   73.877796] Free swap  = 0kB
>[   73.877804] Total swap = 0kB
>[   73.884950] 16384 pages of RAM
>[   73.884966] 5008 free pages
>[   73.884975] 1558 reserved pages
>[   73.884983] 1674 slab pages
>[   73.884990] 532633 pages shared
>[   73.884998] 0 pages swap cached
>[   73.885007] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
>[   73.908257] [  502]     0   502      723      119       5        0           0 klogd
>[   73.936201] [  504]     0   504      723      112       5        0           0 syslogd
>[   73.980352] [  506]     0   506      723       76       5        0           0 telnetd
>[   74.028534] [  566]     0   566      724      166       5        0           0 ash
>[   74.053796] [  606]     0   606      723       57       4        0           0 udhcpc
>[   74.081419] [  607]     0   607      724      161       5        0           0 dohell
>[   74.090873] [  609]     0   609      724       68       5        0           0 dohell
>[   74.106407] [  610]     0   610      724       61       5        0           0 dohell
>[   74.118017] [  611]     0   611      724       58       5        0           0 dohell
>[   74.128390] [  612]     0   612      723      110       5        0           0 nc
>[   74.198260] [  613]     0   613      724       61       5        0           0 dohell
>[   74.214133] [  614]     0   614      724       60       5        0           0 dohell
>[   74.227782] [  615]     0   615      724       60       5        0           0 dohell
>[   74.240921] [  616]     0   616      723      104       4        0           0 dd
>[   74.264528] [  617]     0   617      941      360       4        0           0 dd
>[   74.284988] [  618]     0   618      724       60       5        0           0 dohell
>[   74.312808] [  620]     0   620     2523     2494       9        0           0 cyclictest
>[   74.333122] [  623]     0   623      724      174       5        0           0 ls
>[   74.361071] [ 1623]     0  1623      690       95       4        0           0 sleep
>[   74.382116] [ 2008]     0  2008      448      133       5        0           0 hackbench
>[   74.402224] [ 2011]     0  2011      448       54       5        0           0 hackbench
>[   74.426216] [ 2012]     0  2012      448       54       5        0           0 hackbench
>[   74.452513] [ 2013]     0  2013      448       54       5        0           0 hackbench
>[   74.468437] [ 2014]     0  2014      448       54       5        0           0 hackbench
>[   74.481603] [ 2015]     0  2015      448       54       5        0           0 hackbench
>[   74.495083] [ 2016]     0  2016      448       54       5        0           0 hackbench
>[   74.528757] [ 2017]     0  2017      448       54       5        0           0 hackbench
>[   74.542685] [ 2019]     0  2019      448       54       5        0           0 hackbench
>[   74.561320] [ 2020]     0  2020      448       54       5        0           0 hackbench
>[   74.576906] [ 2021]     0  2021      448       54       5        0           0 hackbench
>[   74.586537] [ 2022]     0  2022      448       54       5        0           0 hackbench
>[   74.610881] [ 2023]     0  2023      448       54       5        0           0 hackbench
>[   74.696010] [ 2026]     0  2026      448       59       5        0           0 hackbench
>[   74.746412] [ 2032]     0  2032      448       64       5        0           0 hackbench
>[   74.781200] [ 2095]     0  2095      689       36       3        0           0 cat
>[   74.795979] [ 2098]     0  2098      246        1       2        0           0 ps
>[   74.815133] [ 2099]     0  2099      448      133       4        0           0 hackbench
>[   74.829415] [ 2100]     0  2100      690       98       4        0           0 cat
>[   74.844691] [ 2101]     0  2101      448       54       4        0           0 hackbench
>[   74.855449] [ 2102]     0  2102      448       54       4        0           0 hackbench
>[   74.876889] [ 2103]     0  2103      448       54       4        0           0 hackbench
>[   74.893808] [ 2104]     0  2104      448       54       4        0           0 hackbench
>[   74.915100] [ 2105]     0  2105      448       54       4        0           0 hackbench
>[   74.932730] [ 2106]     0  2106      448       54       4        0           0 hackbench
>[   74.965216] [ 2107]     0  2107      448       54       4        0           0 hackbench
>[   75.027941] [ 2108]     0  2108      448       54       4        0           0 hackbench
>[   75.046197] [ 2109]     0  2109      448       54       4        0           0 hackbench
>[   75.063038] [ 2110]     0  2110      448       54       4        0           0 hackbench
>[   75.079646] [ 2111]     0  2111      448       55       4        0           0 hackbench
>[   75.115715] [ 2112]     0  2112      448       55       4        0           0 hackbench
>[   75.125629] [ 2113]     0  2113      448       55       4        0           0 hackbench
>[   75.136509] [ 2114]     0  2114      448       55       4        0           0 hackbench
>[   75.146358] [ 2115]     0  2115      448       55       4        0           0 hackbench
>[   75.157808] [ 2116]     0  2116      448       55       4        0           0 hackbench
>[   75.167646] [ 2117]     0  2117      448       55       4        0           0 hackbench
>[   75.177705] [ 2118]     0  2118      448       55       4        0           0 hackbench
>[   75.187688] [ 2119]     0  2119      448       55       4        0           0 hackbench
>[   75.202440] [ 2120]     0  2120      448       55       4        0           0 hackbench
>[   75.226442] [ 2121]     0  2121      448       54       4        0           0 hackbench
>[   75.299323] [ 2190]     0  2190      724      152       5        0           0 ps
>[   75.309857] [ 2191]     0  2191      392       31       3        0           0 cat
>[   75.320704] [ 2192]     0  2192      116       33       3        0           0 hackbench
>[   75.331584] [ 2193]     0  2193      246        1       2        0           0 ps
>[   75.341056] Out of memory: Kill process 620 (cyclictest) score 163 or sacrifice child
>[   75.350899] Killed process 620 (cyclictest) total-vm:10092kB, anon-rss:8544kB, file-rss:1432kB
>
>Unfortunately I'm rather busy at the moment, so I couldn't investigate
>this issue at all, I'm sorry.
>Steven, probably this issue also concerns 3.8.13.14-rt28-rc1 (I just
>discovered your post on the list).
>
>Regards,
>Helmut
>
>PS: running this test with hackbench alone (no dohell), I get nearly
>the same result with kmem_cache_alloc() failing.
>PPS: I had to resend this post since I didn't get through to the list.
>Sorry about any noise!

Do you have RCU_BOOST enabled? My guess here is that you *something* is
running at a higher priority not allowing rcuc/ to run and do its job.

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Resend: 3.10.32-rt31: problem with "rcu: Eliminate softirq processing from rcutree"
  2014-03-07 16:21 ` Sebastian Andrzej Siewior
@ 2014-03-07 17:04   ` Paul E. McKenney
  2014-03-08 19:08     ` Helmut Buchsbaum
  0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2014-03-07 17:04 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: Helmut Buchsbaum, linux-rt-users, rostedt

On Fri, Mar 07, 2014 at 05:21:45PM +0100, Sebastian Andrzej Siewior wrote:
> + paulmck
> 
> * Helmut Buchsbaum | 2014-03-06 20:05:13 [+0100]:
> 
> >I am working with a 3.10-rt based kernel on a custom device based on
> >Xilinx Zynq Z-7010 (dual Cortex-A9) with 64MB DDR2 RAM.
> >Today I rebased my work (3.10-rt mainly enhanced by Xilinx drivers,
> >but otherwise unchanged) from 3.10.32-rt30 to 3.10.32-rt31. There I
> >ran a dohell derivate (from
> >http://git.xenomai.org/xenomai-2.6.git/plain/src/testsuite/xeno-test/dohell?id=v2.6.3,
> >just using customized hackbench call and cyclictest as the only test)
> >when the OOM killer struck, which definitely did not happen with
> >3.10.32-rt30. Bisecting I detected
> >8cf5b0e982cba4fd0e469989a420a15c8a378fa2 "rcu: Eliminate softirq
> >processing from rcutree" as the change which introduced the unwanted
> >behavior:
> >
> >[   73.865213] ls invoked oom-killer: gfp_mask=0x2000d0, order=0, oom_score_adj=0
> >[   73.877229] CPU: 1 PID: 623 Comm: ls Not tainted 3.10.32-rt30+ #1
> >[   73.877280] [<c0013c80>] (unwind_backtrace+0x0/0x138) from [<c0011d14>] (show_stack+0x10/0x14)
> >[   73.877310] [<c0011d14>] (show_stack+0x10/0x14) from [<c0394d88>] (dump_header.isra.12+0x6c/0xa8)
> >[   73.877334] [<c0394d88>] (dump_header.isra.12+0x6c/0xa8) from [<c0394e10>] (oom_kill_process.part.14+0x4c/0x384)
> >[   73.877357] [<c0394e10>] (oom_kill_process.part.14+0x4c/0x384) from [<c007301c>] (out_of_memory+0x128/0x1c8)
> >[   73.877380] [<c007301c>] (out_of_memory+0x128/0x1c8) from >[<c0076f10>] (__alloc_pages_nodemask+0x6f4/0x71c)
> >[   73.877406] [<c0076f10>] (__alloc_pages_nodemask+0x6f4/0x71c) from [<c009fe9c>] (allocate_slab+0xe4/0xfc)
> >[   73.877426] [<c009fe9c>] (allocate_slab+0xe4/0xfc) from [<c009fee4>] (new_slab+0x30/0x154)
> >[   73.877447] [<c009fee4>] (new_slab+0x30/0x154) from [<c0396608>] (__slab_alloc.isra.52.constprop.54+0x2f0/0x3c4)
> >[   73.877467] [<c0396608>] (__slab_alloc.isra.52.constprop.54+0x2f0/0x3c4) from [<c00a0614>] (kmem_cache_alloc+0x130/0x138)
> >[   73.877489] [<c00a0614>] (kmem_cache_alloc+0x130/0x138) from [<c00a5d84>] (get_empty_filp+0x6c/0x1f4)
> >[   73.877509] [<c00a5d84>] (get_empty_filp+0x6c/0x1f4) from [<c00b18b8>] (path_openat+0x2c/0x43c)
> >[   73.877527] [<c00b18b8>] (path_openat+0x2c/0x43c) from [<c00b1f80>] (do_filp_open+0x2c/0x80)
> >[   73.877544] [<c00b1f80>] (do_filp_open+0x2c/0x80) from [<c00a40a4>] (do_sys_open+0xe8/0x174)
> >[   73.877564] [<c00a40a4>] (do_sys_open+0xe8/0x174) from [<c000e9c0>] (ret_fast_syscall+0x0/0x30)
> >[   73.877574] Mem-info:
> >[   73.877582] Normal per-cpu:
> >[   73.877593] CPU    0: hi:    6, btch:   1 usd:   0
> >[   73.877603] CPU    1: hi:    6, btch:   1 usd:   0
> >[   73.877625] active_anon:1827 inactive_anon:5 isolated_anon:0
> >[   73.877625]  active_file:174 inactive_file:1158 isolated_file:0
> >[   73.877625]  unevictable:2493 dirty:1 writeback:1160 unstable:0
> >[   73.877625]  free:3978 slab_reclaimable:1161 slab_unreclaimable:1345
> >[   73.877625]  mapped:440 shmem:512 pagetables:526 bounce:0
> >[   73.877625]  free_cma:3833
> >[   73.877669] Normal free:15912kB min:824kB low:1028kB high:1236kB active_anon:7308kB inactive_anon:20kB active_file:696kB inacs
> >[   73.877679] lowmem_reserve[]: 0 0
> >[   73.877695] Normal: 234*4kB (ERC) 208*8kB (URC) 200*16kB (C) 189*32kB (RC) 61*64kB (RC) 1*128kB (R) 0*256kB 0*512kB 0*1024kB B
> >[   73.877769] 2196 total pagecache pages
> >[   73.877780] 0 pages in swap cache
> >[   73.877789] Swap cache stats: add 0, delete 0, find 0/0
> >[   73.877796] Free swap  = 0kB
> >[   73.877804] Total swap = 0kB
> >[   73.884950] 16384 pages of RAM
> >[   73.884966] 5008 free pages
> >[   73.884975] 1558 reserved pages
> >[   73.884983] 1674 slab pages
> >[   73.884990] 532633 pages shared
> >[   73.884998] 0 pages swap cached
> >[   73.885007] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
> >[   73.908257] [  502]     0   502      723      119       5        0           0 klogd
> >[   73.936201] [  504]     0   504      723      112       5        0           0 syslogd
> >[   73.980352] [  506]     0   506      723       76       5        0           0 telnetd
> >[   74.028534] [  566]     0   566      724      166       5        0           0 ash
> >[   74.053796] [  606]     0   606      723       57       4        0           0 udhcpc
> >[   74.081419] [  607]     0   607      724      161       5        0           0 dohell
> >[   74.090873] [  609]     0   609      724       68       5        0           0 dohell
> >[   74.106407] [  610]     0   610      724       61       5        0           0 dohell
> >[   74.118017] [  611]     0   611      724       58       5        0           0 dohell
> >[   74.128390] [  612]     0   612      723      110       5        0           0 nc
> >[   74.198260] [  613]     0   613      724       61       5        0           0 dohell
> >[   74.214133] [  614]     0   614      724       60       5        0           0 dohell
> >[   74.227782] [  615]     0   615      724       60       5        0           0 dohell
> >[   74.240921] [  616]     0   616      723      104       4        0           0 dd
> >[   74.264528] [  617]     0   617      941      360       4        0           0 dd
> >[   74.284988] [  618]     0   618      724       60       5        0           0 dohell
> >[   74.312808] [  620]     0   620     2523     2494       9        0           0 cyclictest
> >[   74.333122] [  623]     0   623      724      174       5        0           0 ls
> >[   74.361071] [ 1623]     0  1623      690       95       4        0           0 sleep
> >[   74.382116] [ 2008]     0  2008      448      133       5        0           0 hackbench
> >[   74.402224] [ 2011]     0  2011      448       54       5        0           0 hackbench
> >[   74.426216] [ 2012]     0  2012      448       54       5        0           0 hackbench
> >[   74.452513] [ 2013]     0  2013      448       54       5        0           0 hackbench
> >[   74.468437] [ 2014]     0  2014      448       54       5        0           0 hackbench
> >[   74.481603] [ 2015]     0  2015      448       54       5        0           0 hackbench
> >[   74.495083] [ 2016]     0  2016      448       54       5        0           0 hackbench
> >[   74.528757] [ 2017]     0  2017      448       54       5        0           0 hackbench
> >[   74.542685] [ 2019]     0  2019      448       54       5        0           0 hackbench
> >[   74.561320] [ 2020]     0  2020      448       54       5        0           0 hackbench
> >[   74.576906] [ 2021]     0  2021      448       54       5        0           0 hackbench
> >[   74.586537] [ 2022]     0  2022      448       54       5        0           0 hackbench
> >[   74.610881] [ 2023]     0  2023      448       54       5        0           0 hackbench
> >[   74.696010] [ 2026]     0  2026      448       59       5        0           0 hackbench
> >[   74.746412] [ 2032]     0  2032      448       64       5        0           0 hackbench
> >[   74.781200] [ 2095]     0  2095      689       36       3        0           0 cat
> >[   74.795979] [ 2098]     0  2098      246        1       2        0           0 ps
> >[   74.815133] [ 2099]     0  2099      448      133       4        0           0 hackbench
> >[   74.829415] [ 2100]     0  2100      690       98       4        0           0 cat
> >[   74.844691] [ 2101]     0  2101      448       54       4        0           0 hackbench
> >[   74.855449] [ 2102]     0  2102      448       54       4        0           0 hackbench
> >[   74.876889] [ 2103]     0  2103      448       54       4        0           0 hackbench
> >[   74.893808] [ 2104]     0  2104      448       54       4        0           0 hackbench
> >[   74.915100] [ 2105]     0  2105      448       54       4        0           0 hackbench
> >[   74.932730] [ 2106]     0  2106      448       54       4        0           0 hackbench
> >[   74.965216] [ 2107]     0  2107      448       54       4        0           0 hackbench
> >[   75.027941] [ 2108]     0  2108      448       54       4        0           0 hackbench
> >[   75.046197] [ 2109]     0  2109      448       54       4        0           0 hackbench
> >[   75.063038] [ 2110]     0  2110      448       54       4        0           0 hackbench
> >[   75.079646] [ 2111]     0  2111      448       55       4        0           0 hackbench
> >[   75.115715] [ 2112]     0  2112      448       55       4        0           0 hackbench
> >[   75.125629] [ 2113]     0  2113      448       55       4        0           0 hackbench
> >[   75.136509] [ 2114]     0  2114      448       55       4        0           0 hackbench
> >[   75.146358] [ 2115]     0  2115      448       55       4        0           0 hackbench
> >[   75.157808] [ 2116]     0  2116      448       55       4        0           0 hackbench
> >[   75.167646] [ 2117]     0  2117      448       55       4        0           0 hackbench
> >[   75.177705] [ 2118]     0  2118      448       55       4        0           0 hackbench
> >[   75.187688] [ 2119]     0  2119      448       55       4        0           0 hackbench
> >[   75.202440] [ 2120]     0  2120      448       55       4        0           0 hackbench
> >[   75.226442] [ 2121]     0  2121      448       54       4        0           0 hackbench
> >[   75.299323] [ 2190]     0  2190      724      152       5        0           0 ps
> >[   75.309857] [ 2191]     0  2191      392       31       3        0           0 cat
> >[   75.320704] [ 2192]     0  2192      116       33       3        0           0 hackbench
> >[   75.331584] [ 2193]     0  2193      246        1       2        0           0 ps
> >[   75.341056] Out of memory: Kill process 620 (cyclictest) score 163 or sacrifice child
> >[   75.350899] Killed process 620 (cyclictest) total-vm:10092kB, anon-rss:8544kB, file-rss:1432kB
> >
> >Unfortunately I'm rather busy at the moment, so I couldn't investigate
> >this issue at all, I'm sorry.
> >Steven, probably this issue also concerns 3.8.13.14-rt28-rc1 (I just
> >discovered your post on the list).
> >
> >Regards,
> >Helmut
> >
> >PS: running this test with hackbench alone (no dohell), I get nearly
> >the same result with kmem_cache_alloc() failing.
> >PPS: I had to resend this post since I didn't get through to the list.
> >Sorry about any noise!
> 
> Do you have RCU_BOOST enabled? My guess here is that you *something* is
> running at a higher priority not allowing rcuc/ to run and do its job.

The major effect of that commit is to move processing from softirq to
kthread.  If your workload has quite a few interrupts, it is possible
that the softirq version would -usually- make progress, but the kthread
version not.  (You could still get this to happen with softirq should
ksoftirqd kick in.)

Sebastian's suggestion of RCU_BOOST makes sense, and you will also need to
set CONFIG_RCU_BOOST_PRIO to be higher priority than your highest-priority
CPU-bound thread.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Resend: 3.10.32-rt31: problem with "rcu: Eliminate softirq processing from rcutree"
  2014-03-07 17:04   ` Paul E. McKenney
@ 2014-03-08 19:08     ` Helmut Buchsbaum
  2014-03-08 19:19       ` Helmut Buchsbaum
  2014-03-08 20:04       ` Paul E. McKenney
  0 siblings, 2 replies; 9+ messages in thread
From: Helmut Buchsbaum @ 2014-03-08 19:08 UTC (permalink / raw)
  To: paulmck; +Cc: Sebastian Andrzej Siewior, linux-rt-users, rostedt

This is the correct point to tweak the system, I was not yet aware of!
Thanks a lot for the hint. I just enabled RCU_BOOST with its default
settings and the system runs smooth and stable again even in heavy
load situations. Maybe the default for RCU_BOOST should be set to Y on
systems using PREEMPT_RT_FULL ?

Again, many thanx,
Helmut

2014-03-07 18:04 GMT+01:00 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> On Fri, Mar 07, 2014 at 05:21:45PM +0100, Sebastian Andrzej Siewior wrote:
>> + paulmck
>>
>> * Helmut Buchsbaum | 2014-03-06 20:05:13 [+0100]:
>>
>> >I am working with a 3.10-rt based kernel on a custom device based on
>> >Xilinx Zynq Z-7010 (dual Cortex-A9) with 64MB DDR2 RAM.
>> >Today I rebased my work (3.10-rt mainly enhanced by Xilinx drivers,
>> >but otherwise unchanged) from 3.10.32-rt30 to 3.10.32-rt31. There I
>> >ran a dohell derivate (from
>> >http://git.xenomai.org/xenomai-2.6.git/plain/src/testsuite/xeno-test/dohell?id=v2.6.3,
>> >just using customized hackbench call and cyclictest as the only test)
>> >when the OOM killer struck, which definitely did not happen with
>> >3.10.32-rt30. Bisecting I detected
>> >8cf5b0e982cba4fd0e469989a420a15c8a378fa2 "rcu: Eliminate softirq
>> >processing from rcutree" as the change which introduced the unwanted
>> >behavior:
>> >
>> >[   73.865213] ls invoked oom-killer: gfp_mask=0x2000d0, order=0, oom_score_adj=0
>> >[   73.877229] CPU: 1 PID: 623 Comm: ls Not tainted 3.10.32-rt30+ #1
>> >[   73.877280] [<c0013c80>] (unwind_backtrace+0x0/0x138) from [<c0011d14>] (show_stack+0x10/0x14)
>> >[   73.877310] [<c0011d14>] (show_stack+0x10/0x14) from [<c0394d88>] (dump_header.isra.12+0x6c/0xa8)
>> >[   73.877334] [<c0394d88>] (dump_header.isra.12+0x6c/0xa8) from [<c0394e10>] (oom_kill_process.part.14+0x4c/0x384)
>> >[   73.877357] [<c0394e10>] (oom_kill_process.part.14+0x4c/0x384) from [<c007301c>] (out_of_memory+0x128/0x1c8)
>> >[   73.877380] [<c007301c>] (out_of_memory+0x128/0x1c8) from >[<c0076f10>] (__alloc_pages_nodemask+0x6f4/0x71c)
>> >[   73.877406] [<c0076f10>] (__alloc_pages_nodemask+0x6f4/0x71c) from [<c009fe9c>] (allocate_slab+0xe4/0xfc)
>> >[   73.877426] [<c009fe9c>] (allocate_slab+0xe4/0xfc) from [<c009fee4>] (new_slab+0x30/0x154)
>> >[   73.877447] [<c009fee4>] (new_slab+0x30/0x154) from [<c0396608>] (__slab_alloc.isra.52.constprop.54+0x2f0/0x3c4)
>> >[   73.877467] [<c0396608>] (__slab_alloc.isra.52.constprop.54+0x2f0/0x3c4) from [<c00a0614>] (kmem_cache_alloc+0x130/0x138)
>> >[   73.877489] [<c00a0614>] (kmem_cache_alloc+0x130/0x138) from [<c00a5d84>] (get_empty_filp+0x6c/0x1f4)
>> >[   73.877509] [<c00a5d84>] (get_empty_filp+0x6c/0x1f4) from [<c00b18b8>] (path_openat+0x2c/0x43c)
>> >[   73.877527] [<c00b18b8>] (path_openat+0x2c/0x43c) from [<c00b1f80>] (do_filp_open+0x2c/0x80)
>> >[   73.877544] [<c00b1f80>] (do_filp_open+0x2c/0x80) from [<c00a40a4>] (do_sys_open+0xe8/0x174)
>> >[   73.877564] [<c00a40a4>] (do_sys_open+0xe8/0x174) from [<c000e9c0>] (ret_fast_syscall+0x0/0x30)
>> >[   73.877574] Mem-info:
>> >[   73.877582] Normal per-cpu:
>> >[   73.877593] CPU    0: hi:    6, btch:   1 usd:   0
>> >[   73.877603] CPU    1: hi:    6, btch:   1 usd:   0
>> >[   73.877625] active_anon:1827 inactive_anon:5 isolated_anon:0
>> >[   73.877625]  active_file:174 inactive_file:1158 isolated_file:0
>> >[   73.877625]  unevictable:2493 dirty:1 writeback:1160 unstable:0
>> >[   73.877625]  free:3978 slab_reclaimable:1161 slab_unreclaimable:1345
>> >[   73.877625]  mapped:440 shmem:512 pagetables:526 bounce:0
>> >[   73.877625]  free_cma:3833
>> >[   73.877669] Normal free:15912kB min:824kB low:1028kB high:1236kB active_anon:7308kB inactive_anon:20kB active_file:696kB inacs
>> >[   73.877679] lowmem_reserve[]: 0 0
>> >[   73.877695] Normal: 234*4kB (ERC) 208*8kB (URC) 200*16kB (C) 189*32kB (RC) 61*64kB (RC) 1*128kB (R) 0*256kB 0*512kB 0*1024kB B
>> >[   73.877769] 2196 total pagecache pages
>> >[   73.877780] 0 pages in swap cache
>> >[   73.877789] Swap cache stats: add 0, delete 0, find 0/0
>> >[   73.877796] Free swap  = 0kB
>> >[   73.877804] Total swap = 0kB
>> >[   73.884950] 16384 pages of RAM
>> >[   73.884966] 5008 free pages
>> >[   73.884975] 1558 reserved pages
>> >[   73.884983] 1674 slab pages
>> >[   73.884990] 532633 pages shared
>> >[   73.884998] 0 pages swap cached
>> >[   73.885007] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
>> >[   73.908257] [  502]     0   502      723      119       5        0           0 klogd
>> >[   73.936201] [  504]     0   504      723      112       5        0           0 syslogd
>> >[   73.980352] [  506]     0   506      723       76       5        0           0 telnetd
>> >[   74.028534] [  566]     0   566      724      166       5        0           0 ash
>> >[   74.053796] [  606]     0   606      723       57       4        0           0 udhcpc
>> >[   74.081419] [  607]     0   607      724      161       5        0           0 dohell
>> >[   74.090873] [  609]     0   609      724       68       5        0           0 dohell
>> >[   74.106407] [  610]     0   610      724       61       5        0           0 dohell
>> >[   74.118017] [  611]     0   611      724       58       5        0           0 dohell
>> >[   74.128390] [  612]     0   612      723      110       5        0           0 nc
>> >[   74.198260] [  613]     0   613      724       61       5        0           0 dohell
>> >[   74.214133] [  614]     0   614      724       60       5        0           0 dohell
>> >[   74.227782] [  615]     0   615      724       60       5        0           0 dohell
>> >[   74.240921] [  616]     0   616      723      104       4        0           0 dd
>> >[   74.264528] [  617]     0   617      941      360       4        0           0 dd
>> >[   74.284988] [  618]     0   618      724       60       5        0           0 dohell
>> >[   74.312808] [  620]     0   620     2523     2494       9        0           0 cyclictest
>> >[   74.333122] [  623]     0   623      724      174       5        0           0 ls
>> >[   74.361071] [ 1623]     0  1623      690       95       4        0           0 sleep
>> >[   74.382116] [ 2008]     0  2008      448      133       5        0           0 hackbench
>> >[   74.402224] [ 2011]     0  2011      448       54       5        0           0 hackbench
>> >[   74.426216] [ 2012]     0  2012      448       54       5        0           0 hackbench
>> >[   74.452513] [ 2013]     0  2013      448       54       5        0           0 hackbench
>> >[   74.468437] [ 2014]     0  2014      448       54       5        0           0 hackbench
>> >[   74.481603] [ 2015]     0  2015      448       54       5        0           0 hackbench
>> >[   74.495083] [ 2016]     0  2016      448       54       5        0           0 hackbench
>> >[   74.528757] [ 2017]     0  2017      448       54       5        0           0 hackbench
>> >[   74.542685] [ 2019]     0  2019      448       54       5        0           0 hackbench
>> >[   74.561320] [ 2020]     0  2020      448       54       5        0           0 hackbench
>> >[   74.576906] [ 2021]     0  2021      448       54       5        0           0 hackbench
>> >[   74.586537] [ 2022]     0  2022      448       54       5        0           0 hackbench
>> >[   74.610881] [ 2023]     0  2023      448       54       5        0           0 hackbench
>> >[   74.696010] [ 2026]     0  2026      448       59       5        0           0 hackbench
>> >[   74.746412] [ 2032]     0  2032      448       64       5        0           0 hackbench
>> >[   74.781200] [ 2095]     0  2095      689       36       3        0           0 cat
>> >[   74.795979] [ 2098]     0  2098      246        1       2        0           0 ps
>> >[   74.815133] [ 2099]     0  2099      448      133       4        0           0 hackbench
>> >[   74.829415] [ 2100]     0  2100      690       98       4        0           0 cat
>> >[   74.844691] [ 2101]     0  2101      448       54       4        0           0 hackbench
>> >[   74.855449] [ 2102]     0  2102      448       54       4        0           0 hackbench
>> >[   74.876889] [ 2103]     0  2103      448       54       4        0           0 hackbench
>> >[   74.893808] [ 2104]     0  2104      448       54       4        0           0 hackbench
>> >[   74.915100] [ 2105]     0  2105      448       54       4        0           0 hackbench
>> >[   74.932730] [ 2106]     0  2106      448       54       4        0           0 hackbench
>> >[   74.965216] [ 2107]     0  2107      448       54       4        0           0 hackbench
>> >[   75.027941] [ 2108]     0  2108      448       54       4        0           0 hackbench
>> >[   75.046197] [ 2109]     0  2109      448       54       4        0           0 hackbench
>> >[   75.063038] [ 2110]     0  2110      448       54       4        0           0 hackbench
>> >[   75.079646] [ 2111]     0  2111      448       55       4        0           0 hackbench
>> >[   75.115715] [ 2112]     0  2112      448       55       4        0           0 hackbench
>> >[   75.125629] [ 2113]     0  2113      448       55       4        0           0 hackbench
>> >[   75.136509] [ 2114]     0  2114      448       55       4        0           0 hackbench
>> >[   75.146358] [ 2115]     0  2115      448       55       4        0           0 hackbench
>> >[   75.157808] [ 2116]     0  2116      448       55       4        0           0 hackbench
>> >[   75.167646] [ 2117]     0  2117      448       55       4        0           0 hackbench
>> >[   75.177705] [ 2118]     0  2118      448       55       4        0           0 hackbench
>> >[   75.187688] [ 2119]     0  2119      448       55       4        0           0 hackbench
>> >[   75.202440] [ 2120]     0  2120      448       55       4        0           0 hackbench
>> >[   75.226442] [ 2121]     0  2121      448       54       4        0           0 hackbench
>> >[   75.299323] [ 2190]     0  2190      724      152       5        0           0 ps
>> >[   75.309857] [ 2191]     0  2191      392       31       3        0           0 cat
>> >[   75.320704] [ 2192]     0  2192      116       33       3        0           0 hackbench
>> >[   75.331584] [ 2193]     0  2193      246        1       2        0           0 ps
>> >[   75.341056] Out of memory: Kill process 620 (cyclictest) score 163 or sacrifice child
>> >[   75.350899] Killed process 620 (cyclictest) total-vm:10092kB, anon-rss:8544kB, file-rss:1432kB
>> >
>> >Unfortunately I'm rather busy at the moment, so I couldn't investigate
>> >this issue at all, I'm sorry.
>> >Steven, probably this issue also concerns 3.8.13.14-rt28-rc1 (I just
>> >discovered your post on the list).
>> >
>> >Regards,
>> >Helmut
>> >
>> >PS: running this test with hackbench alone (no dohell), I get nearly
>> >the same result with kmem_cache_alloc() failing.
>> >PPS: I had to resend this post since I didn't get through to the list.
>> >Sorry about any noise!
>>
>> Do you have RCU_BOOST enabled? My guess here is that you *something* is
>> running at a higher priority not allowing rcuc/ to run and do its job.
>
> The major effect of that commit is to move processing from softirq to
> kthread.  If your workload has quite a few interrupts, it is possible
> that the softirq version would -usually- make progress, but the kthread
> version not.  (You could still get this to happen with softirq should
> ksoftirqd kick in.)
>
> Sebastian's suggestion of RCU_BOOST makes sense, and you will also need to
> set CONFIG_RCU_BOOST_PRIO to be higher priority than your highest-priority
> CPU-bound thread.
>
>                                                         Thanx, Paul
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Resend: 3.10.32-rt31: problem with "rcu: Eliminate softirq processing from rcutree"
  2014-03-08 19:08     ` Helmut Buchsbaum
@ 2014-03-08 19:19       ` Helmut Buchsbaum
  2014-03-18 10:58         ` Sebastian Andrzej Siewior
  2014-03-08 20:04       ` Paul E. McKenney
  1 sibling, 1 reply; 9+ messages in thread
From: Helmut Buchsbaum @ 2014-03-08 19:19 UTC (permalink / raw)
  To: paulmck; +Cc: Sebastian Andrzej Siewior, linux-rt-users, rostedt

2014-03-08 20:08 GMT+01:00 Helmut Buchsbaum <helmut.buchsbaum@gmail.com>:
> This is the correct point to tweak the system, I was not yet aware of!
> Thanks a lot for the hint. I just enabled RCU_BOOST with its default
> settings and the system runs smooth and stable again even in heavy
> load situations. Maybe the default for RCU_BOOST should be set to Y on
> systems using PREEMPT_RT_FULL ?
>
> Again, many thanx,
> Helmut
>
> 2014-03-07 18:04 GMT+01:00 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
>> On Fri, Mar 07, 2014 at 05:21:45PM +0100, Sebastian Andrzej Siewior wrote:
>>> + paulmck
>>>
>>> * Helmut Buchsbaum | 2014-03-06 20:05:13 [+0100]:
>>>
>>> >I am working with a 3.10-rt based kernel on a custom device based on
>>> >Xilinx Zynq Z-7010 (dual Cortex-A9) with 64MB DDR2 RAM.
>>> >Today I rebased my work (3.10-rt mainly enhanced by Xilinx drivers,
>>> >but otherwise unchanged) from 3.10.32-rt30 to 3.10.32-rt31. There I
>>> >ran a dohell derivate (from
>>> >http://git.xenomai.org/xenomai-2.6.git/plain/src/testsuite/xeno-test/dohell?id=v2.6.3,
>>> >just using customized hackbench call and cyclictest as the only test)
>>> >when the OOM killer struck, which definitely did not happen with
>>> >3.10.32-rt30. Bisecting I detected
>>> >8cf5b0e982cba4fd0e469989a420a15c8a378fa2 "rcu: Eliminate softirq
>>> >processing from rcutree" as the change which introduced the unwanted
>>> >behavior:
>>> >
>>> >[   73.865213] ls invoked oom-killer: gfp_mask=0x2000d0, order=0, oom_score_adj=0
>>> >[   73.877229] CPU: 1 PID: 623 Comm: ls Not tainted 3.10.32-rt30+ #1
>>> >[   73.877280] [<c0013c80>] (unwind_backtrace+0x0/0x138) from [<c0011d14>] (show_stack+0x10/0x14)
>>> >[   73.877310] [<c0011d14>] (show_stack+0x10/0x14) from [<c0394d88>] (dump_header.isra.12+0x6c/0xa8)
>>> >[   73.877334] [<c0394d88>] (dump_header.isra.12+0x6c/0xa8) from [<c0394e10>] (oom_kill_process.part.14+0x4c/0x384)
>>> >[   73.877357] [<c0394e10>] (oom_kill_process.part.14+0x4c/0x384) from [<c007301c>] (out_of_memory+0x128/0x1c8)
>>> >[   73.877380] [<c007301c>] (out_of_memory+0x128/0x1c8) from >[<c0076f10>] (__alloc_pages_nodemask+0x6f4/0x71c)
>>> >[   73.877406] [<c0076f10>] (__alloc_pages_nodemask+0x6f4/0x71c) from [<c009fe9c>] (allocate_slab+0xe4/0xfc)
>>> >[   73.877426] [<c009fe9c>] (allocate_slab+0xe4/0xfc) from [<c009fee4>] (new_slab+0x30/0x154)
>>> >[   73.877447] [<c009fee4>] (new_slab+0x30/0x154) from [<c0396608>] (__slab_alloc.isra.52.constprop.54+0x2f0/0x3c4)
>>> >[   73.877467] [<c0396608>] (__slab_alloc.isra.52.constprop.54+0x2f0/0x3c4) from [<c00a0614>] (kmem_cache_alloc+0x130/0x138)
>>> >[   73.877489] [<c00a0614>] (kmem_cache_alloc+0x130/0x138) from [<c00a5d84>] (get_empty_filp+0x6c/0x1f4)
>>> >[   73.877509] [<c00a5d84>] (get_empty_filp+0x6c/0x1f4) from [<c00b18b8>] (path_openat+0x2c/0x43c)
>>> >[   73.877527] [<c00b18b8>] (path_openat+0x2c/0x43c) from [<c00b1f80>] (do_filp_open+0x2c/0x80)
>>> >[   73.877544] [<c00b1f80>] (do_filp_open+0x2c/0x80) from [<c00a40a4>] (do_sys_open+0xe8/0x174)
>>> >[   73.877564] [<c00a40a4>] (do_sys_open+0xe8/0x174) from [<c000e9c0>] (ret_fast_syscall+0x0/0x30)
>>> >[   73.877574] Mem-info:
>>> >[   73.877582] Normal per-cpu:
>>> >[   73.877593] CPU    0: hi:    6, btch:   1 usd:   0
>>> >[   73.877603] CPU    1: hi:    6, btch:   1 usd:   0
>>> >[   73.877625] active_anon:1827 inactive_anon:5 isolated_anon:0
>>> >[   73.877625]  active_file:174 inactive_file:1158 isolated_file:0
>>> >[   73.877625]  unevictable:2493 dirty:1 writeback:1160 unstable:0
>>> >[   73.877625]  free:3978 slab_reclaimable:1161 slab_unreclaimable:1345
>>> >[   73.877625]  mapped:440 shmem:512 pagetables:526 bounce:0
>>> >[   73.877625]  free_cma:3833
>>> >[   73.877669] Normal free:15912kB min:824kB low:1028kB high:1236kB active_anon:7308kB inactive_anon:20kB active_file:696kB inacs
>>> >[   73.877679] lowmem_reserve[]: 0 0
>>> >[   73.877695] Normal: 234*4kB (ERC) 208*8kB (URC) 200*16kB (C) 189*32kB (RC) 61*64kB (RC) 1*128kB (R) 0*256kB 0*512kB 0*1024kB B
>>> >[   73.877769] 2196 total pagecache pages
>>> >[   73.877780] 0 pages in swap cache
>>> >[   73.877789] Swap cache stats: add 0, delete 0, find 0/0
>>> >[   73.877796] Free swap  = 0kB
>>> >[   73.877804] Total swap = 0kB
>>> >[   73.884950] 16384 pages of RAM
>>> >[   73.884966] 5008 free pages
>>> >[   73.884975] 1558 reserved pages
>>> >[   73.884983] 1674 slab pages
>>> >[   73.884990] 532633 pages shared
>>> >[   73.884998] 0 pages swap cached
>>> >[   73.885007] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
>>> >[   73.908257] [  502]     0   502      723      119       5        0           0 klogd
>>> >[   73.936201] [  504]     0   504      723      112       5        0           0 syslogd
>>> >[   73.980352] [  506]     0   506      723       76       5        0           0 telnetd
>>> >[   74.028534] [  566]     0   566      724      166       5        0           0 ash
>>> >[   74.053796] [  606]     0   606      723       57       4        0           0 udhcpc
>>> >[   74.081419] [  607]     0   607      724      161       5        0           0 dohell
>>> >[   74.090873] [  609]     0   609      724       68       5        0           0 dohell
>>> >[   74.106407] [  610]     0   610      724       61       5        0           0 dohell
>>> >[   74.118017] [  611]     0   611      724       58       5        0           0 dohell
>>> >[   74.128390] [  612]     0   612      723      110       5        0           0 nc
>>> >[   74.198260] [  613]     0   613      724       61       5        0           0 dohell
>>> >[   74.214133] [  614]     0   614      724       60       5        0           0 dohell
>>> >[   74.227782] [  615]     0   615      724       60       5        0           0 dohell
>>> >[   74.240921] [  616]     0   616      723      104       4        0           0 dd
>>> >[   74.264528] [  617]     0   617      941      360       4        0           0 dd
>>> >[   74.284988] [  618]     0   618      724       60       5        0           0 dohell
>>> >[   74.312808] [  620]     0   620     2523     2494       9        0           0 cyclictest
>>> >[   74.333122] [  623]     0   623      724      174       5        0           0 ls
>>> >[   74.361071] [ 1623]     0  1623      690       95       4        0           0 sleep
>>> >[   74.382116] [ 2008]     0  2008      448      133       5        0           0 hackbench
>>> >[   74.402224] [ 2011]     0  2011      448       54       5        0           0 hackbench
>>> >[   74.426216] [ 2012]     0  2012      448       54       5        0           0 hackbench
>>> >[   74.452513] [ 2013]     0  2013      448       54       5        0           0 hackbench
>>> >[   74.468437] [ 2014]     0  2014      448       54       5        0           0 hackbench
>>> >[   74.481603] [ 2015]     0  2015      448       54       5        0           0 hackbench
>>> >[   74.495083] [ 2016]     0  2016      448       54       5        0           0 hackbench
>>> >[   74.528757] [ 2017]     0  2017      448       54       5        0           0 hackbench
>>> >[   74.542685] [ 2019]     0  2019      448       54       5        0           0 hackbench
>>> >[   74.561320] [ 2020]     0  2020      448       54       5        0           0 hackbench
>>> >[   74.576906] [ 2021]     0  2021      448       54       5        0           0 hackbench
>>> >[   74.586537] [ 2022]     0  2022      448       54       5        0           0 hackbench
>>> >[   74.610881] [ 2023]     0  2023      448       54       5        0           0 hackbench
>>> >[   74.696010] [ 2026]     0  2026      448       59       5        0           0 hackbench
>>> >[   74.746412] [ 2032]     0  2032      448       64       5        0           0 hackbench
>>> >[   74.781200] [ 2095]     0  2095      689       36       3        0           0 cat
>>> >[   74.795979] [ 2098]     0  2098      246        1       2        0           0 ps
>>> >[   74.815133] [ 2099]     0  2099      448      133       4        0           0 hackbench
>>> >[   74.829415] [ 2100]     0  2100      690       98       4        0           0 cat
>>> >[   74.844691] [ 2101]     0  2101      448       54       4        0           0 hackbench
>>> >[   74.855449] [ 2102]     0  2102      448       54       4        0           0 hackbench
>>> >[   74.876889] [ 2103]     0  2103      448       54       4        0           0 hackbench
>>> >[   74.893808] [ 2104]     0  2104      448       54       4        0           0 hackbench
>>> >[   74.915100] [ 2105]     0  2105      448       54       4        0           0 hackbench
>>> >[   74.932730] [ 2106]     0  2106      448       54       4        0           0 hackbench
>>> >[   74.965216] [ 2107]     0  2107      448       54       4        0           0 hackbench
>>> >[   75.027941] [ 2108]     0  2108      448       54       4        0           0 hackbench
>>> >[   75.046197] [ 2109]     0  2109      448       54       4        0           0 hackbench
>>> >[   75.063038] [ 2110]     0  2110      448       54       4        0           0 hackbench
>>> >[   75.079646] [ 2111]     0  2111      448       55       4        0           0 hackbench
>>> >[   75.115715] [ 2112]     0  2112      448       55       4        0           0 hackbench
>>> >[   75.125629] [ 2113]     0  2113      448       55       4        0           0 hackbench
>>> >[   75.136509] [ 2114]     0  2114      448       55       4        0           0 hackbench
>>> >[   75.146358] [ 2115]     0  2115      448       55       4        0           0 hackbench
>>> >[   75.157808] [ 2116]     0  2116      448       55       4        0           0 hackbench
>>> >[   75.167646] [ 2117]     0  2117      448       55       4        0           0 hackbench
>>> >[   75.177705] [ 2118]     0  2118      448       55       4        0           0 hackbench
>>> >[   75.187688] [ 2119]     0  2119      448       55       4        0           0 hackbench
>>> >[   75.202440] [ 2120]     0  2120      448       55       4        0           0 hackbench
>>> >[   75.226442] [ 2121]     0  2121      448       54       4        0           0 hackbench
>>> >[   75.299323] [ 2190]     0  2190      724      152       5        0           0 ps
>>> >[   75.309857] [ 2191]     0  2191      392       31       3        0           0 cat
>>> >[   75.320704] [ 2192]     0  2192      116       33       3        0           0 hackbench
>>> >[   75.331584] [ 2193]     0  2193      246        1       2        0           0 ps
>>> >[   75.341056] Out of memory: Kill process 620 (cyclictest) score 163 or sacrifice child
>>> >[   75.350899] Killed process 620 (cyclictest) total-vm:10092kB, anon-rss:8544kB, file-rss:1432kB
>>> >
>>> >Unfortunately I'm rather busy at the moment, so I couldn't investigate
>>> >this issue at all, I'm sorry.
>>> >Steven, probably this issue also concerns 3.8.13.14-rt28-rc1 (I just
>>> >discovered your post on the list).
>>> >
>>> >Regards,
>>> >Helmut
>>> >
>>> >PS: running this test with hackbench alone (no dohell), I get nearly
>>> >the same result with kmem_cache_alloc() failing.
>>> >PPS: I had to resend this post since I didn't get through to the list.
>>> >Sorry about any noise!
>>>
>>> Do you have RCU_BOOST enabled? My guess here is that you *something* is
>>> running at a higher priority not allowing rcuc/ to run and do its job.
>>
>> The major effect of that commit is to move processing from softirq to
>> kthread.  If your workload has quite a few interrupts, it is possible
>> that the softirq version would -usually- make progress, but the kthread
>> version not.  (You could still get this to happen with softirq should
>> ksoftirqd kick in.)
>>
>> Sebastian's suggestion of RCU_BOOST makes sense, and you will also need to
>> set CONFIG_RCU_BOOST_PRIO to be higher priority than your highest-priority
>> CPU-bound thread.
>>
>>                                                         Thanx, Paul
>>

Sorry about top posting, I am just not used to Gmail's web interface ;-)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Resend: 3.10.32-rt31: problem with "rcu: Eliminate softirq processing from rcutree"
  2014-03-08 19:08     ` Helmut Buchsbaum
  2014-03-08 19:19       ` Helmut Buchsbaum
@ 2014-03-08 20:04       ` Paul E. McKenney
  2014-03-18 11:05         ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2014-03-08 20:04 UTC (permalink / raw)
  To: Helmut Buchsbaum; +Cc: Sebastian Andrzej Siewior, linux-rt-users, rostedt

On Sat, Mar 08, 2014 at 08:08:35PM +0100, Helmut Buchsbaum wrote:
> This is the correct point to tweak the system, I was not yet aware of!
> Thanks a lot for the hint. I just enabled RCU_BOOST with its default
> settings and the system runs smooth and stable again even in heavy
> load situations. Maybe the default for RCU_BOOST should be set to Y on
> systems using PREEMPT_RT_FULL ?

I must defer to Sebastian on this, but seems like a good approach.

							Thanx, Paul

> Again, many thanx,
> Helmut
> 
> 2014-03-07 18:04 GMT+01:00 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> > On Fri, Mar 07, 2014 at 05:21:45PM +0100, Sebastian Andrzej Siewior wrote:
> >> + paulmck
> >>
> >> * Helmut Buchsbaum | 2014-03-06 20:05:13 [+0100]:
> >>
> >> >I am working with a 3.10-rt based kernel on a custom device based on
> >> >Xilinx Zynq Z-7010 (dual Cortex-A9) with 64MB DDR2 RAM.
> >> >Today I rebased my work (3.10-rt mainly enhanced by Xilinx drivers,
> >> >but otherwise unchanged) from 3.10.32-rt30 to 3.10.32-rt31. There I
> >> >ran a dohell derivate (from
> >> >http://git.xenomai.org/xenomai-2.6.git/plain/src/testsuite/xeno-test/dohell?id=v2.6.3,
> >> >just using customized hackbench call and cyclictest as the only test)
> >> >when the OOM killer struck, which definitely did not happen with
> >> >3.10.32-rt30. Bisecting I detected
> >> >8cf5b0e982cba4fd0e469989a420a15c8a378fa2 "rcu: Eliminate softirq
> >> >processing from rcutree" as the change which introduced the unwanted
> >> >behavior:
> >> >
> >> >[   73.865213] ls invoked oom-killer: gfp_mask=0x2000d0, order=0, oom_score_adj=0
> >> >[   73.877229] CPU: 1 PID: 623 Comm: ls Not tainted 3.10.32-rt30+ #1
> >> >[   73.877280] [<c0013c80>] (unwind_backtrace+0x0/0x138) from [<c0011d14>] (show_stack+0x10/0x14)
> >> >[   73.877310] [<c0011d14>] (show_stack+0x10/0x14) from [<c0394d88>] (dump_header.isra.12+0x6c/0xa8)
> >> >[   73.877334] [<c0394d88>] (dump_header.isra.12+0x6c/0xa8) from [<c0394e10>] (oom_kill_process.part.14+0x4c/0x384)
> >> >[   73.877357] [<c0394e10>] (oom_kill_process.part.14+0x4c/0x384) from [<c007301c>] (out_of_memory+0x128/0x1c8)
> >> >[   73.877380] [<c007301c>] (out_of_memory+0x128/0x1c8) from >[<c0076f10>] (__alloc_pages_nodemask+0x6f4/0x71c)
> >> >[   73.877406] [<c0076f10>] (__alloc_pages_nodemask+0x6f4/0x71c) from [<c009fe9c>] (allocate_slab+0xe4/0xfc)
> >> >[   73.877426] [<c009fe9c>] (allocate_slab+0xe4/0xfc) from [<c009fee4>] (new_slab+0x30/0x154)
> >> >[   73.877447] [<c009fee4>] (new_slab+0x30/0x154) from [<c0396608>] (__slab_alloc.isra.52.constprop.54+0x2f0/0x3c4)
> >> >[   73.877467] [<c0396608>] (__slab_alloc.isra.52.constprop.54+0x2f0/0x3c4) from [<c00a0614>] (kmem_cache_alloc+0x130/0x138)
> >> >[   73.877489] [<c00a0614>] (kmem_cache_alloc+0x130/0x138) from [<c00a5d84>] (get_empty_filp+0x6c/0x1f4)
> >> >[   73.877509] [<c00a5d84>] (get_empty_filp+0x6c/0x1f4) from [<c00b18b8>] (path_openat+0x2c/0x43c)
> >> >[   73.877527] [<c00b18b8>] (path_openat+0x2c/0x43c) from [<c00b1f80>] (do_filp_open+0x2c/0x80)
> >> >[   73.877544] [<c00b1f80>] (do_filp_open+0x2c/0x80) from [<c00a40a4>] (do_sys_open+0xe8/0x174)
> >> >[   73.877564] [<c00a40a4>] (do_sys_open+0xe8/0x174) from [<c000e9c0>] (ret_fast_syscall+0x0/0x30)
> >> >[   73.877574] Mem-info:
> >> >[   73.877582] Normal per-cpu:
> >> >[   73.877593] CPU    0: hi:    6, btch:   1 usd:   0
> >> >[   73.877603] CPU    1: hi:    6, btch:   1 usd:   0
> >> >[   73.877625] active_anon:1827 inactive_anon:5 isolated_anon:0
> >> >[   73.877625]  active_file:174 inactive_file:1158 isolated_file:0
> >> >[   73.877625]  unevictable:2493 dirty:1 writeback:1160 unstable:0
> >> >[   73.877625]  free:3978 slab_reclaimable:1161 slab_unreclaimable:1345
> >> >[   73.877625]  mapped:440 shmem:512 pagetables:526 bounce:0
> >> >[   73.877625]  free_cma:3833
> >> >[   73.877669] Normal free:15912kB min:824kB low:1028kB high:1236kB active_anon:7308kB inactive_anon:20kB active_file:696kB inacs
> >> >[   73.877679] lowmem_reserve[]: 0 0
> >> >[   73.877695] Normal: 234*4kB (ERC) 208*8kB (URC) 200*16kB (C) 189*32kB (RC) 61*64kB (RC) 1*128kB (R) 0*256kB 0*512kB 0*1024kB B
> >> >[   73.877769] 2196 total pagecache pages
> >> >[   73.877780] 0 pages in swap cache
> >> >[   73.877789] Swap cache stats: add 0, delete 0, find 0/0
> >> >[   73.877796] Free swap  = 0kB
> >> >[   73.877804] Total swap = 0kB
> >> >[   73.884950] 16384 pages of RAM
> >> >[   73.884966] 5008 free pages
> >> >[   73.884975] 1558 reserved pages
> >> >[   73.884983] 1674 slab pages
> >> >[   73.884990] 532633 pages shared
> >> >[   73.884998] 0 pages swap cached
> >> >[   73.885007] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
> >> >[   73.908257] [  502]     0   502      723      119       5        0           0 klogd
> >> >[   73.936201] [  504]     0   504      723      112       5        0           0 syslogd
> >> >[   73.980352] [  506]     0   506      723       76       5        0           0 telnetd
> >> >[   74.028534] [  566]     0   566      724      166       5        0           0 ash
> >> >[   74.053796] [  606]     0   606      723       57       4        0           0 udhcpc
> >> >[   74.081419] [  607]     0   607      724      161       5        0           0 dohell
> >> >[   74.090873] [  609]     0   609      724       68       5        0           0 dohell
> >> >[   74.106407] [  610]     0   610      724       61       5        0           0 dohell
> >> >[   74.118017] [  611]     0   611      724       58       5        0           0 dohell
> >> >[   74.128390] [  612]     0   612      723      110       5        0           0 nc
> >> >[   74.198260] [  613]     0   613      724       61       5        0           0 dohell
> >> >[   74.214133] [  614]     0   614      724       60       5        0           0 dohell
> >> >[   74.227782] [  615]     0   615      724       60       5        0           0 dohell
> >> >[   74.240921] [  616]     0   616      723      104       4        0           0 dd
> >> >[   74.264528] [  617]     0   617      941      360       4        0           0 dd
> >> >[   74.284988] [  618]     0   618      724       60       5        0           0 dohell
> >> >[   74.312808] [  620]     0   620     2523     2494       9        0           0 cyclictest
> >> >[   74.333122] [  623]     0   623      724      174       5        0           0 ls
> >> >[   74.361071] [ 1623]     0  1623      690       95       4        0           0 sleep
> >> >[   74.382116] [ 2008]     0  2008      448      133       5        0           0 hackbench
> >> >[   74.402224] [ 2011]     0  2011      448       54       5        0           0 hackbench
> >> >[   74.426216] [ 2012]     0  2012      448       54       5        0           0 hackbench
> >> >[   74.452513] [ 2013]     0  2013      448       54       5        0           0 hackbench
> >> >[   74.468437] [ 2014]     0  2014      448       54       5        0           0 hackbench
> >> >[   74.481603] [ 2015]     0  2015      448       54       5        0           0 hackbench
> >> >[   74.495083] [ 2016]     0  2016      448       54       5        0           0 hackbench
> >> >[   74.528757] [ 2017]     0  2017      448       54       5        0           0 hackbench
> >> >[   74.542685] [ 2019]     0  2019      448       54       5        0           0 hackbench
> >> >[   74.561320] [ 2020]     0  2020      448       54       5        0           0 hackbench
> >> >[   74.576906] [ 2021]     0  2021      448       54       5        0           0 hackbench
> >> >[   74.586537] [ 2022]     0  2022      448       54       5        0           0 hackbench
> >> >[   74.610881] [ 2023]     0  2023      448       54       5        0           0 hackbench
> >> >[   74.696010] [ 2026]     0  2026      448       59       5        0           0 hackbench
> >> >[   74.746412] [ 2032]     0  2032      448       64       5        0           0 hackbench
> >> >[   74.781200] [ 2095]     0  2095      689       36       3        0           0 cat
> >> >[   74.795979] [ 2098]     0  2098      246        1       2        0           0 ps
> >> >[   74.815133] [ 2099]     0  2099      448      133       4        0           0 hackbench
> >> >[   74.829415] [ 2100]     0  2100      690       98       4        0           0 cat
> >> >[   74.844691] [ 2101]     0  2101      448       54       4        0           0 hackbench
> >> >[   74.855449] [ 2102]     0  2102      448       54       4        0           0 hackbench
> >> >[   74.876889] [ 2103]     0  2103      448       54       4        0           0 hackbench
> >> >[   74.893808] [ 2104]     0  2104      448       54       4        0           0 hackbench
> >> >[   74.915100] [ 2105]     0  2105      448       54       4        0           0 hackbench
> >> >[   74.932730] [ 2106]     0  2106      448       54       4        0           0 hackbench
> >> >[   74.965216] [ 2107]     0  2107      448       54       4        0           0 hackbench
> >> >[   75.027941] [ 2108]     0  2108      448       54       4        0           0 hackbench
> >> >[   75.046197] [ 2109]     0  2109      448       54       4        0           0 hackbench
> >> >[   75.063038] [ 2110]     0  2110      448       54       4        0           0 hackbench
> >> >[   75.079646] [ 2111]     0  2111      448       55       4        0           0 hackbench
> >> >[   75.115715] [ 2112]     0  2112      448       55       4        0           0 hackbench
> >> >[   75.125629] [ 2113]     0  2113      448       55       4        0           0 hackbench
> >> >[   75.136509] [ 2114]     0  2114      448       55       4        0           0 hackbench
> >> >[   75.146358] [ 2115]     0  2115      448       55       4        0           0 hackbench
> >> >[   75.157808] [ 2116]     0  2116      448       55       4        0           0 hackbench
> >> >[   75.167646] [ 2117]     0  2117      448       55       4        0           0 hackbench
> >> >[   75.177705] [ 2118]     0  2118      448       55       4        0           0 hackbench
> >> >[   75.187688] [ 2119]     0  2119      448       55       4        0           0 hackbench
> >> >[   75.202440] [ 2120]     0  2120      448       55       4        0           0 hackbench
> >> >[   75.226442] [ 2121]     0  2121      448       54       4        0           0 hackbench
> >> >[   75.299323] [ 2190]     0  2190      724      152       5        0           0 ps
> >> >[   75.309857] [ 2191]     0  2191      392       31       3        0           0 cat
> >> >[   75.320704] [ 2192]     0  2192      116       33       3        0           0 hackbench
> >> >[   75.331584] [ 2193]     0  2193      246        1       2        0           0 ps
> >> >[   75.341056] Out of memory: Kill process 620 (cyclictest) score 163 or sacrifice child
> >> >[   75.350899] Killed process 620 (cyclictest) total-vm:10092kB, anon-rss:8544kB, file-rss:1432kB
> >> >
> >> >Unfortunately I'm rather busy at the moment, so I couldn't investigate
> >> >this issue at all, I'm sorry.
> >> >Steven, probably this issue also concerns 3.8.13.14-rt28-rc1 (I just
> >> >discovered your post on the list).
> >> >
> >> >Regards,
> >> >Helmut
> >> >
> >> >PS: running this test with hackbench alone (no dohell), I get nearly
> >> >the same result with kmem_cache_alloc() failing.
> >> >PPS: I had to resend this post since I didn't get through to the list.
> >> >Sorry about any noise!
> >>
> >> Do you have RCU_BOOST enabled? My guess here is that you *something* is
> >> running at a higher priority not allowing rcuc/ to run and do its job.
> >
> > The major effect of that commit is to move processing from softirq to
> > kthread.  If your workload has quite a few interrupts, it is possible
> > that the softirq version would -usually- make progress, but the kthread
> > version not.  (You could still get this to happen with softirq should
> > ksoftirqd kick in.)
> >
> > Sebastian's suggestion of RCU_BOOST makes sense, and you will also need to
> > set CONFIG_RCU_BOOST_PRIO to be higher priority than your highest-priority
> > CPU-bound thread.
> >
> >                                                         Thanx, Paul
> >
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Resend: 3.10.32-rt31: problem with "rcu: Eliminate softirq processing from rcutree"
  2014-03-08 19:19       ` Helmut Buchsbaum
@ 2014-03-18 10:58         ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2014-03-18 10:58 UTC (permalink / raw)
  To: Helmut Buchsbaum; +Cc: paulmck, linux-rt-users, rostedt

* Helmut Buchsbaum | 2014-03-08 20:19:00 [+0100]:

>Sorry about top posting, I am just not used to Gmail's web interface ;-)

Top-posting is bad enough. But quoting the complete email just to add
one line of complete nonsense is a complete waste of ressources including
reader's and HW processing.

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Resend: 3.10.32-rt31: problem with "rcu: Eliminate softirq processing from rcutree"
  2014-03-08 20:04       ` Paul E. McKenney
@ 2014-03-18 11:05         ` Sebastian Andrzej Siewior
  2014-03-20 14:50           ` Steven Rostedt
  0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2014-03-18 11:05 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Helmut Buchsbaum, linux-rt-users, rostedt, tglx

* Paul E. McKenney | 2014-03-08 12:04:00 [-0800]:

>On Sat, Mar 08, 2014 at 08:08:35PM +0100, Helmut Buchsbaum wrote:
>> This is the correct point to tweak the system, I was not yet aware of!
>> Thanks a lot for the hint. I just enabled RCU_BOOST with its default
>> settings and the system runs smooth and stable again even in heavy
>> load situations. Maybe the default for RCU_BOOST should be set to Y on
>> systems using PREEMPT_RT_FULL ?
>
>I must defer to Sebastian on this, but seems like a good approach.

Hmm. There are a few ways to break an RT system. Afaik RCU_BOOST was
recommended even before this change. 

Steven, tglx any opinion on that?

Another thing we could do is to register_shrinker() which is called
before OOM. But then boosting is simple enough.

>							Thanx, Paul

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Resend: 3.10.32-rt31: problem with "rcu: Eliminate softirq processing from rcutree"
  2014-03-18 11:05         ` Sebastian Andrzej Siewior
@ 2014-03-20 14:50           ` Steven Rostedt
  0 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2014-03-20 14:50 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Paul E. McKenney, Helmut Buchsbaum, linux-rt-users, tglx

On Tue, 18 Mar 2014 12:05:31 +0100
Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:

> * Paul E. McKenney | 2014-03-08 12:04:00 [-0800]:
> 
> >On Sat, Mar 08, 2014 at 08:08:35PM +0100, Helmut Buchsbaum wrote:
> >> This is the correct point to tweak the system, I was not yet aware of!
> >> Thanks a lot for the hint. I just enabled RCU_BOOST with its default
> >> settings and the system runs smooth and stable again even in heavy
> >> load situations. Maybe the default for RCU_BOOST should be set to Y on
> >> systems using PREEMPT_RT_FULL ?
> >
> >I must defer to Sebastian on this, but seems like a good approach.
> 
> Hmm. There are a few ways to break an RT system. Afaik RCU_BOOST was
> recommended even before this change. 
> 
> Steven, tglx any opinion on that?
> 
> Another thing we could do is to register_shrinker() which is called
> before OOM. But then boosting is simple enough.

I thought RCU_BOOST was already set for default y when PREEMPT_RT is
set. If not, then please add it. Note, the default should be set, but
not selected. Still let the user disable it.

 default y if PREEMPT_RT_FULL

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-03-20 14:50 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-06 19:05 Resend: 3.10.32-rt31: problem with "rcu: Eliminate softirq processing from rcutree" Helmut Buchsbaum
2014-03-07 16:21 ` Sebastian Andrzej Siewior
2014-03-07 17:04   ` Paul E. McKenney
2014-03-08 19:08     ` Helmut Buchsbaum
2014-03-08 19:19       ` Helmut Buchsbaum
2014-03-18 10:58         ` Sebastian Andrzej Siewior
2014-03-08 20:04       ` Paul E. McKenney
2014-03-18 11:05         ` Sebastian Andrzej Siewior
2014-03-20 14:50           ` Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.