* Re: kswapd craziness round 2 @ 2013-02-18 6:18 Daniel J Blueman 2013-02-18 11:42 ` Hillf Danton 0 siblings, 1 reply; 25+ messages in thread From: Daniel J Blueman @ 2013-02-18 6:18 UTC (permalink / raw) To: Jiri Slaby; +Cc: Linux Kernel, Steffen Persvold On Monday, 18 February 2013 06:10:02 UTC+8, Jiri Slaby wrote: > Hi, > > You still feel the sour taste of the "kswapd craziness in v3.7" thread, > right? Welcome to the hell, part two :{. > > I believe this started happening after update from > 3.8.0-rc4-next-20130125 to 3.8.0-rc7-next-20130211. The same as before, > many hours of uptime are needed and perhaps some suspend/resume cycles > too. Memory pressure is not high, plenty of I/O cache: > # free > total used free shared buffers cached > Mem: 6026692 5571184 455508 0 351252 2016648 > -/+ buffers/cache: 3203284 2823408 > Swap: 0 0 0 > > kswap is working very toughly though: > root 580 0.6 0.0 0 0 ? S úno12 46:21 [kswapd0] > > This happens on I/O activity right now. For example by updatedb or find > /. This is what the stack trace of kswapd0 looks like: > [<ffffffff8113c431>] shrink_slab+0xa1/0x2d0 > [<ffffffff8113ecd1>] kswapd+0x541/0x930 > [<ffffffff810a3000>] kthread+0xc0/0xd0 > [<ffffffff816beb5c>] ret_from_fork+0x7c/0xb0 > [<ffffffffffffffff>] 0xffffffffffffffff Likewise with 3.8-rc, I've been able to reproduce [1] a livelock scenario which hoses the box and observe RCU stalls are observed [2]. There may be a connection; I'll do a bit more debugging in the next few days. Daniel --- [1] 1. live-booted image using ramdisk 2. boot 3.8-rc with <16GB memory and without swap 3. run OpenMP NAS Parallel Benchmark dc.B against local disk (ie not ramdisk) 4. observe hang O(30) mins later --- [2] [ 2675.587878] INFO: rcu_sched self-detected stall on CPU { 5} (t=24000 jiffies g=6313 c=6312 q=68) -- Daniel J Blueman Principal Software Engineer, Numascale Asia ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 2013-02-18 6:18 kswapd craziness round 2 Daniel J Blueman @ 2013-02-18 11:42 ` Hillf Danton 2013-02-18 15:05 ` Daniel J Blueman 2013-02-20 22:14 ` Jiri Slaby 0 siblings, 2 replies; 25+ messages in thread From: Hillf Danton @ 2013-02-18 11:42 UTC (permalink / raw) To: Daniel J Blueman; +Cc: Jiri Slaby, Linux Kernel, Steffen Persvold On Mon, Feb 18, 2013 at 2:18 PM, Daniel J Blueman <daniel@numascale-asia.com> wrote: > On Monday, 18 February 2013 06:10:02 UTC+8, Jiri Slaby wrote: > >> Hi, >> >> You still feel the sour taste of the "kswapd craziness in v3.7" thread, >> right? Welcome to the hell, part two :{. >> >> I believe this started happening after update from >> 3.8.0-rc4-next-20130125 to 3.8.0-rc7-next-20130211. The same as before, >> many hours of uptime are needed and perhaps some suspend/resume cycles >> too. Memory pressure is not high, plenty of I/O cache: >> # free >> total used free shared buffers cached >> Mem: 6026692 5571184 455508 0 351252 2016648 >> -/+ buffers/cache: 3203284 2823408 >> Swap: 0 0 0 >> >> kswap is working very toughly though: >> root 580 0.6 0.0 0 0 ? S úno12 46:21 [kswapd0] >> >> This happens on I/O activity right now. For example by updatedb or find >> /. This is what the stack trace of kswapd0 looks like: >> [<ffffffff8113c431>] shrink_slab+0xa1/0x2d0 >> [<ffffffff8113ecd1>] kswapd+0x541/0x930 >> [<ffffffff810a3000>] kthread+0xc0/0xd0 >> [<ffffffff816beb5c>] ret_from_fork+0x7c/0xb0 >> [<ffffffffffffffff>] 0xffffffffffffffff > > Likewise with 3.8-rc, I've been able to reproduce [1] a livelock scenario > which hoses the box and observe RCU stalls are observed [2]. > > There may be a connection; I'll do a bit more debugging in the next few > days. > > Daniel > > --- [1] > > 1. live-booted image using ramdisk > 2. boot 3.8-rc with <16GB memory and without swap > 3. run OpenMP NAS Parallel Benchmark dc.B against local disk (ie not > ramdisk) > 4. observe hang O(30) mins later > > --- [2] > > [ 2675.587878] INFO: rcu_sched self-detected stall on CPU { 5} (t=24000 > jiffies g=6313 c=6312 q=68) Does Ingo's revert help? https://lkml.org/lkml/2013/2/15/168 ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 2013-02-18 11:42 ` Hillf Danton @ 2013-02-18 15:05 ` Daniel J Blueman 2013-02-20 22:14 ` Jiri Slaby 1 sibling, 0 replies; 25+ messages in thread From: Daniel J Blueman @ 2013-02-18 15:05 UTC (permalink / raw) To: Hillf Danton Cc: Jiri Slaby, Linux Kernel, Steffen Persvold, Ingo Molnar, Linus Torvalds On 18/02/2013 19:42, Hillf Danton wrote: > On Mon, Feb 18, 2013 at 2:18 PM, Daniel J Blueman > <daniel@numascale-asia.com> wrote: >> On Monday, 18 February 2013 06:10:02 UTC+8, Jiri Slaby wrote: >> >>> Hi, >>> >>> You still feel the sour taste of the "kswapd craziness in v3.7" thread, >>> right? Welcome to the hell, part two :{. >>> >>> I believe this started happening after update from >>> 3.8.0-rc4-next-20130125 to 3.8.0-rc7-next-20130211. The same as before, >>> many hours of uptime are needed and perhaps some suspend/resume cycles >>> too. Memory pressure is not high, plenty of I/O cache: >>> # free >>> total used free shared buffers cached >>> Mem: 6026692 5571184 455508 0 351252 2016648 >>> -/+ buffers/cache: 3203284 2823408 >>> Swap: 0 0 0 >>> >>> kswap is working very toughly though: >>> root 580 0.6 0.0 0 0 ? S úno12 46:21 [kswapd0] >>> >>> This happens on I/O activity right now. For example by updatedb or find >>> /. This is what the stack trace of kswapd0 looks like: >>> [<ffffffff8113c431>] shrink_slab+0xa1/0x2d0 >>> [<ffffffff8113ecd1>] kswapd+0x541/0x930 >>> [<ffffffff810a3000>] kthread+0xc0/0xd0 >>> [<ffffffff816beb5c>] ret_from_fork+0x7c/0xb0 >>> [<ffffffffffffffff>] 0xffffffffffffffff >> >> Likewise with 3.8-rc, I've been able to reproduce [1] a livelock scenario >> which hoses the box and observe RCU stalls [2]. >> >> There may be a connection; I'll do a bit more debugging in the next few >> days. >> >> Daniel >> >> --- [1] >> >> 1. live-booted image using ramdisk >> 2. boot 3.8-rc with <16GB memory and without swap >> 3. run OpenMP NAS Parallel Benchmark dc.B against local disk (ie not >> ramdisk) >> 4. observe hang O(30) mins later >> >> --- [2] >> >> [ 2675.587878] INFO: rcu_sched self-detected stall on CPU { 5} (t=24000 >> jiffies g=6313 c=6312 q=68) > > Does Ingo's revert help? https://lkml.org/lkml/2013/2/15/168 Close, but no cigar; I still hit this livelock on 3.8-rc7 with Ingo's revert or Linus's fix. However, I am unable to reproduce the hang with 3.7.9, so will begin bisection tomorrow, probably automating via pexpect. Thanks, Daniel -- Daniel J Blueman Principal Software Engineer, Numascale Asia ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 2013-02-18 11:42 ` Hillf Danton 2013-02-18 15:05 ` Daniel J Blueman @ 2013-02-20 22:14 ` Jiri Slaby 2013-02-21 12:07 ` Hillf Danton 1 sibling, 1 reply; 25+ messages in thread From: Jiri Slaby @ 2013-02-20 22:14 UTC (permalink / raw) To: Hillf Danton, Daniel J Blueman; +Cc: Linux Kernel, Steffen Persvold On 02/18/2013 12:42 PM, Hillf Danton wrote: > On Mon, Feb 18, 2013 at 2:18 PM, Daniel J Blueman > <daniel@numascale-asia.com> wrote: >> On Monday, 18 February 2013 06:10:02 UTC+8, Jiri Slaby wrote: >> >>> Hi, >>> >>> You still feel the sour taste of the "kswapd craziness in v3.7" thread, >>> right? Welcome to the hell, part two :{. ... >>> kswap is working very toughly though: >>> root 580 0.6 0.0 0 0 ? S úno12 46:21 [kswapd0] ... >> [ 2675.587878] INFO: rcu_sched self-detected stall on CPU { 5} (t=24000 >> jiffies g=6313 c=6312 q=68) > > Does Ingo's revert help? https://lkml.org/lkml/2013/2/15/168 Not at all... -- js suse labs ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 2013-02-20 22:14 ` Jiri Slaby @ 2013-02-21 12:07 ` Hillf Danton 2013-02-24 21:27 ` Jiri Slaby 2013-02-28 17:02 ` Jiri Slaby 0 siblings, 2 replies; 25+ messages in thread From: Hillf Danton @ 2013-02-21 12:07 UTC (permalink / raw) To: Jiri Slaby; +Cc: Daniel J Blueman, Linux Kernel, Steffen Persvold On Thu, Feb 21, 2013 at 6:14 AM, Jiri Slaby <jslaby@suse.cz> wrote: >> >> Does Ingo's revert help? https://lkml.org/lkml/2013/2/15/168 > > Not at all... > Then mind taking a try? --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 +++ b/mm/vmscan.c Thu Feb 21 20:05:58 2013 @@ -1715,7 +1715,7 @@ static void get_scan_count(struct lruvec * to swap. Better start now and leave the - probably heavily * thrashing - remaining file pages alone. */ - if (global_reclaim(sc)) { + if (global_reclaim(sc) && sc->priority >= DEF_PRIORITY - 2) { free = zone_page_state(zone, NR_FREE_PAGES); if (unlikely(file + free <= high_wmark_pages(zone))) { scan_balance = SCAN_ANON; @@ -2840,9 +2840,10 @@ out: * reclaim if they wish. */ if (sc.nr_reclaimed < SWAP_CLUSTER_MAX) - order = sc.order = 0; - - goto loop_again; + if (order != 0) { + sc.order = order = 0; + goto loop_again; + } } /* -- ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 2013-02-21 12:07 ` Hillf Danton @ 2013-02-24 21:27 ` Jiri Slaby 2013-02-28 17:02 ` Jiri Slaby 1 sibling, 0 replies; 25+ messages in thread From: Jiri Slaby @ 2013-02-24 21:27 UTC (permalink / raw) To: Hillf Danton; +Cc: Daniel J Blueman, Linux Kernel, Steffen Persvold On 02/21/2013 01:07 PM, Hillf Danton wrote: > On Thu, Feb 21, 2013 at 6:14 AM, Jiri Slaby <jslaby@suse.cz> wrote: >>> >>> Does Ingo's revert help? https://lkml.org/lkml/2013/2/15/168 >> >> Not at all... >> > Then mind taking a try? Applied now, I'll report in a week or so as it needs a couple of days of uptime to occur. thanks, -- js suse labs ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 2013-02-21 12:07 ` Hillf Danton 2013-02-24 21:27 ` Jiri Slaby @ 2013-02-28 17:02 ` Jiri Slaby 2013-03-01 14:02 ` Hillf Danton 1 sibling, 1 reply; 25+ messages in thread From: Jiri Slaby @ 2013-02-28 17:02 UTC (permalink / raw) To: Hillf Danton; +Cc: Daniel J Blueman, Linux Kernel, Steffen Persvold On 02/21/2013 01:07 PM, Hillf Danton wrote: > On Thu, Feb 21, 2013 at 6:14 AM, Jiri Slaby <jslaby@suse.cz> wrote: >>> >>> Does Ingo's revert help? https://lkml.org/lkml/2013/2/15/168 >> >> Not at all... >> > Then mind taking a try? Ok, no difference, kswap is still crazy. I'm attaching the output of "grep -vw '0' /proc/vmstat" if you see something there. > --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 > +++ b/mm/vmscan.c Thu Feb 21 20:05:58 2013 > @@ -1715,7 +1715,7 @@ static void get_scan_count(struct lruvec > * to swap. Better start now and leave the - probably heavily > * thrashing - remaining file pages alone. > */ > - if (global_reclaim(sc)) { > + if (global_reclaim(sc) && sc->priority >= DEF_PRIORITY - 2) { > free = zone_page_state(zone, NR_FREE_PAGES); > if (unlikely(file + free <= high_wmark_pages(zone))) { > scan_balance = SCAN_ANON; > @@ -2840,9 +2840,10 @@ out: > * reclaim if they wish. > */ > if (sc.nr_reclaimed < SWAP_CLUSTER_MAX) > - order = sc.order = 0; > - > - goto loop_again; > + if (order != 0) { > + sc.order = order = 0; > + goto loop_again; > + } > } > > /* nr_free_pages 36767 nr_inactive_anon 209253 nr_active_anon 1000355 nr_inactive_file 130500 nr_active_file 82677 nr_anon_pages 781334 nr_mapped 94443 nr_file_pages 554906 nr_dirty 29 nr_slab_reclaimable 13104 nr_slab_unreclaimable 9202 nr_page_table_pages 11694 nr_kernel_stack 477 nr_vmscan_write 114 nr_vmscan_immediate_reclaim 831 nr_shmem 341734 nr_dirtied 13492560 nr_written 13388832 nr_anon_transparent_hugepages 169 nr_dirty_threshold 20063 nr_dirty_background_threshold 10031 pgpgin 29026221 pgpgout 55166319 pgalloc_dma 256 pgalloc_dma32 75887179 pgalloc_normal 127591749 pgfree 212204191 pgactivate 5665900 pgdeactivate 1370274 pgfault 130946292 pgmajfault 91443 pgrefill_dma32 582854 pgrefill_normal 1140727 pgsteal_kswapd_dma32 6244454 pgsteal_kswapd_normal 6341734 pgsteal_direct_dma32 1209055 pgsteal_direct_normal 2280164 pgscan_kswapd_dma32 6271350 pgscan_kswapd_normal 6403760 pgscan_direct_dma32 1213349 pgscan_direct_normal 2300634 pginodesteal 190690 slabs_scanned 5139200 kswapd_inodesteal 456779 kswapd_low_wmark_hit_quickly 5042 kswapd_high_wmark_hit_quickly 156125 pageoutrun 170524 allocstall 32073 pgrotated 1321 pgmigrate_success 890843 pgmigrate_fail 282 compact_migrate_scanned 7776871 compact_free_scanned 565089036 compact_isolated 10590951 compact_stall 3114 compact_fail 2675 compact_success 439 unevictable_pgs_culled 658 unevictable_pgs_rescued 5309 unevictable_pgs_mlocked 5309 unevictable_pgs_munlocked 5309 thp_fault_alloc 6071 thp_fault_fallback 34735 thp_collapse_alloc 1817 thp_collapse_alloc_failed 2822 thp_split 292 thp_zero_page_alloc 2 thp_zero_page_alloc_failed 243 thanks, -- js suse labs ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 2013-02-28 17:02 ` Jiri Slaby @ 2013-03-01 14:02 ` Hillf Danton 2013-03-07 19:37 ` Jiri Slaby 0 siblings, 1 reply; 25+ messages in thread From: Hillf Danton @ 2013-03-01 14:02 UTC (permalink / raw) To: Jiri Slaby; +Cc: Daniel J Blueman, Linux Kernel, Steffen Persvold On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby <jslaby@suse.cz> wrote: > > Ok, no difference, kswap is still crazy. I'm attaching the output of > "grep -vw '0' /proc/vmstat" if you see something there. > Thanks to you for test and data. Lets try to restore the deleted nap, then. Hillf --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 +++ b/mm/vmscan.c Fri Mar 1 21:55:40 2013 @@ -2817,6 +2817,10 @@ loop_again: */ if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX) break; + + if (sc.priority < DEF_PRIORITY - 2) + congestion_wait(BLK_RW_ASYNC, HZ/10); + } while (--sc.priority >= 0); out: -- ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 2013-03-01 14:02 ` Hillf Danton @ 2013-03-07 19:37 ` Jiri Slaby 2013-03-08 6:42 ` Hillf Danton 0 siblings, 1 reply; 25+ messages in thread From: Jiri Slaby @ 2013-03-07 19:37 UTC (permalink / raw) To: Hillf Danton; +Cc: Daniel J Blueman, Linux Kernel, Steffen Persvold On 03/01/2013 03:02 PM, Hillf Danton wrote: > On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby <jslaby@suse.cz> wrote: >> >> Ok, no difference, kswap is still crazy. I'm attaching the output of >> "grep -vw '0' /proc/vmstat" if you see something there. >> > Thanks to you for test and data. > > Lets try to restore the deleted nap, then. Oh, it seems to be nice now: root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] Thanks. > Hillf > --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 > +++ b/mm/vmscan.c Fri Mar 1 21:55:40 2013 > @@ -2817,6 +2817,10 @@ loop_again: > */ > if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX) > break; > + > + if (sc.priority < DEF_PRIORITY - 2) > + congestion_wait(BLK_RW_ASYNC, HZ/10); > + > } while (--sc.priority >= 0); > > out: > -- > -- js suse labs ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 2013-03-07 19:37 ` Jiri Slaby @ 2013-03-08 6:42 ` Hillf Danton 0 siblings, 0 replies; 25+ messages in thread From: Hillf Danton @ 2013-03-08 6:42 UTC (permalink / raw) To: Jiri Slaby Cc: Daniel J Blueman, Linux Kernel, Steffen Persvold, mm, Mel Gorman, Andrew Morton On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby <jslaby@suse.cz> wrote: > On 03/01/2013 03:02 PM, Hillf Danton wrote: >> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby <jslaby@suse.cz> wrote: >>> >>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>> "grep -vw '0' /proc/vmstat" if you see something there. >>> >> Thanks to you for test and data. >> >> Lets try to restore the deleted nap, then. > > Oh, it seems to be nice now: > root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] > Double thanks. But Mel does not like it, probably. Lets try nap in another way. Hillf --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 +++ b/mm/vmscan.c Fri Mar 8 14:36:10 2013 @@ -2793,6 +2793,10 @@ loop_again: * speculatively avoid congestion waits */ zone_clear_flag(zone, ZONE_CONGESTED); + + else if (sc.priority > 2 && + sc.priority < DEF_PRIORITY - 2) + wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); } /* -- >> >> --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 >> +++ b/mm/vmscan.c Fri Mar 1 21:55:40 2013 >> @@ -2817,6 +2817,10 @@ loop_again: >> */ >> if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX) >> break; >> + >> + if (sc.priority < DEF_PRIORITY - 2) >> + congestion_wait(BLK_RW_ASYNC, HZ/10); >> + >> } while (--sc.priority >= 0); >> >> out: >> -- >> > > > -- > js > suse labs ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 @ 2013-03-08 6:42 ` Hillf Danton 0 siblings, 0 replies; 25+ messages in thread From: Hillf Danton @ 2013-03-08 6:42 UTC (permalink / raw) To: Jiri Slaby Cc: Daniel J Blueman, Linux Kernel, Steffen Persvold, mm, Mel Gorman, Andrew Morton On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby <jslaby@suse.cz> wrote: > On 03/01/2013 03:02 PM, Hillf Danton wrote: >> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby <jslaby@suse.cz> wrote: >>> >>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>> "grep -vw '0' /proc/vmstat" if you see something there. >>> >> Thanks to you for test and data. >> >> Lets try to restore the deleted nap, then. > > Oh, it seems to be nice now: > root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] > Double thanks. But Mel does not like it, probably. Lets try nap in another way. Hillf --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 +++ b/mm/vmscan.c Fri Mar 8 14:36:10 2013 @@ -2793,6 +2793,10 @@ loop_again: * speculatively avoid congestion waits */ zone_clear_flag(zone, ZONE_CONGESTED); + + else if (sc.priority > 2 && + sc.priority < DEF_PRIORITY - 2) + wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); } /* -- >> >> --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 >> +++ b/mm/vmscan.c Fri Mar 1 21:55:40 2013 >> @@ -2817,6 +2817,10 @@ loop_again: >> */ >> if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX) >> break; >> + >> + if (sc.priority < DEF_PRIORITY - 2) >> + congestion_wait(BLK_RW_ASYNC, HZ/10); >> + >> } while (--sc.priority >= 0); >> >> out: >> -- >> > > > -- > js > suse labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 2013-03-08 6:42 ` Hillf Danton @ 2013-03-08 7:29 ` Zlatko Calusic -1 siblings, 0 replies; 25+ messages in thread From: Zlatko Calusic @ 2013-03-08 7:29 UTC (permalink / raw) To: Hillf Danton Cc: Jiri Slaby, Daniel J Blueman, Linux Kernel, Steffen Persvold, mm, Mel Gorman, Andrew Morton On 08.03.2013 07:42, Hillf Danton wrote: > On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby <jslaby@suse.cz> wrote: >> On 03/01/2013 03:02 PM, Hillf Danton wrote: >>> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby <jslaby@suse.cz> wrote: >>>> >>>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>>> "grep -vw '0' /proc/vmstat" if you see something there. >>>> >>> Thanks to you for test and data. >>> >>> Lets try to restore the deleted nap, then. >> >> Oh, it seems to be nice now: >> root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] >> > Double thanks. > > But Mel does not like it, probably. > Lets try nap in another way. > > Hillf > > --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 > +++ b/mm/vmscan.c Fri Mar 8 14:36:10 2013 > @@ -2793,6 +2793,10 @@ loop_again: > * speculatively avoid congestion waits > */ > zone_clear_flag(zone, ZONE_CONGESTED); > + > + else if (sc.priority > 2 && > + sc.priority < DEF_PRIORITY - 2) > + wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); > } > > /* > -- > There's another bug in there, which I'm still chasing. Artificial sleeps like this just mask the real bug and introduce new problems (on my 4GB server kswapd spends all the time in those congestion wait calls). The problem is that the bug needs about 5 days of uptime to reveal it's ugly head. So far I can only tell that it was introduced somewhere between 3.1 & 3.4. Also, check shrink_inactive_list(), it already sleeps if really needed: if (nr_writeback && nr_writeback >= (nr_taken >> (DEF_PRIORITY - sc->priority))) wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); Regards, -- Zlatko ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 @ 2013-03-08 7:29 ` Zlatko Calusic 0 siblings, 0 replies; 25+ messages in thread From: Zlatko Calusic @ 2013-03-08 7:29 UTC (permalink / raw) To: Hillf Danton Cc: Jiri Slaby, Daniel J Blueman, Linux Kernel, Steffen Persvold, mm, Mel Gorman, Andrew Morton On 08.03.2013 07:42, Hillf Danton wrote: > On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby <jslaby@suse.cz> wrote: >> On 03/01/2013 03:02 PM, Hillf Danton wrote: >>> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby <jslaby@suse.cz> wrote: >>>> >>>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>>> "grep -vw '0' /proc/vmstat" if you see something there. >>>> >>> Thanks to you for test and data. >>> >>> Lets try to restore the deleted nap, then. >> >> Oh, it seems to be nice now: >> root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] >> > Double thanks. > > But Mel does not like it, probably. > Lets try nap in another way. > > Hillf > > --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 > +++ b/mm/vmscan.c Fri Mar 8 14:36:10 2013 > @@ -2793,6 +2793,10 @@ loop_again: > * speculatively avoid congestion waits > */ > zone_clear_flag(zone, ZONE_CONGESTED); > + > + else if (sc.priority > 2 && > + sc.priority < DEF_PRIORITY - 2) > + wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); > } > > /* > -- > There's another bug in there, which I'm still chasing. Artificial sleeps like this just mask the real bug and introduce new problems (on my 4GB server kswapd spends all the time in those congestion wait calls). The problem is that the bug needs about 5 days of uptime to reveal it's ugly head. So far I can only tell that it was introduced somewhere between 3.1 & 3.4. Also, check shrink_inactive_list(), it already sleeps if really needed: if (nr_writeback && nr_writeback >= (nr_taken >> (DEF_PRIORITY - sc->priority))) wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); Regards, -- Zlatko -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 2013-03-08 7:29 ` Zlatko Calusic @ 2013-03-08 8:27 ` Hillf Danton -1 siblings, 0 replies; 25+ messages in thread From: Hillf Danton @ 2013-03-08 8:27 UTC (permalink / raw) To: Zlatko Calusic Cc: Jiri Slaby, Daniel J Blueman, Linux Kernel, Steffen Persvold, mm, Mel Gorman, Andrew Morton On Fri, Mar 8, 2013 at 3:29 PM, Zlatko Calusic <zcalusic@bitsync.net> wrote: > There's another bug in there, which I'm still chasing. > I am busy in discovering an employer(a really hard work?) so I dunno the hours I have for that bug. Hmm, take a look at Mels thoughts? http://marc.info/?l=linux-mm&m=136189593423501&w=2 BTW, he will be online next week. Hillf ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 @ 2013-03-08 8:27 ` Hillf Danton 0 siblings, 0 replies; 25+ messages in thread From: Hillf Danton @ 2013-03-08 8:27 UTC (permalink / raw) To: Zlatko Calusic Cc: Jiri Slaby, Daniel J Blueman, Linux Kernel, Steffen Persvold, mm, Mel Gorman, Andrew Morton On Fri, Mar 8, 2013 at 3:29 PM, Zlatko Calusic <zcalusic@bitsync.net> wrote: > There's another bug in there, which I'm still chasing. > I am busy in discovering an employer(a really hard work?) so I dunno the hours I have for that bug. Hmm, take a look at Mels thoughts? http://marc.info/?l=linux-mm&m=136189593423501&w=2 BTW, he will be online next week. Hillf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 2013-03-08 6:42 ` Hillf Danton @ 2013-03-08 23:21 ` Jiri Slaby -1 siblings, 0 replies; 25+ messages in thread From: Jiri Slaby @ 2013-03-08 23:21 UTC (permalink / raw) To: Hillf Danton Cc: Daniel J Blueman, Linux Kernel, Steffen Persvold, mm, Mel Gorman, Andrew Morton On 03/08/2013 07:42 AM, Hillf Danton wrote: > On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby <jslaby@suse.cz> wrote: >> On 03/01/2013 03:02 PM, Hillf Danton wrote: >>> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby <jslaby@suse.cz> wrote: >>>> >>>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>>> "grep -vw '0' /proc/vmstat" if you see something there. >>>> >>> Thanks to you for test and data. >>> >>> Lets try to restore the deleted nap, then. >> >> Oh, it seems to be nice now: >> root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] >> > Double thanks. There is one downside. I'm not sure whether that patch was the culprit. My Thunderbird is jerky when scrolling and lags while writing this message. The letters sometimes appear later than typed and in groups. Like I (kbd): My Thunder TB: My Thunder I (kbd): b-i-r-d TB: is silent I (kbd): still typing... TB: bird is Perhaps it's not only TB. > But Mel does not like it, probably. > Lets try nap in another way. Will try next week. > --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 > +++ b/mm/vmscan.c Fri Mar 8 14:36:10 2013 > @@ -2793,6 +2793,10 @@ loop_again: > * speculatively avoid congestion waits > */ > zone_clear_flag(zone, ZONE_CONGESTED); > + > + else if (sc.priority > 2 && > + sc.priority < DEF_PRIORITY - 2) > + wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); > } > > /* -- js suse labs ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 @ 2013-03-08 23:21 ` Jiri Slaby 0 siblings, 0 replies; 25+ messages in thread From: Jiri Slaby @ 2013-03-08 23:21 UTC (permalink / raw) To: Hillf Danton Cc: Daniel J Blueman, Linux Kernel, Steffen Persvold, mm, Mel Gorman, Andrew Morton On 03/08/2013 07:42 AM, Hillf Danton wrote: > On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby <jslaby@suse.cz> wrote: >> On 03/01/2013 03:02 PM, Hillf Danton wrote: >>> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby <jslaby@suse.cz> wrote: >>>> >>>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>>> "grep -vw '0' /proc/vmstat" if you see something there. >>>> >>> Thanks to you for test and data. >>> >>> Lets try to restore the deleted nap, then. >> >> Oh, it seems to be nice now: >> root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] >> > Double thanks. There is one downside. I'm not sure whether that patch was the culprit. My Thunderbird is jerky when scrolling and lags while writing this message. The letters sometimes appear later than typed and in groups. Like I (kbd): My Thunder TB: My Thunder I (kbd): b-i-r-d TB: is silent I (kbd): still typing... TB: bird is Perhaps it's not only TB. > But Mel does not like it, probably. > Lets try nap in another way. Will try next week. > --- a/mm/vmscan.c Thu Feb 21 20:01:02 2013 > +++ b/mm/vmscan.c Fri Mar 8 14:36:10 2013 > @@ -2793,6 +2793,10 @@ loop_again: > * speculatively avoid congestion waits > */ > zone_clear_flag(zone, ZONE_CONGESTED); > + > + else if (sc.priority > 2 && > + sc.priority < DEF_PRIORITY - 2) > + wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); > } > > /* -- js suse labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 2013-03-08 23:21 ` Jiri Slaby @ 2013-03-19 16:59 ` Pádraig Brady -1 siblings, 0 replies; 25+ messages in thread From: Pádraig Brady @ 2013-03-19 16:59 UTC (permalink / raw) To: Jiri Slaby Cc: Hillf Danton, Daniel J Blueman, Linux Kernel, Steffen Persvold, mm, Mel Gorman, Andrew Morton On 03/08/2013 11:21 PM, Jiri Slaby wrote: > On 03/08/2013 07:42 AM, Hillf Danton wrote: >> On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby <jslaby@suse.cz> wrote: >>> On 03/01/2013 03:02 PM, Hillf Danton wrote: >>>> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby <jslaby@suse.cz> wrote: >>>>> >>>>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>>>> "grep -vw '0' /proc/vmstat" if you see something there. >>>>> >>>> Thanks to you for test and data. >>>> >>>> Lets try to restore the deleted nap, then. >>> >>> Oh, it seems to be nice now: >>> root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] >>> >> Double thanks. > > There is one downside. I'm not sure whether that patch was the culprit. > My Thunderbird is jerky when scrolling and lags while writing this > message. The letters sometimes appear later than typed and in groups. Like > I (kbd): My Thunder > TB: My Thunder > I (kbd): b-i-r-d > TB: is silent > I (kbd): still typing... > TB: bird is > > Perhaps it's not only TB. I notice the same thunderbird issue on the much older 2.6.40.4-5.fc15.x86_64 which I'd hoped would be fixed on upgrade :( My Thunderbird is using 1957m virt, 722m RSS on my 3G system. What are your corresponding mem values? For reference: http://marc.info/?t=130865025500001&r=1&w=2 https://bugzilla.redhat.com/show_bug.cgi?id=712019 thanks, Pádraig. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 @ 2013-03-19 16:59 ` Pádraig Brady 0 siblings, 0 replies; 25+ messages in thread From: Pádraig Brady @ 2013-03-19 16:59 UTC (permalink / raw) To: Jiri Slaby Cc: Hillf Danton, Daniel J Blueman, Linux Kernel, Steffen Persvold, mm, Mel Gorman, Andrew Morton On 03/08/2013 11:21 PM, Jiri Slaby wrote: > On 03/08/2013 07:42 AM, Hillf Danton wrote: >> On Fri, Mar 8, 2013 at 3:37 AM, Jiri Slaby <jslaby@suse.cz> wrote: >>> On 03/01/2013 03:02 PM, Hillf Danton wrote: >>>> On Fri, Mar 1, 2013 at 1:02 AM, Jiri Slaby <jslaby@suse.cz> wrote: >>>>> >>>>> Ok, no difference, kswap is still crazy. I'm attaching the output of >>>>> "grep -vw '0' /proc/vmstat" if you see something there. >>>>> >>>> Thanks to you for test and data. >>>> >>>> Lets try to restore the deleted nap, then. >>> >>> Oh, it seems to be nice now: >>> root 579 0.0 0.0 0 0 ? S Mar04 0:13 [kswapd0] >>> >> Double thanks. > > There is one downside. I'm not sure whether that patch was the culprit. > My Thunderbird is jerky when scrolling and lags while writing this > message. The letters sometimes appear later than typed and in groups. Like > I (kbd): My Thunder > TB: My Thunder > I (kbd): b-i-r-d > TB: is silent > I (kbd): still typing... > TB: bird is > > Perhaps it's not only TB. I notice the same thunderbird issue on the much older 2.6.40.4-5.fc15.x86_64 which I'd hoped would be fixed on upgrade :( My Thunderbird is using 1957m virt, 722m RSS on my 3G system. What are your corresponding mem values? For reference: http://marc.info/?t=130865025500001&r=1&w=2 https://bugzilla.redhat.com/show_bug.cgi?id=712019 thanks, PA!draig. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 2013-03-19 16:59 ` Pádraig Brady @ 2013-03-20 4:12 ` Hillf Danton -1 siblings, 0 replies; 25+ messages in thread From: Hillf Danton @ 2013-03-20 4:12 UTC (permalink / raw) To: Pádraig Brady Cc: Jiri Slaby, Daniel J Blueman, Linux Kernel, Steffen Persvold, mm, Mel Gorman On Wed, Mar 20, 2013 at 12:59 AM, Pádraig Brady <P@draigbrady.com> wrote: > > I notice the same thunderbird issue on the much older 2.6.40.4-5.fc15.x86_64 > which I'd hoped would be fixed on upgrade :( > > My Thunderbird is using 1957m virt, 722m RSS on my 3G system. > What are your corresponding mem values? > > For reference: > http://marc.info/?t=130865025500001&r=1&w=2 > https://bugzilla.redhat.com/show_bug.cgi?id=712019 > Hey, would you all please try Mels new work? http://marc.info/?l=linux-mm&m=136352546814642&w=4 thanks Hillf ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 @ 2013-03-20 4:12 ` Hillf Danton 0 siblings, 0 replies; 25+ messages in thread From: Hillf Danton @ 2013-03-20 4:12 UTC (permalink / raw) To: Pádraig Brady Cc: Jiri Slaby, Daniel J Blueman, Linux Kernel, Steffen Persvold, mm, Mel Gorman On Wed, Mar 20, 2013 at 12:59 AM, Pádraig Brady <P@draigbrady.com> wrote: > > I notice the same thunderbird issue on the much older 2.6.40.4-5.fc15.x86_64 > which I'd hoped would be fixed on upgrade :( > > My Thunderbird is using 1957m virt, 722m RSS on my 3G system. > What are your corresponding mem values? > > For reference: > http://marc.info/?t=130865025500001&r=1&w=2 > https://bugzilla.redhat.com/show_bug.cgi?id=712019 > Hey, would you all please try Mels new work? http://marc.info/?l=linux-mm&m=136352546814642&w=4 thanks Hillf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 2013-03-20 4:12 ` Hillf Danton @ 2013-03-20 8:39 ` Jiri Slaby -1 siblings, 0 replies; 25+ messages in thread From: Jiri Slaby @ 2013-03-20 8:39 UTC (permalink / raw) To: Hillf Danton, Pádraig Brady Cc: Daniel J Blueman, Linux Kernel, Steffen Persvold, mm, Mel Gorman On 03/20/2013 05:12 AM, Hillf Danton wrote: > Hey, would you all please try Mels new work? > http://marc.info/?l=linux-mm&m=136352546814642&w=4 Yeah, I was in CC and also asked Mel if I should apply those. I will as soon as I'm back home (next week). thanks, -- js suse labs ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: kswapd craziness round 2 @ 2013-03-20 8:39 ` Jiri Slaby 0 siblings, 0 replies; 25+ messages in thread From: Jiri Slaby @ 2013-03-20 8:39 UTC (permalink / raw) To: Hillf Danton, Pádraig Brady Cc: Daniel J Blueman, Linux Kernel, Steffen Persvold, mm, Mel Gorman On 03/20/2013 05:12 AM, Hillf Danton wrote: > Hey, would you all please try Mels new work? > http://marc.info/?l=linux-mm&m=136352546814642&w=4 Yeah, I was in CC and also asked Mel if I should apply those. I will as soon as I'm back home (next week). thanks, -- js suse labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* kswapd craziness round 2 @ 2013-02-17 22:02 ` Jiri Slaby 0 siblings, 0 replies; 25+ messages in thread From: Jiri Slaby @ 2013-02-17 22:02 UTC (permalink / raw) To: linux-mm; +Cc: Mel Gorman, Andrew Morton, Valdis Kletnieks, LKML, Rik van Riel Hi, You still feel the sour taste of the "kswapd craziness in v3.7" thread, right? Welcome to the hell, part two :{. I believe this started happening after update from 3.8.0-rc4-next-20130125 to 3.8.0-rc7-next-20130211. The same as before, many hours of uptime are needed and perhaps some suspend/resume cycles too. Memory pressure is not high, plenty of I/O cache: # free total used free shared buffers cached Mem: 6026692 5571184 455508 0 351252 2016648 -/+ buffers/cache: 3203284 2823408 Swap: 0 0 0 kswap is working very toughly though: root 580 0.6 0.0 0 0 ? S úno12 46:21 [kswapd0] This happens on I/O activity right now. For example by updatedb or find /. This is what the stack trace of kswapd0 looks like: [<ffffffff8113c431>] shrink_slab+0xa1/0x2d0 [<ffffffff8113ecd1>] kswapd+0x541/0x930 [<ffffffff810a3000>] kthread+0xc0/0xd0 [<ffffffff816beb5c>] ret_from_fork+0x7c/0xb0 [<ffffffffffffffff>] 0xffffffffffffffff Any ideas? thanks, -- js suse labs ^ permalink raw reply [flat|nested] 25+ messages in thread
* kswapd craziness round 2 @ 2013-02-17 22:02 ` Jiri Slaby 0 siblings, 0 replies; 25+ messages in thread From: Jiri Slaby @ 2013-02-17 22:02 UTC (permalink / raw) To: linux-mm; +Cc: Mel Gorman, Andrew Morton, Valdis Kletnieks, LKML, Rik van Riel Hi, You still feel the sour taste of the "kswapd craziness in v3.7" thread, right? Welcome to the hell, part two :{. I believe this started happening after update from 3.8.0-rc4-next-20130125 to 3.8.0-rc7-next-20130211. The same as before, many hours of uptime are needed and perhaps some suspend/resume cycles too. Memory pressure is not high, plenty of I/O cache: # free total used free shared buffers cached Mem: 6026692 5571184 455508 0 351252 2016648 -/+ buffers/cache: 3203284 2823408 Swap: 0 0 0 kswap is working very toughly though: root 580 0.6 0.0 0 0 ? S uno12 46:21 [kswapd0] This happens on I/O activity right now. For example by updatedb or find /. This is what the stack trace of kswapd0 looks like: [<ffffffff8113c431>] shrink_slab+0xa1/0x2d0 [<ffffffff8113ecd1>] kswapd+0x541/0x930 [<ffffffff810a3000>] kthread+0xc0/0xd0 [<ffffffff816beb5c>] ret_from_fork+0x7c/0xb0 [<ffffffffffffffff>] 0xffffffffffffffff Any ideas? thanks, -- js suse labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2013-03-20 8:38 UTC | newest] Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-02-18 6:18 kswapd craziness round 2 Daniel J Blueman 2013-02-18 11:42 ` Hillf Danton 2013-02-18 15:05 ` Daniel J Blueman 2013-02-20 22:14 ` Jiri Slaby 2013-02-21 12:07 ` Hillf Danton 2013-02-24 21:27 ` Jiri Slaby 2013-02-28 17:02 ` Jiri Slaby 2013-03-01 14:02 ` Hillf Danton 2013-03-07 19:37 ` Jiri Slaby 2013-03-08 6:42 ` Hillf Danton 2013-03-08 6:42 ` Hillf Danton 2013-03-08 7:29 ` Zlatko Calusic 2013-03-08 7:29 ` Zlatko Calusic 2013-03-08 8:27 ` Hillf Danton 2013-03-08 8:27 ` Hillf Danton 2013-03-08 23:21 ` Jiri Slaby 2013-03-08 23:21 ` Jiri Slaby 2013-03-19 16:59 ` Pádraig Brady 2013-03-19 16:59 ` Pádraig Brady 2013-03-20 4:12 ` Hillf Danton 2013-03-20 4:12 ` Hillf Danton 2013-03-20 8:39 ` Jiri Slaby 2013-03-20 8:39 ` Jiri Slaby -- strict thread matches above, loose matches on Subject: below -- 2013-02-17 22:02 Jiri Slaby 2013-02-17 22:02 ` Jiri Slaby
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.