linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: mm: kswapd struggles reclaiming the pages on 64GB server
       [not found] <CAK-uSPo9Nc-1HaURvwstOGYGuMEx4CXhPRv+cZevYLZX6URzYw@mail.gmail.com>
@ 2016-08-17 11:43 ` Michal Hocko
  2016-08-22 18:16   ` Andriy Tkachuk
  0 siblings, 1 reply; 2+ messages in thread
From: Michal Hocko @ 2016-08-17 11:43 UTC (permalink / raw)
  To: Andriy Tkachuk; +Cc: linux-kernel, Mel Gorman, linux-mm, Johannes Weiner

[CCing linux-mm and Johannes]

On Fri 12-08-16 21:52:20, Andriy Tkachuk wrote:
> Hi,
> 
> our user-space application uses large amount of anon pages (private
> mapping of the large file, more than 64GB RAM available in the system)
> which are rarely accessible and are supposed to be swapped out.
> Instead, we see that most of these pages are kept in memory while the
> system suffers from the lack of free memory and overall performance
> (especially the disk I/O, vm.swappiness=100 does not help it). kswapd
> scans millions of pages per second but reclames hundreds per sec only.

I haven't looked at your numbers deeply but this smells like the long
standing problem/limitation we have. We are trying really hard to not
swap out and rather reclaim the page cache because the swap refault
tends to be more disruptive in many case. Not all, though, and trashing
like behavior you see is cetainly undesirable.

Johannes has been looking into that area recently. Have a look at
http://lkml.kernel.org/r/20160606194836.3624-1-hannes@cmpxchg.org

> Here are the 5 secs interval snapshots of some counters:
> 
> $ egrep 'Cached|nr_.*active_anon|pgsteal_.*_normal|pgscan_kswapd_normal|pgrefill_normal|nr_vmscan_write|nr_swap|pgact'
> proc-*-0616-1605[345]* | sed 's/:/ /' | sort -sk 2,2
> proc-meminfo-0616-160539.txt Cached:           347936 kB
> proc-meminfo-0616-160549.txt Cached:           316316 kB
> proc-meminfo-0616-160559.txt Cached:           322264 kB
> proc-meminfo-0616-160539.txt SwapCached:      2853064 kB
> proc-meminfo-0616-160549.txt SwapCached:      2853168 kB
> proc-meminfo-0616-160559.txt SwapCached:      2853280 kB
> proc-vmstat-0616-160535.txt nr_active_anon 14508616
> proc-vmstat-0616-160545.txt nr_active_anon 14513725
> proc-vmstat-0616-160555.txt nr_active_anon 14515197
> proc-vmstat-0616-160535.txt nr_inactive_anon 747407
> proc-vmstat-0616-160545.txt nr_inactive_anon 744846
> proc-vmstat-0616-160555.txt nr_inactive_anon 744509
> proc-vmstat-0616-160535.txt nr_vmscan_write 5589095
> proc-vmstat-0616-160545.txt nr_vmscan_write 5589097
> proc-vmstat-0616-160555.txt nr_vmscan_write 5589097
> proc-vmstat-0616-160535.txt pgactivate 246016824
> proc-vmstat-0616-160545.txt pgactivate 246033242
> proc-vmstat-0616-160555.txt pgactivate 246042064
> proc-vmstat-0616-160535.txt pgrefill_normal 22763262
> proc-vmstat-0616-160545.txt pgrefill_normal 22768020
> proc-vmstat-0616-160555.txt pgrefill_normal 22768178
> proc-vmstat-0616-160535.txt pgscan_kswapd_normal 111985367420
> proc-vmstat-0616-160545.txt pgscan_kswapd_normal 111996845554
> proc-vmstat-0616-160555.txt pgscan_kswapd_normal 112028276639
> proc-vmstat-0616-160535.txt pgsteal_direct_normal 344064
> proc-vmstat-0616-160545.txt pgsteal_direct_normal 344064
> proc-vmstat-0616-160555.txt pgsteal_direct_normal 344064
> proc-vmstat-0616-160535.txt pgsteal_kswapd_normal 53817848
> proc-vmstat-0616-160545.txt pgsteal_kswapd_normal 53818626
> proc-vmstat-0616-160555.txt pgsteal_kswapd_normal 53818637
> 
> The pgrefill_normal and pgactivate counters show that only few
> hundreds/sec pages move from active to inactive and vice versa lists -
> that is comparable with what was reclaimed. So it looks like kswapd
> scans the pages from inactive list mostly in kind of a loop and does
> not even have a chance to look at the pages from the active list
> (where most of the application's anon pages are located).
> 
> The kernel version: linux-3.10.0-229.14.1.el7.
> 
> Any ideas? Would be be useful to change inactive_ratio dynamically in
> such a cases so that more pages could be moved from active to inactive
> list and get a chance to be reclaimed? (Note: when application is
> restarted - the problem disappears for a while (days) until the
> correspondent number of privately mapped pages are dirtied again.)
> 
> Thank you,
>    Andriy

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: mm: kswapd struggles reclaiming the pages on 64GB server
  2016-08-17 11:43 ` mm: kswapd struggles reclaiming the pages on 64GB server Michal Hocko
@ 2016-08-22 18:16   ` Andriy Tkachuk
  0 siblings, 0 replies; 2+ messages in thread
From: Andriy Tkachuk @ 2016-08-22 18:16 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-kernel, Mel Gorman, linux-mm, Johannes Weiner

Hi Michal.

Thank you for the reply.

It looks like the root cause of the problems we are facing is a bit
different, although the ultimate effect is similar - bad swapping
effectiveness.

As far as I could understand, Johannes tries to fix the balancing
between anon and file lists. But in my case it looks like the anon
pages which are idle for a long time and could be swapped out - they
all are just sitting in active list and don't move to inactive without
a chance to be scanned and eventually swapped out. (See the
/proc/vmstat samples and explanations in my prev. mail. BTW, the
samples interval is 10 secs there, not the 5. My typo.)

It looks like in my case the system load activity enters a steady mode
when all the scanned pages from inactive list become referenced very
soon. So kswapd aggresively scans, but mostly the inactive list where
it can hardly find to reclaim anything. So the inactive list is not
shortened and, as result, is not refilled from the active one. That's
why the anon pages from active list are not even get a chance to be
scanned. Note: the zone's inactive_ratio is more than 10 on 64GB RAM
systems, so the inactive list is much smaller than active in my case.

  Andriy

On Wed, Aug 17, 2016 at 12:43 PM, Michal Hocko <mhocko@kernel.org> wrote:
> [CCing linux-mm and Johannes]
>
>
> I haven't looked at your numbers deeply but this smells like the long
> standing problem/limitation we have. We are trying really hard to not
> swap out and rather reclaim the page cache because the swap refault
> tends to be more disruptive in many case. Not all, though, and trashing
> like behavior you see is cetainly undesirable.
>
> Johannes has been looking into that area recently. Have a look at
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lkml.kernel.org_r_20160606194836.3624-2D1-2Dhannes-40cmpxchg.org&d=DQIBAg&c=IGDlg0lD0b-nebmJJ0Kp8A&r=rP2MQ-RHGa6a64ebEAbeV_m6Ae_GOWHWTIpipamZCdE&m=Mxava1puJmDToyZNc62FshgwDC66k26arjHAM6o54yI&s=wmYJ3WdYDc73B7hO75xxvmIk0hDoTUSjGH-KxSC48SA&e=
>
> --
> Michal Hocko
> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2016-08-22 18:16 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAK-uSPo9Nc-1HaURvwstOGYGuMEx4CXhPRv+cZevYLZX6URzYw@mail.gmail.com>
2016-08-17 11:43 ` mm: kswapd struggles reclaiming the pages on 64GB server Michal Hocko
2016-08-22 18:16   ` Andriy Tkachuk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).