From: Jia He <hejianet@gmail.com>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Mel Gorman <mgorman@techsingularity.net>,
Vlastimil Babka <vbabka@suse.cz>, Michal Hocko <mhocko@suse.com>,
Minchan Kim <minchan@kernel.org>, Rik van Riel <riel@redhat.com>,
Jia He <hejianet@gmail.com>
Subject: [RFC PATCH] mm/vmscan: fix high cpu usage of kswapd if there
Date: Wed, 22 Feb 2017 17:04:48 +0800 [thread overview]
Message-ID: <1487754288-5149-1-git-send-email-hejianet@gmail.com> (raw)
When I try to dynamically allocate the hugepages more than system total
free memory:
e.g. echo 4000 >/proc/sys/vm/nr_hugepages
Then the kswapd will take 100% cpu for a long time(more than 3 hours, and
will not be about to end)
top result:
top - 13:42:59 up 3:37, 1 user, load average: 1.09, 1.03, 1.01
Tasks: 1 total, 1 running, 0 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 12.5 sy, 0.0 ni, 85.5 id, 2.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 31371520 total, 30915136 used, 456384 free, 320 buffers
KiB Swap: 6284224 total, 115712 used, 6168512 free. 48192 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
76 root 20 0 0 0 0 R 100.0 0.000 217:17.29 kswapd3
The root cause is kswapd3 is trying to do relaim again and again but it
makes no progress
# numactl -H
available: 3 nodes (0,2-3)
node 0 cpus:
node 0 size: 0 MB
node 0 free: 0 MB
node 2 cpus: 0 1 2 3 4 5 6 7
node 2 size: 15299 MB
node 2 free: 289 MB
node 3 cpus:
node 3 size: 15336 MB
node 3 free: 184 MB <--- kswapd works
node distances:
node 0 2 3
0: 10 40 40
2: 40 10 20
3: 40 20 10
At that time, there are no relaimable pages in that node:
Node 3, zone DMA
per-node stats
nr_inactive_anon 0
nr_active_anon 0
nr_inactive_file 0
nr_active_file 0
nr_unevictable 0
nr_isolated_anon 0
nr_isolated_file 0
nr_pages_scanned 0
workingset_refault 0
workingset_activate 0
workingset_nodereclaim 0
nr_anon_pages 0
nr_mapped 0
nr_file_pages 0
nr_dirty 0
nr_writeback 0
nr_writeback_temp 0
nr_shmem 0
nr_shmem_hugepages 0
nr_shmem_pmdmapped 0
nr_anon_transparent_hugepages 0
nr_unstable 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_dirtied 0
nr_written 0
pages free 2951
min 2821
low 3526
high 4231
node_scanned 0
spanned 245760
present 245760
managed 245388
nr_free_pages 2951
nr_zone_inactive_anon 0
nr_zone_active_anon 0
nr_zone_inactive_file 0
nr_zone_active_file 0
nr_zone_unevictable 0
nr_zone_write_pending 0
nr_mlock 0
nr_slab_reclaimable 46
nr_slab_unreclaimable 90
nr_page_table_pages 0
nr_kernel_stack 0
nr_bounce 0
nr_zspages 0
numa_hit 2257
numa_miss 0
numa_foreign 0
numa_interleave 982
numa_local 0
numa_other 2257
nr_free_cma 0
protection: (0, 0, 0, 0)
This patch resolves the issue from 2 aspects:
1. In prepare_kswapd_sleep, only when zone is not balanced and there is
reclaimable pages in this zone, kswapd will go to do relaim without sleeping
2. Don't wake up kswapd if there are no reclaimable pages in that node
After this patch:
top - 07:13:40 up 28 min, 1 user, load average: 0.00, 0.00, 0.00
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.0 sy, 0.0 ni, 99.9 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 31371520 total, 30908096 used, 463424 free, 384 buffers
KiB Swap: 6284224 total, 77504 used, 6206720 free. 131328 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
77 root 20 0 0 0 0 S 0.000 0.000 0:00.00 kswapd3
Signed-off-by: Jia He <hejianet@gmail.com>
---
mm/vmscan.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 532a2a7..a05e3ab 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3139,7 +3139,8 @@ static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, int classzone_idx)
if (!managed_zone(zone))
continue;
- if (!zone_balanced(zone, order, classzone_idx))
+ if (!zone_balanced(zone, order, classzone_idx)
+ && zone_reclaimable_pages(zone))
return false;
}
@@ -3502,6 +3503,7 @@ void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx)
{
pg_data_t *pgdat;
int z;
+ int node_has_relaimable_pages = 0;
if (!managed_zone(zone))
return;
@@ -3522,8 +3524,15 @@ void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx)
if (zone_balanced(zone, order, classzone_idx))
return;
+
+ if (!zone_reclaimable_pages(zone))
+ node_has_relaimable_pages = 1;
}
+ /* Dont wake kswapd if no reclaimable pages */
+ if (!node_has_relaimable_pages)
+ return;
+
trace_mm_vmscan_wakeup_kswapd(pgdat->node_id, zone_idx(zone), order);
wake_up_interruptible(&pgdat->kswapd_wait);
}
--
1.8.5.6
next reply other threads:[~2017-02-22 9:05 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-22 9:04 Jia He [this message]
2017-02-22 11:41 ` [RFC PATCH] mm/vmscan: fix high cpu usage of kswapd if there Michal Hocko
2017-02-22 14:31 ` hejianet
2017-02-22 15:48 ` Michal Hocko
2017-02-23 2:25 ` hejianet
2017-02-22 20:16 ` Johannes Weiner
2017-02-22 20:24 ` Johannes Weiner
2017-02-23 7:29 ` Michal Hocko
[not found] ` <28d09cda-e020-8289-1b1f-e19fbd3b3aeb@gmail.com>
2017-02-23 3:15 ` Fwd: " hejianet
2017-02-23 7:21 ` Michal Hocko
2017-02-23 10:19 ` Michal Hocko
2017-02-23 11:16 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1487754288-5149-1-git-send-email-hejianet@gmail.com \
--to=hejianet@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=riel@redhat.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).