* [PATCH v2 0/3] improvements about lowmem_reserve and /proc/zoneinfo
@ 2020-04-02 14:01 Baoquan He
2020-04-02 14:01 ` [PATCH v2 1/3] mm/page_alloc.c: only tune sysctl_lowmem_reserve_ratio value once when changing it Baoquan He
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Baoquan He @ 2020-04-02 14:01 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, akpm, iamjoonsoo.kim, mhocko, bhe, mgorman, rientjes
In this post, I just drop the patch 4 and patch 5 in old v1 since David
and Michal worried moving per-node stats to the front of /proc/zoneinfo
has potential to break the existing user space scripts. For patch 1~3,
there's no change, seems no risk is found out so far, so just keep them
and repost.
The v1 thread can be found here:
https://lore.kernel.org/linux-mm/20200324142229.12028-1-bhe@redhat.com/
Baoquan He (3):
mm/page_alloc.c: only tune sysctl_lowmem_reserve_ratio value once when
changing it
mm/page_alloc.c: clear out zone->lowmem_reserve[] if the zone is empty
mm/vmstat.c: do not show lowmem reserve protection information of
empty zone
mm/page_alloc.c | 13 +++++++++++--
mm/vmstat.c | 12 ++++++------
2 files changed, 17 insertions(+), 8 deletions(-)
--
2.17.2
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 1/3] mm/page_alloc.c: only tune sysctl_lowmem_reserve_ratio value once when changing it
2020-04-02 14:01 [PATCH v2 0/3] improvements about lowmem_reserve and /proc/zoneinfo Baoquan He
@ 2020-04-02 14:01 ` Baoquan He
2020-04-02 14:01 ` [PATCH v2 2/3] mm/page_alloc.c: clear out zone->lowmem_reserve[] if the zone is empty Baoquan He
2020-04-02 14:01 ` [PATCH v2 3/3] mm/vmstat.c: do not show lowmem reserve protection information of empty zone Baoquan He
2 siblings, 0 replies; 4+ messages in thread
From: Baoquan He @ 2020-04-02 14:01 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, akpm, iamjoonsoo.kim, mhocko, bhe, mgorman, rientjes
When people write to /proc/sys/vm/lowmem_reserve_ratio to change
sysctl_lowmem_reserve_ratio[], setup_per_zone_lowmem_reserve()
is called to recalculate all ->lowmem_reserve[] for each zone of all
nodes as below:
static void setup_per_zone_lowmem_reserve(void)
{
...
for_each_online_pgdat(pgdat) {
for (j = 0; j < MAX_NR_ZONES; j++) {
...
while (idx) {
...
if (sysctl_lowmem_reserve_ratio[idx] < 1) {
sysctl_lowmem_reserve_ratio[idx] = 0;
lower_zone->lowmem_reserve[j] = 0;
} else {
...
}
}
}
}
Meanwhile, here, sysctl_lowmem_reserve_ratio[idx] will be tuned if its
value is smaller than '1'. As we know, sysctl_lowmem_reserve_ratio[] is
set for zone without regarding to which node it belongs to. That means
the tuning will be done on all nodes, even though it has been done in the
first node.
And the tuning will be done too even when init_per_zone_wmark_min()
calls setup_per_zone_lowmem_reserve(), where actually nobody tries to
change sysctl_lowmem_reserve_ratio[].
So now move the tuning into lowmem_reserve_ratio_sysctl_handler(), to
make code logic more reasonable.
Signed-off-by: Baoquan He <bhe@redhat.com>
---
mm/page_alloc.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ca1453204e66..c0c788798d8b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7840,8 +7840,7 @@ static void setup_per_zone_lowmem_reserve(void)
idx--;
lower_zone = pgdat->node_zones + idx;
- if (sysctl_lowmem_reserve_ratio[idx] < 1) {
- sysctl_lowmem_reserve_ratio[idx] = 0;
+ if (!sysctl_lowmem_reserve_ratio[idx]) {
lower_zone->lowmem_reserve[j] = 0;
} else {
lower_zone->lowmem_reserve[j] =
@@ -8106,7 +8105,15 @@ int sysctl_min_slab_ratio_sysctl_handler(struct ctl_table *table, int write,
int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *table, int write,
void __user *buffer, size_t *length, loff_t *ppos)
{
+ int i;
+
proc_dointvec_minmax(table, write, buffer, length, ppos);
+
+ for (i = 0; i < MAX_NR_ZONES; i++) {
+ if (sysctl_lowmem_reserve_ratio[i] < 1)
+ sysctl_lowmem_reserve_ratio[i] = 0;
+ }
+
setup_per_zone_lowmem_reserve();
return 0;
}
--
2.17.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v2 2/3] mm/page_alloc.c: clear out zone->lowmem_reserve[] if the zone is empty
2020-04-02 14:01 [PATCH v2 0/3] improvements about lowmem_reserve and /proc/zoneinfo Baoquan He
2020-04-02 14:01 ` [PATCH v2 1/3] mm/page_alloc.c: only tune sysctl_lowmem_reserve_ratio value once when changing it Baoquan He
@ 2020-04-02 14:01 ` Baoquan He
2020-04-02 14:01 ` [PATCH v2 3/3] mm/vmstat.c: do not show lowmem reserve protection information of empty zone Baoquan He
2 siblings, 0 replies; 4+ messages in thread
From: Baoquan He @ 2020-04-02 14:01 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, akpm, iamjoonsoo.kim, mhocko, bhe, mgorman, rientjes
When requesting memory allocation from a specific zone is not satisfied,
it will fall to lower zone to try allocating memory. In this case,
lower zone's ->lowmem_reserve[] will help protect its own memory resource.
The higher the relevant ->lowmem_reserve[] is, the harder the upper zone
can get memory from this lower zone.
However, this protection mechanism should be applied to populated zone,
but not an empty zone. So filling ->lowmem_reserve[] for empty zone is
not necessary, and may mislead people that it's valid data in that zone.
Node 2, zone DMA
pages free 0
min 0
low 0
high 0
spanned 0
present 0
managed 0
protection: (0, 0, 1024, 1024)
Node 2, zone DMA32
pages free 0
min 0
low 0
high 0
spanned 0
present 0
managed 0
protection: (0, 0, 1024, 1024)
Node 2, zone Normal
per-node stats
nr_inactive_anon 0
nr_active_anon 143
nr_inactive_file 0
nr_active_file 0
nr_unevictable 0
nr_slab_reclaimable 45
nr_slab_unreclaimable 254
Here clear out zone->lowmem_reserve[] if zone is empty.
Signed-off-by: Baoquan He <bhe@redhat.com>
---
mm/page_alloc.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c0c788798d8b..138a56c0f48f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7840,8 +7840,10 @@ static void setup_per_zone_lowmem_reserve(void)
idx--;
lower_zone = pgdat->node_zones + idx;
- if (!sysctl_lowmem_reserve_ratio[idx]) {
+ if (!sysctl_lowmem_reserve_ratio[idx] ||
+ !zone_managed_pages(lower_zone)) {
lower_zone->lowmem_reserve[j] = 0;
+ continue;
} else {
lower_zone->lowmem_reserve[j] =
managed_pages / sysctl_lowmem_reserve_ratio[idx];
--
2.17.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v2 3/3] mm/vmstat.c: do not show lowmem reserve protection information of empty zone
2020-04-02 14:01 [PATCH v2 0/3] improvements about lowmem_reserve and /proc/zoneinfo Baoquan He
2020-04-02 14:01 ` [PATCH v2 1/3] mm/page_alloc.c: only tune sysctl_lowmem_reserve_ratio value once when changing it Baoquan He
2020-04-02 14:01 ` [PATCH v2 2/3] mm/page_alloc.c: clear out zone->lowmem_reserve[] if the zone is empty Baoquan He
@ 2020-04-02 14:01 ` Baoquan He
2 siblings, 0 replies; 4+ messages in thread
From: Baoquan He @ 2020-04-02 14:01 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, akpm, iamjoonsoo.kim, mhocko, bhe, mgorman, rientjes
Because the lowmem reserve protection of a zone can't tell anything if
the zone is empty, except of adding one more line in /proc/zoneinfo.
Let's remove it from that zone's showing.
Signed-off-by: Baoquan He <bhe@redhat.com>
---
mm/vmstat.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 96d21a792b57..6fd1407f4632 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1590,6 +1590,12 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
zone->present_pages,
zone_managed_pages(zone));
+ /* If unpopulated, no other information is useful */
+ if (!populated_zone(zone)) {
+ seq_putc(m, '\n');
+ return;
+ }
+
seq_printf(m,
"\n protection: (%ld",
zone->lowmem_reserve[0]);
@@ -1597,12 +1603,6 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
seq_printf(m, ", %ld", zone->lowmem_reserve[i]);
seq_putc(m, ')');
- /* If unpopulated, no other information is useful */
- if (!populated_zone(zone)) {
- seq_putc(m, '\n');
- return;
- }
-
for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
seq_printf(m, "\n %-12s %lu", zone_stat_name(i),
zone_page_state(zone, i));
--
2.17.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-04-02 14:01 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-02 14:01 [PATCH v2 0/3] improvements about lowmem_reserve and /proc/zoneinfo Baoquan He
2020-04-02 14:01 ` [PATCH v2 1/3] mm/page_alloc.c: only tune sysctl_lowmem_reserve_ratio value once when changing it Baoquan He
2020-04-02 14:01 ` [PATCH v2 2/3] mm/page_alloc.c: clear out zone->lowmem_reserve[] if the zone is empty Baoquan He
2020-04-02 14:01 ` [PATCH v2 3/3] mm/vmstat.c: do not show lowmem reserve protection information of empty zone Baoquan He
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.