All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-09-06  4:35 ` js1304
  0 siblings, 0 replies; 44+ messages in thread
From: js1304 @ 2017-09-06  4:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
important to reserve. When ZONE_MOVABLE is used, this problem would
theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
allocation request which is mainly used for page cache and anon page
allocation. So, fix it by setting 0 to
sysctl_lowmem_reserve_ratio[ZONE_HIGHMEM].

And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
makes code complex. For example, if there is highmem system, following
reserve ratio is activated for *NORMAL ZONE* which would be easyily
misleading people.

 #ifdef CONFIG_HIGHMEM
 32
 #endif

This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
array by MAX_NR_ZONES and place "#ifdef" to right place.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 Documentation/sysctl/vm.txt |  5 ++---
 include/linux/mmzone.h      |  2 +-
 mm/page_alloc.c             | 25 ++++++++++++++-----------
 3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index 9baf66a..e9059d3 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -336,8 +336,6 @@ The lowmem_reserve_ratio is an array. You can see them by reading this file.
 % cat /proc/sys/vm/lowmem_reserve_ratio
 256     256     32
 -
-Note: # of this elements is one fewer than number of zones. Because the highest
-      zone's value is not necessary for following calculation.
 
 But, these values are not used directly. The kernel calculates # of protection
 pages for each zones from them. These are shown as array of protection pages
@@ -388,7 +386,8 @@ As above expression, they are reciprocal number of ratio.
 pages of higher zones on the node.
 
 If you would like to protect more pages, smaller values are effective.
-The minimum value is 1 (1/1 -> 100%).
+The minimum value is 1 (1/1 -> 100%). The value less than 1 completely
+disables protection of the pages.
 
 ==============================================================
 
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 356a814..d549c4e 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -890,7 +890,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
 					void __user *, size_t *, loff_t *);
 int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
 					void __user *, size_t *, loff_t *);
-extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
+extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
 int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
 					void __user *, size_t *, loff_t *);
 int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0f34356..2a7f7e9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -203,17 +203,18 @@ static void __free_pages_ok(struct page *page, unsigned int order);
  * TBD: should special case ZONE_DMA32 machines here - in those we normally
  * don't need any ZONE_NORMAL reservation
  */
-int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
+int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
 #ifdef CONFIG_ZONE_DMA
-	 256,
+	[ZONE_DMA] = 256,
 #endif
 #ifdef CONFIG_ZONE_DMA32
-	 256,
+	[ZONE_DMA32] = 256,
 #endif
+	[ZONE_NORMAL] = 32,
 #ifdef CONFIG_HIGHMEM
-	 32,
+	[ZONE_HIGHMEM] = 0,
 #endif
-	 32,
+	[ZONE_MOVABLE] = 0,
 };
 
 EXPORT_SYMBOL(totalram_pages);
@@ -6921,13 +6922,15 @@ static void setup_per_zone_lowmem_reserve(void)
 				struct zone *lower_zone;
 
 				idx--;
-
-				if (sysctl_lowmem_reserve_ratio[idx] < 1)
-					sysctl_lowmem_reserve_ratio[idx] = 1;
-
 				lower_zone = pgdat->node_zones + idx;
-				lower_zone->lowmem_reserve[j] = managed_pages /
-					sysctl_lowmem_reserve_ratio[idx];
+
+				if (sysctl_lowmem_reserve_ratio[idx] < 1) {
+					sysctl_lowmem_reserve_ratio[idx] = 0;
+					lower_zone->lowmem_reserve[j] = 0;
+				} else {
+					lower_zone->lowmem_reserve[j] =
+						managed_pages / sysctl_lowmem_reserve_ratio[idx];
+				}
 				managed_pages += lower_zone->managed_pages;
 			}
 		}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-09-06  4:35 ` js1304
  0 siblings, 0 replies; 44+ messages in thread
From: js1304 @ 2017-09-06  4:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
important to reserve. When ZONE_MOVABLE is used, this problem would
theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
allocation request which is mainly used for page cache and anon page
allocation. So, fix it by setting 0 to
sysctl_lowmem_reserve_ratio[ZONE_HIGHMEM].

And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
makes code complex. For example, if there is highmem system, following
reserve ratio is activated for *NORMAL ZONE* which would be easyily
misleading people.

 #ifdef CONFIG_HIGHMEM
 32
 #endif

This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
array by MAX_NR_ZONES and place "#ifdef" to right place.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 Documentation/sysctl/vm.txt |  5 ++---
 include/linux/mmzone.h      |  2 +-
 mm/page_alloc.c             | 25 ++++++++++++++-----------
 3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index 9baf66a..e9059d3 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -336,8 +336,6 @@ The lowmem_reserve_ratio is an array. You can see them by reading this file.
 % cat /proc/sys/vm/lowmem_reserve_ratio
 256     256     32
 -
-Note: # of this elements is one fewer than number of zones. Because the highest
-      zone's value is not necessary for following calculation.
 
 But, these values are not used directly. The kernel calculates # of protection
 pages for each zones from them. These are shown as array of protection pages
@@ -388,7 +386,8 @@ As above expression, they are reciprocal number of ratio.
 pages of higher zones on the node.
 
 If you would like to protect more pages, smaller values are effective.
-The minimum value is 1 (1/1 -> 100%).
+The minimum value is 1 (1/1 -> 100%). The value less than 1 completely
+disables protection of the pages.
 
 ==============================================================
 
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 356a814..d549c4e 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -890,7 +890,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
 					void __user *, size_t *, loff_t *);
 int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
 					void __user *, size_t *, loff_t *);
-extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
+extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
 int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
 					void __user *, size_t *, loff_t *);
 int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0f34356..2a7f7e9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -203,17 +203,18 @@ static void __free_pages_ok(struct page *page, unsigned int order);
  * TBD: should special case ZONE_DMA32 machines here - in those we normally
  * don't need any ZONE_NORMAL reservation
  */
-int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
+int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
 #ifdef CONFIG_ZONE_DMA
-	 256,
+	[ZONE_DMA] = 256,
 #endif
 #ifdef CONFIG_ZONE_DMA32
-	 256,
+	[ZONE_DMA32] = 256,
 #endif
+	[ZONE_NORMAL] = 32,
 #ifdef CONFIG_HIGHMEM
-	 32,
+	[ZONE_HIGHMEM] = 0,
 #endif
-	 32,
+	[ZONE_MOVABLE] = 0,
 };
 
 EXPORT_SYMBOL(totalram_pages);
@@ -6921,13 +6922,15 @@ static void setup_per_zone_lowmem_reserve(void)
 				struct zone *lower_zone;
 
 				idx--;
-
-				if (sysctl_lowmem_reserve_ratio[idx] < 1)
-					sysctl_lowmem_reserve_ratio[idx] = 1;
-
 				lower_zone = pgdat->node_zones + idx;
-				lower_zone->lowmem_reserve[j] = managed_pages /
-					sysctl_lowmem_reserve_ratio[idx];
+
+				if (sysctl_lowmem_reserve_ratio[idx] < 1) {
+					sysctl_lowmem_reserve_ratio[idx] = 0;
+					lower_zone->lowmem_reserve[j] = 0;
+				} else {
+					lower_zone->lowmem_reserve[j] =
+						managed_pages / sysctl_lowmem_reserve_ratio[idx];
+				}
 				managed_pages += lower_zone->managed_pages;
 			}
 		}
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-09-06  4:35 ` js1304
@ 2017-09-06  7:54   ` Vlastimil Babka
  -1 siblings, 0 replies; 44+ messages in thread
From: Vlastimil Babka @ 2017-09-06  7:54 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: Michal Hocko, Mel Gorman, Johannes Weiner, Aneesh Kumar K . V,
	Minchan Kim, linux-mm, linux-kernel, Joonsoo Kim, Linux API

+CC linux-api

On 09/06/2017 06:35 AM, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> important to reserve. When ZONE_MOVABLE is used, this problem would
> theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> allocation request which is mainly used for page cache and anon page
> allocation. So, fix it by setting 0 to
> sysctl_lowmem_reserve_ratio[ZONE_HIGHMEM].
> 
> And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> makes code complex. For example, if there is highmem system, following
> reserve ratio is activated for *NORMAL ZONE* which would be easyily
> misleading people.
> 
>  #ifdef CONFIG_HIGHMEM
>  32
>  #endif
> 
> This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> array by MAX_NR_ZONES and place "#ifdef" to right place.
> 
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  Documentation/sysctl/vm.txt |  5 ++---
>  include/linux/mmzone.h      |  2 +-
>  mm/page_alloc.c             | 25 ++++++++++++++-----------
>  3 files changed, 17 insertions(+), 15 deletions(-)
> 
> diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
> index 9baf66a..e9059d3 100644
> --- a/Documentation/sysctl/vm.txt
> +++ b/Documentation/sysctl/vm.txt
> @@ -336,8 +336,6 @@ The lowmem_reserve_ratio is an array. You can see them by reading this file.
>  % cat /proc/sys/vm/lowmem_reserve_ratio
>  256     256     32
>  -
> -Note: # of this elements is one fewer than number of zones. Because the highest
> -      zone's value is not necessary for following calculation.
>  
>  But, these values are not used directly. The kernel calculates # of protection
>  pages for each zones from them. These are shown as array of protection pages
> @@ -388,7 +386,8 @@ As above expression, they are reciprocal number of ratio.
>  pages of higher zones on the node.
>  
>  If you would like to protect more pages, smaller values are effective.
> -The minimum value is 1 (1/1 -> 100%).
> +The minimum value is 1 (1/1 -> 100%). The value less than 1 completely
> +disables protection of the pages.
>  
>  ==============================================================
>  
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 356a814..d549c4e 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -890,7 +890,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
> -extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
> +extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
>  int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 0f34356..2a7f7e9 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -203,17 +203,18 @@ static void __free_pages_ok(struct page *page, unsigned int order);
>   * TBD: should special case ZONE_DMA32 machines here - in those we normally
>   * don't need any ZONE_NORMAL reservation
>   */
> -int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
> +int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
>  #ifdef CONFIG_ZONE_DMA
> -	 256,
> +	[ZONE_DMA] = 256,
>  #endif
>  #ifdef CONFIG_ZONE_DMA32
> -	 256,
> +	[ZONE_DMA32] = 256,
>  #endif
> +	[ZONE_NORMAL] = 32,
>  #ifdef CONFIG_HIGHMEM
> -	 32,
> +	[ZONE_HIGHMEM] = 0,
>  #endif
> -	 32,
> +	[ZONE_MOVABLE] = 0,
>  };
>  
>  EXPORT_SYMBOL(totalram_pages);
> @@ -6921,13 +6922,15 @@ static void setup_per_zone_lowmem_reserve(void)
>  				struct zone *lower_zone;
>  
>  				idx--;
> -
> -				if (sysctl_lowmem_reserve_ratio[idx] < 1)
> -					sysctl_lowmem_reserve_ratio[idx] = 1;
> -
>  				lower_zone = pgdat->node_zones + idx;
> -				lower_zone->lowmem_reserve[j] = managed_pages /
> -					sysctl_lowmem_reserve_ratio[idx];
> +
> +				if (sysctl_lowmem_reserve_ratio[idx] < 1) {
> +					sysctl_lowmem_reserve_ratio[idx] = 0;
> +					lower_zone->lowmem_reserve[j] = 0;
> +				} else {
> +					lower_zone->lowmem_reserve[j] =
> +						managed_pages / sysctl_lowmem_reserve_ratio[idx];
> +				}
>  				managed_pages += lower_zone->managed_pages;
>  			}
>  		}
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-09-06  7:54   ` Vlastimil Babka
  0 siblings, 0 replies; 44+ messages in thread
From: Vlastimil Babka @ 2017-09-06  7:54 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: Michal Hocko, Mel Gorman, Johannes Weiner, Aneesh Kumar K . V,
	Minchan Kim, linux-mm, linux-kernel, Joonsoo Kim, Linux API

+CC linux-api

On 09/06/2017 06:35 AM, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> important to reserve. When ZONE_MOVABLE is used, this problem would
> theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> allocation request which is mainly used for page cache and anon page
> allocation. So, fix it by setting 0 to
> sysctl_lowmem_reserve_ratio[ZONE_HIGHMEM].
> 
> And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> makes code complex. For example, if there is highmem system, following
> reserve ratio is activated for *NORMAL ZONE* which would be easyily
> misleading people.
> 
>  #ifdef CONFIG_HIGHMEM
>  32
>  #endif
> 
> This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> array by MAX_NR_ZONES and place "#ifdef" to right place.
> 
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  Documentation/sysctl/vm.txt |  5 ++---
>  include/linux/mmzone.h      |  2 +-
>  mm/page_alloc.c             | 25 ++++++++++++++-----------
>  3 files changed, 17 insertions(+), 15 deletions(-)
> 
> diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
> index 9baf66a..e9059d3 100644
> --- a/Documentation/sysctl/vm.txt
> +++ b/Documentation/sysctl/vm.txt
> @@ -336,8 +336,6 @@ The lowmem_reserve_ratio is an array. You can see them by reading this file.
>  % cat /proc/sys/vm/lowmem_reserve_ratio
>  256     256     32
>  -
> -Note: # of this elements is one fewer than number of zones. Because the highest
> -      zone's value is not necessary for following calculation.
>  
>  But, these values are not used directly. The kernel calculates # of protection
>  pages for each zones from them. These are shown as array of protection pages
> @@ -388,7 +386,8 @@ As above expression, they are reciprocal number of ratio.
>  pages of higher zones on the node.
>  
>  If you would like to protect more pages, smaller values are effective.
> -The minimum value is 1 (1/1 -> 100%).
> +The minimum value is 1 (1/1 -> 100%). The value less than 1 completely
> +disables protection of the pages.
>  
>  ==============================================================
>  
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 356a814..d549c4e 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -890,7 +890,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
> -extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
> +extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
>  int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 0f34356..2a7f7e9 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -203,17 +203,18 @@ static void __free_pages_ok(struct page *page, unsigned int order);
>   * TBD: should special case ZONE_DMA32 machines here - in those we normally
>   * don't need any ZONE_NORMAL reservation
>   */
> -int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
> +int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
>  #ifdef CONFIG_ZONE_DMA
> -	 256,
> +	[ZONE_DMA] = 256,
>  #endif
>  #ifdef CONFIG_ZONE_DMA32
> -	 256,
> +	[ZONE_DMA32] = 256,
>  #endif
> +	[ZONE_NORMAL] = 32,
>  #ifdef CONFIG_HIGHMEM
> -	 32,
> +	[ZONE_HIGHMEM] = 0,
>  #endif
> -	 32,
> +	[ZONE_MOVABLE] = 0,
>  };
>  
>  EXPORT_SYMBOL(totalram_pages);
> @@ -6921,13 +6922,15 @@ static void setup_per_zone_lowmem_reserve(void)
>  				struct zone *lower_zone;
>  
>  				idx--;
> -
> -				if (sysctl_lowmem_reserve_ratio[idx] < 1)
> -					sysctl_lowmem_reserve_ratio[idx] = 1;
> -
>  				lower_zone = pgdat->node_zones + idx;
> -				lower_zone->lowmem_reserve[j] = managed_pages /
> -					sysctl_lowmem_reserve_ratio[idx];
> +
> +				if (sysctl_lowmem_reserve_ratio[idx] < 1) {
> +					sysctl_lowmem_reserve_ratio[idx] = 0;
> +					lower_zone->lowmem_reserve[j] = 0;
> +				} else {
> +					lower_zone->lowmem_reserve[j] =
> +						managed_pages / sysctl_lowmem_reserve_ratio[idx];
> +				}
>  				managed_pages += lower_zone->managed_pages;
>  			}
>  		}
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-09-06  4:35 ` js1304
@ 2017-09-14 13:24   ` Michal Hocko
  -1 siblings, 0 replies; 44+ messages in thread
From: Michal Hocko @ 2017-09-14 13:24 UTC (permalink / raw)
  To: js1304
  Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel,
	Joonsoo Kim, linux-api

[Sorry for a later reply]

On Wed 06-09-17 13:35:25, Joonsoo Kim wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> important to reserve.

I am still not convinced this is a good idea. I do agree that reserving
memory in both HIGHMEM and MOVABLE is just wasting memory but removing
the reserve from the highmem as well will result that an oom victim will
allocate from lower zones and that might have unexpected side effects.

Can we simply leave HIGHMEM reserve and only remove it from the movable
zone if both are present?

> When ZONE_MOVABLE is used, this problem would
> theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> allocation request which is mainly used for page cache and anon page
> allocation. So, fix it by setting 0 to
> sysctl_lowmem_reserve_ratio[ZONE_HIGHMEM].
> 
> And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> makes code complex. For example, if there is highmem system, following
> reserve ratio is activated for *NORMAL ZONE* which would be easyily

s@easyily@easily@

> misleading people.
> 
>  #ifdef CONFIG_HIGHMEM
>  32
>  #endif
> 
> This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> array by MAX_NR_ZONES and place "#ifdef" to right place.

I would probably split this patch into two but this is up to you

> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  Documentation/sysctl/vm.txt |  5 ++---
>  include/linux/mmzone.h      |  2 +-
>  mm/page_alloc.c             | 25 ++++++++++++++-----------
>  3 files changed, 17 insertions(+), 15 deletions(-)
> 
> diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
> index 9baf66a..e9059d3 100644
> --- a/Documentation/sysctl/vm.txt
> +++ b/Documentation/sysctl/vm.txt
> @@ -336,8 +336,6 @@ The lowmem_reserve_ratio is an array. You can see them by reading this file.
>  % cat /proc/sys/vm/lowmem_reserve_ratio
>  256     256     32
>  -
> -Note: # of this elements is one fewer than number of zones. Because the highest
> -      zone's value is not necessary for following calculation.
>  
>  But, these values are not used directly. The kernel calculates # of protection
>  pages for each zones from them. These are shown as array of protection pages
> @@ -388,7 +386,8 @@ As above expression, they are reciprocal number of ratio.
>  pages of higher zones on the node.
>  
>  If you would like to protect more pages, smaller values are effective.
> -The minimum value is 1 (1/1 -> 100%).
> +The minimum value is 1 (1/1 -> 100%). The value less than 1 completely
> +disables protection of the pages.
>  
>  ==============================================================
>  
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 356a814..d549c4e 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -890,7 +890,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
> -extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
> +extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
>  int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 0f34356..2a7f7e9 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -203,17 +203,18 @@ static void __free_pages_ok(struct page *page, unsigned int order);
>   * TBD: should special case ZONE_DMA32 machines here - in those we normally
>   * don't need any ZONE_NORMAL reservation
>   */
> -int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
> +int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
>  #ifdef CONFIG_ZONE_DMA
> -	 256,
> +	[ZONE_DMA] = 256,
>  #endif
>  #ifdef CONFIG_ZONE_DMA32
> -	 256,
> +	[ZONE_DMA32] = 256,
>  #endif
> +	[ZONE_NORMAL] = 32,
>  #ifdef CONFIG_HIGHMEM
> -	 32,
> +	[ZONE_HIGHMEM] = 0,
>  #endif
> -	 32,
> +	[ZONE_MOVABLE] = 0,
>  };
>  
>  EXPORT_SYMBOL(totalram_pages);
> @@ -6921,13 +6922,15 @@ static void setup_per_zone_lowmem_reserve(void)
>  				struct zone *lower_zone;
>  
>  				idx--;
> -
> -				if (sysctl_lowmem_reserve_ratio[idx] < 1)
> -					sysctl_lowmem_reserve_ratio[idx] = 1;
> -
>  				lower_zone = pgdat->node_zones + idx;
> -				lower_zone->lowmem_reserve[j] = managed_pages /
> -					sysctl_lowmem_reserve_ratio[idx];
> +
> +				if (sysctl_lowmem_reserve_ratio[idx] < 1) {
> +					sysctl_lowmem_reserve_ratio[idx] = 0;
> +					lower_zone->lowmem_reserve[j] = 0;
> +				} else {
> +					lower_zone->lowmem_reserve[j] =
> +						managed_pages / sysctl_lowmem_reserve_ratio[idx];
> +				}
>  				managed_pages += lower_zone->managed_pages;
>  			}
>  		}
> -- 
> 2.7.4
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-09-14 13:24   ` Michal Hocko
  0 siblings, 0 replies; 44+ messages in thread
From: Michal Hocko @ 2017-09-14 13:24 UTC (permalink / raw)
  To: js1304
  Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel,
	Joonsoo Kim, linux-api

[Sorry for a later reply]

On Wed 06-09-17 13:35:25, Joonsoo Kim wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> important to reserve.

I am still not convinced this is a good idea. I do agree that reserving
memory in both HIGHMEM and MOVABLE is just wasting memory but removing
the reserve from the highmem as well will result that an oom victim will
allocate from lower zones and that might have unexpected side effects.

Can we simply leave HIGHMEM reserve and only remove it from the movable
zone if both are present?

> When ZONE_MOVABLE is used, this problem would
> theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> allocation request which is mainly used for page cache and anon page
> allocation. So, fix it by setting 0 to
> sysctl_lowmem_reserve_ratio[ZONE_HIGHMEM].
> 
> And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> makes code complex. For example, if there is highmem system, following
> reserve ratio is activated for *NORMAL ZONE* which would be easyily

s@easyily@easily@

> misleading people.
> 
>  #ifdef CONFIG_HIGHMEM
>  32
>  #endif
> 
> This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> array by MAX_NR_ZONES and place "#ifdef" to right place.

I would probably split this patch into two but this is up to you

> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  Documentation/sysctl/vm.txt |  5 ++---
>  include/linux/mmzone.h      |  2 +-
>  mm/page_alloc.c             | 25 ++++++++++++++-----------
>  3 files changed, 17 insertions(+), 15 deletions(-)
> 
> diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
> index 9baf66a..e9059d3 100644
> --- a/Documentation/sysctl/vm.txt
> +++ b/Documentation/sysctl/vm.txt
> @@ -336,8 +336,6 @@ The lowmem_reserve_ratio is an array. You can see them by reading this file.
>  % cat /proc/sys/vm/lowmem_reserve_ratio
>  256     256     32
>  -
> -Note: # of this elements is one fewer than number of zones. Because the highest
> -      zone's value is not necessary for following calculation.
>  
>  But, these values are not used directly. The kernel calculates # of protection
>  pages for each zones from them. These are shown as array of protection pages
> @@ -388,7 +386,8 @@ As above expression, they are reciprocal number of ratio.
>  pages of higher zones on the node.
>  
>  If you would like to protect more pages, smaller values are effective.
> -The minimum value is 1 (1/1 -> 100%).
> +The minimum value is 1 (1/1 -> 100%). The value less than 1 completely
> +disables protection of the pages.
>  
>  ==============================================================
>  
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 356a814..d549c4e 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -890,7 +890,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
> -extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
> +extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
>  int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 0f34356..2a7f7e9 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -203,17 +203,18 @@ static void __free_pages_ok(struct page *page, unsigned int order);
>   * TBD: should special case ZONE_DMA32 machines here - in those we normally
>   * don't need any ZONE_NORMAL reservation
>   */
> -int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
> +int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
>  #ifdef CONFIG_ZONE_DMA
> -	 256,
> +	[ZONE_DMA] = 256,
>  #endif
>  #ifdef CONFIG_ZONE_DMA32
> -	 256,
> +	[ZONE_DMA32] = 256,
>  #endif
> +	[ZONE_NORMAL] = 32,
>  #ifdef CONFIG_HIGHMEM
> -	 32,
> +	[ZONE_HIGHMEM] = 0,
>  #endif
> -	 32,
> +	[ZONE_MOVABLE] = 0,
>  };
>  
>  EXPORT_SYMBOL(totalram_pages);
> @@ -6921,13 +6922,15 @@ static void setup_per_zone_lowmem_reserve(void)
>  				struct zone *lower_zone;
>  
>  				idx--;
> -
> -				if (sysctl_lowmem_reserve_ratio[idx] < 1)
> -					sysctl_lowmem_reserve_ratio[idx] = 1;
> -
>  				lower_zone = pgdat->node_zones + idx;
> -				lower_zone->lowmem_reserve[j] = managed_pages /
> -					sysctl_lowmem_reserve_ratio[idx];
> +
> +				if (sysctl_lowmem_reserve_ratio[idx] < 1) {
> +					sysctl_lowmem_reserve_ratio[idx] = 0;
> +					lower_zone->lowmem_reserve[j] = 0;
> +				} else {
> +					lower_zone->lowmem_reserve[j] =
> +						managed_pages / sysctl_lowmem_reserve_ratio[idx];
> +				}
>  				managed_pages += lower_zone->managed_pages;
>  			}
>  		}
> -- 
> 2.7.4
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-09-14 13:24   ` Michal Hocko
  (?)
@ 2018-04-04  0:24   ` Joonsoo Kim
  2018-04-12 12:01     ` Michal Hocko
  -1 siblings, 1 reply; 44+ messages in thread
From: Joonsoo Kim @ 2018-04-04  0:24 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, Linux Memory Management List,
	LKML, Joonsoo Kim, linux-api

Hello, Michal.

Sorry for a really long delay.

2017-09-14 22:24 GMT+09:00 Michal Hocko <mhocko@kernel.org>:
> [Sorry for a later reply]
>
> On Wed 06-09-17 13:35:25, Joonsoo Kim wrote:
>> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>>
>> Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
>> important to reserve.
>
> I am still not convinced this is a good idea. I do agree that reserving
> memory in both HIGHMEM and MOVABLE is just wasting memory but removing
> the reserve from the highmem as well will result that an oom victim will
> allocate from lower zones and that might have unexpected side effects.

Looks like you are confused.

This patch only affects the situation that ZONE_HIGHMEM and ZONE_MOVABLE is
used at the same time. In that case, before this patch, ZONE_HIGHMEM has
reserve for GFP_HIGHMEM | GFP_MOVABLE request, but, with this patch,  no reserve
in ZONE_HIGHMEM for GFP_HIGHMEM | GFP_MOVABLE request. This perfectly
matchs with your hope. :)

> Can we simply leave HIGHMEM reserve and only remove it from the movable
> zone if both are present?

There is no higher zone than ZONE_MOVABLE so ZONE_MOVABLE has no reserve
with/without this patch. To save memory, we need to remove the reserve in
ZONE_HIGHMEM.

Thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2018-04-04  0:24   ` Joonsoo Kim
@ 2018-04-12 12:01     ` Michal Hocko
  0 siblings, 0 replies; 44+ messages in thread
From: Michal Hocko @ 2018-04-12 12:01 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, Linux Memory Management List,
	LKML, Joonsoo Kim, linux-api

On Wed 04-04-18 09:24:06, Joonsoo Kim wrote:
> 2017-09-14 22:24 GMT+09:00 Michal Hocko <mhocko@kernel.org>:
> > [Sorry for a later reply]
> >
> > On Wed 06-09-17 13:35:25, Joonsoo Kim wrote:
> >> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> >>
> >> Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> >> important to reserve.
> >
> > I am still not convinced this is a good idea. I do agree that reserving
> > memory in both HIGHMEM and MOVABLE is just wasting memory but removing
> > the reserve from the highmem as well will result that an oom victim will
> > allocate from lower zones and that might have unexpected side effects.
> 
> Looks like you are confused.
> 
> This patch only affects the situation that ZONE_HIGHMEM and ZONE_MOVABLE is
> used at the same time. In that case, before this patch, ZONE_HIGHMEM has
> reserve for GFP_HIGHMEM | GFP_MOVABLE request, but, with this patch,  no reserve
> in ZONE_HIGHMEM for GFP_HIGHMEM | GFP_MOVABLE request. This perfectly
> matchs with your hope. :)

I have forgot all the details but my vague recollection is that the
concern was that GFP_HIGHUSER_MOVABLE etc. wouldn't keep any reserve in
the highmem zone and so emergency allocations - e.g. those during OOM
will have to fallback to kernel zones and might lead to hard to predict
results. Am I still confused and this will not happen after the patch?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-08-29 13:39               ` Michal Hocko
@ 2017-08-31  1:45                 ` Joonsoo Kim
  -1 siblings, 0 replies; 44+ messages in thread
From: Joonsoo Kim @ 2017-08-31  1:45 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vlastimil Babka, Andrew Morton, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Tue, Aug 29, 2017 at 03:39:45PM +0200, Michal Hocko wrote:
> On Tue 29-08-17 09:45:47, Joonsoo Kim wrote:
> > On Mon, Aug 28, 2017 at 11:56:16AM +0200, Michal Hocko wrote:
> > > On Mon 28-08-17 09:15:52, Joonsoo Kim wrote:
> > > > On Fri, Aug 25, 2017 at 09:38:42AM +0200, Michal Hocko wrote:
> > > > > On Fri 25-08-17 09:20:31, Joonsoo Kim wrote:
> > > > > > On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> > > > > > > On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> > > > > > > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > > > > > > 
> > > > > > > > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > > > > > > > important to reserve. When ZONE_MOVABLE is used, this problem would
> > > > > > > > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > > > > > > > allocation request which is mainly used for page cache and anon page
> > > > > > > > allocation. So, fix it.
> > > > > > > > 
> > > > > > > > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > > > > > > > makes code complex. For example, if there is highmem system, following
> > > > > > > > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > > > > > > > misleading people.
> > > > > > > > 
> > > > > > > >  #ifdef CONFIG_HIGHMEM
> > > > > > > >  32
> > > > > > > >  #endif
> > > > > > > > 
> > > > > > > > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > > > > > > > array by MAX_NR_ZONES and place "#ifdef" to right place.
> > > > > > > > 
> > > > > > > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > > > > > > > Acked-by: Vlastimil Babka <vbabka@suse.cz>
> > > > > > > 
> > > > > > > Looks like I did that almost year ago, so definitely had to refresh my
> > > > > > > memory now :)
> > > > > > > 
> > > > > > > Anyway now I looked more thoroughly and noticed that this change leaks
> > > > > > > into the reported sysctl. On a 64bit system with ZONE_MOVABLE:
> > > > > > > 
> > > > > > > before the patch:
> > > > > > > vm.lowmem_reserve_ratio = 256   256     32
> > > > > > > 
> > > > > > > after the patch:
> > > > > > > vm.lowmem_reserve_ratio = 256   256     32      2147483647
> > > > > > > 
> > > > > > > So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
> > > > > > > should do that differently than with the INT_MAX trick, IMHO.
> > > > > > 
> > > > > > Hmm, this is already pointed by Minchan and I have answered that.
> > > > > > 
> > > > > > lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> > > > > > 
> > > > > > If you have a better idea, please let me know.
> > > > > 
> > > > > Why don't we just use 0. In fact we are reserving 0 pages... Using
> > > > > INT_MAX is just wrong.
> > > > 
> > > > The number of reserved pages is calculated by "managed_pages /
> > > > ratio". Using INT_MAX, net result would be 0.
> > > 
> > > Why cannot we simply special case 0?
> > > 
> > > > There is a logic converting ratio 0 to ratio 1.
> > > > 
> > > > if (sysctl_lowmem_reserve_ratio[idx] < 1)
> > > >         sysctl_lowmem_reserve_ratio[idx] = 1
> > > 
> > > This code just tries to prevent from division by 0 but I am wondering
> > > we should simply set lowmem_reserve to 0 in that case.
> > > 
> > > > If I use 0 to represent 0 reserved page, there would be a user
> > > > who is affected by this change. So, I don't use 0 for this patch.
> > > 
> > > I am sorry but I do not understand? Could you be more specific please?
> > 
> > If there is a user that manually set sysctl_lowmem_reserve_ratio and
> > he/she uses '0' to set ratio to '1', your suggestion making '0' as
> > a special value changes his/her system behaviour. I'm afraid this
> > case.
> 
> Documentation (Documentation/sysctl/vm.txt) explicitly states that 1
> is minimum. So I wouldn't afraid all that much. And you can actually
> printk_once if 0 is set and explain that this disables memory reserve
> for the particular zone altogether.

Great! If documentation says that, we can freely use the value, zero.
I will do it as this way.

Thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-31  1:45                 ` Joonsoo Kim
  0 siblings, 0 replies; 44+ messages in thread
From: Joonsoo Kim @ 2017-08-31  1:45 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vlastimil Babka, Andrew Morton, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Tue, Aug 29, 2017 at 03:39:45PM +0200, Michal Hocko wrote:
> On Tue 29-08-17 09:45:47, Joonsoo Kim wrote:
> > On Mon, Aug 28, 2017 at 11:56:16AM +0200, Michal Hocko wrote:
> > > On Mon 28-08-17 09:15:52, Joonsoo Kim wrote:
> > > > On Fri, Aug 25, 2017 at 09:38:42AM +0200, Michal Hocko wrote:
> > > > > On Fri 25-08-17 09:20:31, Joonsoo Kim wrote:
> > > > > > On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> > > > > > > On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> > > > > > > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > > > > > > 
> > > > > > > > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > > > > > > > important to reserve. When ZONE_MOVABLE is used, this problem would
> > > > > > > > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > > > > > > > allocation request which is mainly used for page cache and anon page
> > > > > > > > allocation. So, fix it.
> > > > > > > > 
> > > > > > > > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > > > > > > > makes code complex. For example, if there is highmem system, following
> > > > > > > > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > > > > > > > misleading people.
> > > > > > > > 
> > > > > > > >  #ifdef CONFIG_HIGHMEM
> > > > > > > >  32
> > > > > > > >  #endif
> > > > > > > > 
> > > > > > > > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > > > > > > > array by MAX_NR_ZONES and place "#ifdef" to right place.
> > > > > > > > 
> > > > > > > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > > > > > > > Acked-by: Vlastimil Babka <vbabka@suse.cz>
> > > > > > > 
> > > > > > > Looks like I did that almost year ago, so definitely had to refresh my
> > > > > > > memory now :)
> > > > > > > 
> > > > > > > Anyway now I looked more thoroughly and noticed that this change leaks
> > > > > > > into the reported sysctl. On a 64bit system with ZONE_MOVABLE:
> > > > > > > 
> > > > > > > before the patch:
> > > > > > > vm.lowmem_reserve_ratio = 256   256     32
> > > > > > > 
> > > > > > > after the patch:
> > > > > > > vm.lowmem_reserve_ratio = 256   256     32      2147483647
> > > > > > > 
> > > > > > > So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
> > > > > > > should do that differently than with the INT_MAX trick, IMHO.
> > > > > > 
> > > > > > Hmm, this is already pointed by Minchan and I have answered that.
> > > > > > 
> > > > > > lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> > > > > > 
> > > > > > If you have a better idea, please let me know.
> > > > > 
> > > > > Why don't we just use 0. In fact we are reserving 0 pages... Using
> > > > > INT_MAX is just wrong.
> > > > 
> > > > The number of reserved pages is calculated by "managed_pages /
> > > > ratio". Using INT_MAX, net result would be 0.
> > > 
> > > Why cannot we simply special case 0?
> > > 
> > > > There is a logic converting ratio 0 to ratio 1.
> > > > 
> > > > if (sysctl_lowmem_reserve_ratio[idx] < 1)
> > > >         sysctl_lowmem_reserve_ratio[idx] = 1
> > > 
> > > This code just tries to prevent from division by 0 but I am wondering
> > > we should simply set lowmem_reserve to 0 in that case.
> > > 
> > > > If I use 0 to represent 0 reserved page, there would be a user
> > > > who is affected by this change. So, I don't use 0 for this patch.
> > > 
> > > I am sorry but I do not understand? Could you be more specific please?
> > 
> > If there is a user that manually set sysctl_lowmem_reserve_ratio and
> > he/she uses '0' to set ratio to '1', your suggestion making '0' as
> > a special value changes his/her system behaviour. I'm afraid this
> > case.
> 
> Documentation (Documentation/sysctl/vm.txt) explicitly states that 1
> is minimum. So I wouldn't afraid all that much. And you can actually
> printk_once if 0 is set and explain that this disables memory reserve
> for the particular zone altogether.

Great! If documentation says that, we can freely use the value, zero.
I will do it as this way.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-08-29  0:45             ` Joonsoo Kim
@ 2017-08-29 13:39               ` Michal Hocko
  -1 siblings, 0 replies; 44+ messages in thread
From: Michal Hocko @ 2017-08-29 13:39 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Vlastimil Babka, Andrew Morton, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Tue 29-08-17 09:45:47, Joonsoo Kim wrote:
> On Mon, Aug 28, 2017 at 11:56:16AM +0200, Michal Hocko wrote:
> > On Mon 28-08-17 09:15:52, Joonsoo Kim wrote:
> > > On Fri, Aug 25, 2017 at 09:38:42AM +0200, Michal Hocko wrote:
> > > > On Fri 25-08-17 09:20:31, Joonsoo Kim wrote:
> > > > > On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> > > > > > On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> > > > > > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > > > > > 
> > > > > > > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > > > > > > important to reserve. When ZONE_MOVABLE is used, this problem would
> > > > > > > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > > > > > > allocation request which is mainly used for page cache and anon page
> > > > > > > allocation. So, fix it.
> > > > > > > 
> > > > > > > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > > > > > > makes code complex. For example, if there is highmem system, following
> > > > > > > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > > > > > > misleading people.
> > > > > > > 
> > > > > > >  #ifdef CONFIG_HIGHMEM
> > > > > > >  32
> > > > > > >  #endif
> > > > > > > 
> > > > > > > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > > > > > > array by MAX_NR_ZONES and place "#ifdef" to right place.
> > > > > > > 
> > > > > > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > > > > > > Acked-by: Vlastimil Babka <vbabka@suse.cz>
> > > > > > 
> > > > > > Looks like I did that almost year ago, so definitely had to refresh my
> > > > > > memory now :)
> > > > > > 
> > > > > > Anyway now I looked more thoroughly and noticed that this change leaks
> > > > > > into the reported sysctl. On a 64bit system with ZONE_MOVABLE:
> > > > > > 
> > > > > > before the patch:
> > > > > > vm.lowmem_reserve_ratio = 256   256     32
> > > > > > 
> > > > > > after the patch:
> > > > > > vm.lowmem_reserve_ratio = 256   256     32      2147483647
> > > > > > 
> > > > > > So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
> > > > > > should do that differently than with the INT_MAX trick, IMHO.
> > > > > 
> > > > > Hmm, this is already pointed by Minchan and I have answered that.
> > > > > 
> > > > > lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> > > > > 
> > > > > If you have a better idea, please let me know.
> > > > 
> > > > Why don't we just use 0. In fact we are reserving 0 pages... Using
> > > > INT_MAX is just wrong.
> > > 
> > > The number of reserved pages is calculated by "managed_pages /
> > > ratio". Using INT_MAX, net result would be 0.
> > 
> > Why cannot we simply special case 0?
> > 
> > > There is a logic converting ratio 0 to ratio 1.
> > > 
> > > if (sysctl_lowmem_reserve_ratio[idx] < 1)
> > >         sysctl_lowmem_reserve_ratio[idx] = 1
> > 
> > This code just tries to prevent from division by 0 but I am wondering
> > we should simply set lowmem_reserve to 0 in that case.
> > 
> > > If I use 0 to represent 0 reserved page, there would be a user
> > > who is affected by this change. So, I don't use 0 for this patch.
> > 
> > I am sorry but I do not understand? Could you be more specific please?
> 
> If there is a user that manually set sysctl_lowmem_reserve_ratio and
> he/she uses '0' to set ratio to '1', your suggestion making '0' as
> a special value changes his/her system behaviour. I'm afraid this
> case.

Documentation (Documentation/sysctl/vm.txt) explicitly states that 1
is minimum. So I wouldn't afraid all that much. And you can actually
printk_once if 0 is set and explain that this disables memory reserve
for the particular zone altogether.

> However, if you and Vlastimil agree with this making '0' as a special
> value, I will go this way.

I do agree that INT_MAX is just too ugly.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-29 13:39               ` Michal Hocko
  0 siblings, 0 replies; 44+ messages in thread
From: Michal Hocko @ 2017-08-29 13:39 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Vlastimil Babka, Andrew Morton, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Tue 29-08-17 09:45:47, Joonsoo Kim wrote:
> On Mon, Aug 28, 2017 at 11:56:16AM +0200, Michal Hocko wrote:
> > On Mon 28-08-17 09:15:52, Joonsoo Kim wrote:
> > > On Fri, Aug 25, 2017 at 09:38:42AM +0200, Michal Hocko wrote:
> > > > On Fri 25-08-17 09:20:31, Joonsoo Kim wrote:
> > > > > On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> > > > > > On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> > > > > > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > > > > > 
> > > > > > > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > > > > > > important to reserve. When ZONE_MOVABLE is used, this problem would
> > > > > > > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > > > > > > allocation request which is mainly used for page cache and anon page
> > > > > > > allocation. So, fix it.
> > > > > > > 
> > > > > > > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > > > > > > makes code complex. For example, if there is highmem system, following
> > > > > > > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > > > > > > misleading people.
> > > > > > > 
> > > > > > >  #ifdef CONFIG_HIGHMEM
> > > > > > >  32
> > > > > > >  #endif
> > > > > > > 
> > > > > > > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > > > > > > array by MAX_NR_ZONES and place "#ifdef" to right place.
> > > > > > > 
> > > > > > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > > > > > > Acked-by: Vlastimil Babka <vbabka@suse.cz>
> > > > > > 
> > > > > > Looks like I did that almost year ago, so definitely had to refresh my
> > > > > > memory now :)
> > > > > > 
> > > > > > Anyway now I looked more thoroughly and noticed that this change leaks
> > > > > > into the reported sysctl. On a 64bit system with ZONE_MOVABLE:
> > > > > > 
> > > > > > before the patch:
> > > > > > vm.lowmem_reserve_ratio = 256   256     32
> > > > > > 
> > > > > > after the patch:
> > > > > > vm.lowmem_reserve_ratio = 256   256     32      2147483647
> > > > > > 
> > > > > > So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
> > > > > > should do that differently than with the INT_MAX trick, IMHO.
> > > > > 
> > > > > Hmm, this is already pointed by Minchan and I have answered that.
> > > > > 
> > > > > lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> > > > > 
> > > > > If you have a better idea, please let me know.
> > > > 
> > > > Why don't we just use 0. In fact we are reserving 0 pages... Using
> > > > INT_MAX is just wrong.
> > > 
> > > The number of reserved pages is calculated by "managed_pages /
> > > ratio". Using INT_MAX, net result would be 0.
> > 
> > Why cannot we simply special case 0?
> > 
> > > There is a logic converting ratio 0 to ratio 1.
> > > 
> > > if (sysctl_lowmem_reserve_ratio[idx] < 1)
> > >         sysctl_lowmem_reserve_ratio[idx] = 1
> > 
> > This code just tries to prevent from division by 0 but I am wondering
> > we should simply set lowmem_reserve to 0 in that case.
> > 
> > > If I use 0 to represent 0 reserved page, there would be a user
> > > who is affected by this change. So, I don't use 0 for this patch.
> > 
> > I am sorry but I do not understand? Could you be more specific please?
> 
> If there is a user that manually set sysctl_lowmem_reserve_ratio and
> he/she uses '0' to set ratio to '1', your suggestion making '0' as
> a special value changes his/her system behaviour. I'm afraid this
> case.

Documentation (Documentation/sysctl/vm.txt) explicitly states that 1
is minimum. So I wouldn't afraid all that much. And you can actually
printk_once if 0 is set and explain that this disables memory reserve
for the particular zone altogether.

> However, if you and Vlastimil agree with this making '0' as a special
> value, I will go this way.

I do agree that INT_MAX is just too ugly.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-08-29  0:36             ` Joonsoo Kim
@ 2017-08-29  7:00               ` Vlastimil Babka
  -1 siblings, 0 replies; 44+ messages in thread
From: Vlastimil Babka @ 2017-08-29  7:00 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Michal Hocko, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel,
	Linux API

On 08/29/2017 02:36 AM, Joonsoo Kim wrote:
> On Mon, Aug 28, 2017 at 08:45:07AM +0200, Vlastimil Babka wrote:
>> +CC linux-api
>>
>> On 08/28/2017 02:28 AM, Joonsoo Kim wrote:
>>> On Fri, Aug 25, 2017 at 09:56:10AM +0200, Vlastimil Babka wrote:
>>>
>>> Seems reasonable. However, if there is a user who checks
>>> sysctl_lowmem_reserve_ratio entry for HIGHMEM and change it, suggested
>>> interface will cause a problem since it doesn't expose ratio for
>>> HIGHMEM. Am I missing something?
>>
>> As you explained, it makes little sense to change it for HIGHMEM which
>> only affects MOVABLE allocations. Also I doubt there are many systems
>> with both HIGHMEM (implies 32bit) *and* MOVABLE (implies NUMA, memory
>> hotplug...) zones. So I would just remove it, and if somebody will
>> really miss it, we can always add it back. In any case, please CC
>> linux-api on the next version.
> 
> If we will accept a change that potentially breaks the user, I think
> that making zero as a special value for sysctl_lowmem_reserve_ratio
> is better solution. How about this way?

I'd prefer removal, but won't object to zero. Certainly much better than
UINT_MAX.

> Thanks.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-api" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-29  7:00               ` Vlastimil Babka
  0 siblings, 0 replies; 44+ messages in thread
From: Vlastimil Babka @ 2017-08-29  7:00 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Michal Hocko, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel,
	Linux API

On 08/29/2017 02:36 AM, Joonsoo Kim wrote:
> On Mon, Aug 28, 2017 at 08:45:07AM +0200, Vlastimil Babka wrote:
>> +CC linux-api
>>
>> On 08/28/2017 02:28 AM, Joonsoo Kim wrote:
>>> On Fri, Aug 25, 2017 at 09:56:10AM +0200, Vlastimil Babka wrote:
>>>
>>> Seems reasonable. However, if there is a user who checks
>>> sysctl_lowmem_reserve_ratio entry for HIGHMEM and change it, suggested
>>> interface will cause a problem since it doesn't expose ratio for
>>> HIGHMEM. Am I missing something?
>>
>> As you explained, it makes little sense to change it for HIGHMEM which
>> only affects MOVABLE allocations. Also I doubt there are many systems
>> with both HIGHMEM (implies 32bit) *and* MOVABLE (implies NUMA, memory
>> hotplug...) zones. So I would just remove it, and if somebody will
>> really miss it, we can always add it back. In any case, please CC
>> linux-api on the next version.
> 
> If we will accept a change that potentially breaks the user, I think
> that making zero as a special value for sysctl_lowmem_reserve_ratio
> is better solution. How about this way?

I'd prefer removal, but won't object to zero. Certainly much better than
UINT_MAX.

> Thanks.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-api" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-08-28  9:56           ` Michal Hocko
@ 2017-08-29  0:45             ` Joonsoo Kim
  -1 siblings, 0 replies; 44+ messages in thread
From: Joonsoo Kim @ 2017-08-29  0:45 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vlastimil Babka, Andrew Morton, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Mon, Aug 28, 2017 at 11:56:16AM +0200, Michal Hocko wrote:
> On Mon 28-08-17 09:15:52, Joonsoo Kim wrote:
> > On Fri, Aug 25, 2017 at 09:38:42AM +0200, Michal Hocko wrote:
> > > On Fri 25-08-17 09:20:31, Joonsoo Kim wrote:
> > > > On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> > > > > On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> > > > > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > > > > 
> > > > > > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > > > > > important to reserve. When ZONE_MOVABLE is used, this problem would
> > > > > > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > > > > > allocation request which is mainly used for page cache and anon page
> > > > > > allocation. So, fix it.
> > > > > > 
> > > > > > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > > > > > makes code complex. For example, if there is highmem system, following
> > > > > > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > > > > > misleading people.
> > > > > > 
> > > > > >  #ifdef CONFIG_HIGHMEM
> > > > > >  32
> > > > > >  #endif
> > > > > > 
> > > > > > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > > > > > array by MAX_NR_ZONES and place "#ifdef" to right place.
> > > > > > 
> > > > > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > > > > > Acked-by: Vlastimil Babka <vbabka@suse.cz>
> > > > > 
> > > > > Looks like I did that almost year ago, so definitely had to refresh my
> > > > > memory now :)
> > > > > 
> > > > > Anyway now I looked more thoroughly and noticed that this change leaks
> > > > > into the reported sysctl. On a 64bit system with ZONE_MOVABLE:
> > > > > 
> > > > > before the patch:
> > > > > vm.lowmem_reserve_ratio = 256   256     32
> > > > > 
> > > > > after the patch:
> > > > > vm.lowmem_reserve_ratio = 256   256     32      2147483647
> > > > > 
> > > > > So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
> > > > > should do that differently than with the INT_MAX trick, IMHO.
> > > > 
> > > > Hmm, this is already pointed by Minchan and I have answered that.
> > > > 
> > > > lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> > > > 
> > > > If you have a better idea, please let me know.
> > > 
> > > Why don't we just use 0. In fact we are reserving 0 pages... Using
> > > INT_MAX is just wrong.
> > 
> > The number of reserved pages is calculated by "managed_pages /
> > ratio". Using INT_MAX, net result would be 0.
> 
> Why cannot we simply special case 0?
> 
> > There is a logic converting ratio 0 to ratio 1.
> > 
> > if (sysctl_lowmem_reserve_ratio[idx] < 1)
> >         sysctl_lowmem_reserve_ratio[idx] = 1
> 
> This code just tries to prevent from division by 0 but I am wondering
> we should simply set lowmem_reserve to 0 in that case.
> 
> > If I use 0 to represent 0 reserved page, there would be a user
> > who is affected by this change. So, I don't use 0 for this patch.
> 
> I am sorry but I do not understand? Could you be more specific please?

If there is a user that manually set sysctl_lowmem_reserve_ratio and
he/she uses '0' to set ratio to '1', your suggestion making '0' as
a special value changes his/her system behaviour. I'm afraid this
case.

However, if you and Vlastimil agree with this making '0' as a special
value, I will go this way.

Thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-29  0:45             ` Joonsoo Kim
  0 siblings, 0 replies; 44+ messages in thread
From: Joonsoo Kim @ 2017-08-29  0:45 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vlastimil Babka, Andrew Morton, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Mon, Aug 28, 2017 at 11:56:16AM +0200, Michal Hocko wrote:
> On Mon 28-08-17 09:15:52, Joonsoo Kim wrote:
> > On Fri, Aug 25, 2017 at 09:38:42AM +0200, Michal Hocko wrote:
> > > On Fri 25-08-17 09:20:31, Joonsoo Kim wrote:
> > > > On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> > > > > On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> > > > > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > > > > 
> > > > > > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > > > > > important to reserve. When ZONE_MOVABLE is used, this problem would
> > > > > > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > > > > > allocation request which is mainly used for page cache and anon page
> > > > > > allocation. So, fix it.
> > > > > > 
> > > > > > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > > > > > makes code complex. For example, if there is highmem system, following
> > > > > > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > > > > > misleading people.
> > > > > > 
> > > > > >  #ifdef CONFIG_HIGHMEM
> > > > > >  32
> > > > > >  #endif
> > > > > > 
> > > > > > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > > > > > array by MAX_NR_ZONES and place "#ifdef" to right place.
> > > > > > 
> > > > > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > > > > > Acked-by: Vlastimil Babka <vbabka@suse.cz>
> > > > > 
> > > > > Looks like I did that almost year ago, so definitely had to refresh my
> > > > > memory now :)
> > > > > 
> > > > > Anyway now I looked more thoroughly and noticed that this change leaks
> > > > > into the reported sysctl. On a 64bit system with ZONE_MOVABLE:
> > > > > 
> > > > > before the patch:
> > > > > vm.lowmem_reserve_ratio = 256   256     32
> > > > > 
> > > > > after the patch:
> > > > > vm.lowmem_reserve_ratio = 256   256     32      2147483647
> > > > > 
> > > > > So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
> > > > > should do that differently than with the INT_MAX trick, IMHO.
> > > > 
> > > > Hmm, this is already pointed by Minchan and I have answered that.
> > > > 
> > > > lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> > > > 
> > > > If you have a better idea, please let me know.
> > > 
> > > Why don't we just use 0. In fact we are reserving 0 pages... Using
> > > INT_MAX is just wrong.
> > 
> > The number of reserved pages is calculated by "managed_pages /
> > ratio". Using INT_MAX, net result would be 0.
> 
> Why cannot we simply special case 0?
> 
> > There is a logic converting ratio 0 to ratio 1.
> > 
> > if (sysctl_lowmem_reserve_ratio[idx] < 1)
> >         sysctl_lowmem_reserve_ratio[idx] = 1
> 
> This code just tries to prevent from division by 0 but I am wondering
> we should simply set lowmem_reserve to 0 in that case.
> 
> > If I use 0 to represent 0 reserved page, there would be a user
> > who is affected by this change. So, I don't use 0 for this patch.
> 
> I am sorry but I do not understand? Could you be more specific please?

If there is a user that manually set sysctl_lowmem_reserve_ratio and
he/she uses '0' to set ratio to '1', your suggestion making '0' as
a special value changes his/her system behaviour. I'm afraid this
case.

However, if you and Vlastimil agree with this making '0' as a special
value, I will go this way.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-29  0:36             ` Joonsoo Kim
  0 siblings, 0 replies; 44+ messages in thread
From: Joonsoo Kim @ 2017-08-29  0:36 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Michal Hocko, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel,
	Linux API

On Mon, Aug 28, 2017 at 08:45:07AM +0200, Vlastimil Babka wrote:
> +CC linux-api
> 
> On 08/28/2017 02:28 AM, Joonsoo Kim wrote:
> > On Fri, Aug 25, 2017 at 09:56:10AM +0200, Vlastimil Babka wrote:
> >> On 08/25/2017 02:20 AM, Joonsoo Kim wrote:
> >>> On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> >>>
> >>> Hmm, this is already pointed by Minchan and I have answered that.
> >>>
> >>> lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> >>>
> >>> If you have a better idea, please let me know.
> >>
> >> My idea is that size of sysctl_lowmem_reserve_ratio is ZONE_NORMAL+1 and
> >> it has no entries for zones > NORMAL. The
> >> setup_per_zone_lowmem_reserve() is adjusted to only set
> >> lower_zone->lowmem_reserve[j] for idx <= ZONE_NORMAL.
> >>
> >> I can't imagine somebody would want override the ratio for HIGHMEM or
> >> MOVABLE
> >> (where it has no effect anyway) so the simplest thing is not to expose
> >> it at all.
> > 
> > Seems reasonable. However, if there is a user who checks
> > sysctl_lowmem_reserve_ratio entry for HIGHMEM and change it, suggested
> > interface will cause a problem since it doesn't expose ratio for
> > HIGHMEM. Am I missing something?
> 
> As you explained, it makes little sense to change it for HIGHMEM which
> only affects MOVABLE allocations. Also I doubt there are many systems
> with both HIGHMEM (implies 32bit) *and* MOVABLE (implies NUMA, memory
> hotplug...) zones. So I would just remove it, and if somebody will
> really miss it, we can always add it back. In any case, please CC
> linux-api on the next version.

If we will accept a change that potentially breaks the user, I think
that making zero as a special value for sysctl_lowmem_reserve_ratio
is better solution. How about this way?

Thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-29  0:36             ` Joonsoo Kim
  0 siblings, 0 replies; 44+ messages in thread
From: Joonsoo Kim @ 2017-08-29  0:36 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Michal Hocko, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linux API

On Mon, Aug 28, 2017 at 08:45:07AM +0200, Vlastimil Babka wrote:
> +CC linux-api
> 
> On 08/28/2017 02:28 AM, Joonsoo Kim wrote:
> > On Fri, Aug 25, 2017 at 09:56:10AM +0200, Vlastimil Babka wrote:
> >> On 08/25/2017 02:20 AM, Joonsoo Kim wrote:
> >>> On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> >>>
> >>> Hmm, this is already pointed by Minchan and I have answered that.
> >>>
> >>> lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> >>>
> >>> If you have a better idea, please let me know.
> >>
> >> My idea is that size of sysctl_lowmem_reserve_ratio is ZONE_NORMAL+1 and
> >> it has no entries for zones > NORMAL. The
> >> setup_per_zone_lowmem_reserve() is adjusted to only set
> >> lower_zone->lowmem_reserve[j] for idx <= ZONE_NORMAL.
> >>
> >> I can't imagine somebody would want override the ratio for HIGHMEM or
> >> MOVABLE
> >> (where it has no effect anyway) so the simplest thing is not to expose
> >> it at all.
> > 
> > Seems reasonable. However, if there is a user who checks
> > sysctl_lowmem_reserve_ratio entry for HIGHMEM and change it, suggested
> > interface will cause a problem since it doesn't expose ratio for
> > HIGHMEM. Am I missing something?
> 
> As you explained, it makes little sense to change it for HIGHMEM which
> only affects MOVABLE allocations. Also I doubt there are many systems
> with both HIGHMEM (implies 32bit) *and* MOVABLE (implies NUMA, memory
> hotplug...) zones. So I would just remove it, and if somebody will
> really miss it, we can always add it back. In any case, please CC
> linux-api on the next version.

If we will accept a change that potentially breaks the user, I think
that making zero as a special value for sysctl_lowmem_reserve_ratio
is better solution. How about this way?

Thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-29  0:36             ` Joonsoo Kim
  0 siblings, 0 replies; 44+ messages in thread
From: Joonsoo Kim @ 2017-08-29  0:36 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Michal Hocko, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel,
	Linux API

On Mon, Aug 28, 2017 at 08:45:07AM +0200, Vlastimil Babka wrote:
> +CC linux-api
> 
> On 08/28/2017 02:28 AM, Joonsoo Kim wrote:
> > On Fri, Aug 25, 2017 at 09:56:10AM +0200, Vlastimil Babka wrote:
> >> On 08/25/2017 02:20 AM, Joonsoo Kim wrote:
> >>> On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> >>>
> >>> Hmm, this is already pointed by Minchan and I have answered that.
> >>>
> >>> lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> >>>
> >>> If you have a better idea, please let me know.
> >>
> >> My idea is that size of sysctl_lowmem_reserve_ratio is ZONE_NORMAL+1 and
> >> it has no entries for zones > NORMAL. The
> >> setup_per_zone_lowmem_reserve() is adjusted to only set
> >> lower_zone->lowmem_reserve[j] for idx <= ZONE_NORMAL.
> >>
> >> I can't imagine somebody would want override the ratio for HIGHMEM or
> >> MOVABLE
> >> (where it has no effect anyway) so the simplest thing is not to expose
> >> it at all.
> > 
> > Seems reasonable. However, if there is a user who checks
> > sysctl_lowmem_reserve_ratio entry for HIGHMEM and change it, suggested
> > interface will cause a problem since it doesn't expose ratio for
> > HIGHMEM. Am I missing something?
> 
> As you explained, it makes little sense to change it for HIGHMEM which
> only affects MOVABLE allocations. Also I doubt there are many systems
> with both HIGHMEM (implies 32bit) *and* MOVABLE (implies NUMA, memory
> hotplug...) zones. So I would just remove it, and if somebody will
> really miss it, we can always add it back. In any case, please CC
> linux-api on the next version.

If we will accept a change that potentially breaks the user, I think
that making zero as a special value for sysctl_lowmem_reserve_ratio
is better solution. How about this way?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-08-28  0:15         ` Joonsoo Kim
@ 2017-08-28  9:56           ` Michal Hocko
  -1 siblings, 0 replies; 44+ messages in thread
From: Michal Hocko @ 2017-08-28  9:56 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Vlastimil Babka, Andrew Morton, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Mon 28-08-17 09:15:52, Joonsoo Kim wrote:
> On Fri, Aug 25, 2017 at 09:38:42AM +0200, Michal Hocko wrote:
> > On Fri 25-08-17 09:20:31, Joonsoo Kim wrote:
> > > On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> > > > On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> > > > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > > > 
> > > > > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > > > > important to reserve. When ZONE_MOVABLE is used, this problem would
> > > > > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > > > > allocation request which is mainly used for page cache and anon page
> > > > > allocation. So, fix it.
> > > > > 
> > > > > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > > > > makes code complex. For example, if there is highmem system, following
> > > > > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > > > > misleading people.
> > > > > 
> > > > >  #ifdef CONFIG_HIGHMEM
> > > > >  32
> > > > >  #endif
> > > > > 
> > > > > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > > > > array by MAX_NR_ZONES and place "#ifdef" to right place.
> > > > > 
> > > > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > > > > Acked-by: Vlastimil Babka <vbabka@suse.cz>
> > > > 
> > > > Looks like I did that almost year ago, so definitely had to refresh my
> > > > memory now :)
> > > > 
> > > > Anyway now I looked more thoroughly and noticed that this change leaks
> > > > into the reported sysctl. On a 64bit system with ZONE_MOVABLE:
> > > > 
> > > > before the patch:
> > > > vm.lowmem_reserve_ratio = 256   256     32
> > > > 
> > > > after the patch:
> > > > vm.lowmem_reserve_ratio = 256   256     32      2147483647
> > > > 
> > > > So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
> > > > should do that differently than with the INT_MAX trick, IMHO.
> > > 
> > > Hmm, this is already pointed by Minchan and I have answered that.
> > > 
> > > lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> > > 
> > > If you have a better idea, please let me know.
> > 
> > Why don't we just use 0. In fact we are reserving 0 pages... Using
> > INT_MAX is just wrong.
> 
> The number of reserved pages is calculated by "managed_pages /
> ratio". Using INT_MAX, net result would be 0.

Why cannot we simply special case 0?

> There is a logic converting ratio 0 to ratio 1.
> 
> if (sysctl_lowmem_reserve_ratio[idx] < 1)
>         sysctl_lowmem_reserve_ratio[idx] = 1

This code just tries to prevent from division by 0 but I am wondering
we should simply set lowmem_reserve to 0 in that case.

> If I use 0 to represent 0 reserved page, there would be a user
> who is affected by this change. So, I don't use 0 for this patch.

I am sorry but I do not understand? Could you be more specific please?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-28  9:56           ` Michal Hocko
  0 siblings, 0 replies; 44+ messages in thread
From: Michal Hocko @ 2017-08-28  9:56 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Vlastimil Babka, Andrew Morton, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Mon 28-08-17 09:15:52, Joonsoo Kim wrote:
> On Fri, Aug 25, 2017 at 09:38:42AM +0200, Michal Hocko wrote:
> > On Fri 25-08-17 09:20:31, Joonsoo Kim wrote:
> > > On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> > > > On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> > > > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > > > 
> > > > > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > > > > important to reserve. When ZONE_MOVABLE is used, this problem would
> > > > > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > > > > allocation request which is mainly used for page cache and anon page
> > > > > allocation. So, fix it.
> > > > > 
> > > > > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > > > > makes code complex. For example, if there is highmem system, following
> > > > > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > > > > misleading people.
> > > > > 
> > > > >  #ifdef CONFIG_HIGHMEM
> > > > >  32
> > > > >  #endif
> > > > > 
> > > > > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > > > > array by MAX_NR_ZONES and place "#ifdef" to right place.
> > > > > 
> > > > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > > > > Acked-by: Vlastimil Babka <vbabka@suse.cz>
> > > > 
> > > > Looks like I did that almost year ago, so definitely had to refresh my
> > > > memory now :)
> > > > 
> > > > Anyway now I looked more thoroughly and noticed that this change leaks
> > > > into the reported sysctl. On a 64bit system with ZONE_MOVABLE:
> > > > 
> > > > before the patch:
> > > > vm.lowmem_reserve_ratio = 256   256     32
> > > > 
> > > > after the patch:
> > > > vm.lowmem_reserve_ratio = 256   256     32      2147483647
> > > > 
> > > > So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
> > > > should do that differently than with the INT_MAX trick, IMHO.
> > > 
> > > Hmm, this is already pointed by Minchan and I have answered that.
> > > 
> > > lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> > > 
> > > If you have a better idea, please let me know.
> > 
> > Why don't we just use 0. In fact we are reserving 0 pages... Using
> > INT_MAX is just wrong.
> 
> The number of reserved pages is calculated by "managed_pages /
> ratio". Using INT_MAX, net result would be 0.

Why cannot we simply special case 0?

> There is a logic converting ratio 0 to ratio 1.
> 
> if (sysctl_lowmem_reserve_ratio[idx] < 1)
>         sysctl_lowmem_reserve_ratio[idx] = 1

This code just tries to prevent from division by 0 but I am wondering
we should simply set lowmem_reserve to 0 in that case.

> If I use 0 to represent 0 reserved page, there would be a user
> who is affected by this change. So, I don't use 0 for this patch.

I am sorry but I do not understand? Could you be more specific please?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-08-28  0:28         ` Joonsoo Kim
  (?)
@ 2017-08-28  6:45           ` Vlastimil Babka
  -1 siblings, 0 replies; 44+ messages in thread
From: Vlastimil Babka @ 2017-08-28  6:45 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Michal Hocko, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel,
	Linux API

+CC linux-api

On 08/28/2017 02:28 AM, Joonsoo Kim wrote:
> On Fri, Aug 25, 2017 at 09:56:10AM +0200, Vlastimil Babka wrote:
>> On 08/25/2017 02:20 AM, Joonsoo Kim wrote:
>>> On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
>>>
>>> Hmm, this is already pointed by Minchan and I have answered that.
>>>
>>> lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
>>>
>>> If you have a better idea, please let me know.
>>
>> My idea is that size of sysctl_lowmem_reserve_ratio is ZONE_NORMAL+1 and
>> it has no entries for zones > NORMAL. The
>> setup_per_zone_lowmem_reserve() is adjusted to only set
>> lower_zone->lowmem_reserve[j] for idx <= ZONE_NORMAL.
>>
>> I can't imagine somebody would want override the ratio for HIGHMEM or
>> MOVABLE
>> (where it has no effect anyway) so the simplest thing is not to expose
>> it at all.
> 
> Seems reasonable. However, if there is a user who checks
> sysctl_lowmem_reserve_ratio entry for HIGHMEM and change it, suggested
> interface will cause a problem since it doesn't expose ratio for
> HIGHMEM. Am I missing something?

As you explained, it makes little sense to change it for HIGHMEM which
only affects MOVABLE allocations. Also I doubt there are many systems
with both HIGHMEM (implies 32bit) *and* MOVABLE (implies NUMA, memory
hotplug...) zones. So I would just remove it, and if somebody will
really miss it, we can always add it back. In any case, please CC
linux-api on the next version.

> Thanks.
> 
> 
>>
>>> Thanks.
>>>
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-28  6:45           ` Vlastimil Babka
  0 siblings, 0 replies; 44+ messages in thread
From: Vlastimil Babka @ 2017-08-28  6:45 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Michal Hocko, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linux API

+CC linux-api

On 08/28/2017 02:28 AM, Joonsoo Kim wrote:
> On Fri, Aug 25, 2017 at 09:56:10AM +0200, Vlastimil Babka wrote:
>> On 08/25/2017 02:20 AM, Joonsoo Kim wrote:
>>> On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
>>>
>>> Hmm, this is already pointed by Minchan and I have answered that.
>>>
>>> lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
>>>
>>> If you have a better idea, please let me know.
>>
>> My idea is that size of sysctl_lowmem_reserve_ratio is ZONE_NORMAL+1 and
>> it has no entries for zones > NORMAL. The
>> setup_per_zone_lowmem_reserve() is adjusted to only set
>> lower_zone->lowmem_reserve[j] for idx <= ZONE_NORMAL.
>>
>> I can't imagine somebody would want override the ratio for HIGHMEM or
>> MOVABLE
>> (where it has no effect anyway) so the simplest thing is not to expose
>> it at all.
> 
> Seems reasonable. However, if there is a user who checks
> sysctl_lowmem_reserve_ratio entry for HIGHMEM and change it, suggested
> interface will cause a problem since it doesn't expose ratio for
> HIGHMEM. Am I missing something?

As you explained, it makes little sense to change it for HIGHMEM which
only affects MOVABLE allocations. Also I doubt there are many systems
with both HIGHMEM (implies 32bit) *and* MOVABLE (implies NUMA, memory
hotplug...) zones. So I would just remove it, and if somebody will
really miss it, we can always add it back. In any case, please CC
linux-api on the next version.

> Thanks.
> 
> 
>>
>>> Thanks.
>>>
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo-Bw31MaZKKs0EbZ0PF+XxCw@public.gmane.org  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org"> email-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-28  6:45           ` Vlastimil Babka
  0 siblings, 0 replies; 44+ messages in thread
From: Vlastimil Babka @ 2017-08-28  6:45 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Michal Hocko, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel,
	Linux API

+CC linux-api

On 08/28/2017 02:28 AM, Joonsoo Kim wrote:
> On Fri, Aug 25, 2017 at 09:56:10AM +0200, Vlastimil Babka wrote:
>> On 08/25/2017 02:20 AM, Joonsoo Kim wrote:
>>> On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
>>>
>>> Hmm, this is already pointed by Minchan and I have answered that.
>>>
>>> lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
>>>
>>> If you have a better idea, please let me know.
>>
>> My idea is that size of sysctl_lowmem_reserve_ratio is ZONE_NORMAL+1 and
>> it has no entries for zones > NORMAL. The
>> setup_per_zone_lowmem_reserve() is adjusted to only set
>> lower_zone->lowmem_reserve[j] for idx <= ZONE_NORMAL.
>>
>> I can't imagine somebody would want override the ratio for HIGHMEM or
>> MOVABLE
>> (where it has no effect anyway) so the simplest thing is not to expose
>> it at all.
> 
> Seems reasonable. However, if there is a user who checks
> sysctl_lowmem_reserve_ratio entry for HIGHMEM and change it, suggested
> interface will cause a problem since it doesn't expose ratio for
> HIGHMEM. Am I missing something?

As you explained, it makes little sense to change it for HIGHMEM which
only affects MOVABLE allocations. Also I doubt there are many systems
with both HIGHMEM (implies 32bit) *and* MOVABLE (implies NUMA, memory
hotplug...) zones. So I would just remove it, and if somebody will
really miss it, we can always add it back. In any case, please CC
linux-api on the next version.

> Thanks.
> 
> 
>>
>>> Thanks.
>>>
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-08-25  7:56       ` Vlastimil Babka
@ 2017-08-28  0:28         ` Joonsoo Kim
  -1 siblings, 0 replies; 44+ messages in thread
From: Joonsoo Kim @ 2017-08-28  0:28 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Michal Hocko, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Fri, Aug 25, 2017 at 09:56:10AM +0200, Vlastimil Babka wrote:
> On 08/25/2017 02:20 AM, Joonsoo Kim wrote:
> > On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> > 
> > Hmm, this is already pointed by Minchan and I have answered that.
> > 
> > lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> > 
> > If you have a better idea, please let me know.
> 
> My idea is that size of sysctl_lowmem_reserve_ratio is ZONE_NORMAL+1 and
> it has no entries for zones > NORMAL. The
> setup_per_zone_lowmem_reserve() is adjusted to only set
> lower_zone->lowmem_reserve[j] for idx <= ZONE_NORMAL.
> 
> I can't imagine somebody would want override the ratio for HIGHMEM or
> MOVABLE
> (where it has no effect anyway) so the simplest thing is not to expose
> it at all.

Seems reasonable. However, if there is a user who checks
sysctl_lowmem_reserve_ratio entry for HIGHMEM and change it, suggested
interface will cause a problem since it doesn't expose ratio for
HIGHMEM. Am I missing something?

Thanks.


> 
> > Thanks.
> > 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-28  0:28         ` Joonsoo Kim
  0 siblings, 0 replies; 44+ messages in thread
From: Joonsoo Kim @ 2017-08-28  0:28 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Michal Hocko, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Fri, Aug 25, 2017 at 09:56:10AM +0200, Vlastimil Babka wrote:
> On 08/25/2017 02:20 AM, Joonsoo Kim wrote:
> > On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> > 
> > Hmm, this is already pointed by Minchan and I have answered that.
> > 
> > lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> > 
> > If you have a better idea, please let me know.
> 
> My idea is that size of sysctl_lowmem_reserve_ratio is ZONE_NORMAL+1 and
> it has no entries for zones > NORMAL. The
> setup_per_zone_lowmem_reserve() is adjusted to only set
> lower_zone->lowmem_reserve[j] for idx <= ZONE_NORMAL.
> 
> I can't imagine somebody would want override the ratio for HIGHMEM or
> MOVABLE
> (where it has no effect anyway) so the simplest thing is not to expose
> it at all.

Seems reasonable. However, if there is a user who checks
sysctl_lowmem_reserve_ratio entry for HIGHMEM and change it, suggested
interface will cause a problem since it doesn't expose ratio for
HIGHMEM. Am I missing something?

Thanks.


> 
> > Thanks.
> > 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-08-25  7:38       ` Michal Hocko
@ 2017-08-28  0:15         ` Joonsoo Kim
  -1 siblings, 0 replies; 44+ messages in thread
From: Joonsoo Kim @ 2017-08-28  0:15 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vlastimil Babka, Andrew Morton, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Fri, Aug 25, 2017 at 09:38:42AM +0200, Michal Hocko wrote:
> On Fri 25-08-17 09:20:31, Joonsoo Kim wrote:
> > On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> > > On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> > > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > > 
> > > > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > > > important to reserve. When ZONE_MOVABLE is used, this problem would
> > > > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > > > allocation request which is mainly used for page cache and anon page
> > > > allocation. So, fix it.
> > > > 
> > > > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > > > makes code complex. For example, if there is highmem system, following
> > > > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > > > misleading people.
> > > > 
> > > >  #ifdef CONFIG_HIGHMEM
> > > >  32
> > > >  #endif
> > > > 
> > > > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > > > array by MAX_NR_ZONES and place "#ifdef" to right place.
> > > > 
> > > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > > > Acked-by: Vlastimil Babka <vbabka@suse.cz>
> > > 
> > > Looks like I did that almost year ago, so definitely had to refresh my
> > > memory now :)
> > > 
> > > Anyway now I looked more thoroughly and noticed that this change leaks
> > > into the reported sysctl. On a 64bit system with ZONE_MOVABLE:
> > > 
> > > before the patch:
> > > vm.lowmem_reserve_ratio = 256   256     32
> > > 
> > > after the patch:
> > > vm.lowmem_reserve_ratio = 256   256     32      2147483647
> > > 
> > > So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
> > > should do that differently than with the INT_MAX trick, IMHO.
> > 
> > Hmm, this is already pointed by Minchan and I have answered that.
> > 
> > lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> > 
> > If you have a better idea, please let me know.
> 
> Why don't we just use 0. In fact we are reserving 0 pages... Using
> INT_MAX is just wrong.

The number of reserved pages is calculated by "managed_pages /
ratio". Using INT_MAX, net result would be 0.

There is a logic converting ratio 0 to ratio 1.

if (sysctl_lowmem_reserve_ratio[idx] < 1)
        sysctl_lowmem_reserve_ratio[idx] = 1

If I use 0 to represent 0 reserved page, there would be a user
who is affected by this change. So, I don't use 0 for this patch.

Thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-28  0:15         ` Joonsoo Kim
  0 siblings, 0 replies; 44+ messages in thread
From: Joonsoo Kim @ 2017-08-28  0:15 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vlastimil Babka, Andrew Morton, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Fri, Aug 25, 2017 at 09:38:42AM +0200, Michal Hocko wrote:
> On Fri 25-08-17 09:20:31, Joonsoo Kim wrote:
> > On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> > > On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> > > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > > 
> > > > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > > > important to reserve. When ZONE_MOVABLE is used, this problem would
> > > > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > > > allocation request which is mainly used for page cache and anon page
> > > > allocation. So, fix it.
> > > > 
> > > > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > > > makes code complex. For example, if there is highmem system, following
> > > > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > > > misleading people.
> > > > 
> > > >  #ifdef CONFIG_HIGHMEM
> > > >  32
> > > >  #endif
> > > > 
> > > > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > > > array by MAX_NR_ZONES and place "#ifdef" to right place.
> > > > 
> > > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > > > Acked-by: Vlastimil Babka <vbabka@suse.cz>
> > > 
> > > Looks like I did that almost year ago, so definitely had to refresh my
> > > memory now :)
> > > 
> > > Anyway now I looked more thoroughly and noticed that this change leaks
> > > into the reported sysctl. On a 64bit system with ZONE_MOVABLE:
> > > 
> > > before the patch:
> > > vm.lowmem_reserve_ratio = 256   256     32
> > > 
> > > after the patch:
> > > vm.lowmem_reserve_ratio = 256   256     32      2147483647
> > > 
> > > So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
> > > should do that differently than with the INT_MAX trick, IMHO.
> > 
> > Hmm, this is already pointed by Minchan and I have answered that.
> > 
> > lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> > 
> > If you have a better idea, please let me know.
> 
> Why don't we just use 0. In fact we are reserving 0 pages... Using
> INT_MAX is just wrong.

The number of reserved pages is calculated by "managed_pages /
ratio". Using INT_MAX, net result would be 0.

There is a logic converting ratio 0 to ratio 1.

if (sysctl_lowmem_reserve_ratio[idx] < 1)
        sysctl_lowmem_reserve_ratio[idx] = 1

If I use 0 to represent 0 reserved page, there would be a user
who is affected by this change. So, I don't use 0 for this patch.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-08-25  0:20     ` Joonsoo Kim
@ 2017-08-25  7:56       ` Vlastimil Babka
  -1 siblings, 0 replies; 44+ messages in thread
From: Vlastimil Babka @ 2017-08-25  7:56 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Michal Hocko, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On 08/25/2017 02:20 AM, Joonsoo Kim wrote:
> On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> 
> Hmm, this is already pointed by Minchan and I have answered that.
> 
> lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> 
> If you have a better idea, please let me know.

My idea is that size of sysctl_lowmem_reserve_ratio is ZONE_NORMAL+1 and
it has no entries for zones > NORMAL. The
setup_per_zone_lowmem_reserve() is adjusted to only set
lower_zone->lowmem_reserve[j] for idx <= ZONE_NORMAL.

I can't imagine somebody would want override the ratio for HIGHMEM or
MOVABLE
(where it has no effect anyway) so the simplest thing is not to expose
it at all.

> Thanks.
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-25  7:56       ` Vlastimil Babka
  0 siblings, 0 replies; 44+ messages in thread
From: Vlastimil Babka @ 2017-08-25  7:56 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Michal Hocko, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On 08/25/2017 02:20 AM, Joonsoo Kim wrote:
> On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> 
> Hmm, this is already pointed by Minchan and I have answered that.
> 
> lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> 
> If you have a better idea, please let me know.

My idea is that size of sysctl_lowmem_reserve_ratio is ZONE_NORMAL+1 and
it has no entries for zones > NORMAL. The
setup_per_zone_lowmem_reserve() is adjusted to only set
lower_zone->lowmem_reserve[j] for idx <= ZONE_NORMAL.

I can't imagine somebody would want override the ratio for HIGHMEM or
MOVABLE
(where it has no effect anyway) so the simplest thing is not to expose
it at all.

> Thanks.
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-08-25  0:20     ` Joonsoo Kim
@ 2017-08-25  7:38       ` Michal Hocko
  -1 siblings, 0 replies; 44+ messages in thread
From: Michal Hocko @ 2017-08-25  7:38 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Vlastimil Babka, Andrew Morton, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Fri 25-08-17 09:20:31, Joonsoo Kim wrote:
> On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> > On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > 
> > > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > > important to reserve. When ZONE_MOVABLE is used, this problem would
> > > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > > allocation request which is mainly used for page cache and anon page
> > > allocation. So, fix it.
> > > 
> > > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > > makes code complex. For example, if there is highmem system, following
> > > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > > misleading people.
> > > 
> > >  #ifdef CONFIG_HIGHMEM
> > >  32
> > >  #endif
> > > 
> > > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > > array by MAX_NR_ZONES and place "#ifdef" to right place.
> > > 
> > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > > Acked-by: Vlastimil Babka <vbabka@suse.cz>
> > 
> > Looks like I did that almost year ago, so definitely had to refresh my
> > memory now :)
> > 
> > Anyway now I looked more thoroughly and noticed that this change leaks
> > into the reported sysctl. On a 64bit system with ZONE_MOVABLE:
> > 
> > before the patch:
> > vm.lowmem_reserve_ratio = 256   256     32
> > 
> > after the patch:
> > vm.lowmem_reserve_ratio = 256   256     32      2147483647
> > 
> > So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
> > should do that differently than with the INT_MAX trick, IMHO.
> 
> Hmm, this is already pointed by Minchan and I have answered that.
> 
> lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> 
> If you have a better idea, please let me know.

Why don't we just use 0. In fact we are reserving 0 pages... Using
INT_MAX is just wrong.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-25  7:38       ` Michal Hocko
  0 siblings, 0 replies; 44+ messages in thread
From: Michal Hocko @ 2017-08-25  7:38 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Vlastimil Babka, Andrew Morton, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Fri 25-08-17 09:20:31, Joonsoo Kim wrote:
> On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> > On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > 
> > > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > > important to reserve. When ZONE_MOVABLE is used, this problem would
> > > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > > allocation request which is mainly used for page cache and anon page
> > > allocation. So, fix it.
> > > 
> > > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > > makes code complex. For example, if there is highmem system, following
> > > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > > misleading people.
> > > 
> > >  #ifdef CONFIG_HIGHMEM
> > >  32
> > >  #endif
> > > 
> > > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > > array by MAX_NR_ZONES and place "#ifdef" to right place.
> > > 
> > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > > Acked-by: Vlastimil Babka <vbabka@suse.cz>
> > 
> > Looks like I did that almost year ago, so definitely had to refresh my
> > memory now :)
> > 
> > Anyway now I looked more thoroughly and noticed that this change leaks
> > into the reported sysctl. On a 64bit system with ZONE_MOVABLE:
> > 
> > before the patch:
> > vm.lowmem_reserve_ratio = 256   256     32
> > 
> > after the patch:
> > vm.lowmem_reserve_ratio = 256   256     32      2147483647
> > 
> > So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
> > should do that differently than with the INT_MAX trick, IMHO.
> 
> Hmm, this is already pointed by Minchan and I have answered that.
> 
> lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>
> 
> If you have a better idea, please let me know.

Why don't we just use 0. In fact we are reserving 0 pages... Using
INT_MAX is just wrong.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-08-25  0:15     ` Joonsoo Kim
@ 2017-08-25  7:33       ` Michal Hocko
  -1 siblings, 0 replies; 44+ messages in thread
From: Michal Hocko @ 2017-08-25  7:33 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Fri 25-08-17 09:15:43, Joonsoo Kim wrote:
> On Thu, Aug 24, 2017 at 11:30:50AM +0200, Michal Hocko wrote:
> > On Thu 24-08-17 14:45:46, Joonsoo Kim wrote:
> > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > 
> > > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > > important to reserve. When ZONE_MOVABLE is used, this problem would
> > > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > > allocation request which is mainly used for page cache and anon page
> > > allocation. So, fix it.
> > 
> > I do not really understand what is the problem you are trying to fix.
> > Yes the memory is reserved for a higher priority consumer and that is
> > deliberate AFAICT. Just consider that an OOM victim wants to make
> > further progress and rely on memory reserve while doing
> > GFP_HIGHUSER_MOVABLE request.
> > 
> > So what is the real problem you are trying to address here?
> 
> If the system has the both, ZONE_HIGHMEM and ZONE_MOVABLE,
> ZONE_HIGHMEM will reserve the memory for ZONE_MOVABLE request.

Ohh, right. I forgot that __GFP_MOVABLE doesn't really enforce the
movable zone. It does so only if __GFP_HIGHMEM is specified as well when
ZONE_HIGHMEM is enabled. So indeed reserving memory in both is somehow
awkward. So why don't we simply remove reserves from the movable zone
when the highmem zone is enabled?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-25  7:33       ` Michal Hocko
  0 siblings, 0 replies; 44+ messages in thread
From: Michal Hocko @ 2017-08-25  7:33 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Fri 25-08-17 09:15:43, Joonsoo Kim wrote:
> On Thu, Aug 24, 2017 at 11:30:50AM +0200, Michal Hocko wrote:
> > On Thu 24-08-17 14:45:46, Joonsoo Kim wrote:
> > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > 
> > > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > > important to reserve. When ZONE_MOVABLE is used, this problem would
> > > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > > allocation request which is mainly used for page cache and anon page
> > > allocation. So, fix it.
> > 
> > I do not really understand what is the problem you are trying to fix.
> > Yes the memory is reserved for a higher priority consumer and that is
> > deliberate AFAICT. Just consider that an OOM victim wants to make
> > further progress and rely on memory reserve while doing
> > GFP_HIGHUSER_MOVABLE request.
> > 
> > So what is the real problem you are trying to address here?
> 
> If the system has the both, ZONE_HIGHMEM and ZONE_MOVABLE,
> ZONE_HIGHMEM will reserve the memory for ZONE_MOVABLE request.

Ohh, right. I forgot that __GFP_MOVABLE doesn't really enforce the
movable zone. It does so only if __GFP_HIGHMEM is specified as well when
ZONE_HIGHMEM is enabled. So indeed reserving memory in both is somehow
awkward. So why don't we simply remove reserves from the movable zone
when the highmem zone is enabled?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-08-24  9:41   ` Vlastimil Babka
@ 2017-08-25  0:20     ` Joonsoo Kim
  -1 siblings, 0 replies; 44+ messages in thread
From: Joonsoo Kim @ 2017-08-25  0:20 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Michal Hocko, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > 
> > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > important to reserve. When ZONE_MOVABLE is used, this problem would
> > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > allocation request which is mainly used for page cache and anon page
> > allocation. So, fix it.
> > 
> > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > makes code complex. For example, if there is highmem system, following
> > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > misleading people.
> > 
> >  #ifdef CONFIG_HIGHMEM
> >  32
> >  #endif
> > 
> > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > array by MAX_NR_ZONES and place "#ifdef" to right place.
> > 
> > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > Acked-by: Vlastimil Babka <vbabka@suse.cz>
> 
> Looks like I did that almost year ago, so definitely had to refresh my
> memory now :)
> 
> Anyway now I looked more thoroughly and noticed that this change leaks
> into the reported sysctl. On a 64bit system with ZONE_MOVABLE:
> 
> before the patch:
> vm.lowmem_reserve_ratio = 256   256     32
> 
> after the patch:
> vm.lowmem_reserve_ratio = 256   256     32      2147483647
> 
> So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
> should do that differently than with the INT_MAX trick, IMHO.

Hmm, this is already pointed by Minchan and I have answered that.

lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>

If you have a better idea, please let me know.

Thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-25  0:20     ` Joonsoo Kim
  0 siblings, 0 replies; 44+ messages in thread
From: Joonsoo Kim @ 2017-08-25  0:20 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Michal Hocko, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > 
> > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > important to reserve. When ZONE_MOVABLE is used, this problem would
> > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > allocation request which is mainly used for page cache and anon page
> > allocation. So, fix it.
> > 
> > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > makes code complex. For example, if there is highmem system, following
> > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > misleading people.
> > 
> >  #ifdef CONFIG_HIGHMEM
> >  32
> >  #endif
> > 
> > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > array by MAX_NR_ZONES and place "#ifdef" to right place.
> > 
> > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > Acked-by: Vlastimil Babka <vbabka@suse.cz>
> 
> Looks like I did that almost year ago, so definitely had to refresh my
> memory now :)
> 
> Anyway now I looked more thoroughly and noticed that this change leaks
> into the reported sysctl. On a 64bit system with ZONE_MOVABLE:
> 
> before the patch:
> vm.lowmem_reserve_ratio = 256   256     32
> 
> after the patch:
> vm.lowmem_reserve_ratio = 256   256     32      2147483647
> 
> So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
> should do that differently than with the INT_MAX trick, IMHO.

Hmm, this is already pointed by Minchan and I have answered that.

lkml.kernel.org/r/<20170421013243.GA13966@js1304-desktop>

If you have a better idea, please let me know.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-08-24  9:30   ` Michal Hocko
@ 2017-08-25  0:15     ` Joonsoo Kim
  -1 siblings, 0 replies; 44+ messages in thread
From: Joonsoo Kim @ 2017-08-25  0:15 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Thu, Aug 24, 2017 at 11:30:50AM +0200, Michal Hocko wrote:
> On Thu 24-08-17 14:45:46, Joonsoo Kim wrote:
> > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > 
> > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > important to reserve. When ZONE_MOVABLE is used, this problem would
> > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > allocation request which is mainly used for page cache and anon page
> > allocation. So, fix it.
> 
> I do not really understand what is the problem you are trying to fix.
> Yes the memory is reserved for a higher priority consumer and that is
> deliberate AFAICT. Just consider that an OOM victim wants to make
> further progress and rely on memory reserve while doing
> GFP_HIGHUSER_MOVABLE request.
> 
> So what is the real problem you are trying to address here?

If the system has the both, ZONE_HIGHMEM and ZONE_MOVABLE,
ZONE_HIGHMEM will reserve the memory for ZONE_MOVABLE request.
However, they are consumed by nearly equivalent priority consumer who
uses GFP_HIGHMEM + GFP_MOVABLE. In that case, reserved memory in
ZONE_HIGHMEM would not be used and it means just waste of the memory.
This patch try to fix it to nullify reserving memory in ZONE_HIGHMEM.

And, I think that all this problem is caused by the complex code in
lowmem reserve calculation. So, did some clean-up.

Thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-25  0:15     ` Joonsoo Kim
  0 siblings, 0 replies; 44+ messages in thread
From: Joonsoo Kim @ 2017-08-25  0:15 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel

On Thu, Aug 24, 2017 at 11:30:50AM +0200, Michal Hocko wrote:
> On Thu 24-08-17 14:45:46, Joonsoo Kim wrote:
> > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > 
> > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > important to reserve. When ZONE_MOVABLE is used, this problem would
> > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > allocation request which is mainly used for page cache and anon page
> > allocation. So, fix it.
> 
> I do not really understand what is the problem you are trying to fix.
> Yes the memory is reserved for a higher priority consumer and that is
> deliberate AFAICT. Just consider that an OOM victim wants to make
> further progress and rely on memory reserve while doing
> GFP_HIGHUSER_MOVABLE request.
> 
> So what is the real problem you are trying to address here?

If the system has the both, ZONE_HIGHMEM and ZONE_MOVABLE,
ZONE_HIGHMEM will reserve the memory for ZONE_MOVABLE request.
However, they are consumed by nearly equivalent priority consumer who
uses GFP_HIGHMEM + GFP_MOVABLE. In that case, reserved memory in
ZONE_HIGHMEM would not be used and it means just waste of the memory.
This patch try to fix it to nullify reserving memory in ZONE_HIGHMEM.

And, I think that all this problem is caused by the complex code in
lowmem reserve calculation. So, did some clean-up.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-08-24  5:45 ` js1304
@ 2017-08-24  9:41   ` Vlastimil Babka
  -1 siblings, 0 replies; 44+ messages in thread
From: Vlastimil Babka @ 2017-08-24  9:41 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: Michal Hocko, Mel Gorman, Johannes Weiner, Aneesh Kumar K . V,
	Minchan Kim, linux-mm, linux-kernel, Joonsoo Kim

On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> important to reserve. When ZONE_MOVABLE is used, this problem would
> theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> allocation request which is mainly used for page cache and anon page
> allocation. So, fix it.
> 
> And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> makes code complex. For example, if there is highmem system, following
> reserve ratio is activated for *NORMAL ZONE* which would be easyily
> misleading people.
> 
>  #ifdef CONFIG_HIGHMEM
>  32
>  #endif
> 
> This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> array by MAX_NR_ZONES and place "#ifdef" to right place.
> 
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>

Looks like I did that almost year ago, so definitely had to refresh my
memory now :)

Anyway now I looked more thoroughly and noticed that this change leaks
into the reported sysctl. On a 64bit system with ZONE_MOVABLE:

before the patch:
vm.lowmem_reserve_ratio = 256   256     32

after the patch:
vm.lowmem_reserve_ratio = 256   256     32      2147483647

So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
should do that differently than with the INT_MAX trick, IMHO.

> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  include/linux/mmzone.h |  2 +-
>  mm/page_alloc.c        | 11 ++++++-----
>  2 files changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index e7e92c8..e5f134b 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -882,7 +882,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
> -extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
> +extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
>  int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 90b1996..6faa53d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -202,17 +202,18 @@ static void __free_pages_ok(struct page *page, unsigned int order);
>   * TBD: should special case ZONE_DMA32 machines here - in those we normally
>   * don't need any ZONE_NORMAL reservation
>   */
> -int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
> +int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
>  #ifdef CONFIG_ZONE_DMA
> -	 256,
> +	[ZONE_DMA] = 256,
>  #endif
>  #ifdef CONFIG_ZONE_DMA32
> -	 256,
> +	[ZONE_DMA32] = 256,
>  #endif
> +	[ZONE_NORMAL] = 32,
>  #ifdef CONFIG_HIGHMEM
> -	 32,
> +	[ZONE_HIGHMEM] = INT_MAX,
>  #endif
> -	 32,
> +	[ZONE_MOVABLE] = INT_MAX,
>  };
>  
>  EXPORT_SYMBOL(totalram_pages);
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-24  9:41   ` Vlastimil Babka
  0 siblings, 0 replies; 44+ messages in thread
From: Vlastimil Babka @ 2017-08-24  9:41 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: Michal Hocko, Mel Gorman, Johannes Weiner, Aneesh Kumar K . V,
	Minchan Kim, linux-mm, linux-kernel, Joonsoo Kim

On 08/24/2017 07:45 AM, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> important to reserve. When ZONE_MOVABLE is used, this problem would
> theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> allocation request which is mainly used for page cache and anon page
> allocation. So, fix it.
> 
> And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> makes code complex. For example, if there is highmem system, following
> reserve ratio is activated for *NORMAL ZONE* which would be easyily
> misleading people.
> 
>  #ifdef CONFIG_HIGHMEM
>  32
>  #endif
> 
> This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> array by MAX_NR_ZONES and place "#ifdef" to right place.
> 
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>

Looks like I did that almost year ago, so definitely had to refresh my
memory now :)

Anyway now I looked more thoroughly and noticed that this change leaks
into the reported sysctl. On a 64bit system with ZONE_MOVABLE:

before the patch:
vm.lowmem_reserve_ratio = 256   256     32

after the patch:
vm.lowmem_reserve_ratio = 256   256     32      2147483647

So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
should do that differently than with the INT_MAX trick, IMHO.

> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  include/linux/mmzone.h |  2 +-
>  mm/page_alloc.c        | 11 ++++++-----
>  2 files changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index e7e92c8..e5f134b 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -882,7 +882,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
> -extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
> +extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
>  int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 90b1996..6faa53d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -202,17 +202,18 @@ static void __free_pages_ok(struct page *page, unsigned int order);
>   * TBD: should special case ZONE_DMA32 machines here - in those we normally
>   * don't need any ZONE_NORMAL reservation
>   */
> -int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
> +int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
>  #ifdef CONFIG_ZONE_DMA
> -	 256,
> +	[ZONE_DMA] = 256,
>  #endif
>  #ifdef CONFIG_ZONE_DMA32
> -	 256,
> +	[ZONE_DMA32] = 256,
>  #endif
> +	[ZONE_NORMAL] = 32,
>  #ifdef CONFIG_HIGHMEM
> -	 32,
> +	[ZONE_HIGHMEM] = INT_MAX,
>  #endif
> -	 32,
> +	[ZONE_MOVABLE] = INT_MAX,
>  };
>  
>  EXPORT_SYMBOL(totalram_pages);
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
  2017-08-24  5:45 ` js1304
@ 2017-08-24  9:30   ` Michal Hocko
  -1 siblings, 0 replies; 44+ messages in thread
From: Michal Hocko @ 2017-08-24  9:30 UTC (permalink / raw)
  To: js1304
  Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel,
	Joonsoo Kim

On Thu 24-08-17 14:45:46, Joonsoo Kim wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> important to reserve. When ZONE_MOVABLE is used, this problem would
> theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> allocation request which is mainly used for page cache and anon page
> allocation. So, fix it.

I do not really understand what is the problem you are trying to fix.
Yes the memory is reserved for a higher priority consumer and that is
deliberate AFAICT. Just consider that an OOM victim wants to make
further progress and rely on memory reserve while doing
GFP_HIGHUSER_MOVABLE request.

So what is the real problem you are trying to address here?

> And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> makes code complex. For example, if there is highmem system, following
> reserve ratio is activated for *NORMAL ZONE* which would be easyily
> misleading people.
> 
>  #ifdef CONFIG_HIGHMEM
>  32
>  #endif
> 
> This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> array by MAX_NR_ZONES and place "#ifdef" to right place.
> 
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  include/linux/mmzone.h |  2 +-
>  mm/page_alloc.c        | 11 ++++++-----
>  2 files changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index e7e92c8..e5f134b 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -882,7 +882,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
> -extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
> +extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
>  int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 90b1996..6faa53d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -202,17 +202,18 @@ static void __free_pages_ok(struct page *page, unsigned int order);
>   * TBD: should special case ZONE_DMA32 machines here - in those we normally
>   * don't need any ZONE_NORMAL reservation
>   */
> -int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
> +int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
>  #ifdef CONFIG_ZONE_DMA
> -	 256,
> +	[ZONE_DMA] = 256,
>  #endif
>  #ifdef CONFIG_ZONE_DMA32
> -	 256,
> +	[ZONE_DMA32] = 256,
>  #endif
> +	[ZONE_NORMAL] = 32,
>  #ifdef CONFIG_HIGHMEM
> -	 32,
> +	[ZONE_HIGHMEM] = INT_MAX,
>  #endif
> -	 32,
> +	[ZONE_MOVABLE] = INT_MAX,
>  };
>  
>  EXPORT_SYMBOL(totalram_pages);
> -- 
> 2.7.4
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-24  9:30   ` Michal Hocko
  0 siblings, 0 replies; 44+ messages in thread
From: Michal Hocko @ 2017-08-24  9:30 UTC (permalink / raw)
  To: js1304
  Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel,
	Joonsoo Kim

On Thu 24-08-17 14:45:46, Joonsoo Kim wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> important to reserve. When ZONE_MOVABLE is used, this problem would
> theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> allocation request which is mainly used for page cache and anon page
> allocation. So, fix it.

I do not really understand what is the problem you are trying to fix.
Yes the memory is reserved for a higher priority consumer and that is
deliberate AFAICT. Just consider that an OOM victim wants to make
further progress and rely on memory reserve while doing
GFP_HIGHUSER_MOVABLE request.

So what is the real problem you are trying to address here?

> And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> makes code complex. For example, if there is highmem system, following
> reserve ratio is activated for *NORMAL ZONE* which would be easyily
> misleading people.
> 
>  #ifdef CONFIG_HIGHMEM
>  32
>  #endif
> 
> This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> array by MAX_NR_ZONES and place "#ifdef" to right place.
> 
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  include/linux/mmzone.h |  2 +-
>  mm/page_alloc.c        | 11 ++++++-----
>  2 files changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index e7e92c8..e5f134b 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -882,7 +882,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
> -extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
> +extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
>  int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
>  					void __user *, size_t *, loff_t *);
>  int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 90b1996..6faa53d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -202,17 +202,18 @@ static void __free_pages_ok(struct page *page, unsigned int order);
>   * TBD: should special case ZONE_DMA32 machines here - in those we normally
>   * don't need any ZONE_NORMAL reservation
>   */
> -int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
> +int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
>  #ifdef CONFIG_ZONE_DMA
> -	 256,
> +	[ZONE_DMA] = 256,
>  #endif
>  #ifdef CONFIG_ZONE_DMA32
> -	 256,
> +	[ZONE_DMA32] = 256,
>  #endif
> +	[ZONE_NORMAL] = 32,
>  #ifdef CONFIG_HIGHMEM
> -	 32,
> +	[ZONE_HIGHMEM] = INT_MAX,
>  #endif
> -	 32,
> +	[ZONE_MOVABLE] = INT_MAX,
>  };
>  
>  EXPORT_SYMBOL(totalram_pages);
> -- 
> 2.7.4
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-24  5:45 ` js1304
  0 siblings, 0 replies; 44+ messages in thread
From: js1304 @ 2017-08-24  5:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
important to reserve. When ZONE_MOVABLE is used, this problem would
theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
allocation request which is mainly used for page cache and anon page
allocation. So, fix it.

And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
makes code complex. For example, if there is highmem system, following
reserve ratio is activated for *NORMAL ZONE* which would be easyily
misleading people.

 #ifdef CONFIG_HIGHMEM
 32
 #endif

This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
array by MAX_NR_ZONES and place "#ifdef" to right place.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 include/linux/mmzone.h |  2 +-
 mm/page_alloc.c        | 11 ++++++-----
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index e7e92c8..e5f134b 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -882,7 +882,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
 					void __user *, size_t *, loff_t *);
 int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
 					void __user *, size_t *, loff_t *);
-extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
+extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
 int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
 					void __user *, size_t *, loff_t *);
 int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 90b1996..6faa53d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -202,17 +202,18 @@ static void __free_pages_ok(struct page *page, unsigned int order);
  * TBD: should special case ZONE_DMA32 machines here - in those we normally
  * don't need any ZONE_NORMAL reservation
  */
-int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
+int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
 #ifdef CONFIG_ZONE_DMA
-	 256,
+	[ZONE_DMA] = 256,
 #endif
 #ifdef CONFIG_ZONE_DMA32
-	 256,
+	[ZONE_DMA32] = 256,
 #endif
+	[ZONE_NORMAL] = 32,
 #ifdef CONFIG_HIGHMEM
-	 32,
+	[ZONE_HIGHMEM] = INT_MAX,
 #endif
-	 32,
+	[ZONE_MOVABLE] = INT_MAX,
 };
 
 EXPORT_SYMBOL(totalram_pages);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
@ 2017-08-24  5:45 ` js1304
  0 siblings, 0 replies; 44+ messages in thread
From: js1304 @ 2017-08-24  5:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Aneesh Kumar K . V, Minchan Kim, linux-mm, linux-kernel,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
important to reserve. When ZONE_MOVABLE is used, this problem would
theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
allocation request which is mainly used for page cache and anon page
allocation. So, fix it.

And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
makes code complex. For example, if there is highmem system, following
reserve ratio is activated for *NORMAL ZONE* which would be easyily
misleading people.

 #ifdef CONFIG_HIGHMEM
 32
 #endif

This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
array by MAX_NR_ZONES and place "#ifdef" to right place.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 include/linux/mmzone.h |  2 +-
 mm/page_alloc.c        | 11 ++++++-----
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index e7e92c8..e5f134b 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -882,7 +882,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
 					void __user *, size_t *, loff_t *);
 int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
 					void __user *, size_t *, loff_t *);
-extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
+extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
 int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
 					void __user *, size_t *, loff_t *);
 int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 90b1996..6faa53d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -202,17 +202,18 @@ static void __free_pages_ok(struct page *page, unsigned int order);
  * TBD: should special case ZONE_DMA32 machines here - in those we normally
  * don't need any ZONE_NORMAL reservation
  */
-int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
+int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
 #ifdef CONFIG_ZONE_DMA
-	 256,
+	[ZONE_DMA] = 256,
 #endif
 #ifdef CONFIG_ZONE_DMA32
-	 256,
+	[ZONE_DMA32] = 256,
 #endif
+	[ZONE_NORMAL] = 32,
 #ifdef CONFIG_HIGHMEM
-	 32,
+	[ZONE_HIGHMEM] = INT_MAX,
 #endif
-	 32,
+	[ZONE_MOVABLE] = INT_MAX,
 };
 
 EXPORT_SYMBOL(totalram_pages);
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2018-04-12 12:01 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-06  4:35 [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request js1304
2017-09-06  4:35 ` js1304
2017-09-06  7:54 ` Vlastimil Babka
2017-09-06  7:54   ` Vlastimil Babka
2017-09-14 13:24 ` Michal Hocko
2017-09-14 13:24   ` Michal Hocko
2018-04-04  0:24   ` Joonsoo Kim
2018-04-12 12:01     ` Michal Hocko
  -- strict thread matches above, loose matches on Subject: below --
2017-08-24  5:45 js1304
2017-08-24  5:45 ` js1304
2017-08-24  9:30 ` Michal Hocko
2017-08-24  9:30   ` Michal Hocko
2017-08-25  0:15   ` Joonsoo Kim
2017-08-25  0:15     ` Joonsoo Kim
2017-08-25  7:33     ` Michal Hocko
2017-08-25  7:33       ` Michal Hocko
2017-08-24  9:41 ` Vlastimil Babka
2017-08-24  9:41   ` Vlastimil Babka
2017-08-25  0:20   ` Joonsoo Kim
2017-08-25  0:20     ` Joonsoo Kim
2017-08-25  7:38     ` Michal Hocko
2017-08-25  7:38       ` Michal Hocko
2017-08-28  0:15       ` Joonsoo Kim
2017-08-28  0:15         ` Joonsoo Kim
2017-08-28  9:56         ` Michal Hocko
2017-08-28  9:56           ` Michal Hocko
2017-08-29  0:45           ` Joonsoo Kim
2017-08-29  0:45             ` Joonsoo Kim
2017-08-29 13:39             ` Michal Hocko
2017-08-29 13:39               ` Michal Hocko
2017-08-31  1:45               ` Joonsoo Kim
2017-08-31  1:45                 ` Joonsoo Kim
2017-08-25  7:56     ` Vlastimil Babka
2017-08-25  7:56       ` Vlastimil Babka
2017-08-28  0:28       ` Joonsoo Kim
2017-08-28  0:28         ` Joonsoo Kim
2017-08-28  6:45         ` Vlastimil Babka
2017-08-28  6:45           ` Vlastimil Babka
2017-08-28  6:45           ` Vlastimil Babka
2017-08-29  0:36           ` Joonsoo Kim
2017-08-29  0:36             ` Joonsoo Kim
2017-08-29  0:36             ` Joonsoo Kim
2017-08-29  7:00             ` Vlastimil Babka
2017-08-29  7:00               ` Vlastimil Babka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.