[PATCH 1/2] mm/slub: wake up kswapd for initial high order allocation

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] mm/slub: wake up kswapd for initial high order allocation
@ 2017-08-28  1:11 ` js1304
  0 siblings, 0 replies; 20+ messages in thread
From: js1304 @ 2017-08-28  1:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, linux-kernel, Mel Gorman, Vlastimil Babka

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

slub uses higher order allocation than it actually needs. In this case,
we don't want to do direct reclaim to make such a high order page since
it causes a big latency to the user. Instead, we would like to fallback
lower order allocation that it actually needs.

However, we also want to get this higher order page in the next time
in order to get the best performance and it would be a role of
the background thread like as kswapd and kcompactd. To wake up them,
we should not clear __GFP_KSWAPD_RECLAIM.

Unlike this intention, current code clears __GFP_KSWAPD_RECLAIM so fix it.

Note that this patch does some clean up, too.
__GFP_NOFAIL is cleared twice so remove one.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/slub.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 0dc7397..e1e442c 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1578,8 +1578,12 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
 	 * so we fall-back to the minimum order allocation.
 	 */
 	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
-	if ((alloc_gfp & __GFP_DIRECT_RECLAIM) && oo_order(oo) > oo_order(s->min))
-		alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) & ~(__GFP_RECLAIM|__GFP_NOFAIL);
+	if (oo_order(oo) > oo_order(s->min)) {
+		if (alloc_gfp & __GFP_DIRECT_RECLAIM) {
+			alloc_gfp |= __GFP_NOMEMALLOC;
+			alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
+		}
+	}

 	page = alloc_slab_page(s, alloc_gfp, node, oo);
 	if (unlikely(!page)) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 1/2] mm/slub: wake up kswapd for initial high order allocation
@ 2017-08-28  1:11 ` js1304
  0 siblings, 0 replies; 20+ messages in thread
From: js1304 @ 2017-08-28  1:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, linux-kernel, Mel Gorman, Vlastimil Babka

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

slub uses higher order allocation than it actually needs. In this case,
we don't want to do direct reclaim to make such a high order page since
it causes a big latency to the user. Instead, we would like to fallback
lower order allocation that it actually needs.

However, we also want to get this higher order page in the next time
in order to get the best performance and it would be a role of
the background thread like as kswapd and kcompactd. To wake up them,
we should not clear __GFP_KSWAPD_RECLAIM.

Unlike this intention, current code clears __GFP_KSWAPD_RECLAIM so fix it.

Note that this patch does some clean up, too.
__GFP_NOFAIL is cleared twice so remove one.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/slub.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 0dc7397..e1e442c 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1578,8 +1578,12 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
 	 * so we fall-back to the minimum order allocation.
 	 */
 	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
-	if ((alloc_gfp & __GFP_DIRECT_RECLAIM) && oo_order(oo) > oo_order(s->min))
-		alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) & ~(__GFP_RECLAIM|__GFP_NOFAIL);
+	if (oo_order(oo) > oo_order(s->min)) {
+		if (alloc_gfp & __GFP_DIRECT_RECLAIM) {
+			alloc_gfp |= __GFP_NOMEMALLOC;
+			alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
+		}
+	}
 
 	page = alloc_slab_page(s, alloc_gfp, node, oo);
 	if (unlikely(!page)) {
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/2] mm/slub: don't use reserved highatomic pageblock for optimistic try
  2017-08-28  1:11 ` js1304
@ 2017-08-28  1:11   ` js1304
  -1 siblings, 0 replies; 20+ messages in thread
From: js1304 @ 2017-08-28  1:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, linux-kernel, Mel Gorman, Vlastimil Babka

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

High-order atomic allocation is difficult to succeed since we cannot
reclaim anything in this context. So, we reserves the pageblock for
this kind of request.

In slub, we try to allocate higher-order page more than it actually
needs in order to get the best performance. If this optimistic try is
used with GFP_ATOMIC, alloc_flags will be set as ALLOC_HARDER and
the pageblock reserved for high-order atomic allocation would be used.
Moreover, this request would reserve the MIGRATE_HIGHATOMIC pageblock
,if succeed, to prepare further request. It would not be good to use
MIGRATE_HIGHATOMIC pageblock in terms of fragmentation management
since it unconditionally set a migratetype to request's migratetype
when unreserving the pageblock without considering the migratetype of
used pages in the pageblock.

This is not what we don't intend so fix it by unconditionally setting
__GFP_NOMEMALLOC in order to not set ALLOC_HARDER.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/slub.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index e1e442c..fd8dd89 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1579,10 +1579,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
 	 */
 	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
 	if (oo_order(oo) > oo_order(s->min)) {
-		if (alloc_gfp & __GFP_DIRECT_RECLAIM) {
-			alloc_gfp |= __GFP_NOMEMALLOC;
-			alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
-		}
+		alloc_gfp |= __GFP_NOMEMALLOC;
+		alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
 	}

 	page = alloc_slab_page(s, alloc_gfp, node, oo);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/2] mm/slub: don't use reserved highatomic pageblock for optimistic try
@ 2017-08-28  1:11   ` js1304
  0 siblings, 0 replies; 20+ messages in thread
From: js1304 @ 2017-08-28  1:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, linux-kernel, Mel Gorman, Vlastimil Babka

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

High-order atomic allocation is difficult to succeed since we cannot
reclaim anything in this context. So, we reserves the pageblock for
this kind of request.

In slub, we try to allocate higher-order page more than it actually
needs in order to get the best performance. If this optimistic try is
used with GFP_ATOMIC, alloc_flags will be set as ALLOC_HARDER and
the pageblock reserved for high-order atomic allocation would be used.
Moreover, this request would reserve the MIGRATE_HIGHATOMIC pageblock
,if succeed, to prepare further request. It would not be good to use
MIGRATE_HIGHATOMIC pageblock in terms of fragmentation management
since it unconditionally set a migratetype to request's migratetype
when unreserving the pageblock without considering the migratetype of
used pages in the pageblock.

This is not what we don't intend so fix it by unconditionally setting
__GFP_NOMEMALLOC in order to not set ALLOC_HARDER.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/slub.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index e1e442c..fd8dd89 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1579,10 +1579,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
 	 */
 	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
 	if (oo_order(oo) > oo_order(s->min)) {
-		if (alloc_gfp & __GFP_DIRECT_RECLAIM) {
-			alloc_gfp |= __GFP_NOMEMALLOC;
-			alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
-		}
+		alloc_gfp |= __GFP_NOMEMALLOC;
+		alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
 	}

 	page = alloc_slab_page(s, alloc_gfp, node, oo);
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] mm/slub: wake up kswapd for initial high order allocation
  2017-08-28  1:11 ` js1304
@ 2017-08-28 10:04   ` Vlastimil Babka
  -1 siblings, 0 replies; 20+ messages in thread
From: Vlastimil Babka @ 2017-08-28 10:04 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, linux-kernel, Mel Gorman

On 08/28/2017 03:11 AM, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> slub uses higher order allocation than it actually needs. In this case,
> we don't want to do direct reclaim to make such a high order page since
> it causes a big latency to the user. Instead, we would like to fallback
> lower order allocation that it actually needs.
> 
> However, we also want to get this higher order page in the next time
> in order to get the best performance and it would be a role of
> the background thread like as kswapd and kcompactd. To wake up them,
> we should not clear __GFP_KSWAPD_RECLAIM.
> 
> Unlike this intention, current code clears __GFP_KSWAPD_RECLAIM so fix it.
> 
> Note that this patch does some clean up, too.
> __GFP_NOFAIL is cleared twice so remove one.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Hm, so this seems to revert Mel's 444eb2a449ef ("mm: thp: set THP defrag
by default to madvise and add a stall-free defrag option") wrt the slub
allocate_slab() part. AFAICS the intention in Mel's patch was that he
removed a special case in __alloc_page_slowpath() where including
__GFP_THISNODE and lacking ~__GFP_DIRECT_RECLAIM effectively means also
lacking __GFP_KSWAPD_RECLAIM. The commit log claims that slab/slub might
change behavior so he moved the removal of __GFP_KSWAPD_RECLAIM to them.

But AFAICS, only slab uses __GFP_THISNODE, while slub doesn't. So your
patch would indeed revert an unintentional change of Mel's commit. Is it
right or do I miss something?

> ---
>  mm/slub.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 0dc7397..e1e442c 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1578,8 +1578,12 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
>  	 * so we fall-back to the minimum order allocation.
>  	 */
>  	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
> -	if ((alloc_gfp & __GFP_DIRECT_RECLAIM) && oo_order(oo) > oo_order(s->min))
> -		alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) & ~(__GFP_RECLAIM|__GFP_NOFAIL);
> +	if (oo_order(oo) > oo_order(s->min)) {
> +		if (alloc_gfp & __GFP_DIRECT_RECLAIM) {
> +			alloc_gfp |= __GFP_NOMEMALLOC;
> +			alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
> +		}
> +	}
>  
>  	page = alloc_slab_page(s, alloc_gfp, node, oo);
>  	if (unlikely(!page)) {
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] mm/slub: wake up kswapd for initial high order allocation
@ 2017-08-28 10:04   ` Vlastimil Babka
  0 siblings, 0 replies; 20+ messages in thread
From: Vlastimil Babka @ 2017-08-28 10:04 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, linux-kernel, Mel Gorman

On 08/28/2017 03:11 AM, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> slub uses higher order allocation than it actually needs. In this case,
> we don't want to do direct reclaim to make such a high order page since
> it causes a big latency to the user. Instead, we would like to fallback
> lower order allocation that it actually needs.
> 
> However, we also want to get this higher order page in the next time
> in order to get the best performance and it would be a role of
> the background thread like as kswapd and kcompactd. To wake up them,
> we should not clear __GFP_KSWAPD_RECLAIM.
> 
> Unlike this intention, current code clears __GFP_KSWAPD_RECLAIM so fix it.
> 
> Note that this patch does some clean up, too.
> __GFP_NOFAIL is cleared twice so remove one.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Hm, so this seems to revert Mel's 444eb2a449ef ("mm: thp: set THP defrag
by default to madvise and add a stall-free defrag option") wrt the slub
allocate_slab() part. AFAICS the intention in Mel's patch was that he
removed a special case in __alloc_page_slowpath() where including
__GFP_THISNODE and lacking ~__GFP_DIRECT_RECLAIM effectively means also
lacking __GFP_KSWAPD_RECLAIM. The commit log claims that slab/slub might
change behavior so he moved the removal of __GFP_KSWAPD_RECLAIM to them.

But AFAICS, only slab uses __GFP_THISNODE, while slub doesn't. So your
patch would indeed revert an unintentional change of Mel's commit. Is it
right or do I miss something?

> ---
>  mm/slub.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 0dc7397..e1e442c 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1578,8 +1578,12 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
>  	 * so we fall-back to the minimum order allocation.
>  	 */
>  	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
> -	if ((alloc_gfp & __GFP_DIRECT_RECLAIM) && oo_order(oo) > oo_order(s->min))
> -		alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) & ~(__GFP_RECLAIM|__GFP_NOFAIL);
> +	if (oo_order(oo) > oo_order(s->min)) {
> +		if (alloc_gfp & __GFP_DIRECT_RECLAIM) {
> +			alloc_gfp |= __GFP_NOMEMALLOC;
> +			alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
> +		}
> +	}
>  
>  	page = alloc_slab_page(s, alloc_gfp, node, oo);
>  	if (unlikely(!page)) {
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] mm/slub: don't use reserved highatomic pageblock for optimistic try
  2017-08-28  1:11   ` js1304
@ 2017-08-28 11:29     ` Vlastimil Babka
  -1 siblings, 0 replies; 20+ messages in thread
From: Vlastimil Babka @ 2017-08-28 11:29 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, linux-kernel, Mel Gorman, Michal Hocko

On 08/28/2017 03:11 AM, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> High-order atomic allocation is difficult to succeed since we cannot
> reclaim anything in this context. So, we reserves the pageblock for
> this kind of request.
> 
> In slub, we try to allocate higher-order page more than it actually
> needs in order to get the best performance. If this optimistic try is
> used with GFP_ATOMIC, alloc_flags will be set as ALLOC_HARDER and
> the pageblock reserved for high-order atomic allocation would be used.
> Moreover, this request would reserve the MIGRATE_HIGHATOMIC pageblock
> ,if succeed, to prepare further request. It would not be good to use
> MIGRATE_HIGHATOMIC pageblock in terms of fragmentation management
> since it unconditionally set a migratetype to request's migratetype
> when unreserving the pageblock without considering the migratetype of
> used pages in the pageblock.
> 
> This is not what we don't intend so fix it by unconditionally setting
> __GFP_NOMEMALLOC in order to not set ALLOC_HARDER.

I wonder if it would be more robust to strip GFP_ATOMIC from alloc_gfp.
E.g. __GFP_NOMEMALLOC does seem to prevent ALLOC_HARDER, but not
ALLOC_HIGH. Or maybe we should adjust __GFP_NOMEMALLOC implementation
and document it more thoroughly? CC Michal Hocko

Also, were these 2 patches done via code inspection or you noticed
suboptimal behavior which got fixed? Thanks.

> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  mm/slub.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index e1e442c..fd8dd89 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1579,10 +1579,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
>  	 */
>  	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
>  	if (oo_order(oo) > oo_order(s->min)) {
> -		if (alloc_gfp & __GFP_DIRECT_RECLAIM) {
> -			alloc_gfp |= __GFP_NOMEMALLOC;
> -			alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
> -		}
> +		alloc_gfp |= __GFP_NOMEMALLOC;
> +		alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
>  	}
>  
>  	page = alloc_slab_page(s, alloc_gfp, node, oo);
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] mm/slub: don't use reserved highatomic pageblock for optimistic try
@ 2017-08-28 11:29     ` Vlastimil Babka
  0 siblings, 0 replies; 20+ messages in thread
From: Vlastimil Babka @ 2017-08-28 11:29 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, linux-kernel, Mel Gorman, Michal Hocko

On 08/28/2017 03:11 AM, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> High-order atomic allocation is difficult to succeed since we cannot
> reclaim anything in this context. So, we reserves the pageblock for
> this kind of request.
> 
> In slub, we try to allocate higher-order page more than it actually
> needs in order to get the best performance. If this optimistic try is
> used with GFP_ATOMIC, alloc_flags will be set as ALLOC_HARDER and
> the pageblock reserved for high-order atomic allocation would be used.
> Moreover, this request would reserve the MIGRATE_HIGHATOMIC pageblock
> ,if succeed, to prepare further request. It would not be good to use
> MIGRATE_HIGHATOMIC pageblock in terms of fragmentation management
> since it unconditionally set a migratetype to request's migratetype
> when unreserving the pageblock without considering the migratetype of
> used pages in the pageblock.
> 
> This is not what we don't intend so fix it by unconditionally setting
> __GFP_NOMEMALLOC in order to not set ALLOC_HARDER.

I wonder if it would be more robust to strip GFP_ATOMIC from alloc_gfp.
E.g. __GFP_NOMEMALLOC does seem to prevent ALLOC_HARDER, but not
ALLOC_HIGH. Or maybe we should adjust __GFP_NOMEMALLOC implementation
and document it more thoroughly? CC Michal Hocko

Also, were these 2 patches done via code inspection or you noticed
suboptimal behavior which got fixed? Thanks.

> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  mm/slub.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index e1e442c..fd8dd89 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1579,10 +1579,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
>  	 */
>  	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
>  	if (oo_order(oo) > oo_order(s->min)) {
> -		if (alloc_gfp & __GFP_DIRECT_RECLAIM) {
> -			alloc_gfp |= __GFP_NOMEMALLOC;
> -			alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
> -		}
> +		alloc_gfp |= __GFP_NOMEMALLOC;
> +		alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
>  	}
>  
>  	page = alloc_slab_page(s, alloc_gfp, node, oo);
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] mm/slub: don't use reserved highatomic pageblock for optimistic try
  2017-08-28 11:29     ` Vlastimil Babka
@ 2017-08-28 13:08       ` Michal Hocko
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-08-28 13:08 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: js1304, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, linux-mm, linux-kernel, Mel Gorman

On Mon 28-08-17 13:29:29, Vlastimil Babka wrote:
> On 08/28/2017 03:11 AM, js1304@gmail.com wrote:
> > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > 
> > High-order atomic allocation is difficult to succeed since we cannot
> > reclaim anything in this context. So, we reserves the pageblock for
> > this kind of request.
> > 
> > In slub, we try to allocate higher-order page more than it actually
> > needs in order to get the best performance. If this optimistic try is
> > used with GFP_ATOMIC, alloc_flags will be set as ALLOC_HARDER and
> > the pageblock reserved for high-order atomic allocation would be used.
> > Moreover, this request would reserve the MIGRATE_HIGHATOMIC pageblock
> > ,if succeed, to prepare further request. It would not be good to use
> > MIGRATE_HIGHATOMIC pageblock in terms of fragmentation management
> > since it unconditionally set a migratetype to request's migratetype
> > when unreserving the pageblock without considering the migratetype of
> > used pages in the pageblock.
> > 
> > This is not what we don't intend so fix it by unconditionally setting
> > __GFP_NOMEMALLOC in order to not set ALLOC_HARDER.
> 
> I wonder if it would be more robust to strip GFP_ATOMIC from alloc_gfp.
> E.g. __GFP_NOMEMALLOC does seem to prevent ALLOC_HARDER, but not
> ALLOC_HIGH. Or maybe we should adjust __GFP_NOMEMALLOC implementation
> and document it more thoroughly? CC Michal Hocko

Yeah, __GFP_NOMEMALLOC is rather inconsistent. It has been added to
override __GFP_MEMALLOC resp. PF_MEMALLOC AFAIK. In this particular
case I would agree that dropping __GFP_HIGH and __GFP_ATOMIC would
be more precise. I am not sure we want to touch the existing semantic of
__GFP_NOMEMALLOC though. This would require auditing all the existing
users (something tells me that quite some of those will be incorrect...)

> Also, were these 2 patches done via code inspection or you noticed
> suboptimal behavior which got fixed? Thanks.

The patch description is not very clear to me either but I guess that
Joonsoo sees to many larger order pages to back slab objects when the
system is not under heavy memory pressure and that increases internal
fragmentation?

> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > ---
> >  mm/slub.c | 6 ++----
> >  1 file changed, 2 insertions(+), 4 deletions(-)
> > 
> > diff --git a/mm/slub.c b/mm/slub.c
> > index e1e442c..fd8dd89 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -1579,10 +1579,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
> >  	 */
> >  	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
> >  	if (oo_order(oo) > oo_order(s->min)) {
> > -		if (alloc_gfp & __GFP_DIRECT_RECLAIM) {
> > -			alloc_gfp |= __GFP_NOMEMALLOC;
> > -			alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
> > -		}
> > +		alloc_gfp |= __GFP_NOMEMALLOC;
> > +		alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
> >  	}
> >  
> >  	page = alloc_slab_page(s, alloc_gfp, node, oo);
> > 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] mm/slub: don't use reserved highatomic pageblock for optimistic try
@ 2017-08-28 13:08       ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-08-28 13:08 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: js1304, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, linux-mm, linux-kernel, Mel Gorman

On Mon 28-08-17 13:29:29, Vlastimil Babka wrote:
> On 08/28/2017 03:11 AM, js1304@gmail.com wrote:
> > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > 
> > High-order atomic allocation is difficult to succeed since we cannot
> > reclaim anything in this context. So, we reserves the pageblock for
> > this kind of request.
> > 
> > In slub, we try to allocate higher-order page more than it actually
> > needs in order to get the best performance. If this optimistic try is
> > used with GFP_ATOMIC, alloc_flags will be set as ALLOC_HARDER and
> > the pageblock reserved for high-order atomic allocation would be used.
> > Moreover, this request would reserve the MIGRATE_HIGHATOMIC pageblock
> > ,if succeed, to prepare further request. It would not be good to use
> > MIGRATE_HIGHATOMIC pageblock in terms of fragmentation management
> > since it unconditionally set a migratetype to request's migratetype
> > when unreserving the pageblock without considering the migratetype of
> > used pages in the pageblock.
> > 
> > This is not what we don't intend so fix it by unconditionally setting
> > __GFP_NOMEMALLOC in order to not set ALLOC_HARDER.
> 
> I wonder if it would be more robust to strip GFP_ATOMIC from alloc_gfp.
> E.g. __GFP_NOMEMALLOC does seem to prevent ALLOC_HARDER, but not
> ALLOC_HIGH. Or maybe we should adjust __GFP_NOMEMALLOC implementation
> and document it more thoroughly? CC Michal Hocko

Yeah, __GFP_NOMEMALLOC is rather inconsistent. It has been added to
override __GFP_MEMALLOC resp. PF_MEMALLOC AFAIK. In this particular
case I would agree that dropping __GFP_HIGH and __GFP_ATOMIC would
be more precise. I am not sure we want to touch the existing semantic of
__GFP_NOMEMALLOC though. This would require auditing all the existing
users (something tells me that quite some of those will be incorrect...)

> Also, were these 2 patches done via code inspection or you noticed
> suboptimal behavior which got fixed? Thanks.

The patch description is not very clear to me either but I guess that
Joonsoo sees to many larger order pages to back slab objects when the
system is not under heavy memory pressure and that increases internal
fragmentation?

> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > ---
> >  mm/slub.c | 6 ++----
> >  1 file changed, 2 insertions(+), 4 deletions(-)
> > 
> > diff --git a/mm/slub.c b/mm/slub.c
> > index e1e442c..fd8dd89 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -1579,10 +1579,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
> >  	 */
> >  	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
> >  	if (oo_order(oo) > oo_order(s->min)) {
> > -		if (alloc_gfp & __GFP_DIRECT_RECLAIM) {
> > -			alloc_gfp |= __GFP_NOMEMALLOC;
> > -			alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
> > -		}
> > +		alloc_gfp |= __GFP_NOMEMALLOC;
> > +		alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
> >  	}
> >  
> >  	page = alloc_slab_page(s, alloc_gfp, node, oo);
> > 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] mm/slub: wake up kswapd for initial high order allocation
  2017-08-28 10:04   ` Vlastimil Babka
@ 2017-08-29  0:22     ` Joonsoo Kim
  -1 siblings, 0 replies; 20+ messages in thread
From: Joonsoo Kim @ 2017-08-29  0:22 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Christoph Lameter, Pekka Enberg, David Rientjes,
	linux-mm, linux-kernel, Mel Gorman

On Mon, Aug 28, 2017 at 12:04:41PM +0200, Vlastimil Babka wrote:
> On 08/28/2017 03:11 AM, js1304@gmail.com wrote:
> > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > 
> > slub uses higher order allocation than it actually needs. In this case,
> > we don't want to do direct reclaim to make such a high order page since
> > it causes a big latency to the user. Instead, we would like to fallback
> > lower order allocation that it actually needs.
> > 
> > However, we also want to get this higher order page in the next time
> > in order to get the best performance and it would be a role of
> > the background thread like as kswapd and kcompactd. To wake up them,
> > we should not clear __GFP_KSWAPD_RECLAIM.
> > 
> > Unlike this intention, current code clears __GFP_KSWAPD_RECLAIM so fix it.
> > 
> > Note that this patch does some clean up, too.
> > __GFP_NOFAIL is cleared twice so remove one.
> > 
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Hm, so this seems to revert Mel's 444eb2a449ef ("mm: thp: set THP defrag
> by default to madvise and add a stall-free defrag option") wrt the slub
> allocate_slab() part. AFAICS the intention in Mel's patch was that he
> removed a special case in __alloc_page_slowpath() where including
> __GFP_THISNODE and lacking ~__GFP_DIRECT_RECLAIM effectively means also
> lacking __GFP_KSWAPD_RECLAIM. The commit log claims that slab/slub might
> change behavior so he moved the removal of __GFP_KSWAPD_RECLAIM to them.
> 
> But AFAICS, only slab uses __GFP_THISNODE, while slub doesn't. So your
> patch would indeed revert an unintentional change of Mel's commit. Is it
> right or do I miss something?

I didn't look at that patch. What I tried here is just restoring first
intention of this code. I now realize that Mel did it for specific
purpose. Thanks for notifying it.

Anyway, your analysis looks correct and this change doesn't hurt Mel's
intention and restores original behaviour of the code. I will add your
analysis on the commit description and resubmit it. Is it okay to you?

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] mm/slub: wake up kswapd for initial high order allocation
@ 2017-08-29  0:22     ` Joonsoo Kim
  0 siblings, 0 replies; 20+ messages in thread
From: Joonsoo Kim @ 2017-08-29  0:22 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Christoph Lameter, Pekka Enberg, David Rientjes,
	linux-mm, linux-kernel, Mel Gorman

On Mon, Aug 28, 2017 at 12:04:41PM +0200, Vlastimil Babka wrote:
> On 08/28/2017 03:11 AM, js1304@gmail.com wrote:
> > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > 
> > slub uses higher order allocation than it actually needs. In this case,
> > we don't want to do direct reclaim to make such a high order page since
> > it causes a big latency to the user. Instead, we would like to fallback
> > lower order allocation that it actually needs.
> > 
> > However, we also want to get this higher order page in the next time
> > in order to get the best performance and it would be a role of
> > the background thread like as kswapd and kcompactd. To wake up them,
> > we should not clear __GFP_KSWAPD_RECLAIM.
> > 
> > Unlike this intention, current code clears __GFP_KSWAPD_RECLAIM so fix it.
> > 
> > Note that this patch does some clean up, too.
> > __GFP_NOFAIL is cleared twice so remove one.
> > 
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Hm, so this seems to revert Mel's 444eb2a449ef ("mm: thp: set THP defrag
> by default to madvise and add a stall-free defrag option") wrt the slub
> allocate_slab() part. AFAICS the intention in Mel's patch was that he
> removed a special case in __alloc_page_slowpath() where including
> __GFP_THISNODE and lacking ~__GFP_DIRECT_RECLAIM effectively means also
> lacking __GFP_KSWAPD_RECLAIM. The commit log claims that slab/slub might
> change behavior so he moved the removal of __GFP_KSWAPD_RECLAIM to them.
> 
> But AFAICS, only slab uses __GFP_THISNODE, while slub doesn't. So your
> patch would indeed revert an unintentional change of Mel's commit. Is it
> right or do I miss something?

I didn't look at that patch. What I tried here is just restoring first
intention of this code. I now realize that Mel did it for specific
purpose. Thanks for notifying it.

Anyway, your analysis looks correct and this change doesn't hurt Mel's
intention and restores original behaviour of the code. I will add your
analysis on the commit description and resubmit it. Is it okay to you?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] mm/slub: don't use reserved highatomic pageblock for optimistic try
  2017-08-28 13:08       ` Michal Hocko
@ 2017-08-29  0:33         ` Joonsoo Kim
  -1 siblings, 0 replies; 20+ messages in thread
From: Joonsoo Kim @ 2017-08-29  0:33 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, linux-mm, linux-kernel, Mel Gorman

On Mon, Aug 28, 2017 at 03:08:29PM +0200, Michal Hocko wrote:
> On Mon 28-08-17 13:29:29, Vlastimil Babka wrote:
> > On 08/28/2017 03:11 AM, js1304@gmail.com wrote:
> > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > 
> > > High-order atomic allocation is difficult to succeed since we cannot
> > > reclaim anything in this context. So, we reserves the pageblock for
> > > this kind of request.
> > > 
> > > In slub, we try to allocate higher-order page more than it actually
> > > needs in order to get the best performance. If this optimistic try is
> > > used with GFP_ATOMIC, alloc_flags will be set as ALLOC_HARDER and
> > > the pageblock reserved for high-order atomic allocation would be used.
> > > Moreover, this request would reserve the MIGRATE_HIGHATOMIC pageblock
> > > ,if succeed, to prepare further request. It would not be good to use
> > > MIGRATE_HIGHATOMIC pageblock in terms of fragmentation management
> > > since it unconditionally set a migratetype to request's migratetype
> > > when unreserving the pageblock without considering the migratetype of
> > > used pages in the pageblock.
> > > 
> > > This is not what we don't intend so fix it by unconditionally setting
> > > __GFP_NOMEMALLOC in order to not set ALLOC_HARDER.
> > 
> > I wonder if it would be more robust to strip GFP_ATOMIC from alloc_gfp.
> > E.g. __GFP_NOMEMALLOC does seem to prevent ALLOC_HARDER, but not
> > ALLOC_HIGH. Or maybe we should adjust __GFP_NOMEMALLOC implementation
> > and document it more thoroughly? CC Michal Hocko
> 
> Yeah, __GFP_NOMEMALLOC is rather inconsistent. It has been added to
> override __GFP_MEMALLOC resp. PF_MEMALLOC AFAIK. In this particular
> case I would agree that dropping __GFP_HIGH and __GFP_ATOMIC would
> be more precise. I am not sure we want to touch the existing semantic of
> __GFP_NOMEMALLOC though. This would require auditing all the existing
> users (something tells me that quite some of those will be incorrect...)

Hmm... now I realize that there is another reason that we need to use
__GFP_NOMEMALLOC. Even if this allocation comes from PF_MEMALLOC user,
this optimistic try should not use the reserved memory below the
watermark. That is, it should not use ALLOC_NO_WATERMARKS. It can
only be accomplished by using __GFP_NOMEMALLOC.

> 
> > Also, were these 2 patches done via code inspection or you noticed
> > suboptimal behavior which got fixed? Thanks.
> 
> The patch description is not very clear to me either but I guess that
> Joonsoo sees to many larger order pages to back slab objects when the
> system is not under heavy memory pressure and that increases internal
> fragmentation?

Your guess is right. I found this problem when I checked the
fragmentation ratio through the benchmark some months ago. I don't
remember detailed system state in that benchmark.

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] mm/slub: don't use reserved highatomic pageblock for optimistic try
@ 2017-08-29  0:33         ` Joonsoo Kim
  0 siblings, 0 replies; 20+ messages in thread
From: Joonsoo Kim @ 2017-08-29  0:33 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, linux-mm, linux-kernel, Mel Gorman

On Mon, Aug 28, 2017 at 03:08:29PM +0200, Michal Hocko wrote:
> On Mon 28-08-17 13:29:29, Vlastimil Babka wrote:
> > On 08/28/2017 03:11 AM, js1304@gmail.com wrote:
> > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > 
> > > High-order atomic allocation is difficult to succeed since we cannot
> > > reclaim anything in this context. So, we reserves the pageblock for
> > > this kind of request.
> > > 
> > > In slub, we try to allocate higher-order page more than it actually
> > > needs in order to get the best performance. If this optimistic try is
> > > used with GFP_ATOMIC, alloc_flags will be set as ALLOC_HARDER and
> > > the pageblock reserved for high-order atomic allocation would be used.
> > > Moreover, this request would reserve the MIGRATE_HIGHATOMIC pageblock
> > > ,if succeed, to prepare further request. It would not be good to use
> > > MIGRATE_HIGHATOMIC pageblock in terms of fragmentation management
> > > since it unconditionally set a migratetype to request's migratetype
> > > when unreserving the pageblock without considering the migratetype of
> > > used pages in the pageblock.
> > > 
> > > This is not what we don't intend so fix it by unconditionally setting
> > > __GFP_NOMEMALLOC in order to not set ALLOC_HARDER.
> > 
> > I wonder if it would be more robust to strip GFP_ATOMIC from alloc_gfp.
> > E.g. __GFP_NOMEMALLOC does seem to prevent ALLOC_HARDER, but not
> > ALLOC_HIGH. Or maybe we should adjust __GFP_NOMEMALLOC implementation
> > and document it more thoroughly? CC Michal Hocko
> 
> Yeah, __GFP_NOMEMALLOC is rather inconsistent. It has been added to
> override __GFP_MEMALLOC resp. PF_MEMALLOC AFAIK. In this particular
> case I would agree that dropping __GFP_HIGH and __GFP_ATOMIC would
> be more precise. I am not sure we want to touch the existing semantic of
> __GFP_NOMEMALLOC though. This would require auditing all the existing
> users (something tells me that quite some of those will be incorrect...)

Hmm... now I realize that there is another reason that we need to use
__GFP_NOMEMALLOC. Even if this allocation comes from PF_MEMALLOC user,
this optimistic try should not use the reserved memory below the
watermark. That is, it should not use ALLOC_NO_WATERMARKS. It can
only be accomplished by using __GFP_NOMEMALLOC.

> 
> > Also, were these 2 patches done via code inspection or you noticed
> > suboptimal behavior which got fixed? Thanks.
> 
> The patch description is not very clear to me either but I guess that
> Joonsoo sees to many larger order pages to back slab objects when the
> system is not under heavy memory pressure and that increases internal
> fragmentation?

Your guess is right. I found this problem when I checked the
fragmentation ratio through the benchmark some months ago. I don't
remember detailed system state in that benchmark.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] mm/slub: wake up kswapd for initial high order allocation
  2017-08-29  0:22     ` Joonsoo Kim
@ 2017-08-29  7:14       ` Vlastimil Babka
  -1 siblings, 0 replies; 20+ messages in thread
From: Vlastimil Babka @ 2017-08-29  7:14 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Christoph Lameter, Pekka Enberg, David Rientjes,
	linux-mm, linux-kernel, Mel Gorman

On 08/29/2017 02:22 AM, Joonsoo Kim wrote:
> On Mon, Aug 28, 2017 at 12:04:41PM +0200, Vlastimil Babka wrote:
>>
>> Hm, so this seems to revert Mel's 444eb2a449ef ("mm: thp: set THP defrag
>> by default to madvise and add a stall-free defrag option") wrt the slub
>> allocate_slab() part. AFAICS the intention in Mel's patch was that he
>> removed a special case in __alloc_page_slowpath() where including
>> __GFP_THISNODE and lacking ~__GFP_DIRECT_RECLAIM effectively means also
>> lacking __GFP_KSWAPD_RECLAIM. The commit log claims that slab/slub might
>> change behavior so he moved the removal of __GFP_KSWAPD_RECLAIM to them.
>>
>> But AFAICS, only slab uses __GFP_THISNODE, while slub doesn't. So your
>> patch would indeed revert an unintentional change of Mel's commit. Is it
>> right or do I miss something?
> 
> I didn't look at that patch. What I tried here is just restoring first
> intention of this code. I now realize that Mel did it for specific
> purpose. Thanks for notifying it.
> 
> Anyway, your analysis looks correct and this change doesn't hurt Mel's
> intention and restores original behaviour of the code. I will add your
> analysis on the commit description and resubmit it. Is it okay to you?

Yeah, no problem.

> Thanks.
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] mm/slub: wake up kswapd for initial high order allocation
@ 2017-08-29  7:14       ` Vlastimil Babka
  0 siblings, 0 replies; 20+ messages in thread
From: Vlastimil Babka @ 2017-08-29  7:14 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Christoph Lameter, Pekka Enberg, David Rientjes,
	linux-mm, linux-kernel, Mel Gorman

On 08/29/2017 02:22 AM, Joonsoo Kim wrote:
> On Mon, Aug 28, 2017 at 12:04:41PM +0200, Vlastimil Babka wrote:
>>
>> Hm, so this seems to revert Mel's 444eb2a449ef ("mm: thp: set THP defrag
>> by default to madvise and add a stall-free defrag option") wrt the slub
>> allocate_slab() part. AFAICS the intention in Mel's patch was that he
>> removed a special case in __alloc_page_slowpath() where including
>> __GFP_THISNODE and lacking ~__GFP_DIRECT_RECLAIM effectively means also
>> lacking __GFP_KSWAPD_RECLAIM. The commit log claims that slab/slub might
>> change behavior so he moved the removal of __GFP_KSWAPD_RECLAIM to them.
>>
>> But AFAICS, only slab uses __GFP_THISNODE, while slub doesn't. So your
>> patch would indeed revert an unintentional change of Mel's commit. Is it
>> right or do I miss something?
> 
> I didn't look at that patch. What I tried here is just restoring first
> intention of this code. I now realize that Mel did it for specific
> purpose. Thanks for notifying it.
> 
> Anyway, your analysis looks correct and this change doesn't hurt Mel's
> intention and restores original behaviour of the code. I will add your
> analysis on the commit description and resubmit it. Is it okay to you?

Yeah, no problem.

> Thanks.
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] mm/slub: don't use reserved highatomic pageblock for optimistic try
  2017-08-29  0:33         ` Joonsoo Kim
@ 2017-08-31  1:42           ` Joonsoo Kim
  -1 siblings, 0 replies; 20+ messages in thread
From: Joonsoo Kim @ 2017-08-31  1:42 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, linux-mm, linux-kernel, Mel Gorman

On Tue, Aug 29, 2017 at 09:33:44AM +0900, Joonsoo Kim wrote:
> On Mon, Aug 28, 2017 at 03:08:29PM +0200, Michal Hocko wrote:
> > On Mon 28-08-17 13:29:29, Vlastimil Babka wrote:
> > > On 08/28/2017 03:11 AM, js1304@gmail.com wrote:
> > > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > > 
> > > > High-order atomic allocation is difficult to succeed since we cannot
> > > > reclaim anything in this context. So, we reserves the pageblock for
> > > > this kind of request.
> > > > 
> > > > In slub, we try to allocate higher-order page more than it actually
> > > > needs in order to get the best performance. If this optimistic try is
> > > > used with GFP_ATOMIC, alloc_flags will be set as ALLOC_HARDER and
> > > > the pageblock reserved for high-order atomic allocation would be used.
> > > > Moreover, this request would reserve the MIGRATE_HIGHATOMIC pageblock
> > > > ,if succeed, to prepare further request. It would not be good to use
> > > > MIGRATE_HIGHATOMIC pageblock in terms of fragmentation management
> > > > since it unconditionally set a migratetype to request's migratetype
> > > > when unreserving the pageblock without considering the migratetype of
> > > > used pages in the pageblock.
> > > > 
> > > > This is not what we don't intend so fix it by unconditionally setting
> > > > __GFP_NOMEMALLOC in order to not set ALLOC_HARDER.
> > > 
> > > I wonder if it would be more robust to strip GFP_ATOMIC from alloc_gfp.
> > > E.g. __GFP_NOMEMALLOC does seem to prevent ALLOC_HARDER, but not
> > > ALLOC_HIGH. Or maybe we should adjust __GFP_NOMEMALLOC implementation
> > > and document it more thoroughly? CC Michal Hocko
> > 
> > Yeah, __GFP_NOMEMALLOC is rather inconsistent. It has been added to
> > override __GFP_MEMALLOC resp. PF_MEMALLOC AFAIK. In this particular
> > case I would agree that dropping __GFP_HIGH and __GFP_ATOMIC would
> > be more precise. I am not sure we want to touch the existing semantic of
> > __GFP_NOMEMALLOC though. This would require auditing all the existing
> > users (something tells me that quite some of those will be incorrect...)
> 
> Hmm... now I realize that there is another reason that we need to use
> __GFP_NOMEMALLOC. Even if this allocation comes from PF_MEMALLOC user,
> this optimistic try should not use the reserved memory below the
> watermark. That is, it should not use ALLOC_NO_WATERMARKS. It can
> only be accomplished by using __GFP_NOMEMALLOC.

Michal, Vlastimil, Any thought?

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] mm/slub: don't use reserved highatomic pageblock for optimistic try
@ 2017-08-31  1:42           ` Joonsoo Kim
  0 siblings, 0 replies; 20+ messages in thread
From: Joonsoo Kim @ 2017-08-31  1:42 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, linux-mm, linux-kernel, Mel Gorman

On Tue, Aug 29, 2017 at 09:33:44AM +0900, Joonsoo Kim wrote:
> On Mon, Aug 28, 2017 at 03:08:29PM +0200, Michal Hocko wrote:
> > On Mon 28-08-17 13:29:29, Vlastimil Babka wrote:
> > > On 08/28/2017 03:11 AM, js1304@gmail.com wrote:
> > > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > > 
> > > > High-order atomic allocation is difficult to succeed since we cannot
> > > > reclaim anything in this context. So, we reserves the pageblock for
> > > > this kind of request.
> > > > 
> > > > In slub, we try to allocate higher-order page more than it actually
> > > > needs in order to get the best performance. If this optimistic try is
> > > > used with GFP_ATOMIC, alloc_flags will be set as ALLOC_HARDER and
> > > > the pageblock reserved for high-order atomic allocation would be used.
> > > > Moreover, this request would reserve the MIGRATE_HIGHATOMIC pageblock
> > > > ,if succeed, to prepare further request. It would not be good to use
> > > > MIGRATE_HIGHATOMIC pageblock in terms of fragmentation management
> > > > since it unconditionally set a migratetype to request's migratetype
> > > > when unreserving the pageblock without considering the migratetype of
> > > > used pages in the pageblock.
> > > > 
> > > > This is not what we don't intend so fix it by unconditionally setting
> > > > __GFP_NOMEMALLOC in order to not set ALLOC_HARDER.
> > > 
> > > I wonder if it would be more robust to strip GFP_ATOMIC from alloc_gfp.
> > > E.g. __GFP_NOMEMALLOC does seem to prevent ALLOC_HARDER, but not
> > > ALLOC_HIGH. Or maybe we should adjust __GFP_NOMEMALLOC implementation
> > > and document it more thoroughly? CC Michal Hocko
> > 
> > Yeah, __GFP_NOMEMALLOC is rather inconsistent. It has been added to
> > override __GFP_MEMALLOC resp. PF_MEMALLOC AFAIK. In this particular
> > case I would agree that dropping __GFP_HIGH and __GFP_ATOMIC would
> > be more precise. I am not sure we want to touch the existing semantic of
> > __GFP_NOMEMALLOC though. This would require auditing all the existing
> > users (something tells me that quite some of those will be incorrect...)
> 
> Hmm... now I realize that there is another reason that we need to use
> __GFP_NOMEMALLOC. Even if this allocation comes from PF_MEMALLOC user,
> this optimistic try should not use the reserved memory below the
> watermark. That is, it should not use ALLOC_NO_WATERMARKS. It can
> only be accomplished by using __GFP_NOMEMALLOC.

Michal, Vlastimil, Any thought?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] mm/slub: don't use reserved highatomic pageblock for optimistic try
  2017-08-31  1:42           ` Joonsoo Kim
@ 2017-08-31  5:21             ` Michal Hocko
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-08-31  5:21 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, linux-mm, linux-kernel, Mel Gorman

On Thu 31-08-17 10:42:41, Joonsoo Kim wrote:
> On Tue, Aug 29, 2017 at 09:33:44AM +0900, Joonsoo Kim wrote:
> > On Mon, Aug 28, 2017 at 03:08:29PM +0200, Michal Hocko wrote:
> > > On Mon 28-08-17 13:29:29, Vlastimil Babka wrote:
> > > > On 08/28/2017 03:11 AM, js1304@gmail.com wrote:
> > > > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > > > 
> > > > > High-order atomic allocation is difficult to succeed since we cannot
> > > > > reclaim anything in this context. So, we reserves the pageblock for
> > > > > this kind of request.
> > > > > 
> > > > > In slub, we try to allocate higher-order page more than it actually
> > > > > needs in order to get the best performance. If this optimistic try is
> > > > > used with GFP_ATOMIC, alloc_flags will be set as ALLOC_HARDER and
> > > > > the pageblock reserved for high-order atomic allocation would be used.
> > > > > Moreover, this request would reserve the MIGRATE_HIGHATOMIC pageblock
> > > > > ,if succeed, to prepare further request. It would not be good to use
> > > > > MIGRATE_HIGHATOMIC pageblock in terms of fragmentation management
> > > > > since it unconditionally set a migratetype to request's migratetype
> > > > > when unreserving the pageblock without considering the migratetype of
> > > > > used pages in the pageblock.
> > > > > 
> > > > > This is not what we don't intend so fix it by unconditionally setting
> > > > > __GFP_NOMEMALLOC in order to not set ALLOC_HARDER.
> > > > 
> > > > I wonder if it would be more robust to strip GFP_ATOMIC from alloc_gfp.
> > > > E.g. __GFP_NOMEMALLOC does seem to prevent ALLOC_HARDER, but not
> > > > ALLOC_HIGH. Or maybe we should adjust __GFP_NOMEMALLOC implementation
> > > > and document it more thoroughly? CC Michal Hocko
> > > 
> > > Yeah, __GFP_NOMEMALLOC is rather inconsistent. It has been added to
> > > override __GFP_MEMALLOC resp. PF_MEMALLOC AFAIK. In this particular
> > > case I would agree that dropping __GFP_HIGH and __GFP_ATOMIC would
> > > be more precise. I am not sure we want to touch the existing semantic of
> > > __GFP_NOMEMALLOC though. This would require auditing all the existing
> > > users (something tells me that quite some of those will be incorrect...)
> > 
> > Hmm... now I realize that there is another reason that we need to use
> > __GFP_NOMEMALLOC. Even if this allocation comes from PF_MEMALLOC user,
> > this optimistic try should not use the reserved memory below the
> > watermark. That is, it should not use ALLOC_NO_WATERMARKS. It can
> > only be accomplished by using __GFP_NOMEMALLOC.
> 
> Michal, Vlastimil, Any thought?

Hmm, I would go with a helper like below and use it in slub

gfp_t gfp_drop_reserves(gfp_t mask)
{
	mask &= ~(__GFP_HIGH|__GFP_ATOMIC)
	mask |= __GFP_NOMEMALLOC;

	return mask;
}
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] mm/slub: don't use reserved highatomic pageblock for optimistic try
@ 2017-08-31  5:21             ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-08-31  5:21 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, linux-mm, linux-kernel, Mel Gorman

On Thu 31-08-17 10:42:41, Joonsoo Kim wrote:
> On Tue, Aug 29, 2017 at 09:33:44AM +0900, Joonsoo Kim wrote:
> > On Mon, Aug 28, 2017 at 03:08:29PM +0200, Michal Hocko wrote:
> > > On Mon 28-08-17 13:29:29, Vlastimil Babka wrote:
> > > > On 08/28/2017 03:11 AM, js1304@gmail.com wrote:
> > > > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > > > 
> > > > > High-order atomic allocation is difficult to succeed since we cannot
> > > > > reclaim anything in this context. So, we reserves the pageblock for
> > > > > this kind of request.
> > > > > 
> > > > > In slub, we try to allocate higher-order page more than it actually
> > > > > needs in order to get the best performance. If this optimistic try is
> > > > > used with GFP_ATOMIC, alloc_flags will be set as ALLOC_HARDER and
> > > > > the pageblock reserved for high-order atomic allocation would be used.
> > > > > Moreover, this request would reserve the MIGRATE_HIGHATOMIC pageblock
> > > > > ,if succeed, to prepare further request. It would not be good to use
> > > > > MIGRATE_HIGHATOMIC pageblock in terms of fragmentation management
> > > > > since it unconditionally set a migratetype to request's migratetype
> > > > > when unreserving the pageblock without considering the migratetype of
> > > > > used pages in the pageblock.
> > > > > 
> > > > > This is not what we don't intend so fix it by unconditionally setting
> > > > > __GFP_NOMEMALLOC in order to not set ALLOC_HARDER.
> > > > 
> > > > I wonder if it would be more robust to strip GFP_ATOMIC from alloc_gfp.
> > > > E.g. __GFP_NOMEMALLOC does seem to prevent ALLOC_HARDER, but not
> > > > ALLOC_HIGH. Or maybe we should adjust __GFP_NOMEMALLOC implementation
> > > > and document it more thoroughly? CC Michal Hocko
> > > 
> > > Yeah, __GFP_NOMEMALLOC is rather inconsistent. It has been added to
> > > override __GFP_MEMALLOC resp. PF_MEMALLOC AFAIK. In this particular
> > > case I would agree that dropping __GFP_HIGH and __GFP_ATOMIC would
> > > be more precise. I am not sure we want to touch the existing semantic of
> > > __GFP_NOMEMALLOC though. This would require auditing all the existing
> > > users (something tells me that quite some of those will be incorrect...)
> > 
> > Hmm... now I realize that there is another reason that we need to use
> > __GFP_NOMEMALLOC. Even if this allocation comes from PF_MEMALLOC user,
> > this optimistic try should not use the reserved memory below the
> > watermark. That is, it should not use ALLOC_NO_WATERMARKS. It can
> > only be accomplished by using __GFP_NOMEMALLOC.
> 
> Michal, Vlastimil, Any thought?

Hmm, I would go with a helper like below and use it in slub

gfp_t gfp_drop_reserves(gfp_t mask)
{
	mask &= ~(__GFP_HIGH|__GFP_ATOMIC)
	mask |= __GFP_NOMEMALLOC;

	return mask;
}
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-08-31  5:21 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-28  1:11 [PATCH 1/2] mm/slub: wake up kswapd for initial high order allocation js1304
2017-08-28  1:11 ` js1304
2017-08-28  1:11 ` [PATCH 2/2] mm/slub: don't use reserved highatomic pageblock for optimistic try js1304
2017-08-28  1:11   ` js1304
2017-08-28 11:29   ` Vlastimil Babka
2017-08-28 11:29     ` Vlastimil Babka
2017-08-28 13:08     ` Michal Hocko
2017-08-28 13:08       ` Michal Hocko
2017-08-29  0:33       ` Joonsoo Kim
2017-08-29  0:33         ` Joonsoo Kim
2017-08-31  1:42         ` Joonsoo Kim
2017-08-31  1:42           ` Joonsoo Kim
2017-08-31  5:21           ` Michal Hocko
2017-08-31  5:21             ` Michal Hocko
2017-08-28 10:04 ` [PATCH 1/2] mm/slub: wake up kswapd for initial high order allocation Vlastimil Babka
2017-08-28 10:04   ` Vlastimil Babka
2017-08-29  0:22   ` Joonsoo Kim
2017-08-29  0:22     ` Joonsoo Kim
2017-08-29  7:14     ` Vlastimil Babka
2017-08-29  7:14       ` Vlastimil Babka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.