* [RFC PATCH] mm: Micro-optimisation: Save two branches on hot - page allocation path
@ 2020-03-04 14:22 mateusznosek0
2020-03-04 14:47 ` Vlastimil Babka
2020-03-04 15:15 ` Mel Gorman
0 siblings, 2 replies; 3+ messages in thread
From: mateusznosek0 @ 2020-03-04 14:22 UTC (permalink / raw)
To: linux-mm, linux-kernel; +Cc: Mateusz Nosek, akpm
From: Mateusz Nosek <mateusznosek0@gmail.com>
This patch makes ALLOC_KSWAPD
equal to __GFP_KSWAPD_RACLAIM (cast to 'int').
Thanks to that code like:
```if (gfp_mask & __GFP_KSWAPD_RECLAIM)
alloc_flags |= ALLOC_KSWAPD;```
can be changed to:
```alloc_flags |= (__force int) (gfp_mask &__GFP_KSWAPD_RECLAIM);```
Thanks to this one branch less is generated in the assembly.
In case of ALLOC_KSWAPD flag two branches are saved,
first one in code that always executes in the beggining of page allocation
and the second one in loop in page allocator slowpath.
Signed-off-by: Mateusz Nosek <mateusznosek0@gmail.com>
---
mm/internal.h | 2 +-
mm/page_alloc.c | 23 +++++++++++++++--------
2 files changed, 16 insertions(+), 9 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index 86372d164476..7fb724977743 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -535,7 +535,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
#else
#define ALLOC_NOFRAGMENT 0x0
#endif
-#define ALLOC_KSWAPD 0x200 /* allow waking of kswapd */
+#define ALLOC_KSWAPD 0x800 /* allow waking of kswapd */
enum ttu_flags;
struct tlbflush_unmap_batch;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 79e950d76ffc..73afd883eab5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3609,10 +3609,14 @@ static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
static inline unsigned int
alloc_flags_nofragment(struct zone *zone, gfp_t gfp_mask)
{
- unsigned int alloc_flags = 0;
+ unsigned int alloc_flags;
- if (gfp_mask & __GFP_KSWAPD_RECLAIM)
- alloc_flags |= ALLOC_KSWAPD;
+ /*
+ * __GFP_KSWAPD_RECLAIM is assumed to be the same as ALLOC_KSWAPD
+ * to save a branch.
+ */
+ BUILD_BUG_ON(__GFP_KSWAPD_RECLAIM != (__force gfp_t) ALLOC_KSWAPD);
+ alloc_flags = (__force int) (gfp_mask & __GFP_KSWAPD_RECLAIM);
#ifdef CONFIG_ZONE_DMA32
if (!zone)
@@ -4248,8 +4252,13 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
{
unsigned int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
- /* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */
+ /*
+ * __GFP_HIGH is assumed to be the same as ALLOC_HIGH
+ * and __GFP_KSWAPD_RECLAIM is assumed to be the same as ALLOC_KSWAPD
+ * to save two branches.
+ */
BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH);
+ BUILD_BUG_ON(__GFP_KSWAPD_RECLAIM != (__force gfp_t) ALLOC_KSWAPD);
/*
* The caller may dip into page reserves a bit more if the caller
@@ -4257,7 +4266,8 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
* policy or is asking for __GFP_HIGH memory. GFP_ATOMIC requests will
* set both ALLOC_HARDER (__GFP_ATOMIC) and ALLOC_HIGH (__GFP_HIGH).
*/
- alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH);
+ alloc_flags |= (__force int)
+ (gfp_mask & (__GFP_HIGH | __GFP_KSWAPD_RECLAIM));
if (gfp_mask & __GFP_ATOMIC) {
/*
@@ -4274,9 +4284,6 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
} else if (unlikely(rt_task(current)) && !in_interrupt())
alloc_flags |= ALLOC_HARDER;
- if (gfp_mask & __GFP_KSWAPD_RECLAIM)
- alloc_flags |= ALLOC_KSWAPD;
-
#ifdef CONFIG_CMA
if (gfpflags_to_migratetype(gfp_mask) == MIGRATE_MOVABLE)
alloc_flags |= ALLOC_CMA;
--
2.17.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [RFC PATCH] mm: Micro-optimisation: Save two branches on hot - page allocation path
2020-03-04 14:22 [RFC PATCH] mm: Micro-optimisation: Save two branches on hot - page allocation path mateusznosek0
@ 2020-03-04 14:47 ` Vlastimil Babka
2020-03-04 15:15 ` Mel Gorman
1 sibling, 0 replies; 3+ messages in thread
From: Vlastimil Babka @ 2020-03-04 14:47 UTC (permalink / raw)
To: mateusznosek0, linux-mm, linux-kernel; +Cc: akpm, Mel Gorman
Let's CC Mel.
On 3/4/20 3:22 PM, mateusznosek0@gmail.com wrote:
> From: Mateusz Nosek <mateusznosek0@gmail.com>
>
> This patch makes ALLOC_KSWAPD
> equal to __GFP_KSWAPD_RACLAIM (cast to 'int').
>
> Thanks to that code like:
> ```if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> alloc_flags |= ALLOC_KSWAPD;```
> can be changed to:
> ```alloc_flags |= (__force int) (gfp_mask &__GFP_KSWAPD_RECLAIM);```
> Thanks to this one branch less is generated in the assembly.
>
> In case of ALLOC_KSWAPD flag two branches are saved,
> first one in code that always executes in the beggining of page allocation
> and the second one in loop in page allocator slowpath.
>
> Signed-off-by: Mateusz Nosek <mateusznosek0@gmail.com>
I think it's fine and in line with similar optimisations done by Mel in the past.
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Some nits below.
> ---
> mm/internal.h | 2 +-
> mm/page_alloc.c | 23 +++++++++++++++--------
> 2 files changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 86372d164476..7fb724977743 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -535,7 +535,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
> #else
> #define ALLOC_NOFRAGMENT 0x0
> #endif
> -#define ALLOC_KSWAPD 0x200 /* allow waking of kswapd */
> +#define ALLOC_KSWAPD 0x800 /* allow waking of kswapd */
Add a comment that the value has to match the GFP flag?
>
> enum ttu_flags;
> struct tlbflush_unmap_batch;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 79e950d76ffc..73afd883eab5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3609,10 +3609,14 @@ static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
> static inline unsigned int
> alloc_flags_nofragment(struct zone *zone, gfp_t gfp_mask)
> {
> - unsigned int alloc_flags = 0;
> + unsigned int alloc_flags;
>
> - if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> - alloc_flags |= ALLOC_KSWAPD;
> + /*
> + * __GFP_KSWAPD_RECLAIM is assumed to be the same as ALLOC_KSWAPD
> + * to save a branch.
> + */
> + BUILD_BUG_ON(__GFP_KSWAPD_RECLAIM != (__force gfp_t) ALLOC_KSWAPD);
I think one BUILD_BUG_ON is enough...
> + alloc_flags = (__force int) (gfp_mask & __GFP_KSWAPD_RECLAIM);
>
> #ifdef CONFIG_ZONE_DMA32
> if (!zone)
> @@ -4248,8 +4252,13 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
> {
> unsigned int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
>
> - /* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */
> + /*
> + * __GFP_HIGH is assumed to be the same as ALLOC_HIGH
> + * and __GFP_KSWAPD_RECLAIM is assumed to be the same as ALLOC_KSWAPD
> + * to save two branches.
> + */
> BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH);
> + BUILD_BUG_ON(__GFP_KSWAPD_RECLAIM != (__force gfp_t) ALLOC_KSWAPD);
... and this would be the better one of the two.
Thanks.
>
> /*
> * The caller may dip into page reserves a bit more if the caller
> @@ -4257,7 +4266,8 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
> * policy or is asking for __GFP_HIGH memory. GFP_ATOMIC requests will
> * set both ALLOC_HARDER (__GFP_ATOMIC) and ALLOC_HIGH (__GFP_HIGH).
> */
> - alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH);
> + alloc_flags |= (__force int)
> + (gfp_mask & (__GFP_HIGH | __GFP_KSWAPD_RECLAIM));
>
> if (gfp_mask & __GFP_ATOMIC) {
> /*
> @@ -4274,9 +4284,6 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
> } else if (unlikely(rt_task(current)) && !in_interrupt())
> alloc_flags |= ALLOC_HARDER;
>
> - if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> - alloc_flags |= ALLOC_KSWAPD;
> -
> #ifdef CONFIG_CMA
> if (gfpflags_to_migratetype(gfp_mask) == MIGRATE_MOVABLE)
> alloc_flags |= ALLOC_CMA;
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC PATCH] mm: Micro-optimisation: Save two branches on hot - page allocation path
2020-03-04 14:22 [RFC PATCH] mm: Micro-optimisation: Save two branches on hot - page allocation path mateusznosek0
2020-03-04 14:47 ` Vlastimil Babka
@ 2020-03-04 15:15 ` Mel Gorman
1 sibling, 0 replies; 3+ messages in thread
From: Mel Gorman @ 2020-03-04 15:15 UTC (permalink / raw)
To: mateusznosek0; +Cc: linux-mm, linux-kernel, akpm
On Wed, Mar 04, 2020 at 03:22:30PM +0100, mateusznosek0@gmail.com wrote:
> From: Mateusz Nosek <mateusznosek0@gmail.com>
>
> This patch makes ALLOC_KSWAPD
> equal to __GFP_KSWAPD_RACLAIM (cast to 'int').
>
s/RACLAIM/RECLAIM/
Very strange word wrapping
> Thanks to that code like:
> ```if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> alloc_flags |= ALLOC_KSWAPD;```
> can be changed to:
> ```alloc_flags |= (__force int) (gfp_mask &__GFP_KSWAPD_RECLAIM);```
Not sure what the multiple ``` are about. It does not appear to be an
encoding error but I think you can drop them. Maybe some mail readers
render this differently but it looks weird in plain text.
> Thanks to this one branch less is generated in the assembly.
>
> In case of ALLOC_KSWAPD flag two branches are saved,
> first one in code that always executes in the beggining of page allocation
s/beggining/beginning/
> and the second one in loop in page allocator slowpath.
Also strange word wrapping.
>
> Signed-off-by: Mateusz Nosek <mateusznosek0@gmail.com>
> ---
> mm/internal.h | 2 +-
> mm/page_alloc.c | 23 +++++++++++++++--------
> 2 files changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 86372d164476..7fb724977743 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -535,7 +535,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
> #else
> #define ALLOC_NOFRAGMENT 0x0
> #endif
> -#define ALLOC_KSWAPD 0x200 /* allow waking of kswapd */
> +#define ALLOC_KSWAPD 0x800 /* allow waking of kswapd */
>
> enum ttu_flags;
> struct tlbflush_unmap_batch;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 79e950d76ffc..73afd883eab5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3609,10 +3609,14 @@ static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
> static inline unsigned int
> alloc_flags_nofragment(struct zone *zone, gfp_t gfp_mask)
> {
> - unsigned int alloc_flags = 0;
> + unsigned int alloc_flags;
>
> - if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> - alloc_flags |= ALLOC_KSWAPD;
> + /*
> + * __GFP_KSWAPD_RECLAIM is assumed to be the same as ALLOC_KSWAPD
> + * to save a branch.
> + */
> + BUILD_BUG_ON(__GFP_KSWAPD_RECLAIM != (__force gfp_t) ALLOC_KSWAPD);
> + alloc_flags = (__force int) (gfp_mask & __GFP_KSWAPD_RECLAIM);
>
> #ifdef CONFIG_ZONE_DMA32
> if (!zone)
Ok, that looks reasonable.
> @@ -4248,8 +4252,13 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
> {
> unsigned int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
>
> - /* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */
> + /*
> + * __GFP_HIGH is assumed to be the same as ALLOC_HIGH
> + * and __GFP_KSWAPD_RECLAIM is assumed to be the same as ALLOC_KSWAPD
> + * to save two branches.
> + */
> BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH);
> + BUILD_BUG_ON(__GFP_KSWAPD_RECLAIM != (__force gfp_t) ALLOC_KSWAPD);
>
> /*
> * The caller may dip into page reserves a bit more if the caller
As Vlastimil pointed out, you do not need to make two compiler-based
checks. This seems like a reasonable location given that gfp_to_alloc_flags
is the most obvious place to document tricks with the flags.
> @@ -4257,7 +4266,8 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
> * policy or is asking for __GFP_HIGH memory. GFP_ATOMIC requests will
> * set both ALLOC_HARDER (__GFP_ATOMIC) and ALLOC_HIGH (__GFP_HIGH).
> */
> - alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH);
> + alloc_flags |= (__force int)
> + (gfp_mask & (__GFP_HIGH | __GFP_KSWAPD_RECLAIM));
>
> if (gfp_mask & __GFP_ATOMIC) {
> /*
Slightly harder to read but it does potentially generate better code.
If you correct the typos in the changelog, the slightly strange formatting
of the changelog and drop one of the build checks then you can add
Acked-by: Mel Gorman <mgorman@techsingularity.net>
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-03-04 15:15 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-04 14:22 [RFC PATCH] mm: Micro-optimisation: Save two branches on hot - page allocation path mateusznosek0
2020-03-04 14:47 ` Vlastimil Babka
2020-03-04 15:15 ` Mel Gorman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).