* Re: [PATCH] mm, page_alloc: Fix a division by zero error when boosting watermarks
2019-02-13 13:19 [PATCH] mm, page_alloc: Fix a division by zero error when boosting watermarks Mel Gorman
@ 2019-02-13 13:20 ` Will Deacon
2019-02-13 13:42 ` Vlastimil Babka
2019-02-13 14:30 ` [PATCH] mm, page_alloc: Fix a division by zero error when boosting watermarks v2 Mel Gorman
2 siblings, 0 replies; 6+ messages in thread
From: Will Deacon @ 2019-02-13 13:20 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Yury Norov, Vlastimil Babka, Andrea Arcangeli,
David Rientjes, Michal Hocko, Catalin Marinas, linux-kernel,
linux-arm-kernel, linux-mm
On Wed, Feb 13, 2019 at 01:19:23PM +0000, Mel Gorman wrote:
> Yury Norov reported that an arm64 KVM instance could not boot since after
> v5.0-rc1 and could addressed by reverting the patches
>
> 1c30844d2dfe272d58c ("mm: reclaim small amounts of memory when an external
> 73444bc4d8f92e46a20 ("mm, page_alloc: do not wake kswapd with zone lock held")
>
> The problem is that a division by zero error is possible if boosting occurs
> either very early in boot or if the high watermark is very small. This
> patch checks for the conditions and avoids boosting in those cases.
>
> Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
> Reported-and-tested-by: Yury Norov <yury.norov@gmail.com>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> ---
> mm/page_alloc.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d295c9bc01a8..ae7e4ba5b9f5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2170,6 +2170,11 @@ static inline void boost_watermark(struct zone *zone)
>
> max_boost = mult_frac(zone->_watermark[WMARK_HIGH],
> watermark_boost_factor, 10000);
> +
> + /* high watermark be be uninitialised or very small */
> + if (!max_boost)
> + return;
> +
> max_boost = max(pageblock_nr_pages, max_boost);
>
> zone->watermark_boost = min(zone->watermark_boost + pageblock_nr_pages,
I can confirm that this also allows my KVM guest to boot:
Tested-by: Will Deacon <will.deacon@arm.com>
Will
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm, page_alloc: Fix a division by zero error when boosting watermarks
2019-02-13 13:19 [PATCH] mm, page_alloc: Fix a division by zero error when boosting watermarks Mel Gorman
2019-02-13 13:20 ` Will Deacon
@ 2019-02-13 13:42 ` Vlastimil Babka
2019-02-13 14:15 ` Mel Gorman
2019-02-13 14:30 ` [PATCH] mm, page_alloc: Fix a division by zero error when boosting watermarks v2 Mel Gorman
2 siblings, 1 reply; 6+ messages in thread
From: Vlastimil Babka @ 2019-02-13 13:42 UTC (permalink / raw)
To: Mel Gorman, Andrew Morton
Cc: Yury Norov, Andrea Arcangeli, David Rientjes, Michal Hocko,
Will Deacon, Catalin Marinas, linux-kernel, linux-arm-kernel,
linux-mm
On 2/13/19 2:19 PM, Mel Gorman wrote:
> Yury Norov reported that an arm64 KVM instance could not boot since after
> v5.0-rc1 and could addressed by reverting the patches
>
> 1c30844d2dfe272d58c ("mm: reclaim small amounts of memory when an external
> 73444bc4d8f92e46a20 ("mm, page_alloc: do not wake kswapd with zone lock held")
>
> The problem is that a division by zero error is possible if boosting occurs
> either very early in boot or if the high watermark is very small. This
> patch checks for the conditions and avoids boosting in those cases.
Hmm is it really a division by zero? The following line sets max_boost to
pageblock_nr_pages if it's zero. And where would the division happen anyway?
So I wonder what's going on, your patch should AFAICS only take effect when
zone->_watermark[WMARK_HIGH] is 0 or 1 to begin with, otherwise max_boost is at
least 2?
Also upon closer look, I think that (prior to the patch), boost_watermark()
could be reduced (thanks to the max+min capping) to
zone->watermark_boost = pageblock_nr_pages
?
>
> Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
> Reported-and-tested-by: Yury Norov <yury.norov@gmail.com>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> ---
> mm/page_alloc.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d295c9bc01a8..ae7e4ba5b9f5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2170,6 +2170,11 @@ static inline void boost_watermark(struct zone *zone)
>
> max_boost = mult_frac(zone->_watermark[WMARK_HIGH],
> watermark_boost_factor, 10000);
> +
> + /* high watermark be be uninitialised or very small */
> + if (!max_boost)
> + return;
> +
> max_boost = max(pageblock_nr_pages, max_boost);
>
> zone->watermark_boost = min(zone->watermark_boost + pageblock_nr_pages,
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm, page_alloc: Fix a division by zero error when boosting watermarks
2019-02-13 13:42 ` Vlastimil Babka
@ 2019-02-13 14:15 ` Mel Gorman
0 siblings, 0 replies; 6+ messages in thread
From: Mel Gorman @ 2019-02-13 14:15 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Yury Norov, Andrea Arcangeli, David Rientjes,
Michal Hocko, Will Deacon, Catalin Marinas, linux-kernel,
linux-arm-kernel, linux-mm
On Wed, Feb 13, 2019 at 02:42:36PM +0100, Vlastimil Babka wrote:
> On 2/13/19 2:19 PM, Mel Gorman wrote:
> > Yury Norov reported that an arm64 KVM instance could not boot since after
> > v5.0-rc1 and could addressed by reverting the patches
> >
> > 1c30844d2dfe272d58c ("mm: reclaim small amounts of memory when an external
> > 73444bc4d8f92e46a20 ("mm, page_alloc: do not wake kswapd with zone lock held")
> >
> > The problem is that a division by zero error is possible if boosting occurs
> > either very early in boot or if the high watermark is very small. This
> > patch checks for the conditions and avoids boosting in those cases.
>
> Hmm is it really a division by zero? The following line sets max_boost to
> pageblock_nr_pages if it's zero. And where would the division happen anyway?
>
> So I wonder what's going on, your patch should AFAICS only take effect when
> zone->_watermark[WMARK_HIGH] is 0 or 1 to begin with, otherwise max_boost is at
> least 2?
>
The issue can occur if pageblock_nr_pages is also zero or not yet
initialised. It means the changelog is misleading because it has to
trigger very early in boot as happened with Yury.
> Also upon closer look, I think that (prior to the patch), boost_watermark()
> could be reduced (thanks to the max+min capping) to
>
> zone->watermark_boost = pageblock_nr_pages
>
I don't think it's worth being fancy about it if we're hitting
fragmentation issues that early in boot.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] mm, page_alloc: Fix a division by zero error when boosting watermarks v2
2019-02-13 13:19 [PATCH] mm, page_alloc: Fix a division by zero error when boosting watermarks Mel Gorman
2019-02-13 13:20 ` Will Deacon
2019-02-13 13:42 ` Vlastimil Babka
@ 2019-02-13 14:30 ` Mel Gorman
2019-02-13 14:31 ` Vlastimil Babka
2 siblings, 1 reply; 6+ messages in thread
From: Mel Gorman @ 2019-02-13 14:30 UTC (permalink / raw)
To: Andrew Morton
Cc: Yury Norov, Vlastimil Babka, Andrea Arcangeli, David Rientjes,
Michal Hocko, Will Deacon, Catalin Marinas, linux-kernel,
linux-arm-kernel, linux-mm
Yury Norov reported that an arm64 KVM instance could not boot since after
v5.0-rc1 and could addressed by reverting the patches
1c30844d2dfe272d58c ("mm: reclaim small amounts of memory when an external
73444bc4d8f92e46a20 ("mm, page_alloc: do not wake kswapd with zone lock held")
The problem is that a division by zero error is possible if boosting
occurs very early in boot if the system has very little memory. This
patch avoids the division by zero error.
Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
Reported-and-tested-by: Yury Norov <yury.norov@gmail.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d295c9bc01a8..bb1c7d843ebf 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2170,6 +2170,18 @@ static inline void boost_watermark(struct zone *zone)
max_boost = mult_frac(zone->_watermark[WMARK_HIGH],
watermark_boost_factor, 10000);
+
+ /*
+ * high watermark may be uninitialised if fragmentation occurs
+ * very early in boot so do not boost. We do not fall
+ * through and boost by pageblock_nr_pages as failing
+ * allocations that early means that reclaim is not going
+ * to help and it may even be impossible to reclaim the
+ * boosted watermark resulting in a hang.
+ */
+ if (!max_boost)
+ return;
+
max_boost = max(pageblock_nr_pages, max_boost);
zone->watermark_boost = min(zone->watermark_boost + pageblock_nr_pages,
--
Mel Gorman
SUSE Labs
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] mm, page_alloc: Fix a division by zero error when boosting watermarks v2
2019-02-13 14:30 ` [PATCH] mm, page_alloc: Fix a division by zero error when boosting watermarks v2 Mel Gorman
@ 2019-02-13 14:31 ` Vlastimil Babka
0 siblings, 0 replies; 6+ messages in thread
From: Vlastimil Babka @ 2019-02-13 14:31 UTC (permalink / raw)
To: Mel Gorman, Andrew Morton
Cc: Yury Norov, Andrea Arcangeli, David Rientjes, Michal Hocko,
Will Deacon, Catalin Marinas, linux-kernel, linux-arm-kernel,
linux-mm
On 2/13/19 3:30 PM, Mel Gorman wrote:
> Yury Norov reported that an arm64 KVM instance could not boot since after
> v5.0-rc1 and could addressed by reverting the patches
>
> 1c30844d2dfe272d58c ("mm: reclaim small amounts of memory when an external
> 73444bc4d8f92e46a20 ("mm, page_alloc: do not wake kswapd with zone lock held")
>
> The problem is that a division by zero error is possible if boosting
> occurs very early in boot if the system has very little memory. This
> patch avoids the division by zero error.
>
> Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
> Reported-and-tested-by: Yury Norov <yury.norov@gmail.com>
> Tested-by: Will Deacon <will.deacon@arm.com>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Thanks, sorry for the noise before.
> ---
> mm/page_alloc.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d295c9bc01a8..bb1c7d843ebf 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2170,6 +2170,18 @@ static inline void boost_watermark(struct zone *zone)
>
> max_boost = mult_frac(zone->_watermark[WMARK_HIGH],
> watermark_boost_factor, 10000);
> +
> + /*
> + * high watermark may be uninitialised if fragmentation occurs
> + * very early in boot so do not boost. We do not fall
> + * through and boost by pageblock_nr_pages as failing
> + * allocations that early means that reclaim is not going
> + * to help and it may even be impossible to reclaim the
> + * boosted watermark resulting in a hang.
> + */
> + if (!max_boost)
> + return;
> +
> max_boost = max(pageblock_nr_pages, max_boost);
>
> zone->watermark_boost = min(zone->watermark_boost + pageblock_nr_pages,
>
^ permalink raw reply [flat|nested] 6+ messages in thread