* [PATCH v1] mm/page_alloc: drop pr_info_ratelimited() in alloc_contig_range()
@ 2021-03-01 15:09 David Hildenbrand
2021-03-01 15:35 ` Zi Yan
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: David Hildenbrand @ 2021-03-01 15:09 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, David Hildenbrand, Andrew Morton, Minchan Kim,
Oscar Salvador, Michal Hocko, Vlastimil Babka
The information that some PFNs are busy is:
a) not helpful for ordinary users: we don't even know *who* called
alloc_contig_range(). This is certainly not worth a pr_info.*().
b) not really helpful for debugging: we don't have any details *why*
these PFNs are busy, and that is what we usually care about.
c) not complete: there are other cases where we fail alloc_contig_range()
using different paths that are not getting recorded.
For example, we reach this path once we succeeded in isolating pageblocks,
but failed to migrate some pages - which can happen easily on
ZONE_NORMAL (i.e., has_unmovable_pages() is racy) but also on ZONE_MOVABLE
i.e., we would have to retry longer to migrate).
For example via virtio-mem when unplugging memory, we can create quite
some noise (especially with ZONE_NORMAL) that is not of interest to
users - it's expected that some allocations may fail as memory is busy.
Let's just drop that pr_info_ratelimit() and rather implement a dynamic
debugging mechanism in the future that can give us a better reason why
alloc_contig_range() failed on specific pages.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
mm/page_alloc.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 519a60d5b6f7..efb924fb13e8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8647,8 +8647,6 @@ int alloc_contig_range(unsigned long start, unsigned long end,
/* Make sure the range is really isolated. */
if (test_pages_isolated(outer_start, end, 0)) {
- pr_info_ratelimited("%s: [%lx, %lx) PFNs busy\n",
- __func__, outer_start, end);
ret = -EBUSY;
goto done;
}
--
2.29.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v1] mm/page_alloc: drop pr_info_ratelimited() in alloc_contig_range()
2021-03-01 15:09 [PATCH v1] mm/page_alloc: drop pr_info_ratelimited() in alloc_contig_range() David Hildenbrand
@ 2021-03-01 15:35 ` Zi Yan
2021-03-01 15:52 ` Michal Hocko
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Zi Yan @ 2021-03-01 15:35 UTC (permalink / raw)
To: David Hildenbrand
Cc: linux-kernel, linux-mm, Andrew Morton, Minchan Kim,
Oscar Salvador, Michal Hocko, Vlastimil Babka
[-- Attachment #1: Type: text/plain, Size: 1621 bytes --]
On 1 Mar 2021, at 10:09, David Hildenbrand wrote:
> The information that some PFNs are busy is:
> a) not helpful for ordinary users: we don't even know *who* called
> alloc_contig_range(). This is certainly not worth a pr_info.*().
> b) not really helpful for debugging: we don't have any details *why*
> these PFNs are busy, and that is what we usually care about.
> c) not complete: there are other cases where we fail alloc_contig_range()
> using different paths that are not getting recorded.
>
> For example, we reach this path once we succeeded in isolating pageblocks,
> but failed to migrate some pages - which can happen easily on
> ZONE_NORMAL (i.e., has_unmovable_pages() is racy) but also on ZONE_MOVABLE
> i.e., we would have to retry longer to migrate).
>
> For example via virtio-mem when unplugging memory, we can create quite
> some noise (especially with ZONE_NORMAL) that is not of interest to
> users - it's expected that some allocations may fail as memory is busy.
>
> Let's just drop that pr_info_ratelimit() and rather implement a dynamic
> debugging mechanism in the future that can give us a better reason why
> alloc_contig_range() failed on specific pages.
>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
LGTM. I agree that the printout is not quite useful.
Reviewed-by: Zi Yan <ziy@nvidia.com>
—
Best Regards,
Yan Zi
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v1] mm/page_alloc: drop pr_info_ratelimited() in alloc_contig_range()
2021-03-01 15:09 [PATCH v1] mm/page_alloc: drop pr_info_ratelimited() in alloc_contig_range() David Hildenbrand
2021-03-01 15:35 ` Zi Yan
@ 2021-03-01 15:52 ` Michal Hocko
2021-03-01 22:05 ` Oscar Salvador
2021-03-02 17:29 ` Minchan Kim
3 siblings, 0 replies; 5+ messages in thread
From: Michal Hocko @ 2021-03-01 15:52 UTC (permalink / raw)
To: David Hildenbrand
Cc: linux-kernel, linux-mm, Andrew Morton, Minchan Kim,
Oscar Salvador, Vlastimil Babka
On Mon 01-03-21 16:09:45, David Hildenbrand wrote:
> The information that some PFNs are busy is:
> a) not helpful for ordinary users: we don't even know *who* called
> alloc_contig_range(). This is certainly not worth a pr_info.*().
> b) not really helpful for debugging: we don't have any details *why*
> these PFNs are busy, and that is what we usually care about.
> c) not complete: there are other cases where we fail alloc_contig_range()
> using different paths that are not getting recorded.
>
> For example, we reach this path once we succeeded in isolating pageblocks,
> but failed to migrate some pages - which can happen easily on
> ZONE_NORMAL (i.e., has_unmovable_pages() is racy) but also on ZONE_MOVABLE
> i.e., we would have to retry longer to migrate).
>
> For example via virtio-mem when unplugging memory, we can create quite
> some noise (especially with ZONE_NORMAL) that is not of interest to
> users - it's expected that some allocations may fail as memory is busy.
>
> Let's just drop that pr_info_ratelimit() and rather implement a dynamic
> debugging mechanism in the future that can give us a better reason why
> alloc_contig_range() failed on specific pages.
>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
> ---
> mm/page_alloc.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 519a60d5b6f7..efb924fb13e8 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8647,8 +8647,6 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>
> /* Make sure the range is really isolated. */
> if (test_pages_isolated(outer_start, end, 0)) {
> - pr_info_ratelimited("%s: [%lx, %lx) PFNs busy\n",
> - __func__, outer_start, end);
> ret = -EBUSY;
> goto done;
> }
> --
> 2.29.2
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v1] mm/page_alloc: drop pr_info_ratelimited() in alloc_contig_range()
2021-03-01 15:09 [PATCH v1] mm/page_alloc: drop pr_info_ratelimited() in alloc_contig_range() David Hildenbrand
2021-03-01 15:35 ` Zi Yan
2021-03-01 15:52 ` Michal Hocko
@ 2021-03-01 22:05 ` Oscar Salvador
2021-03-02 17:29 ` Minchan Kim
3 siblings, 0 replies; 5+ messages in thread
From: Oscar Salvador @ 2021-03-01 22:05 UTC (permalink / raw)
To: David Hildenbrand
Cc: linux-kernel, linux-mm, Andrew Morton, Minchan Kim, Michal Hocko,
Vlastimil Babka
On Mon, Mar 01, 2021 at 04:09:45PM +0100, David Hildenbrand wrote:
> The information that some PFNs are busy is:
> a) not helpful for ordinary users: we don't even know *who* called
> alloc_contig_range(). This is certainly not worth a pr_info.*().
> b) not really helpful for debugging: we don't have any details *why*
> these PFNs are busy, and that is what we usually care about.
> c) not complete: there are other cases where we fail alloc_contig_range()
> using different paths that are not getting recorded.
>
> For example, we reach this path once we succeeded in isolating pageblocks,
> but failed to migrate some pages - which can happen easily on
> ZONE_NORMAL (i.e., has_unmovable_pages() is racy) but also on ZONE_MOVABLE
> i.e., we would have to retry longer to migrate).
>
> For example via virtio-mem when unplugging memory, we can create quite
> some noise (especially with ZONE_NORMAL) that is not of interest to
> users - it's expected that some allocations may fail as memory is busy.
>
> Let's just drop that pr_info_ratelimit() and rather implement a dynamic
> debugging mechanism in the future that can give us a better reason why
> alloc_contig_range() failed on specific pages.
>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
--
Oscar Salvador
SUSE L3
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v1] mm/page_alloc: drop pr_info_ratelimited() in alloc_contig_range()
2021-03-01 15:09 [PATCH v1] mm/page_alloc: drop pr_info_ratelimited() in alloc_contig_range() David Hildenbrand
` (2 preceding siblings ...)
2021-03-01 22:05 ` Oscar Salvador
@ 2021-03-02 17:29 ` Minchan Kim
3 siblings, 0 replies; 5+ messages in thread
From: Minchan Kim @ 2021-03-02 17:29 UTC (permalink / raw)
To: David Hildenbrand
Cc: linux-kernel, linux-mm, Andrew Morton, Oscar Salvador,
Michal Hocko, Vlastimil Babka
On Mon, Mar 01, 2021 at 04:09:45PM +0100, David Hildenbrand wrote:
> The information that some PFNs are busy is:
> a) not helpful for ordinary users: we don't even know *who* called
> alloc_contig_range(). This is certainly not worth a pr_info.*().
> b) not really helpful for debugging: we don't have any details *why*
> these PFNs are busy, and that is what we usually care about.
> c) not complete: there are other cases where we fail alloc_contig_range()
> using different paths that are not getting recorded.
>
> For example, we reach this path once we succeeded in isolating pageblocks,
> but failed to migrate some pages - which can happen easily on
> ZONE_NORMAL (i.e., has_unmovable_pages() is racy) but also on ZONE_MOVABLE
> i.e., we would have to retry longer to migrate).
>
> For example via virtio-mem when unplugging memory, we can create quite
> some noise (especially with ZONE_NORMAL) that is not of interest to
> users - it's expected that some allocations may fail as memory is busy.
>
> Let's just drop that pr_info_ratelimit() and rather implement a dynamic
> debugging mechanism in the future that can give us a better reason why
> alloc_contig_range() failed on specific pages.
>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: David Hildenbrand <david@redhat.com>
I agree it's not useful to dump just range. Rather than, dump_page
would be much helpful.
Acked-by: Minchan Kim <minchan@kernel.org>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-03-02 20:14 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-01 15:09 [PATCH v1] mm/page_alloc: drop pr_info_ratelimited() in alloc_contig_range() David Hildenbrand
2021-03-01 15:35 ` Zi Yan
2021-03-01 15:52 ` Michal Hocko
2021-03-01 22:05 ` Oscar Salvador
2021-03-02 17:29 ` Minchan Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).