All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	joaodias@google.com
Subject: Re: [PATCH] mm: be more verbose for alloc_contig_range faliures
Date: Thu, 4 Mar 2021 10:11:35 -0800	[thread overview]
Message-ID: <YEEi1+TREGBElE5H@google.com> (raw)
In-Reply-To: <c08662f3-6ae1-4fb5-1c4f-840a70fad035@redhat.com>

On Thu, Mar 04, 2021 at 06:23:09PM +0100, David Hildenbrand wrote:
> > > You want to debug something, so you try triggering it and capturing debug
> > > data. There are not that many alloc_contig_range() users such that this
> > > would really be an issue to isolate ...
> > 
> > cma_alloc uses alloc_contig_range and cma_alloc has lots of users.
> > Even, it is expoerted by dmabuf so any userspace would trigger the
> > allocation by their own. Some of them could be tolerant for the failure,
> > rest of them could be critical. We should't expect it by limited kernel
> > usecase.
> 
> Assume you are debugging allocation failures. You either collect the data
> yourself or ask someone to send you that output. You care about any
> alloc_contig_range() allocation failures that shouldn't happen, don't you?
> 
> > 
> > > 
> > > Strictly speaking: any allocation failure on ZONE_MOVABLE or CMA is
> > > problematic (putting aside NORETRY logic and similar aside). So any such
> > > page you hit is worth investigating and, therefore, worth getting logged for
> > > debugging purposes.
> > 
> > If you believe the every alloc_contig_range failure is problematic
> 
> Every one where we should have guarantees I guess: ZONE_MOVABLE or
> MIGRAT_CMA. On ZONE_NORMAL, there are no guarantees.

Indeed.

> 
> > and there is no such realy example I menionted above in the world,
> > I am happy to put this chunk to support dynamic debugging.
> > Okay?
> > 
> > +#if defined(CONFIG_DYNAMIC_DEBUG) || \
> > +        (defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE))
> > +static DEFINE_RATELIMIT_STATE(alloc_contig_ratelimit_state,
> > +               DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST);
> > +int alloc_contig_ratelimit(void)
> > +{
> > +       return __ratelimit(&alloc_contig_ratelimit_state);
> > +}
> > +
> 
> ^ do we need ratelimiting with dynamic debugging enabled?

Main argument was debug message flooding. Even though we
play with dynamic debugging, the issue never disappear.

> 
> > +void dump_migrate_failure_pages(struct list_head *page_list)
> > +{
> > +       DEFINE_DYNAMIC_DEBUG_METADATA(descriptor,
> > +                       "migrate failure");
> > +       if (DYNAMIC_DEBUG_BRANCH(descriptor) &&
> > +                       alloc_contig_ratelimit()) {
> > +               struct page *page;
> > +
> > +               WARN(1, "failed callstack");
> > +               list_for_each_entry(page, page_list, lru)
> > +                       dump_page(page, "migration failure");
> 
> Are all pages on the list guaranteed to be problematic, or only the first
> entry? I assume all.

All.

> 
> > +       }
> > +}
> > +#else
> > +static inline void dump_migrate_failure_pages(struct list_head *page_list)
> > +{
> > +}
> > +#endif
> > +
> >   /* [start, end) must belong to a single zone. */
> >   static int __alloc_contig_migrate_range(struct compact_control *cc,
> >                                          unsigned long start, unsigned long end)
> > @@ -8496,6 +8522,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
> >                                  NULL, (unsigned long)&mtc, cc->mode, MR_CONTIG_RANGE);
> >          }
> >          if (ret < 0) {
> > +               dump_migrate_failure_pages(&cc->migratepages);
> >                  putback_movable_pages(&cc->migratepages);
> >                  return ret;
> >          }
> > 
> > 
> 
> If that's the way dynamic debugging is configured/enabled (still have to
> look into it) - yes, that goes into the right direction. As I said above,
> you should dump only where we have some kind of guarantees I assume.

Sure, let me wait for your review before sending next revision.
Thanks for the review!

  reply	other threads:[~2021-03-04 18:13 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-17 16:36 [PATCH] mm: be more verbose for alloc_contig_range faliures Minchan Kim
2021-02-17 16:51 ` David Hildenbrand
2021-02-17 17:26   ` Minchan Kim
2021-02-17 17:34     ` David Hildenbrand
2021-02-17 17:45       ` Minchan Kim
2021-02-18  8:56 ` Michal Hocko
2021-02-18  9:02   ` David Hildenbrand
2021-02-18  9:35     ` Michal Hocko
2021-02-18  9:43       ` David Hildenbrand
2021-02-18  9:59         ` Michal Hocko
2021-02-18 16:19         ` Minchan Kim
2021-02-18 16:26           ` David Hildenbrand
2021-02-18 16:47             ` Minchan Kim
2021-02-18 16:53               ` David Hildenbrand
2021-02-19  9:28           ` Michal Hocko
2021-02-19  9:30             ` David Hildenbrand
2021-02-19 10:02               ` Michal Hocko
2021-02-19 10:34                 ` David Hildenbrand
2021-03-02 17:23             ` Minchan Kim
2021-03-04 16:01               ` Minchan Kim
2021-03-04 16:10                 ` David Hildenbrand
2021-03-04 16:23                   ` Minchan Kim
2021-03-04 16:28                     ` David Hildenbrand
2021-03-04 17:11                       ` Minchan Kim
2021-03-04 17:23                         ` David Hildenbrand
2021-03-04 18:11                           ` Minchan Kim [this message]
2021-03-04 18:22                             ` Minchan Kim
2021-03-08 12:49                               ` Michal Hocko
2021-03-08 13:22                                 ` David Hildenbrand
2021-03-08 14:11                                   ` Michal Hocko
2021-03-08 14:13                                     ` David Hildenbrand
2021-03-08 15:42                                       ` Michal Hocko
2021-03-08 15:58                                         ` Minchan Kim
2021-03-08 16:21                                           ` Michal Hocko
2021-03-08 17:01                                             ` Minchan Kim
2021-03-08 20:27                                           ` Minchan Kim
2021-02-18 16:10   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YEEi1+TREGBElE5H@google.com \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=joaodias@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.