linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	joaodias@google.com
Subject: Re: [PATCH] mm: be more verbose for alloc_contig_range faliures
Date: Thu, 4 Mar 2021 10:11:35 -0800	[thread overview]
Message-ID: <YEEi1+TREGBElE5H@google.com> (raw)
In-Reply-To: <c08662f3-6ae1-4fb5-1c4f-840a70fad035@redhat.com>

On Thu, Mar 04, 2021 at 06:23:09PM +0100, David Hildenbrand wrote:
> > > You want to debug something, so you try triggering it and capturing debug
> > > data. There are not that many alloc_contig_range() users such that this
> > > would really be an issue to isolate ...
> > 
> > cma_alloc uses alloc_contig_range and cma_alloc has lots of users.
> > Even, it is expoerted by dmabuf so any userspace would trigger the
> > allocation by their own. Some of them could be tolerant for the failure,
> > rest of them could be critical. We should't expect it by limited kernel
> > usecase.
> 
> Assume you are debugging allocation failures. You either collect the data
> yourself or ask someone to send you that output. You care about any
> alloc_contig_range() allocation failures that shouldn't happen, don't you?
> 
> > 
> > > 
> > > Strictly speaking: any allocation failure on ZONE_MOVABLE or CMA is
> > > problematic (putting aside NORETRY logic and similar aside). So any such
> > > page you hit is worth investigating and, therefore, worth getting logged for
> > > debugging purposes.
> > 
> > If you believe the every alloc_contig_range failure is problematic
> 
> Every one where we should have guarantees I guess: ZONE_MOVABLE or
> MIGRAT_CMA. On ZONE_NORMAL, there are no guarantees.

Indeed.

> 
> > and there is no such realy example I menionted above in the world,
> > I am happy to put this chunk to support dynamic debugging.
> > Okay?
> > 
> > +#if defined(CONFIG_DYNAMIC_DEBUG) || \
> > +        (defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE))
> > +static DEFINE_RATELIMIT_STATE(alloc_contig_ratelimit_state,
> > +               DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST);
> > +int alloc_contig_ratelimit(void)
> > +{
> > +       return __ratelimit(&alloc_contig_ratelimit_state);
> > +}
> > +
> 
> ^ do we need ratelimiting with dynamic debugging enabled?

Main argument was debug message flooding. Even though we
play with dynamic debugging, the issue never disappear.

> 
> > +void dump_migrate_failure_pages(struct list_head *page_list)
> > +{
> > +       DEFINE_DYNAMIC_DEBUG_METADATA(descriptor,
> > +                       "migrate failure");
> > +       if (DYNAMIC_DEBUG_BRANCH(descriptor) &&
> > +                       alloc_contig_ratelimit()) {
> > +               struct page *page;
> > +
> > +               WARN(1, "failed callstack");
> > +               list_for_each_entry(page, page_list, lru)
> > +                       dump_page(page, "migration failure");
> 
> Are all pages on the list guaranteed to be problematic, or only the first
> entry? I assume all.

All.

> 
> > +       }
> > +}
> > +#else
> > +static inline void dump_migrate_failure_pages(struct list_head *page_list)
> > +{
> > +}
> > +#endif
> > +
> >   /* [start, end) must belong to a single zone. */
> >   static int __alloc_contig_migrate_range(struct compact_control *cc,
> >                                          unsigned long start, unsigned long end)
> > @@ -8496,6 +8522,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
> >                                  NULL, (unsigned long)&mtc, cc->mode, MR_CONTIG_RANGE);
> >          }
> >          if (ret < 0) {
> > +               dump_migrate_failure_pages(&cc->migratepages);
> >                  putback_movable_pages(&cc->migratepages);
> >                  return ret;
> >          }
> > 
> > 
> 
> If that's the way dynamic debugging is configured/enabled (still have to
> look into it) - yes, that goes into the right direction. As I said above,
> you should dump only where we have some kind of guarantees I assume.

Sure, let me wait for your review before sending next revision.
Thanks for the review!


  reply	other threads:[~2021-03-04 18:11 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-17 16:36 [PATCH] mm: be more verbose for alloc_contig_range faliures Minchan Kim
2021-02-17 16:51 ` David Hildenbrand
2021-02-17 17:26   ` Minchan Kim
2021-02-17 17:34     ` David Hildenbrand
2021-02-17 17:45       ` Minchan Kim
2021-02-18  8:56 ` Michal Hocko
2021-02-18  9:02   ` David Hildenbrand
2021-02-18  9:35     ` Michal Hocko
2021-02-18  9:43       ` David Hildenbrand
2021-02-18  9:59         ` Michal Hocko
2021-02-18 16:19         ` Minchan Kim
2021-02-18 16:26           ` David Hildenbrand
2021-02-18 16:47             ` Minchan Kim
2021-02-18 16:53               ` David Hildenbrand
2021-02-19  9:28           ` Michal Hocko
2021-02-19  9:30             ` David Hildenbrand
2021-02-19 10:02               ` Michal Hocko
2021-02-19 10:34                 ` David Hildenbrand
     [not found]             ` <YD50pcPuwV456vwm@google.com>
2021-03-04 16:01               ` Minchan Kim
2021-03-04 16:10                 ` David Hildenbrand
2021-03-04 16:23                   ` Minchan Kim
2021-03-04 16:28                     ` David Hildenbrand
2021-03-04 17:11                       ` Minchan Kim
2021-03-04 17:23                         ` David Hildenbrand
2021-03-04 18:11                           ` Minchan Kim [this message]
2021-03-04 18:22                             ` Minchan Kim
2021-03-08 12:49                               ` Michal Hocko
2021-03-08 13:22                                 ` David Hildenbrand
2021-03-08 14:11                                   ` Michal Hocko
2021-03-08 14:13                                     ` David Hildenbrand
2021-03-08 15:42                                       ` Michal Hocko
2021-03-08 15:58                                         ` Minchan Kim
2021-03-08 16:21                                           ` Michal Hocko
2021-03-08 17:01                                             ` Minchan Kim
2021-03-08 20:27                                           ` Minchan Kim
2021-02-18 16:10   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YEEi1+TREGBElE5H@google.com \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=joaodias@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).