All of lore.kernel.org
 help / color / mirror / Atom feed
From: Suren Baghdasaryan <surenb@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>,
	akpm@linux-foundation.org, mhocko@suse.com, peterz@infradead.org,
	guro@fb.com, shakeelb@google.com, timmurray@google.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@android.com
Subject: Re: [PATCH 1/1] mm: count time in drain_all_pages during direct reclaim as memory pressure
Date: Wed, 23 Feb 2022 11:06:18 -0800	[thread overview]
Message-ID: <CAJuCfpEOHKnsZW+Yo-p8PEPTyO_CK-cV1FOresT+skUAuEhXRw@mail.gmail.com> (raw)
In-Reply-To: <YhaDACTHpIT5rDB1@cmpxchg.org>

On Wed, Feb 23, 2022 at 10:54 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> On Sun, Feb 20, 2022 at 08:52:38AM -0800, Suren Baghdasaryan wrote:
> > On Sat, Feb 19, 2022 at 4:40 PM Minchan Kim <minchan@kernel.org> wrote:
> > >
> > > On Sat, Feb 19, 2022 at 09:49:40AM -0800, Suren Baghdasaryan wrote:
> > > > When page allocation in direct reclaim path fails, the system will
> > > > make one attempt to shrink per-cpu page lists and free pages from
> > > > high alloc reserves. Draining per-cpu pages into buddy allocator can
> > > > be a very slow operation because it's done using workqueues and the
> > > > task in direct reclaim waits for all of them to finish before
> > >
> > > Yes, drain_all_pages is serious slow(100ms - 150ms on Android)
> > > especially when CPUs are fully packed. It was also spotted in CMA
> > > allocation even when there was on no memory pressure.
> >
> > Thanks for the input, Minchan!
> > In my tests I've seen 50-60ms delays in a single drain_all_pages but I
> > can imagine there are cases worse than these.
> >
> > >
> > > > proceeding. Currently this time is not accounted as psi memory stall.
> > >
> > > Good spot.
> > >
> > > >
> > > > While testing mobile devices under extreme memory pressure, when
> > > > allocations are failing during direct reclaim, we notices that psi
> > > > events which would be expected in such conditions were not triggered.
> > > > After profiling these cases it was determined that the reason for
> > > > missing psi events was that a big chunk of time spent in direct
> > > > reclaim is not accounted as memory stall, therefore psi would not
> > > > reach the levels at which an event is generated. Further investigation
> > > > revealed that the bulk of that unaccounted time was spent inside
> > > > drain_all_pages call.
> > > >
> > > > Annotate drain_all_pages and unreserve_highatomic_pageblock during
> > > > page allocation failure in the direct reclaim path so that delays
> > > > caused by these calls are accounted as memory stall.
> > > >
> > > > Reported-by: Tim Murray <timmurray@google.com>
> > > > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > > > ---
> > > >  mm/page_alloc.c | 4 ++++
> > > >  1 file changed, 4 insertions(+)
> > > >
> > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > > index 3589febc6d31..7fd0d392b39b 100644
> > > > --- a/mm/page_alloc.c
> > > > +++ b/mm/page_alloc.c
> > > > @@ -4639,8 +4639,12 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
> > > >        * Shrink them and try again
> > > >        */
> > > >       if (!page && !drained) {
> > > > +             unsigned long pflags;
> > > > +
> > > > +             psi_memstall_enter(&pflags);
> > > >               unreserve_highatomic_pageblock(ac, false);
> > > >               drain_all_pages(NULL);
> > > > +             psi_memstall_leave(&pflags);
> > >
> > > Instead of annotating the specific drain_all_pages, how about
> > > moving the annotation from __perform_reclaim to
> > > __alloc_pages_direct_reclaim?
> >
> > I'm fine with that approach too. Let's wait for Johannes' input before
> > I make any changes.
>
> I think the change makes sense, even if the workqueue fix speeds up
> the drain. I agree with Minchan about moving the annotation upward.
>
> With it moved, please feel free to add
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Thanks Johannes!
I'll move psi_memstall_enter/psi_memstall_leave from __perform_reclaim
into __alloc_pages_direct_reclaim to cover it completely. After that
will continue on fixing the workqueue issue.

  reply	other threads:[~2022-02-23 19:06 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-19 17:49 [PATCH 1/1] mm: count time in drain_all_pages during direct reclaim as memory pressure Suren Baghdasaryan
2022-02-20  0:40 ` Minchan Kim
2022-02-20 16:52   ` Suren Baghdasaryan
2022-02-23 18:54     ` Johannes Weiner
2022-02-23 19:06       ` Suren Baghdasaryan [this message]
2022-02-23 19:42         ` Suren Baghdasaryan
2022-02-21  8:55 ` Michal Hocko
2022-02-21 10:41   ` Petr Mladek
2022-02-21 19:13     ` Suren Baghdasaryan
2022-02-21 19:09   ` Suren Baghdasaryan
2022-02-22 19:47   ` Tim Murray
2022-02-23  0:15     ` Suren Baghdasaryan
2022-03-03  2:59     ` Hillf Danton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJuCfpEOHKnsZW+Yo-p8PEPTyO_CK-cV1FOresT+skUAuEhXRw@mail.gmail.com \
    --to=surenb@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=peterz@infradead.org \
    --cc=shakeelb@google.com \
    --cc=timmurray@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.