All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Figo.zhang" <figo1802@gmail.com>
To: Aaron Lu <aaron.lu@intel.com>
Cc: Linux MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Huang Ying <ying.huang@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Kemi Wang <kemi.wang@intel.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Andi Kleen <ak@linux.intel.com>, Michal Hocko <mhocko@suse.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Matthew Wilcox <willy@infradead.org>,
	Daniel Jordan <daniel.m.jordan@oracle.com>
Subject: Re: [RFC PATCH v2 2/4] mm/__free_one_page: skip merge for order-0 page unless compaction failed
Date: Tue, 20 Mar 2018 21:21:33 -0700	[thread overview]
Message-ID: <CAF7GXvrQG0+iPu8h13coo2QW7WxNhjHA1JAaOYoEBBB9-obRSQ@mail.gmail.com> (raw)
In-Reply-To: <20180321015944.GB28705@intel.com>

[-- Attachment #1: Type: text/plain, Size: 5113 bytes --]

2018-03-20 18:59 GMT-07:00 Aaron Lu <aaron.lu@intel.com>:

> On Tue, Mar 20, 2018 at 03:58:51PM -0700, Figo.zhang wrote:
> > 2018-03-20 1:54 GMT-07:00 Aaron Lu <aaron.lu@intel.com>:
> >
> > > Running will-it-scale/page_fault1 process mode workload on a 2 sockets
> > > Intel Skylake server showed severe lock contention of zone->lock, as
> > > high as about 80%(42% on allocation path and 35% on free path) CPU
> > > cycles are burnt spinning. With perf, the most time consuming part
> inside
> > > that lock on free path is cache missing on page structures, mostly on
> > > the to-be-freed page's buddy due to merging.
> > >
> > > One way to avoid this overhead is not do any merging at all for order-0
> > > pages. With this approach, the lock contention for zone->lock on free
> > > path dropped to 1.1% but allocation side still has as high as 42% lock
> > > contention. In the meantime, the dropped lock contention on free side
> > > doesn't translate to performance increase, instead, it's consumed by
> > > increased lock contention of the per node lru_lock(rose from 5% to 37%)
> > > and the final performance slightly dropped about 1%.
> > >
> > > Though performance dropped a little, it almost eliminated zone lock
> > > contention on free path and it is the foundation for the next patch
> > > that eliminates zone lock contention for allocation path.
> > >
> > > A new document file called "struct_page_filed" is added to explain
> > > the newly reused field in "struct page".
> > >
> > > Suggested-by: Dave Hansen <dave.hansen@intel.com>
> > > Signed-off-by: Aaron Lu <aaron.lu@intel.com>
> > > ---
> > >  Documentation/vm/struct_page_field |  5 +++
> > >  include/linux/mm_types.h           |  1 +
> > >  mm/compaction.c                    | 13 +++++-
> > >  mm/internal.h                      | 27 ++++++++++++
> > >  mm/page_alloc.c                    | 89 ++++++++++++++++++++++++++++++
> > > +++-----
> > >  5 files changed, 122 insertions(+), 13 deletions(-)
> > >  create mode 100644 Documentation/vm/struct_page_field
> > >
> > > diff --git a/Documentation/vm/struct_page_field
> b/Documentation/vm/struct_
> > > page_field
> > > new file mode 100644
> > > index 000000000000..1ab6c19ccc7a
> > > --- /dev/null
> > > +++ b/Documentation/vm/struct_page_field
> > > @@ -0,0 +1,5 @@
> > > +buddy_merge_skipped:
> > > +Used to indicate this page skipped merging when added to buddy. This
> > > +field only makes sense if the page is in Buddy and is order zero.
> > > +It's a bug if any higher order pages in Buddy has this field set.
> > > +Shares space with index.
> > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> > > index fd1af6b9591d..7edc4e102a8e 100644
> > > --- a/include/linux/mm_types.h
> > > +++ b/include/linux/mm_types.h
> > > @@ -91,6 +91,7 @@ struct page {
> > >                 pgoff_t index;          /* Our offset within mapping.
> */
> > >                 void *freelist;         /* sl[aou]b first free object
> */
> > >                 /* page_deferred_list().prev    -- second tail page */
> > > +               bool buddy_merge_skipped; /* skipped merging when
> added to
> > > buddy */
> > >         };
> > >
> > >         union {
> > > diff --git a/mm/compaction.c b/mm/compaction.c
> > > index 2c8999d027ab..fb9031fdca41 100644
> > > --- a/mm/compaction.c
> > > +++ b/mm/compaction.c
> > > @@ -776,8 +776,19 @@ isolate_migratepages_block(struct compact_control
> > > *cc, unsigned long low_pfn,
> > >                  * potential isolation targets.
> > >                  */
> > >                 if (PageBuddy(page)) {
> > > -                       unsigned long freepage_order =
> > > page_order_unsafe(page);
> > > +                       unsigned long freepage_order;
> > >
> > > +                       /*
> > > +                        * If this is a merge_skipped page, do merge
> now
> > > +                        * since high-order pages are needed. zone lock
> > > +                        * isn't taken for the merge_skipped check so
> the
> > > +                        * check could be wrong but the worst case is
> we
> > > +                        * lose a merge opportunity.
> > > +                        */
> > > +                       if (page_merge_was_skipped(page))
> > > +                               try_to_merge_page(page);
> > > +
> > > +                       freepage_order = page_order_unsafe(page);
> > >                         /*
> > >                          * Without lock, we cannot be sure that what we
> > > got is
> > >                          * a valid page order. Consider only values in
> the
> > >
> >
> > when the system memory is very very low and try a lot of failures and
> then
>
> >If the system memory is very very low, it doesn't appear there is a need
> >to do compaction since compaction needs to have enough order 0 pages to
> >make a high order one.
>
>
suppose that in free_one_page() will try to merge to high order anytime ,
but now in your patch,
those merge has postponed when system in low memory status, it is very easy
let system trigger
low memory state and get poor performance.

[-- Attachment #2: Type: text/html, Size: 6879 bytes --]

  reply	other threads:[~2018-03-21  4:21 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-20  8:54 [RFC PATCH v2 0/4] Eliminate zone->lock contention for will-it-scale/page_fault1 and parallel free Aaron Lu
2018-03-20  8:54 ` [RFC PATCH v2 1/4] mm/page_alloc: use helper functions to add/remove a page to/from buddy Aaron Lu
2018-03-20 11:35   ` Vlastimil Babka
2018-03-20 13:50     ` Aaron Lu
2018-03-20  8:54 ` [RFC PATCH v2 2/4] mm/__free_one_page: skip merge for order-0 page unless compaction failed Aaron Lu
2018-03-20 11:45   ` Vlastimil Babka
2018-03-20 14:11     ` Aaron Lu
2018-03-21  7:53       ` Vlastimil Babka
2018-03-22 17:15       ` Matthew Wilcox
2018-03-22 18:39         ` Daniel Jordan
2018-03-22 18:50           ` Matthew Wilcox
2018-03-20 22:58   ` Figo.zhang
2018-03-21  1:59     ` Aaron Lu
2018-03-21  4:21       ` Figo.zhang [this message]
2018-03-21  4:53         ` Aaron Lu
2018-03-21  5:59           ` Figo.zhang
2018-03-21  7:42             ` Aaron Lu
2018-03-20  8:54 ` [RFC PATCH v2 3/4] mm/rmqueue_bulk: alloc without touching individual page structure Aaron Lu
2018-03-20 22:29   ` Figo.zhang
2018-03-21  1:52     ` Aaron Lu
2018-03-21 12:55   ` Vlastimil Babka
2018-03-21 15:01     ` Aaron Lu
2018-03-29 19:16       ` Daniel Jordan
2018-03-20  8:54 ` [RFC PATCH v2 4/4] mm/free_pcppages_bulk: reduce overhead of cluster operation on free path Aaron Lu
2018-03-21 17:44 ` [RFC PATCH v2 0/4] Eliminate zone->lock contention for will-it-scale/page_fault1 and parallel free Daniel Jordan
2018-03-22  1:30   ` Aaron Lu
2018-03-22 11:20     ` Daniel Jordan
2018-03-29 19:19 ` Daniel Jordan
2018-03-30  1:42   ` Aaron Lu
2018-03-30 14:27     ` Daniel Jordan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAF7GXvrQG0+iPu8h13coo2QW7WxNhjHA1JAaOYoEBBB9-obRSQ@mail.gmail.com \
    --to=figo1802@gmail.com \
    --cc=aaron.lu@intel.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=kemi.wang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.