All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aaron Lu <aaron.lu@intel.com>
To: "Figo.zhang" <figo1802@gmail.com>
Cc: Linux MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Huang Ying <ying.huang@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Kemi Wang <kemi.wang@intel.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Andi Kleen <ak@linux.intel.com>, Michal Hocko <mhocko@suse.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Matthew Wilcox <willy@infradead.org>,
	Daniel Jordan <daniel.m.jordan@oracle.com>
Subject: Re: [RFC PATCH v2 3/4] mm/rmqueue_bulk: alloc without touching individual page structure
Date: Wed, 21 Mar 2018 09:52:23 +0800	[thread overview]
Message-ID: <20180321015223.GA28705@intel.com> (raw)
In-Reply-To: <CAF7GXvpzgc0vsJemUYQPhPFte8b8a4nBFo=iwZBTdM1Y2eoHYw@mail.gmail.com>

On Tue, Mar 20, 2018 at 03:29:33PM -0700, Figo.zhang wrote:
> 2018-03-20 1:54 GMT-07:00 Aaron Lu <aaron.lu@intel.com>:
> 
> > Profile on Intel Skylake server shows the most time consuming part
> > under zone->lock on allocation path is accessing those to-be-returned
> > page's "struct page" on the free_list inside zone->lock. One explanation
> > is, different CPUs are releasing pages to the head of free_list and
> > those page's 'struct page' may very well be cache cold for the allocating
> > CPU when it grabs these pages from free_list' head. The purpose here
> > is to avoid touching these pages one by one inside zone->lock.
> >
> > One idea is, we just take the requested number of pages off free_list
> > with something like list_cut_position() and then adjust nr_free of
> > free_area accordingly inside zone->lock and other operations like
> > clearing PageBuddy flag for these pages are done outside of zone->lock.
> >
> 
> sounds good!
> your idea is reducing the lock contention in rmqueue_bulk() function by

Right, the idea is to reduce the lock held time.

> split the order-0
> freelist into two list, one is without zone->lock, other is need zone->lock?

But not by splitting freelist into two lists, I didn't do that.
I moved part of the things done previously inside the lock outside, i.e.
clearing PageBuddy flag etc. is now done outside so that we do not need
to take the penalty of cache miss on those "struct page"s inside the
lock and have all other CPUs waiting.

> 
> it seems that it is a big lock granularity holding the zone->lock in
> rmqueue_bulk() ,
> why not we change like it?

It is believed frequently taking and dropping lock is worse than taking
it and do all needed things and then drop.

> 
> static int rmqueue_bulk(struct zone *zone, unsigned int order,
>             unsigned long count, struct list_head *list,
>             int migratetype, bool cold)
> {
> 
>     for (i = 0; i < count; ++i) {
>         spin_lock(&zone->lock);
>         struct page *page = __rmqueue(zone, order, migratetype);
>        spin_unlock(&zone->lock);
>        ...
>     }

In this case, spin_lock() and spin_unlock() should be outside the loop.

>     __mod_zone_page_state(zone, NR_FREE_PAGES, -(i << order));
> 
>     return i;
> }

  reply	other threads:[~2018-03-21  1:51 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-20  8:54 [RFC PATCH v2 0/4] Eliminate zone->lock contention for will-it-scale/page_fault1 and parallel free Aaron Lu
2018-03-20  8:54 ` [RFC PATCH v2 1/4] mm/page_alloc: use helper functions to add/remove a page to/from buddy Aaron Lu
2018-03-20 11:35   ` Vlastimil Babka
2018-03-20 13:50     ` Aaron Lu
2018-03-20  8:54 ` [RFC PATCH v2 2/4] mm/__free_one_page: skip merge for order-0 page unless compaction failed Aaron Lu
2018-03-20 11:45   ` Vlastimil Babka
2018-03-20 14:11     ` Aaron Lu
2018-03-21  7:53       ` Vlastimil Babka
2018-03-22 17:15       ` Matthew Wilcox
2018-03-22 18:39         ` Daniel Jordan
2018-03-22 18:50           ` Matthew Wilcox
2018-03-20 22:58   ` Figo.zhang
2018-03-21  1:59     ` Aaron Lu
2018-03-21  4:21       ` Figo.zhang
2018-03-21  4:53         ` Aaron Lu
2018-03-21  5:59           ` Figo.zhang
2018-03-21  7:42             ` Aaron Lu
2018-03-20  8:54 ` [RFC PATCH v2 3/4] mm/rmqueue_bulk: alloc without touching individual page structure Aaron Lu
2018-03-20 22:29   ` Figo.zhang
2018-03-21  1:52     ` Aaron Lu [this message]
2018-03-21 12:55   ` Vlastimil Babka
2018-03-21 15:01     ` Aaron Lu
2018-03-29 19:16       ` Daniel Jordan
2018-03-20  8:54 ` [RFC PATCH v2 4/4] mm/free_pcppages_bulk: reduce overhead of cluster operation on free path Aaron Lu
2018-03-21 17:44 ` [RFC PATCH v2 0/4] Eliminate zone->lock contention for will-it-scale/page_fault1 and parallel free Daniel Jordan
2018-03-22  1:30   ` Aaron Lu
2018-03-22 11:20     ` Daniel Jordan
2018-03-29 19:19 ` Daniel Jordan
2018-03-30  1:42   ` Aaron Lu
2018-03-30 14:27     ` Daniel Jordan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180321015223.GA28705@intel.com \
    --to=aaron.lu@intel.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=figo1802@gmail.com \
    --cc=kemi.wang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.