linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Michal Hocko <mhocko@kernel.org>
Cc: Cannon Matthews <cannonmatthews@google.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	David Rientjes <rientjes@google.com>,
	Greg Thelen <gthelen@google.com>, Salman Qazi <sqazi@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	ak@linux.intel.com, x86@kernel.org
Subject: Re: [PATCH] mm: clear 1G pages with streaming stores on x86
Date: Mon, 9 Mar 2020 12:35:33 +0300	[thread overview]
Message-ID: <20200309093533.qj255nrgyofmtiqz@box> (raw)
In-Reply-To: <20200309090630.GC8447@dhcp22.suse.cz>

On Mon, Mar 09, 2020 at 10:06:30AM +0100, Michal Hocko wrote:
> On Mon 09-03-20 03:08:20, Kirill A. Shutemov wrote:
> > On Fri, Mar 06, 2020 at 05:03:53PM -0800, Cannon Matthews wrote:
> > > Reimplement clear_gigantic_page() to clear gigabytes pages using the
> > > non-temporal streaming store instructions that bypass the cache
> > > (movnti), since an entire 1GiB region will not fit in the cache anyway.
> > > 
> > > Doing an mlock() on a 512GiB 1G-hugetlb region previously would take on
> > > average 134 seconds, about 260ms/GiB which is quite slow. Using `movnti`
> > > and optimizing the control flow over the constituent small pages, this
> > > can be improved roughly by a factor of 3-4x, with the 512GiB mlock()
> > > taking only 34 seconds on average, or 67ms/GiB.
> > > 
> > > The assembly code for the __clear_page_nt routine is more or less
> > > taken directly from the output of gcc with -O3 for this function with
> > > some tweaks to support arbitrary sizes and moving memory barriers:
> > > 
> > > void clear_page_nt_64i (void *page)
> > > {
> > >   for (int i = 0; i < GiB /sizeof(long long int); ++i)
> > >     {
> > >       _mm_stream_si64 (((long long int*)page) + i, 0);
> > >     }
> > >   sfence();
> > > }
> > > 
> > > Tested:
> > > 	Time to `mlock()` a 512GiB region on broadwell CPU
> > > 				AVG time (s)	% imp.	ms/page
> > > 	clear_page_erms		133.584		-	261
> > > 	clear_page_nt		34.154		74.43%	67
> > 
> > Some macrobenchmark would be great too.
> > 
> > > An earlier version of this code was sent as an RFC patch ~July 2018
> > > https://patchwork.kernel.org/patch/10543193/ but never merged.
> > 
> > Andi and I tried to use MOVNTI for large/gigantic page clearing back in
> > 2012[1]. Maybe it can be useful.
> > 
> > That patchset is somewhat more complex trying to keep the memory around
> > the fault address hot in cache. In theory it should help to reduce latency
> > on the first access to the memory.
> > 
> > I was not able to get convincing numbers back then for the hardware of the
> > time. Maybe it's better now.
> > 
> > https://lore.kernel.org/r/1345470757-12005-1-git-send-email-kirill.shutemov@linux.intel.com
> 
> Thanks for the reminder. I've had only a very vague recollection. Your
> series had a much wider scope indeed. Since then we have gained
> process_huge_page which tries to optimize normal huge pages.
> 
> Gigantic huge pages are a bit different. They are much less dynamic from
> the usage POV in my experience. Micro-optimizations for the first access
> tends to not matter at all as it is usually pre-allocation scenario. On
> the other hand, speeding up the initialization sounds like a good thing
> in general. It will be a single time benefit but if the additional code
> is not hard to maintain then I would be inclined to take it even with
> "artificial" numbers state above. There really shouldn't be other downsides
> except for the code maintenance, right?

I cannot think of any, no.

-- 
 Kirill A. Shutemov


  reply	other threads:[~2020-03-09  9:35 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-07  1:03 [PATCH] mm: clear 1G pages with streaming stores on x86 Cannon Matthews
2020-03-07 15:36 ` kbuild test robot
2020-03-07 22:06 ` Andrew Morton
2020-03-09  0:08 ` Kirill A. Shutemov
2020-03-09  9:06   ` Michal Hocko
2020-03-09  9:35     ` Kirill A. Shutemov [this message]
2020-03-09 11:36     ` Kirill A. Shutemov
2020-03-09 12:26       ` Michal Hocko
2020-03-09 18:01         ` Mike Kravetz
2020-03-09 15:38     ` Andi Kleen
2020-03-09 18:37       ` Matthew Wilcox
2020-03-11  0:21         ` Cannon Matthews
2020-03-11  0:54           ` Kirill A. Shutemov
2020-03-11  3:35             ` Arvind Sankar
2020-03-11  8:16               ` Kirill A. Shutemov
2020-03-11 18:32                 ` Arvind Sankar
2020-03-11 20:32                   ` Arvind Sankar
2020-03-12  0:52                     ` Kirill A. Shutemov
2020-03-31  0:40                   ` Elliott, Robert (Servers)
2020-03-16 10:18             ` Michal Hocko
2020-03-16 12:19               ` Kirill A. Shutemov
2020-03-26 19:46                 ` Matthew Wilcox
2020-03-11 15:07       ` David Laight
2020-03-09 15:33   ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200309093533.qj255nrgyofmtiqz@box \
    --to=kirill@shutemov.name \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=cannonmatthews@google.com \
    --cc=gthelen@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=rientjes@google.com \
    --cc=sqazi@google.com \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).