linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Will Deacon <will@kernel.org>,
	Prathu Baronia <prathu.baronia@oneplus.com>
Cc: catalin.marinas@arm.com, alexander.duyck@gmail.com,
	chintan.pandya@oneplus.com, mhocko@suse.com,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	gregkh@linuxfoundation.com, gthelen@google.com, jack@suse.cz,
	ken.lin@oneplus.com, gasine.xu@oneplus.com, ying.huang@intel.com,
	mark.rutland@arm.com
Subject: Re: [PATCH v2] mm: Optimized hugepage zeroing & copying from user
Date: Tue, 21 Apr 2020 14:47:13 +0200	[thread overview]
Message-ID: <fa663410-7eef-a10a-4e9d-42bde91451a6@suse.cz> (raw)
In-Reply-To: <20200421100932.GC17256@willie-the-truck>

On 4/21/20 12:09 PM, Will Deacon wrote:
> On Tue, Apr 21, 2020 at 03:06:21PM +0530, Prathu Baronia wrote:
>> With below v2 patch we observe a significantly(~65%) improved zeroing time for
>> hugepages.
> 
> What patch? I assume you mean:
> 
> https://lore.kernel.org/linux-mm/20200414153829.GA15230@oneplus.com/
> 
> but you've trimmed all the details!
> 
>> We profiled the clear_huge_page() using ftrace on Qualcomm's SM8150 platform
>> under controlled conditions(i.e. only CPU0 and 6 turned on and set to max
>> frequency, and DDR set to performance governor).
>> 
>> The existing method uses a reverse traversal of a section of a hugepage which
>> based on our series of experiments proves slower than a oneshot(v2) approach on
>> ARM64.(more details in mail thread)
>> 
>> We didn't see any benefit on x86 so v2 probably won't find any place in the main
>> memory.c code.
> 
> Do you know why you don't see any benefit on x86? It seems unusual that
> something like this would vary so wildly between two modern architectures.
> I'd like to understand what's going on.

It was suspected that current Intel can prefetch forward and backwards, and the
tested ARM64 microarchitecture only backwards, can it be true? The current code
does clearing backwards.

>> We are currently thinking of making this optimization ARM64 specific for better
>> performance by placing this in arch/arm64/mm/memory.c(to be created) file. We
>> would really appreciate if you can share your opinion on this.
> 
> There's no need for arch-specific optimisation. Please do it in core code,
> and allow architectures to opt-out if necessary. That means you probably
> need to respond to:
> 
> https://lore.kernel.org/linux-mm/20200417074851.GE26326@shao2-debian/

Note that this can be also viewed differently. It was commit c79b57e462b5 ("mm:
hugetlb: clear target sub-page last when clearing huge page") that introduced
the existing implementation, based on x86 numbers and probably the same test
that generated the regression report. It's likely that said commit thus
regressed arm64.

In that case the generic implementation should be just reverted to be simple and
not assume any (micro)architectural details. If any architecture wants an
optimized version they could add it opt-in, and justifify it by using real
workloads, not microbenchmarks.

> because that doesn't look as rosy as the numbers you're seeing.
> 
> Thanks,
> 
> Will
> 



  reply	other threads:[~2020-04-21 12:47 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-14 15:38 [PATCH v2] mm: Optimized hugepage zeroing & copying from user Prathu Baronia
2020-04-14 17:03 ` Michal Hocko
2020-04-14 17:41   ` Daniel Jordan
     [not found]   ` <20200414184743.GB2097@oneplus.com>
2020-04-14 19:32     ` Alexander Duyck
2020-04-15  3:40       ` Huang, Ying
2020-04-15 11:09         ` Michal Hocko
2020-04-19 12:05       ` Prathu Baronia
2020-04-14 19:40     ` Michal Hocko
2020-04-15  3:27 ` Huang, Ying
2020-04-16  1:21   ` Huang, Ying
2020-04-19 15:58   ` Prathu Baronia
2020-04-20  0:18     ` Huang, Ying
2020-04-21  9:36       ` Prathu Baronia
2020-04-21 10:09         ` Will Deacon
2020-04-21 12:47           ` Vlastimil Babka [this message]
2020-04-21 12:48             ` Vlastimil Babka
2020-04-21 13:39               ` Will Deacon
2020-04-21 13:48                 ` Vlastimil Babka
2020-04-21 13:56                   ` Chintan Pandya
2020-04-22  8:18                   ` Will Deacon
2020-04-22 11:19                     ` Will Deacon
2020-04-22 14:38                       ` Prathu Baronia
2020-05-01  8:58                         ` Prathu Baronia
2020-05-05  8:59                           ` Will Deacon
2020-04-21 13:00             ` Michal Hocko
2020-04-21 13:10               ` Will Deacon
2020-04-17  7:48 ` [mm] 134c8b410f: vm-scalability.median -7.9% regression kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fa663410-7eef-a10a-4e9d-42bde91451a6@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.duyck@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=chintan.pandya@oneplus.com \
    --cc=gasine.xu@oneplus.com \
    --cc=gregkh@linuxfoundation.com \
    --cc=gthelen@google.com \
    --cc=jack@suse.cz \
    --cc=ken.lin@oneplus.com \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=mhocko@suse.com \
    --cc=prathu.baronia@oneplus.com \
    --cc=will@kernel.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).