linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Prathu Baronia <prathu.baronia@oneplus.com>
Cc: alexander.duyck@gmail.com, chintan.pandya@oneplus.com,
	ying.huang@intel.com, akpm@linux-foundation.com,
	linux-mm@kvack.org, gregkh@linuxfoundation.com,
	gthelen@google.com, jack@suse.cz, ken.lin@oneplus.com,
	gasine.xu@oneplus.com
Subject: Re: [PATCH v2] mm: Optimized hugepage zeroing & copying from user
Date: Tue, 14 Apr 2020 21:40:33 +0200	[thread overview]
Message-ID: <20200414194033.GU4629@dhcp22.suse.cz> (raw)
In-Reply-To: <20200414184743.GB2097@oneplus.com>

On Wed 15-04-20 00:17:44, Prathu Baronia wrote:
> The 04/14/2020 19:03, Michal Hocko wrote:
> > I still have hard time to see why kmap machinery should introduce any
> > slowdown here. Previous data posted while discussing v1 didn't really
> > show anything outside of the noise.
> > 
> You are right, the multiple barriers are not responsible for the slowdown, but
> removal of kmap_atomic() allows us to call memset and memcpy for larger sizes.
> I will re-frame this part of the commit text when we proceed towards v3 to
> present it more cleanly.

While this might be OK for 2MB huge pages, does the same apply to other
larger sizes? E.g. 512MG or 1G or even larger huge pages? You should
consider !PREEMPT kernels.

[...]

> > No. There is absolutely zero reason to add a config option for this. The
> > kernel should have all the information to make an educated guess.
> > 
> I will try to incorporate this in v3. But currently I don't have any idea on how
> to go about implementing the guessing logic. Would really appreciate if you can
> suggest some way to go about it.

If you cannot guess the proper sizing then how is a poor user who tries
to configure the kernel supposed to do it?

> > Also before going any further. The patch which has introduced the
> > optimization was c79b57e462b5 ("mm: hugetlb: clear target sub-page last
> > when clearing huge page"). It is based on an artificial benchmark which
> > to my knowledge doesn't represent any real workload. Your measurements
> > are based on a different benchmark. Your numbers clearly show that some
> > assumptions used for the optimization are not architecture neutral.
> > 
> But oneshot numbers are significantly better on both the archs. I think
> theoretically the oneshot approach should provide better results on all the
> architectures when compared with serial approach. Isn't it a fair assumption to
> go ahead with the oneshot approach?

What is this assumption based on? Also please consider that all these
numbers are based on artificial microbenchmarks. Can you see any
difference on real world huge page users? The same applies to the
regression you can see with the existing code.
-- 
Michal Hocko
SUSE Labs


  parent reply	other threads:[~2020-04-14 19:40 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-14 15:38 [PATCH v2] mm: Optimized hugepage zeroing & copying from user Prathu Baronia
2020-04-14 17:03 ` Michal Hocko
2020-04-14 17:41   ` Daniel Jordan
     [not found]   ` <20200414184743.GB2097@oneplus.com>
2020-04-14 19:32     ` Alexander Duyck
2020-04-15  3:40       ` Huang, Ying
2020-04-15 11:09         ` Michal Hocko
2020-04-19 12:05       ` Prathu Baronia
2020-04-14 19:40     ` Michal Hocko [this message]
2020-04-15  3:27 ` Huang, Ying
2020-04-16  1:21   ` Huang, Ying
2020-04-19 15:58   ` Prathu Baronia
2020-04-20  0:18     ` Huang, Ying
2020-04-21  9:36       ` Prathu Baronia
2020-04-21 10:09         ` Will Deacon
2020-04-21 12:47           ` Vlastimil Babka
2020-04-21 12:48             ` Vlastimil Babka
2020-04-21 13:39               ` Will Deacon
2020-04-21 13:48                 ` Vlastimil Babka
2020-04-21 13:56                   ` Chintan Pandya
2020-04-22  8:18                   ` Will Deacon
2020-04-22 11:19                     ` Will Deacon
2020-04-22 14:38                       ` Prathu Baronia
2020-05-01  8:58                         ` Prathu Baronia
2020-05-05  8:59                           ` Will Deacon
2020-04-21 13:00             ` Michal Hocko
2020-04-21 13:10               ` Will Deacon
2020-04-17  7:48 ` [mm] 134c8b410f: vm-scalability.median -7.9% regression kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200414194033.GU4629@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.com \
    --cc=alexander.duyck@gmail.com \
    --cc=chintan.pandya@oneplus.com \
    --cc=gasine.xu@oneplus.com \
    --cc=gregkh@linuxfoundation.com \
    --cc=gthelen@google.com \
    --cc=jack@suse.cz \
    --cc=ken.lin@oneplus.com \
    --cc=linux-mm@kvack.org \
    --cc=prathu.baronia@oneplus.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).