linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nathan Chancellor <nathan@kernel.org>
To: Rik van Riel <riel@surriel.com>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>,
	Andrew Morton <akpm@linux-foundation.org>,
	Yang Shi <shy828301@gmail.com>,
	"Huang, Ying" <ying.huang@intel.com>,
	kernel test robot <yujie.liu@intel.com>,
	lkp@lists.01.org, lkp@intel.com,
	Matthew Wilcox <willy@infradead.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	feng.tang@intel.com, zhengjun.xing@linux.intel.com,
	fengwei.yin@intel.com
Subject: Re: [mm] f35b5d7d67: will-it-scale.per_process_ops -95.5% regression
Date: Thu, 1 Dec 2022 14:35:31 -0700	[thread overview]
Message-ID: <Y4keIyIK6OA3nOwT@dev-arch.thelio-3990X> (raw)
In-Reply-To: <07adee081a70c2b4b44d9bf93a0ad3142e091086.camel@surriel.com>

On Thu, Dec 01, 2022 at 03:29:41PM -0500, Rik van Riel wrote:
> On Thu, 2022-12-01 at 19:33 +0100, Thorsten Leemhuis wrote:
> > Hi, this is your Linux kernel regression tracker.
> > 
> > On 28.11.22 07:40, Nathan Chancellor wrote:
> > > Hi Rik,
> > 
> > I wonder what we should do about below performance regression. Is
> > reverting the culprit now and reapplying it later together with a fix
> > a
> > viable option? Or was anything done/is anybody doing something
> > already
> > to address the problem and I just missed it?
> 
> The changeset in question speeds up kernel compiles with
> GCC, as well as the runtime speed of other programs, due
> to being able to use THPs more. However, it slows down kernel
> compiles with clang, due to ... something clang does.
> 
> I have not figured out what that something is yet.
> 
> I don't know if I have the wrong version of clang here,
> but I have not seen any smoking gun at all when tracing
> clang system calls. I see predominantly small mmap and
> unmap calls, and nothing that even triggers 2MB alignment.

Sorry about that :/ What version of clang are you trying to reproduce
with? I was able to see this with 14.x and 16.x, it is possible that
older versions do not do the thing that is causing this.

I cannot really be testing much on my main workstation but I will see if
I can reproduce this behavior on one of my other test systems or a
virtual machine. Once I do that, if you are still unable to reproduce
it, I can potentially try and help you debug this, although I will
likely need some hand holding.

Cheers,
Nathan

> > Yang Shi, Andrew, what's your option on this? I ask you directly,
> > because it looks like Rik hasn't posted anything to lists archived on
> > lore during the last few weeks. :-/
> > 
> > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker'
> > hat)
> > 
> > P.S.: As the Linux kernel's regression tracker I deal with a lot of
> > reports and sometimes miss something important when writing mails
> > like
> > this. If that's the case here, don't hesitate to tell me in a public
> > reply, it's in everyone's interest to set the public record straight.
> > 
> > > On Thu, Oct 20, 2022 at 10:16:20AM -0700, Nathan Chancellor wrote:
> > > > On Thu, Oct 20, 2022 at 11:28:16AM -0400, Rik van Riel wrote:
> > > > > On Thu, 2022-10-20 at 13:07 +0800, Huang, Ying wrote:
> > > > > > Nathan Chancellor <nathan@kernel.org> writes:
> > > > > > > 
> > > > > > > For what it's worth, I just bisected a massive and visible
> > > > > > > performance
> > > > > > > regression on my Threadripper 3990X workstation to commit
> > > > > > > f35b5d7d676e
> > > > > > > ("mm: align larger anonymous mappings on THP boundaries"),
> > > > > > > which
> > > > > > > seems
> > > > > > > directly related to this report/analysis. I initially
> > > > > > > noticed this
> > > > > > > because my full set of kernel builds against mainline went
> > > > > > > from 2
> > > > > > > hours
> > > > > > > and 20 minutes or so to over 3 hours. Zeroing in on x86_64
> > > > > > > allmodconfig,
> > > > > > > which I used for the bisect:
> > > > > > > 
> > > > > > > @ 7b5a0b664ebe ("mm/page_ext: remove unused variable in
> > > > > > > offline_page_ext"):
> > > > > > > 
> > > > > > > Benchmark 1: make -skj128 LLVM=1 allmodconfig all
> > > > > > >   Time (mean ± σ):     318.172 s ±  0.730 s    [User:
> > > > > > > 31750.902 s,
> > > > > > > System: 4564.246 s]
> > > > > > >   Range (min … max):   317.332 s … 318.662 s    3 runs
> > > > > > > 
> > > > > > > @ f35b5d7d676e ("mm: align larger anonymous mappings on THP
> > > > > > > boundaries"):
> > > > > > > 
> > > > > > > Benchmark 1: make -skj128 LLVM=1 allmodconfig all
> > > > > > >   Time (mean ± σ):     406.688 s ±  0.676 s    [User:
> > > > > > > 31819.526 s,
> > > > > System: 16327.022 s]
> > > > > > >   Range (min … max):   405.954 s … 407.284 s    3 run
> > > > > > 
> > > > > > Have you tried to build with gcc?  Want to check whether is
> > > > > > this
> > > > > > clang
> > > > > > specific issue or not.
> > > > > 
> > > > > This may indeed be something LLVM specific. In previous tests,
> > > > > GCC has generally seen a benefit from increased THP usage.
> > > > > Many other applications also benefit from getting more THPs.
> > > > 
> > > > Indeed, GCC builds actually appear to be slightly faster on my
> > > > system now,
> > > > apologies for not trying that before reporting :/
> > > > 
> > > > 7b5a0b664ebe:
> > > > 
> > > > Benchmark 1: make -skj128 allmodconfig all
> > > >   Time (mean ± σ):     355.294 s ±  0.931 s    [User: 33620.469
> > > > s, System: 6390.064 s]
> > > >   Range (min … max):   354.571 s … 356.344 s    3 runs
> > > > 
> > > > f35b5d7d676e:
> > > > 
> > > > Benchmark 1: make -skj128 allmodconfig all
> > > >   Time (mean ± σ):     347.400 s ±  2.029 s    [User: 34389.724
> > > > s, System: 4603.175 s]
> > > >   Range (min … max):   345.815 s … 349.686 s    3 runs
> > > > 
> > > > > LLVM showing 10% system time before this change, and a whopping
> > > > > 30% system time after that change, suggests that LLVM is
> > > > > behaving
> > > > > quite differently from GCC in some ways.
> > > > 
> > > > The above tests were done with GCC 12.2.0 from Arch Linux. The
> > > > previous LLVM
> > > > tests were done with a self-compiled version of LLVM from the
> > > > main branch
> > > > (16.0.0), optimized with BOLT [1]. To eliminate that as a source
> > > > of issues, I
> > > > used my distribution's version of clang (14.0.6) and saw similar
> > > > results as
> > > > before:
> > > > 
> > > > 7b5a0b664ebe:
> > > > 
> > > > Benchmark 1: make -skj128 LLVM=/usr/bin/ allmodconfig all
> > > >   Time (mean ± σ):     462.517 s ±  1.214 s    [User: 48544.240
> > > > s, System: 5586.212 s]
> > > >   Range (min … max):   461.115 s … 463.245 s    3 runs
> > > > 
> > > > f35b5d7d676e:
> > > > 
> > > > Benchmark 1: make -skj128 LLVM=/usr/bin/ allmodconfig all
> > > >   Time (mean ± σ):     547.927 s ±  0.862 s    [User: 47913.709
> > > > s, System: 17682.514 s]
> > > >   Range (min … max):   547.429 s … 548.922 s    3 runs
> > > > 
> > > > > If we can figure out what these differences are, maybe we can
> > > > > just fine tune the code to avoid this issue.
> > > > > 
> > > > > I'll try to play around with LLVM compilation a little bit next
> > > > > week, to see if I can figure out what might be going on. I
> > > > > wonder
> > > > > if LLVM is doing lots of mremap calls or something...
> > > > 
> > > > If there is any further information I can provide or patches I
> > > > can test,
> > > > I am more than happy to do so.
> > > > 
> > > > [1]:
> > > > https://github.com/llvm/llvm-project/tree/96552e73900176d65ee6650facae8d669d6f9498/bolt
> > > 
> > > Was there ever a follow up to this report that I missed? I just
> > > noticed that I am still reverting f35b5d7d676e in my mainline
> > > kernel.
> > > 
> > > Cheers,
> > > Nathan
> > > 
> > 
> > #regzbot ignore-activity
> > 
> 
> -- 
> All Rights Reversed.



  parent reply	other threads:[~2022-12-01 21:35 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-18  8:44 [mm] f35b5d7d67: will-it-scale.per_process_ops -95.5% regression kernel test robot
2022-10-19  2:05 ` Huang, Ying
2022-10-20  4:23   ` Nathan Chancellor
2022-10-20  5:07     ` Huang, Ying
2022-10-20 15:28       ` Rik van Riel
2022-10-20 17:16         ` Nathan Chancellor
2022-11-28  6:40           ` Nathan Chancellor
2022-12-01 18:33             ` Thorsten Leemhuis
2022-12-01 20:29               ` Rik van Riel
2022-12-01 21:22                 ` Andrew Morton
2022-12-01 21:44                   ` Yang Shi
2022-12-02  8:46                   ` Thorsten Leemhuis
2022-12-02 18:44                     ` Andrew Morton
2022-12-02 19:37                       ` Thorsten Leemhuis
2022-12-01 21:35                 ` Nathan Chancellor [this message]
2022-12-16 11:48                 ` Yin, Fengwei
2022-10-20 16:40       ` Yujie Liu
2022-11-29  8:59     ` [mm] f35b5d7d67: will-it-scale.per_process_ops -95.5% regression #forregzbot Thorsten Leemhuis
2022-12-02  6:43       ` Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y4keIyIK6OA3nOwT@dev-arch.thelio-3990X \
    --to=nathan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=regressions@leemhuis.info \
    --cc=riel@surriel.com \
    --cc=shy828301@gmail.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yujie.liu@intel.com \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).