All of lore.kernel.org
 help / color / mirror / Atom feed
From: Feng Tang <feng.tang@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>,
	Michal Marek <michal.lkml@markovi.net>,
	linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	andi.kleen@intel.com, ying.huang@intel.com,
	andriy.shevchenko@intel.com
Subject: Re: [RFC PATCH] makefile: add debug option to enable function aligned on 32 bytes
Date: Thu, 23 Jul 2020 13:13:48 +0800	[thread overview]
Message-ID: <20200723051348.GA5150@feng-iot> (raw)
In-Reply-To: <20200722203919.8b7c9b35ff51d66550c3846c@linux-foundation.org>

Hi Andrew,

Thanks for the review.

On Wed, Jul 22, 2020 at 08:39:19PM -0700, Andrew Morton wrote:
> On Thu, 23 Jul 2020 11:30:01 +0800 Feng Tang <feng.tang@intel.com> wrote:
> 
> > Recently 0day reported many strange performance changes (regression
> > or improvement), in which there was no obvious relation between
> > the culprit commit and the benchmark at the first look, and it causes
> > people to doubt the test itself is wrong.
> > 
> > Upon further check, many of these cases are caused by the change
> > to the alignment of kernel text or data, as whole text/data of kernel
> > are linked together, change in one domain may affect alignments of
> > other domains.
> > 
> > gcc has an option '-falign-functions=n' to force text aligned, and with
> > that option enabled, some of those performance changes will be gone,
> > like [1][2][3].
> > 
> > Add this option so that developers and 0day can easily find performance
> > bump caused by text alignment change,
> 
> Would they use it this way, or would they simply always enable the
> option to reduce the variability

We've had concerns about side effects, like increased kernel size won't be
accepted by embedded system, the possible i-cache usage/contention increase.

And I've only done limited benchmark test, so I thought it may be safer
to be off by default. Though my bold thought was it could be default on :)

> It makes sense, but is it actually known that this does reduce the
> variability?

Yes, at lease for the strange performance bumps reported by 0day, like
in [1][2][3].

> > as tracking these strange bump
> > is quite time consuming. Though it can't help in other cases like data
> > alignment changes like [4].
> > 
> > Following is some size data for v5.7 kernel built with a RHEL config
> > used in 0day:
> > 
> >     text      data      bss	 dec	   filename
> >   19738771  13292906  5554236  38585913	 vmlinux.noalign
> >   19758591  13297002  5529660  38585253	 vmlinux.align32
> > 
> > Raw vmlinux size in bytes:
> > 
> > 	v5.7		v5.7+align32
> > 	253950832	254018000	+0.02%
> > 
> > Some benchmark data, most of them have no big change:
> > 
> >   * hackbench:		[ -1.8%,  +0.5%]
> > 
> >   * fsmark:		[ -3.2%,  +3.4%]  # ext4/xfs/btrfs
> > 
> >   * kbuild:		[ -2.0%,  +0.9%]
> > 
> >   * will-it-scale:	[ -0.5%,  +1.8%]  # mmap1/pagefault3
> > 
> >   * netperf:
> >     - TCP_CRR		[+16.6%, +97.4%]
> >     - TCP_RR		[-18.5%,  -1.8%]
> >     - TCP_STREAM	[ -1.1%,  +1.9%]
> 
> What do the numbers in [] mean?  The TCP_CRR results look remarkable?
 
For each of the benchmark listed above, I took 2 or 3 test platforms
and run it with different parameters. So each of the benchmark will
have several cases run, and [] lists the lowest and highest result.

For the netperf/TCP_CRR case, the lowest is +16.6% on a Skylake server
with 16 testing threads, and highest is +97.4 on a Cascadelake server
with 96 testing threads.

Thanks,
Feng

> > [1] https://lore.kernel.org/lkml/20200114085637.GA29297@shao2-debian/
> > [2] https://lore.kernel.org/lkml/20200330011254.GA14393@feng-iot/
> > [3] https://lore.kernel.org/lkml/1d98d1f0-fe84-6df7-f5bd-f4cb2cdb7f45@intel.com/
> > [4] https://lore.kernel.org/lkml/20200205123216.GO12867@shao2-debian/
> > 

  reply	other threads:[~2020-07-23  5:13 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-23  3:30 [RFC PATCH] makefile: add debug option to enable function aligned on 32 bytes Feng Tang
2020-07-23  3:39 ` Andrew Morton
2020-07-23  5:13   ` Feng Tang [this message]
2020-07-23  6:29   ` Feng Tang
2020-07-24  0:57     ` Andrew Morton
2020-07-24  1:06       ` Feng Tang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200723051348.GA5150@feng-iot \
    --to=feng.tang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi.kleen@intel.com \
    --cc=andriy.shevchenko@intel.com \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masahiroy@kernel.org \
    --cc=michal.lkml@markovi.net \
    --cc=torvalds@linux-foundation.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.