LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Feng Tang <feng.tang@intel.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Chen, Rong A" <rong.a.chen@intel.com>,
	Jann Horn <jannh@google.com>, LKML <linux-kernel@vger.kernel.org>,
	"lkp@lists.01.org" <lkp@lists.01.org>
Subject: Re: [LKP] Re: [mm] fd4d9c7d0c: stress-ng.switch.ops_per_sec -30.5% regression
Date: Mon, 30 Mar 2020 09:12:54 +0800
Message-ID: <20200330011254.GA14393@feng-iot> (raw)
In-Reply-To: <CAHk-=wi2c3UcK4fjUR2nM-7iUOAyQijq9ETfQHaN0WwFh2Bm9A@mail.gmail.com>

Hi Linus,

On Fri, Mar 27, 2020 at 12:57:45AM +0800, Linus Torvalds wrote:
> On Wed, Mar 25, 2020 at 10:57 PM kernel test robot
> <rong.a.chen@intel.com> wrote:
> >
> > FYI, we noticed a -30.5% regression of stress-ng.switch.ops_per_sec due to commit:
> >
> > commit: fd4d9c7d0c71866ec0c2825189ebd2ce35bd95b8 ("mm: slub: add missing TID bump in kmem_cache_alloc_bulk()")
> 
> This looks odd.
> 
> I would not expect the update of c->tid to have that noticeable an
> impact, even on a big machine that might be close to some scaling
> limit.
 
The test machin is a Cascade Lake platform, 2 sockets, 48C/96T.

> It doesn't add any expensive atomic ops, and while it _could_ make a
> percpu cacheline dirty, I think that cacheline should already be dirty
> anyway under any load where this is noticeable. Plus this should be a
> relatively cold path anyway.
> 
> So mind humoring me, and double-check that regression?
> 
> Of course, it might be another "just magic cache placement" detail
> where code moved enough to make a difference.
> 
> Or maybe it really ends up causing new tid mismatches and we end up
> failing the fast path in slub as a result. But looking at the stats
> that changed in your message doesn't make me go "yeah, that looks like
> a slub difference".

Per our check, the code movement does exist.

From the system map:

old map:
	ffffffff812a1880 T kmem_cache_alloc_bulk
	ffffffff812a1a80 t kmalloc_large_node
	ffffffff812a1b10 t calculate_sizes
	ffffffff812a1eb0 t store_user_store
	ffffffff812a1f20 t poison_store
	ffffffff812a1f90 t red_zone_store
	ffffffff812a2000 t order_store

new map:
	ffffffff812a1880 T kmem_cache_alloc_bulk
	ffffffff812a1a90 t kmalloc_large_node
	ffffffff812a1b20 T __kmalloc_node	---> relocated
	ffffffff812a1e40 t calculate_sizes
	ffffffff812a21e0 t store_user_store
	ffffffff812a2250 t poison_store
	ffffffff812a22c0 t red_zone_store
	ffffffff812a2330 t order_store

In old map the 'kmem_cache_alloc_bulk' is cache aligned, and occupies
0x200 bytes, and the next function 'kmalloc_large_node' starts at
an alinged address. In new map 'kmem_cache_alloc_bulk' occupies
0x210 bytes, and offset of the alignment of many functions following
it. (please let us know if you need the full system map for the
2 vmlinuxs)

From the objdump, the direct chagne of "c->tid = next_tid(c->tid);" 
is one line added "49 83 40 08 01	addq   $0x1,0x8(%r8)"

We did experiments to make the kernel functions 32 bytes aligned,
----------------------------------------------------------------
diff --git a/Makefile b/Makefile
index 171f2b004c8a..63f28aaf78c9 100644
--- a/Makefile
+++ b/Makefile
 
 KBUILD_AFLAGS   := -D__ASSEMBLY__ -fno-PIE
-KBUILD_CFLAGS   := -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs \
+KBUILD_CFLAGS   := -Wall -Wundef -falign-functions=32 -Werror=strict-prototypes -Wno-trigraphs \
----------------------------------------------------------------
 
the regression is reduced to about 3%:

2060457 ±  4%      -3.2%    1993685 ±  2%  stress-ng.switch.ops_per_sec

which is pretty small for a micro-benchmark

Thanks,
Feng

      parent reply index

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-26  5:57 kernel test robot
2020-03-26 16:57 ` Linus Torvalds
2020-03-27  8:46   ` Rong Chen
2020-03-30  1:12   ` Feng Tang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200330011254.GA14393@feng-iot \
    --to=feng.tang@intel.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@lists.01.org \
    --cc=rong.a.chen@intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git