linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@suse.de>
To: Feng Tang <feng.tang@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>,
	kernel test robot <rong.a.chen@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@lists.01.org, Mel Gorman <mgorman@suse.com>
Subject: Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression
Date: Mon, 24 Aug 2020 18:12:38 +0200	[thread overview]
Message-ID: <20200824161238.GI4794@zn.tnic> (raw)
In-Reply-To: <20200824153300.GA56944@shbuild999.sh.intel.com>

On Mon, Aug 24, 2020 at 11:33:00PM +0800, Feng Tang wrote:
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 43b1519..2c020ef 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -95,7 +95,7 @@ struct mca_config mca_cfg __read_mostly = {
>  	.monarch_timeout = -1
>  };
>  
> -static DEFINE_PER_CPU(struct mce, mces_seen);
> +static DEFINE_PER_CPU_ALIGNED(struct mce, mces_seen);
>  static unsigned long mce_need_notify;
>  static int cpu_missing;
>  
> @@ -148,7 +148,7 @@ void mce_setup(struct mce *m)
>  	m->microcode = boot_cpu_data.microcode;
>  }
>  
> -DEFINE_PER_CPU(struct mce, injectm);
> +DEFINE_PER_CPU_ALIGNED(struct mce, injectm);
>  EXPORT_PER_CPU_SYMBOL_GPL(injectm);

I don't think this is the right fix. Lemme quote Tony from a previous
email:

"The answer isn't to tinker with "struct mce". Other changes could
trigger this same change in alignment. Anything that is this perfomance
sensitive needs to have some "__attribute__((aligned(64)))" (or
whatever) to make sure arbitrary changes elsewhere don't do this."

And yes, your diff is not tinkering with struct mce but it is tinkering
with percpu vars which are of type struct mce.

However, the proper fix is...

> :)  Right, this is what I'm doing right now. Some test job is queued on
> the test box, and it may needs some iterations of new patch. Hopefully we
> can isolate some specific variable given some luck.

... yes, exactly, you need to identify the contention where this
happens, causing a cacheline to bounce or a variable straddles across a
cacheline boundary, causing the read to fetch two cachelines and thus
causes that slowdown. And then align that var to the beginning of a
cacheline.

Also, maybe I missed this but, do you trigger this only on Xeon Phi or
on "normal" x86 too?

Because if it is Xeon Phi only, then that might explain the size of the
slowdown and that it happens only there because it is a, well, "strange"
machine. :-)

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

  parent reply	other threads:[~2020-08-24 16:12 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-25 11:44 [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression kernel test robot
2020-04-25 13:01 ` Borislav Petkov
2020-08-18  8:29   ` [LKP] " Feng Tang
2020-08-18 20:06     ` Luck, Tony
2020-08-19  2:04       ` Feng Tang
2020-08-19  2:23         ` Luck, Tony
2020-08-19  3:04           ` Feng Tang
2020-08-19  3:15           ` Feng Tang
2020-08-21  2:02         ` Feng Tang
2020-08-24 15:14           ` Borislav Petkov
2020-08-24 15:33             ` Feng Tang
2020-08-24 15:38               ` Luck, Tony
2020-08-24 15:48                 ` Feng Tang
2020-08-24 16:12               ` Borislav Petkov [this message]
2020-08-24 16:56                 ` Mel Gorman
2020-08-25  6:49                   ` Feng Tang
2020-08-25  6:23                 ` Feng Tang
2020-08-25 16:44                   ` Luck, Tony
2020-08-26  1:45                     ` Feng Tang
2020-08-28 17:48                   ` Borislav Petkov
2020-08-31  2:16                     ` Feng Tang
2020-08-31  7:56                       ` Mel Gorman
2020-08-31  8:23                         ` Feng Tang
2020-08-31  8:55                           ` Mel Gorman
2020-08-31 12:53                             ` Feng Tang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200824161238.GI4794@zn.tnic \
    --to=bp@suse.de \
    --cc=feng.tang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@lists.01.org \
    --cc=mgorman@suse.com \
    --cc=rong.a.chen@intel.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).