archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <>
To: Tom Lendacky <>,
	Dave Hansen <>,
	LKML <>
Cc:, Andrew Cooper <>,
	"Edgecombe, Rick P" <>
Subject: Re: [patch 3/3] x86/fpu/xsave: Optimize XSAVEC/S when XGETBV1 is supported
Date: Fri, 22 Apr 2022 21:30:19 +0200	[thread overview]
Message-ID: <87bkws6hmc.ffs@tglx> (raw)
In-Reply-To: <>

On Wed, Apr 20 2022 at 13:15, Tom Lendacky wrote:
> On 4/19/22 16:22, Thomas Gleixner wrote:
>>> That was bare metal and I just checked that this was a production config
>>> and not some weird debug muck which breaks large pages. I'll look deeper
>>> into that.
>> I can't find any reasonable explanation. The pages are definitely large
>> pages, so yes the dTLB miss count does not make sense, but it's
>> consistently faster and it's always the dTLB miss count which makes the
>> big difference according to perf.
>> For enhanced fun, I ran the lot on a AMD Zen3 machine and with the same
>> test case (hackbench -l 10000) repeated 10 times by perf stat this is
>> consistently slower than the non optimized variant. There is at least an
>> explanation for that. A tight loop of 1 Mio xgetbv(1) invocations takes
>> 9 Mio cycles on a SKL-X and 50 Mio cycles on a AMD Zen3.
> I'll take a look into this and see what I find. Might be interesting to 
> see if the actual XSAVES is slower or quicker, too, based on the input mask.
> If the performance slowdown shows up in real world benchmarks, we might 
> want to consider not using the xgetbv() call on AMD.

As things stand now, I'm not going to pursue this further at the moment.

The effect on SKL-X is not explainable especially the dTLB miss count
decrease does not make any sense. Aside of that I just figured out that
it is very sensitive to kernel configurations and I have no idea yet
what exactly is the screw to turn to make the effect come and go.

So I just go and add the XSAVEC support alone as that's actually
something which _is_ beneficial for guests.



  reply	other threads:[~2022-04-22 19:51 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-04 12:11 [patch 0/3] x86/fpu/xsave: Add XSAVEC support and XGETBV1 utilization Thomas Gleixner
2022-04-04 12:11 ` [patch 1/3] x86/fpu/xsave: Support XSAVEC in the kernel Thomas Gleixner
2022-04-04 16:10   ` Andrew Cooper
2022-04-14 14:43   ` Dave Hansen
2022-04-25 13:11   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2022-04-04 12:11 ` [patch 2/3] x86/fpu/xsave: Prepare for optimized compaction Thomas Gleixner
2022-04-14 15:46   ` Dave Hansen
2022-04-19 12:39     ` Thomas Gleixner
2022-04-19 13:33       ` Thomas Gleixner
2022-04-04 12:11 ` [patch 3/3] x86/fpu/xsave: Optimize XSAVEC/S when XGETBV1 is supported Thomas Gleixner
2022-04-14 17:24   ` Dave Hansen
2022-04-19 13:43     ` Thomas Gleixner
2022-04-19 21:22       ` Thomas Gleixner
2022-04-20 18:15         ` Tom Lendacky
2022-04-22 19:30           ` Thomas Gleixner [this message]
2022-04-23 15:20             ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bkws6hmc.ffs@tglx \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).