From: Thomas Gleixner <firstname.lastname@example.org>
To: Tom Lendacky <email@example.com>,
Dave Hansen <firstname.lastname@example.org>,
Cc: email@example.com, Andrew Cooper <firstname.lastname@example.org>,
"Edgecombe, Rick P" <email@example.com>
Subject: Re: [patch 3/3] x86/fpu/xsave: Optimize XSAVEC/S when XGETBV1 is supported
Date: Fri, 22 Apr 2022 21:30:19 +0200 [thread overview]
Message-ID: <87bkws6hmc.ffs@tglx> (raw)
On Wed, Apr 20 2022 at 13:15, Tom Lendacky wrote:
> On 4/19/22 16:22, Thomas Gleixner wrote:
>>> That was bare metal and I just checked that this was a production config
>>> and not some weird debug muck which breaks large pages. I'll look deeper
>>> into that.
>> I can't find any reasonable explanation. The pages are definitely large
>> pages, so yes the dTLB miss count does not make sense, but it's
>> consistently faster and it's always the dTLB miss count which makes the
>> big difference according to perf.
>> For enhanced fun, I ran the lot on a AMD Zen3 machine and with the same
>> test case (hackbench -l 10000) repeated 10 times by perf stat this is
>> consistently slower than the non optimized variant. There is at least an
>> explanation for that. A tight loop of 1 Mio xgetbv(1) invocations takes
>> 9 Mio cycles on a SKL-X and 50 Mio cycles on a AMD Zen3.
> I'll take a look into this and see what I find. Might be interesting to
> see if the actual XSAVES is slower or quicker, too, based on the input mask.
> If the performance slowdown shows up in real world benchmarks, we might
> want to consider not using the xgetbv() call on AMD.
As things stand now, I'm not going to pursue this further at the moment.
The effect on SKL-X is not explainable especially the dTLB miss count
decrease does not make any sense. Aside of that I just figured out that
it is very sensitive to kernel configurations and I have no idea yet
what exactly is the screw to turn to make the effect come and go.
So I just go and add the XSAVEC support alone as that's actually
something which _is_ beneficial for guests.
next prev parent reply other threads:[~2022-04-22 19:51 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-04 12:11 [patch 0/3] x86/fpu/xsave: Add XSAVEC support and XGETBV1 utilization Thomas Gleixner
2022-04-04 12:11 ` [patch 1/3] x86/fpu/xsave: Support XSAVEC in the kernel Thomas Gleixner
2022-04-04 16:10 ` Andrew Cooper
2022-04-14 14:43 ` Dave Hansen
2022-04-25 13:11 ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2022-04-04 12:11 ` [patch 2/3] x86/fpu/xsave: Prepare for optimized compaction Thomas Gleixner
2022-04-14 15:46 ` Dave Hansen
2022-04-19 12:39 ` Thomas Gleixner
2022-04-19 13:33 ` Thomas Gleixner
2022-04-04 12:11 ` [patch 3/3] x86/fpu/xsave: Optimize XSAVEC/S when XGETBV1 is supported Thomas Gleixner
2022-04-14 17:24 ` Dave Hansen
2022-04-19 13:43 ` Thomas Gleixner
2022-04-19 21:22 ` Thomas Gleixner
2022-04-20 18:15 ` Tom Lendacky
2022-04-22 19:30 ` Thomas Gleixner [this message]
2022-04-23 15:20 ` Dave Hansen
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).