linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: tip-bot2 for Kan Liang <tip-bot2@linutronix.de>,
	linux-tip-commits@vger.kernel.org
Cc: Kan Liang <kan.liang@linux.intel.com>,
	"Peter Zijlstra \(Intel\)" <peterz@infradead.org>,
	Dave Hansen <dave.hansen@intel.com>, x86 <x86@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [tip: perf/core] x86/fpu/xstate: Support dynamic supervisor feature for LBR
Date: Fri, 28 May 2021 00:15:08 +0200	[thread overview]
Message-ID: <87y2bz7rir.ffs@nanos.tec.linutronix.de> (raw)
In-Reply-To: <159420190464.4006.9196645532990660696.tip-bot2@tip-bot2>

Peter,

On Wed, Jul 08 2020 at 09:51, tip-bot wrote:
> The following commit has been merged into the perf/core branch of tip:
>
> Commit-ID:     f0dccc9da4c0fda049e99326f85db8c242fd781f
> Gitweb:        https://git.kernel.org/tip/f0dccc9da4c0fda049e99326f85db8c242fd781f
> Author:        Kan Liang <kan.liang@linux.intel.com>
> AuthorDate:    Fri, 03 Jul 2020 05:49:26 -07:00
> Committer:     Peter Zijlstra <peterz@infradead.org>
> CommitterDate: Wed, 08 Jul 2020 11:38:56 +02:00
>
> x86/fpu/xstate: Support dynamic supervisor feature for LBR
>
> Last Branch Records (LBR) registers are used to log taken branches and
> other control flows. In perf with call stack mode, LBR information is
> used to reconstruct a call stack. To get the complete call stack, perf
> has to save/restore all LBR registers during a context switch. Due to
> the large number of the LBR registers, e.g., the current platform has
> 96 LBR registers, this process causes a high CPU overhead. To reduce
> the CPU overhead during a context switch, an LBR state component that
> contains all the LBR related registers is introduced in hardware. All
> LBR registers can be saved/restored together using one XSAVES/XRSTORS
> instruction.
>
> However, the kernel should not save/restore the LBR state component at
> each context switch, like other state components, because of the
> following unique features of LBR:
> - The LBR state component only contains valuable information when LBR
>   is enabled in the perf subsystem, but for most of the time, LBR is
>   disabled.
> - The size of the LBR state component is huge. For the current
>   platform, it's 808 bytes.
> If the kernel saves/restores the LBR state at each context switch, for
> most of the time, it is just a waste of space and cycles.
>
> To efficiently support the LBR state component, it is desired to have:
> - only context-switch the LBR when the LBR feature is enabled in perf.
> - only allocate an LBR-specific XSAVE buffer on demand.
>   (Besides the LBR state, a legacy region and an XSAVE header have to be
>    included in the buffer as well. There is a total of (808+576) byte
>    overhead for the LBR-specific XSAVE buffer. The overhead only happens
>    when the perf is actively using LBRs. There is still a space-saving,
>    on average, when it replaces the constant 808 bytes of overhead for
>    every task, all the time on the systems that support architectural
>    LBR.)
> - be able to use XSAVES/XRSTORS for accessing LBR at run time.
>   However, the IA32_XSS should not be adjusted at run time.
>   (The XCR0 | IA32_XSS are used to determine the requested-feature
>   bitmap (RFBM) of XSAVES.)
>
> A solution, called dynamic supervisor feature, is introduced to address
> this issue, which
> - does not allocate a buffer in each task->fpu;
> - does not save/restore a state component at each context switch;
> - sets the bit corresponding to the dynamic supervisor feature in
>   IA32_XSS at boot time, and avoids setting it at run time.

This needs to be put on hold until the whole fpu signal restore mess is
sorted. The current failure modes are 'harmless', but once XSS comes
into play it becomes dangerous.

Please revert that stuff ASAP until the underlying issues of XSTATE are
sorted and then this wants to be posted again according to the rules I
layed out here:

  https://lore.kernel.org/lkml/874keo80bh.ffs@nanos.tec.linutronix.de/

No if, no but..

Thanks,

        tglx

  reply	other threads:[~2021-05-27 22:15 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-03 12:49 [PATCH V3 00/23] Support Architectural LBR kan.liang
2020-07-03 12:49 ` [PATCH V3 01/23] x86/cpufeatures: Add Architectural LBRs feature bit kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-09 23:00     ` Dave Hansen
2020-07-10  9:51       ` Peter Zijlstra
2020-07-10 14:09       ` Liang, Kan
2020-07-03 12:49 ` [PATCH V3 02/23] perf/x86/intel/lbr: Add a function pointer for LBR reset kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 03/23] perf/x86/intel/lbr: Add a function pointer for LBR read kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 04/23] perf/x86/intel/lbr: Add the function pointers for LBR save and restore kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 05/23] perf/x86/intel/lbr: Factor out a new struct for generic optimization kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 06/23] perf/x86/intel/lbr: Use dynamic data structure for task_ctx kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 07/23] x86/msr-index: Add bunch of MSRs for Arch LBR kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 08/23] perf/x86: Expose CPUID enumeration bits for arch LBR kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 09/23] perf/x86/intel/lbr: Support LBR_CTL kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 10/23] perf/x86/intel/lbr: Unify the stored format of LBR information kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 11/23] perf/x86/intel/lbr: Mark the {rd,wr}lbr_{to,from} wrappers __always_inline kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 12/23] perf/x86/intel/lbr: Factor out rdlbr_all() and wrlbr_all() kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 13/23] perf/x86/intel/lbr: Factor out intel_pmu_store_lbr kan.liang
2020-07-03 19:50   ` Peter Zijlstra
2020-07-03 20:59     ` Liang, Kan
2020-07-06 10:25       ` Peter Zijlstra
2020-07-06 13:32         ` Liang, Kan
2020-07-06 14:25           ` Peter Zijlstra
2020-07-06 22:29       ` Liang, Kan
2020-07-07  7:40         ` Peter Zijlstra
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 14/23] perf/x86/intel/lbr: Support Architectural LBR kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 15/23] perf/core: Factor out functions to allocate/free the task_ctx_data kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 16/23] perf/core: Use kmem_cache to allocate the PMU specific data kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 17/23] perf/x86/intel/lbr: Create kmem_cache for the LBR context data kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 18/23] perf/x86: Remove task_ctx_size kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 19/23] x86/fpu: Use proper mask to replace full instruction mask kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 20/23] x86/fpu/xstate: Support dynamic supervisor feature for LBR kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2021-05-27 22:15     ` Thomas Gleixner [this message]
2020-07-03 12:49 ` [PATCH V3 21/23] x86/fpu/xstate: Add helpers for LBR dynamic supervisor feature kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 22/23] perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 23/23] perf/x86/intel/lbr: Support XSAVES for arch LBR read kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 19:34 ` [PATCH V3 00/23] Support Architectural LBR Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y2bz7rir.ffs@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=dave.hansen@intel.com \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tip-bot2@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).