linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 00/36] x86: enable FRED for x86-64
@ 2023-07-31  6:32 Xin Li
  2023-07-31  6:32 ` [PATCH v9 01/36] Documentation/x86/64: Add documentation for FRED Xin Li
                   ` (25 more replies)
  0 siblings, 26 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

This patch set enables the Intel flexible return and event delivery
(FRED) architecture for x86-64.

The FRED architecture defines simple new transitions that change
privilege level (ring transitions). The FRED architecture was
designed with the following goals:

1) Improve overall performance and response time by replacing event
   delivery through the interrupt descriptor table (IDT event
   delivery) and event return by the IRET instruction with lower
   latency transitions.

2) Improve software robustness by ensuring that event delivery
   establishes the full supervisor context and that event return
   establishes the full user context.

The new transitions defined by the FRED architecture are FRED event
delivery and, for returning from events, two FRED return instructions.
FRED event delivery can effect a transition from ring 3 to ring 0, but
it is used also to deliver events incident to ring 0. One FRED
instruction (ERETU) effects a return from ring 0 to ring 3, while the
other (ERETS) returns while remaining in ring 0. Collectively, FRED
event delivery and the FRED return instructions are FRED transitions.

Search for the latest FRED spec in most search engines with this search pattern:

  site:intel.com FRED (flexible return and event delivery) specification

As of now there is no publicly avaiable CPU supporting FRED, thus the Intel
Simics® Simulator is used as software development and testing vehicles. And
it can be downloaded from:
  https://www.intel.com/content/www/us/en/developer/articles/tool/simics-simulator.html

To enable FRED, the Simics package 8112 QSP-CPU needs to be installed with CPU
model configured as:
	$cpu_comp_class = "x86-experimental-fred"


Changes since v8:
* Move the FRED initialization patch after all required changes are in
  place (Thomas Gleixner).
* Don't do syscall early out in fred_entry_from_user() before there are
  proper performance numbers and justifications (Thomas Gleixner).
* Add the control exception handler to the FRED exception handler table
  (Thomas Gleixner).
* Introduce a macro sysvec_install() to derive the asm handler name from
  a C handler, which simplifies the code and avoids an ugly typecast
  (Thomas Gleixner).
* Remove junk code that assumes no local APIC on x86_64 (Thomas Gleixner).
* Put IDTENTRY changes in a separate patch (Thomas Gleixner).
* Use high-order 48 bits above the lowest 16 bit SS only when FRED is
  enabled (Thomas Gleixner).
* Explain why writing directly to the IA32_KERNEL_GS_BASE MSR is
  doing the right thing (Thomas Gleixner).
* Reword some patch descriptions (Thomas Gleixner).
* Add a new macro VMX_DO_FRED_EVENT_IRQOFF for FRED instead of
  refactoring VMX_DO_EVENT_IRQOFF (Sean Christopherson).
* Do NOT use a trampoline, just LEA+PUSH the return RIP, PUSH the error
  code, and jump to the FRED kernel entry point for NMI or call
  external_interrupt() for IRQs (Sean Christopherson).
* Call external_interrupt() only when FRED is enabled, and convert the
  non-FRED handling to external_interrupt() after FRED lands (Sean
  Christopherson).
* Use __packed instead of __attribute__((__packed__)) (Borislav Petkov).
* Put all comments above the members, like the rest of the file does
  (Borislav Petkov).
* Reflect the FRED spec 5.0 change that ERETS and ERETU add 8 to %rsp
  before popping the return context from the stack.
* Reflect stack frame definition changes from FRED spec 3.0 to 5.0.
* Add ENDBR to the FRED_ENTER asm macro after kernel IBT is added to
  FRED base line in FRED spec 5.0.
* Add a document which briefly introduces FRED features.
* Remove 2 patches, "allow FRED systems to use interrupt vectors
  0x10-0x1f" and "allow dynamic stack frame size", from this patch set,
  as they are "optimizations" only.
* Send 2 patches, "header file for event types" and "do not modify the
  DPL bits for a null selector", as pre-FRED patches.

Changes since v7:
* Always call external_interrupt() for VMX IRQ handling on x86_64, thus avoid
  re-entering the noinstr code.
* Create a FRED stack frame when FRED is compiled-in but not enabled, which
  uses some extra stack space but simplifies the code.
* Add a log message when FRED is enabled.

Changes since v6:
* Add a comment to explain why it is safe to write to a previous FRED stack
  frame. (Lai Jiangshan).
* Export fred_entrypoint_kernel(), required when kvm-intel built as a module.
* Reserve a REDZONE for CALL emulation and Align RSP to a 64-byte boundary
  before pushing a new FRED stack frame.
* Replace pt_regs csx flags prefix FRED_CSL_ with FRED_CSX_.

Changes since v5:
* Initialize system_interrupt_handlers with dispatch_table_spurious_interrupt()
  instead of NULL to get rid of a branch (Peter Zijlstra).
* Disallow #DB inside #MCE for robustness sake (Peter Zijlstra).
* Add a comment for FRED stack level settings (Lai Jiangshan).
* Move the NMI bit from an invalid stack frame, which caused ERETU to fault,
  to the fault handler's stack frame, thus to unblock NMI ASAP if NMI is blocked
  (Lai Jiangshan).
* Refactor VMX_DO_EVENT_IRQOFF to handle IRQ/NMI in IRQ/NMI induced VM exits
  when FRED is enabled (Sean Christopherson).

Changes since v4:
* Do NOT use the term "injection", which in the KVM context means to
  reinject an event into the guest (Sean Christopherson).
* Add the explanation of why to execute "int $2" to invoke the NMI handler
  in NMI caused VM exits (Sean Christopherson).
* Use cs/ss instead of csx/ssx when initializing the pt_regs structure
  for calling external_interrupt(), otherwise it breaks i386 build.

Changes since v3:
* Call external_interrupt() to handle IRQ in IRQ caused VM exits.
* Execute "int $2" to handle NMI in NMI caused VM exits.
* Rename csl/ssl of the pt_regs structure to csx/ssx (x for extended)
  (Andrew Cooper).

Changes since v2:
* Improve comments for changes in arch/x86/include/asm/idtentry.h.

Changes since v1:
* call irqentry_nmi_{enter,exit}() in both IDT and FRED debug fault kernel
  handler (Peter Zijlstra).
* Initialize a FRED exception handler to fred_bad_event() instead of NULL
  if no FRED handler defined for an exception vector (Peter Zijlstra).
* Push calling irqentry_{enter,exit}() and instrumentation_{begin,end}()
  down into individual FRED exception handlers, instead of in the dispatch
  framework (Peter Zijlstra).

H. Peter Anvin (Intel) (22):
  x86/fred: Add Kconfig option for FRED (CONFIG_X86_FRED)
  x86/fred: Disable FRED support if CONFIG_X86_FRED is disabled
  x86/cpufeatures: Add the cpu feature bit for FRED
  x86/opcode: Add ERETU, ERETS instructions to x86-opcode-map
  x86/objtool: Teach objtool about ERETU and ERETS
  x86/cpu: Add X86_CR4_FRED macro
  x86/cpu: Add MSR numbers for FRED configuration
  x86/fred: Make unions for the cs and ss fields in struct pt_regs
  x86/fred: Add a new header file for FRED definitions
  x86/fred: Reserve space for the FRED stack frame
  x86/fred: Update MSR_IA32_FRED_RSP0 during task switch
  x86/fred: Let ret_from_fork_asm() jmp to fred_exit_user when FRED is
    enabled
  x86/fred: Disallow the swapgs instruction when FRED is enabled
  x86/fred: No ESPFIX needed when FRED is enabled
  x86/fred: Allow single-step trap and NMI when starting a new task
  x86/fred: Add a page fault entry stub for FRED
  x86/fred: Add a debug fault entry stub for FRED
  x86/fred: Add a NMI entry stub for FRED
  x86/traps: Add a system interrupt handler table for system interrupt
    dispatch
  x86/traps: Add external_interrupt() to dispatch external interrupts
  x86/fred: FRED entry/exit and dispatch code
  x86/fred: FRED initialization code

Xin Li (14):
  Documentation/x86/64: Add documentation for FRED
  x86/fred: Define a common function type fred_handler
  x86/fred: Add a machine check entry stub for FRED
  x86/fred: Add a double fault entry stub for FRED
  x86/entry: Remove idtentry_sysvec from entry_{32,64}.S
  x86/idtentry: Incorporate definitions/declarations of the FRED
    external interrupt handler type
  x86/traps: Add sysvec_install() to install a system interrupt handler
  x86/idtentry: Incorporate declaration/definition of the FRED exception
    handler type
  x86/fred: Fixup fault on ERETU by jumping to fred_entrypoint_user
  x86/traps: Export external_interrupt() for handling IRQ in IRQ induced
    VM exits
  x86/fred: Export fred_entrypoint_kernel() for handling NMI in NMI
    induced VM exits
  KVM: VMX: Add VMX_DO_FRED_EVENT_IRQOFF for IRQ/NMI handling
  x86/syscall: Split IDT syscall setup code into idt_syscall_init()
  x86/fred: Disable FRED by default in its early stage

 .../admin-guide/kernel-parameters.txt         |   4 +
 Documentation/arch/x86/x86_64/fred.rst        | 102 ++++++++
 Documentation/arch/x86/x86_64/index.rst       |   1 +
 arch/x86/Kconfig                              |   9 +
 arch/x86/entry/Makefile                       |   5 +-
 arch/x86/entry/entry_32.S                     |   4 -
 arch/x86/entry/entry_64.S                     |  14 +-
 arch/x86/entry/entry_64_fred.S                |  58 +++++
 arch/x86/entry/entry_fred.c                   | 220 ++++++++++++++++++
 arch/x86/entry/vsyscall/vsyscall_64.c         |   2 +-
 arch/x86/include/asm/asm-prototypes.h         |   1 +
 arch/x86/include/asm/cpufeatures.h            |   1 +
 arch/x86/include/asm/disabled-features.h      |   8 +-
 arch/x86/include/asm/extable_fixup_types.h    |   4 +-
 arch/x86/include/asm/fred.h                   | 157 +++++++++++++
 arch/x86/include/asm/idtentry.h               | 115 ++++++++-
 arch/x86/include/asm/msr-index.h              |  13 +-
 arch/x86/include/asm/ptrace.h                 |  57 ++++-
 arch/x86/include/asm/switch_to.h              |  11 +-
 arch/x86/include/asm/thread_info.h            |  12 +-
 arch/x86/include/asm/traps.h                  |  23 ++
 arch/x86/include/uapi/asm/processor-flags.h   |   2 +
 arch/x86/kernel/Makefile                      |   1 +
 arch/x86/kernel/cpu/acrn.c                    |   5 +-
 arch/x86/kernel/cpu/common.c                  |  47 +++-
 arch/x86/kernel/cpu/mce/core.c                |  15 ++
 arch/x86/kernel/cpu/mshyperv.c                |  16 +-
 arch/x86/kernel/espfix_64.c                   |   8 +
 arch/x86/kernel/fred.c                        |  67 ++++++
 arch/x86/kernel/irqinit.c                     |   7 +-
 arch/x86/kernel/kvm.c                         |   2 +-
 arch/x86/kernel/nmi.c                         |  19 ++
 arch/x86/kernel/process_64.c                  |  31 ++-
 arch/x86/kernel/traps.c                       | 153 ++++++++++--
 arch/x86/kvm/vmx/vmenter.S                    |  88 +++++++
 arch/x86/kvm/vmx/vmx.c                        |  19 +-
 arch/x86/lib/x86-opcode-map.txt               |   2 +-
 arch/x86/mm/extable.c                         |  79 +++++++
 arch/x86/mm/fault.c                           |  18 +-
 drivers/xen/events/events_base.c              |   3 +-
 tools/arch/x86/include/asm/cpufeatures.h      |   1 +
 .../arch/x86/include/asm/disabled-features.h  |   8 +-
 tools/arch/x86/include/asm/msr-index.h        |  13 +-
 tools/arch/x86/lib/x86-opcode-map.txt         |   2 +-
 tools/objtool/arch/x86/decode.c               |  19 +-
 45 files changed, 1348 insertions(+), 98 deletions(-)
 create mode 100644 Documentation/arch/x86/x86_64/fred.rst
 create mode 100644 arch/x86/entry/entry_64_fred.S
 create mode 100644 arch/x86/entry/entry_fred.c
 create mode 100644 arch/x86/include/asm/fred.h
 create mode 100644 arch/x86/kernel/fred.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v9 01/36] Documentation/x86/64: Add documentation for FRED
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 02/36] x86/fred: Add Kconfig option for FRED (CONFIG_X86_FRED) Xin Li
                   ` (24 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

Briefly introduce FRED, its advantages compared to IDT, and its
Linux enabling.

Signed-off-by: Xin Li <xin3.li@intel.com>
---
 Documentation/arch/x86/x86_64/fred.rst  | 102 ++++++++++++++++++++++++
 Documentation/arch/x86/x86_64/index.rst |   1 +
 2 files changed, 103 insertions(+)
 create mode 100644 Documentation/arch/x86/x86_64/fred.rst

diff --git a/Documentation/arch/x86/x86_64/fred.rst b/Documentation/arch/x86/x86_64/fred.rst
new file mode 100644
index 000000000000..27c980e882ba
--- /dev/null
+++ b/Documentation/arch/x86/x86_64/fred.rst
@@ -0,0 +1,102 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================================
+Flexible Return and Event Delivery (FRED)
+=========================================
+
+Overview
+========
+
+The FRED architecture defines simple new transitions that change
+privilege level (ring transitions). The FRED architecture was
+designed with the following goals:
+
+1) Improve overall performance and response time by replacing event
+   delivery through the interrupt descriptor table (IDT event
+   delivery) and event return by the IRET instruction with lower
+   latency transitions.
+
+2) Improve software robustness by ensuring that event delivery
+   establishes the full supervisor context and that event return
+   establishes the full user context.
+
+The new transitions defined by the FRED architecture are FRED event
+delivery and, for returning from events, two FRED return instructions.
+FRED event delivery can effect a transition from ring 3 to ring 0, but
+it is used also to deliver events incident to ring 0. One FRED
+instruction (ERETU) effects a return from ring 0 to ring 3, while the
+other (ERETS) returns while remaining in ring 0. Collectively, FRED
+event delivery and the FRED return instructions are FRED transitions.
+
+In addition to these transitions, the FRED architecture defines a new
+instruction (LKGS) for managing the state of the GS segment register.
+The LKGS instruction can be used by 64-bit operating systems that do
+not use the new FRED transitions.
+
+Software based event dispatching
+================================
+
+FRED operates differently from IDT in terms of event handling. Instead
+of directly dispatching an event to its handler based on the event
+vector, FRED requires the software to dispatch an event to its handler
+based on both the event's type and vector. Therefore, an event
+dispatch framework must be implemented to facilitate the
+event-to-handler dispatch process. The FRED event dispatch framework
+assumes control once an event is delivered, starting from two FRED
+entry points, after which several event dispatch tables are introduced
+to facilitate the dispatching.
+
+The first level dispatching is event type based, and two tables need
+to be defined, one for ring 3 event dispatching, and the other
+for ring 0.
+
+The second level dispatching is event vector based, and
+several tables need to be defined, e.g., an exception handler table
+for exception dispatching.
+
+Full supervisor/user context
+============================
+
+FRED event delivery atomically save and restore full supervisor/user
+context upon event delivery and return. Thus it avoids the problem of
+transient states due to %cr2 and/or %dr6, thus it is no longer needed
+to handle all the ugly corner cases caused by half baked CPU states.
+
+FRED allows explicit unblock of NMI with new event return instructions
+ERETS/ERETU, avoiding the mess caused by IRET which unconditionally
+unblocks NMI, when an exception happens during NMI handling.
+
+FRED always restores the full value of %rsp, thus ESPFIX is no longer
+needed when FRED is enabled.
+
+LKGS
+====
+
+LKGS behaves like the MOV to GS instruction except that it loads the
+base address into the IA32_KERNEL_GS_BASE MSR instead of the GS
+segment’s descriptor cache, which is exactly what Linux kernel does
+to load user level GS base. With LKGS, it ends up with avoiding
+mucking with kernel GS.
+
+Because FRED event delivery from ring 3 swaps the value of the GS base
+address and that of the IA32_KERNEL_GS_BASE MSR, and ERETU swaps the
+value of the GS base address and that of the IA32_KERNEL_GS_BASE MSR,
+plus the introduction of LKGS instruction, the SWAPGS instruction is
+no longer needed when FRED is enabled, thus is disallowed (#UD).
+
+Stack levels
+============
+
+4 stack levels 0~3 are introduced to replace the un-reentrant IST for
+handling events. Each stack level could be configured to use a
+dedicated stack.
+
+The current stack level could be unchanged or go higher upon FRED
+event delivery. If unchanged, the CPU keeps using the current event
+stack. If higher, the CPU switches to a new stack specified by the
+stack MSR of the new stack level.
+
+Only execution of a FRED return instruction ERETU or ERETS could lower
+the current stack level, causing the CPU to switch back to the stack
+it was on before a previous event delivery.
+satck.
diff --git a/Documentation/arch/x86/x86_64/index.rst b/Documentation/arch/x86/x86_64/index.rst
index a56070fc8e77..ad15e9bd623f 100644
--- a/Documentation/arch/x86/x86_64/index.rst
+++ b/Documentation/arch/x86/x86_64/index.rst
@@ -15,3 +15,4 @@ x86_64 Support
    cpu-hotplug-spec
    machinecheck
    fsgs
+   fred
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 02/36] x86/fred: Add Kconfig option for FRED (CONFIG_X86_FRED)
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
  2023-07-31  6:32 ` [PATCH v9 01/36] Documentation/x86/64: Add documentation for FRED Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 03/36] x86/fred: Disable FRED support if CONFIG_X86_FRED is disabled Xin Li
                   ` (23 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Add the configuration option CONFIG_X86_FRED to enable FRED.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/Kconfig | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 7422db409770..700d94cb8330 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -494,6 +494,15 @@ config X86_CPU_RESCTRL
 
 	  Say N if unsure.
 
+config X86_FRED
+	bool "Flexible Return and Event Delivery"
+	depends on X86_64
+	help
+	  When enabled, try to use Flexible Return and Event Delivery
+	  instead of the legacy SYSCALL/SYSENTER/IDT architecture for
+	  ring transitions and exception/interrupt handling if the
+	  system supports.
+
 if X86_32
 config X86_BIGSMP
 	bool "Support for big SMP systems with more than 8 CPUs"
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 03/36] x86/fred: Disable FRED support if CONFIG_X86_FRED is disabled
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
  2023-07-31  6:32 ` [PATCH v9 01/36] Documentation/x86/64: Add documentation for FRED Xin Li
  2023-07-31  6:32 ` [PATCH v9 02/36] x86/fred: Add Kconfig option for FRED (CONFIG_X86_FRED) Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 04/36] x86/cpufeatures: Add the cpu feature bit for FRED Xin Li
                   ` (22 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Add CONFIG_X86_FRED to <asm/disabled-features.h> to make
cpu_feature_enabled() work correctly with FRED.

Originally-by: Megha Dey <megha.dey@intel.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/include/asm/disabled-features.h       | 8 +++++++-
 tools/arch/x86/include/asm/disabled-features.h | 8 +++++++-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index fafe9be7a6f4..85fd67c67ce1 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -105,6 +105,12 @@
 # define DISABLE_TDX_GUEST	(1 << (X86_FEATURE_TDX_GUEST & 31))
 #endif
 
+#ifdef CONFIG_X86_FRED
+# define DISABLE_FRED 0
+#else
+# define DISABLE_FRED (1 << (X86_FEATURE_FRED & 31))
+#endif
+
 /*
  * Make sure to add features to the correct mask
  */
@@ -122,7 +128,7 @@
 #define DISABLED_MASK11	(DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET| \
 			 DISABLE_CALL_DEPTH_TRACKING)
 #define DISABLED_MASK12	(DISABLE_LAM)
-#define DISABLED_MASK13	0
+#define DISABLED_MASK13	(DISABLE_FRED)
 #define DISABLED_MASK14	0
 #define DISABLED_MASK15	0
 #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \
diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h
index fafe9be7a6f4..85fd67c67ce1 100644
--- a/tools/arch/x86/include/asm/disabled-features.h
+++ b/tools/arch/x86/include/asm/disabled-features.h
@@ -105,6 +105,12 @@
 # define DISABLE_TDX_GUEST	(1 << (X86_FEATURE_TDX_GUEST & 31))
 #endif
 
+#ifdef CONFIG_X86_FRED
+# define DISABLE_FRED 0
+#else
+# define DISABLE_FRED (1 << (X86_FEATURE_FRED & 31))
+#endif
+
 /*
  * Make sure to add features to the correct mask
  */
@@ -122,7 +128,7 @@
 #define DISABLED_MASK11	(DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET| \
 			 DISABLE_CALL_DEPTH_TRACKING)
 #define DISABLED_MASK12	(DISABLE_LAM)
-#define DISABLED_MASK13	0
+#define DISABLED_MASK13	(DISABLE_FRED)
 #define DISABLED_MASK14	0
 #define DISABLED_MASK15	0
 #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 04/36] x86/cpufeatures: Add the cpu feature bit for FRED
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (2 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 03/36] x86/fred: Disable FRED support if CONFIG_X86_FRED is disabled Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 05/36] x86/opcode: Add ERETU, ERETS instructions to x86-opcode-map Xin Li
                   ` (21 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Add the CPU feature bit for FRED.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/include/asm/cpufeatures.h       | 1 +
 tools/arch/x86/include/asm/cpufeatures.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index cb8ca46213be..fd3ddd5c0283 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -317,6 +317,7 @@
 #define X86_FEATURE_FZRM		(12*32+10) /* "" Fast zero-length REP MOVSB */
 #define X86_FEATURE_FSRS		(12*32+11) /* "" Fast short REP STOSB */
 #define X86_FEATURE_FSRC		(12*32+12) /* "" Fast short REP {CMPSB,SCASB} */
+#define X86_FEATURE_FRED		(12*32+17) /* Flexible Return and Event Delivery */
 #define X86_FEATURE_LKGS		(12*32+18) /* "" Load "kernel" (userspace) GS */
 #define X86_FEATURE_AMX_FP16		(12*32+21) /* "" AMX fp16 Support */
 #define X86_FEATURE_AVX_IFMA            (12*32+23) /* "" Support for VPMADD52[H,L]UQ */
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index cb8ca46213be..fd3ddd5c0283 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -317,6 +317,7 @@
 #define X86_FEATURE_FZRM		(12*32+10) /* "" Fast zero-length REP MOVSB */
 #define X86_FEATURE_FSRS		(12*32+11) /* "" Fast short REP STOSB */
 #define X86_FEATURE_FSRC		(12*32+12) /* "" Fast short REP {CMPSB,SCASB} */
+#define X86_FEATURE_FRED		(12*32+17) /* Flexible Return and Event Delivery */
 #define X86_FEATURE_LKGS		(12*32+18) /* "" Load "kernel" (userspace) GS */
 #define X86_FEATURE_AMX_FP16		(12*32+21) /* "" AMX fp16 Support */
 #define X86_FEATURE_AVX_IFMA            (12*32+23) /* "" Support for VPMADD52[H,L]UQ */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 05/36] x86/opcode: Add ERETU, ERETS instructions to x86-opcode-map
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (3 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 04/36] x86/cpufeatures: Add the cpu feature bit for FRED Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  9:43   ` Masami Hiramatsu
  2023-07-31  6:32 ` [PATCH v9 06/36] x86/objtool: Teach objtool about ERETU and ERETS Xin Li
                   ` (20 subsequent siblings)
  25 siblings, 1 reply; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Add instruction opcodes used by FRED ERETU/ERETS to x86-opcode-map.

Opcode numbers are per FRED spec v5.0.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/lib/x86-opcode-map.txt       | 2 +-
 tools/arch/x86/lib/x86-opcode-map.txt | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
index 5168ee0360b2..7a269e269dc0 100644
--- a/arch/x86/lib/x86-opcode-map.txt
+++ b/arch/x86/lib/x86-opcode-map.txt
@@ -1052,7 +1052,7 @@ EndTable
 
 GrpTable: Grp7
 0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B) | PCONFIG (101),(11B) | ENCLV (000),(11B)
-1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B)
+1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B) | ERETU (F3),(010),(11B) | ERETS (F2),(010),(11B)
 2: LGDT Ms | XGETBV (000),(11B) | XSETBV (001),(11B) | VMFUNC (100),(11B) | XEND (101)(11B) | XTEST (110)(11B) | ENCLU (111),(11B)
 3: LIDT Ms
 4: SMSW Mw/Rv
diff --git a/tools/arch/x86/lib/x86-opcode-map.txt b/tools/arch/x86/lib/x86-opcode-map.txt
index 5168ee0360b2..7a269e269dc0 100644
--- a/tools/arch/x86/lib/x86-opcode-map.txt
+++ b/tools/arch/x86/lib/x86-opcode-map.txt
@@ -1052,7 +1052,7 @@ EndTable
 
 GrpTable: Grp7
 0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B) | PCONFIG (101),(11B) | ENCLV (000),(11B)
-1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B)
+1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B) | ERETU (F3),(010),(11B) | ERETS (F2),(010),(11B)
 2: LGDT Ms | XGETBV (000),(11B) | XSETBV (001),(11B) | VMFUNC (100),(11B) | XEND (101)(11B) | XTEST (110)(11B) | ENCLU (111),(11B)
 3: LIDT Ms
 4: SMSW Mw/Rv
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 06/36] x86/objtool: Teach objtool about ERETU and ERETS
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (4 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 05/36] x86/opcode: Add ERETU, ERETS instructions to x86-opcode-map Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 07/36] x86/cpu: Add X86_CR4_FRED macro Xin Li
                   ` (19 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Update the objtool decoder to know about the ERETU and ERETS
instructions (type INSN_CONTEXT_SWITCH).

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 tools/objtool/arch/x86/decode.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 2e1caabecb18..a486485cff20 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -509,11 +509,20 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec
 
 		if (op2 == 0x01) {
 
-			if (modrm == 0xca)
-				insn->type = INSN_CLAC;
-			else if (modrm == 0xcb)
-				insn->type = INSN_STAC;
-
+			switch (insn_last_prefix_id(&ins)) {
+			case INAT_PFX_REPE:
+			case INAT_PFX_REPNE:
+				if (modrm == 0xca)
+					/* eretu/erets */
+					insn->type = INSN_CONTEXT_SWITCH;
+				break;
+			default:
+				if (modrm == 0xca)
+					insn->type = INSN_CLAC;
+				else if (modrm == 0xcb)
+					insn->type = INSN_STAC;
+				break;
+			}
 		} else if (op2 >= 0x80 && op2 <= 0x8f) {
 
 			insn->type = INSN_JUMP_CONDITIONAL;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 07/36] x86/cpu: Add X86_CR4_FRED macro
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (5 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 06/36] x86/objtool: Teach objtool about ERETU and ERETS Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 08/36] x86/cpu: Add MSR numbers for FRED configuration Xin Li
                   ` (18 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Add X86_CR4_FRED macro for the FRED bit in %cr4. This bit should be a
pinned bit, not to be changed after initialization.

CR4 macros are defined in arch/x86/include/uapi/asm/processor-flags.h,
which is uapi, and thus cannot depend on CONFIG_X86_64.

Using _BITUL() causes build errors on 32 bits, and there is no
guarantee that user space applications (e.g. something like Qemu)
might not want to use this declaration even when building for i386 or
x32.

However, %cr4 is a machine word (unsigned long), so to avoid build
warnings on 32 bits, explicitly cast the value to unsigned long,
truncating upper 32 bits.

The other alternative would be to use CONFIG_X86_64 around the
definition of cr4_pinned_mask. It is probably not desirable to make
cr4_pinned_mask non-const.

Another option, which may be preferable, to be honest: explicitly
enumerate the CR4 bits which *may* be changed (a whitelist), instead
of the ones that may not. That would be a separate, pre-FRED, patch,
and would automatically resolve this problem as a side effect.

The following flags probably should have been in this set all along,
as they are all controls affecting the kernel runtime environment as
opposed to user space:

X86_CR4_DE, X86_CR4_PAE, X86_CR4_PSE, X86_CR4_MCE, X86_CR4_PGE,
X86_CR4_OSFXSR, X86_CR4_OSXMMEXCPT, X86_CR4_LA57, X86_CR4_PCIDE,
X86_CR4_LAM_SUP

Possibly X86_CR4_VMXE as well, which seems harmless even if KVM is
not loaded; X86_CR4_PKE can be fixed as long as the PKE configuration
registers are at least initialized to disabled.

It is relatively simple to do an audit of which flags are allowed to
be modified at runtime and whitelist only those. There is no reason
why we should allow bits in CR4 to be toggled by default.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/include/uapi/asm/processor-flags.h | 2 ++
 arch/x86/kernel/cpu/common.c                | 5 +++--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
index d898432947ff..ce08c2ca70b5 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -138,6 +138,8 @@
 #define X86_CR4_CET		_BITUL(X86_CR4_CET_BIT)
 #define X86_CR4_LAM_SUP_BIT	28 /* LAM for supervisor pointers */
 #define X86_CR4_LAM_SUP		_BITUL(X86_CR4_LAM_SUP_BIT)
+#define X86_CR4_FRED_BIT	32 /* enable FRED kernel entry */
+#define X86_CR4_FRED		_BITULL(X86_CR4_FRED_BIT)
 
 /*
  * x86-64 Task Priority Register, CR8
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 0ba1067f4e5f..331b06d19f7f 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -402,8 +402,9 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c)
 
 /* These bits should not change their value after CPU init is finished. */
 static const unsigned long cr4_pinned_mask =
-	X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
-	X86_CR4_FSGSBASE | X86_CR4_CET;
+	(unsigned long)
+	(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
+	 X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_FRED);
 static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
 static unsigned long cr4_pinned_bits __ro_after_init;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 08/36] x86/cpu: Add MSR numbers for FRED configuration
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (6 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 07/36] x86/cpu: Add X86_CR4_FRED macro Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 09/36] x86/fred: Make unions for the cs and ss fields in struct pt_regs Xin Li
                   ` (17 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Add MSR numbers for the FRED configuration registers.

Originally-by: Megha Dey <megha.dey@intel.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/include/asm/msr-index.h       | 13 ++++++++++++-
 tools/arch/x86/include/asm/msr-index.h | 13 ++++++++++++-
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index a00a53e15ab7..111fb76f6dbe 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -36,8 +36,19 @@
 #define EFER_FFXSR		(1<<_EFER_FFXSR)
 #define EFER_AUTOIBRS		(1<<_EFER_AUTOIBRS)
 
-/* Intel MSRs. Some also available on other CPUs */
+/* FRED MSRs */
+#define MSR_IA32_FRED_RSP0	0x1cc /* Level 0 stack pointer */
+#define MSR_IA32_FRED_RSP1	0x1cd /* Level 1 stack pointer */
+#define MSR_IA32_FRED_RSP2	0x1ce /* Level 2 stack pointer */
+#define MSR_IA32_FRED_RSP3	0x1cf /* Level 3 stack pointer */
+#define MSR_IA32_FRED_STKLVLS	0x1d0 /* Exception stack levels */
+#define MSR_IA32_FRED_SSP0	MSR_IA32_PL0_SSP /* Level 0 shadow stack pointer */
+#define MSR_IA32_FRED_SSP1	0x1d1 /* Level 1 shadow stack pointer */
+#define MSR_IA32_FRED_SSP2	0x1d2 /* Level 2 shadow stack pointer */
+#define MSR_IA32_FRED_SSP3	0x1d3 /* Level 3 shadow stack pointer */
+#define MSR_IA32_FRED_CONFIG	0x1d4 /* Entrypoint and interrupt stack level */
 
+/* Intel MSRs. Some also available on other CPUs */
 #define MSR_TEST_CTRL				0x00000033
 #define MSR_TEST_CTRL_SPLIT_LOCK_DETECT_BIT	29
 #define MSR_TEST_CTRL_SPLIT_LOCK_DETECT		BIT(MSR_TEST_CTRL_SPLIT_LOCK_DETECT_BIT)
diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
index 3aedae61af4f..565cade0785a 100644
--- a/tools/arch/x86/include/asm/msr-index.h
+++ b/tools/arch/x86/include/asm/msr-index.h
@@ -36,8 +36,19 @@
 #define EFER_FFXSR		(1<<_EFER_FFXSR)
 #define EFER_AUTOIBRS		(1<<_EFER_AUTOIBRS)
 
-/* Intel MSRs. Some also available on other CPUs */
+/* FRED MSRs */
+#define MSR_IA32_FRED_RSP0	0x1cc /* Level 0 stack pointer */
+#define MSR_IA32_FRED_RSP1	0x1cd /* Level 1 stack pointer */
+#define MSR_IA32_FRED_RSP2	0x1ce /* Level 2 stack pointer */
+#define MSR_IA32_FRED_RSP3	0x1cf /* Level 3 stack pointer */
+#define MSR_IA32_FRED_STKLVLS	0x1d0 /* Exception stack levels */
+#define MSR_IA32_FRED_SSP0	MSR_IA32_PL0_SSP /* Level 0 shadow stack pointer */
+#define MSR_IA32_FRED_SSP1	0x1d1 /* Level 1 shadow stack pointer */
+#define MSR_IA32_FRED_SSP2	0x1d2 /* Level 2 shadow stack pointer */
+#define MSR_IA32_FRED_SSP3	0x1d3 /* Level 3 shadow stack pointer */
+#define MSR_IA32_FRED_CONFIG	0x1d4 /* Entrypoint and interrupt stack level */
 
+/* Intel MSRs. Some also available on other CPUs */
 #define MSR_TEST_CTRL				0x00000033
 #define MSR_TEST_CTRL_SPLIT_LOCK_DETECT_BIT	29
 #define MSR_TEST_CTRL_SPLIT_LOCK_DETECT		BIT(MSR_TEST_CTRL_SPLIT_LOCK_DETECT_BIT)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 09/36] x86/fred: Make unions for the cs and ss fields in struct pt_regs
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (7 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 08/36] x86/cpu: Add MSR numbers for FRED configuration Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 10/36] x86/fred: Add a new header file for FRED definitions Xin Li
                   ` (16 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Make the cs and ss fields in struct pt_regs unions between the actual
selector and the unsigned long stack slot. FRED uses this space to
store additional flags.

The printk changes are simply due to the cs and ss fields changed to
unsigned short from unsigned long.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---

Changes since v8:
* Reflect stack frame definition changes from FRED spec 3.0 to 5.0.
* Use __packed instead of __attribute__((__packed__)) (Borislav Petkov).
* Put all comments above the members, like the rest of the file does
  (Borislav Petkov).

Changes since v3:
* Rename csl/ssl of the pt_regs structure to csx/ssx (x for extended)
  (Andrew Cooper).
---
 arch/x86/entry/vsyscall/vsyscall_64.c |  2 +-
 arch/x86/include/asm/ptrace.h         | 57 +++++++++++++++++++++++++--
 arch/x86/kernel/process_64.c          |  2 +-
 3 files changed, 56 insertions(+), 5 deletions(-)

diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index e0ca8120aea8..a3c0df11d0e6 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -76,7 +76,7 @@ static void warn_bad_vsyscall(const char *level, struct pt_regs *regs,
 	if (!show_unhandled_signals)
 		return;
 
-	printk_ratelimited("%s%s[%d] %s ip:%lx cs:%lx sp:%lx ax:%lx si:%lx di:%lx\n",
+	printk_ratelimited("%s%s[%d] %s ip:%lx cs:%x sp:%lx ax:%lx si:%lx di:%lx\n",
 			   level, current->comm, task_pid_nr(current),
 			   message, regs->ip, regs->cs,
 			   regs->sp, regs->ax, regs->si, regs->di);
diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index f4db78b09c8f..f1690beffd15 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -80,15 +80,66 @@ struct pt_regs {
 /*
  * On syscall entry, this is syscall#. On CPU exception, this is error code.
  * On hw interrupt, it's IRQ number:
+ *
+ * A FRED stack frame starts here:
+ *   1) It _always_ includes an error code;
+ *   2) The return frame for eretu/erets starts here.
  */
 	unsigned long orig_ax;
 /* Return frame for iretq */
 	unsigned long ip;
-	unsigned long cs;
+	union {
+/* CS extended: CS + any fields above it */
+		unsigned long csx;
+		struct {
+/* CS selector proper */
+			unsigned short cs;
+/* The stack level (SL) at the time the event occurred */
+			unsigned int sl		: 2;
+/* Set to indicate that indirect branch tracker in WAIT_FOR_ENDBRANCH state */
+			unsigned int wfe	: 1;
+			unsigned int __csx_resv1: 13;
+			unsigned int __csx_resv2: 32;
+		} __packed;
+	};
 	unsigned long flags;
 	unsigned long sp;
-	unsigned long ss;
-/* top of stack page */
+	union {
+/* SS extended: SS + any fields above it */
+		unsigned long ssx;
+		struct {
+/* SS selector proper */
+			unsigned short ss;
+/* Set to indicate that interrupt blocking by STI was in effect */
+			unsigned int sti	: 1;
+/* For SYSCALL, SYSENTER, or INT n (for any value of n) */
+			unsigned int sys	: 1;
+			unsigned int nmi	: 1;
+			unsigned int __ssx_resv1: 13;
+/* Event information fields, ignored by the FRED return instructions */
+			unsigned int vector	: 8;
+			unsigned int __ssx_resv2: 8;
+			unsigned int type	: 4;
+			unsigned int __ssx_resv3: 4;
+/* Set to indicate that the event was incident to enclave execution */
+			unsigned int enc	: 1;
+/* Set to indicate that the logical processor had been in 64-bit mode */
+			unsigned int l		: 1;
+/*
+ * Set to indicate the event is a nested exception encountered during FRED
+ * event delivery of another event. This bit is not set if the event is
+ * double fault (#DF).
+ */
+			unsigned int nst	: 1;
+			unsigned int __ssx_resv4: 1;
+/* The length of the instruction causing the event */
+			unsigned int instr_len	: 4;
+		} __packed;
+	};
+/*
+ * Top of stack page on IDT systems, while FRED systems have extra fields
+ * defined above, see <asm/fred.h>.
+ */
 };
 
 #endif /* !__i386__ */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 3d181c16a2f6..265ab8fcb146 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -117,7 +117,7 @@ void __show_regs(struct pt_regs *regs, enum show_regs_mode mode,
 
 	printk("%sFS:  %016lx(%04x) GS:%016lx(%04x) knlGS:%016lx\n",
 	       log_lvl, fs, fsindex, gs, gsindex, shadowgs);
-	printk("%sCS:  %04lx DS: %04x ES: %04x CR0: %016lx\n",
+	printk("%sCS:  %04x DS: %04x ES: %04x CR0: %016lx\n",
 		log_lvl, regs->cs, ds, es, cr0);
 	printk("%sCR2: %016lx CR3: %016lx CR4: %016lx\n",
 		log_lvl, cr2, cr3, cr4);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 10/36] x86/fred: Add a new header file for FRED definitions
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (8 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 09/36] x86/fred: Make unions for the cs and ss fields in struct pt_regs Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 11/36] x86/fred: Reserve space for the FRED stack frame Xin Li
                   ` (15 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Add a header file for FRED prototypes and definitions.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---

Changes since v6:
* Replace pt_regs csx flags prefix FRED_CSL_ with FRED_CSX_.
---
 arch/x86/include/asm/fred.h | 104 ++++++++++++++++++++++++++++++++++++
 1 file changed, 104 insertions(+)
 create mode 100644 arch/x86/include/asm/fred.h

diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h
new file mode 100644
index 000000000000..d76e681a806f
--- /dev/null
+++ b/arch/x86/include/asm/fred.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Macros for Flexible Return and Event Delivery (FRED)
+ */
+
+#ifndef ASM_X86_FRED_H
+#define ASM_X86_FRED_H
+
+#include <linux/const.h>
+#include <asm/asm.h>
+
+/*
+ * FRED return instructions
+ *
+ * Replace with "ERETS"/"ERETU" once binutils support FRED return instructions.
+ * The binutils version supporting FRED instructions is still TBD, and will
+ * update once we have it.
+ */
+#define ERETS			_ASM_BYTES(0xf2,0x0f,0x01,0xca)
+#define ERETU			_ASM_BYTES(0xf3,0x0f,0x01,0xca)
+
+/*
+ * RSP is aligned to a 64-byte boundary before used to push a new stack frame
+ */
+#define FRED_STACK_FRAME_RSP_MASK	_AT(unsigned long, (~0x3f))
+
+/*
+ * Event stack level macro for the FRED_STKLVLS MSR.
+ * Usage example: FRED_STKLVL(X86_TRAP_DF, 3)
+ * Multiple values can be ORd together.
+ */
+#define FRED_STKLVL(v,l)	(_AT(unsigned long, l) << (2*(v)))
+
+/* FRED_CONFIG MSR */
+#define FRED_CONFIG_CSL_MASK		0x3
+/*
+ * Used for the return address for call emulation during code patching,
+ * and measured in 64-byte cache lines.
+ */
+#define FRED_CONFIG_REDZONE_AMOUNT	1
+#define FRED_CONFIG_REDZONE		(_AT(unsigned long, FRED_CONFIG_REDZONE_AMOUNT) << 6)
+#define FRED_CONFIG_INT_STKLVL(l)	(_AT(unsigned long, l) << 9)
+#define FRED_CONFIG_ENTRYPOINT(p)	_AT(unsigned long, (p))
+
+/*
+ * FRED event type and vector bit width and counts.
+ *
+ * There is space in the stack frame making it possible to extend event type
+ * and vector fields in the future.
+ */
+#define FRED_EVENT_TYPE_BITS		3
+#define FRED_EVENT_TYPE_COUNT		_BITUL(FRED_EVENT_TYPE_BITS)
+#define FRED_EVENT_VECTOR_BITS		8
+#define FRED_EVENT_VECTOR_COUNT		_BITUL(FRED_EVENT_VECTOR_BITS)
+
+/* FRED EVENT_TYPE_OTHER vector numbers */
+#define FRED_SYSCALL			1
+#define FRED_SYSENTER			2
+#define FRED_NUM_OTHER_VECTORS		3
+
+/* Flags above the SS selector (regs->ssx) */
+#define FRED_SSX_INTERRUPT_SHADOW_BIT	16
+#define FRED_SSX_INTERRUPT_SHADOW	_BITUL(FRED_SSX_INTERRUPT_SHADOW_BIT)
+#define FRED_SSX_SOFTWARE_INITIATED_BIT	17
+#define FRED_SSX_SOFTWARE_INITIATED	_BITUL(FRED_SSX_SOFTWARE_INITIATED_BIT)
+#define FRED_SSX_NMI_BIT		18
+#define FRED_SSX_NMI			_BITUL(FRED_SSX_NMI_BIT)
+#define FRED_SSX_64_BIT_MODE_BIT	57
+#define FRED_SSX_64_BIT_MODE		_BITUL(FRED_SSX_64_BIT_MODE_BIT)
+
+#ifdef CONFIG_X86_FRED
+
+#ifndef __ASSEMBLY__
+
+#include <linux/kernel.h>
+#include <asm/ptrace.h>
+
+struct fred_info {
+	/* Event data: CR2, DR6, ... */
+	unsigned long edata;
+	unsigned long resv;
+};
+
+/* Full format of the FRED stack frame */
+struct fred_frame {
+	struct pt_regs   regs;
+	struct fred_info info;
+};
+
+static __always_inline struct fred_info *fred_info(struct pt_regs *regs)
+{
+	return &container_of(regs, struct fred_frame, regs)->info;
+}
+
+static __always_inline unsigned long fred_event_data(struct pt_regs *regs)
+{
+	return fred_info(regs)->edata;
+}
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* CONFIG_X86_FRED */
+
+#endif /* ASM_X86_FRED_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 11/36] x86/fred: Reserve space for the FRED stack frame
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (9 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 10/36] x86/fred: Add a new header file for FRED definitions Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 12/36] x86/fred: Update MSR_IA32_FRED_RSP0 during task switch Xin Li
                   ` (14 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

When using FRED, reserve space at the top of the stack frame, just
like i386 does.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/include/asm/thread_info.h | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index d63b02940747..089cab875cba 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -31,7 +31,9 @@
  * In vm86 mode, the hardware frame is much longer still, so add 16
  * bytes to make room for the real-mode segments.
  *
- * x86_64 has a fixed-length stack frame.
+ * x86-64 has a fixed-length stack frame, but it depends on whether
+ * or not FRED is enabled. Future versions of FRED might make this
+ * dynamic, but for now it is always 2 words longer.
  */
 #ifdef CONFIG_X86_32
 # ifdef CONFIG_VM86
@@ -39,8 +41,12 @@
 # else
 #  define TOP_OF_KERNEL_STACK_PADDING 8
 # endif
-#else
-# define TOP_OF_KERNEL_STACK_PADDING 0
+#else /* x86-64 */
+# ifdef CONFIG_X86_FRED
+#  define TOP_OF_KERNEL_STACK_PADDING (2*8)
+# else
+#  define TOP_OF_KERNEL_STACK_PADDING 0
+# endif
 #endif
 
 /*
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 12/36] x86/fred: Update MSR_IA32_FRED_RSP0 during task switch
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (10 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 11/36] x86/fred: Reserve space for the FRED stack frame Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 13/36] x86/fred: Let ret_from_fork_asm() jmp to fred_exit_user when FRED is enabled Xin Li
                   ` (13 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

MSR_IA32_FRED_RSP0 is used during ring 3 event delivery, and needs to
be updated to point to the top of next task stack during task switch.

Update MSR_IA32_FRED_RSP0 with WRMSR instruction for now, and will use
WRMSRNS/WRMSRLIST for performance once it gets upstreamed.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/include/asm/switch_to.h | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h
index f42dbf17f52b..6c911fd400b2 100644
--- a/arch/x86/include/asm/switch_to.h
+++ b/arch/x86/include/asm/switch_to.h
@@ -70,9 +70,16 @@ static inline void update_task_stack(struct task_struct *task)
 #ifdef CONFIG_X86_32
 	this_cpu_write(cpu_tss_rw.x86_tss.sp1, task->thread.sp0);
 #else
-	/* Xen PV enters the kernel on the thread stack. */
-	if (cpu_feature_enabled(X86_FEATURE_XENPV))
+	if (cpu_feature_enabled(X86_FEATURE_FRED)) {
+		/*
+		 * Will use WRMSRNS/WRMSRLIST for performance once it's upstreamed.
+		 */
+		wrmsrl(MSR_IA32_FRED_RSP0,
+		       (unsigned long)task_stack_page(task) + THREAD_SIZE);
+	} else if (cpu_feature_enabled(X86_FEATURE_XENPV)) {
+		/* Xen PV enters the kernel on the thread stack. */
 		load_sp0(task_top_of_stack(task));
+	}
 #endif
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 13/36] x86/fred: Let ret_from_fork_asm() jmp to fred_exit_user when FRED is enabled
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (11 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 12/36] x86/fred: Update MSR_IA32_FRED_RSP0 during task switch Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 14/36] x86/fred: Disallow the swapgs instruction " Xin Li
                   ` (12 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Let ret_from_fork_asm() jmp to fred_exit_user when FRED is enabled,
otherwise the existing IDT code is chosen.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/entry/entry_64.S | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 43606de22511..8069151176f2 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -309,7 +309,13 @@ SYM_CODE_START(ret_from_fork_asm)
 	 * and unwind should work normally.
 	 */
 	UNWIND_HINT_REGS
+
+#ifdef CONFIG_X86_FRED
+	ALTERNATIVE "jmp swapgs_restore_regs_and_return_to_usermode", \
+		    "jmp fred_exit_user", X86_FEATURE_FRED
+#else
 	jmp	swapgs_restore_regs_and_return_to_usermode
+#endif
 SYM_CODE_END(ret_from_fork_asm)
 .popsection
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 14/36] x86/fred: Disallow the swapgs instruction when FRED is enabled
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (12 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 13/36] x86/fred: Let ret_from_fork_asm() jmp to fred_exit_user when FRED is enabled Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 15/36] x86/fred: No ESPFIX needed " Xin Li
                   ` (11 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

The FRED architecture establishes the full supervisor/user through:
1) FRED event delivery from ring 3 swaps the value of the GS base
   address and that of the IA32_KERNEL_GS_BASE MSR.
2) ERETU swaps the value of the GS base address and that of the
   IA32_KERNEL_GS_BASE MSR.
3) LKGS is already upstreamed and automatically enabled with FRED to
   load the GS base address directly into the IA32_KERNEL_GS_BASE MSR
   instead of the GS segment’s descriptor cache.

As a result, there is no need to SWAPGS away from the kernel GS base,
i.e., the swapgs instruction is no longer needed when FRED is enabled,
thus is disallowed. Otherwise it causes #UD.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---

Changes since v8:
* Explain why writing directly to the IA32_KERNEL_GS_BASE MSR is
  doing the right thing (Thomas Gleixner).
---
 arch/x86/kernel/process_64.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 265ab8fcb146..6d5fed29f552 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -166,7 +166,8 @@ static noinstr unsigned long __rdgsbase_inactive(void)
 
 	lockdep_assert_irqs_disabled();
 
-	if (!cpu_feature_enabled(X86_FEATURE_XENPV)) {
+	if (!cpu_feature_enabled(X86_FEATURE_FRED) &&
+	    !cpu_feature_enabled(X86_FEATURE_XENPV)) {
 		native_swapgs();
 		gsbase = rdgsbase();
 		native_swapgs();
@@ -191,7 +192,8 @@ static noinstr void __wrgsbase_inactive(unsigned long gsbase)
 {
 	lockdep_assert_irqs_disabled();
 
-	if (!cpu_feature_enabled(X86_FEATURE_XENPV)) {
+	if (!cpu_feature_enabled(X86_FEATURE_FRED) &&
+	    !cpu_feature_enabled(X86_FEATURE_XENPV)) {
 		native_swapgs();
 		wrgsbase(gsbase);
 		native_swapgs();
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 15/36] x86/fred: No ESPFIX needed when FRED is enabled
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (13 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 14/36] x86/fred: Disallow the swapgs instruction " Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 16/36] x86/fred: Allow single-step trap and NMI when starting a new task Xin Li
                   ` (10 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Because FRED always restores the full value of %rsp, ESPFIX is
no longer needed when it's enabled.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/kernel/espfix_64.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kernel/espfix_64.c b/arch/x86/kernel/espfix_64.c
index 16f9814c9be0..48d133a54f45 100644
--- a/arch/x86/kernel/espfix_64.c
+++ b/arch/x86/kernel/espfix_64.c
@@ -106,6 +106,10 @@ void __init init_espfix_bsp(void)
 	pgd_t *pgd;
 	p4d_t *p4d;
 
+	/* FRED systems don't need ESPFIX */
+	if (cpu_feature_enabled(X86_FEATURE_FRED))
+		return;
+
 	/* Install the espfix pud into the kernel page directory */
 	pgd = &init_top_pgt[pgd_index(ESPFIX_BASE_ADDR)];
 	p4d = p4d_alloc(&init_mm, pgd, ESPFIX_BASE_ADDR);
@@ -129,6 +133,10 @@ void init_espfix_ap(int cpu)
 	void *stack_page;
 	pteval_t ptemask;
 
+	/* FRED systems don't need ESPFIX */
+	if (cpu_feature_enabled(X86_FEATURE_FRED))
+		return;
+
 	/* We only have to do this once... */
 	if (likely(per_cpu(espfix_stack, cpu)))
 		return;		/* Already initialized */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 16/36] x86/fred: Allow single-step trap and NMI when starting a new task
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (14 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 15/36] x86/fred: No ESPFIX needed " Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 17/36] x86/fred: Define a common function type fred_handler Xin Li
                   ` (9 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Entering a new task is logically speaking a return from a system call
(exec, fork, clone, etc.). As such, if ptrace enables single stepping
a single step exception should be allowed to trigger immediately upon
entering user space. This is not optional.

NMI should *never* be disabled in user space. As such, this is an
optional, opportunistic way to catch errors.

Allow single-step trap and NMI when starting a new task, thus once
the new task enters user space, single-step trap and NMI are both
enabled immediately.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---

Changes since v8:
* Use high-order 48 bits above the lowest 16 bit SS only when FRED
  is enabled (Thomas Gleixner).
---
 arch/x86/kernel/process_64.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 6d5fed29f552..0b47871a6141 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -56,6 +56,7 @@
 #include <asm/resctrl.h>
 #include <asm/unistd.h>
 #include <asm/fsgsbase.h>
+#include <asm/fred.h>
 #ifdef CONFIG_IA32_EMULATION
 /* Not included via unistd.h */
 #include <asm/unistd_32_ia32.h>
@@ -507,8 +508,18 @@ void x86_gsbase_write_task(struct task_struct *task, unsigned long gsbase)
 static void
 start_thread_common(struct pt_regs *regs, unsigned long new_ip,
 		    unsigned long new_sp,
-		    unsigned int _cs, unsigned int _ss, unsigned int _ds)
+		    u16 _cs, u16 _ss, u16 _ds)
 {
+	/*
+	 * Paranoia: High-order 48 bits above the lowest 16 bit SS are
+	 * discarded by the legacy IRET instruction on all Intel, AMD,
+	 * and Cyrix/Centaur/VIA CPUs, thus can be set unconditionally,
+	 * even when FRED is not enabled. But we choose the safer side
+	 * to use these bits only when FRED is enabled.
+	 */
+	const unsigned long ssx_flags = cpu_feature_enabled(X86_FEATURE_FRED) ?
+		(FRED_SSX_SOFTWARE_INITIATED | FRED_SSX_NMI) : 0;
+
 	WARN_ON_ONCE(regs != current_pt_regs());
 
 	if (static_cpu_has(X86_BUG_NULL_SEG)) {
@@ -522,11 +533,11 @@ start_thread_common(struct pt_regs *regs, unsigned long new_ip,
 	loadsegment(ds, _ds);
 	load_gs_index(0);
 
-	regs->ip		= new_ip;
-	regs->sp		= new_sp;
-	regs->cs		= _cs;
-	regs->ss		= _ss;
-	regs->flags		= X86_EFLAGS_IF;
+	regs->ip	= new_ip;
+	regs->sp	= new_sp;
+	regs->csx	= _cs;
+	regs->ssx	= _ss | ssx_flags;
+	regs->flags	= X86_EFLAGS_IF | X86_EFLAGS_FIXED;
 }
 
 void
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 17/36] x86/fred: Define a common function type fred_handler
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (15 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 16/36] x86/fred: Allow single-step trap and NMI when starting a new task Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:32 ` [PATCH v9 18/36] x86/fred: Add a page fault entry stub for FRED Xin Li
                   ` (8 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

FRED event delivery establishes a full supervisor context by saving
the essential information about an event to a FRED stack frame, e.g.,
the faulting linear address of a #PF is saved as event data of a FRED
stack frame. Thus a struct pt_regs has all the needed data to handle
an event and it's the only input argument of a FRED event handler.

Define fred_handler, a common function type used in the FRED event
dispatch framework, which makes it easier to find the entry points
(via grep), allows the prototype to change if necessary without
requiring changing changes everywhere, and makes sure that all the
entry points have the proper decorations (currently noinstr, but
could change in the future.)

Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/include/asm/fred.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h
index d76e681a806f..b45c1bea5b7f 100644
--- a/arch/x86/include/asm/fred.h
+++ b/arch/x86/include/asm/fred.h
@@ -68,6 +68,19 @@
 #define FRED_SSX_64_BIT_MODE_BIT	57
 #define FRED_SSX_64_BIT_MODE		_BITUL(FRED_SSX_64_BIT_MODE_BIT)
 
+/*
+ * FRED event delivery establishes a full supervisor context by
+ * saving the essential information about an event to a FRED
+ * stack frame, e.g., the faulting linear address of a #PF is
+ * saved as event data of a FRED #PF stack frame. Thus a struct
+ * pt_regs has all the needed data to handle an event and it's
+ * the only input argument of a FRED event handler.
+ *
+ * FRED handlers need to be placed in the noinstr text section.
+ */
+#define DECLARE_FRED_HANDLER(f) void f (struct pt_regs *regs)
+#define DEFINE_FRED_HANDLER(f) noinstr DECLARE_FRED_HANDLER(f)
+
 #ifdef CONFIG_X86_FRED
 
 #ifndef __ASSEMBLY__
@@ -97,6 +110,8 @@ static __always_inline unsigned long fred_event_data(struct pt_regs *regs)
 	return fred_info(regs)->edata;
 }
 
+typedef DECLARE_FRED_HANDLER((*fred_handler));
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* CONFIG_X86_FRED */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 18/36] x86/fred: Add a page fault entry stub for FRED
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (16 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 17/36] x86/fred: Define a common function type fred_handler Xin Li
@ 2023-07-31  6:32 ` Xin Li
  2023-07-31  6:33 ` [PATCH v9 19/36] x86/fred: Add a debug " Xin Li
                   ` (7 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:32 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Add a page fault entry stub for FRED.

On a FRED system, the faulting address (CR2) is passed on the stack,
to avoid the problem of transient state. Thus we get the page fault
address from the stack instead of CR2.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/include/asm/fred.h |  2 ++
 arch/x86/mm/fault.c         | 18 ++++++++++++++++--
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h
index b45c1bea5b7f..fb8e7b4f2d38 100644
--- a/arch/x86/include/asm/fred.h
+++ b/arch/x86/include/asm/fred.h
@@ -112,6 +112,8 @@ static __always_inline unsigned long fred_event_data(struct pt_regs *regs)
 
 typedef DECLARE_FRED_HANDLER((*fred_handler));
 
+DECLARE_FRED_HANDLER(fred_exc_page_fault);
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* CONFIG_X86_FRED */
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index e8711b2cafaf..dd3df092d0f2 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -34,6 +34,7 @@
 #include <asm/kvm_para.h>		/* kvm_handle_async_pf		*/
 #include <asm/vdso.h>			/* fixup_vdso_exception()	*/
 #include <asm/irq_stack.h>
+#include <asm/fred.h>
 
 #define CREATE_TRACE_POINTS
 #include <asm/trace/exceptions.h>
@@ -1495,9 +1496,10 @@ handle_page_fault(struct pt_regs *regs, unsigned long error_code,
 	}
 }
 
-DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault)
+static __always_inline void page_fault_common(struct pt_regs *regs,
+					      unsigned int error_code,
+					      unsigned long address)
 {
-	unsigned long address = read_cr2();
 	irqentry_state_t state;
 
 	prefetchw(&current->mm->mmap_lock);
@@ -1544,3 +1546,15 @@ DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault)
 
 	irqentry_exit(regs, state);
 }
+
+DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault)
+{
+	page_fault_common(regs, error_code, read_cr2());
+}
+
+#ifdef CONFIG_X86_FRED
+DEFINE_FRED_HANDLER(fred_exc_page_fault)
+{
+	page_fault_common(regs, regs->orig_ax, fred_event_data(regs));
+}
+#endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 19/36] x86/fred: Add a debug fault entry stub for FRED
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (17 preceding siblings ...)
  2023-07-31  6:32 ` [PATCH v9 18/36] x86/fred: Add a page fault entry stub for FRED Xin Li
@ 2023-07-31  6:33 ` Xin Li
  2023-07-31  6:33 ` [PATCH v9 20/36] x86/fred: Add a NMI " Xin Li
                   ` (6 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:33 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Add a debug fault entry stub for FRED.

On a FRED system, the debug trap status information (DR6) is passed
on the stack, to avoid the problem of transient state. Furthermore,
FRED transitions avoid a lot of ugly corner cases the handling of which
can, and should be, skipped.

The FRED debug trap status information saved on the stack differs from DR6
in both stickiness and polarity; it is exactly what debug_read_clear_dr6()
returns, and exc_debug_user()/exc_debug_kernel() expect.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---

Changes since v1:
* call irqentry_nmi_{enter,exit}() in both IDT and FRED debug fault kernel
  handler (Peter Zijlstra).
---
 arch/x86/include/asm/fred.h |  1 +
 arch/x86/kernel/traps.c     | 56 +++++++++++++++++++++++++++----------
 2 files changed, 42 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h
index fb8e7b4f2d38..ad7b79130b1e 100644
--- a/arch/x86/include/asm/fred.h
+++ b/arch/x86/include/asm/fred.h
@@ -112,6 +112,7 @@ static __always_inline unsigned long fred_event_data(struct pt_regs *regs)
 
 typedef DECLARE_FRED_HANDLER((*fred_handler));
 
+DECLARE_FRED_HANDLER(fred_exc_debug);
 DECLARE_FRED_HANDLER(fred_exc_page_fault);
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 4a817d20ce3b..b10464966a81 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -47,6 +47,7 @@
 #include <asm/debugreg.h>
 #include <asm/realmode.h>
 #include <asm/text-patching.h>
+#include <asm/fred.h>
 #include <asm/ftrace.h>
 #include <asm/traps.h>
 #include <asm/desc.h>
@@ -1021,21 +1022,9 @@ static bool notify_debug(struct pt_regs *regs, unsigned long *dr6)
 	return false;
 }
 
-static __always_inline void exc_debug_kernel(struct pt_regs *regs,
-					     unsigned long dr6)
+static __always_inline void debug_kernel_common(struct pt_regs *regs,
+						unsigned long dr6)
 {
-	/*
-	 * Disable breakpoints during exception handling; recursive exceptions
-	 * are exceedingly 'fun'.
-	 *
-	 * Since this function is NOKPROBE, and that also applies to
-	 * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a
-	 * HW_BREAKPOINT_W on our stack)
-	 *
-	 * Entry text is excluded for HW_BP_X and cpu_entry_area, which
-	 * includes the entry stack is excluded for everything.
-	 */
-	unsigned long dr7 = local_db_save();
 	irqentry_state_t irq_state = irqentry_nmi_enter(regs);
 	instrumentation_begin();
 
@@ -1063,7 +1052,8 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs,
 	 * Catch SYSENTER with TF set and clear DR_STEP. If this hit a
 	 * watchpoint at the same time then that will still be handled.
 	 */
-	if ((dr6 & DR_STEP) && is_sysenter_singlestep(regs))
+	if (!cpu_feature_enabled(X86_FEATURE_FRED) &&
+	    (dr6 & DR_STEP) && is_sysenter_singlestep(regs))
 		dr6 &= ~DR_STEP;
 
 	/*
@@ -1091,7 +1081,25 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs,
 out:
 	instrumentation_end();
 	irqentry_nmi_exit(regs, irq_state);
+}
 
+static __always_inline void exc_debug_kernel(struct pt_regs *regs,
+					     unsigned long dr6)
+{
+	/*
+	 * Disable breakpoints during exception handling; recursive exceptions
+	 * are exceedingly 'fun'.
+	 *
+	 * Since this function is NOKPROBE, and that also applies to
+	 * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a
+	 * HW_BREAKPOINT_W on our stack)
+	 *
+	 * Entry text is excluded for HW_BP_X and cpu_entry_area, which
+	 * includes the entry stack is excluded for everything.
+	 */
+	unsigned long dr7 = local_db_save();
+
+	debug_kernel_common(regs, dr6);
 	local_db_restore(dr7);
 }
 
@@ -1180,6 +1188,24 @@ DEFINE_IDTENTRY_DEBUG_USER(exc_debug)
 {
 	exc_debug_user(regs, debug_read_clear_dr6());
 }
+
+# ifdef CONFIG_X86_FRED
+DEFINE_FRED_HANDLER(fred_exc_debug)
+{
+	/*
+	 * The FRED debug information saved onto stack differs from
+	 * DR6 in both stickiness and polarity; it is exactly what
+	 * debug_read_clear_dr6() returns.
+	 */
+	unsigned long dr6 = fred_event_data(regs);
+
+	if (user_mode(regs))
+		exc_debug_user(regs, dr6);
+	else
+		debug_kernel_common(regs, dr6);
+}
+# endif /* CONFIG_X86_FRED */
+
 #else
 /* 32 bit does not have separate entry points. */
 DEFINE_IDTENTRY_RAW(exc_debug)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 20/36] x86/fred: Add a NMI entry stub for FRED
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (18 preceding siblings ...)
  2023-07-31  6:33 ` [PATCH v9 19/36] x86/fred: Add a debug " Xin Li
@ 2023-07-31  6:33 ` Xin Li
  2023-07-31  6:33 ` [PATCH v9 21/36] x86/fred: Add a machine check " Xin Li
                   ` (5 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:33 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

On a FRED system, NMIs nest both with themselves and faults, transient
information is saved into the stack frame, and NMI unblocking only
happens when the stack frame indicates that so should happen.

Thus, the NMI entry stub for FRED is really quite small...

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/include/asm/fred.h |  1 +
 arch/x86/kernel/nmi.c       | 19 +++++++++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h
index ad7b79130b1e..2a7c47dfd733 100644
--- a/arch/x86/include/asm/fred.h
+++ b/arch/x86/include/asm/fred.h
@@ -112,6 +112,7 @@ static __always_inline unsigned long fred_event_data(struct pt_regs *regs)
 
 typedef DECLARE_FRED_HANDLER((*fred_handler));
 
+DECLARE_FRED_HANDLER(fred_exc_nmi);
 DECLARE_FRED_HANDLER(fred_exc_debug);
 DECLARE_FRED_HANDLER(fred_exc_page_fault);
 
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index a0c551846b35..f803e2bcd024 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -34,6 +34,7 @@
 #include <asm/cache.h>
 #include <asm/nospec-branch.h>
 #include <asm/sev.h>
+#include <asm/fred.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/nmi.h>
@@ -643,6 +644,24 @@ void nmi_backtrace_stall_check(const struct cpumask *btp)
 
 #endif
 
+#ifdef CONFIG_X86_FRED
+DEFINE_FRED_HANDLER(fred_exc_nmi)
+{
+	/*
+	 * With FRED, CR2 and DR6 are pushed atomically on faults,
+	 * so we don't have to worry about saving and restoring them.
+	 * Breakpoint faults nest, so assume it is OK to leave DR7
+	 * enabled.
+	 */
+	irqentry_state_t irq_state = irqentry_nmi_enter(regs);
+
+	inc_irq_stat(__nmi_count);
+	default_do_nmi(regs);
+
+	irqentry_nmi_exit(regs, irq_state);
+}
+#endif
+
 void stop_nmi(void)
 {
 	ignore_nmis++;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 21/36] x86/fred: Add a machine check entry stub for FRED
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (19 preceding siblings ...)
  2023-07-31  6:33 ` [PATCH v9 20/36] x86/fred: Add a NMI " Xin Li
@ 2023-07-31  6:33 ` Xin Li
  2023-07-31  6:33 ` [PATCH v9 22/36] x86/fred: Add a double fault " Xin Li
                   ` (4 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:33 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

Add a machine check entry stub for FRED.

Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---

Changes since v5:
* Disallow #DB inside #MCE for robustness sake (Peter Zijlstra).
---
 arch/x86/include/asm/fred.h    |  1 +
 arch/x86/kernel/cpu/mce/core.c | 15 +++++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h
index 2a7c47dfd733..f559dd9dc4f2 100644
--- a/arch/x86/include/asm/fred.h
+++ b/arch/x86/include/asm/fred.h
@@ -115,6 +115,7 @@ typedef DECLARE_FRED_HANDLER((*fred_handler));
 DECLARE_FRED_HANDLER(fred_exc_nmi);
 DECLARE_FRED_HANDLER(fred_exc_debug);
 DECLARE_FRED_HANDLER(fred_exc_page_fault);
+DECLARE_FRED_HANDLER(fred_exc_machine_check);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index b8ad5a5b4026..98456e20f155 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -52,6 +52,7 @@
 #include <asm/mce.h>
 #include <asm/msr.h>
 #include <asm/reboot.h>
+#include <asm/fred.h>
 
 #include "internal.h"
 
@@ -2118,6 +2119,20 @@ DEFINE_IDTENTRY_MCE_USER(exc_machine_check)
 	exc_machine_check_user(regs);
 	local_db_restore(dr7);
 }
+
+#ifdef CONFIG_X86_FRED
+DEFINE_FRED_HANDLER(fred_exc_machine_check)
+{
+	unsigned long dr7;
+
+	dr7 = local_db_save();
+	if (user_mode(regs))
+		exc_machine_check_user(regs);
+	else
+		exc_machine_check_kernel(regs);
+	local_db_restore(dr7);
+}
+#endif
 #else
 /* 32bit unified entry point */
 DEFINE_IDTENTRY_RAW(exc_machine_check)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 22/36] x86/fred: Add a double fault entry stub for FRED
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (20 preceding siblings ...)
  2023-07-31  6:33 ` [PATCH v9 21/36] x86/fred: Add a machine check " Xin Li
@ 2023-07-31  6:33 ` Xin Li
  2023-07-31  6:33 ` [PATCH v9 23/36] x86/entry: Remove idtentry_sysvec from entry_{32,64}.S Xin Li
                   ` (3 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:33 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

The IDT event delivery of a double fault pushes an error code into the
orig_ax member of the pt_regs structure, and the error code is passed
as the second argument of its C-handler exc_double_fault(), although
the pt_regs structure is already passed as the first argument.

The existing IDT double fault asm entry code does the following

  movq ORIG_RAX(%rsp), %rsi	/* get error code into 2nd argument*/
  movq $-1, ORIG_RAX(%rsp)	/* no syscall to restart */

to set the orig_ax member to -1 just before calling the C-handler.

X86_TRAP_TS, X86_TRAP_NP, X86_TRAP_SS, X86_TRAP_GP, X86_TRAP_AC and
X86_TRAP_CP are all handled in the same way because the IDT event
delivery pushes an error code into their stack frame for them.

The commit d99015b1abbad ("x86: move entry_64.S register saving out of
the macros") introduced the changes to set orig_ax to -1, but I can't
see why. Our tests with FRED seem fine if orig_ax is left unchanged
instead of set to -1. It's probably cleaner and simpler to remove the
second argument from exc_double_fault() while leave orig_ax unchanged
to pass the error code inside the first argument, at least on native
x86_64. That would be a separate, pre-FRED, patch.

For now just add a double fault entry stub for FRED, which simply
calls the existing exc_double_fault().

Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/include/asm/fred.h | 1 +
 arch/x86/kernel/traps.c     | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h
index f559dd9dc4f2..bd701ac87528 100644
--- a/arch/x86/include/asm/fred.h
+++ b/arch/x86/include/asm/fred.h
@@ -116,6 +116,7 @@ DECLARE_FRED_HANDLER(fred_exc_nmi);
 DECLARE_FRED_HANDLER(fred_exc_debug);
 DECLARE_FRED_HANDLER(fred_exc_page_fault);
 DECLARE_FRED_HANDLER(fred_exc_machine_check);
+DECLARE_FRED_HANDLER(fred_exc_double_fault);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index b10464966a81..49dd92458eb0 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -555,6 +555,13 @@ DEFINE_IDTENTRY_DF(exc_double_fault)
 	instrumentation_end();
 }
 
+#ifdef CONFIG_X86_FRED
+DEFINE_FRED_HANDLER(fred_exc_double_fault)
+{
+	exc_double_fault(regs, regs->orig_ax);
+}
+#endif
+
 DEFINE_IDTENTRY(exc_bounds)
 {
 	if (notify_die(DIE_TRAP, "bounds", regs, 0,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 23/36] x86/entry: Remove idtentry_sysvec from entry_{32,64}.S
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (21 preceding siblings ...)
  2023-07-31  6:33 ` [PATCH v9 22/36] x86/fred: Add a double fault " Xin Li
@ 2023-07-31  6:33 ` Xin Li
  2023-07-31  6:33 ` [PATCH v9 24/36] x86/idtentry: Incorporate definitions/declarations of the FRED external interrupt handler type Xin Li
                   ` (2 subsequent siblings)
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:33 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

idtentry_sysvec is really just DECLARE_IDTENTRY defined in
<asm/idtentry.h>, no need to define it separately.

Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/entry/entry_32.S       | 4 ----
 arch/x86/entry/entry_64.S       | 8 --------
 arch/x86/include/asm/idtentry.h | 2 +-
 3 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 6e6af42e044a..e0f22ad8ff7e 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -649,10 +649,6 @@ SYM_CODE_START_LOCAL(asm_\cfunc)
 SYM_CODE_END(asm_\cfunc)
 .endm
 
-.macro idtentry_sysvec vector cfunc
-	idtentry \vector asm_\cfunc \cfunc has_error_code=0
-.endm
-
 /*
  * Include the defines which emit the idt entries which are shared
  * shared between 32 and 64 bit and emit the __irqentry_text_* markers
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 8069151176f2..44f14b990597 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -438,14 +438,6 @@ SYM_CODE_END(\asmsym)
 	idtentry \vector asm_\cfunc \cfunc has_error_code=1
 .endm
 
-/*
- * System vectors which invoke their handlers directly and are not
- * going through the regular common device interrupt handling code.
- */
-.macro idtentry_sysvec vector cfunc
-	idtentry \vector asm_\cfunc \cfunc has_error_code=0
-.endm
-
 /**
  * idtentry_mce_db - Macro to generate entry stubs for #MC and #DB
  * @vector:		Vector number
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index cd5c10a74071..6817c0f8e323 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -447,7 +447,7 @@ __visible noinstr void func(struct pt_regs *regs,			\
 
 /* System vector entries */
 #define DECLARE_IDTENTRY_SYSVEC(vector, func)				\
-	idtentry_sysvec vector func
+	DECLARE_IDTENTRY(vector, func)
 
 #ifdef CONFIG_X86_64
 # define DECLARE_IDTENTRY_MCE(vector, func)				\
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 24/36] x86/idtentry: Incorporate definitions/declarations of the FRED external interrupt handler type
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (22 preceding siblings ...)
  2023-07-31  6:33 ` [PATCH v9 23/36] x86/entry: Remove idtentry_sysvec from entry_{32,64}.S Xin Li
@ 2023-07-31  6:33 ` Xin Li
  2023-07-31  6:33 ` [PATCH v9 25/36] x86/traps: Add a system interrupt handler table for system interrupt dispatch Xin Li
  2023-07-31 22:29 ` [PATCH v9 00/36] x86: enable FRED for x86-64 Sean Christopherson
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:33 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

FRED operates differently from IDT in terms of interrupt handling.
Instead of directly dispatching an interrupt to its handler based
on the interrupt vector, FRED requires the software to dispatch
an event to its handler based on both the event's type and vector.
Therefore, an event dispatch framework must be implemented to
facilitate the event-to-handler dispatch process.

The FRED event dispatch framework assumes control once an event is
delivered, starting from two FRED entry points, after which several
event dispatch tables are introduced to facilitate the dispatching.
The first level dispatching is event type based, and two tables need
to be defined, one for ring 3 event dispatching, and the other for
ring 0. The second level dispatching is event vector based, and
several tables need to be defined, e.g., an exception handler table
for exception dispatching.

Handlers in these tables are typically noinstr. However for external
interrupt dispatching, irqentry_{enter,exit}() and
instrumentation_{begin,end}() can be extracted from respective interrupt
handler to the dispatch framework. As a result, FRED external interrupt
handlers don't need to be noinstr.

Incorporate definitions/declarations of FRED external interrupt handler
types into the IDT entry macros.

It is probably better to rename idtentry as event_entry.

Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---

Changes since v8:
* Put IDTENTRY changes in a separate patch (Thomas Gleixner).
---
 arch/x86/include/asm/idtentry.h | 91 +++++++++++++++++++++++++++++----
 1 file changed, 82 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 6817c0f8e323..e67d111bf932 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -167,17 +167,22 @@ __visible noinstr void func(struct pt_regs *regs, unsigned long error_code)
 
 /**
  * DECLARE_IDTENTRY_IRQ - Declare functions for device interrupt IDT entry
- *			  points (common/spurious)
+ *			  points (common/spurious) and their corresponding
+ *			  software based dispatch handlers in the non-noinstr
+ *			  text section
  * @vector:	Vector number (ignored for C)
  * @func:	Function name of the entry point
  *
- * Maps to DECLARE_IDTENTRY_ERRORCODE()
+ * Maps to DECLARE_IDTENTRY_ERRORCODE(), plus a dispatch function prototype
  */
 #define DECLARE_IDTENTRY_IRQ(vector, func)				\
-	DECLARE_IDTENTRY_ERRORCODE(vector, func)
+	DECLARE_IDTENTRY_ERRORCODE(vector, func);			\
+	void dispatch_##func(struct pt_regs *regs, unsigned long error_code)
 
 /**
  * DEFINE_IDTENTRY_IRQ - Emit code for device interrupt IDT entry points
+ *			 and their corresponding software based dispatch
+ *			 handlers in the non-noinstr text section
  * @func:	Function name of the entry point
  *
  * The vector number is pushed by the low level entry stub and handed
@@ -187,6 +192,11 @@ __visible noinstr void func(struct pt_regs *regs, unsigned long error_code)
  * irq_enter/exit_rcu() are invoked before the function body and the
  * KVM L1D flush request is set. Stack switching to the interrupt stack
  * has to be done in the function body if necessary.
+ *
+ * dispatch_func() is a software based dispatch handler in the non-noinstr
+ * text section, assuming the irqentry_{enter,exit}() and
+ * instrumentation_{begin,end}() helpers are invoked in the external
+ * interrupt dispatch framework before and after dispatch_func().
  */
 #define DEFINE_IDTENTRY_IRQ(func)					\
 static void __##func(struct pt_regs *regs, u32 vector);			\
@@ -204,31 +214,68 @@ __visible noinstr void func(struct pt_regs *regs,			\
 	irqentry_exit(regs, state);					\
 }									\
 									\
+__visible void dispatch_##func(struct pt_regs *regs,			\
+			       unsigned long error_code)		\
+{									\
+	u32 vector = (u32)(u8)error_code;				\
+									\
+	kvm_set_cpu_l1tf_flush_l1d();					\
+	run_irq_on_irqstack_cond(__##func, regs, vector);		\
+}									\
+									\
 static noinline void __##func(struct pt_regs *regs, u32 vector)
 
+/*
+ * Define a function type system_interrupt_handler as the element type of
+ * the table system_interrupt_handlers.
+ *
+ * System interrupt handlers don't take any interrupt vector number, or
+ * any interrupt error code as arguments, as a system interrupt handler
+ * is defined to handle a specific interrupt vector, and no error code
+ * is defined for external interrupts. It takes only one argument of type
+ * struct pt_regs *.
+ */
+#define DECLARE_SYSTEM_INTERRUPT_HANDLER(f) \
+	void f (struct pt_regs *regs)
+#define DEFINE_SYSTEM_INTERRUPT_HANDLER(f) \
+	__visible DECLARE_SYSTEM_INTERRUPT_HANDLER(f)
+typedef DECLARE_SYSTEM_INTERRUPT_HANDLER((*system_interrupt_handler));
+
 /**
  * DECLARE_IDTENTRY_SYSVEC - Declare functions for system vector entry points
+ *			     and their corresponding software based dispatch
+ *			     handlers in the non-noinstr text section
  * @vector:	Vector number (ignored for C)
  * @func:	Function name of the entry point
  *
- * Declares three functions:
+ * Declares four functions:
  * - The ASM entry point: asm_##func
  * - The XEN PV trap entry point: xen_##func (maybe unused)
  * - The C handler called from the ASM entry point
+ * - The C handler used in the system interrupt handler table
  *
- * Maps to DECLARE_IDTENTRY().
+ * Maps to DECLARE_IDTENTRY(), plus a dispatch table function prototype
  */
 #define DECLARE_IDTENTRY_SYSVEC(vector, func)				\
-	DECLARE_IDTENTRY(vector, func)
+	DECLARE_IDTENTRY(vector, func);					\
+	DECLARE_SYSTEM_INTERRUPT_HANDLER(dispatch_table_##func)
 
 /**
  * DEFINE_IDTENTRY_SYSVEC - Emit code for system vector IDT entry points
+ *			    and their corresponding software based dispatch
+ *			    handlers in the non-noinstr text section
  * @func:	Function name of the entry point
  *
  * irqentry_enter/exit() and irq_enter/exit_rcu() are invoked before the
  * function body. KVM L1D flush request is set.
  *
- * Runs the function on the interrupt stack if the entry hit kernel mode
+ * Runs the function on the interrupt stack if the entry hit kernel mode.
+ *
+ * dispatch_table_func() is used to fill the system interrupt handler table
+ * for system interrupts dispatching, assuming the irqentry_{enter,exit}()
+ * and instrumentation_{begin,end}() helpers are invoked in the external
+ * interrupt dispatch framework before and after dispatch_table_func(),
+ * thus in the non-noinstr text section.
  */
 #define DEFINE_IDTENTRY_SYSVEC(func)					\
 static void __##func(struct pt_regs *regs);				\
@@ -244,11 +291,19 @@ __visible noinstr void func(struct pt_regs *regs)			\
 	irqentry_exit(regs, state);					\
 }									\
 									\
+DEFINE_SYSTEM_INTERRUPT_HANDLER(dispatch_table_##func)			\
+{									\
+	kvm_set_cpu_l1tf_flush_l1d();					\
+	run_sysvec_on_irqstack_cond(__##func, regs);			\
+}									\
+									\
 static noinline void __##func(struct pt_regs *regs)
 
 /**
  * DEFINE_IDTENTRY_SYSVEC_SIMPLE - Emit code for simple system vector IDT
- *				   entry points
+ *				   entry points and their corresponding
+ *				   software based dispatch handlers in
+ *				   the non-noinstr text section
  * @func:	Function name of the entry point
  *
  * Runs the function on the interrupted stack. No switch to IRQ stack and
@@ -256,6 +311,12 @@ static noinline void __##func(struct pt_regs *regs)
  *
  * Only use for 'empty' vectors like reschedule IPI and KVM posted
  * interrupt vectors.
+ *
+ * dispatch_table_func() is used to fill the system interrupt handler table
+ * for system interrupts dispatching, assuming the irqentry_{enter,exit}()
+ * and instrumentation_{begin,end}() helpers are invoked in the external
+ * interrupt dispatch framework before and after dispatch_table_func(),
+ * thus in the non-noinstr text section.
  */
 #define DEFINE_IDTENTRY_SYSVEC_SIMPLE(func)				\
 static __always_inline void __##func(struct pt_regs *regs);		\
@@ -273,6 +334,14 @@ __visible noinstr void func(struct pt_regs *regs)			\
 	irqentry_exit(regs, state);					\
 }									\
 									\
+DEFINE_SYSTEM_INTERRUPT_HANDLER(dispatch_table_##func)			\
+{									\
+	__irq_enter_raw();						\
+	kvm_set_cpu_l1tf_flush_l1d();					\
+	__##func (regs);						\
+	__irq_exit_raw();						\
+}									\
+									\
 static __always_inline void __##func(struct pt_regs *regs)
 
 /**
@@ -647,7 +716,11 @@ DECLARE_IDTENTRY_SYSVEC(X86_PLATFORM_IPI_VECTOR,	sysvec_x86_platform_ipi);
 #endif
 
 #ifdef CONFIG_SMP
-DECLARE_IDTENTRY(RESCHEDULE_VECTOR,			sysvec_reschedule_ipi);
+/*
+ * Use DECLARE_IDTENTRY_SYSVEC instead of DECLARE_IDTENTRY to add a
+ * software based dispatch handler declaration for RESCHEDULE_VECTOR.
+ */
+DECLARE_IDTENTRY_SYSVEC(RESCHEDULE_VECTOR,		sysvec_reschedule_ipi);
 DECLARE_IDTENTRY_SYSVEC(REBOOT_VECTOR,			sysvec_reboot);
 DECLARE_IDTENTRY_SYSVEC(CALL_FUNCTION_SINGLE_VECTOR,	sysvec_call_function_single);
 DECLARE_IDTENTRY_SYSVEC(CALL_FUNCTION_VECTOR,		sysvec_call_function);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v9 25/36] x86/traps: Add a system interrupt handler table for system interrupt dispatch
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (23 preceding siblings ...)
  2023-07-31  6:33 ` [PATCH v9 24/36] x86/idtentry: Incorporate definitions/declarations of the FRED external interrupt handler type Xin Li
@ 2023-07-31  6:33 ` Xin Li
  2023-07-31 22:29 ` [PATCH v9 00/36] x86: enable FRED for x86-64 Sean Christopherson
  25 siblings, 0 replies; 32+ messages in thread
From: Xin Li @ 2023-07-31  6:33 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm, xen-devel
  Cc: Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H . Peter Anvin, Andy Lutomirski,
	Oleg Nesterov, Tony Luck, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Xin Li, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

On x86, external interrupts can be categorized into two groups:
  1) System interrupts
  2) External device interrupts

All external device interrupts are directed to the common_interrupt(),
which, in turn, dispatches these external device interrupts using a
per-CPU external device interrupt dispatch table vector_irq.

To handle system interrupts, a system interrupt handler table needs to
be introduced. This table enables the direct dispatching of a system
interrupt to its corresponding handler. As a result, a software-based
dispatch function will be implemented as:

  void external_interrupt(struct pt_regs *regs)
  {
    u8 vector = regs->vector;
    if (is_system_interrupt(vector))
      system_interrupt_handlers[vector_to_sysvec(vector)](regs);
    else /* external device interrupt */
      common_interrupt(regs);
  }

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Co-developed-by: Xin Li <xin3.li@intel.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---

Changes since v8:
* Remove junk code that assumes no local APIC on x86_64 (Thomas Gleixner).

Changes since v5:
* Initialize system_interrupt_handlers with dispatch_table_spurious_interrupt()
  instead of NULL to get rid of any NULL check (Peter Zijlstra).
---
 arch/x86/kernel/traps.c | 50 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 49dd92458eb0..e430a8c47931 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -1488,6 +1488,56 @@ DEFINE_IDTENTRY_SW(iret_error)
 }
 #endif
 
+#ifdef CONFIG_X86_64
+
+static void dispatch_table_spurious_interrupt(struct pt_regs *regs)
+{
+	dispatch_spurious_interrupt(regs, regs->vector);
+}
+
+#define SYSV(x,y) [(x) - FIRST_SYSTEM_VECTOR] = y
+
+static system_interrupt_handler system_interrupt_handlers[NR_SYSTEM_VECTORS] = {
+	[0 ... NR_SYSTEM_VECTORS-1]		= dispatch_table_spurious_interrupt,
+#ifdef CONFIG_SMP
+	SYSV(RESCHEDULE_VECTOR,			dispatch_table_sysvec_reschedule_ipi),
+	SYSV(CALL_FUNCTION_VECTOR,		dispatch_table_sysvec_call_function),
+	SYSV(CALL_FUNCTION_SINGLE_VECTOR,	dispatch_table_sysvec_call_function_single),
+	SYSV(REBOOT_VECTOR,			dispatch_table_sysvec_reboot),
+#endif
+
+#ifdef CONFIG_X86_THERMAL_VECTOR
+	SYSV(THERMAL_APIC_VECTOR,		dispatch_table_sysvec_thermal),
+#endif
+
+#ifdef CONFIG_X86_MCE_THRESHOLD
+	SYSV(THRESHOLD_APIC_VECTOR,		dispatch_table_sysvec_threshold),
+#endif
+
+#ifdef CONFIG_X86_MCE_AMD
+	SYSV(DEFERRED_ERROR_VECTOR,		dispatch_table_sysvec_deferred_error),
+#endif
+
+#ifdef CONFIG_X86_LOCAL_APIC
+	SYSV(LOCAL_TIMER_VECTOR,		dispatch_table_sysvec_apic_timer_interrupt),
+	SYSV(X86_PLATFORM_IPI_VECTOR,		dispatch_table_sysvec_x86_platform_ipi),
+# ifdef CONFIG_HAVE_KVM
+	SYSV(POSTED_INTR_VECTOR,		dispatch_table_sysvec_kvm_posted_intr_ipi),
+	SYSV(POSTED_INTR_WAKEUP_VECTOR,		dispatch_table_sysvec_kvm_posted_intr_wakeup_ipi),
+	SYSV(POSTED_INTR_NESTED_VECTOR,		dispatch_table_sysvec_kvm_posted_intr_nested_ipi),
+# endif
+# ifdef CONFIG_IRQ_WORK
+	SYSV(IRQ_WORK_VECTOR,			dispatch_table_sysvec_irq_work),
+# endif
+	SYSV(SPURIOUS_APIC_VECTOR,		dispatch_table_sysvec_spurious_apic_interrupt),
+	SYSV(ERROR_APIC_VECTOR,			dispatch_table_sysvec_error_interrupt),
+#endif
+};
+
+#undef SYSV
+
+#endif /* CONFIG_X86_64 */
+
 void __init trap_init(void)
 {
 	/* Init cpu_entry_area before IST entries are set up */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v9 05/36] x86/opcode: Add ERETU, ERETS instructions to x86-opcode-map
  2023-07-31  6:32 ` [PATCH v9 05/36] x86/opcode: Add ERETU, ERETS instructions to x86-opcode-map Xin Li
@ 2023-07-31  9:43   ` Masami Hiramatsu
  2023-07-31 16:36     ` Li, Xin3
  0 siblings, 1 reply; 32+ messages in thread
From: Masami Hiramatsu @ 2023-07-31  9:43 UTC (permalink / raw)
  To: Xin Li
  Cc: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm,
	xen-devel, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H . Peter Anvin,
	Andy Lutomirski, Oleg Nesterov, Tony Luck, K . Y . Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li,
	Vitaly Kuznetsov, Sean Christopherson, Peter Zijlstra,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Josh Poimboeuf, Paul E . McKenney, Catalin Marinas, Randy Dunlap,
	Steven Rostedt, Kim Phillips, Hyeonggon Yoo, Liam R . Howlett,
	Sebastian Reichel, Kirill A . Shutemov, Suren Baghdasaryan,
	Pawan Gupta, Jiaxi Chen, Babu Moger, Jim Mattson, Sandipan Das,
	Lai Jiangshan, Hans de Goede, Reinette Chatre, Daniel Sneddon,
	Breno Leitao, Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

On Sun, 30 Jul 2023 23:32:46 -0700
Xin Li <xin3.li@intel.com> wrote:

> From: "H. Peter Anvin (Intel)" <hpa@zytor.com>
> 
> Add instruction opcodes used by FRED ERETU/ERETS to x86-opcode-map.
> 
> Opcode numbers are per FRED spec v5.0.
> 
> Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
> Tested-by: Shan Kang <shan.kang@intel.com>
> Signed-off-by: Xin Li <xin3.li@intel.com>

This looks good to me. (ERETS has the opcode F2 0F 01 CA, ERETU
has the opcode F3 0F 01 CA)

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Thank you,

> ---
>  arch/x86/lib/x86-opcode-map.txt       | 2 +-
>  tools/arch/x86/lib/x86-opcode-map.txt | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
> index 5168ee0360b2..7a269e269dc0 100644
> --- a/arch/x86/lib/x86-opcode-map.txt
> +++ b/arch/x86/lib/x86-opcode-map.txt
> @@ -1052,7 +1052,7 @@ EndTable
>  
>  GrpTable: Grp7
>  0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B) | PCONFIG (101),(11B) | ENCLV (000),(11B)
> -1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B)
> +1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B) | ERETU (F3),(010),(11B) | ERETS (F2),(010),(11B)
>  2: LGDT Ms | XGETBV (000),(11B) | XSETBV (001),(11B) | VMFUNC (100),(11B) | XEND (101)(11B) | XTEST (110)(11B) | ENCLU (111),(11B)
>  3: LIDT Ms
>  4: SMSW Mw/Rv
> diff --git a/tools/arch/x86/lib/x86-opcode-map.txt b/tools/arch/x86/lib/x86-opcode-map.txt
> index 5168ee0360b2..7a269e269dc0 100644
> --- a/tools/arch/x86/lib/x86-opcode-map.txt
> +++ b/tools/arch/x86/lib/x86-opcode-map.txt
> @@ -1052,7 +1052,7 @@ EndTable
>  
>  GrpTable: Grp7
>  0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B) | PCONFIG (101),(11B) | ENCLV (000),(11B)
> -1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B)
> +1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B) | ERETU (F3),(010),(11B) | ERETS (F2),(010),(11B)
>  2: LGDT Ms | XGETBV (000),(11B) | XSETBV (001),(11B) | VMFUNC (100),(11B) | XEND (101)(11B) | XTEST (110)(11B) | ENCLU (111),(11B)
>  3: LIDT Ms
>  4: SMSW Mw/Rv
> -- 
> 2.34.1
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v9 05/36] x86/opcode: Add ERETU, ERETS instructions to x86-opcode-map
  2023-07-31  9:43   ` Masami Hiramatsu
@ 2023-07-31 16:36     ` Li, Xin3
  0 siblings, 0 replies; 32+ messages in thread
From: Li, Xin3 @ 2023-07-31 16:36 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm,
	xen-devel, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H . Peter Anvin, Lutomirski,
	Andy, Oleg Nesterov, Luck, Tony, K . Y . Srinivasan,
	Haiyang Zhang, Wei Liu, Cui, Dexuan, Paolo Bonzini, Wanpeng Li,
	Vitaly Kuznetsov, Christopherson,,
	Sean, Peter Zijlstra, Gross, Jurgen, Stefano Stabellini,
	Oleksandr Tyshchenko, Josh Poimboeuf, Paul E . McKenney,
	Catalin Marinas, Randy Dunlap, Steven Rostedt, Kim Phillips,
	Hyeonggon Yoo, Liam R . Howlett, Sebastian Reichel,
	Kirill A . Shutemov, Suren Baghdasaryan, Pawan Gupta, Jiaxi Chen,
	Babu Moger, Jim Mattson, Sandipan Das, Lai Jiangshan,
	Hans de Goede, Chatre, Reinette, Daniel Sneddon, Breno Leitao,
	Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masahiro Yamada, Ze Gao, Li, Fei1,
	Conghui, Raj, Ashok, Jason A . Donenfeld, Mark Rutland,
	Jacob Pan, Jiapeng Chong, Jane Malalane, Woodhouse, David,
	Ostrovsky, Boris, Arnaldo Carvalho de Melo, Yantengsi,
	Christophe Leroy, Sathvika Vasireddy

> > Add instruction opcodes used by FRED ERETU/ERETS to x86-opcode-map.
> >
> > Opcode numbers are per FRED spec v5.0.
> >
> > Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
> > Tested-by: Shan Kang <shan.kang@intel.com>
> > Signed-off-by: Xin Li <xin3.li@intel.com>
> 
> This looks good to me. (ERETS has the opcode F2 0F 01 CA, ERETU has the opcode
> F3 0F 01 CA)
> 
> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> Thank you,

Thanks! Will add your RB.
  Xin


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v9 00/36] x86: enable FRED for x86-64
  2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
                   ` (24 preceding siblings ...)
  2023-07-31  6:33 ` [PATCH v9 25/36] x86/traps: Add a system interrupt handler table for system interrupt dispatch Xin Li
@ 2023-07-31 22:29 ` Sean Christopherson
  2023-07-31 23:10   ` Li, Xin3
  25 siblings, 1 reply; 32+ messages in thread
From: Sean Christopherson @ 2023-07-31 22:29 UTC (permalink / raw)
  To: Xin Li
  Cc: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm,
	xen-devel, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H . Peter Anvin,
	Andy Lutomirski, Oleg Nesterov, Tony Luck, K . Y . Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li,
	Vitaly Kuznetsov, Peter Zijlstra, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Hyeonggon Yoo, Liam R . Howlett, Sebastian Reichel,
	Kirill A . Shutemov, Suren Baghdasaryan, Pawan Gupta, Jiaxi Chen,
	Babu Moger, Jim Mattson, Sandipan Das, Lai Jiangshan,
	Hans de Goede, Reinette Chatre, Daniel Sneddon, Breno Leitao,
	Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

On Sun, Jul 30, 2023, Xin Li wrote:
> This patch set enables the Intel flexible return and event delivery
> (FRED) architecture for x86-64.

...

> -- 
> 2.34.1

What is this based on?	FYI, you're using a version of git that will (mostly)
automatically generate the based, e.g. I do 

  git format-patch --base=HEAD~$nr ...

in my scripts, where $nr is the number of patches I am sending.  My specific
approaches requires HEAD-$nr to be a publicly visible object/commit, but that
should be the case the vast majority of the time anyways.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v9 00/36] x86: enable FRED for x86-64
  2023-07-31 22:29 ` [PATCH v9 00/36] x86: enable FRED for x86-64 Sean Christopherson
@ 2023-07-31 23:10   ` Li, Xin3
  2023-07-31 23:17     ` Sean Christopherson
  0 siblings, 1 reply; 32+ messages in thread
From: Li, Xin3 @ 2023-07-31 23:10 UTC (permalink / raw)
  To: Christopherson,, Sean
  Cc: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm,
	xen-devel, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H . Peter Anvin, Lutomirski,
	Andy, Oleg Nesterov, Luck, Tony, K . Y . Srinivasan,
	Haiyang Zhang, Wei Liu, Cui, Dexuan, Paolo Bonzini, Wanpeng Li,
	Vitaly Kuznetsov, Peter Zijlstra, Gross, Jurgen,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Hyeonggon Yoo, Liam R . Howlett, Sebastian Reichel,
	Kirill A . Shutemov, Suren Baghdasaryan, Pawan Gupta, Jiaxi Chen,
	Babu Moger, Jim Mattson, Sandipan Das, Lai Jiangshan,
	Hans de Goede, Chatre, Reinette, Daniel Sneddon, Breno Leitao,
	Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Li, Fei1, Conghui, Raj, Ashok, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane, Woodhouse,
	David, Ostrovsky, Boris, Arnaldo Carvalho de Melo, Yantengsi,
	Christophe Leroy, Sathvika Vasireddy

> > This patch set enables the Intel flexible return and event delivery
> > (FRED) architecture for x86-64.
> 
> ...
> 
> > --
> > 2.34.1
> 
> What is this based on?

The tip tree master branch.

> FYI, you're using a version of git that will (mostly)
> automatically generate the based, e.g. I do
> 
>   git format-patch --base=HEAD~$nr ...
> 
> in my scripts, where $nr is the number of patches I am sending.  My specific
> approaches requires HEAD-$nr to be a publicly visible object/commit, but that
> should be the case the vast majority of the time anyways.

Are you talking about that you only got a subset of this patch set?

HPA told me he only got patches 0-25/36.

And I got several undeliverable email notifications, saying
"
The following message to <tglx@linutronix.de> was undeliverable.
The reason for the problem:
5.x.1 - Maximum number of delivery attempts exceeded. [Default] 450-'4.7.25 Client host rejected: cannot find your hostname, [134.134.136.31]'
"

I guess there were some problems with the Intel mail system last night,
probably I should resend this patch set later.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v9 00/36] x86: enable FRED for x86-64
  2023-07-31 23:10   ` Li, Xin3
@ 2023-07-31 23:17     ` Sean Christopherson
  2023-07-31 23:56       ` Li, Xin3
  0 siblings, 1 reply; 32+ messages in thread
From: Sean Christopherson @ 2023-07-31 23:17 UTC (permalink / raw)
  To: Xin3 Li
  Cc: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm,
	xen-devel, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H . Peter Anvin,
	Andy Lutomirski, Oleg Nesterov, Tony Luck, K . Y . Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Paolo Bonzini, Wanpeng Li,
	Vitaly Kuznetsov, Peter Zijlstra, Jurgen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Hyeonggon Yoo, Liam R . Howlett, Sebastian Reichel,
	Kirill A . Shutemov, Suren Baghdasaryan, Pawan Gupta, Jiaxi Chen,
	Babu Moger, Jim Mattson, Sandipan Das, Lai Jiangshan,
	Hans de Goede, Reinette Chatre, Daniel Sneddon, Breno Leitao,
	Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Fei1 Li, Conghui, Ashok Raj, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane,
	David Woodhouse, Boris Ostrovsky, Arnaldo Carvalho de Melo,
	Yantengsi, Christophe Leroy, Sathvika Vasireddy

On Mon, Jul 31, 2023, Xin3 Li wrote:
> > > This patch set enables the Intel flexible return and event delivery
> > > (FRED) architecture for x86-64.
> > 
> > ...
> > 
> > > --
> > > 2.34.1
> > 
> > What is this based on?
> 
> The tip tree master branch.
> 
> > FYI, you're using a version of git that will (mostly)
> > automatically generate the based, e.g. I do
> > 
> >   git format-patch --base=HEAD~$nr ...
> > 
> > in my scripts, where $nr is the number of patches I am sending.  My specific
> > approaches requires HEAD-$nr to be a publicly visible object/commit, but that
> > should be the case the vast majority of the time anyways.
> 
> Are you talking about that you only got a subset of this patch set?

No, I'm saying I don't want to waste a bunch of time tracking down exactly which
commit a 36 patch series is based on.  E.g. I just refreshed tip/master and still
get:

Applying: x86/idtentry: Incorporate definitions/declarations of the FRED external interrupt handler type
error: sha1 information is lacking or useless (arch/x86/include/asm/idtentry.h).
error: could not build fake ancestor
Patch failed at 0024 x86/idtentry: Incorporate definitions/declarations of the FRED external interrupt handler type
hint: Use 'git am --show-current-patch=diff' to see the failed patch

> HPA told me he only got patches 0-25/36.
> 
> And I got several undeliverable email notifications, saying
> "
> The following message to <tglx@linutronix.de> was undeliverable.
> The reason for the problem:
> 5.x.1 - Maximum number of delivery attempts exceeded. [Default] 450-'4.7.25 Client host rejected: cannot find your hostname, [134.134.136.31]'
> "
> 
> I guess there were some problems with the Intel mail system last night,
> probably I should resend this patch set later.

Yes, lore also appears to be missing patches.  I grabbed the mbox off of KVM's
patchwork instance.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v9 00/36] x86: enable FRED for x86-64
  2023-07-31 23:17     ` Sean Christopherson
@ 2023-07-31 23:56       ` Li, Xin3
  0 siblings, 0 replies; 32+ messages in thread
From: Li, Xin3 @ 2023-07-31 23:56 UTC (permalink / raw)
  To: Christopherson,, Sean
  Cc: linux-doc, linux-kernel, linux-edac, linux-hyperv, kvm,
	xen-devel, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H . Peter Anvin, Lutomirski,
	Andy, Oleg Nesterov, Luck, Tony, K . Y . Srinivasan,
	Haiyang Zhang, Wei Liu, Cui, Dexuan, Paolo Bonzini, Wanpeng Li,
	Vitaly Kuznetsov, Peter Zijlstra, Gross, Jurgen,
	Stefano Stabellini, Oleksandr Tyshchenko, Josh Poimboeuf,
	Paul E . McKenney, Catalin Marinas, Randy Dunlap, Steven Rostedt,
	Kim Phillips, Hyeonggon Yoo, Liam R . Howlett, Sebastian Reichel,
	Kirill A . Shutemov, Suren Baghdasaryan, Pawan Gupta, Jiaxi Chen,
	Babu Moger, Jim Mattson, Sandipan Das, Lai Jiangshan,
	Hans de Goede, Chatre, Reinette, Daniel Sneddon, Breno Leitao,
	Nikunj A Dadhania, Brian Gerst, Sami Tolvanen,
	Alexander Potapenko, Andrew Morton, Arnd Bergmann,
	Eric W . Biederman, Kees Cook, Masami Hiramatsu, Masahiro Yamada,
	Ze Gao, Li, Fei1, Conghui, Raj, Ashok, Jason A . Donenfeld,
	Mark Rutland, Jacob Pan, Jiapeng Chong, Jane Malalane, Woodhouse,
	David, Ostrovsky, Boris, Arnaldo Carvalho de Melo, Yantengsi,
	Christophe Leroy, Sathvika Vasireddy

> > Are you talking about that you only got a subset of this patch set?
> 
> No, I'm saying I don't want to waste a bunch of time tracking down exactly which
> commit a 36 patch series is based on.  E.g. I just refreshed tip/master and still
> get:
> 
> Applying: x86/idtentry: Incorporate definitions/declarations of the FRED external
> interrupt handler type
> error: sha1 information is lacking or useless (arch/x86/include/asm/idtentry.h).
> error: could not build fake ancestor
> Patch failed at 0024 x86/idtentry: Incorporate definitions/declarations of the FRED
> external interrupt handler type
> hint: Use 'git am --show-current-patch=diff' to see the failed patch

That is due to the following patch set (originally from tglx) is not
merged yet:

https://lore.kernel.org/lkml/20230621171248.6805-1-xin3.li@intel.com/

Sigh, I should have mentioned it in the cover letter.

As mentioned in the cover letter, 2 patches are sent out separately
as pre-FRED patches:
https://lore.kernel.org/lkml/20230706051443.2054-1-xin3.li@intel.com/
https://lore.kernel.org/lkml/20230706052231.2183-1-xin3.li@intel.com/

Sorry it's a bit complicated.

Got to mention, just in case you want to try out FRED, the current
public Intel Simics emulator has not updated to support FRED 5.0 yet
(it only supports FRED 3.0). The plan is late Q3, or early Q4.

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2023-07-31 23:56 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-31  6:32 [PATCH v9 00/36] x86: enable FRED for x86-64 Xin Li
2023-07-31  6:32 ` [PATCH v9 01/36] Documentation/x86/64: Add documentation for FRED Xin Li
2023-07-31  6:32 ` [PATCH v9 02/36] x86/fred: Add Kconfig option for FRED (CONFIG_X86_FRED) Xin Li
2023-07-31  6:32 ` [PATCH v9 03/36] x86/fred: Disable FRED support if CONFIG_X86_FRED is disabled Xin Li
2023-07-31  6:32 ` [PATCH v9 04/36] x86/cpufeatures: Add the cpu feature bit for FRED Xin Li
2023-07-31  6:32 ` [PATCH v9 05/36] x86/opcode: Add ERETU, ERETS instructions to x86-opcode-map Xin Li
2023-07-31  9:43   ` Masami Hiramatsu
2023-07-31 16:36     ` Li, Xin3
2023-07-31  6:32 ` [PATCH v9 06/36] x86/objtool: Teach objtool about ERETU and ERETS Xin Li
2023-07-31  6:32 ` [PATCH v9 07/36] x86/cpu: Add X86_CR4_FRED macro Xin Li
2023-07-31  6:32 ` [PATCH v9 08/36] x86/cpu: Add MSR numbers for FRED configuration Xin Li
2023-07-31  6:32 ` [PATCH v9 09/36] x86/fred: Make unions for the cs and ss fields in struct pt_regs Xin Li
2023-07-31  6:32 ` [PATCH v9 10/36] x86/fred: Add a new header file for FRED definitions Xin Li
2023-07-31  6:32 ` [PATCH v9 11/36] x86/fred: Reserve space for the FRED stack frame Xin Li
2023-07-31  6:32 ` [PATCH v9 12/36] x86/fred: Update MSR_IA32_FRED_RSP0 during task switch Xin Li
2023-07-31  6:32 ` [PATCH v9 13/36] x86/fred: Let ret_from_fork_asm() jmp to fred_exit_user when FRED is enabled Xin Li
2023-07-31  6:32 ` [PATCH v9 14/36] x86/fred: Disallow the swapgs instruction " Xin Li
2023-07-31  6:32 ` [PATCH v9 15/36] x86/fred: No ESPFIX needed " Xin Li
2023-07-31  6:32 ` [PATCH v9 16/36] x86/fred: Allow single-step trap and NMI when starting a new task Xin Li
2023-07-31  6:32 ` [PATCH v9 17/36] x86/fred: Define a common function type fred_handler Xin Li
2023-07-31  6:32 ` [PATCH v9 18/36] x86/fred: Add a page fault entry stub for FRED Xin Li
2023-07-31  6:33 ` [PATCH v9 19/36] x86/fred: Add a debug " Xin Li
2023-07-31  6:33 ` [PATCH v9 20/36] x86/fred: Add a NMI " Xin Li
2023-07-31  6:33 ` [PATCH v9 21/36] x86/fred: Add a machine check " Xin Li
2023-07-31  6:33 ` [PATCH v9 22/36] x86/fred: Add a double fault " Xin Li
2023-07-31  6:33 ` [PATCH v9 23/36] x86/entry: Remove idtentry_sysvec from entry_{32,64}.S Xin Li
2023-07-31  6:33 ` [PATCH v9 24/36] x86/idtentry: Incorporate definitions/declarations of the FRED external interrupt handler type Xin Li
2023-07-31  6:33 ` [PATCH v9 25/36] x86/traps: Add a system interrupt handler table for system interrupt dispatch Xin Li
2023-07-31 22:29 ` [PATCH v9 00/36] x86: enable FRED for x86-64 Sean Christopherson
2023-07-31 23:10   ` Li, Xin3
2023-07-31 23:17     ` Sean Christopherson
2023-07-31 23:56       ` Li, Xin3

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).