linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 Part2 0/5] Declare safe late loadable microcode
@ 2023-01-13 17:29 Ashok Raj
  2023-01-13 17:29 ` [PATCH v1 Part2 1/5] x86/microcode: Move late load warning to the same function that taints kernel Ashok Raj
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Ashok Raj @ 2023-01-13 17:29 UTC (permalink / raw)
  To: Borislav Petkov, Thomas Gleixner
  Cc: Ashok Raj, LKML, x86, Ingo Molnar, Tony Luck, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper

Hi Boris & Thomas,

Attached is a series that adds support for microcode to declare a minimum
revision number that is safe for late loading.

Late loading was disabled[1] in 5.19 due to lack of an ability to declare
when a microcode is safe for late loading. 

This series is inspired by recommendations from Thomas [2]. 

This series is part2 and applies on top of that cleanup series part1[3].

- Move where the warning and tainting is done to the same function. It
  helps when later patches enable late loading, to issue the warning only
  when the vendor doesn't support a mechanism for declaring a microcode
  for safe late loading.

- Add meta-data for Intel microcode to specify a minimum revision that the
  CPU must be on before attempting to load this microcode.

- Extend this as a generic mechanism to allow the common infrastructure to
  permit late loading.

- Some CPUs required a wbinvd() to be issued before attempting to load a
  microcode. These were later addressed via microcode patches. Now there
  exists a mechanism to declare a minimum revision, those flushes aren't
  required any longer.

- Provide a mechanism to override minrev as a way to bypass during the
  transition. It is recommended to always use early loading, especially if
  one is moving to a newer kernel. This is a chicken bit option :) and
  completely optional.

[1] https://lore.kernel.org/lkml/20220524185324.28395-3-bp@alien8.de/
[2] https://lore.kernel.org/linux-kernel/alpine.DEB.2.21.1909062237580.1902@nanos.tec.linutronix.de/
[3] https://lore.kernel.org/lkml/20230109153555.4986-1-ashok.raj@intel.com/

Cheers,
Ashok

Cc: LKML <linux-kernel@vger.kernel.org>
Cc: x86 <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner (Intel) <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Stefan Talpalaru <stefantalpalaru@yahoo.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Peter Zilstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>


Ashok Raj (5):
  x86/microcode: Move late load warning to the same function that taints
    kernel
  x86/microcode/intel: Add minimum required revision to microcode header
  x86/microcode: Add a generic mechanism to declare support for minrev
  x86/microcode/intel: Drop wbinvd() from microcode loading
  x86/microcode: Provide an option to override minrev enforcement

 arch/x86/include/asm/microcode.h       |  4 ++
 arch/x86/include/asm/microcode_intel.h |  3 +-
 arch/x86/kernel/cpu/microcode/core.c   | 36 ++++++++++++++---
 arch/x86/kernel/cpu/microcode/intel.c  | 55 ++++++++++++++++++++------
 arch/x86/Kconfig                       |  7 ++--
 5 files changed, 83 insertions(+), 22 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 Part2 1/5] x86/microcode: Move late load warning to the same function that taints kernel
  2023-01-13 17:29 [PATCH v1 Part2 0/5] Declare safe late loadable microcode Ashok Raj
@ 2023-01-13 17:29 ` Ashok Raj
  2023-01-19 21:48   ` Thomas Gleixner
  2023-01-13 17:29 ` [PATCH v1 Part2 2/5] x86/microcode/intel: Add minimum required revision to microcode header Ashok Raj
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Ashok Raj @ 2023-01-13 17:29 UTC (permalink / raw)
  To: Borislav Petkov, Thomas Gleixner
  Cc: Ashok Raj, Tony Luck, LKML, x86, Ingo Molnar, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper

Currently the warning about late loading and tainting are issued from two
different functions.

Later patches will re-enable microcode late-loading.

Having both messages in the same function helps issuing warnings only
when required.

Move the warning from microcode_reload_late() -> reload_store() where the
kernel tainting also happens.

No functional changes.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: x86 <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner (Intel) <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Stefan Talpalaru <stefantalpalaru@yahoo.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Peter Zilstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>
---
 arch/x86/kernel/cpu/microcode/core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index d7cbc83df9b6..c361882baf63 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -441,9 +441,6 @@ static int microcode_reload_late(void)
 	int old = boot_cpu_data.microcode, ret;
 	struct cpuinfo_x86 prev_info;
 
-	pr_err("Attempting late microcode loading - it is dangerous and taints the kernel.\n");
-	pr_err("You should switch to early loading, if possible.\n");
-
 	atomic_set(&late_cpus_in,  0);
 	atomic_set(&late_cpus_out, 0);
 
@@ -494,6 +491,9 @@ static ssize_t reload_store(struct device *dev,
 	if (tmp_ret != UCODE_NEW)
 		goto put;
 
+	pr_err("Attempting late microcode loading - it is dangerous and taints the kernel.\n");
+	pr_err("You should switch to early loading, if possible.\n");
+
 	mutex_lock(&microcode_mutex);
 	ret = microcode_reload_late();
 	mutex_unlock(&microcode_mutex);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 Part2 2/5] x86/microcode/intel: Add minimum required revision to microcode header
  2023-01-13 17:29 [PATCH v1 Part2 0/5] Declare safe late loadable microcode Ashok Raj
  2023-01-13 17:29 ` [PATCH v1 Part2 1/5] x86/microcode: Move late load warning to the same function that taints kernel Ashok Raj
@ 2023-01-13 17:29 ` Ashok Raj
  2023-01-19 22:03   ` Thomas Gleixner
  2023-01-13 17:29 ` [PATCH v1 Part2 3/5] x86/microcode: Add a generic mechanism to declare support for minrev Ashok Raj
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Ashok Raj @ 2023-01-13 17:29 UTC (permalink / raw)
  To: Borislav Petkov, Thomas Gleixner
  Cc: Ashok Raj, LKML, x86, Ingo Molnar, Tony Luck, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper

In general users don't have the necessary information to determine
whether a late loading of a new microcode version has removed any feature
(MSR, CPUID etc) between what is currently loaded and this new microcode.
To address this issue, Intel has added a "minimum required version" field
to a previously reserved field in the file header. Microcode updates
should only be applied if the current microcode version is equal
to, or greater than this minimum required version.

Thomas made some suggestions[1] on how meta-data in the microcode file
could provide Linux with information to decide if the new microcode is
suitable candidate for late loading. But even the "simpler" option#1
requires a lot of metadata and corresponding kernel code to parse it.

The proposal here is an even simpler option. Simply "OS visible features"
such as CPUID and MSRs are the only two examples. The microcode must not
change these OS visible features because they cause problems after late
loading. When microcode changes features, microcode will change the min_rev
to prevent such microcodes from being late loaded.

Pseudo code for late loading is as follows:

if header.min_required_id == 0
        This is old format microcode, block late loading
else if current_ucode_version < header.min_required_id
        Current version is too old, block late loading of this microcode.
else
        OK to proceed with late loading.

Any microcode that modifies the interface to an OS-visible feature
will set the min_version to itself. This will enforce this microcode is
not suitable for late loading unless the currently loaded revision is
greater or equal to the new microcode affecting the change.

The enforcement is not in hardware and limited to kernel loader enforcing
the requirement. It is not required for early loading of microcode to
enforce this requirement, since the new features are only
evaluated after early loading in the boot process.

Check if the new microcode specifies the minimum version for safe late
loading. Otherwise reject late load.

Test cases covered:

1. With new kernel, attempting to load an older format microcode with the
   min_rev=0 should be blocked by kernel.

   [  210.541802] Late loading denied: Microcode header does not specify a
   required min version.

2. New microcode with a non-zero min_rev in the header, but the specified
   min_rev is greater than what is currently loaded in the CPU should be
   blocked by kernel.

   245.139828] microcode: Late loading denied: Current revision 0x8f685300 is too old to update, must be at 0xaa000050 version or higher. Use early loading instead.

3. New microcode with a min_rev < currently loaded should allow loading the
   microcode

4. Build initrd with microcode that has min_rev=0, or min_rev > currently
   loaded should permit early loading microcode from initrd.

[1] https://lore.kernel.org/linux-kernel/alpine.DEB.2.21.1909062237580.1902@nanos.tec.linutronix.de/

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: x86 <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner (Intel) <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Stefan Talpalaru <stefantalpalaru@yahoo.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Peter Zilstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>
---
 arch/x86/include/asm/microcode_intel.h |  3 ++-
 arch/x86/kernel/cpu/microcode/intel.c  | 34 +++++++++++++++++++++++++-
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/microcode_intel.h b/arch/x86/include/asm/microcode_intel.h
index f1fa979e05bf..e83afe919b10 100644
--- a/arch/x86/include/asm/microcode_intel.h
+++ b/arch/x86/include/asm/microcode_intel.h
@@ -15,7 +15,8 @@ struct microcode_header_intel {
 	unsigned int            datasize;
 	unsigned int            totalsize;
 	unsigned int            metasize;
-	unsigned int            reserved[2];
+	unsigned int		min_req_ver;
+	unsigned int		reserved3;
 };
 
 struct microcode_intel {
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 0cdff9ed2a4e..6046f90a47b2 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -137,6 +137,33 @@ static void save_microcode_patch(struct ucode_cpu_info *uci, void *data, unsigne
 		intel_ucode_patch = p->data;
 }
 
+static int is_lateload_safe(struct microcode_header_intel *mc_header)
+{
+	struct ucode_cpu_info uci;
+
+	/*
+	 * When late-loading, ensure the header declares a minimum revision
+	 * required to perform a late-load.
+	 */
+	if (!mc_header->min_req_ver) {
+		pr_warn("Late loading denied: Microcode header does not specify a required min version\n");
+		return -EINVAL;
+	}
+
+	intel_cpu_collect_info(&uci);
+
+	/*
+	 * Enforce the minimum revision specified in the header is either
+	 * greater or equal to the current revision.
+	 */
+	if (uci.cpu_sig.rev < mc_header->min_req_ver) {
+		pr_warn("Late loading denied: Current revision 0x%x too old to update, must be at 0x%x or higher. Use early loading instead\n",
+			uci.cpu_sig.rev, mc_header->min_req_ver);
+		return -EINVAL;
+	}
+	return 0;
+}
+
 /*
  * Get microcode matching with BSP's model. Only CPUs with the same model as
  * BSP can stay in the platform.
@@ -678,7 +705,9 @@ static enum ucode_state generic_load_microcode(int cpu, struct iov_iter *iter)
 		memcpy(mc, &mc_header, sizeof(mc_header));
 		data = mc + sizeof(mc_header);
 		if (!copy_from_iter_full(data, data_size, iter) ||
-		    intel_microcode_sanity_check(mc, true, MC_HEADER_TYPE_MICROCODE) < 0) {
+		    intel_microcode_sanity_check(mc, true, MC_HEADER_TYPE_MICROCODE) < 0 ||
+		    is_lateload_safe(&mc_header)) {
+			ret = UCODE_ERROR;
 			break;
 		}
 
@@ -701,6 +730,9 @@ static enum ucode_state generic_load_microcode(int cpu, struct iov_iter *iter)
 		return UCODE_ERROR;
 	}
 
+	if (ret == UCODE_ERROR)
+		return ret;
+
 	if (!new_mc)
 		return UCODE_NFOUND;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 Part2 3/5] x86/microcode: Add a generic mechanism to declare support for minrev
  2023-01-13 17:29 [PATCH v1 Part2 0/5] Declare safe late loadable microcode Ashok Raj
  2023-01-13 17:29 ` [PATCH v1 Part2 1/5] x86/microcode: Move late load warning to the same function that taints kernel Ashok Raj
  2023-01-13 17:29 ` [PATCH v1 Part2 2/5] x86/microcode/intel: Add minimum required revision to microcode header Ashok Raj
@ 2023-01-13 17:29 ` Ashok Raj
  2023-01-20  0:15   ` Thomas Gleixner
  2023-01-13 17:29 ` [PATCH v1 Part2 4/5] x86/microcode/intel: Drop wbinvd() from microcode loading Ashok Raj
  2023-01-13 17:29 ` [PATCH v1 Part2 5/5] x86/microcode: Provide an option to override minrev enforcement Ashok Raj
  4 siblings, 1 reply; 20+ messages in thread
From: Ashok Raj @ 2023-01-13 17:29 UTC (permalink / raw)
  To: Borislav Petkov, Thomas Gleixner
  Cc: Ashok Raj, Tony Luck, LKML, x86, Ingo Molnar, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper

Intel microcode adds some meta-data to report a minimum required revision
before this new microcode can be safely late loaded. There are no generic
mechanism to declare support for all vendors.

Add generic support to microcode core to declare such support, this allows
late-loading to be permitted in those architectures that report support
for safe late loading.

Late loading has added support for

- New images declaring a required minimum base version before a late-load
  is performed.

Tainting only happens on architectures that don't support minimum required
version reporting.

Add a new variable in microcode_ops to allow an architecture to declare
support for safe microcode late loading.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: x86 <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner (Intel) <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Stefan Talpalaru <stefantalpalaru@yahoo.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Peter Zilstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>
---
 arch/x86/include/asm/microcode.h      |  2 ++
 arch/x86/kernel/cpu/microcode/core.c  | 25 ++++++++++++++++++++-----
 arch/x86/kernel/cpu/microcode/intel.c |  1 +
 arch/x86/Kconfig                      |  7 ++++---
 4 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/microcode.h b/arch/x86/include/asm/microcode.h
index d5a58bde091c..3d48143e84a9 100644
--- a/arch/x86/include/asm/microcode.h
+++ b/arch/x86/include/asm/microcode.h
@@ -33,6 +33,8 @@ enum ucode_state {
 };
 
 struct microcode_ops {
+	bool safe_late_load;
+
 	enum ucode_state (*request_microcode_fw) (int cpu, struct device *);
 
 	void (*microcode_fini_cpu) (int cpu);
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index c361882baf63..446ddf3fcc29 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -472,6 +472,7 @@ static ssize_t reload_store(struct device *dev,
 	enum ucode_state tmp_ret = UCODE_OK;
 	int bsp = boot_cpu_data.cpu_index;
 	unsigned long val;
+	bool safe_late_load = false;
 	ssize_t ret = 0;
 
 	ret = kstrtoul(buf, 0, &val);
@@ -487,13 +488,22 @@ static ssize_t reload_store(struct device *dev,
 	if (ret)
 		goto put;
 
+	safe_late_load = microcode_ops->safe_late_load;
+
+	/*
+	 * If safe loading indication isn't present, bail out.
+	 */
+	if (!safe_late_load) {
+		pr_err("Attempting late microcode loading - it is dangerous and taints the kernel.\n");
+		pr_err("You should switch to early loading, if possible.\n");
+		ret = -EINVAL;
+		goto put;
+	}
+
 	tmp_ret = microcode_ops->request_microcode_fw(bsp, &microcode_pdev->dev);
 	if (tmp_ret != UCODE_NEW)
 		goto put;
 
-	pr_err("Attempting late microcode loading - it is dangerous and taints the kernel.\n");
-	pr_err("You should switch to early loading, if possible.\n");
-
 	mutex_lock(&microcode_mutex);
 	ret = microcode_reload_late();
 	mutex_unlock(&microcode_mutex);
@@ -501,11 +511,16 @@ static ssize_t reload_store(struct device *dev,
 put:
 	cpus_read_unlock();
 
+	/*
+	 * Only taint if a successful load and vendor doesn't support
+	 * safe_late_load
+	 */
+	if (!(ret && safe_late_load))
+		add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
+
 	if (ret == 0)
 		ret = size;
 
-	add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
-
 	return ret;
 }
 
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 6046f90a47b2..eba4f463ef1c 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -806,6 +806,7 @@ static enum ucode_state request_microcode_fw(int cpu, struct device *device)
 }
 
 static struct microcode_ops microcode_intel_ops = {
+	.safe_late_load			  = true,
 	.request_microcode_fw             = request_microcode_fw,
 	.collect_cpu_info                 = collect_cpu_info,
 	.apply_microcode                  = apply_microcode_intel,
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 3604074a878b..ddc4130e6f8c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1352,15 +1352,16 @@ config MICROCODE_AMD
 	  processors will be enabled.
 
 config MICROCODE_LATE_LOADING
-	bool "Late microcode loading (DANGEROUS)"
-	default n
+	bool "Late microcode loading"
+	default y
 	depends on MICROCODE
 	help
 	  Loading microcode late, when the system is up and executing instructions
 	  is a tricky business and should be avoided if possible. Just the sequence
 	  of synchronizing all cores and SMT threads is one fragile dance which does
 	  not guarantee that cores might not softlock after the loading. Therefore,
-	  use this at your own risk. Late loading taints the kernel too.
+	  use this at your own risk. Late loading taints the kernel, if it
+	  doesn't support a minimum required base version before an update.
 
 config X86_MSR
 	tristate "/dev/cpu/*/msr - Model-specific register support"
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 Part2 4/5] x86/microcode/intel: Drop wbinvd() from microcode loading
  2023-01-13 17:29 [PATCH v1 Part2 0/5] Declare safe late loadable microcode Ashok Raj
                   ` (2 preceding siblings ...)
  2023-01-13 17:29 ` [PATCH v1 Part2 3/5] x86/microcode: Add a generic mechanism to declare support for minrev Ashok Raj
@ 2023-01-13 17:29 ` Ashok Raj
  2023-01-13 17:29 ` [PATCH v1 Part2 5/5] x86/microcode: Provide an option to override minrev enforcement Ashok Raj
  4 siblings, 0 replies; 20+ messages in thread
From: Ashok Raj @ 2023-01-13 17:29 UTC (permalink / raw)
  To: Borislav Petkov, Thomas Gleixner
  Cc: Ashok Raj, Tony Luck, LKML, x86, Ingo Molnar, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper

Some older processors had a bad interaction when updating microcode if the
caches were dirty causing machine checks. The wbinvd() was added to
mitigate that before performing microcode updates. Now that Linux checks
for the minimum version before performing an update, those microcode
revisions can't be loaded.

Remove calls to wbinvd().

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: x86 <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner (Intel) <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Stefan Talpalaru <stefantalpalaru@yahoo.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Peter Zilstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>
---
 arch/x86/kernel/cpu/microcode/intel.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index eba4f463ef1c..68a3c5569cd2 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -415,12 +415,6 @@ static int apply_microcode_early(struct ucode_cpu_info *uci, bool early)
 		return UCODE_OK;
 	}
 
-	/*
-	 * Writeback and invalidate caches before updating microcode to avoid
-	 * internal issues depending on what the microcode is updating.
-	 */
-	native_wbinvd();
-
 	/* write microcode via MSR 0x79 */
 	native_wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc->bits);
 
@@ -624,12 +618,6 @@ static enum ucode_state apply_microcode_intel(int cpu)
 		goto out;
 	}
 
-	/*
-	 * Writeback and invalidate caches before updating microcode to avoid
-	 * internal issues depending on what the microcode is updating.
-	 */
-	native_wbinvd();
-
 	/* write microcode via MSR 0x79 */
 	wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc->bits);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 Part2 5/5] x86/microcode: Provide an option to override minrev enforcement
  2023-01-13 17:29 [PATCH v1 Part2 0/5] Declare safe late loadable microcode Ashok Raj
                   ` (3 preceding siblings ...)
  2023-01-13 17:29 ` [PATCH v1 Part2 4/5] x86/microcode/intel: Drop wbinvd() from microcode loading Ashok Raj
@ 2023-01-13 17:29 ` Ashok Raj
  4 siblings, 0 replies; 20+ messages in thread
From: Ashok Raj @ 2023-01-13 17:29 UTC (permalink / raw)
  To: Borislav Petkov, Thomas Gleixner
  Cc: Ashok Raj, LKML, x86, Ingo Molnar, Tony Luck, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper

Minimum Required Revision (minrev) is enforced strictly. All new patches
will have a minrev that is not zero. But there might be a transition time
for some that need this enforcement to be relaxed.

When the override is enabled, the kernel will be tainted.

Provide a debugfs variable to override the minrev enforcement.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: x86 <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner (Intel) <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Stefan Talpalaru <stefantalpalaru@yahoo.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Peter Zilstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>
---
This patch is optional.
---
 arch/x86/include/asm/microcode.h      |  2 ++
 arch/x86/kernel/cpu/microcode/core.c  | 13 ++++++++++++-
 arch/x86/kernel/cpu/microcode/intel.c |  8 ++++++++
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/microcode.h b/arch/x86/include/asm/microcode.h
index 3d48143e84a9..d82f22d50ebd 100644
--- a/arch/x86/include/asm/microcode.h
+++ b/arch/x86/include/asm/microcode.h
@@ -16,6 +16,8 @@ struct ucode_patch {
 
 extern struct list_head microcode_cache;
 
+extern bool override_minrev;
+
 struct cpu_signature {
 	unsigned int sig;
 	unsigned int pf;
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index 446ddf3fcc29..5ed60c6c8e8d 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -24,6 +24,7 @@
 #include <linux/miscdevice.h>
 #include <linux/capability.h>
 #include <linux/firmware.h>
+#include <linux/debugfs.h>
 #include <linux/kernel.h>
 #include <linux/delay.h>
 #include <linux/mutex.h>
@@ -44,7 +45,9 @@
 #define DRIVER_VERSION	"2.2"
 
 static struct microcode_ops	*microcode_ops;
+static struct dentry		*dentry_ucode;
 static bool dis_ucode_ldr = true;
+bool override_minrev;
 
 bool initrd_gone;
 
@@ -497,7 +500,11 @@ static ssize_t reload_store(struct device *dev,
 		pr_err("Attempting late microcode loading - it is dangerous and taints the kernel.\n");
 		pr_err("You should switch to early loading, if possible.\n");
 		ret = -EINVAL;
-		goto put;
+
+		if (!override_minrev)
+			goto put;
+
+		pr_err("Overriding minrev\n");
 	}
 
 	tmp_ret = microcode_ops->request_microcode_fw(bsp, &microcode_pdev->dev);
@@ -688,7 +695,11 @@ static int __init microcode_init(void)
 	cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, "x86/microcode:online",
 				  mc_cpu_online, mc_cpu_down_prep);
 
+	dentry_ucode = debugfs_create_dir("microcode", NULL);
+	debugfs_create_bool("override_minrev", 0644, dentry_ucode, &override_minrev);
+
 	pr_info("Microcode Update Driver: v%s.", DRIVER_VERSION);
+	pr_info("Override minrev %s\n", override_minrev ? "enabled" : "disabled");
 
 	return 0;
 
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 68a3c5569cd2..172e1f166844 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -141,6 +141,14 @@ static int is_lateload_safe(struct microcode_header_intel *mc_header)
 {
 	struct ucode_cpu_info uci;
 
+	/*
+	 * If minrev is bypassed via debugfs, then allow late-load.
+	 */
+	if (override_minrev) {
+		pr_info("Bypassing minrev enforcement via debugfs\n");
+		return 0;
+	}
+
 	/*
 	 * When late-loading, ensure the header declares a minimum revision
 	 * required to perform a late-load.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 Part2 1/5] x86/microcode: Move late load warning to the same function that taints kernel
  2023-01-13 17:29 ` [PATCH v1 Part2 1/5] x86/microcode: Move late load warning to the same function that taints kernel Ashok Raj
@ 2023-01-19 21:48   ` Thomas Gleixner
  2023-01-19 22:05     ` Ashok Raj
  0 siblings, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2023-01-19 21:48 UTC (permalink / raw)
  To: Ashok Raj, Borislav Petkov
  Cc: Ashok Raj, Tony Luck, LKML, x86, Ingo Molnar, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper

On Fri, Jan 13 2023 at 09:29, Ashok Raj wrote:
> Currently the warning about late loading and tainting are issued from two
> different functions.
>
> Later patches will re-enable microcode late-loading.
>
> Having both messages in the same function helps issuing warnings only
> when required.
>
> Move the warning from microcode_reload_late() -> reload_store() where the
> kernel tainting also happens.
>
> No functional change.

I had to read this more than once to make sense of it. Let me try a
translation:

  Late microcode loading issues a warning and taints the
  kernel. Tainting the kernel and emitting the warning happens in two
  different functions.

  The upcoming support for safe late loading under certain conditions
  needs to prevent both the warning and the tainting when the safe
  conditions are met. That would require to hand the result of the safe
  condition check into the function which emits the warning.

  To avoid this awkward construct, move the warning into reload_store()
  next to the taint() invocation as that is also the function which will
  later contain the safe condition check.

  No functional change.

Did my decoder get that right?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 Part2 2/5] x86/microcode/intel: Add minimum required revision to microcode header
  2023-01-13 17:29 ` [PATCH v1 Part2 2/5] x86/microcode/intel: Add minimum required revision to microcode header Ashok Raj
@ 2023-01-19 22:03   ` Thomas Gleixner
  2023-01-19 22:34     ` Ashok Raj
  0 siblings, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2023-01-19 22:03 UTC (permalink / raw)
  To: Ashok Raj, Borislav Petkov
  Cc: Ashok Raj, LKML, x86, Ingo Molnar, Tony Luck, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper

On Fri, Jan 13 2023 at 09:29, Ashok Raj wrote:
> In general users don't have the necessary information to determine
> whether a late loading of a new microcode version has removed any feature
> (MSR, CPUID etc) between what is currently loaded and this new microcode.

s/this new microcode/a newer microcode revision/

> To address this issue, Intel has added a "minimum required version" field
> to a previously reserved field in the file header. Microcode updates

s/file header/microcode header/ perhaps?

> should only be applied if the current microcode version is equal
> to, or greater than this minimum required version.
>
> Thomas made some suggestions[1] on how meta-data in the microcode file
> could provide Linux with information to decide if the new microcode is
> suitable candidate for late loading. But even the "simpler" option#1
> requires a lot of metadata and corresponding kernel code to parse it.
>
> The proposal here is an even simpler option.

IIRC this was also suggested by this Thomas dude, right?

> Simply "OS visible features" such as CPUID and MSRs are the only two
> examples. The microcode must not change these OS visible features
> because they cause problems after late loading. When microcode changes
> features, microcode will change the min_rev to prevent such microcodes
> from being late loaded.
>
> Pseudo code for late loading is as follows:
>
> if header.min_required_id == 0
>         This is old format microcode, block late loading
> else if current_ucode_version < header.min_required_id
>         Current version is too old, block late loading of this microcode.
> else
>         OK to proceed with late loading.
>
> Any microcode that modifies the interface to an OS-visible feature
> will set the min_version to itself. This will enforce this microcode is
> not suitable for late loading unless the currently loaded revision is
> greater or equal to the new microcode affecting the change.

Up to this paragraph the changelog made sense.

If the currently loaded revision is the same as the to be loaded
revision, then there is nothing to do.

If the currently loaded revision is greater than the to be loaded
revision then it is not loaded as the kernel does not support
downgrading in the first place.

Even if it would support downgrading then this would be outright wrong
for this case:

Rev:        10
Min-Rev:    10

Rev:        20
Min-Rev:    20

If Rev 20 is loaded, then you absolutely cannot load Rev 10 because that
would have the reverse side effects due to which Rev 20 prevents late
loading.

See?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 Part2 1/5] x86/microcode: Move late load warning to the same function that taints kernel
  2023-01-19 21:48   ` Thomas Gleixner
@ 2023-01-19 22:05     ` Ashok Raj
  0 siblings, 0 replies; 20+ messages in thread
From: Ashok Raj @ 2023-01-19 22:05 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Borislav Petkov, Tony Luck, LKML, x86, Ingo Molnar, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper, Ashok Raj

On Thu, Jan 19, 2023 at 10:48:30PM +0100, Thomas Gleixner wrote:
> On Fri, Jan 13 2023 at 09:29, Ashok Raj wrote:
> > Currently the warning about late loading and tainting are issued from two
> > different functions.
> >
> > Later patches will re-enable microcode late-loading.
> >
> > Having both messages in the same function helps issuing warnings only
> > when required.
> >
> > Move the warning from microcode_reload_late() -> reload_store() where the
> > kernel tainting also happens.
> >
> > No functional change.
> 
> I had to read this more than once to make sense of it. Let me try a
> translation:
> 
>   Late microcode loading issues a warning and taints the
>   kernel. Tainting the kernel and emitting the warning happens in two
>   different functions.
> 
>   The upcoming support for safe late loading under certain conditions
>   needs to prevent both the warning and the tainting when the safe
>   conditions are met. That would require to hand the result of the safe
>   condition check into the function which emits the warning.
> 
>   To avoid this awkward construct, move the warning into reload_store()
>   next to the taint() invocation as that is also the function which will
>   later contain the safe condition check.
> 
>   No functional change.
> 
> Did my decoder get that right?
> 

Yes, that is accurate.. inspite of my awkward phrasing :-(

I should copy this commit verbatim before i resend :-)

Cheers,
Ashok

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 Part2 2/5] x86/microcode/intel: Add minimum required revision to microcode header
  2023-01-19 22:03   ` Thomas Gleixner
@ 2023-01-19 22:34     ` Ashok Raj
  0 siblings, 0 replies; 20+ messages in thread
From: Ashok Raj @ 2023-01-19 22:34 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Borislav Petkov, LKML, x86, Ingo Molnar, Tony Luck, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper, Ashok Raj

On Thu, Jan 19, 2023 at 11:03:14PM +0100, Thomas Gleixner wrote:
> On Fri, Jan 13 2023 at 09:29, Ashok Raj wrote:
> > In general users don't have the necessary information to determine
> > whether a late loading of a new microcode version has removed any feature
> > (MSR, CPUID etc) between what is currently loaded and this new microcode.
> 
> s/this new microcode/a newer microcode revision/

Yes.

> 
> > To address this issue, Intel has added a "minimum required version" field
> > to a previously reserved field in the file header. Microcode updates
> 
> s/file header/microcode header/ perhaps?

Yep!
> 
> > should only be applied if the current microcode version is equal
> > to, or greater than this minimum required version.
> >
> > Thomas made some suggestions[1] on how meta-data in the microcode file
> > could provide Linux with information to decide if the new microcode is
> > suitable candidate for late loading. But even the "simpler" option#1
> > requires a lot of metadata and corresponding kernel code to parse it.
> >
> > The proposal here is an even simpler option.
> 
> IIRC this was also suggested by this Thomas dude, right?

Same dude.. might have been your twin :-)

I'll fix it.

> 
> > Simply "OS visible features" such as CPUID and MSRs are the only two
> > examples. The microcode must not change these OS visible features
> > because they cause problems after late loading. When microcode changes
> > features, microcode will change the min_rev to prevent such microcodes
> > from being late loaded.
> >
> > Pseudo code for late loading is as follows:
> >
> > if header.min_required_id == 0
> >         This is old format microcode, block late loading
> > else if current_ucode_version < header.min_required_id
> >         Current version is too old, block late loading of this microcode.
> > else
> >         OK to proceed with late loading.
> >
> > Any microcode that modifies the interface to an OS-visible feature
> > will set the min_version to itself. This will enforce this microcode is
> > not suitable for late loading unless the currently loaded revision is
> > greater or equal to the new microcode affecting the change.
> 
> Up to this paragraph the changelog made sense.
> 
> If the currently loaded revision is the same as the to be loaded
> revision, then there is nothing to do.
> 
> If the currently loaded revision is greater than the to be loaded
> revision then it is not loaded as the kernel does not support
> downgrading in the first place.
> 
> Even if it would support downgrading then this would be outright wrong
> for this case:
> 
> Rev:        10
> Min-Rev:    10
> 
> Rev:        20
> Min-Rev:    20
> 
> If Rev 20 is loaded, then you absolutely cannot load Rev 10 because that
> would have the reverse side effects due to which Rev 20 prevents late
> loading.
> 
> See?

Yes, that's accurate, and in sprit it works that way.

The current_rev > mc_hdr->rev is done in apply_microcode_intel() but I
suppose we could do that check early. 

I didn't touch those parts to make sure only minimal changes were done and
we can do cleanup's later.  I should certainly add a note to make sure we
aren't breaking the rev is always greater than what's in the CPU for
clarity.

I do have several cleanups lined up, but didn't want to hold the minrev and
the nmi series.

Cheers,
Ashok

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 Part2 3/5] x86/microcode: Add a generic mechanism to declare support for minrev
  2023-01-13 17:29 ` [PATCH v1 Part2 3/5] x86/microcode: Add a generic mechanism to declare support for minrev Ashok Raj
@ 2023-01-20  0:15   ` Thomas Gleixner
  2023-01-20  2:19     ` Ashok Raj
  2023-01-21 21:35     ` [Part 2 v2[cleanup] 0/4] Some additional cleanups in microcode Ashok Raj
  0 siblings, 2 replies; 20+ messages in thread
From: Thomas Gleixner @ 2023-01-20  0:15 UTC (permalink / raw)
  To: Ashok Raj, Borislav Petkov
  Cc: Ashok Raj, Tony Luck, LKML, x86, Ingo Molnar, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper

Ashok!

On Fri, Jan 13 2023 at 09:29, Ashok Raj wrote:
> Intel microcode adds some meta-data to report a minimum required revision
> before this new microcode can be safely late loaded. There are no generic

s/this new microcode/a new microcode revision/

Changelogs are not restricted by twitter posting rules.

> mechanism to declare support for all vendors.
>
> Add generic support to microcode core to declare such support, this allows
> late-loading to be permitted in those architectures that report support
> for safe late loading.
>
> Late loading has added support for
>
> - New images declaring a required minimum base version before a late-load
>   is performed.
>
> Tainting only happens on architectures that don't support minimum required
> version reporting.
>
> Add a new variable in microcode_ops to allow an architecture to declare
> support for safe microcode late loading.
> @@ -487,13 +488,22 @@ static ssize_t reload_store(struct device *dev,
>  	if (ret)
>  		goto put;
>  
> +	safe_late_load = microcode_ops->safe_late_load;
> +
> +	/*
> +	 * If safe loading indication isn't present, bail out.
> +	 */
> +	if (!safe_late_load) {
> +		pr_err("Attempting late microcode loading - it is dangerous and taints the kernel.\n");
> +		pr_err("You should switch to early loading, if possible.\n");
> +		ret = -EINVAL;
> +		goto put;
> +	}
> +
>  	tmp_ret = microcode_ops->request_microcode_fw(bsp, &microcode_pdev->dev);
>  	if (tmp_ret != UCODE_NEW)
>  		goto put;
>  
> -	pr_err("Attempting late microcode loading - it is dangerous and taints the kernel.\n");
> -	pr_err("You should switch to early loading, if possible.\n");
> -

Why are you not moving the pr_err()s right away (in 1/5) to the place
where you move it now?

>  	mutex_lock(&microcode_mutex);
>  	ret = microcode_reload_late();
>  	mutex_unlock(&microcode_mutex);
> @@ -501,11 +511,16 @@ static ssize_t reload_store(struct device *dev,
>  put:
>  	cpus_read_unlock();
>  
> +	/*
> +	 * Only taint if a successful load and vendor doesn't support
> +	 * safe_late_load
> +	 */
> +	if (!(ret && safe_late_load))
> +		add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);

The resulting code is undecodable garbage. Whats worse is that the
existing logic in this code is broken already.

#1
	ssize_t ret = 0;

This 'ret = 0' assignment is pointless as ret is immediately overwritten
by the next line:

	ret = kstrtoul(buf, 0, &val);
	if (ret)
		return ret;

	if (val != 1)
		return size;

Now this is really useful. If the value is invalid, i.e. it causes the
function to abort immediately it returns 'size' which means the write
was successful. Oh well.

Now lets look at a few lines further down:

#2

	ssize_t ret = 0;
        ...
        ret = check_online_cpus();
	if (ret)
		goto put;
        ...
put:
        ...
	add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
        ...
        return ret;

Why are we tainting the kernel when there was absolutely ZERO action
done here? All what check_online_cpus() figured out was that not enough
CPUs were online, right? That justfies a error return, but the taint is
bogus, no?

The next bogosity is:

	ssize_t ret = 0;
        ...
        tmp_ret = microcode_ops->request_microcode_fw(bsp, &microcode_pdev->dev);
	if (tmp_ret != UCODE_NEW)
		goto put;
        ...    
put:
        ...
	add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);

	if (ret == 0)
		ret = size;

        return ret;

IOW, the microcode request can fail for whatever reason and the return
value is unconditionally 'size' which means the write to the sysfs file
is successfull.

#3

Not to talk about the completely broken error handling in the actual
microcode loading case in __reload_late()::wait_for_siblings code path.

Maybe more #...

How does any of this make sense and allows sensible scripting of this
interface?

Surely you spent several orders of magnitude more time to stare at this
code than I did during this review, no?

Now instead of noticing and fixing any of this nonsense you are duct
taping this whole safe_late_load handling into that mess to make it even
more incomprehensible.

If you expected an alternative patch here, then I have to disappoint
you.

I'm not presenting you the proper solution this time on a silver tablet
because I'm in the process of taming my 'let me fix this for you' reflex
to prepare for my retirement some years down the road.

But you should have enough hints to fix all of this for real, right?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 Part2 3/5] x86/microcode: Add a generic mechanism to declare support for minrev
  2023-01-20  0:15   ` Thomas Gleixner
@ 2023-01-20  2:19     ` Ashok Raj
  2023-01-21 21:35     ` [Part 2 v2[cleanup] 0/4] Some additional cleanups in microcode Ashok Raj
  1 sibling, 0 replies; 20+ messages in thread
From: Ashok Raj @ 2023-01-20  2:19 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Borislav Petkov, Tony Luck, LKML, x86, Ingo Molnar, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper, Ashok Raj

On Fri, Jan 20, 2023 at 01:15:04AM +0100, Thomas Gleixner wrote:
> Ashok!

I know I'm in trouble when it starts like this :-(

> 
> On Fri, Jan 13 2023 at 09:29, Ashok Raj wrote:
> > Intel microcode adds some meta-data to report a minimum required revision
> > before this new microcode can be safely late loaded. There are no generic
> 
> s/this new microcode/a new microcode revision/
> 
> Changelogs are not restricted by twitter posting rules.

:-)

> 
> > mechanism to declare support for all vendors.
> >
> > Add generic support to microcode core to declare such support, this allows
> > late-loading to be permitted in those architectures that report support
> > for safe late loading.
> >
> > Late loading has added support for
> >
> > - New images declaring a required minimum base version before a late-load
> >   is performed.
> >
> > Tainting only happens on architectures that don't support minimum required
> > version reporting.
> >
> > Add a new variable in microcode_ops to allow an architecture to declare
> > support for safe microcode late loading.
> > @@ -487,13 +488,22 @@ static ssize_t reload_store(struct device *dev,
> >  	if (ret)
> >  		goto put;
> >  
> > +	safe_late_load = microcode_ops->safe_late_load;
> > +
> > +	/*
> > +	 * If safe loading indication isn't present, bail out.
> > +	 */
> > +	if (!safe_late_load) {
> > +		pr_err("Attempting late microcode loading - it is dangerous and taints the kernel.\n");
> > +		pr_err("You should switch to early loading, if possible.\n");
> > +		ret = -EINVAL;
> > +		goto put;
> > +	}
> > +
> >  	tmp_ret = microcode_ops->request_microcode_fw(bsp, &microcode_pdev->dev);
> >  	if (tmp_ret != UCODE_NEW)
> >  		goto put;
> >  
> > -	pr_err("Attempting late microcode loading - it is dangerous and taints the kernel.\n");
> > -	pr_err("You should switch to early loading, if possible.\n");
> > -
> 
> Why are you not moving the pr_err()s right away (in 1/5) to the place
> where you move it now?

Could have, didn't occur then. But I can move them to the proper place in
patch1.

> 
> >  	mutex_lock(&microcode_mutex);
> >  	ret = microcode_reload_late();
> >  	mutex_unlock(&microcode_mutex);
> > @@ -501,11 +511,16 @@ static ssize_t reload_store(struct device *dev,
> >  put:
> >  	cpus_read_unlock();
> >  
> > +	/*
> > +	 * Only taint if a successful load and vendor doesn't support
> > +	 * safe_late_load
> > +	 */
> > +	if (!(ret && safe_late_load))
> > +		add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
> 
> The resulting code is undecodable garbage. Whats worse is that the
> existing logic in this code is broken already.

Yes, I agree, its hard to comprehend. I'll open it up a little to it makes
sense.

if successfully loaded, and !safe_late_load
       add_taint()


> 
> #1
> 	ssize_t ret = 0;
> 
> This 'ret = 0' assignment is pointless as ret is immediately overwritten
> by the next line:

This was existing code, but I can certainly remove the unneeded
initialization.

> 
> 	ret = kstrtoul(buf, 0, &val);
> 	if (ret)
> 		return ret;
> 
> 	if (val != 1)
> 		return size;
> 
> Now this is really useful. If the value is invalid, i.e. it causes the
> function to abort immediately it returns 'size' which means the write
> was successful. Oh well.

Yes, its a bit awkward. This is how its been forever. 

I wasn't sure if the purpose was values other than 1 don't throw error, so
it could be used to accommodate some extended functionality say "echo X"
in the future.

I'm not suggesting such use :-), but thought that maybe the reason to not
report error. 

If its acceptable to return like -EINVAL or something we could do that, so
there is some error user can catch in user space.

> 
> Now lets look at a few lines further down:
> 
> #2
> 
> 	ssize_t ret = 0;
>         ...
>         ret = check_online_cpus();
> 	if (ret)
> 		goto put;
>         ...
> put:
>         ...
> 	add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
>         ...
>         return ret;
> 
> Why are we tainting the kernel when there was absolutely ZERO action
> done here? All what check_online_cpus() figured out was that not enough
> CPUs were online, right? That justfies a error return, but the taint is
> bogus, no?

Agree!

This was the code that was introduced in 5.19 when we turned off
late-loading in the kernel. We try to fix it here, i.e only taint if the
loading was successful and safe_late_load wasn't set.

It should change after this patch? Or maybe you meant fix it to not taint
always before doing this change? 

> 
> The next bogosity is:
> 
> 	ssize_t ret = 0;
>         ...
>         tmp_ret = microcode_ops->request_microcode_fw(bsp, &microcode_pdev->dev);
> 	if (tmp_ret != UCODE_NEW)
> 		goto put;
>         ...    
> put:
>         ...
> 	add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
> 
> 	if (ret == 0)
> 		ret = size;
> 
>         return ret;
> 
> IOW, the microcode request can fail for whatever reason and the return
> value is unconditionally 'size' which means the write to the sysfs file
> is successfull.

Loading can fail for some known reasons

- No file found
- File is either same or older rev than loaded

Should we return proper codes? Certainly can, but since this has been
around all this time, I'm worried someone who depends on this working this
way will now see failures when it didn't in the past.

> 
> #3
> 
> Not to talk about the completely broken error handling in the actual
> microcode loading case in __reload_late()::wait_for_siblings code path.
> 
> Maybe more #...

I'll need to stare at it more than I have .. 

If its busted, its not popping out.

It's a path that all CPUs go through for the exit rendezvous.

We let the secondary also do an apply_microcode(), just to update the
revision in the per-cpu structures. I could be wrong, but if we didn't
update the per-cpu rev, /proc/cpuinfo was reporting the old values. I
remember doing this way back in 2018 Spectre time.

Guess a multiple choice might be useful :-).. I'll keep looking though!

> 
> How does any of this make sense and allows sensible scripting of this
> interface?
> 
> Surely you spent several orders of magnitude more time to stare at this
> code than I did during this review, no?

Sadly yes!

> 
> Now instead of noticing and fixing any of this nonsense you are duct
> taping this whole safe_late_load handling into that mess to make it even
> more incomprehensible.

safe_late_loading didn't change any of the old algorithms for late-loading
itself. All I used it was a mechanism to inform the core that the vendor
supports some way to tell the minrev is comprehended. This doesn't change
any of the code paths we take for late-load.

When safe_late_load is supported, 

- we don't issue a warning, or taint the kernel.
- Vendor provides a way to check if the new microcode has the proper
  meta-data and honor that.

Did you have something more in mind?

> 
> If you expected an alternative patch here, then I have to disappoint
> you.

Disappointed .. No.. I'm glad this is coming up after 4 years. The next
Part3 that has the NMI handling sort of has something similar to hold HT
siblings in NMI before the update completes in primary. Better now than
late .. 

Thanks for all the direction.
> 
> I'm not presenting you the proper solution this time on a silver tablet
> because I'm in the process of taming my 'let me fix this for you' reflex
> to prepare for my retirement some years down the road.
> 
> But you should have enough hints to fix all of this for real, right?

Yes, once i can spot all those holes :-)

Cheers,
Ashok

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Part 2 v2[cleanup] 0/4] Some additional cleanups in microcode
  2023-01-20  0:15   ` Thomas Gleixner
  2023-01-20  2:19     ` Ashok Raj
@ 2023-01-21 21:35     ` Ashok Raj
  2023-01-21 21:35       ` [Part 2 v2[cleanup] 1/4] x86/microcode: Taint kernel only if microcode loading was successful Ashok Raj
                         ` (4 more replies)
  1 sibling, 5 replies; 20+ messages in thread
From: Ashok Raj @ 2023-01-21 21:35 UTC (permalink / raw)
  To: Thomas Gleixner, Borislav Petkov
  Cc: Ashok Raj, LKML, x86, Ingo Molnar, Tony Luck, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper, Boris Ostrovsky, Martin Pohlack

Hi Thomas and Boris,

This mini series is in response to Thomas's feedback here[1].

I hope this addresses all/most of your concerns you raised in this thread.

Sorry if I missed any, drop more hints and I'll fix them up.

Patch3 needs an AMD change as well. I wasn't confident looking at the
source that was returning the patchid vs reading fromt he real MSR. I'll
check with Boris to confirm before I submit this again.

[1] https://lore.kernel.org/lkml/87y1pygiyf.ffs@tglx/

Now that the other patches are now staged in tip/x86/microcode,
this series applies on top of that.

Cheers,
Ashok

Ashok Raj (4):
  x86/microcode: Taint kernel only if microcode loading was successful
  x86/microcode: Report invalid writes to reload sysfs file
  x86/microcode/intel: Fix collect_cpu_info() to reflect current
    microcode
  x86/microcode: Do not call apply_microde() on sibling threads

 arch/x86/kernel/cpu/microcode/core.c  | 28 ++++++++++++++-------------
 arch/x86/kernel/cpu/microcode/intel.c | 10 +++++++++-
 2 files changed, 24 insertions(+), 14 deletions(-)

Cc: LKML <linux-kernel@vger.kernel.org>
Cc: x86 <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner (Intel) <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Stefan Talpalaru <stefantalpalaru@yahoo.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Peter Zilstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Martin Pohlack <mpohlack@amazon.de>

-- 
2.34.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Part 2 v2[cleanup] 1/4] x86/microcode: Taint kernel only if microcode loading was successful
  2023-01-21 21:35     ` [Part 2 v2[cleanup] 0/4] Some additional cleanups in microcode Ashok Raj
@ 2023-01-21 21:35       ` Ashok Raj
  2023-01-21 21:35       ` [Part 2 v2[cleanup] 2/4] x86/microcode: Report invalid writes to reload sysfs file Ashok Raj
                         ` (3 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Ashok Raj @ 2023-01-21 21:35 UTC (permalink / raw)
  To: Thomas Gleixner, Borislav Petkov
  Cc: Ashok Raj, LKML, x86, Ingo Molnar, Tony Luck, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper, Boris Ostrovsky, Martin Pohlack

Currently when late loading is aborted due to check_online_cpu(), kernel
still ends up tainting the kernel.

Taint only when microcode loading was successful.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: x86 <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner (Intel) <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Stefan Talpalaru <stefantalpalaru@yahoo.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Peter Zilstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Martin Pohlack <mpohlack@amazon.de>
---
 arch/x86/kernel/cpu/microcode/core.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index d7cbc83df9b6..c5d80ff00b4e 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -475,6 +475,7 @@ static ssize_t reload_store(struct device *dev,
 	enum ucode_state tmp_ret = UCODE_OK;
 	int bsp = boot_cpu_data.cpu_index;
 	unsigned long val;
+	int load_ret = -1;
 	ssize_t ret = 0;
 
 	ret = kstrtoul(buf, 0, &val);
@@ -495,16 +496,20 @@ static ssize_t reload_store(struct device *dev,
 		goto put;
 
 	mutex_lock(&microcode_mutex);
-	ret = microcode_reload_late();
+	load_ret = microcode_reload_late();
 	mutex_unlock(&microcode_mutex);
 
 put:
 	cpus_read_unlock();
 
-	if (ret == 0)
+	/*
+	 * Taint only when loading was successful
+	 */
+	if (load_ret == 0) {
 		ret = size;
-
-	add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
+		add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
+		pr_warn("Microcode late loading tainted the kernel\n");
+	}
 
 	return ret;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Part 2 v2[cleanup] 2/4] x86/microcode: Report invalid writes to reload sysfs file
  2023-01-21 21:35     ` [Part 2 v2[cleanup] 0/4] Some additional cleanups in microcode Ashok Raj
  2023-01-21 21:35       ` [Part 2 v2[cleanup] 1/4] x86/microcode: Taint kernel only if microcode loading was successful Ashok Raj
@ 2023-01-21 21:35       ` Ashok Raj
  2023-01-21 21:35       ` [Part 2 v2[cleanup] 3/4] x86/microcode/intel: Fix collect_cpu_info() to reflect current microcode Ashok Raj
                         ` (2 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Ashok Raj @ 2023-01-21 21:35 UTC (permalink / raw)
  To: Thomas Gleixner, Borislav Petkov
  Cc: Ashok Raj, LKML, x86, Ingo Molnar, Tony Luck, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper, Boris Ostrovsky, Martin Pohlack

Currently microcode reload sysfs file only accepts a value of 1. But when
user writes other values we don't report error, but just treat them as
silent success without performing a load.

Report those erroneous writes back to user.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: x86 <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner (Intel) <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Stefan Talpalaru <stefantalpalaru@yahoo.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Peter Zilstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Martin Pohlack <mpohlack@amazon.de>
---
 arch/x86/kernel/cpu/microcode/core.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index c5d80ff00b4e..6ade3d59c404 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -479,11 +479,8 @@ static ssize_t reload_store(struct device *dev,
 	ssize_t ret = 0;
 
 	ret = kstrtoul(buf, 0, &val);
-	if (ret)
-		return ret;
-
-	if (val != 1)
-		return size;
+	if (ret || val != 1)
+		return -EINVAL;
 
 	cpus_read_lock();
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Part 2 v2[cleanup] 3/4] x86/microcode/intel: Fix collect_cpu_info() to reflect current microcode
  2023-01-21 21:35     ` [Part 2 v2[cleanup] 0/4] Some additional cleanups in microcode Ashok Raj
  2023-01-21 21:35       ` [Part 2 v2[cleanup] 1/4] x86/microcode: Taint kernel only if microcode loading was successful Ashok Raj
  2023-01-21 21:35       ` [Part 2 v2[cleanup] 2/4] x86/microcode: Report invalid writes to reload sysfs file Ashok Raj
@ 2023-01-21 21:35       ` Ashok Raj
  2023-01-21 21:35       ` [Part 2 v2[cleanup] 4/4] x86/microcode: Do not call apply_microde() on sibling threads Ashok Raj
  2023-01-26 16:52       ` [Part 2 v2[cleanup] 0/4] Some additional cleanups in microcode Ashok Raj
  4 siblings, 0 replies; 20+ messages in thread
From: Ashok Raj @ 2023-01-21 21:35 UTC (permalink / raw)
  To: Thomas Gleixner, Borislav Petkov
  Cc: Ashok Raj, LKML, x86, Ingo Molnar, Tony Luck, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper, Boris Ostrovsky, Martin Pohlack

Currently collect_cpu_info() is only returning what was cached earlier
instead of reading the current revision from the proper MSR.

Collect the current revision and report that value instead of reflecting
what was cached in the past.

[TBD:
    Need to change microcode/amd.c. I didn't quite follow the logic since
    it reports the patch from the pathfile instead of reading the real
    PATCH_LEVEL MSR.
]

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: x86 <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner (Intel) <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Stefan Talpalaru <stefantalpalaru@yahoo.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Peter Zilstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Martin Pohlack <mpohlack@amazon.de>
---
 arch/x86/kernel/cpu/microcode/intel.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 4471d418f28a..be830944178c 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -543,6 +543,13 @@ static int collect_cpu_info(int cpu_num, struct cpu_signature *csig)
 {
 	struct cpuinfo_x86 *c = &cpu_data(cpu_num);
 	unsigned int val[2];
+	int rev;
+
+	/*
+	 * intel_get_microcode_revision() reads a per-core MSR
+	 * to read the revision (MSR_IA32_UCODE_REV).
+	 */
+	WARN_ON_ONCE(cpu_num != smp_processor_id());
 
 	memset(csig, 0, sizeof(*csig));
 
@@ -554,7 +561,8 @@ static int collect_cpu_info(int cpu_num, struct cpu_signature *csig)
 		csig->pf = 1 << ((val[1] >> 18) & 7);
 	}
 
-	csig->rev = c->microcode;
+	rev = intel_get_microcode_revision();
+	csig->rev = c->microcode = rev;
 
 	return 0;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Part 2 v2[cleanup] 4/4] x86/microcode: Do not call apply_microde() on sibling threads
  2023-01-21 21:35     ` [Part 2 v2[cleanup] 0/4] Some additional cleanups in microcode Ashok Raj
                         ` (2 preceding siblings ...)
  2023-01-21 21:35       ` [Part 2 v2[cleanup] 3/4] x86/microcode/intel: Fix collect_cpu_info() to reflect current microcode Ashok Raj
@ 2023-01-21 21:35       ` Ashok Raj
  2023-01-22  3:36         ` Ashok Raj
  2023-01-26 16:52       ` [Part 2 v2[cleanup] 0/4] Some additional cleanups in microcode Ashok Raj
  4 siblings, 1 reply; 20+ messages in thread
From: Ashok Raj @ 2023-01-21 21:35 UTC (permalink / raw)
  To: Thomas Gleixner, Borislav Petkov
  Cc: Ashok Raj, LKML, x86, Ingo Molnar, Tony Luck, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper, Boris Ostrovsky, Martin Pohlack

Microcode updates are applied at the core, and both threads of HT siblings
will notice the update.

During late-load, after the primary has updated the microcode, it also
reflects that in the per-cpu structure (cpuinfo_x86) holding the current
revision.

Since the sibling hasn't had a chance to update the per-cpu revision,
the current code calls apply_microcode() just as a way to verify and also
update the per-cpu revision number.

But in the odd case when primary returned with an error, the secondary will
try to perform a patchload and the primary has already been released to the
system. This could be problematic.

Replace apply_microcode() with a call to collect_cpu_info() and let that
call also update the per-cpu structure instead of returning the previously
cached values.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: x86 <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner (Intel) <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Stefan Talpalaru <stefantalpalaru@yahoo.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Peter Zilstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Martin Pohlack <mpohlack@amazon.de>
---
 arch/x86/kernel/cpu/microcode/core.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index 6ade3d59c404..089636b1643f 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -386,6 +386,7 @@ static int __wait_for_cpus(atomic_t *t, long long timeout)
  */
 static int __reload_late(void *info)
 {
+	struct ucode_cpu_info *uci = ucode_cpu_info + cpu;
 	int cpu = smp_processor_id();
 	enum ucode_state err;
 	int ret = 0;
@@ -422,12 +423,11 @@ static int __reload_late(void *info)
 
 	/*
 	 * At least one thread has completed update on each core.
-	 * For others, simply call the update to make sure the
-	 * per-cpu cpuinfo can be updated with right microcode
-	 * revision.
+	 * For siblings, collect the cpuinfo and update the
+	 * per-cpu cpuinfo with the current microcode revision.
 	 */
 	if (cpumask_first(topology_sibling_cpumask(cpu)) != cpu)
-		err = microcode_ops->apply_microcode(cpu);
+		microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
 
 	return ret;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Part 2 v2[cleanup] 4/4] x86/microcode: Do not call apply_microde() on sibling threads
  2023-01-21 21:35       ` [Part 2 v2[cleanup] 4/4] x86/microcode: Do not call apply_microde() on sibling threads Ashok Raj
@ 2023-01-22  3:36         ` Ashok Raj
  0 siblings, 0 replies; 20+ messages in thread
From: Ashok Raj @ 2023-01-22  3:36 UTC (permalink / raw)
  To: Thomas Gleixner, Borislav Petkov
  Cc: LKML, x86, Ingo Molnar, Tony Luck, Dave Hansen, Alison Schofield,
	Reinette Chatre, Tom Lendacky, Stefan Talpalaru, David Woodhouse,
	Benjamin Herrenschmidt, Jonathan Corbet, Rafael J . Wysocki,
	Peter Zilstra, Andy Lutomirski, Andrew Cooper, Boris Ostrovsky,
	Martin Pohlack, Ashok Raj

On Sat, Jan 21, 2023 at 01:35:12PM -0800, Ashok Raj wrote:

[snip]

> ---
>  arch/x86/kernel/cpu/microcode/core.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
> index 6ade3d59c404..089636b1643f 100644
> --- a/arch/x86/kernel/cpu/microcode/core.c
> +++ b/arch/x86/kernel/cpu/microcode/core.c
> @@ -386,6 +386,7 @@ static int __wait_for_cpus(atomic_t *t, long long timeout)
>   */
>  static int __reload_late(void *info)
>  {
> +	struct ucode_cpu_info *uci = ucode_cpu_info + cpu;

In a quest to keep the christmas tree effect, screwed up the ordering. 
I fixed it before i resent the next time. Modified diff below.

>  	int cpu = smp_processor_id();
>  	enum ucode_state err;
>  	int ret = 0;
> @@ -422,12 +423,11 @@ static int __reload_late(void *info)
>  
>  	/*
>  	 * At least one thread has completed update on each core.
> -	 * For others, simply call the update to make sure the
> -	 * per-cpu cpuinfo can be updated with right microcode
> -	 * revision.
> +	 * For siblings, collect the cpuinfo and update the
> +	 * per-cpu cpuinfo with the current microcode revision.
>  	 */
>  	if (cpumask_first(topology_sibling_cpumask(cpu)) != cpu)
> -		err = microcode_ops->apply_microcode(cpu);
> +		microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
>  
>  	return ret;
>  }
> -- 
> 2.34.1
> 

diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index 6ade3d59c404..07764c1a2dd3 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -387,6 +387,7 @@ static int __wait_for_cpus(atomic_t *t, long long timeout)
 static int __reload_late(void *info)
 {
        int cpu = smp_processor_id();
+       struct ucode_cpu_info *uci;
        enum ucode_state err;
        int ret = 0;

@@ -422,12 +423,13 @@ static int __reload_late(void *info)

        /*
         * At least one thread has completed update on each core.
-        * For others, simply call the update to make sure the
-        * per-cpu cpuinfo can be updated with right microcode
-        * revision.
+        * For siblings, collect the cpuinfo and update the
+        * per-cpu cpuinfo with the current microcode revision.
         */
-       if (cpumask_first(topology_sibling_cpumask(cpu)) != cpu)
-               err = microcode_ops->apply_microcode(cpu);
+       if (cpumask_first(topology_sibling_cpumask(cpu)) != cpu) {
+               uci = ucode_cpu_info + cpu;
+               microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
+       }

        return ret;
 }

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Part 2 v2[cleanup] 0/4] Some additional cleanups in microcode
  2023-01-21 21:35     ` [Part 2 v2[cleanup] 0/4] Some additional cleanups in microcode Ashok Raj
                         ` (3 preceding siblings ...)
  2023-01-21 21:35       ` [Part 2 v2[cleanup] 4/4] x86/microcode: Do not call apply_microde() on sibling threads Ashok Raj
@ 2023-01-26 16:52       ` Ashok Raj
  2023-01-27 13:49         ` Borislav Petkov
  4 siblings, 1 reply; 20+ messages in thread
From: Ashok Raj @ 2023-01-26 16:52 UTC (permalink / raw)
  To: Thomas Gleixner, Borislav Petkov
  Cc: LKML, x86, Ingo Molnar, Tony Luck, Dave Hansen, Alison Schofield,
	Reinette Chatre, Tom Lendacky, Stefan Talpalaru, David Woodhouse,
	Benjamin Herrenschmidt, Jonathan Corbet, Rafael J . Wysocki,
	Peter Zilstra, Andy Lutomirski, Andrew Cooper, Boris Ostrovsky,
	Martin Pohlack, Ashok Raj

Hi Thomas,

On Sat, Jan 21, 2023 at 01:35:08PM -0800, Ashok Raj wrote:
> Hi Thomas and Boris,
> 
> This mini series is in response to Thomas's feedback here[1].
> 
> I hope this addresses all/most of your concerns you raised in this thread.
> 
> Sorry if I missed any, drop more hints and I'll fix them up.
> 
> Patch3 needs an AMD change as well. I wasn't confident looking at the
> source that was returning the patchid vs reading fromt he real MSR. I'll
> check with Boris to confirm before I submit this again.
> 
> [1] https://lore.kernel.org/lkml/87y1pygiyf.ffs@tglx/

I can send you a new series with these cleanups. If you think I left
anything out please let me know.

Boris: Do you have any feedback on this Part2 of the path series? I can
wrap those as well with these that address Thomas's feedback if possible.

Thanks,
Ashok

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Part 2 v2[cleanup] 0/4] Some additional cleanups in microcode
  2023-01-26 16:52       ` [Part 2 v2[cleanup] 0/4] Some additional cleanups in microcode Ashok Raj
@ 2023-01-27 13:49         ` Borislav Petkov
  0 siblings, 0 replies; 20+ messages in thread
From: Borislav Petkov @ 2023-01-27 13:49 UTC (permalink / raw)
  To: Ashok Raj
  Cc: Thomas Gleixner, LKML, x86, Ingo Molnar, Tony Luck, Dave Hansen,
	Alison Schofield, Reinette Chatre, Tom Lendacky,
	Stefan Talpalaru, David Woodhouse, Benjamin Herrenschmidt,
	Jonathan Corbet, Rafael J . Wysocki, Peter Zilstra,
	Andy Lutomirski, Andrew Cooper, Boris Ostrovsky, Martin Pohlack

On Thu, Jan 26, 2023 at 08:52:30AM -0800, Ashok Raj wrote:
> Boris: Do you have any feedback on this Part2 of the path series? I can
> wrap those as well with these that address Thomas's feedback if possible.

I have no clue what I need to look at first - the part 2 or the cleanup? And
what goes where.

I think you should send a proper, tested(!), new revision of the whole set for
review.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-01-27 13:49 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-13 17:29 [PATCH v1 Part2 0/5] Declare safe late loadable microcode Ashok Raj
2023-01-13 17:29 ` [PATCH v1 Part2 1/5] x86/microcode: Move late load warning to the same function that taints kernel Ashok Raj
2023-01-19 21:48   ` Thomas Gleixner
2023-01-19 22:05     ` Ashok Raj
2023-01-13 17:29 ` [PATCH v1 Part2 2/5] x86/microcode/intel: Add minimum required revision to microcode header Ashok Raj
2023-01-19 22:03   ` Thomas Gleixner
2023-01-19 22:34     ` Ashok Raj
2023-01-13 17:29 ` [PATCH v1 Part2 3/5] x86/microcode: Add a generic mechanism to declare support for minrev Ashok Raj
2023-01-20  0:15   ` Thomas Gleixner
2023-01-20  2:19     ` Ashok Raj
2023-01-21 21:35     ` [Part 2 v2[cleanup] 0/4] Some additional cleanups in microcode Ashok Raj
2023-01-21 21:35       ` [Part 2 v2[cleanup] 1/4] x86/microcode: Taint kernel only if microcode loading was successful Ashok Raj
2023-01-21 21:35       ` [Part 2 v2[cleanup] 2/4] x86/microcode: Report invalid writes to reload sysfs file Ashok Raj
2023-01-21 21:35       ` [Part 2 v2[cleanup] 3/4] x86/microcode/intel: Fix collect_cpu_info() to reflect current microcode Ashok Raj
2023-01-21 21:35       ` [Part 2 v2[cleanup] 4/4] x86/microcode: Do not call apply_microde() on sibling threads Ashok Raj
2023-01-22  3:36         ` Ashok Raj
2023-01-26 16:52       ` [Part 2 v2[cleanup] 0/4] Some additional cleanups in microcode Ashok Raj
2023-01-27 13:49         ` Borislav Petkov
2023-01-13 17:29 ` [PATCH v1 Part2 4/5] x86/microcode/intel: Drop wbinvd() from microcode loading Ashok Raj
2023-01-13 17:29 ` [PATCH v1 Part2 5/5] x86/microcode: Provide an option to override minrev enforcement Ashok Raj

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).