All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Erwin Tsaur <erwin.tsaur@intel.com>,
	0day robot <lkp@intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Borislav Petkov <bp@suse.de>, Tony Luck <tony.luck@intel.com>
Subject: [PATCH 5.8 25/70] x86/copy_mc: Introduce copy_mc_enhanced_fast_string()
Date: Sat, 31 Oct 2020 12:35:57 +0100	[thread overview]
Message-ID: <20201031113500.710656518@linuxfoundation.org> (raw)
In-Reply-To: <20201031113459.481803250@linuxfoundation.org>

From: Dan Williams <dan.j.williams@intel.com>

commit 5da8e4a658109e3b7e1f45ae672b7c06ac3e7158 upstream.

The motivations to go rework memcpy_mcsafe() are that the benefit of
doing slow and careful copies is obviated on newer CPUs, and that the
current opt-in list of CPUs to instrument recovery is broken relative to
those CPUs.  There is no need to keep an opt-in list up to date on an
ongoing basis if pmem/dax operations are instrumented for recovery by
default. With recovery enabled by default the old "mcsafe_key" opt-in to
careful copying can be made a "fragile" opt-out. Where the "fragile"
list takes steps to not consume poison across cachelines.

The discussion with Linus made clear that the current "_mcsafe" suffix
was imprecise to a fault. The operations that are needed by pmem/dax are
to copy from a source address that might throw #MC to a destination that
may write-fault, if it is a user page.

So copy_to_user_mcsafe() becomes copy_mc_to_user() to indicate
the separate precautions taken on source and destination.
copy_mc_to_kernel() is introduced as a non-SMAP version that does not
expect write-faults on the destination, but is still prepared to abort
with an error code upon taking #MC.

The original copy_mc_fragile() implementation had negative performance
implications since it did not use the fast-string instruction sequence
to perform copies. For this reason copy_mc_to_kernel() fell back to
plain memcpy() to preserve performance on platforms that did not indicate
the capability to recover from machine check exceptions. However, that
capability detection was not architectural and now that some platforms
can recover from fast-string consumption of memory errors the memcpy()
fallback now causes these more capable platforms to fail.

Introduce copy_mc_enhanced_fast_string() as the fast default
implementation of copy_mc_to_kernel() and finalize the transition of
copy_mc_fragile() to be a platform quirk to indicate 'copy-carefully'.
With this in place, copy_mc_to_kernel() is fast and recovery-ready by
default regardless of hardware capability.

Thanks to Vivek for identifying that copy_user_generic() is not suitable
as the copy_mc_to_user() backend since the #MC handler explicitly checks
ex_has_fault_handler(). Thanks to the 0day robot for catching a
performance bug in the x86/copy_mc_to_user implementation.

 [ bp: Add the "why" for this change from the 0/2th message, massage. ]

Fixes: 92b0729c34ca ("x86/mm, x86/mce: Add memcpy_mcsafe()")
Reported-by: Erwin Tsaur <erwin.tsaur@intel.com>
Reported-by: 0day robot <lkp@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Tested-by: Erwin Tsaur <erwin.tsaur@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/160195562556.2163339.18063423034951948973.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/lib/copy_mc.c    |   32 +++++++++++++++++++++++---------
 arch/x86/lib/copy_mc_64.S |   36 ++++++++++++++++++++++++++++++++++++
 tools/objtool/check.c     |    1 +
 3 files changed, 60 insertions(+), 9 deletions(-)

--- a/arch/x86/lib/copy_mc.c
+++ b/arch/x86/lib/copy_mc.c
@@ -45,6 +45,8 @@ void enable_copy_mc_fragile(void)
 #define copy_mc_fragile_enabled (0)
 #endif
 
+unsigned long copy_mc_enhanced_fast_string(void *dst, const void *src, unsigned len);
+
 /**
  * copy_mc_to_kernel - memory copy that handles source exceptions
  *
@@ -52,9 +54,11 @@ void enable_copy_mc_fragile(void)
  * @src:	source address
  * @len:	number of bytes to copy
  *
- * Call into the 'fragile' version on systems that have trouble
- * actually do machine check recovery. Everyone else can just
- * use memcpy().
+ * Call into the 'fragile' version on systems that benefit from avoiding
+ * corner case poison consumption scenarios, For example, accessing
+ * poison across 2 cachelines with a single instruction. Almost all
+ * other uses case can use copy_mc_enhanced_fast_string() for a fast
+ * recoverable copy, or fallback to plain memcpy.
  *
  * Return 0 for success, or number of bytes not copied if there was an
  * exception.
@@ -63,6 +67,8 @@ unsigned long __must_check copy_mc_to_ke
 {
 	if (copy_mc_fragile_enabled)
 		return copy_mc_fragile(dst, src, len);
+	if (static_cpu_has(X86_FEATURE_ERMS))
+		return copy_mc_enhanced_fast_string(dst, src, len);
 	memcpy(dst, src, len);
 	return 0;
 }
@@ -72,11 +78,19 @@ unsigned long __must_check copy_mc_to_us
 {
 	unsigned long ret;
 
-	if (!copy_mc_fragile_enabled)
-		return copy_user_generic(dst, src, len);
+	if (copy_mc_fragile_enabled) {
+		__uaccess_begin();
+		ret = copy_mc_fragile(dst, src, len);
+		__uaccess_end();
+		return ret;
+	}
+
+	if (static_cpu_has(X86_FEATURE_ERMS)) {
+		__uaccess_begin();
+		ret = copy_mc_enhanced_fast_string(dst, src, len);
+		__uaccess_end();
+		return ret;
+	}
 
-	__uaccess_begin();
-	ret = copy_mc_fragile(dst, src, len);
-	__uaccess_end();
-	return ret;
+	return copy_user_generic(dst, src, len);
 }
--- a/arch/x86/lib/copy_mc_64.S
+++ b/arch/x86/lib/copy_mc_64.S
@@ -124,4 +124,40 @@ EXPORT_SYMBOL_GPL(copy_mc_fragile)
 	_ASM_EXTABLE(.L_write_words, .E_write_words)
 	_ASM_EXTABLE(.L_write_trailing_bytes, .E_trailing_bytes)
 #endif /* CONFIG_X86_MCE */
+
+/*
+ * copy_mc_enhanced_fast_string - memory copy with exception handling
+ *
+ * Fast string copy + fault / exception handling. If the CPU does
+ * support machine check exception recovery, but does not support
+ * recovering from fast-string exceptions then this CPU needs to be
+ * added to the copy_mc_fragile_key set of quirks. Otherwise, absent any
+ * machine check recovery support this version should be no slower than
+ * standard memcpy.
+ */
+SYM_FUNC_START(copy_mc_enhanced_fast_string)
+	movq %rdi, %rax
+	movq %rdx, %rcx
+.L_copy:
+	rep movsb
+	/* Copy successful. Return zero */
+	xorl %eax, %eax
+	ret
+SYM_FUNC_END(copy_mc_enhanced_fast_string)
+
+	.section .fixup, "ax"
+.E_copy:
+	/*
+	 * On fault %rcx is updated such that the copy instruction could
+	 * optionally be restarted at the fault position, i.e. it
+	 * contains 'bytes remaining'. A non-zero return indicates error
+	 * to copy_mc_generic() users, or indicate short transfers to
+	 * user-copy routines.
+	 */
+	movq %rcx, %rax
+	ret
+
+	.previous
+
+	_ASM_EXTABLE_FAULT(.L_copy, .E_copy)
 #endif /* !CONFIG_UML */
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -550,6 +550,7 @@ static const char *uaccess_safe_builtin[
 	"csum_partial_copy_generic",
 	"copy_mc_fragile",
 	"copy_mc_fragile_handle_tail",
+	"copy_mc_enhanced_fast_string",
 	"ftrace_likely_update", /* CONFIG_TRACE_BRANCH_PROFILING */
 	NULL
 };



  parent reply	other threads:[~2020-10-31 11:42 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-31 11:35 [PATCH 5.8 00/70] 5.8.18-rc1 review Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 01/70] netfilter: nftables_offload: KASAN slab-out-of-bounds Read in nft_flow_rule_create Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 02/70] io_uring: dont run task work on an exiting task Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 03/70] io_uring: allow timeout/poll/files killing to take task into account Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 04/70] io_uring: move dropping of files into separate helper Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 05/70] io_uring: stash ctx task reference for SQPOLL Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 06/70] io_uring: unconditionally grab req->task Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 07/70] io_uring: return cancelation status from poll/timeout/files handlers Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 08/70] io_uring: enable task/files specific overflow flushing Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 09/70] io_uring: dont rely on weak ->files references Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 10/70] io_uring: reference ->nsproxy for file table commands Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 11/70] io_wq: Make io_wqe::lock a raw_spinlock_t Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 12/70] io-wq: fix use-after-free in io_wq_worker_running Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 13/70] io_uring: no need to call xa_destroy() on empty xarray Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 14/70] io_uring: Fix use of XArray in __io_uring_files_cancel Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 15/70] io_uring: Fix XArray usage in io_uring_add_task_file Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 16/70] io_uring: Convert advanced XArray uses to the normal API Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 17/70] scripts/setlocalversion: make git describe output more reliable Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 18/70] efi/arm64: libstub: Deal gracefully with EFI_RNG_PROTOCOL failure Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 19/70] fs/kernel_read_file: Remove FIRMWARE_EFI_EMBEDDED enum Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 20/70] arm64: Run ARCH_WORKAROUND_1 enabling code on all CPUs Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 21/70] arm64: Run ARCH_WORKAROUND_2 " Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 22/70] arm64: link with -z norelro regardless of CONFIG_RELOCATABLE Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 23/70] x86/PCI: Fix intel_mid_pci.c build error when ACPI is not enabled Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 24/70] x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}() Greg Kroah-Hartman
2020-10-31 11:35 ` Greg Kroah-Hartman [this message]
2020-10-31 11:35 ` [PATCH 5.8 26/70] efivarfs: Replace invalid slashes with exclamation marks in dentries Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.8 27/70] bnxt_en: Check abort error state in bnxt_open_nic() Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 28/70] bnxt_en: Fix regression in workqueue cleanup logic in bnxt_remove_one() Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 29/70] bnxt_en: Invoke cancel_delayed_work_sync() for PFs also Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 30/70] bnxt_en: Re-write PCI BARs after PCI fatal error Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 31/70] bnxt_en: Send HWRM_FUNC_RESET fw command unconditionally Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 32/70] chelsio/chtls: fix deadlock issue Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 33/70] chelsio/chtls: fix memory leaks in CPL handlers Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 34/70] chelsio/chtls: fix tls record info to user Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 35/70] cxgb4: set up filter action after rewrites Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 36/70] gtp: fix an use-before-init in gtp_newlink() Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 37/70] ibmveth: Fix use of ibmveth in a bridge Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 38/70] ibmvnic: fix ibmvnic_set_mac Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 39/70] mlxsw: core: Fix memory leak on module removal Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 40/70] netem: fix zero division in tabledist Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 41/70] net: hns3: Clear the CMDQ registers before unmapping BAR region Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 42/70] net: ipa: command payloads already mapped Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 43/70] net/sched: act_mpls: Add softdep on mpls_gso.ko Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 44/70] r8169: fix issue with forced threading in combination with shared interrupts Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 45/70] ravb: Fix bit fields checking in ravb_hwtstamp_get() Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 46/70] tcp: Prevent low rmem stalls with SO_RCVLOWAT Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 47/70] tipc: fix memory leak caused by tipc_buf_append() Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 48/70] net: protect tcf_block_unbind with block lock Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 49/70] erofs: avoid duplicated permission check for "trusted." xattrs Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 50/70] arch/x86/amd/ibs: Fix re-arming IBS Fetch Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 51/70] x86/traps: Fix #DE Oops message regression Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 52/70] x86/xen: disable Firmware First mode for correctable memory errors Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 53/70] PCI: aardvark: Fix initialization with old Marvells Arm Trusted Firmware Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 54/70] ata: ahci: mvebu: Make SATA PHY optional for Armada 3720 Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 55/70] fuse: fix page dereference after free Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 56/70] bpf: Fix comment for helper bpf_current_task_under_cgroup() Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 57/70] evm: Check size of security.evm before using it Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 58/70] p54: avoid accessing the data mapped to streaming DMA Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 59/70] cxl: Rework error message for incompatible slots Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 60/70] RDMA/addr: Fix race with netevent_callback()/rdma_addr_cancel() Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 61/70] mtd: lpddr: Fix bad logic in print_drs_error Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 62/70] drm/i915/gem: Serialise debugfs i915_gem_objects with ctx->mutex Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 63/70] serial: qcom_geni_serial: To correct QUP Version detection logic Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 64/70] serial: pl011: Fix lockdep splat when handling magic-sysrq interrupt Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 65/70] PM: runtime: Fix timer_expires data type on 32-bit arches Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 66/70] ata: sata_rcar: Fix DMA boundary mask Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 67/70] xen/gntdev.c: Mark pages as dirty Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 68/70] openrisc: Fix issue with get_user for 64-bit values Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 69/70] misc: rtsx: do not setting OC_POWER_DOWN reg in rtsx_pci_init_ocp() Greg Kroah-Hartman
2020-10-31 11:36 ` [PATCH 5.8 70/70] phy: marvell: comphy: Convert internal SMCC firmware return codes to errno Greg Kroah-Hartman
2020-10-31 20:08 ` [PATCH 5.8 00/70] 5.8.18-rc1 review Guenter Roeck
2020-11-01  7:19 ` Naresh Kamboju
2020-11-02  9:53 ` Jon Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201031113500.710656518@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=bp@suse.de \
    --cc=dan.j.williams@intel.com \
    --cc=erwin.tsaur@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.