LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>,
	Vineet Gupta <vgupta@synopsys.com>
Subject: [PATCH 4.14 18/68] ARCv2: lib: memeset: fix doing prefetchw outside of buffer
Date: Tue, 29 Jan 2019 12:35:40 +0100
Message-ID: <20190129113133.145596929@linuxfoundation.org> (raw)
In-Reply-To: <20190129113131.751891514@linuxfoundation.org>

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>

commit e6a72b7daeeb521753803550f0ed711152bb2555 upstream.

ARCv2 optimized memset uses PREFETCHW instruction for prefetching the
next cache line but doesn't ensure that the line is not past the end of
the buffer. PRETECHW changes the line ownership and marks it dirty,
which can cause issues in SMP config when next line was already owned by
other core. Fix the issue by avoiding the PREFETCHW

Some more details:

The current code has 3 logical loops (ignroing the unaligned part)
  (a) Big loop for doing aligned 64 bytes per iteration with PREALLOC
  (b) Loop for 32 x 2 bytes with PREFETCHW
  (c) any left over bytes

loop (a) was already eliding the last 64 bytes, so PREALLOC was
safe. The fix was removing PREFETCW from (b).

Another potential issue (applicable to configs with 32 or 128 byte L1
cache line) is that PREALLOC assumes 64 byte cache line and may not do
the right thing specially for 32b. While it would be easy to adapt,
there are no known configs with those lie sizes, so for now, just
compile out PREALLOC in such cases.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Cc: stable@vger.kernel.org #4.4+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: rewrote changelog, used asm .macro vs. "C" macro]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arc/lib/memset-archs.S |   40 ++++++++++++++++++++++++++++++++--------
 1 file changed, 32 insertions(+), 8 deletions(-)

--- a/arch/arc/lib/memset-archs.S
+++ b/arch/arc/lib/memset-archs.S
@@ -7,11 +7,39 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/cache.h>
 
-#undef PREALLOC_NOT_AVAIL
+/*
+ * The memset implementation below is optimized to use prefetchw and prealloc
+ * instruction in case of CPU with 64B L1 data cache line (L1_CACHE_SHIFT == 6)
+ * If you want to implement optimized memset for other possible L1 data cache
+ * line lengths (32B and 128B) you should rewrite code carefully checking
+ * we don't call any prefetchw/prealloc instruction for L1 cache lines which
+ * don't belongs to memset area.
+ */
+
+#if L1_CACHE_SHIFT == 6
+
+.macro PREALLOC_INSTR	reg, off
+	prealloc	[\reg, \off]
+.endm
+
+.macro PREFETCHW_INSTR	reg, off
+	prefetchw	[\reg, \off]
+.endm
+
+#else
+
+.macro PREALLOC_INSTR
+.endm
+
+.macro PREFETCHW_INSTR
+.endm
+
+#endif
 
 ENTRY_CFI(memset)
-	prefetchw [r0]		; Prefetch the write location
+	PREFETCHW_INSTR	r0, 0	; Prefetch the first write location
 	mov.f	0, r2
 ;;; if size is zero
 	jz.d	[blink]
@@ -48,11 +76,8 @@ ENTRY_CFI(memset)
 
 	lpnz	@.Lset64bytes
 	;; LOOP START
-#ifdef PREALLOC_NOT_AVAIL
-	prefetchw [r3, 64]	;Prefetch the next write location
-#else
-	prealloc  [r3, 64]
-#endif
+	PREALLOC_INSTR	r3, 64	; alloc next line w/o fetching
+
 #ifdef CONFIG_ARC_HAS_LL64
 	std.ab	r4, [r3, 8]
 	std.ab	r4, [r3, 8]
@@ -85,7 +110,6 @@ ENTRY_CFI(memset)
 	lsr.f	lp_count, r2, 5 ;Last remaining  max 124 bytes
 	lpnz	.Lset32bytes
 	;; LOOP START
-	prefetchw   [r3, 32]	;Prefetch the next write location
 #ifdef CONFIG_ARC_HAS_LL64
 	std.ab	r4, [r3, 8]
 	std.ab	r4, [r3, 8]



  parent reply index

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-29 11:35 [PATCH 4.14 00/68] 4.14.97-stable review Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 01/68] amd-xgbe: Fix mdio access for non-zero ports and clause 45 PHYs Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 02/68] net: bridge: Fix ethernet header pointer before check skb forwardable Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 03/68] net: Fix usage of pskb_trim_rcsum Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 04/68] net: phy: mdio_bus: add missing device_del() in mdiobus_register() error handling Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 05/68] net_sched: refetch skb protocol for each filter Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 06/68] openvswitch: Avoid OOB read when parsing flow nlattrs Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 07/68] vhost: log dirty page correctly Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 08/68] net: ipv4: Fix memory leak in network namespace dismantle Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 09/68] tcp: allow MSG_ZEROCOPY transmission also in CLOSE_WAIT state Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 10/68] ipfrag: really prevent allocation on netns exit Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 11/68] mmc: Kconfig: Enable CONFIG_MMC_SDHCI_IO_ACCESSORS Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 12/68] mei: me: add denverton innovation engine device IDs Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 13/68] USB: serial: simple: add Motorola Tetra TPG2200 device id Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 14/68] USB: serial: pl2303: add new PID to support PL2303TB Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 15/68] ASoC: atom: fix a missing check of snd_pcm_lib_malloc_pages Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 16/68] ASoC: rt5514-spi: Fix potential NULL pointer dereference Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 17/68] ALSA: hda - Add mute LED support for HP ProBook 470 G5 Greg Kroah-Hartman
2019-01-29 11:35 ` Greg Kroah-Hartman [this message]
2019-01-29 11:35 ` [PATCH 4.14 19/68] ARC: adjust memblock_reserve of kernel memory Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 20/68] ARC: perf: map generic branches to correct hardware condition Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 21/68] s390/early: improve machine detection Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 22/68] s390/smp: fix CPU hotplug deadlock with CPU rescan Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 23/68] char/mwave: fix potential Spectre v1 vulnerability Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 24/68] staging: rtl8188eu: Add device code for D-Link DWA-121 rev B1 Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 25/68] tty: Handle problem if line discipline does not have receive_buf Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 26/68] uart: Fix crash in uart_write and uart_put_char Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 27/68] tty/n_hdlc: fix __might_sleep warning Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 28/68] hv_balloon: avoid touching uninitialized struct page during tail onlining Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 29/68] Drivers: hv: vmbus: Check for ring when getting debug info Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 30/68] CIFS: Fix possible hang during async MTU reads and writes Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 31/68] CIFS: Fix credits calculations for reads with errors Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 32/68] CIFS: Fix credit calculation for encrypted " Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 33/68] CIFS: Do not reconnect TCP session in add_credits() Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 34/68] Input: xpad - add support for SteelSeries Stratus Duo Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 35/68] compiler.h: enable builtin overflow checkers and add fallback code Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 36/68] Input: uinput - fix undefined behavior in uinput_validate_absinfo() Greg Kroah-Hartman
2019-01-29 11:35 ` [PATCH 4.14 37/68] acpi/nfit: Block function zero DSMs Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 38/68] acpi/nfit: Fix command-supported detection Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 39/68] dm thin: fix passdown_double_checking_shared_status() Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 40/68] dm crypt: fix parsing of extended IV arguments Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 41/68] KVM: x86: Fix single-step debugging Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 42/68] x86/pkeys: Properly copy pkey state at fork() Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 43/68] x86/selftests/pkeys: Fork() to check for state being preserved Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 44/68] x86/kaslr: Fix incorrect i8254 outb() parameters Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 45/68] posix-cpu-timers: Unbreak timer rearming Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 46/68] irqchip/gic-v3-its: Align PCI Multi-MSI allocation on their size Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 47/68] can: dev: __can_get_echo_skb(): fix bogous check for non-existing skb by removing it Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 48/68] can: bcm: check timer values before ktime conversion Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 49/68] vt: invoke notifier on screen size change Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 50/68] perf unwind: Unwind with libdw doesnt take symfs into account Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 51/68] perf unwind: Take pgoff into account when reporting elf to libdwfl Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 52/68] Revert "seccomp: add a selftest for get_metadata" Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 53/68] net: stmmac: Use correct values in TQS/RQS fields Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 54/68] KVM: x86: Fix a 4.14 backport regression related to userspace/guest FPU Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 55/68] s390/smp: Fix calling smp_call_ipl_cpu() from ipl CPU Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 56/68] nvmet-rdma: Add unlikely for response allocated check Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 57/68] nvmet-rdma: fix null dereference under heavy load Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 58/68] usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 59/68] xhci: Fix leaking USB3 shared_hcd at xhci removal Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 60/68] ptp_kvm: probe for kvm guest availability Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 61/68] x86/pvclock: add setter for pvclock_pvti_cpu0_va Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 62/68] x86/xen/time: set pvclock flags on xen_time_init() Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 63/68] x86/xen/time: setup vcpu 0 time info page Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 64/68] x86/xen/time: Output xen sched_clock time from 0 Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 65/68] xen: Fix x86 sched_clock() interface for xen Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 66/68] f2fs: read page index before freeing Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 67/68] btrfs: fix error handling in btrfs_dev_replace_start Greg Kroah-Hartman
2019-01-29 11:36 ` [PATCH 4.14 68/68] btrfs: dev-replace: go back to suspended state if target device is missing Greg Kroah-Hartman
2019-01-30  2:06 ` [PATCH 4.14 00/68] 4.14.97-stable review shuah
2019-01-30 12:51 ` Jon Hunter
2019-01-31  7:51   ` Greg Kroah-Hartman
2019-01-30 12:55 ` Naresh Kamboju
2019-01-30 18:49   ` Amir Goldstein
2019-01-30 19:32     ` Greg Kroah-Hartman
2019-02-04 10:12       ` Amir Goldstein
2019-02-04 10:35         ` Greg Kroah-Hartman
2019-01-30 22:13 ` Guenter Roeck

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190129113133.145596929@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=Eugeniy.Paltsev@synopsys.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=vgupta@synopsys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git