linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	"John David Anglin " <dave.anglin@bell.net>,
	"Helge Deller" <deller@gmx.de>
Subject: [PATCH 4.9 12/65] parisc: Fix ordering of cache and TLB flushes
Date: Fri,  9 Mar 2018 16:18:12 -0800	[thread overview]
Message-ID: <20180310001825.894672685@linuxfoundation.org> (raw)
In-Reply-To: <20180310001824.927996722@linuxfoundation.org>

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: John David Anglin <dave.anglin@bell.net>

commit 0adb24e03a124b79130c9499731936b11ce2677d upstream.

The change to flush_kernel_vmap_range() wasn't sufficient to avoid the
SMP stalls.  The problem is some drivers call these routines with
interrupts disabled.  Interrupts need to be enabled for flush_tlb_all()
and flush_cache_all() to work.  This version adds checks to ensure
interrupts are not disabled before calling routines that need IPI
interrupts.  When interrupts are disabled, we now drop into slower code.

The attached change fixes the ordering of cache and TLB flushes in
several cases.  When we flush the cache using the existing PTE/TLB
entries, we need to flush the TLB after doing the cache flush.  We don't
need to do this when we flush the entire instruction and data caches as
these flushes don't use the existing TLB entries.  The same is true for
tmpalias region flushes.

The flush_kernel_vmap_range() and invalidate_kernel_vmap_range()
routines have been updated.

Secondly, we added a new purge_kernel_dcache_range_asm() routine to
pacache.S and use it in invalidate_kernel_vmap_range().  Nominally,
purges are faster than flushes as the cache lines don't have to be
written back to memory.

Hopefully, this is sufficient to resolve the remaining problems due to
cache speculation.  So far, testing indicates that this is the case.  I
did work up a patch using tmpalias flushes, but there is a performance
hit because we need the physical address for each page, and we also need
to sequence access to the tmpalias flush code.  This increases the
probability of stalls.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/parisc/include/asm/cacheflush.h |    1 
 arch/parisc/kernel/cache.c           |   57 +++++++++++++++++++----------------
 arch/parisc/kernel/pacache.S         |   22 +++++++++++++
 3 files changed, 54 insertions(+), 26 deletions(-)

--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -25,6 +25,7 @@ void flush_user_icache_range_asm(unsigne
 void flush_kernel_icache_range_asm(unsigned long, unsigned long);
 void flush_user_dcache_range_asm(unsigned long, unsigned long);
 void flush_kernel_dcache_range_asm(unsigned long, unsigned long);
+void purge_kernel_dcache_range_asm(unsigned long, unsigned long);
 void flush_kernel_dcache_page_asm(void *);
 void flush_kernel_icache_page(void *);
 void flush_user_dcache_range(unsigned long, unsigned long);
--- a/arch/parisc/kernel/cache.c
+++ b/arch/parisc/kernel/cache.c
@@ -464,10 +464,10 @@ EXPORT_SYMBOL(copy_user_page);
 int __flush_tlb_range(unsigned long sid, unsigned long start,
 		      unsigned long end)
 {
-	unsigned long flags, size;
+	unsigned long flags;
 
-	size = (end - start);
-	if (size >= parisc_tlb_flush_threshold) {
+	if ((!IS_ENABLED(CONFIG_SMP) || !arch_irqs_disabled()) &&
+	    end - start >= parisc_tlb_flush_threshold) {
 		flush_tlb_all();
 		return 1;
 	}
@@ -538,13 +538,11 @@ void flush_cache_mm(struct mm_struct *mm
 	struct vm_area_struct *vma;
 	pgd_t *pgd;
 
-	/* Flush the TLB to avoid speculation if coherency is required. */
-	if (parisc_requires_coherency())
-		flush_tlb_all();
-
 	/* Flushing the whole cache on each cpu takes forever on
 	   rp3440, etc.  So, avoid it if the mm isn't too big.  */
-	if (mm_total_size(mm) >= parisc_cache_flush_threshold) {
+	if ((!IS_ENABLED(CONFIG_SMP) || !arch_irqs_disabled()) &&
+	    mm_total_size(mm) >= parisc_cache_flush_threshold) {
+		flush_tlb_all();
 		flush_cache_all();
 		return;
 	}
@@ -552,9 +550,9 @@ void flush_cache_mm(struct mm_struct *mm
 	if (mm->context == mfsp(3)) {
 		for (vma = mm->mmap; vma; vma = vma->vm_next) {
 			flush_user_dcache_range_asm(vma->vm_start, vma->vm_end);
-			if ((vma->vm_flags & VM_EXEC) == 0)
-				continue;
-			flush_user_icache_range_asm(vma->vm_start, vma->vm_end);
+			if (vma->vm_flags & VM_EXEC)
+				flush_user_icache_range_asm(vma->vm_start, vma->vm_end);
+			flush_tlb_range(vma, vma->vm_start, vma->vm_end);
 		}
 		return;
 	}
@@ -598,14 +596,9 @@ flush_user_icache_range(unsigned long st
 void flush_cache_range(struct vm_area_struct *vma,
 		unsigned long start, unsigned long end)
 {
-	BUG_ON(!vma->vm_mm->context);
-
-	/* Flush the TLB to avoid speculation if coherency is required. */
-	if (parisc_requires_coherency())
+	if ((!IS_ENABLED(CONFIG_SMP) || !arch_irqs_disabled()) &&
+	    end - start >= parisc_cache_flush_threshold) {
 		flush_tlb_range(vma, start, end);
-
-	if ((end - start) >= parisc_cache_flush_threshold
-	    || vma->vm_mm->context != mfsp(3)) {
 		flush_cache_all();
 		return;
 	}
@@ -613,6 +606,7 @@ void flush_cache_range(struct vm_area_st
 	flush_user_dcache_range_asm(start, end);
 	if (vma->vm_flags & VM_EXEC)
 		flush_user_icache_range_asm(start, end);
+	flush_tlb_range(vma, start, end);
 }
 
 void
@@ -621,8 +615,7 @@ flush_cache_page(struct vm_area_struct *
 	BUG_ON(!vma->vm_mm->context);
 
 	if (pfn_valid(pfn)) {
-		if (parisc_requires_coherency())
-			flush_tlb_page(vma, vmaddr);
+		flush_tlb_page(vma, vmaddr);
 		__flush_cache_page(vma, vmaddr, PFN_PHYS(pfn));
 	}
 }
@@ -630,21 +623,33 @@ flush_cache_page(struct vm_area_struct *
 void flush_kernel_vmap_range(void *vaddr, int size)
 {
 	unsigned long start = (unsigned long)vaddr;
+	unsigned long end = start + size;
 
-	if ((unsigned long)size > parisc_cache_flush_threshold)
+	if ((!IS_ENABLED(CONFIG_SMP) || !arch_irqs_disabled()) &&
+	    (unsigned long)size >= parisc_cache_flush_threshold) {
+		flush_tlb_kernel_range(start, end);
 		flush_data_cache();
-	else
-		flush_kernel_dcache_range_asm(start, start + size);
+		return;
+	}
+
+	flush_kernel_dcache_range_asm(start, end);
+	flush_tlb_kernel_range(start, end);
 }
 EXPORT_SYMBOL(flush_kernel_vmap_range);
 
 void invalidate_kernel_vmap_range(void *vaddr, int size)
 {
 	unsigned long start = (unsigned long)vaddr;
+	unsigned long end = start + size;
 
-	if ((unsigned long)size > parisc_cache_flush_threshold)
+	if ((!IS_ENABLED(CONFIG_SMP) || !arch_irqs_disabled()) &&
+	    (unsigned long)size >= parisc_cache_flush_threshold) {
+		flush_tlb_kernel_range(start, end);
 		flush_data_cache();
-	else
-		flush_kernel_dcache_range_asm(start, start + size);
+		return;
+	}
+
+	purge_kernel_dcache_range_asm(start, end);
+	flush_tlb_kernel_range(start, end);
 }
 EXPORT_SYMBOL(invalidate_kernel_vmap_range);
--- a/arch/parisc/kernel/pacache.S
+++ b/arch/parisc/kernel/pacache.S
@@ -1110,6 +1110,28 @@ ENTRY_CFI(flush_kernel_dcache_range_asm)
 	.procend
 ENDPROC_CFI(flush_kernel_dcache_range_asm)
 
+ENTRY_CFI(purge_kernel_dcache_range_asm)
+	.proc
+	.callinfo NO_CALLS
+	.entry
+
+	ldil		L%dcache_stride, %r1
+	ldw		R%dcache_stride(%r1), %r23
+	ldo		-1(%r23), %r21
+	ANDCM		%r26, %r21, %r26
+
+1:      cmpb,COND(<<),n	%r26, %r25,1b
+	pdc,m		%r23(%r26)
+
+	sync
+	syncdma
+	bv		%r0(%r2)
+	nop
+	.exit
+
+	.procend
+ENDPROC_CFI(purge_kernel_dcache_range_asm)
+
 ENTRY_CFI(flush_user_icache_range_asm)
 	.proc
 	.callinfo NO_CALLS

  parent reply	other threads:[~2018-03-10  0:18 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-10  0:18 [PATCH 4.9 00/65] 4.9.87-stable review Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 01/65] tpm: st33zp24: fix potential buffer overruns caused by bit glitches on the bus Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 02/65] tpm_i2c_infineon: " Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 03/65] tpm_i2c_nuvoton: " Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 04/65] tpm_tis: " Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 05/65] tpm: constify transmit data pointers Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 06/65] tpm_tis_spi: Use DMA-safe memory for SPI transfers Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 07/65] tpm-dev-common: Reject too short writes Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 08/65] ALSA: usb-audio: Add a quirck for B&W PX headphones Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 09/65] ALSA: hda: Add a power_save blacklist Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 10/65] ALSA: hda - Fix pincfg at resume on Lenovo T470 dock Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 11/65] timers: Forward timer base before migrating timers Greg Kroah-Hartman
2018-03-10  0:18 ` Greg Kroah-Hartman [this message]
2018-03-10  0:18 ` [PATCH 4.9 13/65] cpufreq: s3c24xx: Fix broken s3c_cpufreq_init() Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 14/65] dax: fix vma_is_fsdax() helper Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 15/65] x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 16/65] x86/platform/intel-mid: Handle Intel Edison reboot correctly Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 17/65] media: m88ds3103: dont call a non-initalized function Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 18/65] nospec: Allow index argument to have const-qualified type Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 19/65] ARM: mvebu: Fix broken PL310_ERRATA_753970 selects Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 20/65] ARM: kvm: fix building with gcc-8 Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 21/65] KVM: mmu: Fix overlap between public and private memslots Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 22/65] KVM/x86: Remove indirect MSR op calls from SPEC_CTRL Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 23/65] KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely() Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 24/65] PCI/ASPM: Deal with missing root ports in link state handling Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 25/65] dm io: fix duplicate bio completion due to missing ref count Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 26/65] ARM: dts: LogicPD SOM-LV: Fix I2C1 pinmux Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 27/65] ARM: dts: LogicPD Torpedo: " Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 28/65] x86/mm: Give each mm TLB flush generation a unique ID Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 29/65] x86/speculation: Use Indirect Branch Prediction Barrier in context switch Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 30/65] md: only allow remove_and_add_spares when no sync_thread running Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 31/65] netlink: put module reference if dump start fails Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 32/65] x86/apic/vector: Handle legacy irq data correctly Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 33/65] bridge: check brport attr show in brport_show Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 34/65] fib_semantics: Dont match route with mismatching tclassid Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 35/65] hdlc_ppp: carrier detect ok, dont turn off negotiation Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 36/65] ipv6 sit: work around bogus gcc-8 -Wrestrict warning Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 37/65] net: fix race on decreasing number of TX queues Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 38/65] net: ipv4: dont allow setting net.ipv4.route.min_pmtu below 68 Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 39/65] netlink: ensure to loop over all netns in genlmsg_multicast_allns() Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 40/65] ppp: prevent unregistered channels from connecting to PPP units Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 41/65] udplite: fix partial checksum initialization Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 42/65] sctp: fix dst refcnt leak in sctp_v4_get_dst Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 43/65] mlxsw: spectrum_switchdev: Check success of FDB add operation Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 44/65] net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 45/65] tcp: Honor the eor bit in tcp_mtu_probe Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 46/65] rxrpc: Fix send in rxrpc_send_data_packet() Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 47/65] tcp_bbr: better deal with suboptimal GSO Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 48/65] sctp: fix dst refcnt leak in sctp_v6_get_dst() Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 49/65] s390/qeth: fix underestimated count of buffer elements Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 50/65] s390/qeth: fix SETIP command handling Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 51/65] s390/qeth: fix overestimated count of buffer elements Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 52/65] s390/qeth: fix IP removal on offline cards Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 53/65] s390/qeth: fix double-free on IP add/remove race Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 54/65] s390/qeth: fix IP address lookup for L3 devices Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 55/65] s390/qeth: fix IPA command submission race Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 56/65] sctp: verify size of a new chunk in _sctp_make_chunk() Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 57/65] net: mpls: Pull common label check into helper Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 58/65] mpls, nospec: Sanitize array index in mpls_label_ok() Greg Kroah-Hartman
2018-03-10  0:18 ` [PATCH 4.9 59/65] bpf: fix wrong exposure of map_flags into fdinfo for lpm Greg Kroah-Hartman
2018-03-10  0:19 ` [PATCH 4.9 60/65] bpf: fix mlock precharge on arraymaps Greg Kroah-Hartman
2018-03-10  0:19 ` [PATCH 4.9 61/65] bpf, x64: implement retpoline for tail call Greg Kroah-Hartman
2018-03-10  0:19 ` [PATCH 4.9 62/65] bpf, arm64: fix out of bounds access in " Greg Kroah-Hartman
2018-03-10  0:19 ` [PATCH 4.9 63/65] bpf: add schedule points in percpu arrays management Greg Kroah-Hartman
2018-03-10  0:19 ` [PATCH 4.9 64/65] bpf, ppc64: fix out of bounds access in tail call Greg Kroah-Hartman
2018-03-10  0:19 ` [PATCH 4.9 65/65] btrfs: preserve i_mode if __btrfs_set_acl() fails Greg Kroah-Hartman
2018-03-10  5:14 ` [PATCH 4.9 00/65] 4.9.87-stable review Shuah Khan
2018-03-10  7:59 ` kernelci.org bot
2018-03-10 15:44 ` Guenter Roeck
2018-03-12  7:02 ` Naresh Kamboju
2018-03-12  9:32   ` Naresh Kamboju
2018-03-12 10:01     ` Naresh Kamboju
2018-03-12 10:26   ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180310001825.894672685@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=dave.anglin@bell.net \
    --cc=deller@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).