Re: [PATCH v4 4/5] riscv: rewrite tlb flush for performance

From: Christoph Hellwig <hch@infradead.org>
To: Gary Guo <gary@garyguo.net>
Cc: Palmer Dabbelt <palmer@sifive.com>,
	Anup Patel <Anup.Patel@wdc.com>,
	Christoph Hellwig <hch@infradead.org>,
	Atish Patra <atish.patra@wdc.com>,
	Albert Ou <aou@eecs.berkeley.edu>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>
Subject: Re: [PATCH v4 4/5] riscv: rewrite tlb flush for performance
Date: Wed, 27 Mar 2019 00:25:57 -0700	[thread overview]
Message-ID: <20190327072557.GE3210@infradead.org> (raw)
In-Reply-To: <d60a62cfbbf63382a47e3c2226c5dd6148f8b814.1553647082.git.gary@garyguo.net>

> @@ -27,53 +19,47 @@ static inline void local_flush_tlb_all(void)
>  	__asm__ __volatile__ ("sfence.vma" : : : "memory");
>  }
>  
> -/* Flush one page from local TLB */
> -static inline void local_flush_tlb_page(unsigned long addr)
> +static inline void local_flush_tlb_mm(struct mm_struct *mm)
>  {
> -	__asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory");
> +	/* Flush ASID 0 so that global mappings are not affected */
> +	__asm__ __volatile__ ("sfence.vma x0, %0" : : "r" (0) : "memory");
>  }
>  
> -#ifndef CONFIG_SMP
> -
> -#define flush_tlb_all() local_flush_tlb_all()
> -#define flush_tlb_page(vma, addr) local_flush_tlb_page(addr)
> +static inline void local_flush_tlb_page(struct vm_area_struct *vma,
> +	unsigned long addr)
> +{
> +	__asm__ __volatile__ ("sfence.vma %0, %1"
> +			      : : "r" (addr), "r" (0)
> +			      : "memory");
> +}

Why do we pass the vma argument here even if it is never used?  That
just seems to create some rather pointless churn.  Also I'd add
local_flush_tlb_mm below local_flush_tlb_page to avoid churn as well,
nevermind that it seems the more logical order to me.

> +void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
> +	unsigned long end);
> +void local_flush_tlb_kernel_range(unsigned long start, unsigned long end);

As far as I can tell these are only used for the !SMP case and only
to implement the non-local prefixed versions.  In that case we should
just drop the local_prefix and implement those APIs directly, and only
for !SMP builds.

> +
> +#include <linux/mm.h>
> +#include <asm/sbi.h>
> +
> +#define SFENCE_VMA_FLUSH_ALL ((unsigned long) -1)
> +
> +/*
> + * This controls the maximum amount of page-level sfence.vma that the kernel
> + * can issue when the kernel needs to flush a range from the TLB.  If the size
> + * of range goes beyond this threshold, a full sfence.vma is issued.
> + *
> + * Increase this number can negatively impact performance on implementations
> + * where sfence.vma's address operand is ignored and always perform a global
> + * TLB flush.  On the other hand, implementations with page-level TLB flush
> + * support can benefit from a larger number.
> + */
> +static unsigned long tlbi_range_threshold = PAGE_SIZE;

I really hate having this is a tunable in the kernel code.  I think
the right answer is to have a device tree entry to carry this number
so that the platform can supply it.  Btw, what are examples of
platforms that flush globalls vs per-page at the moment?  What is a good
larger value for the latter based on your testing?

Also I wonder if we should also split this tunable and the optional
global flush into a separate patch.  This is in this first patch
just make use of the asid,  and then another patch to add the threshold
for doing the full flush.

> +void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
> +			   unsigned long end)
> +{
> +	if (end - start > tlbi_range_threshold) {
> +		local_flush_tlb_mm(vma->vm_mm);
> +		return;
> +	}
> +
> +	while (start < end) {
> +		__asm__ __volatile__ ("sfence.vma %0, %1"
> +				      : : "r" (start), "r" (0)
> +				      : "memory");

I think this should just call local_flush_tlb_page.

> +		start += PAGE_SIZE;
> +	}

And maybe use a for loop to short cut it a bit:

	for (; start < end; start += PAGE_SIZE)
		local_flush_tlb_page(start);

> +void local_flush_tlb_kernel_range(unsigned long start, unsigned long end)
> +{
> +	if (end - start > tlbi_range_threshold) {
> +		local_flush_tlb_all();
> +		return;
> +	}
> +
> +	while (start < end) {
> +		__asm__ __volatile__ ("sfence.vma %0"
> +				      : : "r" (start)
> +				      : "memory");
> +		start += PAGE_SIZE;

Same here, just with local_flush_tlb_kernel_page.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv