Re: [PATCH v4 4/5] riscv: rewrite tlb flush for performance

From: Gary Guo <gary@garyguo.net>
To: Christoph Hellwig <hch@infradead.org>
Cc: "linux-riscv@lists.infradead.org" <linux-riscv@lists.infradead.org>
Subject: Re: [PATCH v4 4/5] riscv: rewrite tlb flush for performance
Date: Wed, 27 Mar 2019 13:56:28 +0000	[thread overview]
Message-ID: <3bef1053-34b5-08f5-6a90-f9270445129e@garyguo.net> (raw)
In-Reply-To: <20190327072557.GE3210@infradead.org>


On 27/03/2019 07:25, Christoph Hellwig wrote:
>> @@ -27,53 +19,47 @@ static inline void local_flush_tlb_all(void)
>>   	__asm__ __volatile__ ("sfence.vma" : : : "memory");
>>   }
>>   
>> -/* Flush one page from local TLB */
>> -static inline void local_flush_tlb_page(unsigned long addr)
>> +static inline void local_flush_tlb_mm(struct mm_struct *mm)
>>   {
>> -	__asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory");
>> +	/* Flush ASID 0 so that global mappings are not affected */
>> +	__asm__ __volatile__ ("sfence.vma x0, %0" : : "r" (0) : "memory");
>>   }
>>   
>> -#ifndef CONFIG_SMP
>> -
>> -#define flush_tlb_all() local_flush_tlb_all()
>> -#define flush_tlb_page(vma, addr) local_flush_tlb_page(addr)
>> +static inline void local_flush_tlb_page(struct vm_area_struct *vma,
>> +	unsigned long addr)
>> +{
>> +	__asm__ __volatile__ ("sfence.vma %0, %1"
>> +			      : : "r" (addr), "r" (0)
>> +			      : "memory");
>> +}
> 
> Why do we pass the vma argument here even if it is never used?  That
> just seems to create some rather pointless churn.  Also I'd add
> local_flush_tlb_mm below local_flush_tlb_page to avoid churn as well,
> nevermind that it seems the more logical order to me >
This isn't used now, but we need that for ASID support. It also more 
consistent with the non-SMP flush signature, and more consistent with 
code of other architectures.
> 
>> +void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
>> +	unsigned long end);
>> +void local_flush_tlb_kernel_range(unsigned long start, unsigned long end);
> 
> As far as I can tell these are only used for the !SMP case and only
> to implement the non-local prefixed versions.  In that case we should
> just drop the local_prefix and implement those APIs directly, and only
> for !SMP builds.
> 
Ok, in that case I'll also move it to tlbflush.c.
>> +
>> +#include <linux/mm.h>
>> +#include <asm/sbi.h>
>> +
>> +#define SFENCE_VMA_FLUSH_ALL ((unsigned long) -1)
>> +
>> +/*
>> + * This controls the maximum amount of page-level sfence.vma that the kernel
>> + * can issue when the kernel needs to flush a range from the TLB.  If the size
>> + * of range goes beyond this threshold, a full sfence.vma is issued.
>> + *
>> + * Increase this number can negatively impact performance on implementations
>> + * where sfence.vma's address operand is ignored and always perform a global
>> + * TLB flush.  On the other hand, implementations with page-level TLB flush
>> + * support can benefit from a larger number.
>> + */
>> +static unsigned long tlbi_range_threshold = PAGE_SIZE;
> 
> I really hate having this is a tunable in the kernel code.  I think
> the right answer is to have a device tree entry to carry this number
> so that the platform can supply it.  Btw, what are examples of
> platforms that flush globalls vs per-page at the moment?  What is a good
> larger value for the latter based on your testing?
> 
This is discussed in previous versions of this patch, and we arrived at 
the conclusion that a boot parameter is the best way to do it now, as at 
the moment we have no other ways to get this information. The actual 
value really depends on the actual implementation. If the implementation 
has a super large TLB where full invalidation would be super expensive, 
they might even want the value to be 511.

> Also I wonder if we should also split this tunable and the optional
> global flush into a separate patch.  This is in this first patch
> just make use of the asid,  and then another patch to add the threshold
> for doing the full flush.
I don't think we should. This patch is more like a rewrite to old logic 
rather than patching things up incrementally.
> 
>> +void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
>> +			   unsigned long end)
>> +{
>> +	if (end - start > tlbi_range_threshold) {
>> +		local_flush_tlb_mm(vma->vm_mm);
>> +		return;
>> +	}
>> +
>> +	while (start < end) {
>> +		__asm__ __volatile__ ("sfence.vma %0, %1"
>> +				      : : "r" (start), "r" (0)
>> +				      : "memory");
> 
> I think this should just call local_flush_tlb_page.
> 
I do this to minimise changes we need if we want to add ASID (in which 
case we want to avoid retrieving ASID from atomic variable multiple times).
>> +		start += PAGE_SIZE;
>> +	}
> 
> And maybe use a for loop to short cut it a bit:
> 
> 	for (; start < end; start += PAGE_SIZE)
> 		local_flush_tlb_page(start);
> 
Ok
>> +void local_flush_tlb_kernel_range(unsigned long start, unsigned long end)
>> +{
>> +	if (end - start > tlbi_range_threshold) {
>> +		local_flush_tlb_all();
>> +		return;
>> +	}
>> +
>> +	while (start < end) {
>> +		__asm__ __volatile__ ("sfence.vma %0"
>> +				      : : "r" (start)
>> +				      : "memory");
>> +		start += PAGE_SIZE;
> 
> Same here, just with local_flush_tlb_kernel_page.
> 
Ok
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv