at 12:16 AM, Peter Zijlstra <peterz@infradead.org> wrote:

> On Tue, Oct 09, 2018 at 10:02:50AM +0530, Ashish Mhetre wrote:
>> From: Shaohua Li <shli@kernel.org>
>> 
>> We use the accessed bit to age a page at page reclaim time,
>> and currently we also flush the TLB when doing so.
>> 
>> But in some workloads TLB flush overhead is very heavy. In my
>> simple multithreaded app with a lot of swap to several pcie
>> SSDs, removing the tlb flush gives about 20% ~ 30% swapout
>> speedup.
>> 
>> Fortunately just removing the TLB flush is a valid optimization:
>> on x86 CPUs, clearing the accessed bit without a TLB flush
>> doesn't cause data corruption.
>> 
>> It could cause incorrect page aging and the (mistaken) reclaim of
>> hot pages, but the chance of that should be relatively low.
>> 
>> So as a performance optimization don't flush the TLB when
>> clearing the accessed bit, it will eventually be flushed by
>> a context switch or a VM operation anyway. [ In the rare
>> event of it not getting flushed for a long time the delay
>> shouldn't really matter because there's no real memory
>> pressure for swapout to react to. ]
> 
> Note that context switches (and here I'm talking about switch_mm(), not
> the cheaper switch_to()) do not unconditionally imply a TLB invalidation
> these days (on PCID enabled hardware).
> 
> So in that regards, the Changelog (and the comment) is a little
> misleading.
> 
> I don't see anything fundamentally wrong with the patch though; just the
> wording.

What am I missing? This is a patch from 2014, no? b13b1d2d8692b ?