On 06/04/2018 06:11 AM, speck for Konrad Rzeszutek Wilk wrote: > On Mon, Jun 04, 2018 at 10:24:59AM +0200, speck for Martin Pohlack wrote: >> [resending as new message as the replay seems to have been lost on at >> least some mail paths] >> >> On 30.05.2018 11:01, speck for Paolo Bonzini wrote: >>> On 30/05/2018 01:54, speck for Andrew Cooper wrote: >>>> Other bits I don't understand are the 64k limit in the first place, why >>>> it gets walked over in 4k strides to begin with (I'm not aware of any >>>> prefetching which would benefit that...) and why a particularly >>>> obfuscated piece of magic is used for the 64byte strides. >>> >>> That is the only part I understood, :) the 4k strides ensure that the >>> source data is in the TLB. Why that is needed is still a mystery though. >> >> I think the reasoning is that you first want to populate the TLB for the >> whole flush array, then fence, to make sure TLB walks do not interfere >> with the actual flushing later, either for performance reasons or for >> preventing leakage of partial walk results. >> >> Not sure about the 64K, it likely is about the LRU implementation for L1 >> replacement not being perfect (but pseudo LRU), so you need to flush >> more than the L1 size (32K) in software. But I have also seen smaller >> recommendations for that (52K). > > Isn't Tim Chen from Intel on this mailing list? Tim, could you find out > please? > Will do. Tim