[RFC 0/4]x86: allocate up to 32 tlb invalidate vectors

* [RFC 0/4]x86: allocate up to 32 tlb invalidate vectors
@ 2010-11-03  6:44 Shaohua Li
  2010-11-15 14:02 ` Shaohua Li
  0 siblings, 1 reply; 3+ messages in thread
From: Shaohua Li @ 2010-11-03  6:44 UTC (permalink / raw)
  To: lkml; +Cc: Ingo Molnar, Andi Kleen, hpa

Hi,
In workload with heavy page reclaim, flush_tlb_page() is frequently
used. We currently have 8 vectors for tlb flush, which is fine for small
machines. But for big machines with a lot of CPUs, the 8 vectors are
shared by all CPUs and we need lock to protect them. This will cause a
lot of lock contentions. please see the patch 3 for detailed number of
the lock contention.
Andi Kleen suggests we can use 32 vectors for tlb flush, which should be
fine for even 8 socket machines. Test shows this reduces lock contention
dramatically (see patch 3 for number).
One might argue if this will waste too many vectors and leave less
vectors for devices. This could be a problem. But even we use 32
vectors, we still leave 78 vectors for devices. And we now have per-cpu
vector, vector isn't scarce any more, but I'm open if anybody has
objections.

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 3+ messages in thread