linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/4]x86: allocate up to 32 tlb invalidate vectors
@ 2010-11-03  6:44 Shaohua Li
  2010-11-15 14:02 ` Shaohua Li
  0 siblings, 1 reply; 3+ messages in thread
From: Shaohua Li @ 2010-11-03  6:44 UTC (permalink / raw)
  To: lkml; +Cc: Ingo Molnar, Andi Kleen, hpa

Hi,
In workload with heavy page reclaim, flush_tlb_page() is frequently
used. We currently have 8 vectors for tlb flush, which is fine for small
machines. But for big machines with a lot of CPUs, the 8 vectors are
shared by all CPUs and we need lock to protect them. This will cause a
lot of lock contentions. please see the patch 3 for detailed number of
the lock contention.
Andi Kleen suggests we can use 32 vectors for tlb flush, which should be
fine for even 8 socket machines. Test shows this reduces lock contention
dramatically (see patch 3 for number).
One might argue if this will waste too many vectors and leave less
vectors for devices. This could be a problem. But even we use 32
vectors, we still leave 78 vectors for devices. And we now have per-cpu
vector, vector isn't scarce any more, but I'm open if anybody has
objections.

Thanks,
Shaohua


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC 0/4]x86: allocate up to 32 tlb invalidate vectors
  2010-11-03  6:44 [RFC 0/4]x86: allocate up to 32 tlb invalidate vectors Shaohua Li
@ 2010-11-15 14:02 ` Shaohua Li
  2010-11-15 17:53   ` H. Peter Anvin
  0 siblings, 1 reply; 3+ messages in thread
From: Shaohua Li @ 2010-11-15 14:02 UTC (permalink / raw)
  To: lkml; +Cc: Ingo Molnar, Andi Kleen, hpa

On Wed, 2010-11-03 at 14:44 +0800, Shaohua Li wrote:
> Hi,
> In workload with heavy page reclaim, flush_tlb_page() is frequently
> used. We currently have 8 vectors for tlb flush, which is fine for small
> machines. But for big machines with a lot of CPUs, the 8 vectors are
> shared by all CPUs and we need lock to protect them. This will cause a
> lot of lock contentions. please see the patch 3 for detailed number of
> the lock contention.
> Andi Kleen suggests we can use 32 vectors for tlb flush, which should be
> fine for even 8 socket machines. Test shows this reduces lock contention
> dramatically (see patch 3 for number).
> One might argue if this will waste too many vectors and leave less
> vectors for devices. This could be a problem. But even we use 32
> vectors, we still leave 78 vectors for devices. And we now have per-cpu
> vector, vector isn't scarce any more, but I'm open if anybody has
> objections.
> 
Hi Ingo & hpa, any comments about this series?

Thanks,
Shaohua


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC 0/4]x86: allocate up to 32 tlb invalidate vectors
  2010-11-15 14:02 ` Shaohua Li
@ 2010-11-15 17:53   ` H. Peter Anvin
  0 siblings, 0 replies; 3+ messages in thread
From: H. Peter Anvin @ 2010-11-15 17:53 UTC (permalink / raw)
  To: Shaohua Li; +Cc: lkml, Ingo Molnar, Andi Kleen

On 11/15/2010 06:02 AM, Shaohua Li wrote:
> On Wed, 2010-11-03 at 14:44 +0800, Shaohua Li wrote:
>> Hi,
>> In workload with heavy page reclaim, flush_tlb_page() is frequently
>> used. We currently have 8 vectors for tlb flush, which is fine for small
>> machines. But for big machines with a lot of CPUs, the 8 vectors are
>> shared by all CPUs and we need lock to protect them. This will cause a
>> lot of lock contentions. please see the patch 3 for detailed number of
>> the lock contention.
>> Andi Kleen suggests we can use 32 vectors for tlb flush, which should be
>> fine for even 8 socket machines. Test shows this reduces lock contention
>> dramatically (see patch 3 for number).
>> One might argue if this will waste too many vectors and leave less
>> vectors for devices. This could be a problem. But even we use 32
>> vectors, we still leave 78 vectors for devices. And we now have per-cpu
>> vector, vector isn't scarce any more, but I'm open if anybody has
>> objections.
>>
> Hi Ingo & hpa, any comments about this series?
> 

Hi Shaohua,

It looks good... I need to do a more thorough review and put it in; I
just have been consumed a bit too much by a certain internal project.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-11-15 17:54 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-03  6:44 [RFC 0/4]x86: allocate up to 32 tlb invalidate vectors Shaohua Li
2010-11-15 14:02 ` Shaohua Li
2010-11-15 17:53   ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).