linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* x86: Is there still value in having a special tlb flush IPI vector?
@ 2008-07-28 23:16 Jeremy Fitzhardinge
  2008-07-28 23:20 ` Jeremy Fitzhardinge
  2008-07-28 23:34 ` Ingo Molnar
  0 siblings, 2 replies; 22+ messages in thread
From: Jeremy Fitzhardinge @ 2008-07-28 23:16 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Jens Axboe, Andi Kleen, Linux Kernel Mailing List

Now that normal smp_function_call is no longer an enormous bottleneck, 
is there still value in having a specialised IPI vector for tlb 
flushes?  It seems like quite a lot of duplicate code.

The 64-bit tlb flush multiplexes the various cpus across 8 vectors to 
increase scalability. If this is a big issue, then the smp function call 
code can (and should) do the same thing.  (Though looking at it more 
closely, the way the code uses the 8 vectors is actually a less general 
way of doing what smp_call_function is doing anyway.)

Thoughts?

(And uv should definitely be hooking pvops if it wants its own 
flush_tlb_others; vsmp sets the precedent for a subarch-like use of pvops.)

    J

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-28 23:16 x86: Is there still value in having a special tlb flush IPI vector? Jeremy Fitzhardinge
@ 2008-07-28 23:20 ` Jeremy Fitzhardinge
  2008-07-29  2:12   ` Andi Kleen
  2008-07-28 23:34 ` Ingo Molnar
  1 sibling, 1 reply; 22+ messages in thread
From: Jeremy Fitzhardinge @ 2008-07-28 23:20 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Jens Axboe, Andi Kleen, Linux Kernel Mailing List

Resend to cc: Andi on an address which actually works.

Jeremy Fitzhardinge wrote:
> Now that normal smp_function_call is no longer an enormous bottleneck, 
> is there still value in having a specialised IPI vector for tlb 
> flushes?  It seems like quite a lot of duplicate code.
>
> The 64-bit tlb flush multiplexes the various cpus across 8 vectors to 
> increase scalability. If this is a big issue, then the smp function 
> call code can (and should) do the same thing.  (Though looking at it 
> more closely, the way the code uses the 8 vectors is actually a less 
> general way of doing what smp_call_function is doing anyway.)
>
> Thoughts?
>
> (And uv should definitely be hooking pvops if it wants its own 
> flush_tlb_others; vsmp sets the precedent for a subarch-like use of 
> pvops.)
>
>    J
> -- 
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-28 23:16 x86: Is there still value in having a special tlb flush IPI vector? Jeremy Fitzhardinge
  2008-07-28 23:20 ` Jeremy Fitzhardinge
@ 2008-07-28 23:34 ` Ingo Molnar
  2008-07-29  4:30   ` Nick Piggin
  1 sibling, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2008-07-28 23:34 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Jens Axboe, Andi Kleen, Linux Kernel Mailing List,
	Thomas Gleixner, H. Peter Anvin


* Jeremy Fitzhardinge <jeremy@goop.org> wrote:

> Now that normal smp_function_call is no longer an enormous bottleneck, 
> is there still value in having a specialised IPI vector for tlb 
> flushes?  It seems like quite a lot of duplicate code.
>
> The 64-bit tlb flush multiplexes the various cpus across 8 vectors to 
> increase scalability. If this is a big issue, then the smp function 
> call code can (and should) do the same thing.  (Though looking at it 
> more closely, the way the code uses the 8 vectors is actually a less 
> general way of doing what smp_call_function is doing anyway.)

yep, and we could eliminate the reschedule IPI as well.

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-28 23:20 ` Jeremy Fitzhardinge
@ 2008-07-29  2:12   ` Andi Kleen
  2008-07-29  6:29     ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2008-07-29  2:12 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Ingo Molnar, Jens Axboe, Andi Kleen, Linux Kernel Mailing List

On Mon, Jul 28, 2008 at 04:20:53PM -0700, Jeremy Fitzhardinge wrote:
> Resend to cc: Andi on an address which actually works.
> 
> Jeremy Fitzhardinge wrote:
> >Now that normal smp_function_call is no longer an enormous bottleneck, 

Hmm? It still uses a global lock at least as of current git tree.

-Andi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-28 23:34 ` Ingo Molnar
@ 2008-07-29  4:30   ` Nick Piggin
  2008-07-29  6:19     ` Jeremy Fitzhardinge
  2008-07-29  9:54     ` Peter Zijlstra
  0 siblings, 2 replies; 22+ messages in thread
From: Nick Piggin @ 2008-07-29  4:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jeremy Fitzhardinge, Jens Axboe, Andi Kleen,
	Linux Kernel Mailing List, Thomas Gleixner, H. Peter Anvin

On Tuesday 29 July 2008 09:34, Ingo Molnar wrote:
> * Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> > Now that normal smp_function_call is no longer an enormous bottleneck,
> > is there still value in having a specialised IPI vector for tlb
> > flushes?  It seems like quite a lot of duplicate code.
> >
> > The 64-bit tlb flush multiplexes the various cpus across 8 vectors to
> > increase scalability. If this is a big issue, then the smp function
> > call code can (and should) do the same thing.  (Though looking at it
> > more closely, the way the code uses the 8 vectors is actually a less
> > general way of doing what smp_call_function is doing anyway.)

It definitely is not a clear win. They do not have the same characteristics.
So numbers will be needed.

smp_call_function is now properly scalable in smp_call_function_single
form. The more general case of multiple targets is not so easy and it still
takes a global lock and touches global cachelines.

I don't think it is a good use of time, honestly. Do you have a good reason?


> yep, and we could eliminate the reschedule IPI as well.

No. The rewrite makes it now very good at synchronously sending a function
to a single other CPU.

Sending asynchronously requires a slab allocation and then a remote slab free
(which is nasty for slab) at the other end, and bouncing of locks and
cachelines. No way you want to do that in the reschedule IPI.

Not to mention the minor problem that it still deadlocks when called with
interrupts disabled ;)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-29  4:30   ` Nick Piggin
@ 2008-07-29  6:19     ` Jeremy Fitzhardinge
  2008-07-29  9:47       ` Nick Piggin
  2008-07-29  9:54     ` Peter Zijlstra
  1 sibling, 1 reply; 22+ messages in thread
From: Jeremy Fitzhardinge @ 2008-07-29  6:19 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ingo Molnar, Jens Axboe, Andi Kleen, Linux Kernel Mailing List,
	Thomas Gleixner, H. Peter Anvin

Nick Piggin wrote:
> It definitely is not a clear win. They do not have the same characteristics.
> So numbers will be needed.
>
> smp_call_function is now properly scalable in smp_call_function_single
> form. The more general case of multiple targets is not so easy and it still
> takes a global lock and touches global cachelines.
>
> I don't think it is a good use of time, honestly. Do you have a good reason?
>   

Code cleanup, unification.  It took about 20 minutes to do.  It probably 
won't take too much longer to unify kernel/tlb.c.  It seems that if 
there's any performance loss in making the transition, then we can make 
it up again by tuning smp_call_function_mask, benefiting all users.

But, truth be told, the real reason is that I think there may be some 
correctness issue around smp_call_function* - I've seen occasional 
inexplicable crashes, all within generic_smp_call_function() - and I 
just can't exercise that code enough to get a solid reproducing case.  
But if it gets used for tlb flushes, then any bug is going to become 
pretty obvious.  Regardless of whether these patches get accepted, I can 
use it as a test vehicle.

> No. The rewrite makes it now very good at synchronously sending a function
> to a single other CPU.
>
> Sending asynchronously requires a slab allocation and then a remote slab free
> (which is nasty for slab) at the other end, and bouncing of locks and
> cachelines. No way you want to do that in the reschedule IPI.
>
> Not to mention the minor problem that it still deadlocks when called with
> interrupts disabled ;)
>   

In the async case?  Or because it can become spontaneously sync if 
there's an allocation failure?

    J

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-29  2:12   ` Andi Kleen
@ 2008-07-29  6:29     ` Jeremy Fitzhardinge
  2008-07-29 12:02       ` Andi Kleen
  0 siblings, 1 reply; 22+ messages in thread
From: Jeremy Fitzhardinge @ 2008-07-29  6:29 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Ingo Molnar, Jens Axboe, Linux Kernel Mailing List

Andi Kleen wrote:
>>> Now that normal smp_function_call is no longer an enormous bottleneck, 
>>>       
>
> Hmm? It still uses a global lock at least as of current git tree.

Yes, but it's only held briefly to put things onto the list.  It doesn't 
get held over the whole IPI transaction as the old smp_call_function 
did, and the tlb flush code still does.  RCU is used to manage the list 
walk and freeing, so there's no long-held locks there either.

    J

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-29  6:19     ` Jeremy Fitzhardinge
@ 2008-07-29  9:47       ` Nick Piggin
  0 siblings, 0 replies; 22+ messages in thread
From: Nick Piggin @ 2008-07-29  9:47 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Ingo Molnar, Jens Axboe, Andi Kleen, Linux Kernel Mailing List,
	Thomas Gleixner, H. Peter Anvin

On Tuesday 29 July 2008 16:19, Jeremy Fitzhardinge wrote:
> Nick Piggin wrote:
> > It definitely is not a clear win. They do not have the same
> > characteristics. So numbers will be needed.
> 
> > smp_call_function is now properly scalable in smp_call_function_single
> > form. The more general case of multiple targets is not so easy and it
> > still takes a global lock and touches global cachelines.
> >
> > I don't think it is a good use of time, honestly. Do you have a good
> > reason?
>
> Code cleanup, unification.  It took about 20 minutes to do.  It probably

OK, so nothing terribly important.


> won't take too much longer to unify kernel/tlb.c.  It seems that if
> there's any performance loss in making the transition, then we can make
> it up again by tuning smp_call_function_mask, benefiting all users.

No I don't think that is the right way to go for such an important
functionality. There are no ifs, smp_call_function does touch global
cachelines and locks.

smp_call_function is barely used, as should be very obvious because it
was allowed to languish with such horrible performance for so long. So
there aren't too many users.

But if you get smp_call_function_mask performance at the same time,
then there is less to argue about I guess (although it will always
be necessarily more complex than plain tlb flushing).


> But, truth be told, the real reason is that I think there may be some
> correctness issue around smp_call_function* - I've seen occasional
> inexplicable crashes, all within generic_smp_call_function() - and I
> just can't exercise that code enough to get a solid reproducing case.
> But if it gets used for tlb flushes, then any bug is going to become
> pretty obvious.  Regardless of whether these patches get accepted, I can
> use it as a test vehicle.

That's fair enough. Better still might be a test harness specifically
to exercise it.


> > No. The rewrite makes it now very good at synchronously sending a
> > function to a single other CPU.
> >
> > Sending asynchronously requires a slab allocation and then a remote slab
> > free (which is nasty for slab) at the other end, and bouncing of locks
> > and cachelines. No way you want to do that in the reschedule IPI.
> >
> > Not to mention the minor problem that it still deadlocks when called with
> > interrupts disabled ;)
>
> In the async case?  Or because it can become spontaneously sync if
> there's an allocation failure?

In both sync and async case, yes.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-29  4:30   ` Nick Piggin
  2008-07-29  6:19     ` Jeremy Fitzhardinge
@ 2008-07-29  9:54     ` Peter Zijlstra
  2008-07-29 10:00       ` Nick Piggin
  1 sibling, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2008-07-29  9:54 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ingo Molnar, Jeremy Fitzhardinge, Jens Axboe, Andi Kleen,
	Linux Kernel Mailing List, Thomas Gleixner, H. Peter Anvin

On Tue, 2008-07-29 at 14:30 +1000, Nick Piggin wrote:

> Not to mention the minor problem that it still deadlocks when called with
> interrupts disabled ;)

__smp_call_function_single has potential though.. 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-29  9:54     ` Peter Zijlstra
@ 2008-07-29 10:00       ` Nick Piggin
  2008-07-29 10:04         ` Peter Zijlstra
  2008-07-29 10:45         ` Peter Zijlstra
  0 siblings, 2 replies; 22+ messages in thread
From: Nick Piggin @ 2008-07-29 10:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Jeremy Fitzhardinge, Jens Axboe, Andi Kleen,
	Linux Kernel Mailing List, Thomas Gleixner, H. Peter Anvin

On Tuesday 29 July 2008 19:54, Peter Zijlstra wrote:
> On Tue, 2008-07-29 at 14:30 +1000, Nick Piggin wrote:
> > Not to mention the minor problem that it still deadlocks when called with
> > interrupts disabled ;)
>
> __smp_call_function_single has potential though..

For reschedule interrupt? I don't really agree.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-29 10:00       ` Nick Piggin
@ 2008-07-29 10:04         ` Peter Zijlstra
  2008-07-29 10:17           ` Nick Piggin
  2008-07-29 10:45         ` Peter Zijlstra
  1 sibling, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2008-07-29 10:04 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ingo Molnar, Jeremy Fitzhardinge, Jens Axboe, Andi Kleen,
	Linux Kernel Mailing List, Thomas Gleixner, H. Peter Anvin

On Tue, 2008-07-29 at 20:00 +1000, Nick Piggin wrote:
> On Tuesday 29 July 2008 19:54, Peter Zijlstra wrote:
> > On Tue, 2008-07-29 at 14:30 +1000, Nick Piggin wrote:
> > > Not to mention the minor problem that it still deadlocks when called with
> > > interrupts disabled ;)
> >
> > __smp_call_function_single has potential though..
> 
> For reschedule interrupt? I don't really agree.

Not specifically, for not deadlocking from irq-off, more so.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-29 10:04         ` Peter Zijlstra
@ 2008-07-29 10:17           ` Nick Piggin
  2008-07-29 10:23             ` Peter Zijlstra
  0 siblings, 1 reply; 22+ messages in thread
From: Nick Piggin @ 2008-07-29 10:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Jeremy Fitzhardinge, Jens Axboe, Andi Kleen,
	Linux Kernel Mailing List, Thomas Gleixner, H. Peter Anvin

On Tuesday 29 July 2008 20:04, Peter Zijlstra wrote:
> On Tue, 2008-07-29 at 20:00 +1000, Nick Piggin wrote:
> > On Tuesday 29 July 2008 19:54, Peter Zijlstra wrote:
> > > On Tue, 2008-07-29 at 14:30 +1000, Nick Piggin wrote:
> > > > Not to mention the minor problem that it still deadlocks when called
> > > > with interrupts disabled ;)
> > >
> > > __smp_call_function_single has potential though..
> >
> > For reschedule interrupt? I don't really agree.
>
> Not specifically, for not deadlocking from irq-off, more so.

Oh, well yes it already does work from irq-off, so it has already
realised its potential :)

Not sure exactly what kinds of users it is going to attract, but
it should be interesting to see!

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-29 10:17           ` Nick Piggin
@ 2008-07-29 10:23             ` Peter Zijlstra
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2008-07-29 10:23 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ingo Molnar, Jeremy Fitzhardinge, Jens Axboe, Andi Kleen,
	Linux Kernel Mailing List, Thomas Gleixner, H. Peter Anvin

On Tue, 2008-07-29 at 20:17 +1000, Nick Piggin wrote:
> On Tuesday 29 July 2008 20:04, Peter Zijlstra wrote:
> > On Tue, 2008-07-29 at 20:00 +1000, Nick Piggin wrote:
> > > On Tuesday 29 July 2008 19:54, Peter Zijlstra wrote:
> > > > On Tue, 2008-07-29 at 14:30 +1000, Nick Piggin wrote:
> > > > > Not to mention the minor problem that it still deadlocks when called
> > > > > with interrupts disabled ;)
> > > >
> > > > __smp_call_function_single has potential though..
> > >
> > > For reschedule interrupt? I don't really agree.
> >
> > Not specifically, for not deadlocking from irq-off, more so.
> 
> Oh, well yes it already does work from irq-off, so it has already
> realised its potential :)
> 
> Not sure exactly what kinds of users it is going to attract, but
> it should be interesting to see!

grep __smp_call_function_single kernel/sched.c



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-29 10:00       ` Nick Piggin
  2008-07-29 10:04         ` Peter Zijlstra
@ 2008-07-29 10:45         ` Peter Zijlstra
  2008-07-31 16:48           ` Ingo Molnar
  2008-07-31 17:48           ` Jeremy Fitzhardinge
  1 sibling, 2 replies; 22+ messages in thread
From: Peter Zijlstra @ 2008-07-29 10:45 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ingo Molnar, Jeremy Fitzhardinge, Jens Axboe, Andi Kleen,
	Linux Kernel Mailing List, Thomas Gleixner, H. Peter Anvin

On Tue, 2008-07-29 at 20:00 +1000, Nick Piggin wrote:
> On Tuesday 29 July 2008 19:54, Peter Zijlstra wrote:
> > On Tue, 2008-07-29 at 14:30 +1000, Nick Piggin wrote:
> > > Not to mention the minor problem that it still deadlocks when called with
> > > interrupts disabled ;)
> >
> > __smp_call_function_single has potential though..
> 
> For reschedule interrupt? I don't really agree.

How about using just arch_send_call_function_single_ipi() to implement
smp_send_reschedule() ?

The overhead of that is a smp_mb() and a list_empty() check in
generic_smp_call_function_single_interrupt() if there is indeed no work
to do.




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-29  6:29     ` Jeremy Fitzhardinge
@ 2008-07-29 12:02       ` Andi Kleen
  2008-07-29 14:46         ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2008-07-29 12:02 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andi Kleen, Ingo Molnar, Jens Axboe, Linux Kernel Mailing List

On Mon, Jul 28, 2008 at 11:29:18PM -0700, Jeremy Fitzhardinge wrote:
> Andi Kleen wrote:
> >>>Now that normal smp_function_call is no longer an enormous bottleneck, 
> >>>      
> >
> >Hmm? It still uses a global lock at least as of current git tree.
> 
> Yes, but it's only held briefly to put things onto the list.  It doesn't 
> get held over the whole IPI transaction as the old smp_call_function 
> did, and the tlb flush code still does.  RCU is used to manage the list 
> walk and freeing, so there's no long-held locks there either.

If it bounces regularly it will still hurt.

-Andi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-29 12:02       ` Andi Kleen
@ 2008-07-29 14:46         ` Jeremy Fitzhardinge
  2008-07-29 14:58           ` Andi Kleen
  0 siblings, 1 reply; 22+ messages in thread
From: Jeremy Fitzhardinge @ 2008-07-29 14:46 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Ingo Molnar, Jens Axboe, Linux Kernel Mailing List

Andi Kleen wrote:
>> Yes, but it's only held briefly to put things onto the list.  It doesn't 
>> get held over the whole IPI transaction as the old smp_call_function 
>> did, and the tlb flush code still does.  RCU is used to manage the list 
>> walk and freeing, so there's no long-held locks there either.
>>     
>
> If it bounces regularly it will still hurt.
>   

We could convert smp_call_function_mask to use a multi-vector scheme 
like tlb_64.c if that turns out to be an issue.

    J

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-29 14:46         ` Jeremy Fitzhardinge
@ 2008-07-29 14:58           ` Andi Kleen
  0 siblings, 0 replies; 22+ messages in thread
From: Andi Kleen @ 2008-07-29 14:58 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andi Kleen, Ingo Molnar, Jens Axboe, Linux Kernel Mailing List

On Tue, Jul 29, 2008 at 07:46:32AM -0700, Jeremy Fitzhardinge wrote:
> Andi Kleen wrote:
> >>Yes, but it's only held briefly to put things onto the list.  It doesn't 
> >>get held over the whole IPI transaction as the old smp_call_function 
> >>did, and the tlb flush code still does.  RCU is used to manage the list 
> >>walk and freeing, so there's no long-held locks there either.
> >>    
> >
> >If it bounces regularly it will still hurt.
> >  
> 
> We could convert smp_call_function_mask to use a multi-vector scheme 
> like tlb_64.c if that turns out to be an issue.

Converting it first would be fine. Or rather in parallel because 
you would need to reuse the TLB vectors (there are not that many 
free)

But waiting first for a report would seem wrong to me.

I can just see some poor performance person spend a lot of work to track
down such a regression. While there's a lot of development manpower available
for Linux there's still no reason to waste i. I think if you want to change 
such performance critical paths you should make sure the new code is roughly
performance equivalent first. And with the global lock I don't see that
at all.

-Andi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-29 10:45         ` Peter Zijlstra
@ 2008-07-31 16:48           ` Ingo Molnar
  2008-08-01  1:32             ` Nick Piggin
  2008-07-31 17:48           ` Jeremy Fitzhardinge
  1 sibling, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2008-07-31 16:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Piggin, Jeremy Fitzhardinge, Jens Axboe, Andi Kleen,
	Linux Kernel Mailing List, Thomas Gleixner, H. Peter Anvin


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Tue, 2008-07-29 at 20:00 +1000, Nick Piggin wrote:
> > On Tuesday 29 July 2008 19:54, Peter Zijlstra wrote:
> > > On Tue, 2008-07-29 at 14:30 +1000, Nick Piggin wrote:
> > > > Not to mention the minor problem that it still deadlocks when called with
> > > > interrupts disabled ;)
> > >
> > > __smp_call_function_single has potential though..
> > 
> > For reschedule interrupt? I don't really agree.
> 
> How about using just arch_send_call_function_single_ipi() to implement 
> smp_send_reschedule() ?

agreed, that's just a single IPI which kicks the need_resched logic on 
return-from-interrupt.

> The overhead of that is a smp_mb() and a list_empty() check in 
> generic_smp_call_function_single_interrupt() if there is indeed no 
> work to do.

that would be a miniscule cost - cacheline is read-shared amongst cpus 
so there's no real bouncing there. So i'm all for it ...

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-29 10:45         ` Peter Zijlstra
  2008-07-31 16:48           ` Ingo Molnar
@ 2008-07-31 17:48           ` Jeremy Fitzhardinge
  2008-07-31 20:57             ` Ingo Molnar
  1 sibling, 1 reply; 22+ messages in thread
From: Jeremy Fitzhardinge @ 2008-07-31 17:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Piggin, Ingo Molnar, Jens Axboe, Andi Kleen,
	Linux Kernel Mailing List, Thomas Gleixner, H. Peter Anvin

Peter Zijlstra wrote:
> How about using just arch_send_call_function_single_ipi() to implement
> smp_send_reschedule() ?
>
> The overhead of that is a smp_mb() and a list_empty() check in
> generic_smp_call_function_single_interrupt() if there is indeed no work
> to do.
>   

Is doing a no-op interrupt sufficient on all architectures?  Is there
some change a function call IPI might not go through the normal
reschedule interrupt exit path?

    J


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-31 17:48           ` Jeremy Fitzhardinge
@ 2008-07-31 20:57             ` Ingo Molnar
  2008-07-31 21:15               ` Peter Zijlstra
  0 siblings, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2008-07-31 20:57 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Peter Zijlstra, Nick Piggin, Jens Axboe, Andi Kleen,
	Linux Kernel Mailing List, Thomas Gleixner, H. Peter Anvin


* Jeremy Fitzhardinge <jeremy@goop.org> wrote:

> Peter Zijlstra wrote:
> > How about using just arch_send_call_function_single_ipi() to implement
> > smp_send_reschedule() ?
> >
> > The overhead of that is a smp_mb() and a list_empty() check in
> > generic_smp_call_function_single_interrupt() if there is indeed no work
> > to do.
> 
> Is doing a no-op interrupt sufficient on all architectures?  Is there 
> some change a function call IPI might not go through the normal 
> reschedule interrupt exit path?

We'd still use the smp_send_reschdule(cpu) API, so it's an architecture 
detail. On x86 we'd use arch_send_call_function_single_ipi().

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-31 20:57             ` Ingo Molnar
@ 2008-07-31 21:15               ` Peter Zijlstra
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2008-07-31 21:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jeremy Fitzhardinge, Nick Piggin, Jens Axboe, Andi Kleen,
	Linux Kernel Mailing List, Thomas Gleixner, H. Peter Anvin

On Thu, 2008-07-31 at 22:57 +0200, Ingo Molnar wrote:
> * Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> 
> > Peter Zijlstra wrote:
> > > How about using just arch_send_call_function_single_ipi() to implement
> > > smp_send_reschedule() ?
> > >
> > > The overhead of that is a smp_mb() and a list_empty() check in
> > > generic_smp_call_function_single_interrupt() if there is indeed no work
> > > to do.
> > 
> > Is doing a no-op interrupt sufficient on all architectures?  Is there 
> > some change a function call IPI might not go through the normal 
> > reschedule interrupt exit path?
> 
> We'd still use the smp_send_reschdule(cpu) API, so it's an architecture 
> detail. On x86 we'd use arch_send_call_function_single_ipi().

Also, all interrupts _should_ do the regular interrupt enter/exit paths,
we fixup stuff there, like jiffies and such.

We had a fun NO_HZ bug the other day because some sparc64 IPIs didn't.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: x86: Is there still value in having a special tlb flush IPI vector?
  2008-07-31 16:48           ` Ingo Molnar
@ 2008-08-01  1:32             ` Nick Piggin
  0 siblings, 0 replies; 22+ messages in thread
From: Nick Piggin @ 2008-08-01  1:32 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Jeremy Fitzhardinge, Jens Axboe, Andi Kleen,
	Linux Kernel Mailing List, Thomas Gleixner, H. Peter Anvin

On Friday 01 August 2008 02:48, Ingo Molnar wrote:
> * Peter Zijlstra <peterz@infradead.org> wrote:

> > The overhead of that is a smp_mb() and a list_empty() check in
> > generic_smp_call_function_single_interrupt() if there is indeed no
> > work to do.
>
> that would be a miniscule cost - cacheline is read-shared amongst cpus
> so there's no real bouncing there. So i'm all for it ...

smp_mb would cost some cycles. So would the branch mispredict because
list_empty would otherwise normally be taken I think. q likely is not
in cache either.

I'm not in favour.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2008-08-01  1:32 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-28 23:16 x86: Is there still value in having a special tlb flush IPI vector? Jeremy Fitzhardinge
2008-07-28 23:20 ` Jeremy Fitzhardinge
2008-07-29  2:12   ` Andi Kleen
2008-07-29  6:29     ` Jeremy Fitzhardinge
2008-07-29 12:02       ` Andi Kleen
2008-07-29 14:46         ` Jeremy Fitzhardinge
2008-07-29 14:58           ` Andi Kleen
2008-07-28 23:34 ` Ingo Molnar
2008-07-29  4:30   ` Nick Piggin
2008-07-29  6:19     ` Jeremy Fitzhardinge
2008-07-29  9:47       ` Nick Piggin
2008-07-29  9:54     ` Peter Zijlstra
2008-07-29 10:00       ` Nick Piggin
2008-07-29 10:04         ` Peter Zijlstra
2008-07-29 10:17           ` Nick Piggin
2008-07-29 10:23             ` Peter Zijlstra
2008-07-29 10:45         ` Peter Zijlstra
2008-07-31 16:48           ` Ingo Molnar
2008-08-01  1:32             ` Nick Piggin
2008-07-31 17:48           ` Jeremy Fitzhardinge
2008-07-31 20:57             ` Ingo Molnar
2008-07-31 21:15               ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).