* [RFC] sched, x86: Prevent resched interrupts if task in kernel mode and !CONFIG_PREEMPT @ 2015-01-23 15:53 Kirill Tkhai 2015-01-23 16:07 ` Peter Zijlstra 0 siblings, 1 reply; 7+ messages in thread From: Kirill Tkhai @ 2015-01-23 15:53 UTC (permalink / raw) To: linux-kernel; +Cc: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, H. Peter Anvin It's useless to send reschedule interrupts in such situations. The earliest point, where schedule() call is possible, is sysret_careful(). But in that function we directly test TIF_NEED_RESCHED. So it's possible to get rid of that type of interrupts. How about this idea? Is set_bit() cheap on x86 machines? --- arch/x86/kernel/entry_64.S | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S index c653dc4..a046ba8 100644 --- a/arch/x86/kernel/entry_64.S +++ b/arch/x86/kernel/entry_64.S @@ -409,6 +409,13 @@ GLOBAL(system_call_after_swapgs) movq_cfi rax,(ORIG_RAX-ARGOFFSET) movq %rcx,RIP-ARGOFFSET(%rsp) CFI_REL_OFFSET rip,RIP-ARGOFFSET +#if !defined(CONFIG_PREEMPT) || !defined(SMP) + /* + * Tell resched_curr() do not send useless interrupts to us. + * Kernel isn't preemptible till sysret_careful() anyway. + */ + LOCK ; bts $TIF_POLLING_NRFLAG,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) +#endif testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) jnz tracesys system_call_fastpath: @@ -427,6 +434,9 @@ GLOBAL(system_call_after_swapgs) * Has incomplete stack frame and undefined top of stack. */ ret_from_sys_call: +#if !defined(CONFIG_PREEMPT) || !defined(SMP) + LOCK ; btr $TIF_POLLING_NRFLAG,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) +#endif movl $_TIF_ALLWORK_MASK,%edi /* edi: flagmask */ sysret_check: ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC] sched, x86: Prevent resched interrupts if task in kernel mode and !CONFIG_PREEMPT 2015-01-23 15:53 [RFC] sched, x86: Prevent resched interrupts if task in kernel mode and !CONFIG_PREEMPT Kirill Tkhai @ 2015-01-23 16:07 ` Peter Zijlstra 2015-01-23 16:24 ` Andy Lutomirski 0 siblings, 1 reply; 7+ messages in thread From: Peter Zijlstra @ 2015-01-23 16:07 UTC (permalink / raw) To: Kirill Tkhai Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, luto On Fri, Jan 23, 2015 at 06:53:32PM +0300, Kirill Tkhai wrote: > It's useless to send reschedule interrupts in such situations. The earliest > point, where schedule() call is possible, is sysret_careful(). But in that > function we directly test TIF_NEED_RESCHED. > > So it's possible to get rid of that type of interrupts. > > How about this idea? Is set_bit() cheap on x86 machines? So you set TIF_POLLING_NRFLAG on syscall entry and clear it again on exit? Thereby we avoid the IPI, because the exit path already checks for TIF_NEED_RESCHED. Should work I suppose, but I'm not too familiar with all that entry.S muck. Andy might know and appreciate this. > --- > arch/x86/kernel/entry_64.S | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > index c653dc4..a046ba8 100644 > --- a/arch/x86/kernel/entry_64.S > +++ b/arch/x86/kernel/entry_64.S > @@ -409,6 +409,13 @@ GLOBAL(system_call_after_swapgs) > movq_cfi rax,(ORIG_RAX-ARGOFFSET) > movq %rcx,RIP-ARGOFFSET(%rsp) > CFI_REL_OFFSET rip,RIP-ARGOFFSET > +#if !defined(CONFIG_PREEMPT) || !defined(SMP) > + /* > + * Tell resched_curr() do not send useless interrupts to us. > + * Kernel isn't preemptible till sysret_careful() anyway. > + */ > + LOCK ; bts $TIF_POLLING_NRFLAG,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) > +#endif > testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) > jnz tracesys > system_call_fastpath: > @@ -427,6 +434,9 @@ GLOBAL(system_call_after_swapgs) > * Has incomplete stack frame and undefined top of stack. > */ > ret_from_sys_call: > +#if !defined(CONFIG_PREEMPT) || !defined(SMP) > + LOCK ; btr $TIF_POLLING_NRFLAG,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) > +#endif > movl $_TIF_ALLWORK_MASK,%edi > /* edi: flagmask */ > sysret_check: > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] sched, x86: Prevent resched interrupts if task in kernel mode and !CONFIG_PREEMPT 2015-01-23 16:07 ` Peter Zijlstra @ 2015-01-23 16:24 ` Andy Lutomirski 2015-01-23 17:09 ` Kirill Tkhai 0 siblings, 1 reply; 7+ messages in thread From: Andy Lutomirski @ 2015-01-23 16:24 UTC (permalink / raw) To: Peter Zijlstra Cc: Kirill Tkhai, linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin On Fri, Jan 23, 2015 at 8:07 AM, Peter Zijlstra <peterz@infradead.org> wrote: > On Fri, Jan 23, 2015 at 06:53:32PM +0300, Kirill Tkhai wrote: >> It's useless to send reschedule interrupts in such situations. The earliest >> point, where schedule() call is possible, is sysret_careful(). But in that >> function we directly test TIF_NEED_RESCHED. >> >> So it's possible to get rid of that type of interrupts. >> >> How about this idea? Is set_bit() cheap on x86 machines? > > So you set TIF_POLLING_NRFLAG on syscall entry and clear it again on > exit? Thereby we avoid the IPI, because the exit path already checks for > TIF_NEED_RESCHED. The idle code says: /* * If the arch has a polling bit, we maintain an invariant: * * Our polling bit is clear if we're not scheduled (i.e. if * rq->curr != rq->idle). This means that, if rq->idle has * the polling bit set, then setting need_resched is * guaranteed to cause the cpu to reschedule. */ Setting polling on non-idle tasks like this will either involve weakening this a bit (it'll still be true for rq->idle) or changing the polling state on context switch. > > Should work I suppose, but I'm not too familiar with all that entry.S > muck. Andy might know and appreciate this. > >> --- >> arch/x86/kernel/entry_64.S | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S >> index c653dc4..a046ba8 100644 >> --- a/arch/x86/kernel/entry_64.S >> +++ b/arch/x86/kernel/entry_64.S >> @@ -409,6 +409,13 @@ GLOBAL(system_call_after_swapgs) >> movq_cfi rax,(ORIG_RAX-ARGOFFSET) >> movq %rcx,RIP-ARGOFFSET(%rsp) >> CFI_REL_OFFSET rip,RIP-ARGOFFSET >> +#if !defined(CONFIG_PREEMPT) || !defined(SMP) >> + /* >> + * Tell resched_curr() do not send useless interrupts to us. >> + * Kernel isn't preemptible till sysret_careful() anyway. >> + */ >> + LOCK ; bts $TIF_POLLING_NRFLAG,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) >> +#endif That's kind of expensive. What's the !SMP part for? >> testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) >> jnz tracesys >> system_call_fastpath: >> @@ -427,6 +434,9 @@ GLOBAL(system_call_after_swapgs) >> * Has incomplete stack frame and undefined top of stack. >> */ >> ret_from_sys_call: >> +#if !defined(CONFIG_PREEMPT) || !defined(SMP) >> + LOCK ; btr $TIF_POLLING_NRFLAG,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) >> +#endif If only it were this simple. There are lots of ways out of syscalls, and this is only one of them :( If we did this, I'd rather do it through the do_notify_resume mechanism or something. I don't see any way to do this without at least one atomic op or smp_mb per syscall, and that's kind of expensive. Would it make sense to try to use context tracking instead? On systems that use context tracking, syscalls are already expensive, and we're already keeping track of which CPUs are in user mode. --Andy ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] sched, x86: Prevent resched interrupts if task in kernel mode and !CONFIG_PREEMPT 2015-01-23 16:24 ` Andy Lutomirski @ 2015-01-23 17:09 ` Kirill Tkhai 2015-01-24 2:36 ` Andy Lutomirski 0 siblings, 1 reply; 7+ messages in thread From: Kirill Tkhai @ 2015-01-23 17:09 UTC (permalink / raw) To: Andy Lutomirski Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin В Пт, 23/01/2015 в 08:24 -0800, Andy Lutomirski пишет: > On Fri, Jan 23, 2015 at 8:07 AM, Peter Zijlstra <peterz@infradead.org> wrote: > > On Fri, Jan 23, 2015 at 06:53:32PM +0300, Kirill Tkhai wrote: > >> It's useless to send reschedule interrupts in such situations. The earliest > >> point, where schedule() call is possible, is sysret_careful(). But in that > >> function we directly test TIF_NEED_RESCHED. > >> > >> So it's possible to get rid of that type of interrupts. > >> > >> How about this idea? Is set_bit() cheap on x86 machines? > > > > So you set TIF_POLLING_NRFLAG on syscall entry and clear it again on > > exit? Thereby we avoid the IPI, because the exit path already checks for > > TIF_NEED_RESCHED. > > The idle code says: > > /* > * If the arch has a polling bit, we maintain an invariant: > * > * Our polling bit is clear if we're not scheduled (i.e. if > * rq->curr != rq->idle). This means that, if rq->idle has > * the polling bit set, then setting need_resched is > * guaranteed to cause the cpu to reschedule. > */ > > Setting polling on non-idle tasks like this will either involve > weakening this a bit (it'll still be true for rq->idle) or changing > the polling state on context switch. > > > > > Should work I suppose, but I'm not too familiar with all that entry.S > > muck. Andy might know and appreciate this. > > > >> --- > >> arch/x86/kernel/entry_64.S | 10 ++++++++++ > >> 1 file changed, 10 insertions(+) > >> > >> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > >> index c653dc4..a046ba8 100644 > >> --- a/arch/x86/kernel/entry_64.S > >> +++ b/arch/x86/kernel/entry_64.S > >> @@ -409,6 +409,13 @@ GLOBAL(system_call_after_swapgs) > >> movq_cfi rax,(ORIG_RAX-ARGOFFSET) > >> movq %rcx,RIP-ARGOFFSET(%rsp) > >> CFI_REL_OFFSET rip,RIP-ARGOFFSET > >> +#if !defined(CONFIG_PREEMPT) || !defined(SMP) > >> + /* > >> + * Tell resched_curr() do not send useless interrupts to us. > >> + * Kernel isn't preemptible till sysret_careful() anyway. > >> + */ > >> + LOCK ; bts $TIF_POLLING_NRFLAG,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) > >> +#endif > > That's kind of expensive. What's the !SMP part for? smp_send_reschedule() is NOP on UP. There is no problem. > > >> testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) > >> jnz tracesys > >> system_call_fastpath: > >> @@ -427,6 +434,9 @@ GLOBAL(system_call_after_swapgs) > >> * Has incomplete stack frame and undefined top of stack. > >> */ > >> ret_from_sys_call: > >> +#if !defined(CONFIG_PREEMPT) || !defined(SMP) > >> + LOCK ; btr $TIF_POLLING_NRFLAG,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) > >> +#endif > > If only it were this simple. There are lots of ways out of syscalls, > and this is only one of them :( If we did this, I'd rather do it > through the do_notify_resume mechanism or something. Yes, syscall is the only thing I did as an example. > I don't see any way to do this without at least one atomic op or > smp_mb per syscall, and that's kind of expensive. JFI, doesn't x86 set_bit() lock a small area of memory? I thought it's not very expensive on this arch (some bus optimizations or something like this). > Would it make sense to try to use context tracking instead? On > systems that use context tracking, syscalls are already expensive, and > we're already keeping track of which CPUs are in user mode. I'll look at context_tracking, but I'm not sure some smp synchronization there. Thanks, Kirill ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] sched, x86: Prevent resched interrupts if task in kernel mode and !CONFIG_PREEMPT 2015-01-23 17:09 ` Kirill Tkhai @ 2015-01-24 2:36 ` Andy Lutomirski 2015-01-26 11:58 ` Kirill Tkhai 0 siblings, 1 reply; 7+ messages in thread From: Andy Lutomirski @ 2015-01-24 2:36 UTC (permalink / raw) To: Kirill Tkhai Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin On Fri, Jan 23, 2015 at 9:09 AM, Kirill Tkhai <ktkhai@parallels.com> wrote: > В Пт, 23/01/2015 в 08:24 -0800, Andy Lutomirski пишет: >> On Fri, Jan 23, 2015 at 8:07 AM, Peter Zijlstra <peterz@infradead.org> wrote: >> > On Fri, Jan 23, 2015 at 06:53:32PM +0300, Kirill Tkhai wrote: >> >> --- >> >> arch/x86/kernel/entry_64.S | 10 ++++++++++ >> >> 1 file changed, 10 insertions(+) >> >> >> >> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S >> >> index c653dc4..a046ba8 100644 >> >> --- a/arch/x86/kernel/entry_64.S >> >> +++ b/arch/x86/kernel/entry_64.S >> >> @@ -409,6 +409,13 @@ GLOBAL(system_call_after_swapgs) >> >> movq_cfi rax,(ORIG_RAX-ARGOFFSET) >> >> movq %rcx,RIP-ARGOFFSET(%rsp) >> >> CFI_REL_OFFSET rip,RIP-ARGOFFSET >> >> +#if !defined(CONFIG_PREEMPT) || !defined(SMP) >> >> + /* >> >> + * Tell resched_curr() do not send useless interrupts to us. >> >> + * Kernel isn't preemptible till sysret_careful() anyway. >> >> + */ >> >> + LOCK ; bts $TIF_POLLING_NRFLAG,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) >> >> +#endif >> >> That's kind of expensive. What's the !SMP part for? > > smp_send_reschedule() is NOP on UP. There is no problem. Shouldn't it be #if !defined(CONFIG_PREEMPT) && defined(CONFIG_SMP) then? > >> >> >> testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) >> >> jnz tracesys >> >> system_call_fastpath: >> >> @@ -427,6 +434,9 @@ GLOBAL(system_call_after_swapgs) >> >> * Has incomplete stack frame and undefined top of stack. >> >> */ >> >> ret_from_sys_call: >> >> +#if !defined(CONFIG_PREEMPT) || !defined(SMP) >> >> + LOCK ; btr $TIF_POLLING_NRFLAG,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) >> >> +#endif >> >> If only it were this simple. There are lots of ways out of syscalls, >> and this is only one of them :( If we did this, I'd rather do it >> through the do_notify_resume mechanism or something. > > Yes, syscall is the only thing I did as an example. > >> I don't see any way to do this without at least one atomic op or >> smp_mb per syscall, and that's kind of expensive. > > JFI, doesn't x86 set_bit() lock a small area of memory? I thought > it's not very expensive on this arch (some bus optimizations or > something like this). An entire syscall on x86 is well under 200 cycles. lock addl is >20 cycles for me, and I don't see why the atomic bitops would be faster. (Oddly, mfence is slower than lock addl, which is really odd, since lock addl implies mfence.) So this overhead may actually matter. > >> Would it make sense to try to use context tracking instead? On >> systems that use context tracking, syscalls are already expensive, and >> we're already keeping track of which CPUs are in user mode. > > I'll look at context_tracking, but I'm not sure some smp synchronization > there. It could be combinable with existing synchronization there. --Andy ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] sched, x86: Prevent resched interrupts if task in kernel mode and !CONFIG_PREEMPT 2015-01-24 2:36 ` Andy Lutomirski @ 2015-01-26 11:58 ` Kirill Tkhai 2015-02-03 17:14 ` Kirill Tkhai 0 siblings, 1 reply; 7+ messages in thread From: Kirill Tkhai @ 2015-01-26 11:58 UTC (permalink / raw) To: Andy Lutomirski Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin В Пт, 23/01/2015 в 18:36 -0800, Andy Lutomirski пишет: > On Fri, Jan 23, 2015 at 9:09 AM, Kirill Tkhai <ktkhai@parallels.com> wrote: > > В Пт, 23/01/2015 в 08:24 -0800, Andy Lutomirski пишет: > >> On Fri, Jan 23, 2015 at 8:07 AM, Peter Zijlstra <peterz@infradead.org> wrote: > >> > On Fri, Jan 23, 2015 at 06:53:32PM +0300, Kirill Tkhai wrote: > >> >> --- > >> >> arch/x86/kernel/entry_64.S | 10 ++++++++++ > >> >> 1 file changed, 10 insertions(+) > >> >> > >> >> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > >> >> index c653dc4..a046ba8 100644 > >> >> --- a/arch/x86/kernel/entry_64.S > >> >> +++ b/arch/x86/kernel/entry_64.S > >> >> @@ -409,6 +409,13 @@ GLOBAL(system_call_after_swapgs) > >> >> movq_cfi rax,(ORIG_RAX-ARGOFFSET) > >> >> movq %rcx,RIP-ARGOFFSET(%rsp) > >> >> CFI_REL_OFFSET rip,RIP-ARGOFFSET > >> >> +#if !defined(CONFIG_PREEMPT) || !defined(SMP) > >> >> + /* > >> >> + * Tell resched_curr() do not send useless interrupts to us. > >> >> + * Kernel isn't preemptible till sysret_careful() anyway. > >> >> + */ > >> >> + LOCK ; bts $TIF_POLLING_NRFLAG,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) > >> >> +#endif > >> > >> That's kind of expensive. What's the !SMP part for? > > > > smp_send_reschedule() is NOP on UP. There is no problem. > > Shouldn't it be #if !defined(CONFIG_PREEMPT) && defined(CONFIG_SMP) then? Definitely, thanks. > > > > >> > >> >> testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) > >> >> jnz tracesys > >> >> system_call_fastpath: > >> >> @@ -427,6 +434,9 @@ GLOBAL(system_call_after_swapgs) > >> >> * Has incomplete stack frame and undefined top of stack. > >> >> */ > >> >> ret_from_sys_call: > >> >> +#if !defined(CONFIG_PREEMPT) || !defined(SMP) > >> >> + LOCK ; btr $TIF_POLLING_NRFLAG,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) > >> >> +#endif > >> > >> If only it were this simple. There are lots of ways out of syscalls, > >> and this is only one of them :( If we did this, I'd rather do it > >> through the do_notify_resume mechanism or something. > > > > Yes, syscall is the only thing I did as an example. > > > >> I don't see any way to do this without at least one atomic op or > >> smp_mb per syscall, and that's kind of expensive. > > > > JFI, doesn't x86 set_bit() lock a small area of memory? I thought > > it's not very expensive on this arch (some bus optimizations or > > something like this). > > An entire syscall on x86 is well under 200 cycles. lock addl is >20 > cycles for me, and I don't see why the atomic bitops would be faster. > (Oddly, mfence is slower than lock addl, which is really odd, since > lock addl implies mfence.) So this overhead may actually matter. Yeah, it's really big overhead. > > > >> Would it make sense to try to use context tracking instead? On > >> systems that use context tracking, syscalls are already expensive, and > >> we're already keeping track of which CPUs are in user mode. > > > > I'll look at context_tracking, but I'm not sure some smp synchronization > > there. > > It could be combinable with existing synchronization there. I'll look at this. Thanks! Kirill ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] sched, x86: Prevent resched interrupts if task in kernel mode and !CONFIG_PREEMPT 2015-01-26 11:58 ` Kirill Tkhai @ 2015-02-03 17:14 ` Kirill Tkhai 0 siblings, 0 replies; 7+ messages in thread From: Kirill Tkhai @ 2015-02-03 17:14 UTC (permalink / raw) To: Andy Lutomirski Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin В Пн, 26/01/2015 в 14:58 +0300, Kirill Tkhai пишет: > В Пт, 23/01/2015 в 18:36 -0800, Andy Lutomirski пишет: > > On Fri, Jan 23, 2015 at 9:09 AM, Kirill Tkhai <ktkhai@parallels.com> wrote: > > > В Пт, 23/01/2015 в 08:24 -0800, Andy Lutomirski пишет: > > >> On Fri, Jan 23, 2015 at 8:07 AM, Peter Zijlstra <peterz@infradead.org> wrote: > > >> > On Fri, Jan 23, 2015 at 06:53:32PM +0300, Kirill Tkhai wrote: > > >> >> --- > > >> >> arch/x86/kernel/entry_64.S | 10 ++++++++++ > > >> >> 1 file changed, 10 insertions(+) > > >> >> > > >> >> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > > >> >> index c653dc4..a046ba8 100644 > > >> >> --- a/arch/x86/kernel/entry_64.S > > >> >> +++ b/arch/x86/kernel/entry_64.S > > >> >> @@ -409,6 +409,13 @@ GLOBAL(system_call_after_swapgs) > > >> >> movq_cfi rax,(ORIG_RAX-ARGOFFSET) > > >> >> movq %rcx,RIP-ARGOFFSET(%rsp) > > >> >> CFI_REL_OFFSET rip,RIP-ARGOFFSET > > >> >> +#if !defined(CONFIG_PREEMPT) || !defined(SMP) > > >> >> + /* > > >> >> + * Tell resched_curr() do not send useless interrupts to us. > > >> >> + * Kernel isn't preemptible till sysret_careful() anyway. > > >> >> + */ > > >> >> + LOCK ; bts $TIF_POLLING_NRFLAG,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) > > >> >> +#endif > > >> > > >> That's kind of expensive. What's the !SMP part for? > > > > > > smp_send_reschedule() is NOP on UP. There is no problem. > > > > Shouldn't it be #if !defined(CONFIG_PREEMPT) && defined(CONFIG_SMP) then? > > Definitely, thanks. > > > > > > > > >> > > >> >> testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) > > >> >> jnz tracesys > > >> >> system_call_fastpath: > > >> >> @@ -427,6 +434,9 @@ GLOBAL(system_call_after_swapgs) > > >> >> * Has incomplete stack frame and undefined top of stack. > > >> >> */ > > >> >> ret_from_sys_call: > > >> >> +#if !defined(CONFIG_PREEMPT) || !defined(SMP) > > >> >> + LOCK ; btr $TIF_POLLING_NRFLAG,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET) > > >> >> +#endif > > >> > > >> If only it were this simple. There are lots of ways out of syscalls, > > >> and this is only one of them :( If we did this, I'd rather do it > > >> through the do_notify_resume mechanism or something. > > > > > > Yes, syscall is the only thing I did as an example. > > > > > >> I don't see any way to do this without at least one atomic op or > > >> smp_mb per syscall, and that's kind of expensive. > > > > > > JFI, doesn't x86 set_bit() lock a small area of memory? I thought > > > it's not very expensive on this arch (some bus optimizations or > > > something like this). > > > > An entire syscall on x86 is well under 200 cycles. lock addl is >20 > > cycles for me, and I don't see why the atomic bitops would be faster. > > (Oddly, mfence is slower than lock addl, which is really odd, since > > lock addl implies mfence.) So this overhead may actually matter. > > Yeah, it's really big overhead. > > > > > > >> Would it make sense to try to use context tracking instead? On > > >> systems that use context tracking, syscalls are already expensive, and > > >> we're already keeping track of which CPUs are in user mode. > > > > > > I'll look at context_tracking, but I'm not sure some smp synchronization > > > there. > > > > It could be combinable with existing synchronization there. > > I'll look at this. Thanks! Сontinuing the theme. I've tried the idea with RCU. Fast & dirty patch which prevents unnecessary interrupts. It prevents 2% of reschedule IPIs. The cost is atomic_read() in resched_curr(). It looks like the profit is not too much... Kirill ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-02-03 17:15 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-01-23 15:53 [RFC] sched, x86: Prevent resched interrupts if task in kernel mode and !CONFIG_PREEMPT Kirill Tkhai 2015-01-23 16:07 ` Peter Zijlstra 2015-01-23 16:24 ` Andy Lutomirski 2015-01-23 17:09 ` Kirill Tkhai 2015-01-24 2:36 ` Andy Lutomirski 2015-01-26 11:58 ` Kirill Tkhai 2015-02-03 17:14 ` Kirill Tkhai
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.