linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
@ 2015-01-07  1:37 Calvin Owens
  2015-01-07  1:49 ` [PATCH v2] " Calvin Owens
  0 siblings, 1 reply; 19+ messages in thread
From: Calvin Owens @ 2015-01-07  1:37 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton, Joe Perches, Paul E. McKenney,
	Peter Zijlstra
  Cc: linux-kernel, kernel-team, Calvin Owens

While debugging an issue with excessive softirq usage, I encountered the
following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
infrastructure"):

    [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]

...but despite this note, the patch still calls RCU with IRQs disabled.

This seemingly innocuous change caused a significant regression in softirq
CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
introducing 0.01% packet loss, the softirq usage would jump to around 25%,
spiking as high as 50%. Before the change, the usage would never exceed 5%.

Moving the call to rcu_note_context_switch() after the cond_sched() call,
as it was originally before the hotplug patch, completely eliminated this
problem.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 kernel/softirq.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 501baa9..9e787d8 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -656,9 +656,13 @@ static void run_ksoftirqd(unsigned int cpu)
 		 * in the task stack here.
 		 */
 		__do_softirq();
-		rcu_note_context_switch();
 		local_irq_enable();
 		cond_resched();
+
+		preempt_disable();
+		rcu_note_context_switch(cpu);
+		preempt_enable();
+
 		return;
 	}
 	local_irq_enable();
-- 
2.1.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-07  1:37 [PATCH] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU Calvin Owens
@ 2015-01-07  1:49 ` Calvin Owens
  2015-01-07  2:19   ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Calvin Owens @ 2015-01-07  1:49 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton, Joe Perches, Paul E. McKenney,
	Peter Zijlstra
  Cc: linux-kernel, kernel-team, calvinowens

While debugging an issue with excessive softirq usage, I encountered the
following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
infrastructure"):

    [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]

...but despite this note, the patch still calls RCU with IRQs disabled.

This seemingly innocuous change caused a significant regression in softirq
CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
introducing 0.01% packet loss, the softirq usage would jump to around 25%,
spiking as high as 50%. Before the change, the usage would never exceed 5%.

Moving the call to rcu_note_context_switch() after the cond_sched() call,
as it was originally before the hotplug patch, completely eliminated this
problem.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
Changes since v1:
	I mixed up the kernel versions I was patching against, sorry!

 kernel/softirq.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 501baa9..9e787d8 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -656,9 +656,13 @@ static void run_ksoftirqd(unsigned int cpu)
 		 * in the task stack here.
 		 */
 		__do_softirq();
-		rcu_note_context_switch();
 		local_irq_enable();
 		cond_resched();
+
+		preempt_disable();
+		rcu_note_context_switch();
+		preempt_enable();
+
 		return;
 	}
 	local_irq_enable();
-- 
2.1.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-07  1:49 ` [PATCH v2] " Calvin Owens
@ 2015-01-07  2:19   ` Paul E. McKenney
  2015-01-07 16:52     ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-01-07  2:19 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Thomas Gleixner, Andrew Morton, Joe Perches, Peter Zijlstra,
	linux-kernel, kernel-team

On Tue, Jan 06, 2015 at 05:49:06PM -0800, Calvin Owens wrote:
> While debugging an issue with excessive softirq usage, I encountered the
> following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
> infrastructure"):
> 
>     [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]
> 
> ...but despite this note, the patch still calls RCU with IRQs disabled.
> 
> This seemingly innocuous change caused a significant regression in softirq
> CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
> introducing 0.01% packet loss, the softirq usage would jump to around 25%,
> spiking as high as 50%. Before the change, the usage would never exceed 5%.
> 
> Moving the call to rcu_note_context_switch() after the cond_sched() call,
> as it was originally before the hotplug patch, completely eliminated this
> problem.
> 
> Signed-off-by: Calvin Owens <calvinowens@fb.com>
> ---
> Changes since v1:
> 	I mixed up the kernel versions I was patching against, sorry!
> 
>  kernel/softirq.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 501baa9..9e787d8 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -656,9 +656,13 @@ static void run_ksoftirqd(unsigned int cpu)
>  		 * in the task stack here.
>  		 */
>  		__do_softirq();
> -		rcu_note_context_switch();
>  		local_irq_enable();
>  		cond_resched();

If this is for 3.20, we can just replace cond_resched() with
cond_resched_rcu_qs(), and get rid of the direct call to
rcu_note_context_switch().  This has the benefit of avoiding
needless rcu_note_context_switch() overhead if cond_resched()
actually did a reschedule.

But don't try it in 3.19 or earlier.  ;-)

							Thanx, Paul

> +
> +		preempt_disable();
> +		rcu_note_context_switch();
> +		preempt_enable();
> +
>  		return;
>  	}
>  	local_irq_enable();
> -- 
> 2.1.1
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-07  2:19   ` Paul E. McKenney
@ 2015-01-07 16:52     ` Paul E. McKenney
  2015-01-08  4:33       ` Calvin Owens
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-01-07 16:52 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Thomas Gleixner, Andrew Morton, Joe Perches, Peter Zijlstra,
	linux-kernel, kernel-team

On Tue, Jan 06, 2015 at 06:19:26PM -0800, Paul E. McKenney wrote:
> On Tue, Jan 06, 2015 at 05:49:06PM -0800, Calvin Owens wrote:
> > While debugging an issue with excessive softirq usage, I encountered the
> > following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
> > infrastructure"):
> > 
> >     [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]
> > 
> > ...but despite this note, the patch still calls RCU with IRQs disabled.
> > 
> > This seemingly innocuous change caused a significant regression in softirq
> > CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
> > introducing 0.01% packet loss, the softirq usage would jump to around 25%,
> > spiking as high as 50%. Before the change, the usage would never exceed 5%.
> > 
> > Moving the call to rcu_note_context_switch() after the cond_sched() call,
> > as it was originally before the hotplug patch, completely eliminated this
> > problem.
> > 
> > Signed-off-by: Calvin Owens <calvinowens@fb.com>
> > ---
> > Changes since v1:
> > 	I mixed up the kernel versions I was patching against, sorry!
> > 
> >  kernel/softirq.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > index 501baa9..9e787d8 100644
> > --- a/kernel/softirq.c
> > +++ b/kernel/softirq.c
> > @@ -656,9 +656,13 @@ static void run_ksoftirqd(unsigned int cpu)
> >  		 * in the task stack here.
> >  		 */
> >  		__do_softirq();
> > -		rcu_note_context_switch();
> >  		local_irq_enable();
> >  		cond_resched();
> 
> If this is for 3.20, we can just replace cond_resched() with
> cond_resched_rcu_qs(), and get rid of the direct call to
> rcu_note_context_switch().  This has the benefit of avoiding
> needless rcu_note_context_switch() overhead if cond_resched()
> actually did a reschedule.
> 
> But don't try it in 3.19 or earlier.  ;-)

As in the following for 3.20.  Does this version work for you?

							Thanx, Paul

------------------------------------------------------------------------

ksoftirqd: Enable IRQs and call cond_resched() before poking RCU

While debugging an issue with excessive softirq usage, I encountered the
following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
infrastructure"):
    
        [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]
    
...but despite this note, the patch still calls RCU with IRQs disabled.

This seemingly innocuous change caused a significant regression in softirq
CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
introducing 0.01% packet loss, the softirq usage would jump to around 25%,
spiking as high as 50%. Before the change, the usage would never exceed 5%.

Moving the call to rcu_note_context_switch() after the cond_sched() call,
as it was originally before the hotplug patch, completely eliminated this
problem, but the new cond_resched_rcu_qs() provides shorter code and
avoids double RCU notification in the case where cond_resched() really
did a context switch.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
[ paulmck: Substituted shiny new cond_resched_rcu_qs() primitive. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 501baa9ac1be..8cdb98847c7b 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -656,9 +656,8 @@ static void run_ksoftirqd(unsigned int cpu)
 		 * in the task stack here.
 		 */
 		__do_softirq();
-		rcu_note_context_switch();
 		local_irq_enable();
-		cond_resched();
+		cond_resched_rcu_qs();
 		return;
 	}
 	local_irq_enable();


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-07 16:52     ` Paul E. McKenney
@ 2015-01-08  4:33       ` Calvin Owens
  2015-01-08  4:53         ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Calvin Owens @ 2015-01-08  4:33 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Thomas Gleixner, Andrew Morton, Joe Perches, Peter Zijlstra,
	linux-kernel, kernel-team, calvinowens

On Wednesday 01/07 at 08:52 -0800, Paul E. McKenney wrote:
> On Tue, Jan 06, 2015 at 06:19:26PM -0800, Paul E. McKenney wrote:
> > On Tue, Jan 06, 2015 at 05:49:06PM -0800, Calvin Owens wrote:
> > > While debugging an issue with excessive softirq usage, I encountered the
> > > following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
> > > infrastructure"):
> > > 
> > >     [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]
> > > 
> > > ...but despite this note, the patch still calls RCU with IRQs disabled.
> > > 
> > > This seemingly innocuous change caused a significant regression in softirq
> > > CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
> > > introducing 0.01% packet loss, the softirq usage would jump to around 25%,
> > > spiking as high as 50%. Before the change, the usage would never exceed 5%.
> > > 
> > > Moving the call to rcu_note_context_switch() after the cond_sched() call,
> > > as it was originally before the hotplug patch, completely eliminated this
> > > problem.
> > > 
> > > Signed-off-by: Calvin Owens <calvinowens@fb.com>
> > > ---
> > > Changes since v1:
> > > 	I mixed up the kernel versions I was patching against, sorry!
> > > 
> > >  kernel/softirq.c | 6 +++++-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > > index 501baa9..9e787d8 100644
> > > --- a/kernel/softirq.c
> > > +++ b/kernel/softirq.c
> > > @@ -656,9 +656,13 @@ static void run_ksoftirqd(unsigned int cpu)
> > >  		 * in the task stack here.
> > >  		 */
> > >  		__do_softirq();
> > > -		rcu_note_context_switch();
> > >  		local_irq_enable();
> > >  		cond_resched();
> > 
> > If this is for 3.20, we can just replace cond_resched() with
> > cond_resched_rcu_qs(), and get rid of the direct call to
> > rcu_note_context_switch().  This has the benefit of avoiding
> > needless rcu_note_context_switch() overhead if cond_resched()
> > actually did a reschedule.
> > 
> > But don't try it in 3.19 or earlier.  ;-)
> 
> As in the following for 3.20.  Does this version work for you?

That's great, thanks :)

Should this go to stable as well? It is technically a regression, albeit
a rather long-standing one.

My original scenario was a bit contrived, but I tested this on some real
loads and it makes a difference: on a heavily loaded Proxygen server,
the aggregate softirq CPU usage decreases by roughly 10% (relative)
given the same amount of traffic with the patch. It also produces
statistically significant performance wins at higher loads on
webservers: about a 1% reduction in overall CPU utilization and improved
latency metrics.

Thanks,
Calvin
 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
> 
> While debugging an issue with excessive softirq usage, I encountered the
> following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
> infrastructure"):
>     
>         [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]
>     
> ...but despite this note, the patch still calls RCU with IRQs disabled.
> 
> This seemingly innocuous change caused a significant regression in softirq
> CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
> introducing 0.01% packet loss, the softirq usage would jump to around 25%,
> spiking as high as 50%. Before the change, the usage would never exceed 5%.
> 
> Moving the call to rcu_note_context_switch() after the cond_sched() call,
> as it was originally before the hotplug patch, completely eliminated this
> problem, but the new cond_resched_rcu_qs() provides shorter code and
> avoids double RCU notification in the case where cond_resched() really
> did a context switch.
> 
> Signed-off-by: Calvin Owens <calvinowens@fb.com>
> [ paulmck: Substituted shiny new cond_resched_rcu_qs() primitive. ]
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 501baa9ac1be..8cdb98847c7b 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -656,9 +656,8 @@ static void run_ksoftirqd(unsigned int cpu)
>  		 * in the task stack here.
>  		 */
>  		__do_softirq();
> -		rcu_note_context_switch();
>  		local_irq_enable();
> -		cond_resched();
> +		cond_resched_rcu_qs();
>  		return;
>  	}
>  	local_irq_enable();
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-08  4:33       ` Calvin Owens
@ 2015-01-08  4:53         ` Paul E. McKenney
  2015-01-08 21:46           ` Calvin Owens
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-01-08  4:53 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Thomas Gleixner, Andrew Morton, Joe Perches, Peter Zijlstra,
	linux-kernel, kernel-team

On Wed, Jan 07, 2015 at 08:33:29PM -0800, Calvin Owens wrote:
> On Wednesday 01/07 at 08:52 -0800, Paul E. McKenney wrote:
> > On Tue, Jan 06, 2015 at 06:19:26PM -0800, Paul E. McKenney wrote:
> > > On Tue, Jan 06, 2015 at 05:49:06PM -0800, Calvin Owens wrote:
> > > > While debugging an issue with excessive softirq usage, I encountered the
> > > > following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
> > > > infrastructure"):
> > > > 
> > > >     [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]
> > > > 
> > > > ...but despite this note, the patch still calls RCU with IRQs disabled.
> > > > 
> > > > This seemingly innocuous change caused a significant regression in softirq
> > > > CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
> > > > introducing 0.01% packet loss, the softirq usage would jump to around 25%,
> > > > spiking as high as 50%. Before the change, the usage would never exceed 5%.
> > > > 
> > > > Moving the call to rcu_note_context_switch() after the cond_sched() call,
> > > > as it was originally before the hotplug patch, completely eliminated this
> > > > problem.
> > > > 
> > > > Signed-off-by: Calvin Owens <calvinowens@fb.com>
> > > > ---
> > > > Changes since v1:
> > > > 	I mixed up the kernel versions I was patching against, sorry!
> > > > 
> > > >  kernel/softirq.c | 6 +++++-
> > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > > > index 501baa9..9e787d8 100644
> > > > --- a/kernel/softirq.c
> > > > +++ b/kernel/softirq.c
> > > > @@ -656,9 +656,13 @@ static void run_ksoftirqd(unsigned int cpu)
> > > >  		 * in the task stack here.
> > > >  		 */
> > > >  		__do_softirq();
> > > > -		rcu_note_context_switch();
> > > >  		local_irq_enable();
> > > >  		cond_resched();
> > > 
> > > If this is for 3.20, we can just replace cond_resched() with
> > > cond_resched_rcu_qs(), and get rid of the direct call to
> > > rcu_note_context_switch().  This has the benefit of avoiding
> > > needless rcu_note_context_switch() overhead if cond_resched()
> > > actually did a reschedule.
> > > 
> > > But don't try it in 3.19 or earlier.  ;-)
> > 
> > As in the following for 3.20.  Does this version work for you?
> 
> That's great, thanks :)
> 
> Should this go to stable as well? It is technically a regression, albeit
> a rather long-standing one.

Your original patch could go into stable, but my updated version depends
on functionality not present before 3.20.

> My original scenario was a bit contrived, but I tested this on some real
> loads and it makes a difference: on a heavily loaded Proxygen server,
> the aggregate softirq CPU usage decreases by roughly 10% (relative)
> given the same amount of traffic with the patch. It also produces
> statistically significant performance wins at higher loads on
> webservers: about a 1% reduction in overall CPU utilization and improved
> latency metrics.

OK, good to know.  I have added this information to the commit log, please
see below for updated commit.

							Thanx, Paul

------------------------------------------------------------------------

ksoftirqd: Enable IRQs and call cond_resched() before poking RCU

While debugging an issue with excessive softirq usage, I encountered the
following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
infrastructure"):
    
        [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]
    
...but despite this note, the patch still calls RCU with IRQs disabled.

This seemingly innocuous change caused a significant regression in softirq
CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
introducing 0.01% packet loss, the softirq usage would jump to around
25%, spiking as high as 50%. Before the change, the usage would never
exceed 5%.  On a heavily loaded Proxygen server, the aggregate softirq
CPU usage decreases by roughly 10% (relative) given the same amount
of traffic with the patch. It also produces statistically significant
performance wins at higher loads on webservers: about a 1% reduction in
overall CPU utilization and improved latency metrics.

Moving the call to rcu_note_context_switch() after the cond_sched() call,
as it was originally before the hotplug patch, completely eliminated this
problem, but the new cond_resched_rcu_qs() provides shorter code and
avoids double RCU notification in the case where cond_resched() really
did a context switch.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
[ paulmck: Substituted shiny new cond_resched_rcu_qs() primitive. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Added Calvin's measurements on Proxygen server and webservers. ]

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 501baa9ac1be..8cdb98847c7b 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -656,9 +656,8 @@ static void run_ksoftirqd(unsigned int cpu)
 		 * in the task stack here.
 		 */
 		__do_softirq();
-		rcu_note_context_switch();
 		local_irq_enable();
-		cond_resched();
+		cond_resched_rcu_qs();
 		return;
 	}
 	local_irq_enable();


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-08  4:53         ` Paul E. McKenney
@ 2015-01-08 21:46           ` Calvin Owens
  2015-01-13 11:17             ` Thomas Gleixner
  0 siblings, 1 reply; 19+ messages in thread
From: Calvin Owens @ 2015-01-08 21:46 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Thomas Gleixner, Andrew Morton, Joe Perches, Peter Zijlstra,
	linux-kernel, kernel-team

On Wednesday 01/07 at 20:53 -0800, Paul E. McKenney wrote:
> On Wed, Jan 07, 2015 at 08:33:29PM -0800, Calvin Owens wrote:
> > On Wednesday 01/07 at 08:52 -0800, Paul E. McKenney wrote:
> > > On Tue, Jan 06, 2015 at 06:19:26PM -0800, Paul E. McKenney wrote:
> > > > On Tue, Jan 06, 2015 at 05:49:06PM -0800, Calvin Owens wrote:
> > > > > While debugging an issue with excessive softirq usage, I encountered the
> > > > > following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
> > > > > infrastructure"):
> > > > > 
> > > > >     [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]
> > > > > 
> > > > > ...but despite this note, the patch still calls RCU with IRQs disabled.
> > > > > 
> > > > > This seemingly innocuous change caused a significant regression in softirq
> > > > > CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
> > > > > introducing 0.01% packet loss, the softirq usage would jump to around 25%,
> > > > > spiking as high as 50%. Before the change, the usage would never exceed 5%.
> > > > > 
> > > > > Moving the call to rcu_note_context_switch() after the cond_sched() call,
> > > > > as it was originally before the hotplug patch, completely eliminated this
> > > > > problem.
> > > > > 
> > > > > Signed-off-by: Calvin Owens <calvinowens@fb.com>
> > > > > ---
> > > > > Changes since v1:
> > > > > 	I mixed up the kernel versions I was patching against, sorry!
> > > > > 
> > > > >  kernel/softirq.c | 6 +++++-
> > > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > > > > index 501baa9..9e787d8 100644
> > > > > --- a/kernel/softirq.c
> > > > > +++ b/kernel/softirq.c
> > > > > @@ -656,9 +656,13 @@ static void run_ksoftirqd(unsigned int cpu)
> > > > >  		 * in the task stack here.
> > > > >  		 */
> > > > >  		__do_softirq();
> > > > > -		rcu_note_context_switch();
> > > > >  		local_irq_enable();
> > > > >  		cond_resched();
> > > > 
> > > > If this is for 3.20, we can just replace cond_resched() with
> > > > cond_resched_rcu_qs(), and get rid of the direct call to
> > > > rcu_note_context_switch().  This has the benefit of avoiding
> > > > needless rcu_note_context_switch() overhead if cond_resched()
> > > > actually did a reschedule.
> > > > 
> > > > But don't try it in 3.19 or earlier.  ;-)
> > > 
> > > As in the following for 3.20.  Does this version work for you?
> > 
> > That's great, thanks :)
> > 
> > Should this go to stable as well? It is technically a regression, albeit
> > a rather long-standing one.
> 
> Your original patch could go into stable, but my updated version depends
> on functionality not present before 3.20.

Right. I'll wait until 3.20-rc1, and then I'll send the original to
stable with a reference to the commit.
 
> > My original scenario was a bit contrived, but I tested this on some real
> > loads and it makes a difference: on a heavily loaded Proxygen server,
> > the aggregate softirq CPU usage decreases by roughly 10% (relative)
> > given the same amount of traffic with the patch. It also produces
> > statistically significant performance wins at higher loads on
> > webservers: about a 1% reduction in overall CPU utilization and improved
> > latency metrics.
> 
> OK, good to know.  I have added this information to the commit log, please
> see below for updated commit.

Looks good!

Thanks,
Calvin

> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
> 
> While debugging an issue with excessive softirq usage, I encountered the
> following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
> infrastructure"):
>     
>         [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]
>     
> ...but despite this note, the patch still calls RCU with IRQs disabled.
> 
> This seemingly innocuous change caused a significant regression in softirq
> CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
> introducing 0.01% packet loss, the softirq usage would jump to around
> 25%, spiking as high as 50%. Before the change, the usage would never
> exceed 5%.  On a heavily loaded Proxygen server, the aggregate softirq
> CPU usage decreases by roughly 10% (relative) given the same amount
> of traffic with the patch. It also produces statistically significant
> performance wins at higher loads on webservers: about a 1% reduction in
> overall CPU utilization and improved latency metrics.
> 
> Moving the call to rcu_note_context_switch() after the cond_sched() call,
> as it was originally before the hotplug patch, completely eliminated this
> problem, but the new cond_resched_rcu_qs() provides shorter code and
> avoids double RCU notification in the case where cond_resched() really
> did a context switch.
> 
> Signed-off-by: Calvin Owens <calvinowens@fb.com>
> [ paulmck: Substituted shiny new cond_resched_rcu_qs() primitive. ]
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> [ paulmck: Added Calvin's measurements on Proxygen server and webservers. ]
> 
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 501baa9ac1be..8cdb98847c7b 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -656,9 +656,8 @@ static void run_ksoftirqd(unsigned int cpu)
>  		 * in the task stack here.
>  		 */
>  		__do_softirq();
> -		rcu_note_context_switch();
>  		local_irq_enable();
> -		cond_resched();
> +		cond_resched_rcu_qs();
>  		return;
>  	}
>  	local_irq_enable();
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-08 21:46           ` Calvin Owens
@ 2015-01-13 11:17             ` Thomas Gleixner
  2015-01-13 18:43               ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Gleixner @ 2015-01-13 11:17 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Paul E. McKenney, Andrew Morton, Joe Perches, Peter Zijlstra,
	linux-kernel, kernel-team

On Thu, 8 Jan 2015, Calvin Owens wrote:
> On Wednesday 01/07 at 20:53 -0800, Paul E. McKenney wrote:
> > > Should this go to stable as well? It is technically a regression, albeit
> > > a rather long-standing one.
> > 
> > Your original patch could go into stable, but my updated version depends
> > on functionality not present before 3.20.
> 
> Right. I'll wait until 3.20-rc1, and then I'll send the original to
> stable with a reference to the commit.

We better do that in two steps:

   1) The original patch tagged for stable
   2) Pauls version as a delta patch

Can one of you please provide such a series?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-13 11:17             ` Thomas Gleixner
@ 2015-01-13 18:43               ` Paul E. McKenney
  2015-01-13 21:16                 ` [PATCH] " Calvin Owens
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-01-13 18:43 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Calvin Owens, Andrew Morton, Joe Perches, Peter Zijlstra,
	linux-kernel, kernel-team

On Tue, Jan 13, 2015 at 12:17:17PM +0100, Thomas Gleixner wrote:
> On Thu, 8 Jan 2015, Calvin Owens wrote:
> > On Wednesday 01/07 at 20:53 -0800, Paul E. McKenney wrote:
> > > > Should this go to stable as well? It is technically a regression, albeit
> > > > a rather long-standing one.
> > > 
> > > Your original patch could go into stable, but my updated version depends
> > > on functionality not present before 3.20.
> > 
> > Right. I'll wait until 3.20-rc1, and then I'll send the original to
> > stable with a reference to the commit.
> 
> We better do that in two steps:
> 
>    1) The original patch tagged for stable
>    2) Pauls version as a delta patch
> 
> Can one of you please provide such a series?

Carl, could you please send me the original with the proper -stable
reference?  I will then queue that one and rebase mine on top of it.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-13 18:43               ` Paul E. McKenney
@ 2015-01-13 21:16                 ` Calvin Owens
  2015-01-14 22:12                   ` Paul E. McKenney
  2015-01-20 13:21                   ` Thomas Gleixner
  0 siblings, 2 replies; 19+ messages in thread
From: Calvin Owens @ 2015-01-13 21:16 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Thomas Gleixner, Andrew Morton, Joe Perches, Peter Zijlstra,
	linux-kernel, kernel-team, calvinowens

While debugging an issue with excessive softirq usage, I encountered the
following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
infrastructure"):

    [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]

...but despite this note, the patch still calls RCU with IRQs disabled.

This seemingly innocuous change caused a significant regression in softirq
CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
introducing 0.01% packet loss, the softirq usage would jump to around 25%,
spiking as high as 50%. Before the change, the usage would never exceed 5%.

Moving the call to rcu_note_context_switch() after the cond_sched() call,
as it was originally before the hotplug patch, completely eliminated this
problem.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
Cc: stable@vger.kernel.org
---
This version includes the "cpu" argument to rcu_note_context_switch() in
order to apply cleanly to stable kernels. It will need to be removed to
apply to 3.18+ and 3.19 (upstream commit 38200cf2 removed the argument).

 kernel/softirq.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 501baa9..9e787d8 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -656,9 +656,13 @@ static void run_ksoftirqd(unsigned int cpu)
 		 * in the task stack here.
 		 */
 		__do_softirq();
-		rcu_note_context_switch(cpu);
 		local_irq_enable();
 		cond_resched();
+
+		preempt_disable();
+		rcu_note_context_switch(cpu);
+		preempt_enable();
+
 		return;
 	}
 	local_irq_enable();
-- 
2.1.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-13 21:16                 ` [PATCH] " Calvin Owens
@ 2015-01-14 22:12                   ` Paul E. McKenney
  2015-01-20 13:21                   ` Thomas Gleixner
  1 sibling, 0 replies; 19+ messages in thread
From: Paul E. McKenney @ 2015-01-14 22:12 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Thomas Gleixner, Andrew Morton, Joe Perches, Peter Zijlstra,
	linux-kernel, kernel-team

On Tue, Jan 13, 2015 at 01:16:18PM -0800, Calvin Owens wrote:
> While debugging an issue with excessive softirq usage, I encountered the
> following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
> infrastructure"):
> 
>     [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]
> 
> ...but despite this note, the patch still calls RCU with IRQs disabled.
> 
> This seemingly innocuous change caused a significant regression in softirq
> CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
> introducing 0.01% packet loss, the softirq usage would jump to around 25%,
> spiking as high as 50%. Before the change, the usage would never exceed 5%.
> 
> Moving the call to rcu_note_context_switch() after the cond_sched() call,
> as it was originally before the hotplug patch, completely eliminated this
> problem.
> 
> Signed-off-by: Calvin Owens <calvinowens@fb.com>
> Cc: stable@vger.kernel.org

Applied, thank you!  I deleted the "cpu" arguments from the calls to
rcu_note_context_switch() below in order for the patch to apply.

I the put the patch following that on top, as Thomas suggested.

							Thanx, Paul

> ---
> This version includes the "cpu" argument to rcu_note_context_switch() in
> order to apply cleanly to stable kernels. It will need to be removed to
> apply to 3.18+ and 3.19 (upstream commit 38200cf2 removed the argument).
> 
>  kernel/softirq.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 501baa9..9e787d8 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -656,9 +656,13 @@ static void run_ksoftirqd(unsigned int cpu)
>  		 * in the task stack here.
>  		 */
>  		__do_softirq();
> -		rcu_note_context_switch(cpu);
>  		local_irq_enable();
>  		cond_resched();
> +
> +		preempt_disable();
> +		rcu_note_context_switch(cpu);
> +		preempt_enable();
> +
>  		return;
>  	}
>  	local_irq_enable();
> -- 
> 2.1.1

------------------------------------------------------------------------

ksoftirqd: Use new cond_resched_rcu_qs() function

Simplify run_ksoftirqd() by using the new cond_resched_rcu_qs() function
that conditionally reschedules, but unconditionally supplies an RCU
quiescent state.  This commit is separate from the previous commit by
Calvin Owens because Calvin's approach can be backported, while this
commit cannot be.  The reason that this commit cannot be backported is
that cond_resched_rcu_qs() does not always provide the needed quiescent
state in earlier kernels.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/softirq.c b/kernel/softirq.c
index c497fcdf0d1e..8cdb98847c7b 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -657,12 +657,7 @@ static void run_ksoftirqd(unsigned int cpu)
 		 */
 		__do_softirq();
 		local_irq_enable();
-		cond_resched();
-
-		preempt_disable();
-		rcu_note_context_switch();
-		preempt_enable();
-
+		cond_resched_rcu_qs();
 		return;
 	}
 	local_irq_enable();


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-13 21:16                 ` [PATCH] " Calvin Owens
  2015-01-14 22:12                   ` Paul E. McKenney
@ 2015-01-20 13:21                   ` Thomas Gleixner
  2015-01-20 20:30                     ` Paul E. McKenney
  1 sibling, 1 reply; 19+ messages in thread
From: Thomas Gleixner @ 2015-01-20 13:21 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Paul E. McKenney, Andrew Morton, Joe Perches, Peter Zijlstra,
	linux-kernel, kernel-team

On Tue, 13 Jan 2015, Calvin Owens wrote:

> While debugging an issue with excessive softirq usage, I encountered the
> following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
> infrastructure"):
> 
>     [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]
> 
> ...but despite this note, the patch still calls RCU with IRQs disabled.
> 
> This seemingly innocuous change caused a significant regression in softirq
> CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
> introducing 0.01% packet loss, the softirq usage would jump to around 25%,
> spiking as high as 50%. Before the change, the usage would never exceed 5%.
> 
> Moving the call to rcu_note_context_switch() after the cond_sched() call,
> as it was originally before the hotplug patch, completely eliminated this
> problem.
> 
> Signed-off-by: Calvin Owens <calvinowens@fb.com>
> Cc: stable@vger.kernel.org
> ---
> This version includes the "cpu" argument to rcu_note_context_switch() in
> order to apply cleanly to stable kernels. It will need to be removed to
> apply to 3.18+ and 3.19 (upstream commit 38200cf2 removed the argument).
> 
>  kernel/softirq.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 501baa9..9e787d8 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -656,9 +656,13 @@ static void run_ksoftirqd(unsigned int cpu)
>  		 * in the task stack here.
>  		 */
>  		__do_softirq();
> -		rcu_note_context_switch(cpu);
>  		local_irq_enable();
>  		cond_resched();
> +
> +		preempt_disable();
> +		rcu_note_context_switch(cpu);
> +		preempt_enable();
> +

The whole rcu_note_context_switch() in run_ksoftirqd() is silly.

    cond_resched()
	__preempt_count_add(PREEMPT_ACTIVE);

	__schedule();
	     preempt_disable();
	     rcu_note_context_switch();
	     ....

	__preempt_count_sub(PREEMPT_ACTIVE);

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-20 13:21                   ` Thomas Gleixner
@ 2015-01-20 20:30                     ` Paul E. McKenney
  2015-01-21  3:40                       ` Mike Galbraith
  2015-01-21  9:30                       ` Thomas Gleixner
  0 siblings, 2 replies; 19+ messages in thread
From: Paul E. McKenney @ 2015-01-20 20:30 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Calvin Owens, Andrew Morton, Joe Perches, Peter Zijlstra,
	linux-kernel, kernel-team

On Tue, Jan 20, 2015 at 02:21:51PM +0100, Thomas Gleixner wrote:
> On Tue, 13 Jan 2015, Calvin Owens wrote:
> 
> > While debugging an issue with excessive softirq usage, I encountered the
> > following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
> > infrastructure"):
> > 
> >     [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]
> > 
> > ...but despite this note, the patch still calls RCU with IRQs disabled.
> > 
> > This seemingly innocuous change caused a significant regression in softirq
> > CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
> > introducing 0.01% packet loss, the softirq usage would jump to around 25%,
> > spiking as high as 50%. Before the change, the usage would never exceed 5%.
> > 
> > Moving the call to rcu_note_context_switch() after the cond_sched() call,
> > as it was originally before the hotplug patch, completely eliminated this
> > problem.
> > 
> > Signed-off-by: Calvin Owens <calvinowens@fb.com>
> > Cc: stable@vger.kernel.org
> > ---
> > This version includes the "cpu" argument to rcu_note_context_switch() in
> > order to apply cleanly to stable kernels. It will need to be removed to
> > apply to 3.18+ and 3.19 (upstream commit 38200cf2 removed the argument).
> > 
> >  kernel/softirq.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > index 501baa9..9e787d8 100644
> > --- a/kernel/softirq.c
> > +++ b/kernel/softirq.c
> > @@ -656,9 +656,13 @@ static void run_ksoftirqd(unsigned int cpu)
> >  		 * in the task stack here.
> >  		 */
> >  		__do_softirq();
> > -		rcu_note_context_switch(cpu);
> >  		local_irq_enable();
> >  		cond_resched();
> > +
> > +		preempt_disable();
> > +		rcu_note_context_switch(cpu);
> > +		preempt_enable();
> > +
> 
> The whole rcu_note_context_switch() in run_ksoftirqd() is silly.
> 
>     cond_resched()
> 	__preempt_count_add(PREEMPT_ACTIVE);
> 
> 	__schedule();
> 	     preempt_disable();
> 	     rcu_note_context_switch();
> 	     ....
> 
> 	__preempt_count_sub(PREEMPT_ACTIVE);

I agree that if should_resched() returns true as assumed above, then there
is no point to invoking rcu_note_context_switch().  However, the case that
this code applies to is when should_resched() returns false, but RCU is
waiting for a quiescent state from the current CPU.  In that case,
cond_resched() won't do anything for RCU, and we do need the
rcu_note_context_switch().

Of course, it would be better to avoid the extra RCU work in the common
case where cond_resched() does inovke the scheduler.  And that is the
point of the following patch, which uses cond_resched_rcu_qs().
However, this use of cond_resched_rcu_qs() doesn't work in older
kernels.  So Calvin's patch is for backporting, and the follow-up
patch for future kernels.

Make sense, or am I missing something?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-20 20:30                     ` Paul E. McKenney
@ 2015-01-21  3:40                       ` Mike Galbraith
  2015-01-21  5:10                         ` Paul E. McKenney
  2015-01-21  9:30                       ` Thomas Gleixner
  1 sibling, 1 reply; 19+ messages in thread
From: Mike Galbraith @ 2015-01-21  3:40 UTC (permalink / raw)
  To: paulmck
  Cc: Thomas Gleixner, Calvin Owens, Andrew Morton, Joe Perches,
	Peter Zijlstra, linux-kernel, kernel-team

On Tue, 2015-01-20 at 12:30 -0800, Paul E. McKenney wrote: 
> On Tue, Jan 20, 2015 at 02:21:51PM +0100, Thomas Gleixner wrote:

> > > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > > index 501baa9..9e787d8 100644
> > > --- a/kernel/softirq.c
> > > +++ b/kernel/softirq.c
> > > @@ -656,9 +656,13 @@ static void run_ksoftirqd(unsigned int cpu)
> > >  		 * in the task stack here.
> > >  		 */
> > >  		__do_softirq();
> > > -		rcu_note_context_switch(cpu);
> > >  		local_irq_enable();
> > >  		cond_resched();
> > > +
> > > +		preempt_disable();
> > > +		rcu_note_context_switch(cpu);
> > > +		preempt_enable();
> > > +
> > 
> > The whole rcu_note_context_switch() in run_ksoftirqd() is silly.
> > 
> >     cond_resched()
> > 	__preempt_count_add(PREEMPT_ACTIVE);
> > 
> > 	__schedule();
> > 	     preempt_disable();
> > 	     rcu_note_context_switch();
> > 	     ....
> > 
> > 	__preempt_count_sub(PREEMPT_ACTIVE);
> 
> I agree that if should_resched() returns true as assumed above, then there
> is no point to invoking rcu_note_context_switch().  However, the case that
> this code applies to is when should_resched() returns false, but RCU is
> waiting for a quiescent state from the current CPU.  In that case,
> cond_resched() won't do anything for RCU, and we do need the
> rcu_note_context_switch().

I've been curious about this for ages, so now is a great time to bite
the bullet and ask TheMan.  A context switch is not far away, why do we
need that quiescent state badly enough to tell what looks like a little
white lie to get it immediately?

(I commented it out in an -rt kernel I was testing yesterday, beat it
enthusiastically for a while, and box didn't _seem_ to notice that it
was missing anything)

-Mike


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-21  3:40                       ` Mike Galbraith
@ 2015-01-21  5:10                         ` Paul E. McKenney
  2015-01-21  6:29                           ` Mike Galbraith
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-01-21  5:10 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Thomas Gleixner, Calvin Owens, Andrew Morton, Joe Perches,
	Peter Zijlstra, linux-kernel, kernel-team

On Wed, Jan 21, 2015 at 04:40:39AM +0100, Mike Galbraith wrote:
> On Tue, 2015-01-20 at 12:30 -0800, Paul E. McKenney wrote: 
> > On Tue, Jan 20, 2015 at 02:21:51PM +0100, Thomas Gleixner wrote:
> 
> > > > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > > > index 501baa9..9e787d8 100644
> > > > --- a/kernel/softirq.c
> > > > +++ b/kernel/softirq.c
> > > > @@ -656,9 +656,13 @@ static void run_ksoftirqd(unsigned int cpu)
> > > >  		 * in the task stack here.
> > > >  		 */
> > > >  		__do_softirq();
> > > > -		rcu_note_context_switch(cpu);
> > > >  		local_irq_enable();
> > > >  		cond_resched();
> > > > +
> > > > +		preempt_disable();
> > > > +		rcu_note_context_switch(cpu);
> > > > +		preempt_enable();
> > > > +
> > > 
> > > The whole rcu_note_context_switch() in run_ksoftirqd() is silly.
> > > 
> > >     cond_resched()
> > > 	__preempt_count_add(PREEMPT_ACTIVE);
> > > 
> > > 	__schedule();
> > > 	     preempt_disable();
> > > 	     rcu_note_context_switch();
> > > 	     ....
> > > 
> > > 	__preempt_count_sub(PREEMPT_ACTIVE);
> > 
> > I agree that if should_resched() returns true as assumed above, then there
> > is no point to invoking rcu_note_context_switch().  However, the case that
> > this code applies to is when should_resched() returns false, but RCU is
> > waiting for a quiescent state from the current CPU.  In that case,
> > cond_resched() won't do anything for RCU, and we do need the
> > rcu_note_context_switch().
> 
> I've been curious about this for ages, so now is a great time to bite
> the bullet and ask TheMan.  A context switch is not far away, why do we
> need that quiescent state badly enough to tell what looks like a little
> white lie to get it immediately?
> 
> (I commented it out in an -rt kernel I was testing yesterday, beat it
> enthusiastically for a while, and box didn't _seem_ to notice that it
> was missing anything)

Yeah, you do have to have a fairly violent network-based DoS attack
to see the difference.  Robert Olsson was the first to make this happen
back in the day.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-21  5:10                         ` Paul E. McKenney
@ 2015-01-21  6:29                           ` Mike Galbraith
  0 siblings, 0 replies; 19+ messages in thread
From: Mike Galbraith @ 2015-01-21  6:29 UTC (permalink / raw)
  To: paulmck
  Cc: Thomas Gleixner, Calvin Owens, Andrew Morton, Joe Perches,
	Peter Zijlstra, linux-kernel, kernel-team

On Tue, 2015-01-20 at 21:10 -0800, Paul E. McKenney wrote: 
> On Wed, Jan 21, 2015 at 04:40:39AM +0100, Mike Galbraith wrote:

> > I've been curious about this for ages, so now is a great time to bite
> > the bullet and ask TheMan.  A context switch is not far away, why do we
> > need that quiescent state badly enough to tell what looks like a little
> > white lie to get it immediately?
> > 
> > (I commented it out in an -rt kernel I was testing yesterday, beat it
> > enthusiastically for a while, and box didn't _seem_ to notice that it
> > was missing anything)
> 
> Yeah, you do have to have a fairly violent network-based DoS attack
> to see the difference.  Robert Olsson was the first to make this happen
> back in the day.

Ah, my little NIC doesn't have enough poop to do that.  Thanks.

-Mike


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-20 20:30                     ` Paul E. McKenney
  2015-01-21  3:40                       ` Mike Galbraith
@ 2015-01-21  9:30                       ` Thomas Gleixner
  2015-01-21 10:27                         ` Paul E. McKenney
  1 sibling, 1 reply; 19+ messages in thread
From: Thomas Gleixner @ 2015-01-21  9:30 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Calvin Owens, Andrew Morton, Joe Perches, Peter Zijlstra,
	linux-kernel, kernel-team

On Tue, 20 Jan 2015, Paul E. McKenney wrote:
> On Tue, Jan 20, 2015 at 02:21:51PM +0100, Thomas Gleixner wrote:
> > The whole rcu_note_context_switch() in run_ksoftirqd() is silly.
> > 
> >     cond_resched()
> > 	__preempt_count_add(PREEMPT_ACTIVE);
> > 
> > 	__schedule();
> > 	     preempt_disable();
> > 	     rcu_note_context_switch();
> > 	     ....
> > 
> > 	__preempt_count_sub(PREEMPT_ACTIVE);
> 
> I agree that if should_resched() returns true as assumed above, then there
> is no point to invoking rcu_note_context_switch().  However, the case that
> this code applies to is when should_resched() returns false, but RCU is
> waiting for a quiescent state from the current CPU.  In that case,
> cond_resched() won't do anything for RCU, and we do need the
> rcu_note_context_switch().

So this should be:

   if (!cond_resched())
      rcu_note_context_switch();

Hmm?
 
> Of course, it would be better to avoid the extra RCU work in the common
> case where cond_resched() does inovke the scheduler.  And that is the
> point of the following patch, which uses cond_resched_rcu_qs().
> However, this use of cond_resched_rcu_qs() doesn't work in older
> kernels.  So Calvin's patch is for backporting, and the follow-up
> patch for future kernels.

I see.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-21  9:30                       ` Thomas Gleixner
@ 2015-01-21 10:27                         ` Paul E. McKenney
  2015-01-22 10:39                           ` Thomas Gleixner
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-01-21 10:27 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Calvin Owens, Andrew Morton, Joe Perches, Peter Zijlstra,
	linux-kernel, kernel-team

On Wed, Jan 21, 2015 at 10:30:07AM +0100, Thomas Gleixner wrote:
> On Tue, 20 Jan 2015, Paul E. McKenney wrote:
> > On Tue, Jan 20, 2015 at 02:21:51PM +0100, Thomas Gleixner wrote:
> > > The whole rcu_note_context_switch() in run_ksoftirqd() is silly.
> > > 
> > >     cond_resched()
> > > 	__preempt_count_add(PREEMPT_ACTIVE);
> > > 
> > > 	__schedule();
> > > 	     preempt_disable();
> > > 	     rcu_note_context_switch();
> > > 	     ....
> > > 
> > > 	__preempt_count_sub(PREEMPT_ACTIVE);
> > 
> > I agree that if should_resched() returns true as assumed above, then there
> > is no point to invoking rcu_note_context_switch().  However, the case that
> > this code applies to is when should_resched() returns false, but RCU is
> > waiting for a quiescent state from the current CPU.  In that case,
> > cond_resched() won't do anything for RCU, and we do need the
> > rcu_note_context_switch().
> 
> So this should be:
> 
>    if (!cond_resched()) {
	preempt_disable();
>       rcu_note_context_switch();
	preempt_enable();
     }
> 
> Hmm?

Going forward, yes, and cond_resched_rcu_qs() in fact does something
very similar.  For backporting, which is what this patch is for, we are
preserving the same double-quiescent-state behavior that existed earlier,
meaning minimal perturbation of old releases.

Seem reasonable, or do you really feel strongly about pushing this
optimization into -stable?

> > Of course, it would be better to avoid the extra RCU work in the common
> > case where cond_resched() does inovke the scheduler.  And that is the
> > point of the following patch, which uses cond_resched_rcu_qs().
> > However, this use of cond_resched_rcu_qs() doesn't work in older
> > kernels.  So Calvin's patch is for backporting, and the follow-up
> > patch for future kernels.
> 
> I see.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  2015-01-21 10:27                         ` Paul E. McKenney
@ 2015-01-22 10:39                           ` Thomas Gleixner
  0 siblings, 0 replies; 19+ messages in thread
From: Thomas Gleixner @ 2015-01-22 10:39 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Calvin Owens, Andrew Morton, Joe Perches, Peter Zijlstra,
	linux-kernel, kernel-team

On Wed, 21 Jan 2015, Paul E. McKenney wrote:
> On Wed, Jan 21, 2015 at 10:30:07AM +0100, Thomas Gleixner wrote:
> > On Tue, 20 Jan 2015, Paul E. McKenney wrote:
> > > On Tue, Jan 20, 2015 at 02:21:51PM +0100, Thomas Gleixner wrote:
> > > > The whole rcu_note_context_switch() in run_ksoftirqd() is silly.
> > > > 
> > > >     cond_resched()
> > > > 	__preempt_count_add(PREEMPT_ACTIVE);
> > > > 
> > > > 	__schedule();
> > > > 	     preempt_disable();
> > > > 	     rcu_note_context_switch();
> > > > 	     ....
> > > > 
> > > > 	__preempt_count_sub(PREEMPT_ACTIVE);
> > > 
> > > I agree that if should_resched() returns true as assumed above, then there
> > > is no point to invoking rcu_note_context_switch().  However, the case that
> > > this code applies to is when should_resched() returns false, but RCU is
> > > waiting for a quiescent state from the current CPU.  In that case,
> > > cond_resched() won't do anything for RCU, and we do need the
> > > rcu_note_context_switch().
> > 
> > So this should be:
> > 
> >    if (!cond_resched()) {
> 	preempt_disable();
> >       rcu_note_context_switch();
> 	preempt_enable();
>      }
> > 
> > Hmm?
> 
> Going forward, yes, and cond_resched_rcu_qs() in fact does something
> very similar.  For backporting, which is what this patch is for, we are
> preserving the same double-quiescent-state behavior that existed earlier,
> meaning minimal perturbation of old releases.
> 
> Seem reasonable, or do you really feel strongly about pushing this
> optimization into -stable?

No.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-01-22 10:40 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-07  1:37 [PATCH] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU Calvin Owens
2015-01-07  1:49 ` [PATCH v2] " Calvin Owens
2015-01-07  2:19   ` Paul E. McKenney
2015-01-07 16:52     ` Paul E. McKenney
2015-01-08  4:33       ` Calvin Owens
2015-01-08  4:53         ` Paul E. McKenney
2015-01-08 21:46           ` Calvin Owens
2015-01-13 11:17             ` Thomas Gleixner
2015-01-13 18:43               ` Paul E. McKenney
2015-01-13 21:16                 ` [PATCH] " Calvin Owens
2015-01-14 22:12                   ` Paul E. McKenney
2015-01-20 13:21                   ` Thomas Gleixner
2015-01-20 20:30                     ` Paul E. McKenney
2015-01-21  3:40                       ` Mike Galbraith
2015-01-21  5:10                         ` Paul E. McKenney
2015-01-21  6:29                           ` Mike Galbraith
2015-01-21  9:30                       ` Thomas Gleixner
2015-01-21 10:27                         ` Paul E. McKenney
2015-01-22 10:39                           ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).