linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH RFC] r8152: Ensure that napi_schedule() is handled
       [not found]         ` <20210514144130.7287af8e@kicinski-fedora-PC1C0HJN>
@ 2021-05-14 23:23           ` Thomas Gleixner
  2021-05-14 23:36             ` Jakub Kicinski
  2021-05-15 13:09             ` Peter Zijlstra
  0 siblings, 2 replies; 4+ messages in thread
From: Thomas Gleixner @ 2021-05-14 23:23 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: linux-usb, netdev, Michal Svec, David S. Miller, Hayes Wang,
	Thierry Reding, Lee Jones, Borislav Petkov, LKML, Peter Zijlstra,
	Boqun Feng

On Fri, May 14 2021 at 14:41, Jakub Kicinski wrote:
> On Fri, 14 May 2021 23:10:43 +0200 Thomas Gleixner wrote:
>> On Fri, May 14 2021 at 13:46, Jakub Kicinski wrote:
>> > On Fri, 14 May 2021 22:25:50 +0200 Thomas Gleixner wrote:  
>> >> Except that some instruction cycle beancounters might complain about
>> >> the extra conditional for the sane cases.
>> >> 
>> >> But yes, I'm fine with that as well. That's why this patch is marked RFC :)  
>> >
>> > When we're in the right context (irq/bh disabled etc.) the cost is just
>> > read of preempt_count() and jump, right? And presumably preempt_count()
>> > is in the cache already, because those sections aren't very long. Let me
>> > make this change locally and see if it is in any way perceivable.  
>> 
>> Right. Just wanted to mention it :)
>> 
>> > Obviously if anyone sees a way to solve the problem without much
>> > ifdefinery and force_irqthreads checks that'd be great - I don't.  
>> 
>> This is not related to force_irqthreads at all. This very driver invokes
>> it from plain thread context.
>
> I see, but a driver calling __napi_schedule_irqoff() from its IRQ
> handler _would_ be an issue, right? Or do irq threads trigger softirq
> processing on exit?

Yes, they do. See irq_forced_thread_fn(). It has a local_bh_disable() /
local_bh_ enable() pair around the invocation to ensure that.

>> > I'd rather avoid pushing this kind of stuff out to the drivers.  
>> 
>> You could have napi_schedule_intask() or something like that which would
>> do the local_bh_disable()/enable() dance around the invocation of
>> napi_schedule(). That would also document it clearly in the drivers. A
>> quick grep shows a bunch of instances which could be replaced:
>> 
>> drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c-5704-		local_bh_disable();
>> drivers/net/ethernet/mellanox/mlx4/en_netdev.c-1830-		local_bh_disable();
>> drivers/net/usb/r8152.c-1552-	local_bh_disable();
>> drivers/net/virtio_net.c-1355-	local_bh_disable();
>> drivers/net/wireless/intel/iwlwifi/pcie/rx.c-1650-	local_bh_disable();
>> drivers/net/wireless/intel/iwlwifi/pcie/rx.c-2015-		local_bh_disable();
>> drivers/net/wireless/intel/iwlwifi/pcie/rx.c-2225-		local_bh_disable();
>> drivers/net/wireless/intel/iwlwifi/pcie/rx.c-2235-		local_bh_disable();
>> drivers/s390/net/qeth_core_main.c-3515-	local_bh_disable();
>
> Very well aware, I've just sent a patch for mlx5 last week :)
>
> My initial reaction was the same as yours - we should add lockdep
> check, and napi_schedule_intask(). But then I started wondering
> if it's all for nothing on rt or with force_irqthreads, and therefore
> we should just eat the extra check.

We can make that work but sure I'm not going to argue when you decide to
just go for raise_softirq_irqsoff().

I just hacked that check up which is actually useful beyond NAPI. It's
straight forward except for that flush_smp_call_function_from_idle()
oddball, which immeditately triggered that assert because block mq uses
__raise_softirq_irqsoff() in a smp function call...

See below. Peter might have opinions though :)

Thanks,

        tglx
---
 include/linux/lockdep.h |   21 +++++++++++++++++++++
 include/linux/sched.h   |    1 +
 kernel/smp.c            |    2 ++
 kernel/softirq.c        |   18 ++++++++++++++----
 4 files changed, 38 insertions(+), 4 deletions(-)

--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -636,6 +636,23 @@ do {									\
 		     (!in_softirq() || in_irq() || in_nmi()));		\
 } while (0)
 
+#define lockdep_set_softirq_raise_safe()				\
+do {									\
+	current->softirq_raise_safe = 1;				\
+} while (0)
+
+#define lockdep_clear_softirq_raise_safe()				\
+do {									\
+	current->softirq_raise_safe = 0;				\
+} while (0)
+
+#define lockdep_assert_softirq_raise_ok()				\
+do {									\
+	WARN_ON_ONCE(__lockdep_enabled &&				\
+		     !current->softirq_raise_safe &&			\
+		     !(softirq_count() | hardirq_count()));		\
+} while (0)
+
 #else
 # define might_lock(lock) do { } while (0)
 # define might_lock_read(lock) do { } while (0)
@@ -648,6 +665,10 @@ do {									\
 # define lockdep_assert_preemption_enabled() do { } while (0)
 # define lockdep_assert_preemption_disabled() do { } while (0)
 # define lockdep_assert_in_softirq() do { } while (0)
+
+# define lockdep_set_softirq_raise_safe()	do { } while (0)
+# define lockdep_clear_softirq_raise_safe()	do { } while (0)
+# define lockdep_assert_softirq_raise_ok()	do { } while (0)
 #endif
 
 #ifdef CONFIG_PROVE_RAW_LOCK_NESTING
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1058,6 +1058,7 @@ struct task_struct {
 	u64				curr_chain_key;
 	int				lockdep_depth;
 	unsigned int			lockdep_recursion;
+	unsigned int			softirq_raise_safe;
 	struct held_lock		held_locks[MAX_LOCK_DEPTH];
 #endif
 
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -691,7 +691,9 @@ void flush_smp_call_function_from_idle(v
 	cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->idle, CFD_SEQ_NOCPU,
 		      smp_processor_id(), CFD_SEQ_IDLE);
 	local_irq_save(flags);
+	lockdep_set_softirq_raise_safe();
 	flush_smp_call_function_queue(true);
+	lockdep_clear_softirq_raise_safe();
 	if (local_softirq_pending())
 		do_softirq();
 
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -664,12 +664,19 @@ void irq_exit(void)
 	lockdep_hardirq_exit();
 }
 
+static inline void ____raise_softirq_irqoff(unsigned int nr)
+{
+	lockdep_assert_irqs_disabled();
+	trace_softirq_raise(nr);
+	or_softirq_pending(1UL << nr);
+}
+
 /*
  * This function must run with irqs disabled!
  */
 inline void raise_softirq_irqoff(unsigned int nr)
 {
-	__raise_softirq_irqoff(nr);
+	____raise_softirq_irqoff(nr);
 
 	/*
 	 * If we're in an interrupt or softirq, we're done
@@ -693,11 +700,14 @@ void raise_softirq(unsigned int nr)
 	local_irq_restore(flags);
 }
 
+/*
+ * Must be invoked with interrupts disabled and either from softirq serving
+ * context or with local bottom halfs disabled.
+ */
 void __raise_softirq_irqoff(unsigned int nr)
 {
-	lockdep_assert_irqs_disabled();
-	trace_softirq_raise(nr);
-	or_softirq_pending(1UL << nr);
+	lockdep_assert_softirq_raise_ok();
+	____raise_softirq_irqoff(nr);
 }
 
 void open_softirq(int nr, void (*action)(struct softirq_action *))


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH RFC] r8152: Ensure that napi_schedule() is handled
  2021-05-14 23:23           ` [PATCH RFC] r8152: Ensure that napi_schedule() is handled Thomas Gleixner
@ 2021-05-14 23:36             ` Jakub Kicinski
  2021-05-15 13:09             ` Peter Zijlstra
  1 sibling, 0 replies; 4+ messages in thread
From: Jakub Kicinski @ 2021-05-14 23:36 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-usb, netdev, Michal Svec, David S. Miller, Hayes Wang,
	Thierry Reding, Lee Jones, Borislav Petkov, LKML, Peter Zijlstra,
	Boqun Feng

On Sat, 15 May 2021 01:23:02 +0200 Thomas Gleixner wrote:
> On Fri, May 14 2021 at 14:41, Jakub Kicinski wrote:
> >> This is not related to force_irqthreads at all. This very driver invokes
> >> it from plain thread context.  
> >
> > I see, but a driver calling __napi_schedule_irqoff() from its IRQ
> > handler _would_ be an issue, right? Or do irq threads trigger softirq
> > processing on exit?  
> 
> Yes, they do. See irq_forced_thread_fn(). It has a local_bh_disable() /
> local_bh_ enable() pair around the invocation to ensure that.

Ah, excellent!

> >> You could have napi_schedule_intask() or something like that which would
> >> do the local_bh_disable()/enable() dance around the invocation of
> >> napi_schedule(). That would also document it clearly in the drivers. A
> >> quick grep shows a bunch of instances which could be replaced:
> >> 
> >> drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c-5704-		local_bh_disable();
> >> drivers/net/ethernet/mellanox/mlx4/en_netdev.c-1830-		local_bh_disable();
> >> drivers/net/usb/r8152.c-1552-	local_bh_disable();
> >> drivers/net/virtio_net.c-1355-	local_bh_disable();
> >> drivers/net/wireless/intel/iwlwifi/pcie/rx.c-1650-	local_bh_disable();
> >> drivers/net/wireless/intel/iwlwifi/pcie/rx.c-2015-		local_bh_disable();
> >> drivers/net/wireless/intel/iwlwifi/pcie/rx.c-2225-		local_bh_disable();
> >> drivers/net/wireless/intel/iwlwifi/pcie/rx.c-2235-		local_bh_disable();
> >> drivers/s390/net/qeth_core_main.c-3515-	local_bh_disable();  
> >
> > Very well aware, I've just sent a patch for mlx5 last week :)
> >
> > My initial reaction was the same as yours - we should add lockdep
> > check, and napi_schedule_intask(). But then I started wondering
> > if it's all for nothing on rt or with force_irqthreads, and therefore
> > we should just eat the extra check.  
> 
> We can make that work but sure I'm not going to argue when you decide to
> just go for raise_softirq_irqsoff().
> 
> I just hacked that check up which is actually useful beyond NAPI. It's
> straight forward except for that flush_smp_call_function_from_idle()
> oddball, which immeditately triggered that assert because block mq uses
> __raise_softirq_irqsoff() in a smp function call...
> 
> See below. Peter might have opinions though :)

Looks good to me, since my thinking that RT complicates things here was
wrong I'm perfectly happy with the lockdep + napi_schedule_intask().

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH RFC] r8152: Ensure that napi_schedule() is handled
  2021-05-14 23:23           ` [PATCH RFC] r8152: Ensure that napi_schedule() is handled Thomas Gleixner
  2021-05-14 23:36             ` Jakub Kicinski
@ 2021-05-15 13:09             ` Peter Zijlstra
  2021-05-15 19:06               ` Thomas Gleixner
  1 sibling, 1 reply; 4+ messages in thread
From: Peter Zijlstra @ 2021-05-15 13:09 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jakub Kicinski, linux-usb, netdev, Michal Svec, David S. Miller,
	Hayes Wang, Thierry Reding, Lee Jones, Borislav Petkov, LKML,
	Boqun Feng

On Sat, May 15, 2021 at 01:23:02AM +0200, Thomas Gleixner wrote:
> We can make that work but sure I'm not going to argue when you decide to
> just go for raise_softirq_irqsoff().
> 
> I just hacked that check up which is actually useful beyond NAPI. It's
> straight forward except for that flush_smp_call_function_from_idle()
> oddball, which immeditately triggered that assert because block mq uses
> __raise_softirq_irqsoff() in a smp function call...
> 
> See below. Peter might have opinions though :)

Yeah, lovely stuff :-)


> +#define lockdep_assert_softirq_raise_ok()				\
> +do {									\
> +	WARN_ON_ONCE(__lockdep_enabled &&				\
> +		     !current->softirq_raise_safe &&			\
> +		     !(softirq_count() | hardirq_count()));		\
> +} while (0)

> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -691,7 +691,9 @@ void flush_smp_call_function_from_idle(v
>  	cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->idle, CFD_SEQ_NOCPU,
>  		      smp_processor_id(), CFD_SEQ_IDLE);
>  	local_irq_save(flags);
> +	lockdep_set_softirq_raise_safe();
>  	flush_smp_call_function_queue(true);
> +	lockdep_clear_softirq_raise_safe();
>  	if (local_softirq_pending())
>  		do_softirq();

I think it might make more sense to raise hardirq_count() in/for
flush_smp_call_function_queue() callers that aren't already from hardirq
context. That's this site and smpcfd_dying_cpu().

Then we can do away with this new special case.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH RFC] r8152: Ensure that napi_schedule() is handled
  2021-05-15 13:09             ` Peter Zijlstra
@ 2021-05-15 19:06               ` Thomas Gleixner
  0 siblings, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2021-05-15 19:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jakub Kicinski, linux-usb, netdev, Michal Svec, David S. Miller,
	Hayes Wang, Thierry Reding, Lee Jones, Borislav Petkov, LKML,
	Boqun Feng

On Sat, May 15 2021 at 15:09, Peter Zijlstra wrote:
> On Sat, May 15, 2021 at 01:23:02AM +0200, Thomas Gleixner wrote:
>> --- a/kernel/smp.c
>> +++ b/kernel/smp.c
>> @@ -691,7 +691,9 @@ void flush_smp_call_function_from_idle(v
>>  	cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->idle, CFD_SEQ_NOCPU,
>>  		      smp_processor_id(), CFD_SEQ_IDLE);
>>  	local_irq_save(flags);
>> +	lockdep_set_softirq_raise_safe();
>>  	flush_smp_call_function_queue(true);
>> +	lockdep_clear_softirq_raise_safe();
>>  	if (local_softirq_pending())
>>  		do_softirq();
>
> I think it might make more sense to raise hardirq_count() in/for
> flush_smp_call_function_queue() callers that aren't already from hardirq
> context. That's this site and smpcfd_dying_cpu().
>
> Then we can do away with this new special case.

Right.

Though I just checked smpcfd_dying_cpu(). That ones does not run
softirqs after flushing the function queue and it can't do that because
that's in the CPU dying phase with interrupts disabled where the CPU is
already half torn down.

Especially as softirq processing enables interrupts, which might cause
even more havoc.

Anyway how is it safe to run arbitrary functions there after the CPU
removed itself from the online mask? That's daft to put it mildly.

Thanks,

        tglx





^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-05-15 19:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <877dk162mo.ffs@nanos.tec.linutronix.de>
     [not found] ` <20210514123838.10d78c35@kicinski-fedora-PC1C0HJN>
     [not found]   ` <87sg2p2hbl.ffs@nanos.tec.linutronix.de>
     [not found]     ` <20210514134655.73d972cb@kicinski-fedora-PC1C0HJN>
     [not found]       ` <87fsyp2f8s.ffs@nanos.tec.linutronix.de>
     [not found]         ` <20210514144130.7287af8e@kicinski-fedora-PC1C0HJN>
2021-05-14 23:23           ` [PATCH RFC] r8152: Ensure that napi_schedule() is handled Thomas Gleixner
2021-05-14 23:36             ` Jakub Kicinski
2021-05-15 13:09             ` Peter Zijlstra
2021-05-15 19:06               ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).