netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* seqlock lockdep false positives?
@ 2021-03-04  0:40 Jakub Kicinski
  2021-03-07  9:20 ` Ahmed S. Darwish
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2021-03-04  0:40 UTC (permalink / raw)
  To: Ahmed S. Darwish; +Cc: erhard_f, netdev

Hi Ahmed!

Erhard is reporting a lockdep splat in drivers/net/ethernet/realtek/8139too.c

https://bugzilla.kernel.org/show_bug.cgi?id=211575

I can't quite grasp how that happens it looks like it's the Rx
lock/syncp on one side and the Tx lock on the other side :S

================================
WARNING: inconsistent lock state
5.12.0-rc1-Pentium4 #2 Not tainted
--------------------------------
inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
c113c804 (&syncp->seq#2){?.-.}-{0:0}, at: rtl8139_poll+0x251/0x350
{IN-HARDIRQ-W} state was registered at:
  lock_acquire+0x239/0x2c5
  do_write_seqcount_begin_nested.constprop.0+0x1a/0x1f
  rtl8139_interrupt+0x346/0x3cb
  __handle_irq_event_percpu+0xe5/0x20c
  handle_irq_event_percpu+0x17/0x3d
  handle_irq_event+0x29/0x42
  handle_fasteoi_irq+0x67/0xd7
  __handle_irq+0x7d/0x9c
  __common_interrupt+0x68/0xc3
  common_interrupt+0x22/0x35
  asm_common_interrupt+0x106/0x180
  _raw_spin_unlock_irqrestore+0x41/0x45
  __mod_timer+0x1cd/0x1d8
  mod_timer+0xa/0xc
  mld_ifc_start_timer+0x24/0x37
  mld_ifc_timer_expire+0x1b0/0x1c0
  call_timer_fn+0xfe/0x201
  __run_timers+0x134/0x159
  run_timer_softirq+0x14/0x27
  __do_softirq+0x15f/0x307
  call_on_stack+0x40/0x46
  do_softirq_own_stack+0x1c/0x1e
  __irq_exit_rcu+0x4f/0x85
  irq_exit_rcu+0x8/0x11
  sysvec_apic_timer_interrupt+0x20/0x2e
  handle_exception_return+0x0/0xaf
  default_idle+0xa/0xc
  arch_cpu_idle+0xd/0xf
  default_idle_call+0x48/0x74
  do_idle+0xb7/0x1c3
  cpu_startup_entry+0x19/0x1b
  rest_init+0x11d/0x120
  arch_call_rest_init+0x8/0xb
  start_kernel+0x417/0x425
  i386_start_kernel+0x43/0x45
  startup_32_smp+0x164/0x168
irq event stamp: 26328
hardirqs last  enabled at (26328): [<c4362e64>] __slab_alloc.constprop.0+0x3e/0x59
hardirqs last disabled at (26327): [<c4362e47>] __slab_alloc.constprop.0+0x21/0x59
softirqs last  enabled at (26314): [<c4789f1f>] __do_softirq+0x2d7/0x307
softirqs last disabled at (26321): [<c420fecb>] call_on_stack+0x40/0x46

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&syncp->seq#2);
  <Interrupt>
    lock(&syncp->seq#2);

 *** DEADLOCK ***

1 lock held by swapper/0/0:
 #0: c113c8a4 (&tp->rx_lock){+.-.}-{2:2}, at: rtl8139_poll+0x31/0x350

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc1-Pentium4 #2
Hardware name:  /FS51, BIOS 6.00 PG 12/02/2003
Call Trace:
 <SOFTIRQ>
 dump_stack+0x78/0xa5
 print_usage_bug+0x17d/0x188
 mark_lock.part.0+0xfd/0x27a
 ? hlock_class+0x18/0x58
 ? mark_lock.part.0+0x33/0x27a
 ? ___slab_alloc.constprop.0+0x2b7/0x2d1
 __lock_acquire+0x458/0x1488
 ? rcu_read_lock_sched_held+0x23/0x4a
 ? trace_kmalloc+0x8c/0xb9
 ? __kmalloc_track_caller+0x130/0x143
 lock_acquire+0x239/0x2c5
 ? rtl8139_poll+0x251/0x350
 ? __alloc_skb+0xb7/0x102
 do_write_seqcount_begin_nested.constprop.0+0x1a/0x1f
 ? rtl8139_poll+0x251/0x350
 rtl8139_poll+0x251/0x350
 __napi_poll+0x24/0xf1
 net_rx_action+0xbb/0x177
 __do_softirq+0x15f/0x307
 ? __entry_text_end+0x5/0x5
 call_on_stack+0x40/0x46
 </SOFTIRQ>
 ? __irq_exit_rcu+0x4f/0x85
 ? irq_exit_rcu+0x8/0x11
 ? common_interrupt+0x27/0x35
 ? asm_common_interrupt+0x106/0x180
 ? ldsem_down_write+0x1f/0x1f
 ? newidle_balance+0x1d0/0x3ab
 ? default_idle+0xa/0xc
 ? __pci_setup_bridge+0x4e/0x64
 ? default_idle+0xa/0xc
 ? arch_cpu_idle+0xd/0xf
 ? default_idle_call+0x48/0x74
 ? do_idle+0xb7/0x1c3
 ? cpu_startup_entry+0x19/0x1b
 ? rest_init+0x11d/0x120
 ? arch_call_rest_init+0x8/0xb
 ? start_kernel+0x417/0x425
 ? i386_start_kernel+0x43/0x45
 ? startup_32_smp+0x164/0x168

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: seqlock lockdep false positives?
  2021-03-04  0:40 seqlock lockdep false positives? Jakub Kicinski
@ 2021-03-07  9:20 ` Ahmed S. Darwish
  2021-03-08  8:38   ` Peter Zijlstra
  2021-03-08 18:28   ` Erhard F.
  0 siblings, 2 replies; 8+ messages in thread
From: Ahmed S. Darwish @ 2021-03-07  9:20 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: erhard_f, netdev, Sebastian Andrzej Siewior, Peter Zijlstra

Hi Jakub,

On Wed, Mar 03, 2021 at 04:40:35PM -0800, Jakub Kicinski wrote:
> Hi Ahmed!
>
> Erhard is reporting a lockdep splat in drivers/net/ethernet/realtek/8139too.c
>
> https://bugzilla.kernel.org/show_bug.cgi?id=211575
>
> I can't quite grasp how that happens it looks like it's the Rx
> lock/syncp on one side and the Tx lock on the other side :S
>
> ================================
> WARNING: inconsistent lock state
> 5.12.0-rc1-Pentium4 #2 Not tainted
> --------------------------------
> inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
> swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
> c113c804 (&syncp->seq#2){?.-.}-{0:0}, at: rtl8139_poll+0x251/0x350
> {IN-HARDIRQ-W} state was registered at:
>   lock_acquire+0x239/0x2c5
>   do_write_seqcount_begin_nested.constprop.0+0x1a/0x1f
>   rtl8139_interrupt+0x346/0x3cb

That's really weird.

The only way I can see this happening is lockdep mistakenly treating
both "tx_stats->syncp.seq" and "rx_stats->syncp.seq" as the same lockdep
class key... somehow.

It is claiming that the softirq code path at rtl8139_poll() is acquiring
the *tx*_stats sequence counter. But at rtl8139_poll(), I can only see
the *rx*_stats sequence counter getting acquired.

I've re-checked where tx/rx stats sequence counters are initialized, and
I see:

  static struct net_device *rtl8139_init_board(struct pci_dev *pdev)
  {
	...
	u64_stats_init(&tp->rx_stats.syncp);
	u64_stats_init(&tp->tx_stats.syncp);
	...
  }

which means they should have different lockdep class keys.  The
u64_stats sequence counters are also initialized way before any IRQ
handlers are registered.

@Erhard, can you please try below patch? Just want to confirm if this
theory has any validity to it:

diff --git a/drivers/net/ethernet/realtek/8139too.c b/drivers/net/ethernet/realtek/8139too.c
index 1e5a453dea14..c0dbb0418e9d 100644
--- a/drivers/net/ethernet/realtek/8139too.c
+++ b/drivers/net/ethernet/realtek/8139too.c
@@ -715,6 +715,11 @@ static const unsigned int rtl8139_rx_config =
 static const unsigned int rtl8139_tx_config =
 	TxIFG96 | (TX_DMA_BURST << TxDMAShift) | (TX_RETRY << TxRetryShift);

+#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
+static struct lock_class_key rx_stats_key;
+static struct lock_class_key tx_stats_key;
+#endif
+
 static void __rtl8139_cleanup_dev (struct net_device *dev)
 {
 	struct rtl8139_private *tp = netdev_priv(dev);
@@ -794,8 +799,17 @@ static struct net_device *rtl8139_init_board(struct pci_dev *pdev)

 	pci_set_master (pdev);

-	u64_stats_init(&tp->rx_stats.syncp);
-	u64_stats_init(&tp->tx_stats.syncp);
+#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
+	dev_warn(d, "Manually intializing tx/rx stats sequence counters\n");
+
+	tp->rx_stats.syncp.seq.sequence = 0;
+	lockdep_set_class_and_name(&tp->rx_stats.syncp.seq,
+				   &rx_stats_key, "RX stats");
+
+	tp->tx_stats.syncp.seq.sequence = 0;
+	lockdep_set_class_and_name(&tp->tx_stats.syncp.seq,
+				   &tx_stats_key, "TX stats");
+#endif

 retry:
 	/* PIO bar register comes first. */

I've added Sebastian and Peter in Cc too. Maybe they can provide some
further input.

[ Rest of the lockdep report is left, as-is, below... ]

>   __handle_irq_event_percpu+0xe5/0x20c
>   handle_irq_event_percpu+0x17/0x3d
>   handle_irq_event+0x29/0x42
>   handle_fasteoi_irq+0x67/0xd7
>   __handle_irq+0x7d/0x9c
>   __common_interrupt+0x68/0xc3
>   common_interrupt+0x22/0x35
>   asm_common_interrupt+0x106/0x180
>   _raw_spin_unlock_irqrestore+0x41/0x45
>   __mod_timer+0x1cd/0x1d8
>   mod_timer+0xa/0xc
>   mld_ifc_start_timer+0x24/0x37
>   mld_ifc_timer_expire+0x1b0/0x1c0
>   call_timer_fn+0xfe/0x201
>   __run_timers+0x134/0x159
>   run_timer_softirq+0x14/0x27
>   __do_softirq+0x15f/0x307
>   call_on_stack+0x40/0x46
>   do_softirq_own_stack+0x1c/0x1e
>   __irq_exit_rcu+0x4f/0x85
>   irq_exit_rcu+0x8/0x11
>   sysvec_apic_timer_interrupt+0x20/0x2e
>   handle_exception_return+0x0/0xaf
>   default_idle+0xa/0xc
>   arch_cpu_idle+0xd/0xf
>   default_idle_call+0x48/0x74
>   do_idle+0xb7/0x1c3
>   cpu_startup_entry+0x19/0x1b
>   rest_init+0x11d/0x120
>   arch_call_rest_init+0x8/0xb
>   start_kernel+0x417/0x425
>   i386_start_kernel+0x43/0x45
>   startup_32_smp+0x164/0x168
> irq event stamp: 26328
> hardirqs last  enabled at (26328): [<c4362e64>] __slab_alloc.constprop.0+0x3e/0x59
> hardirqs last disabled at (26327): [<c4362e47>] __slab_alloc.constprop.0+0x21/0x59
> softirqs last  enabled at (26314): [<c4789f1f>] __do_softirq+0x2d7/0x307
> softirqs last disabled at (26321): [<c420fecb>] call_on_stack+0x40/0x46
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>        CPU0
>        ----
>   lock(&syncp->seq#2);
>   <Interrupt>
>     lock(&syncp->seq#2);
>
>  *** DEADLOCK ***
>
> 1 lock held by swapper/0/0:
>  #0: c113c8a4 (&tp->rx_lock){+.-.}-{2:2}, at: rtl8139_poll+0x31/0x350
>
> stack backtrace:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc1-Pentium4 #2
> Hardware name:  /FS51, BIOS 6.00 PG 12/02/2003
> Call Trace:
>  <SOFTIRQ>
>  dump_stack+0x78/0xa5
>  print_usage_bug+0x17d/0x188
>  mark_lock.part.0+0xfd/0x27a
>  ? hlock_class+0x18/0x58
>  ? mark_lock.part.0+0x33/0x27a
>  ? ___slab_alloc.constprop.0+0x2b7/0x2d1
>  __lock_acquire+0x458/0x1488
>  ? rcu_read_lock_sched_held+0x23/0x4a
>  ? trace_kmalloc+0x8c/0xb9
>  ? __kmalloc_track_caller+0x130/0x143
>  lock_acquire+0x239/0x2c5
>  ? rtl8139_poll+0x251/0x350
>  ? __alloc_skb+0xb7/0x102
>  do_write_seqcount_begin_nested.constprop.0+0x1a/0x1f
>  ? rtl8139_poll+0x251/0x350
>  rtl8139_poll+0x251/0x350
>  __napi_poll+0x24/0xf1
>  net_rx_action+0xbb/0x177
>  __do_softirq+0x15f/0x307
>  ? __entry_text_end+0x5/0x5
>  call_on_stack+0x40/0x46
>  </SOFTIRQ>
>  ? __irq_exit_rcu+0x4f/0x85
>  ? irq_exit_rcu+0x8/0x11
>  ? common_interrupt+0x27/0x35
>  ? asm_common_interrupt+0x106/0x180
>  ? ldsem_down_write+0x1f/0x1f
>  ? newidle_balance+0x1d0/0x3ab
>  ? default_idle+0xa/0xc
>  ? __pci_setup_bridge+0x4e/0x64
>  ? default_idle+0xa/0xc
>  ? arch_cpu_idle+0xd/0xf
>  ? default_idle_call+0x48/0x74
>  ? do_idle+0xb7/0x1c3
>  ? cpu_startup_entry+0x19/0x1b
>  ? rest_init+0x11d/0x120
>  ? arch_call_rest_init+0x8/0xb
>  ? start_kernel+0x417/0x425
>  ? i386_start_kernel+0x43/0x45
>  ? startup_32_smp+0x164/0x168

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: seqlock lockdep false positives?
  2021-03-07  9:20 ` Ahmed S. Darwish
@ 2021-03-08  8:38   ` Peter Zijlstra
  2021-03-08 20:42     ` Erhard F.
  2021-03-08 18:28   ` Erhard F.
  1 sibling, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2021-03-08  8:38 UTC (permalink / raw)
  To: Ahmed S. Darwish
  Cc: Jakub Kicinski, erhard_f, netdev, Sebastian Andrzej Siewior

On Sun, Mar 07, 2021 at 10:20:08AM +0100, Ahmed S. Darwish wrote:
> Hi Jakub,
> 
> On Wed, Mar 03, 2021 at 04:40:35PM -0800, Jakub Kicinski wrote:
> > Hi Ahmed!
> >
> > Erhard is reporting a lockdep splat in drivers/net/ethernet/realtek/8139too.c
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=211575
> >
> > I can't quite grasp how that happens it looks like it's the Rx
> > lock/syncp on one side and the Tx lock on the other side :S
> >
> > ================================
> > WARNING: inconsistent lock state
> > 5.12.0-rc1-Pentium4 #2 Not tainted
> > --------------------------------
> > inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
> > swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
> > c113c804 (&syncp->seq#2){?.-.}-{0:0}, at: rtl8139_poll+0x251/0x350
> > {IN-HARDIRQ-W} state was registered at:
> >   lock_acquire+0x239/0x2c5
> >   do_write_seqcount_begin_nested.constprop.0+0x1a/0x1f
> >   rtl8139_interrupt+0x346/0x3cb
> 
> That's really weird.
> 
> The only way I can see this happening is lockdep mistakenly treating
> both "tx_stats->syncp.seq" and "rx_stats->syncp.seq" as the same lockdep
> class key... somehow.
> 
> It is claiming that the softirq code path at rtl8139_poll() is acquiring
> the *tx*_stats sequence counter. But at rtl8139_poll(), I can only see
> the *rx*_stats sequence counter getting acquired.
> 
> I've re-checked where tx/rx stats sequence counters are initialized, and
> I see:
> 
>   static struct net_device *rtl8139_init_board(struct pci_dev *pdev)
>   {
> 	...
> 	u64_stats_init(&tp->rx_stats.syncp);
> 	u64_stats_init(&tp->tx_stats.syncp);
> 	...
>   }
> 
> which means they should have different lockdep class keys.  The
> u64_stats sequence counters are also initialized way before any IRQ
> handlers are registered.

Indeed, that's one area where inlines are very much not equivalent to
macros. Static variables in inline functions aren't exact, but they very
much do not get to be one per invocation.

Something like the below ought to be the right fix I think.

diff --git a/include/linux/u64_stats_sync.h b/include/linux/u64_stats_sync.h
index c6abb79501b3..e81856c0ba13 100644
--- a/include/linux/u64_stats_sync.h
+++ b/include/linux/u64_stats_sync.h
@@ -115,12 +115,13 @@ static inline void u64_stats_inc(u64_stats_t *p)
 }
 #endif
 
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+#define u64_stats_init(syncp)	seqcount_init(&(syncp)->seq)
+#else
 static inline void u64_stats_init(struct u64_stats_sync *syncp)
 {
-#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
-	seqcount_init(&syncp->seq);
-#endif
 }
+#endif
 
 static inline void u64_stats_update_begin(struct u64_stats_sync *syncp)
 {

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: seqlock lockdep false positives?
  2021-03-07  9:20 ` Ahmed S. Darwish
  2021-03-08  8:38   ` Peter Zijlstra
@ 2021-03-08 18:28   ` Erhard F.
  1 sibling, 0 replies; 8+ messages in thread
From: Erhard F. @ 2021-03-08 18:28 UTC (permalink / raw)
  To: Ahmed S. Darwish
  Cc: Jakub Kicinski, netdev, Sebastian Andrzej Siewior, Peter Zijlstra

On Sun, 7 Mar 2021 10:20:08 +0100
"Ahmed S. Darwish" <a.darwish@linutronix.de> wrote:

> @Erhard, can you please try below patch? Just want to confirm if this
> theory has any validity to it:
> 
> diff --git a/drivers/net/ethernet/realtek/8139too.c b/drivers/net/ethernet/realtek/8139too.c
> index 1e5a453dea14..c0dbb0418e9d 100644
> --- a/drivers/net/ethernet/realtek/8139too.c
> +++ b/drivers/net/ethernet/realtek/8139too.c
> @@ -715,6 +715,11 @@ static const unsigned int rtl8139_rx_config =
>  static const unsigned int rtl8139_tx_config =
>  	TxIFG96 | (TX_DMA_BURST << TxDMAShift) | (TX_RETRY << TxRetryShift);
> 
> +#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
> +static struct lock_class_key rx_stats_key;
> +static struct lock_class_key tx_stats_key;
> +#endif
> +
>  static void __rtl8139_cleanup_dev (struct net_device *dev)
>  {
>  	struct rtl8139_private *tp = netdev_priv(dev);
> @@ -794,8 +799,17 @@ static struct net_device *rtl8139_init_board(struct pci_dev *pdev)
> 
>  	pci_set_master (pdev);
> 
> -	u64_stats_init(&tp->rx_stats.syncp);
> -	u64_stats_init(&tp->tx_stats.syncp);
> +#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
> +	dev_warn(d, "Manually intializing tx/rx stats sequence counters\n");
> +
> +	tp->rx_stats.syncp.seq.sequence = 0;
> +	lockdep_set_class_and_name(&tp->rx_stats.syncp.seq,
> +				   &rx_stats_key, "RX stats");
> +
> +	tp->tx_stats.syncp.seq.sequence = 0;
> +	lockdep_set_class_and_name(&tp->tx_stats.syncp.seq,
> +				   &tx_stats_key, "TX stats");
> +#endif

Hi Ahmed!

With your patch on top of 5.12-rc2 the lockdep splat is gone in the kernel dmesg and I only get:

[...]
8139too: 8139too Fast Ethernet driver 0.9.28
8139too 0000:00:0f.0: Manually intializing tx/rx stats sequence counters
8139too 0000:00:0f.0 eth0: RealTek RTL8139 at 0x(ptrval), 00:30:1b:2f:2c:58, IRQ 18
[...]

Trying Peter's patch next.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: seqlock lockdep false positives?
  2021-03-08  8:38   ` Peter Zijlstra
@ 2021-03-08 20:42     ` Erhard F.
  2021-03-09  7:54       ` Peter Zijlstra
  0 siblings, 1 reply; 8+ messages in thread
From: Erhard F. @ 2021-03-08 20:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ahmed S. Darwish, Jakub Kicinski, netdev, Sebastian Andrzej Siewior

On Mon, 8 Mar 2021 09:38:11 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> Indeed, that's one area where inlines are very much not equivalent to
> macros. Static variables in inline functions aren't exact, but they very
> much do not get to be one per invocation.
> 
> Something like the below ought to be the right fix I think.
> 
> diff --git a/include/linux/u64_stats_sync.h b/include/linux/u64_stats_sync.h
> index c6abb79501b3..e81856c0ba13 100644
> --- a/include/linux/u64_stats_sync.h
> +++ b/include/linux/u64_stats_sync.h
> @@ -115,12 +115,13 @@ static inline void u64_stats_inc(u64_stats_t *p)
>  }
>  #endif
>  
> +#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
> +#define u64_stats_init(syncp)	seqcount_init(&(syncp)->seq)
> +#else
>  static inline void u64_stats_init(struct u64_stats_sync *syncp)
>  {
> -#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
> -	seqcount_init(&syncp->seq);
> -#endif
>  }
> +#endif
>  
>  static inline void u64_stats_update_begin(struct u64_stats_sync *syncp)
>  {

Hi Peter!

I can confirm that your patch on top of 5.12-rc2 makes the lockdep splat disappear (Ahmeds' 1st patch not installed).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: seqlock lockdep false positives?
  2021-03-08 20:42     ` Erhard F.
@ 2021-03-09  7:54       ` Peter Zijlstra
  2021-03-09 10:12         ` Eric Dumazet
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2021-03-09  7:54 UTC (permalink / raw)
  To: Erhard F.
  Cc: Ahmed S. Darwish, Jakub Kicinski, netdev, Sebastian Andrzej Siewior

On Mon, Mar 08, 2021 at 09:42:08PM +0100, Erhard F. wrote:

> I can confirm that your patch on top of 5.12-rc2 makes the lockdep
> splat disappear (Ahmeds' 1st patch not installed).

Excellent, I'll queue the below in locking/urgent then.


---
Subject: u64_stats,lockdep: Fix u64_stats_init() vs lockdep
From: Peter Zijlstra <peterz@infradead.org>
Date: Mon, 8 Mar 2021 09:38:12 +0100

Jakub reported that:

    static struct net_device *rtl8139_init_board(struct pci_dev *pdev)
    {
	    ...
	    u64_stats_init(&tp->rx_stats.syncp);
	    u64_stats_init(&tp->tx_stats.syncp);
	    ...
    }

results in lockdep getting confused between the RX and TX stats lock.
This is because u64_stats_init() is an inline calling seqcount_init(),
which is a macro using a static variable to generate a lockdep class.

By wrapping that in an inline, we negate the effect of the macro and
fold the static key variable, hence the confusion.

Fix by also making u64_stats_init() a macro for the case where it
matters, leaving the other case an inline for argument validation
etc.

Reported-by: Jakub Kicinski <kuba@kernel.org>
Debugged-by: "Ahmed S. Darwish" <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: "Erhard F." <erhard_f@mailbox.org>
Link: https://lkml.kernel.org/r/YEXicy6+9MksdLZh@hirez.programming.kicks-ass.net
---
 include/linux/u64_stats_sync.h |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/include/linux/u64_stats_sync.h
+++ b/include/linux/u64_stats_sync.h
@@ -115,12 +115,13 @@ static inline void u64_stats_inc(u64_sta
 }
 #endif
 
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+#define u64_stats_init(syncp)	seqcount_init(&(syncp)->seq)
+#else
 static inline void u64_stats_init(struct u64_stats_sync *syncp)
 {
-#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
-	seqcount_init(&syncp->seq);
-#endif
 }
+#endif
 
 static inline void u64_stats_update_begin(struct u64_stats_sync *syncp)
 {

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: seqlock lockdep false positives?
  2021-03-09  7:54       ` Peter Zijlstra
@ 2021-03-09 10:12         ` Eric Dumazet
  2021-03-09 14:24           ` Peter Zijlstra
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2021-03-09 10:12 UTC (permalink / raw)
  To: Peter Zijlstra, Erhard F.
  Cc: Ahmed S. Darwish, Jakub Kicinski, netdev, Sebastian Andrzej Siewior



On 3/9/21 8:54 AM, Peter Zijlstra wrote:
> On Mon, Mar 08, 2021 at 09:42:08PM +0100, Erhard F. wrote:
> 
>> I can confirm that your patch on top of 5.12-rc2 makes the lockdep
>> splat disappear (Ahmeds' 1st patch not installed).
> 
> Excellent, I'll queue the below in locking/urgent then.
> 
> 
> ---
> Subject: u64_stats,lockdep: Fix u64_stats_init() vs lockdep
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Mon, 8 Mar 2021 09:38:12 +0100
> 
> Jakub reported that:
> 
>     static struct net_device *rtl8139_init_board(struct pci_dev *pdev)
>     {
> 	    ...
> 	    u64_stats_init(&tp->rx_stats.syncp);
> 	    u64_stats_init(&tp->tx_stats.syncp);
> 	    ...
>     }
> 
> results in lockdep getting confused between the RX and TX stats lock.
> This is because u64_stats_init() is an inline calling seqcount_init(),
> which is a macro using a static variable to generate a lockdep class.
> 
> By wrapping that in an inline, we negate the effect of the macro and
> fold the static key variable, hence the confusion.
> 
> Fix by also making u64_stats_init() a macro for the case where it
> matters, leaving the other case an inline for argument validation
> etc.
> 
> Reported-by: Jakub Kicinski <kuba@kernel.org>
> Debugged-by: "Ahmed S. Darwish" <a.darwish@linutronix.de>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Tested-by: "Erhard F." <erhard_f@mailbox.org>
> Link: https://lkml.kernel.org/r/YEXicy6+9MksdLZh@hirez.programming.kicks-ass.net
> ---
>  include/linux/u64_stats_sync.h |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> --- a/include/linux/u64_stats_sync.h
> +++ b/include/linux/u64_stats_sync.h
> @@ -115,12 +115,13 @@ static inline void u64_stats_inc(u64_sta
>  }
>  #endif
>  
> +#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
> +#define u64_stats_init(syncp)	seqcount_init(&(syncp)->seq)
> +#else
>  static inline void u64_stats_init(struct u64_stats_sync *syncp)
>  {
> -#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
> -	seqcount_init(&syncp->seq);
> -#endif
>  }
> +#endif
>  
>  static inline void u64_stats_update_begin(struct u64_stats_sync *syncp)
>  {
> 

Interesting !

It seems seqcount_latch_init() might benefit from something similar.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: seqlock lockdep false positives?
  2021-03-09 10:12         ` Eric Dumazet
@ 2021-03-09 14:24           ` Peter Zijlstra
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Zijlstra @ 2021-03-09 14:24 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Erhard F.,
	Ahmed S. Darwish, Jakub Kicinski, netdev,
	Sebastian Andrzej Siewior

On Tue, Mar 09, 2021 at 11:12:34AM +0100, Eric Dumazet wrote:
> Interesting !
> 
> It seems seqcount_latch_init() might benefit from something similar.

Indeed so. I've added the below on top.

---
Subject: seqlock,lockdep: Fix seqcount_latch_init()
From: Peter Zijlstra <peterz@infradead.org>
Date: Tue Mar 9 15:21:18 CET 2021

seqcount_init() must be a macro in order to preserve the static
variable that is used for the lockdep key. Don't then wrap it in an
inline function, which destroys that.

Luckily there aren't many users of this function, but fix it before it
becomes a problem.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/seqlock.h |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -664,10 +664,7 @@ typedef struct {
  * seqcount_latch_init() - runtime initializer for seqcount_latch_t
  * @s: Pointer to the seqcount_latch_t instance
  */
-static inline void seqcount_latch_init(seqcount_latch_t *s)
-{
-	seqcount_init(&s->seqcount);
-}
+#define seqcount_latch_init(s) seqcount_init(&(s)->seqcount)
 
 /**
  * raw_read_seqcount_latch() - pick even/odd latch data copy

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-03-09 14:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-04  0:40 seqlock lockdep false positives? Jakub Kicinski
2021-03-07  9:20 ` Ahmed S. Darwish
2021-03-08  8:38   ` Peter Zijlstra
2021-03-08 20:42     ` Erhard F.
2021-03-09  7:54       ` Peter Zijlstra
2021-03-09 10:12         ` Eric Dumazet
2021-03-09 14:24           ` Peter Zijlstra
2021-03-08 18:28   ` Erhard F.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).