All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/11] TRACE_IRQFLAGS wreckage
@ 2020-08-21  8:47 Peter Zijlstra
  2020-08-21  8:47 ` [PATCH v2 01/11] lockdep: Use raw_cpu_*() for per-cpu variables Peter Zijlstra
                   ` (12 more replies)
  0 siblings, 13 replies; 53+ messages in thread
From: Peter Zijlstra @ 2020-08-21  8:47 UTC (permalink / raw)
  To: linux-kernel, mingo, will
  Cc: npiggin, elver, jgross, paulmck, rostedt, rjw, joel, svens, tglx, peterz


TRACE_IRQFLAGS

local_irq_*() keeps a software state that mirrors the hardware state,
used for lockdep, includes tracepoints.

raw_local_irq_*() does not update the software state, no tracing.

---

Problem 1:

	raw_local_irq_save(); // software state on
	local_irq_save(); // software state off
	...
	local_irq_restore(); // software state still off, because we don't enable IRQs
	raw_local_irq_restore(); // software state still off, *whoopsie*

existing instances:

 - lock_acquire()
     raw_local_irq_save()
     __lock_acquire()
       arch_spin_lock(&graph_lock)
         pv_wait() := kvm_wait() (same or worse for Xen/HyperV)
           local_irq_save()

 - trace_clock_global()
     raw_local_irq_save()
     arch_spin_lock()
       pv_wait() := kvm_wait()
	 local_irq_save()

 - apic_retrigger_irq()
     raw_local_irq_save()
     apic->send_IPI() := default_send_IPI_single_phys()
       local_irq_save()

Possible solutions:

 A) make it work by enabling the tracing inside raw_*()
 B) make it work by keeping tracing disabled inside raw_*()
 C) call it broken and clean it up now

Now, given that the only reason to use the raw_* variant is because you don't
want tracing. Therefore A) seems like a weird option (although it can be done).
C) is tempting, but OTOH it ends up converting a _lot_ of code to raw just
because there is one raw user, this strips the validation/tracing off for all
the other users.

So we pick B) and declare any code that ends up doing:

	raw_local_irq_save()
	local_irq_save()
	lockdep_assert_irqs_disabled();

broken. AFAICT this problem has existed forever, the only reason it came
up is because I changed IRQ tracing vs lockdep recursion and the first
instance is fairly common, the other cases hardly ever happen.

---

Problem 2:

	raw_local_irq_save(); // software state on
	trace_*()
          ...
          perf_tp_event()
            ...
            perf_callchain()
	    <#PF>
	      trace_hardirqs_off(); // software state off
	      ...
	      if (regs_irqs_disabled(regs)) // false
	        trace_hardirqs_on();
	    </#PF>
	raw_local_irq_restore(); // software state stays off, *whoopsie*

existing instances:

 - lock_acquire() / lock_release()
     raw_local_irq_save()
     trace_lock_acquire() / trace_lock_release()

 - function tracing

Possible solutions:

 A) fix every architecture's entry code
 B) only fix kernel/entry/common.c
 C) fix lockdep tracepoints and pray

This series does C, AFAICT this problem has existed forever.

---

Problem 3:

	raw_local_irq_save(); // software state on
	<#NMI>
	  trace_hardirqs_off(); // software state off
	  ...
	  if (regs_irqs_disabled(regs)) // false
	    trace_hardirqs_on();
	</#NMI>
	raw_local_irq_restore(); // software state stays off, *whoopsie*

Possible solutions:

This *should* not be a problem if an architecture has it's entry ordering
right. In particular we rely on the architecture doing nmi_enter() before
trace_hardirqs_off().

In that case, in_nmi() will be true, and lockdep_hardirqs_*() should NO-OP,
except if CONFIG_TRACE_IRQFLAGS_NMI (x86).

There might be a problem with using lockdep_assert_irqs_disabled() from NMI
context, if so, those needs a little TLC.

---

The patches in this series do (in reverse order):

 - 2C
 - 1B
 - fix fallout in idle due to the trace_lock_*() tracepoints suddenly
   being visible to rcu-lockdep.

---

Change since -v1:

 - typo (rostedt)
 - split WARN (rostedt)
 - reorder start_critical_section / rcu_idle_enter (rostedt)
 - added arm64 patch (kernel test robot)

---

 arch/arm/mach-omap2/pm34xx.c      |    4 --
 arch/arm64/include/asm/irqflags.h |    5 ++
 arch/arm64/kernel/process.c       |    2 -
 arch/nds32/include/asm/irqflags.h |    5 ++
 arch/powerpc/include/asm/hw_irq.h |   11 ++---
 arch/s390/kernel/idle.c           |    3 -
 arch/x86/entry/thunk_32.S         |    5 --
 arch/x86/include/asm/mmu.h        |    1 
 arch/x86/kernel/process.c         |    4 --
 arch/x86/mm/tlb.c                 |   13 +-----
 drivers/cpuidle/cpuidle.c         |   19 +++++++--
 drivers/idle/intel_idle.c         |   16 --------
 include/linux/cpuidle.h           |   13 +++---
 include/linux/irqflags.h          |   73 ++++++++++++++++++++------------------
 include/linux/lockdep.h           |   18 ++++++---
 include/linux/mmu_context.h       |    5 ++
 kernel/locking/lockdep.c          |   18 +++++----
 kernel/sched/idle.c               |   25 +++++--------
 18 files changed, 118 insertions(+), 122 deletions(-)



^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2020-09-08 20:17 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-21  8:47 [PATCH v2 00/11] TRACE_IRQFLAGS wreckage Peter Zijlstra
2020-08-21  8:47 ` [PATCH v2 01/11] lockdep: Use raw_cpu_*() for per-cpu variables Peter Zijlstra
2020-08-27  1:05   ` Joel Fernandes
2020-08-27  7:54   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2020-08-21  8:47 ` [PATCH v2 02/11] cpuidle: Fixup IRQ state Peter Zijlstra
2020-08-21 17:36   ` Rafael J. Wysocki
2020-08-27  1:06     ` Joel Fernandes
2020-08-27  7:54   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2020-08-21  8:47 ` [PATCH v2 03/11] sched,idle,rcu: Push rcu_idle deeper into the idle path Peter Zijlstra
2020-08-27  1:18   ` Joel Fernandes
2020-08-27  1:24     ` Joel Fernandes
2020-08-27  7:47       ` peterz
2020-08-27  7:53         ` Thomas Gleixner
2020-08-27 22:30         ` Joel Fernandes
2020-09-02  7:08           ` peterz
2020-08-27  7:54   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2020-08-21  8:47 ` [PATCH v2 04/11] cpuidle: Make CPUIDLE_FLAG_TLB_FLUSHED generic Peter Zijlstra
2020-08-27  7:54   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2020-08-21  8:47 ` [PATCH v2 05/11] cpuidle: Move trace_cpu_idle() into generic code Peter Zijlstra
2020-08-27  7:54   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2020-08-21  8:47 ` [PATCH v2 06/11] x86/entry: Remove unused THUNKs Peter Zijlstra
2020-08-27  7:54   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2020-08-21  8:47 ` [PATCH v2 07/11] locking/lockdep: Cleanup Peter Zijlstra
2020-08-27  7:54   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2020-08-21  8:47 ` [PATCH v2 08/11] nds32: Implement arch_irqs_disabled() Peter Zijlstra
2020-08-27  7:54   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2020-08-21  8:47 ` [PATCH v2 09/11] arm64: " Peter Zijlstra
2020-08-27  7:54   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2020-08-21  8:47 ` [PATCH v2 10/11] lockdep: Only trace IRQ edges Peter Zijlstra
2020-08-26  0:47   ` Thomas Gleixner
2020-09-02  4:21   ` Guenter Roeck
2020-09-02  9:09     ` peterz
2020-09-02  9:12       ` peterz
2020-09-02 13:48         ` Guenter Roeck
2020-09-08 14:22           ` peterz
2020-09-08 14:40             ` Guenter Roeck
2020-09-08 15:41               ` [PATCH] sparc64: Fix irqtrace warnings on Ultra-S peterz
2020-08-21  8:47 ` [PATCH v2 11/11] lockdep,trace: Expose tracepoints Peter Zijlstra
2020-08-27  7:54   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2020-09-02  3:51   ` [PATCH v2 11/11] " Guenter Roeck
2020-09-02  7:24     ` peterz
2020-09-02 13:21       ` Guenter Roeck
2020-09-02  8:56     ` peterz
2020-09-02 13:57       ` Guenter Roeck
2020-09-03 14:00         ` peterz
2020-09-03 14:13           ` peterz
2020-09-03 15:19           ` Guenter Roeck
2020-09-03 15:36             ` peterz
2020-09-03 15:54               ` Paul E. McKenney
2020-09-03 18:11               ` Guenter Roeck
2020-08-21 12:58 ` [PATCH v2 00/11] TRACE_IRQFLAGS wreckage peterz
2020-08-26 10:16 ` [PATCH v2 12/11] mips: Implement arch_irqs_disabled() peterz
2020-08-27  7:54   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.