[patch 00/13] preempt: Make preempt count unconditional

* [patch 00/13] preempt: Make preempt count unconditional
@ 2020-09-14 20:42 Thomas Gleixner
  2020-09-14 20:42 ` [patch 01/13] lib/debug: Remove pointless ARCH_NO_PREEMPT dependencies Thomas Gleixner
                   ` (14 more replies)
  0 siblings, 15 replies; 50+ messages in thread
From: Thomas Gleixner @ 2020-09-14 20:42 UTC (permalink / raw)
  To: LKML
  Cc: Juri Lelli, Peter Zijlstra, Sebastian Andrzej Siewior,
	Lai Jiangshan, dri-devel, Ben Segall, linux-mm, linux-kselftest,
	linux-hexagon, Will Deacon, Ingo Molnar, Anton Ivanov,
	linux-arch, Vincent Guittot, Brian Cain, Richard Weinberger,
	Russell King, David Airlie, Ingo Molnar, Geert Uytterhoeven,
	Mel Gorman, intel-gfx, Matt Turner, Valentin Schneider,
	linux-xtensa, Shuah Khan, Paul E. McKenney, Jeff Dike, linux-um,
	Josh Triplett, Steven Rostedt, rcu, linux-m68k, Ivan Kokshaysky,
	Rodrigo Vivi, Dietmar Eggemann, linux-arm-kernel,
	Richard Henderson, Chris Zankel, Max Filippov, Linus Torvalds,
	linux-alpha, Mathieu Desnoyers, Andrew Morton,
	Daniel Bristot de Oliveira

Folks!

While working on various preempt count related things, I stumbled (again)
over the inconsistency of our preempt count handling.

The handling of preempt_count() is inconsistent accross kernel
configurations. On kernels which have PREEMPT_COUNT=n
preempt_disable/enable() and the lock/unlock functions are not affecting
the preempt count, only local_bh_disable/enable() and _bh variants of
locking, soft interrupt delivery, hard interrupt and NMI context affect it.

It's therefore impossible to have a consistent set of checks which provide
information about the context in which a function is called. In many cases
it makes sense to have seperate functions for seperate contexts, but there
are valid reasons to avoid that and handle different calling contexts
conditionally.

The lack of such indicators which work on all kernel configuratios is a
constant source of trouble because developers either do not understand the
implications or try to work around this inconsistency in weird
ways. Neither seem these issues be catched by reviewers and testing.

Recently merged code does:

	 gfp = preemptible() ? GFP_KERNEL : GFP_ATOMIC;

Looks obviously correct, except for the fact that preemptible() is
unconditionally false for CONFIF_PREEMPT_COUNT=n, i.e. all allocations in
that code use GFP_ATOMIC on such kernels.

Attempts to make preempt count unconditional and consistent have been
rejected in the past with handwaving performance arguments.

Freshly conducted benchmarks did not reveal any measurable impact from
enabling preempt count unconditionally. On kernels with CONFIG_PREEMPT_NONE
or CONFIG_PREEMPT_VOLUNTARY the preempt count is only incremented and
decremented but the result of the decrement is not tested. Contrary to that
enabling CONFIG_PREEMPT which tests the result has a small but measurable
impact due to the conditional branch/call.

It's about time to make essential functionality of the kernel consistent
accross the various preemption models.

The series is also available from git:

   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git preempt

That's the first part of a larger effort related to preempt count:

 1) The analysis of the usage sites of in_interrupt(), in_atomic(),
    in_softirq() is still ongoing, but so far the number of buggy users is
    clearly the vast majority. There will be seperate patch series
    (currently 46 and counting) to address these issues once the analysis
    is complete in the next days.

 2) The long discussed state tracking of local irq disable in preempt count
    which accounts interrupt disabled sections as atomic and avoids issuing
    costly instructions (sti, cli, popf or their non X86 counterparts) when
    the state does not change, i.e. nested irq_save() or irq_restore(). I
    have this working on X86 already and contrary to my earlier attempts
    this was reasonably straight forward due to the recent entry/exit code
    consolidation.

    What I've not done yet is to optimize the preempt count handling
    of the [un]lock_irq* operations so they handle the interrupt disabled
    state and the preempt count modification in one go. That's an obvious
    add on, but correctness first ...

 3) Lazy interrupt disabling as a straight forward extension to #2. This
    avoids the actual disabling at the CPU level completely and catches an
    incoming interrupt in the low level entry code, modifies the interrupt
    disabled state on the return stack, notes the interrupt as pending in
    software and raises it again when interrupts are reenabled. This has
    still a few issues which I'm hunting down (cpuidle is unhappy ...)

Thanks,

	tglx
---
 arch/arm/include/asm/assembler.h                                 |   11 --
 arch/arm/kernel/iwmmxt.S                                         |    2 
 arch/arm/mach-ep93xx/crunch-bits.S                               |    2 
 arch/xtensa/kernel/entry.S                                       |    2 
 drivers/gpu/drm/i915/Kconfig.debug                               |    1 
 drivers/gpu/drm/i915/i915_utils.h                                |    3 
 include/linux/bit_spinlock.h                                     |    4 -
 include/linux/lockdep.h                                          |    6 -
 include/linux/pagemap.h                                          |    4 -
 include/linux/preempt.h                                          |   37 +---------
 include/linux/uaccess.h                                          |    6 -
 kernel/Kconfig.preempt                                           |    4 -
 kernel/sched/core.c                                              |    6 -
 lib/Kconfig.debug                                                |    3 
 lib/Kconfig.debug.rej                                            |   14 +--
 tools/testing/selftests/rcutorture/configs/rcu/SRCU-t            |    1 
 tools/testing/selftests/rcutorture/configs/rcu/SRCU-u            |    1 
 tools/testing/selftests/rcutorture/configs/rcu/TINY01            |    1 
 tools/testing/selftests/rcutorture/doc/TINY_RCU.txt              |    5 -
 tools/testing/selftests/rcutorture/doc/TREE_RCU-kconfig.txt      |    1 
 tools/testing/selftests/rcutorture/formal/srcu-cbmc/src/config.h |    1 
 21 files changed, 23 insertions(+), 92 deletions(-)

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread