All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>, Linux-MM <linux-mm@kvack.org>
Cc: Nicholas Piggin <npiggin@gmail.com>,
	Anton Blanchard <anton@ozlabs.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@ozlabs.org>,
	Randy Dunlap <rdunlap@infradead.org>,
	linux-arch <linux-arch@vger.kernel.org>,
	x86@kernel.org, Rik van Riel <riel@surriel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Nadav Amit <nadav.amit@gmail.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Andy Lutomirski <luto@kernel.org>
Subject: [PATCH 00/23] mm, sched: Rework lazy mm handling
Date: Sat,  8 Jan 2022 08:43:45 -0800	[thread overview]
Message-ID: <cover.1641659630.git.luto@kernel.org> (raw)

Hi all-

Sorry I've been sitting on this so long.  I think it's in decent shape, it
has no *known* bugs, and I think it's time to get the show on the road.
This series needs more eyeballs, too.

The overall point of this series is to get rid of the scalability
problems with mm_count, and my goal is to solve it once and for all,
for all architectures, in a way that doesn't have any gotchas for
unwary users of ->active_mm.

Most of this series is just cleanup, though.  mmgrab(), mmdrop(), and
->active_mm are a mess.  A number of ->active_mm users are simply
wrong.  kthread lazy mm handling is inconsistent with user thread lazy
mm handling (by accident, as far as I can tell).  And membarrier()
relies on the barrier semantics of mmdrop() and mmgrab(), such that
anything that gets rid of those barriers risks breaking membarrier().
x86 is sometimes non-lazy when the core thinks it's lazy because the
core mm code didn't offer any mechanism by which x86 could tell the core
that it's exiting lazy mode.

So most of this series is just cleanup.  Bogus users of ->active_mm
are fixed, and membarrier() is reworked so that its barriers are
explicit instead of depending on mmdrop() and mmgrab().  x86 lazy
handling is extensively tidied up, and x86's EFI mm code gets tidied
up a bit too.  I think I've done this all in a way that introduces
little or no overhead.


Additionally, all the code paths that change current->mm are consolidated
so that there is only one path to start using an mm and only one path
to stop using it.

Once that's done, the actual meat (the hazard pointers) isn't so bad, and
the x86 optimization on top that should eliminate scanning of remote CPUs
in __mmput() is about two lines of code.  Other architectures with
sufficiently accurate mm_cpumask() tracking should be able to do the same
thing.

akpm, this is intended to mostly replace Nick Piggin's lazy shootdown
series.  This series implements lazy shootdown on x86 implicitly, and
powerpc should be able to do the same thing in just a couple lines
of code if it wants to.  The result is IMO much cleaner and more
maintainable.

Once this is all reviewed, I'm hoping it can go in -tip (and -next) after
the merge window or go in -mm.  This is not intended for v5.16.  I suspect
-tip is easier in case other arch maintainers want to optimize their
code in the same release.

Andy Lutomirski (23):
  membarrier: Document why membarrier() works
  x86/mm: Handle unlazying membarrier core sync in the arch code
  membarrier: Remove membarrier_arch_switch_mm() prototype in core code
  membarrier: Make the post-switch-mm barrier explicit
  membarrier, kthread: Use _ONCE accessors for task->mm
  powerpc/membarrier: Remove special barrier on mm switch
  membarrier: Rewrite sync_core_before_usermode() and improve
    documentation
  membarrier: Remove redundant clear of mm->membarrier_state in
    exec_mmap()
  membarrier: Fix incorrect barrier positions during exec and
    kthread_use_mm()
  x86/events, x86/insn-eval: Remove incorrect active_mm references
  sched/scs: Initialize shadow stack on idle thread bringup, not
    shutdown
  Rework "sched/core: Fix illegal RCU from offline CPUs"
  exec: Remove unnecessary vmacache_seqnum clear in exec_mmap()
  sched, exec: Factor current mm changes out from exec
  kthread: Switch to __change_current_mm()
  sched: Use lightweight hazard pointers to grab lazy mms
  x86/mm: Make use/unuse_temporary_mm() non-static
  x86/mm: Allow temporary mms when IRQs are on
  x86/efi: Make efi_enter/leave_mm use the temporary_mm machinery
  x86/mm: Remove leave_mm() in favor of unlazy_mm_irqs_off()
  x86/mm: Use unlazy_mm_irqs_off() in TLB flush IPIs
  x86/mm: Optimize for_each_possible_lazymm_cpu()
  x86/mm: Opt in to IRQs-off activate_mm()

 .../membarrier-sync-core/arch-support.txt     |  69 +--
 arch/arm/include/asm/membarrier.h             |  21 +
 arch/arm/kernel/smp.c                         |   2 -
 arch/arm64/include/asm/membarrier.h           |  19 +
 arch/arm64/kernel/smp.c                       |   2 -
 arch/csky/kernel/smp.c                        |   2 -
 arch/ia64/kernel/process.c                    |   1 -
 arch/mips/cavium-octeon/smp.c                 |   1 -
 arch/mips/kernel/smp-bmips.c                  |   2 -
 arch/mips/kernel/smp-cps.c                    |   1 -
 arch/mips/loongson64/smp.c                    |   2 -
 arch/powerpc/include/asm/membarrier.h         |  28 +-
 arch/powerpc/mm/mmu_context.c                 |   1 -
 arch/powerpc/platforms/85xx/smp.c             |   2 -
 arch/powerpc/platforms/powermac/smp.c         |   2 -
 arch/powerpc/platforms/powernv/smp.c          |   1 -
 arch/powerpc/platforms/pseries/hotplug-cpu.c  |   2 -
 arch/powerpc/platforms/pseries/pmem.c         |   1 -
 arch/riscv/kernel/cpu-hotplug.c               |   2 -
 arch/s390/kernel/smp.c                        |   1 -
 arch/sh/kernel/smp.c                          |   1 -
 arch/sparc/kernel/smp_64.c                    |   2 -
 arch/x86/Kconfig                              |   2 +-
 arch/x86/events/core.c                        |   9 +-
 arch/x86/include/asm/membarrier.h             |  25 ++
 arch/x86/include/asm/mmu.h                    |   6 +-
 arch/x86/include/asm/mmu_context.h            |  15 +-
 arch/x86/include/asm/sync_core.h              |  20 -
 arch/x86/kernel/alternative.c                 |  67 +--
 arch/x86/kernel/cpu/mce/core.c                |   2 +-
 arch/x86/kernel/smpboot.c                     |   2 -
 arch/x86/lib/insn-eval.c                      |  13 +-
 arch/x86/mm/tlb.c                             | 155 +++++--
 arch/x86/platform/efi/efi_64.c                |   9 +-
 arch/x86/xen/mmu_pv.c                         |   2 +-
 arch/xtensa/kernel/smp.c                      |   1 -
 drivers/cpuidle/cpuidle.c                     |   2 +-
 drivers/idle/intel_idle.c                     |   4 +-
 drivers/misc/sgi-gru/grufault.c               |   2 +-
 drivers/misc/sgi-gru/gruhandles.c             |   2 +-
 drivers/misc/sgi-gru/grukservices.c           |   2 +-
 fs/exec.c                                     |  28 +-
 include/linux/mmu_context.h                   |   4 +-
 include/linux/sched/hotplug.h                 |   6 -
 include/linux/sched/mm.h                      |  58 ++-
 include/linux/sync_core.h                     |  21 -
 init/Kconfig                                  |   3 -
 kernel/cpu.c                                  |  21 +-
 kernel/exit.c                                 |   2 +-
 kernel/fork.c                                 |  11 +
 kernel/kthread.c                              |  50 +--
 kernel/sched/core.c                           | 409 +++++++++++++++---
 kernel/sched/idle.c                           |   1 +
 kernel/sched/membarrier.c                     |  97 ++++-
 kernel/sched/sched.h                          |  11 +-
 55 files changed, 745 insertions(+), 482 deletions(-)
 create mode 100644 arch/arm/include/asm/membarrier.h
 create mode 100644 arch/arm64/include/asm/membarrier.h
 create mode 100644 arch/x86/include/asm/membarrier.h
 delete mode 100644 include/linux/sync_core.h

-- 
2.33.1


             reply	other threads:[~2022-01-08 16:44 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-08 16:43 Andy Lutomirski [this message]
2022-01-08 16:43 ` [PATCH 01/23] membarrier: Document why membarrier() works Andy Lutomirski
2022-01-12 15:30   ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 02/23] x86/mm: Handle unlazying membarrier core sync in the arch code Andy Lutomirski
2022-01-12 15:40   ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 03/23] membarrier: Remove membarrier_arch_switch_mm() prototype in core code Andy Lutomirski
2022-01-08 16:43 ` [PATCH 04/23] membarrier: Make the post-switch-mm barrier explicit Andy Lutomirski
2022-01-12 15:52   ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 05/23] membarrier, kthread: Use _ONCE accessors for task->mm Andy Lutomirski
2022-01-12 15:55   ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 06/23] powerpc/membarrier: Remove special barrier on mm switch Andy Lutomirski
2022-01-08 16:43   ` Andy Lutomirski
2022-01-10  8:42   ` Christophe Leroy
2022-01-10  8:42     ` Christophe Leroy
2022-01-12 15:57   ` Mathieu Desnoyers
2022-01-12 15:57     ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 07/23] membarrier: Rewrite sync_core_before_usermode() and improve documentation Andy Lutomirski
2022-01-08 16:43   ` Andy Lutomirski
2022-01-08 16:43   ` Andy Lutomirski
2022-01-12 16:11   ` Mathieu Desnoyers
2022-01-12 16:11     ` Mathieu Desnoyers
2022-01-12 16:11     ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 08/23] membarrier: Remove redundant clear of mm->membarrier_state in exec_mmap() Andy Lutomirski
2022-01-12 16:13   ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 09/23] membarrier: Fix incorrect barrier positions during exec and kthread_use_mm() Andy Lutomirski
2022-01-12 16:30   ` Mathieu Desnoyers
2022-01-12 17:08     ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 10/23] x86/events, x86/insn-eval: Remove incorrect active_mm references Andy Lutomirski
2022-01-08 16:43 ` [PATCH 11/23] sched/scs: Initialize shadow stack on idle thread bringup, not shutdown Andy Lutomirski
2022-01-10 22:06   ` Sami Tolvanen
2022-01-08 16:43 ` [PATCH 12/23] Rework "sched/core: Fix illegal RCU from offline CPUs" Andy Lutomirski
2022-01-08 16:43 ` [PATCH 13/23] exec: Remove unnecessary vmacache_seqnum clear in exec_mmap() Andy Lutomirski
2022-01-08 16:43 ` [PATCH 14/23] sched, exec: Factor current mm changes out from exec Andy Lutomirski
2022-01-08 16:44 ` [PATCH 15/23] kthread: Switch to __change_current_mm() Andy Lutomirski
2022-01-08 16:44 ` [PATCH 16/23] sched: Use lightweight hazard pointers to grab lazy mms Andy Lutomirski
2022-01-08 19:22   ` Linus Torvalds
2022-01-08 22:04     ` Andy Lutomirski
2022-01-09  0:27       ` Linus Torvalds
2022-01-09  0:53       ` Linus Torvalds
2022-01-09  3:58         ` Andy Lutomirski
2022-01-09  4:38           ` Linus Torvalds
2022-01-09 20:19             ` Andy Lutomirski
2022-01-09 20:48               ` Linus Torvalds
2022-01-09 21:51                 ` Linus Torvalds
2022-01-10  0:52                   ` Andy Lutomirski
2022-01-10  2:36                     ` Rik van Riel
2022-01-10  3:51                       ` Linus Torvalds
2022-01-10  4:56                   ` Nicholas Piggin
2022-01-10  5:17                     ` Nicholas Piggin
2022-01-10 17:19                       ` Linus Torvalds
2022-01-11  2:24                         ` Nicholas Piggin
2022-01-10 20:52                     ` Andy Lutomirski
2022-01-11  3:10                       ` Nicholas Piggin
2022-01-11 15:39                         ` Andy Lutomirski
2022-01-11 22:48                           ` Nicholas Piggin
2022-01-12  0:42                             ` Nicholas Piggin
2022-01-11 10:39                 ` Will Deacon
2022-01-11 15:22                   ` Andy Lutomirski
2022-01-09  5:56   ` Nadav Amit
2022-01-09  6:48     ` Linus Torvalds
2022-01-09  8:49       ` Nadav Amit
2022-01-09 19:10         ` Linus Torvalds
2022-01-09 19:52           ` Andy Lutomirski
2022-01-09 20:00             ` Linus Torvalds
2022-01-09 20:34             ` Nadav Amit
2022-01-09 20:48               ` Andy Lutomirski
2022-01-09 19:22         ` Rik van Riel
2022-01-09 19:34           ` Nadav Amit
2022-01-09 19:37             ` Rik van Riel
2022-01-09 19:51               ` Nadav Amit
2022-01-09 19:54                 ` Linus Torvalds
2022-01-08 16:44 ` [PATCH 17/23] x86/mm: Make use/unuse_temporary_mm() non-static Andy Lutomirski
2022-01-08 16:44 ` [PATCH 18/23] x86/mm: Allow temporary mms when IRQs are on Andy Lutomirski
2022-01-08 16:44 ` [PATCH 19/23] x86/efi: Make efi_enter/leave_mm use the temporary_mm machinery Andy Lutomirski
2022-01-10 13:13   ` Ard Biesheuvel
2022-01-08 16:44 ` [PATCH 20/23] x86/mm: Remove leave_mm() in favor of unlazy_mm_irqs_off() Andy Lutomirski
2022-01-08 16:44 ` [PATCH 21/23] x86/mm: Use unlazy_mm_irqs_off() in TLB flush IPIs Andy Lutomirski
2022-01-08 16:44 ` [PATCH 22/23] x86/mm: Optimize for_each_possible_lazymm_cpu() Andy Lutomirski
2022-01-08 16:44 ` [PATCH 23/23] x86/mm: Opt in to IRQs-off activate_mm() Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1641659630.git.luto@kernel.org \
    --to=luto@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=anton@ozlabs.org \
    --cc=benh@kernel.crashing.org \
    --cc=dave.hansen@intel.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=nadav.amit@gmail.com \
    --cc=npiggin@gmail.com \
    --cc=paulus@ozlabs.org \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=riel@surriel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.