linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* incoming
@ 2021-11-09  2:30 Andrew Morton
  2021-11-09  2:31 ` [patch 01/87] vfs: keep inodes with page cache off the inode shrinker LRU Andrew Morton
                   ` (86 more replies)
  0 siblings, 87 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

87 patches, based on 8bb7eca972ad531c9b149c0a51ab43a417385813, plus
previously sent material.

Subsystems affected by this patch series:

  mm/pagecache
  mm/hugetlb
  procfs
  misc
  MAINTAINERS
  lib
  checkpatch
  binfmt
  kallsyms
  ramfs
  init
  codafs
  nilfs2
  hfs
  crash_dump
  signals
  seq_file
  fork
  sysvfs
  kcov
  gdb
  resource
  selftests
  ipc

Subsystem: mm/pagecache

    Johannes Weiner <hannes@cmpxchg.org>:
      vfs: keep inodes with page cache off the inode shrinker LRU

Subsystem: mm/hugetlb

    zhangyiru <zhangyiru3@huawei.com>:
      mm,hugetlb: remove mlock ulimit for SHM_HUGETLB

Subsystem: procfs

    Florian Weimer <fweimer@redhat.com>:
      procfs: do not list TID 0 in /proc/<pid>/task

    David Hildenbrand <david@redhat.com>:
      x86/xen: update xen_oldmem_pfn_is_ram() documentation
      x86/xen: simplify xen_oldmem_pfn_is_ram()
      x86/xen: print a warning when HVMOP_get_mem_type fails
      proc/vmcore: let pfn_is_ram() return a bool
      proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
      virtio-mem: factor out hotplug specifics from virtio_mem_init() into virtio_mem_init_hotplug()
      virtio-mem: factor out hotplug specifics from virtio_mem_probe() into virtio_mem_init_hotplug()
      virtio-mem: factor out hotplug specifics from virtio_mem_remove() into virtio_mem_deinit_hotplug()
      virtio-mem: kdump mode to sanitize /proc/vmcore access

    Stephen Brennan <stephen.s.brennan@oracle.com>:
      proc: allow pid_revalidate() during LOOKUP_RCU

Subsystem: misc

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
    Patch series "kernel.h further split", v5:
      kernel.h: drop unneeded <linux/kernel.h> inclusion from other headers
      kernel.h: split out container_of() and typeof_member() macros
      include/kunit/test.h: replace kernel.h with the necessary inclusions
      include/linux/list.h: replace kernel.h with the necessary inclusions
      include/linux/llist.h: replace kernel.h with the necessary inclusions
      include/linux/plist.h: replace kernel.h with the necessary inclusions
      include/media/media-entity.h: replace kernel.h with the necessary inclusions
      include/linux/delay.h: replace kernel.h with the necessary inclusions
      include/linux/sbitmap.h: replace kernel.h with the necessary inclusions
      include/linux/radix-tree.h: replace kernel.h with the necessary inclusions
      include/linux/generic-radix-tree.h: replace kernel.h with the necessary inclusions

    Stephen Rothwell <sfr@canb.auug.org.au>:
      kernel.h: split out instruction pointer accessors

    Rasmus Villemoes <linux@rasmusvillemoes.dk>:
      linux/container_of.h: switch to static_assert

    Colin Ian King <colin.i.king@googlemail.com>:
      mailmap: update email address for Colin King

Subsystem: MAINTAINERS

    Kees Cook <keescook@chromium.org>:
      MAINTAINERS: add "exec & binfmt" section with myself and Eric

    Lukas Bulwahn <lukas.bulwahn@gmail.com>:
    Patch series "Rectify file references for dt-bindings in MAINTAINERS", v5:
      MAINTAINERS: rectify entry for ARM/TOSHIBA VISCONTI ARCHITECTURE
      MAINTAINERS: rectify entry for HIKEY960 ONBOARD USB GPIO HUB DRIVER
      MAINTAINERS: rectify entry for INTEL KEEM BAY DRM DRIVER
      MAINTAINERS: rectify entry for ALLWINNER HARDWARE SPINLOCK SUPPORT

Subsystem: lib

    Imran Khan <imran.f.khan@oracle.com>:
    Patch series "lib, stackdepot: check stackdepot handle before accessing slabs", v2:
      lib, stackdepot: check stackdepot handle before accessing slabs
      lib, stackdepot: add helper to print stack entries
      lib, stackdepot: add helper to print stack entries into buffer

    Lucas De Marchi <lucas.demarchi@intel.com>:
      include/linux/string_helpers.h: add linux/string.h for strlen()

    Alexey Dobriyan <adobriyan@gmail.com>:
      lib: uninline simple_strntoull() as well

    Thomas Gleixner <tglx@linutronix.de>:
      mm/scatterlist: replace the !preemptible warning in sg_miter_stop()

Subsystem: checkpatch

    Rikard Falkeborn <rikard.falkeborn@gmail.com>:
      const_structs.checkpatch: add a few sound ops structs

    Joe Perches <joe@perches.com>:
      checkpatch: improve EXPORT_SYMBOL test for EXPORT_SYMBOL_NS uses

    Peter Ujfalusi <peter.ujfalusi@linux.intel.com>:
      checkpatch: get default codespell dictionary path from package location

Subsystem: binfmt

    Kees Cook <keescook@chromium.org>:
      binfmt_elf: reintroduce using MAP_FIXED_NOREPLACE

    Alexey Dobriyan <adobriyan@gmail.com>:
      ELF: simplify STACK_ALLOC macro

Subsystem: kallsyms

    Kefeng Wang <wangkefeng.wang@huawei.com>:
    Patch series "sections: Unify kernel sections range check and use", v4:
      kallsyms: remove arch specific text and data check
      kallsyms: fix address-checks for kernel related range
      sections: move and rename core_kernel_data() to is_kernel_core_data()
      sections: move is_kernel_inittext() into sections.h
      x86: mm: rename __is_kernel_text() to is_x86_32_kernel_text()
      sections: provide internal __is_kernel() and __is_kernel_text() helper
      mm: kasan: use is_kernel() helper
      extable: use is_kernel_text() helper
      powerpc/mm: use core_kernel_text() helper
      microblaze: use is_kernel_text() helper
      alpha: use is_kernel_text() helper

Subsystem: ramfs

    yangerkun <yangerkun@huawei.com>:
      ramfs: fix mount source show for ramfs

Subsystem: init

    Andrew Halaney <ahalaney@redhat.com>:
      init: make unknown command line param message clearer

Subsystem: codafs

    Jan Harkes <jaharkes@cs.cmu.edu>:
    Patch series "Coda updates for -next":
      coda: avoid NULL pointer dereference from a bad inode
      coda: check for async upcall request using local state

    Alex Shi <alex.shi@linux.alibaba.com>:
      coda: remove err which no one care

    Jan Harkes <jaharkes@cs.cmu.edu>:
      coda: avoid flagging NULL inodes
      coda: avoid hidden code duplication in rename
      coda: avoid doing bad things on inode type changes during revalidation

    Xiyu Yang <xiyuyang19@fudan.edu.cn>:
      coda: convert from atomic_t to refcount_t on coda_vm_ops->refcnt

    Jing Yangyang <jing.yangyang@zte.com.cn>:
      coda: use vmemdup_user to replace the open code

    Jan Harkes <jaharkes@cs.cmu.edu>:
      coda: bump module version to 7.2

Subsystem: nilfs2

    Qing Wang <wangqing@vivo.com>:
    Patch series "nilfs2 updates":
      nilfs2: replace snprintf in show functions with sysfs_emit

    Ryusuke Konishi <konishi.ryusuke@gmail.com>:
      nilfs2: remove filenames from file comments

Subsystem: hfs

    Arnd Bergmann <arnd@arndb.de>:
      hfs/hfsplus: use WARN_ON for sanity check

Subsystem: crash_dump

    Changcheng Deng <deng.changcheng@zte.com.cn>:
      crash_dump: fix boolreturn.cocci warning

    Ye Guojin <ye.guojin@zte.com.cn>:
      crash_dump: remove duplicate include in crash_dump.h

Subsystem: signals

    Ye Guojin <ye.guojin@zte.com.cn>:
      signal: remove duplicate include in signal.h

Subsystem: seq_file

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      seq_file: move seq_escape() to a header

    Muchun Song <songmuchun@bytedance.com>:
      seq_file: fix passing wrong private data

Subsystem: fork

    Ran Xiaokai <ran.xiaokai@zte.com.cn>:
      kernel/fork.c: unshare(): use swap() to make code cleaner

Subsystem: sysvfs

    Pavel Skripkin <paskripkin@gmail.com>:
      sysv: use BUILD_BUG_ON instead of runtime check

Subsystem: kcov

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
    Patch series "kcov: PREEMPT_RT fixup + misc", v2:
      Documentation/kcov: include types.h in the example
      Documentation/kcov: define `ip' in the example
      kcov: allocate per-CPU memory on the relevant node
      kcov: avoid enable+disable interrupts if !in_task()
      kcov: replace local_irq_save() with a local_lock_t

Subsystem: gdb

    Douglas Anderson <dianders@chromium.org>:
      scripts/gdb: handle split debug for vmlinux

Subsystem: resource

    David Hildenbrand <david@redhat.com>:
    Patch series "virtio-mem: disallow mapping virtio-mem memory via /dev/mem", v5:
      kernel/resource: clean up and optimize iomem_is_exclusive()
      kernel/resource: disallow access to exclusive system RAM regions
      virtio-mem: disallow mapping virtio-mem memory via /dev/mem

Subsystem: selftests

    SeongJae Park <sjpark@amazon.de>:
      selftests/kselftest/runner/run_one(): allow running non-executable files

Subsystem: ipc

    Michal Clapinski <mclapinski@google.com>:
      ipc: check checkpoint_restore_ns_capable() to modify C/R proc files

    Manfred Spraul <manfred@colorfullife.com>:
      ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL

 .mailmap                                             |    2 
 Documentation/dev-tools/kcov.rst                     |    5 
 MAINTAINERS                                          |   21 +
 arch/alpha/kernel/traps.c                            |    4 
 arch/microblaze/mm/pgtable.c                         |    3 
 arch/powerpc/mm/pgtable_32.c                         |    7 
 arch/riscv/lib/delay.c                               |    4 
 arch/s390/include/asm/facility.h                     |    4 
 arch/x86/kernel/aperture_64.c                        |   13 
 arch/x86/kernel/unwind_orc.c                         |    2 
 arch/x86/mm/init_32.c                                |   14 
 arch/x86/xen/mmu_hvm.c                               |   39 --
 drivers/gpu/drm/drm_dp_mst_topology.c                |    5 
 drivers/gpu/drm/drm_mm.c                             |    5 
 drivers/gpu/drm/i915/i915_vma.c                      |    5 
 drivers/gpu/drm/i915/intel_runtime_pm.c              |   20 -
 drivers/media/dvb-frontends/cxd2880/cxd2880_common.h |    1 
 drivers/virtio/Kconfig                               |    1 
 drivers/virtio/virtio_mem.c                          |  321 +++++++++++++------
 fs/binfmt_elf.c                                      |   33 +
 fs/coda/cnode.c                                      |   13 
 fs/coda/coda_linux.c                                 |   39 +-
 fs/coda/coda_linux.h                                 |    6 
 fs/coda/dir.c                                        |   20 -
 fs/coda/file.c                                       |   12 
 fs/coda/psdev.c                                      |   14 
 fs/coda/upcall.c                                     |    3 
 fs/hfs/inode.c                                       |    6 
 fs/hfsplus/inode.c                                   |   12 
 fs/hugetlbfs/inode.c                                 |   23 -
 fs/inode.c                                           |   46 +-
 fs/internal.h                                        |    1 
 fs/nilfs2/alloc.c                                    |    2 
 fs/nilfs2/alloc.h                                    |    2 
 fs/nilfs2/bmap.c                                     |    2 
 fs/nilfs2/bmap.h                                     |    2 
 fs/nilfs2/btnode.c                                   |    2 
 fs/nilfs2/btnode.h                                   |    2 
 fs/nilfs2/btree.c                                    |    2 
 fs/nilfs2/btree.h                                    |    2 
 fs/nilfs2/cpfile.c                                   |    2 
 fs/nilfs2/cpfile.h                                   |    2 
 fs/nilfs2/dat.c                                      |    2 
 fs/nilfs2/dat.h                                      |    2 
 fs/nilfs2/dir.c                                      |    2 
 fs/nilfs2/direct.c                                   |    2 
 fs/nilfs2/direct.h                                   |    2 
 fs/nilfs2/file.c                                     |    2 
 fs/nilfs2/gcinode.c                                  |    2 
 fs/nilfs2/ifile.c                                    |    2 
 fs/nilfs2/ifile.h                                    |    2 
 fs/nilfs2/inode.c                                    |    2 
 fs/nilfs2/ioctl.c                                    |    2 
 fs/nilfs2/mdt.c                                      |    2 
 fs/nilfs2/mdt.h                                      |    2 
 fs/nilfs2/namei.c                                    |    2 
 fs/nilfs2/nilfs.h                                    |    2 
 fs/nilfs2/page.c                                     |    2 
 fs/nilfs2/page.h                                     |    2 
 fs/nilfs2/recovery.c                                 |    2 
 fs/nilfs2/segbuf.c                                   |    2 
 fs/nilfs2/segbuf.h                                   |    2 
 fs/nilfs2/segment.c                                  |    2 
 fs/nilfs2/segment.h                                  |    2 
 fs/nilfs2/sufile.c                                   |    2 
 fs/nilfs2/sufile.h                                   |    2 
 fs/nilfs2/super.c                                    |    2 
 fs/nilfs2/sysfs.c                                    |   78 ++--
 fs/nilfs2/sysfs.h                                    |    2 
 fs/nilfs2/the_nilfs.c                                |    2 
 fs/nilfs2/the_nilfs.h                                |    2 
 fs/proc/base.c                                       |   21 -
 fs/proc/vmcore.c                                     |  109 ++++--
 fs/ramfs/inode.c                                     |   11 
 fs/seq_file.c                                        |   16 
 fs/sysv/super.c                                      |    6 
 include/asm-generic/sections.h                       |   75 +++-
 include/kunit/test.h                                 |   13 
 include/linux/bottom_half.h                          |    3 
 include/linux/container_of.h                         |   52 ++-
 include/linux/crash_dump.h                           |   30 +
 include/linux/delay.h                                |    2 
 include/linux/fs.h                                   |    1 
 include/linux/fwnode.h                               |    1 
 include/linux/generic-radix-tree.h                   |    3 
 include/linux/hugetlb.h                              |    6 
 include/linux/instruction_pointer.h                  |    8 
 include/linux/kallsyms.h                             |   21 -
 include/linux/kernel.h                               |   39 --
 include/linux/list.h                                 |    4 
 include/linux/llist.h                                |    4 
 include/linux/pagemap.h                              |   50 ++
 include/linux/plist.h                                |    5 
 include/linux/radix-tree.h                           |    4 
 include/linux/rwsem.h                                |    1 
 include/linux/sbitmap.h                              |   11 
 include/linux/seq_file.h                             |   19 +
 include/linux/signal.h                               |    1 
 include/linux/smp.h                                  |    1 
 include/linux/spinlock.h                             |    1 
 include/linux/stackdepot.h                           |    5 
 include/linux/string_helpers.h                       |    1 
 include/media/media-entity.h                         |    3 
 init/main.c                                          |    4 
 ipc/ipc_sysctl.c                                     |   42 +-
 ipc/shm.c                                            |    8 
 kernel/extable.c                                     |   33 -
 kernel/fork.c                                        |    9 
 kernel/kcov.c                                        |   40 +-
 kernel/locking/lockdep.c                             |    3 
 kernel/resource.c                                    |   54 ++-
 kernel/trace/ftrace.c                                |    2 
 lib/scatterlist.c                                    |   11 
 lib/stackdepot.c                                     |   46 ++
 lib/vsprintf.c                                       |    3 
 mm/Kconfig                                           |    7 
 mm/filemap.c                                         |    8 
 mm/kasan/report.c                                    |   17 -
 mm/memfd.c                                           |    4 
 mm/mmap.c                                            |    3 
 mm/page_owner.c                                      |   18 -
 mm/truncate.c                                        |   19 +
 mm/vmscan.c                                          |    7 
 mm/workingset.c                                      |   10 
 net/sysctl_net.c                                     |    2 
 scripts/checkpatch.pl                                |   33 +
 scripts/const_structs.checkpatch                     |    4 
 scripts/gdb/linux/symbols.py                         |    3 
 tools/testing/selftests/kselftest/runner.sh          |   28 +
 tools/testing/selftests/proc/.gitignore              |    1 
 tools/testing/selftests/proc/Makefile                |    2 
 tools/testing/selftests/proc/proc-tid0.c             |   81 ++++
 132 files changed, 1206 insertions(+), 681 deletions(-)



^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 01/87] vfs: keep inodes with page cache off the inode shrinker LRU
  2021-11-09  2:30 incoming Andrew Morton
@ 2021-11-09  2:31 ` Andrew Morton
  2021-11-09  2:31 ` [patch 02/87] mm,hugetlb: remove mlock ulimit for SHM_HUGETLB Andrew Morton
                   ` (85 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:31 UTC (permalink / raw)
  To: akpm, david, guro, hannes, linux-mm, mm-commits, tj, torvalds

From: Johannes Weiner <hannes@cmpxchg.org>
Subject: vfs: keep inodes with page cache off the inode shrinker LRU

Historically (pre-2.5), the inode shrinker used to reclaim only empty
inodes and skip over those that still contained page cache.  This caused
problems on highmem hosts: struct inode could put fill lowmem zones before
the cache was getting reclaimed in the highmem zones.

To address this, the inode shrinker started to strip page cache to
facilitate reclaiming lowmem.  However, this comes with its own set of
problems: the shrinkers may drop actively used page cache just because the
inodes are not currently open or dirty - think working with a large git
tree.  It further doesn't respect cgroup memory protection settings and
can cause priority inversions between containers.

Nowadays, the page cache also holds non-resident info for evicted cache
pages in order to detect refaults.  We've come to rely heavily on this
data inside reclaim for protecting the cache workingset and driving swap
behavior.  We also use it to quantify and report workload health through
psi.  The latter in turn is used for fleet health monitoring, as well as
driving automated memory sizing of workloads and containers, proactive
reclaim and memory offloading schemes.

The consequences of dropping page cache prematurely is that we're seeing
subtle and not-so-subtle failures in all of the above-mentioned scenarios,
with the workload generally entering unexpected thrashing states while
losing the ability to reliably detect it.

To fix this on non-highmem systems at least, going back to rotating inodes
on the LRU isn't feasible.  We've tried (commit a76cf1a474d7 ("mm: don't
reclaim inodes with many attached pages")) and failed (commit 69056ee6a8a3
("Revert "mm: don't reclaim inodes with many attached pages"")).  The
issue is mostly that shrinker pools attract pressure based on their size,
and when objects get skipped the shrinkers remember this as deferred
reclaim work.  This accumulates excessive pressure on the remaining
inodes, and we can quickly eat into heavily used ones, or dirty ones that
require IO to reclaim, when there potentially is plenty of cold, clean
cache around still.

Instead, this patch keeps populated inodes off the inode LRU in the first
place - just like an open file or dirty state would.  An otherwise clean
and unused inode then gets queued when the last cache entry disappears. 
This solves the problem without reintroducing the reclaim issues, and
generally is a bit more scalable than having to wade through potentially
hundreds of thousands of busy inodes.

Locking is a bit tricky because the locks protecting the inode state
(i_lock) and the inode LRU (lru_list.lock) don't nest inside the irq-safe
page cache lock (i_pages.xa_lock).  Page cache deletions are serialized
through i_lock, taken before the i_pages lock, to make sure depopulated
inodes are queued reliably.  Additions may race with deletions, but we'll
check again in the shrinker.  If additions race with the shrinker itself,
we're protected by the i_lock: if find_inode() or iput() win, the shrinker
will bail on the elevated i_count or I_REFERENCED; if the shrinker wins
and goes ahead with the inode, it will set I_FREEING and inhibit further
igets(), which will cause the other side to create a new instance of the
inode instead.

Link: https://lkml.kernel.org/r/20210614211904.14420-4-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/inode.c              |   46 ++++++++++++++++++++--------------
 fs/internal.h           |    1 
 include/linux/fs.h      |    1 
 include/linux/pagemap.h |   50 ++++++++++++++++++++++++++++++++++++++
 mm/filemap.c            |    8 ++++++
 mm/truncate.c           |   19 ++++++++++++--
 mm/vmscan.c             |    7 +++++
 mm/workingset.c         |   10 +++++++
 8 files changed, 120 insertions(+), 22 deletions(-)

--- a/fs/inode.c~vfs-keep-inodes-with-page-cache-off-the-inode-shrinker-lru
+++ a/fs/inode.c
@@ -428,11 +428,20 @@ void ihold(struct inode *inode)
 }
 EXPORT_SYMBOL(ihold);
 
-static void inode_lru_list_add(struct inode *inode)
+static void __inode_add_lru(struct inode *inode, bool rotate)
 {
+	if (inode->i_state & (I_DIRTY_ALL | I_SYNC | I_FREEING | I_WILL_FREE))
+		return;
+	if (atomic_read(&inode->i_count))
+		return;
+	if (!(inode->i_sb->s_flags & SB_ACTIVE))
+		return;
+	if (!mapping_shrinkable(&inode->i_data))
+		return;
+
 	if (list_lru_add(&inode->i_sb->s_inode_lru, &inode->i_lru))
 		this_cpu_inc(nr_unused);
-	else
+	else if (rotate)
 		inode->i_state |= I_REFERENCED;
 }
 
@@ -443,16 +452,11 @@ static void inode_lru_list_add(struct in
  */
 void inode_add_lru(struct inode *inode)
 {
-	if (!(inode->i_state & (I_DIRTY_ALL | I_SYNC |
-				I_FREEING | I_WILL_FREE)) &&
-	    !atomic_read(&inode->i_count) && inode->i_sb->s_flags & SB_ACTIVE)
-		inode_lru_list_add(inode);
+	__inode_add_lru(inode, false);
 }
 
-
 static void inode_lru_list_del(struct inode *inode)
 {
-
 	if (list_lru_del(&inode->i_sb->s_inode_lru, &inode->i_lru))
 		this_cpu_dec(nr_unused);
 }
@@ -728,10 +732,6 @@ again:
 /*
  * Isolate the inode from the LRU in preparation for freeing it.
  *
- * Any inodes which are pinned purely because of attached pagecache have their
- * pagecache removed.  If the inode has metadata buffers attached to
- * mapping->private_list then try to remove them.
- *
  * If the inode has the I_REFERENCED flag set, then it means that it has been
  * used recently - the flag is set in iput_final(). When we encounter such an
  * inode, clear the flag and move it to the back of the LRU so it gets another
@@ -747,31 +747,39 @@ static enum lru_status inode_lru_isolate
 	struct inode	*inode = container_of(item, struct inode, i_lru);
 
 	/*
-	 * we are inverting the lru lock/inode->i_lock here, so use a trylock.
-	 * If we fail to get the lock, just skip it.
+	 * We are inverting the lru lock/inode->i_lock here, so use a
+	 * trylock. If we fail to get the lock, just skip it.
 	 */
 	if (!spin_trylock(&inode->i_lock))
 		return LRU_SKIP;
 
 	/*
-	 * Referenced or dirty inodes are still in use. Give them another pass
-	 * through the LRU as we canot reclaim them now.
+	 * Inodes can get referenced, redirtied, or repopulated while
+	 * they're already on the LRU, and this can make them
+	 * unreclaimable for a while. Remove them lazily here; iput,
+	 * sync, or the last page cache deletion will requeue them.
 	 */
 	if (atomic_read(&inode->i_count) ||
-	    (inode->i_state & ~I_REFERENCED)) {
+	    (inode->i_state & ~I_REFERENCED) ||
+	    !mapping_shrinkable(&inode->i_data)) {
 		list_lru_isolate(lru, &inode->i_lru);
 		spin_unlock(&inode->i_lock);
 		this_cpu_dec(nr_unused);
 		return LRU_REMOVED;
 	}
 
-	/* recently referenced inodes get one more pass */
+	/* Recently referenced inodes get one more pass */
 	if (inode->i_state & I_REFERENCED) {
 		inode->i_state &= ~I_REFERENCED;
 		spin_unlock(&inode->i_lock);
 		return LRU_ROTATE;
 	}
 
+	/*
+	 * On highmem systems, mapping_shrinkable() permits dropping
+	 * page cache in order to free up struct inodes: lowmem might
+	 * be under pressure before the cache inside the highmem zone.
+	 */
 	if (inode_has_buffers(inode) || !mapping_empty(&inode->i_data)) {
 		__iget(inode);
 		spin_unlock(&inode->i_lock);
@@ -1638,7 +1646,7 @@ static void iput_final(struct inode *ino
 	if (!drop &&
 	    !(inode->i_state & I_DONTCACHE) &&
 	    (sb->s_flags & SB_ACTIVE)) {
-		inode_add_lru(inode);
+		__inode_add_lru(inode, true);
 		spin_unlock(&inode->i_lock);
 		return;
 	}
--- a/fs/internal.h~vfs-keep-inodes-with-page-cache-off-the-inode-shrinker-lru
+++ a/fs/internal.h
@@ -149,7 +149,6 @@ extern int vfs_open(const struct path *,
  * inode.c
  */
 extern long prune_icache_sb(struct super_block *sb, struct shrink_control *sc);
-extern void inode_add_lru(struct inode *inode);
 extern int dentry_needs_remove_privs(struct dentry *dentry);
 
 /*
--- a/include/linux/fs.h~vfs-keep-inodes-with-page-cache-off-the-inode-shrinker-lru
+++ a/include/linux/fs.h
@@ -3193,6 +3193,7 @@ static inline void remove_inode_hash(str
 }
 
 extern void inode_sb_list_add(struct inode *inode);
+extern void inode_add_lru(struct inode *inode);
 
 extern int sb_set_blocksize(struct super_block *, int);
 extern int sb_min_blocksize(struct super_block *, int);
--- a/include/linux/pagemap.h~vfs-keep-inodes-with-page-cache-off-the-inode-shrinker-lru
+++ a/include/linux/pagemap.h
@@ -24,6 +24,56 @@ static inline bool mapping_empty(struct
 }
 
 /*
+ * mapping_shrinkable - test if page cache state allows inode reclaim
+ * @mapping: the page cache mapping
+ *
+ * This checks the mapping's cache state for the pupose of inode
+ * reclaim and LRU management.
+ *
+ * The caller is expected to hold the i_lock, but is not required to
+ * hold the i_pages lock, which usually protects cache state. That's
+ * because the i_lock and the list_lru lock that protect the inode and
+ * its LRU state don't nest inside the irq-safe i_pages lock.
+ *
+ * Cache deletions are performed under the i_lock, which ensures that
+ * when an inode goes empty, it will reliably get queued on the LRU.
+ *
+ * Cache additions do not acquire the i_lock and may race with this
+ * check, in which case we'll report the inode as shrinkable when it
+ * has cache pages. This is okay: the shrinker also checks the
+ * refcount and the referenced bit, which will be elevated or set in
+ * the process of adding new cache pages to an inode.
+ */
+static inline bool mapping_shrinkable(struct address_space *mapping)
+{
+	void *head;
+
+	/*
+	 * On highmem systems, there could be lowmem pressure from the
+	 * inodes before there is highmem pressure from the page
+	 * cache. Make inodes shrinkable regardless of cache state.
+	 */
+	if (IS_ENABLED(CONFIG_HIGHMEM))
+		return true;
+
+	/* Cache completely empty? Shrink away. */
+	head = rcu_access_pointer(mapping->i_pages.xa_head);
+	if (!head)
+		return true;
+
+	/*
+	 * The xarray stores single offset-0 entries directly in the
+	 * head pointer, which allows non-resident page cache entries
+	 * to escape the shadow shrinker's list of xarray nodes. The
+	 * inode shrinker needs to pick them up under memory pressure.
+	 */
+	if (!xa_is_node(head) && xa_is_value(head))
+		return true;
+
+	return false;
+}
+
+/*
  * Bits in mapping->flags.
  */
 enum mapping_flags {
--- a/mm/filemap.c~vfs-keep-inodes-with-page-cache-off-the-inode-shrinker-lru
+++ a/mm/filemap.c
@@ -262,9 +262,13 @@ void delete_from_page_cache(struct page
 	struct address_space *mapping = page_mapping(page);
 
 	BUG_ON(!PageLocked(page));
+	spin_lock(&mapping->host->i_lock);
 	xa_lock_irq(&mapping->i_pages);
 	__delete_from_page_cache(page, NULL);
 	xa_unlock_irq(&mapping->i_pages);
+	if (mapping_shrinkable(mapping))
+		inode_add_lru(mapping->host);
+	spin_unlock(&mapping->host->i_lock);
 
 	page_cache_free_page(mapping, page);
 }
@@ -340,6 +344,7 @@ void delete_from_page_cache_batch(struct
 	if (!pagevec_count(pvec))
 		return;
 
+	spin_lock(&mapping->host->i_lock);
 	xa_lock_irq(&mapping->i_pages);
 	for (i = 0; i < pagevec_count(pvec); i++) {
 		trace_mm_filemap_delete_from_page_cache(pvec->pages[i]);
@@ -348,6 +353,9 @@ void delete_from_page_cache_batch(struct
 	}
 	page_cache_delete_batch(mapping, pvec);
 	xa_unlock_irq(&mapping->i_pages);
+	if (mapping_shrinkable(mapping))
+		inode_add_lru(mapping->host);
+	spin_unlock(&mapping->host->i_lock);
 
 	for (i = 0; i < pagevec_count(pvec); i++)
 		page_cache_free_page(mapping, pvec->pages[i]);
--- a/mm/truncate.c~vfs-keep-inodes-with-page-cache-off-the-inode-shrinker-lru
+++ a/mm/truncate.c
@@ -45,9 +45,13 @@ static inline void __clear_shadow_entry(
 static void clear_shadow_entry(struct address_space *mapping, pgoff_t index,
 			       void *entry)
 {
+	spin_lock(&mapping->host->i_lock);
 	xa_lock_irq(&mapping->i_pages);
 	__clear_shadow_entry(mapping, index, entry);
 	xa_unlock_irq(&mapping->i_pages);
+	if (mapping_shrinkable(mapping))
+		inode_add_lru(mapping->host);
+	spin_unlock(&mapping->host->i_lock);
 }
 
 /*
@@ -73,8 +77,10 @@ static void truncate_exceptional_pvec_en
 		return;
 
 	dax = dax_mapping(mapping);
-	if (!dax)
+	if (!dax) {
+		spin_lock(&mapping->host->i_lock);
 		xa_lock_irq(&mapping->i_pages);
+	}
 
 	for (i = j; i < pagevec_count(pvec); i++) {
 		struct page *page = pvec->pages[i];
@@ -93,8 +99,12 @@ static void truncate_exceptional_pvec_en
 		__clear_shadow_entry(mapping, index, page);
 	}
 
-	if (!dax)
+	if (!dax) {
 		xa_unlock_irq(&mapping->i_pages);
+		if (mapping_shrinkable(mapping))
+			inode_add_lru(mapping->host);
+		spin_unlock(&mapping->host->i_lock);
+	}
 	pvec->nr = j;
 }
 
@@ -567,6 +577,7 @@ invalidate_complete_page2(struct address
 	if (page_has_private(page) && !try_to_release_page(page, GFP_KERNEL))
 		return 0;
 
+	spin_lock(&mapping->host->i_lock);
 	xa_lock_irq(&mapping->i_pages);
 	if (PageDirty(page))
 		goto failed;
@@ -574,6 +585,9 @@ invalidate_complete_page2(struct address
 	BUG_ON(page_has_private(page));
 	__delete_from_page_cache(page, NULL);
 	xa_unlock_irq(&mapping->i_pages);
+	if (mapping_shrinkable(mapping))
+		inode_add_lru(mapping->host);
+	spin_unlock(&mapping->host->i_lock);
 
 	if (mapping->a_ops->freepage)
 		mapping->a_ops->freepage(page);
@@ -582,6 +596,7 @@ invalidate_complete_page2(struct address
 	return 1;
 failed:
 	xa_unlock_irq(&mapping->i_pages);
+	spin_unlock(&mapping->host->i_lock);
 	return 0;
 }
 
--- a/mm/vmscan.c~vfs-keep-inodes-with-page-cache-off-the-inode-shrinker-lru
+++ a/mm/vmscan.c
@@ -1190,6 +1190,8 @@ static int __remove_mapping(struct addre
 	BUG_ON(!PageLocked(page));
 	BUG_ON(mapping != page_mapping(page));
 
+	if (!PageSwapCache(page))
+		spin_lock(&mapping->host->i_lock);
 	xa_lock_irq(&mapping->i_pages);
 	/*
 	 * The non racy check for a busy page.
@@ -1258,6 +1260,9 @@ static int __remove_mapping(struct addre
 			shadow = workingset_eviction(page, target_memcg);
 		__delete_from_page_cache(page, shadow);
 		xa_unlock_irq(&mapping->i_pages);
+		if (mapping_shrinkable(mapping))
+			inode_add_lru(mapping->host);
+		spin_unlock(&mapping->host->i_lock);
 
 		if (freepage != NULL)
 			freepage(page);
@@ -1267,6 +1272,8 @@ static int __remove_mapping(struct addre
 
 cannot_free:
 	xa_unlock_irq(&mapping->i_pages);
+	if (!PageSwapCache(page))
+		spin_unlock(&mapping->host->i_lock);
 	return 0;
 }
 
--- a/mm/workingset.c~vfs-keep-inodes-with-page-cache-off-the-inode-shrinker-lru
+++ a/mm/workingset.c
@@ -543,6 +543,13 @@ static enum lru_status shadow_lru_isolat
 		goto out;
 	}
 
+	if (!spin_trylock(&mapping->host->i_lock)) {
+		xa_unlock(&mapping->i_pages);
+		spin_unlock_irq(lru_lock);
+		ret = LRU_RETRY;
+		goto out;
+	}
+
 	list_lru_isolate(lru, item);
 	__dec_lruvec_kmem_state(node, WORKINGSET_NODES);
 
@@ -562,6 +569,9 @@ static enum lru_status shadow_lru_isolat
 
 out_invalid:
 	xa_unlock_irq(&mapping->i_pages);
+	if (mapping_shrinkable(mapping))
+		inode_add_lru(mapping->host);
+	spin_unlock(&mapping->host->i_lock);
 	ret = LRU_REMOVED_RETRY;
 out:
 	cond_resched();
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 02/87] mm,hugetlb: remove mlock ulimit for SHM_HUGETLB
  2021-11-09  2:30 incoming Andrew Morton
  2021-11-09  2:31 ` [patch 01/87] vfs: keep inodes with page cache off the inode shrinker LRU Andrew Morton
@ 2021-11-09  2:31 ` Andrew Morton
  2021-11-09  2:31 ` [patch 03/87] procfs: do not list TID 0 in /proc/<pid>/task Andrew Morton
                   ` (84 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:31 UTC (permalink / raw)
  To: akpm, hughd, linux-mm, liuzixian4, mhocko, mike.kravetz,
	mm-commits, torvalds, wuxu.wu, zhangyiru3

From: zhangyiru <zhangyiru3@huawei.com>
Subject: mm,hugetlb: remove mlock ulimit for SHM_HUGETLB

commit 21a3c273f88c9cbbaf7e ("mm, hugetlb: add thread name and pid to
SHM_HUGETLB mlock rlimit warning") marked this as deprecated in 2012, but
it is not deleted yet.

Mike says he still sees that message in log files on occasion, so maybe we
should preserve this warning.

Also remove hugetlbfs related user_shm_unlock in ipc/shm.c and remove the
user_shm_unlock after out.

Link: https://lkml.kernel.org/r/20211103105857.25041-1-zhangyiru3@huawei.com
Signed-off-by: zhangyiru <zhangyiru3@huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Liu Zixian <liuzixian4@huawei.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: wuxu.wu <wuxu.wu@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/hugetlbfs/inode.c    |   23 ++++++++---------------
 include/linux/hugetlb.h |    6 ++----
 ipc/shm.c               |    8 +-------
 mm/memfd.c              |    4 +---
 mm/mmap.c               |    3 +--
 5 files changed, 13 insertions(+), 31 deletions(-)

--- a/fs/hugetlbfs/inode.c~mmhugetlb-remove-mlock-ulimit-for-shm_hugetlb
+++ a/fs/hugetlbfs/inode.c
@@ -1446,8 +1446,8 @@ static int get_hstate_idx(int page_size_
  * otherwise hugetlb_reserve_pages reserves one less hugepages than intended.
  */
 struct file *hugetlb_file_setup(const char *name, size_t size,
-				vm_flags_t acctflag, struct ucounts **ucounts,
-				int creat_flags, int page_size_log)
+				vm_flags_t acctflag, int creat_flags,
+				int page_size_log)
 {
 	struct inode *inode;
 	struct vfsmount *mnt;
@@ -1458,22 +1458,19 @@ struct file *hugetlb_file_setup(const ch
 	if (hstate_idx < 0)
 		return ERR_PTR(-ENODEV);
 
-	*ucounts = NULL;
 	mnt = hugetlbfs_vfsmount[hstate_idx];
 	if (!mnt)
 		return ERR_PTR(-ENOENT);
 
 	if (creat_flags == HUGETLB_SHMFS_INODE && !can_do_hugetlb_shm()) {
-		*ucounts = current_ucounts();
-		if (user_shm_lock(size, *ucounts)) {
-			task_lock(current);
-			pr_warn_once("%s (%d): Using mlock ulimits for SHM_HUGETLB is deprecated\n",
+		struct ucounts *ucounts = current_ucounts();
+
+		if (user_shm_lock(size, ucounts)) {
+			pr_warn_once("%s (%d): Using mlock ulimits for SHM_HUGETLB is obsolete\n",
 				current->comm, current->pid);
-			task_unlock(current);
-		} else {
-			*ucounts = NULL;
-			return ERR_PTR(-EPERM);
+			user_shm_unlock(size, ucounts);
 		}
+		return ERR_PTR(-EPERM);
 	}
 
 	file = ERR_PTR(-ENOSPC);
@@ -1498,10 +1495,6 @@ struct file *hugetlb_file_setup(const ch
 
 	iput(inode);
 out:
-	if (*ucounts) {
-		user_shm_unlock(size, *ucounts);
-		*ucounts = NULL;
-	}
 	return file;
 }
 
--- a/include/linux/hugetlb.h~mmhugetlb-remove-mlock-ulimit-for-shm_hugetlb
+++ a/include/linux/hugetlb.h
@@ -477,8 +477,7 @@ static inline struct hugetlbfs_inode_inf
 extern const struct file_operations hugetlbfs_file_operations;
 extern const struct vm_operations_struct hugetlb_vm_ops;
 struct file *hugetlb_file_setup(const char *name, size_t size, vm_flags_t acct,
-				struct ucounts **ucounts, int creat_flags,
-				int page_size_log);
+				int creat_flags, int page_size_log);
 
 static inline bool is_file_hugepages(struct file *file)
 {
@@ -497,8 +496,7 @@ static inline struct hstate *hstate_inod
 #define is_file_hugepages(file)			false
 static inline struct file *
 hugetlb_file_setup(const char *name, size_t size, vm_flags_t acctflag,
-		struct ucounts **ucounts, int creat_flags,
-		int page_size_log)
+		int creat_flags, int page_size_log)
 {
 	return ERR_PTR(-ENOSYS);
 }
--- a/ipc/shm.c~mmhugetlb-remove-mlock-ulimit-for-shm_hugetlb
+++ a/ipc/shm.c
@@ -287,9 +287,6 @@ static void shm_destroy(struct ipc_names
 	shm_unlock(shp);
 	if (!is_file_hugepages(shm_file))
 		shmem_lock(shm_file, 0, shp->mlock_ucounts);
-	else if (shp->mlock_ucounts)
-		user_shm_unlock(i_size_read(file_inode(shm_file)),
-				shp->mlock_ucounts);
 	fput(shm_file);
 	ipc_update_pid(&shp->shm_cprid, NULL);
 	ipc_update_pid(&shp->shm_lprid, NULL);
@@ -650,8 +647,7 @@ static int newseg(struct ipc_namespace *
 		if (shmflg & SHM_NORESERVE)
 			acctflag = VM_NORESERVE;
 		file = hugetlb_file_setup(name, hugesize, acctflag,
-				  &shp->mlock_ucounts, HUGETLB_SHMFS_INODE,
-				(shmflg >> SHM_HUGE_SHIFT) & SHM_HUGE_MASK);
+				HUGETLB_SHMFS_INODE, (shmflg >> SHM_HUGE_SHIFT) & SHM_HUGE_MASK);
 	} else {
 		/*
 		 * Do not allow no accounting for OVERCOMMIT_NEVER, even
@@ -698,8 +694,6 @@ static int newseg(struct ipc_namespace *
 no_id:
 	ipc_update_pid(&shp->shm_cprid, NULL);
 	ipc_update_pid(&shp->shm_lprid, NULL);
-	if (is_file_hugepages(file) && shp->mlock_ucounts)
-		user_shm_unlock(size, shp->mlock_ucounts);
 	fput(file);
 	ipc_rcu_putref(&shp->shm_perm, shm_rcu_free);
 	return error;
--- a/mm/memfd.c~mmhugetlb-remove-mlock-ulimit-for-shm_hugetlb
+++ a/mm/memfd.c
@@ -297,9 +297,7 @@ SYSCALL_DEFINE2(memfd_create,
 	}
 
 	if (flags & MFD_HUGETLB) {
-		struct ucounts *ucounts = NULL;
-
-		file = hugetlb_file_setup(name, 0, VM_NORESERVE, &ucounts,
+		file = hugetlb_file_setup(name, 0, VM_NORESERVE,
 					HUGETLB_ANONHUGE_INODE,
 					(flags >> MFD_HUGE_SHIFT) &
 					MFD_HUGE_MASK);
--- a/mm/mmap.c~mmhugetlb-remove-mlock-ulimit-for-shm_hugetlb
+++ a/mm/mmap.c
@@ -1599,7 +1599,6 @@ unsigned long ksys_mmap_pgoff(unsigned l
 			goto out_fput;
 		}
 	} else if (flags & MAP_HUGETLB) {
-		struct ucounts *ucounts = NULL;
 		struct hstate *hs;
 
 		hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK);
@@ -1615,7 +1614,7 @@ unsigned long ksys_mmap_pgoff(unsigned l
 		 */
 		file = hugetlb_file_setup(HUGETLB_ANON_FILE, len,
 				VM_NORESERVE,
-				&ucounts, HUGETLB_ANONHUGE_INODE,
+				HUGETLB_ANONHUGE_INODE,
 				(flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK);
 		if (IS_ERR(file))
 			return PTR_ERR(file);
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 03/87] procfs: do not list TID 0 in /proc/<pid>/task
  2021-11-09  2:30 incoming Andrew Morton
  2021-11-09  2:31 ` [patch 01/87] vfs: keep inodes with page cache off the inode shrinker LRU Andrew Morton
  2021-11-09  2:31 ` [patch 02/87] mm,hugetlb: remove mlock ulimit for SHM_HUGETLB Andrew Morton
@ 2021-11-09  2:31 ` Andrew Morton
  2021-11-09  2:31 ` [patch 04/87] x86/xen: update xen_oldmem_pfn_is_ram() documentation Andrew Morton
                   ` (83 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:31 UTC (permalink / raw)
  To: adobriyan, akpm, christian.brauner, ebiederm, fweimer, keescook,
	linux-mm, mm-commits, torvalds

From: Florian Weimer <fweimer@redhat.com>
Subject: procfs: do not list TID 0 in /proc/<pid>/task

If a task exits concurrently, task_pid_nr_ns may return 0.

[akpm@linux-foundation.org: coding style tweaks]
[adobriyan@gmail.com: test that /proc/*/task doesn't contain "0"]
  Link: https://lkml.kernel.org/r/YV88AnVzHxPafQ9o@localhost.localdomain
Link: https://lkml.kernel.org/r/8735pn5dx7.fsf@oldenburg.str.redhat.com
Signed-off-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/proc/base.c                           |    3 
 tools/testing/selftests/proc/.gitignore  |    1 
 tools/testing/selftests/proc/Makefile    |    2 
 tools/testing/selftests/proc/proc-tid0.c |   81 +++++++++++++++++++++
 4 files changed, 87 insertions(+)

--- a/fs/proc/base.c~procfs-do-not-list-tid-0-in-proc-pid-task
+++ a/fs/proc/base.c
@@ -3799,7 +3799,10 @@ static int proc_task_readdir(struct file
 	     task = next_tid(task), ctx->pos++) {
 		char name[10 + 1];
 		unsigned int len;
+
 		tid = task_pid_nr_ns(task, ns);
+		if (!tid)
+			continue;	/* The task has just exited. */
 		len = snprintf(name, sizeof(name), "%u", tid);
 		if (!proc_fill_cache(file, ctx, name, len,
 				proc_task_instantiate, task, NULL)) {
--- a/tools/testing/selftests/proc/.gitignore~procfs-do-not-list-tid-0-in-proc-pid-task
+++ a/tools/testing/selftests/proc/.gitignore
@@ -11,6 +11,7 @@
 /proc-self-syscall
 /proc-self-wchan
 /proc-subset-pid
+/proc-tid0
 /proc-uptime-001
 /proc-uptime-002
 /read
--- a/tools/testing/selftests/proc/Makefile~procfs-do-not-list-tid-0-in-proc-pid-task
+++ a/tools/testing/selftests/proc/Makefile
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 CFLAGS += -Wall -O2 -Wno-unused-function
 CFLAGS += -D_GNU_SOURCE
+LDFLAGS += -pthread
 
 TEST_GEN_PROGS :=
 TEST_GEN_PROGS += fd-001-lookup
@@ -13,6 +14,7 @@ TEST_GEN_PROGS += proc-self-map-files-00
 TEST_GEN_PROGS += proc-self-syscall
 TEST_GEN_PROGS += proc-self-wchan
 TEST_GEN_PROGS += proc-subset-pid
+TEST_GEN_PROGS += proc-tid0
 TEST_GEN_PROGS += proc-uptime-001
 TEST_GEN_PROGS += proc-uptime-002
 TEST_GEN_PROGS += read
--- /dev/null
+++ a/tools/testing/selftests/proc/proc-tid0.c
@@ -0,0 +1,81 @@
+/*
+ * Copyright (c) 2021 Alexey Dobriyan <adobriyan@gmail.com>
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+// Test that /proc/*/task never contains "0".
+#include <sys/types.h>
+#include <dirent.h>
+#include <signal.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <pthread.h>
+
+static pid_t pid = -1;
+
+static void atexit_hook(void)
+{
+	if (pid > 0) {
+		kill(pid, SIGKILL);
+	}
+}
+
+static void *f(void *_)
+{
+	return NULL;
+}
+
+static void sigalrm(int _)
+{
+	exit(0);
+}
+
+int main(void)
+{
+	pid = fork();
+	if (pid == 0) {
+		/* child */
+		while (1) {
+			pthread_t pth;
+			pthread_create(&pth, NULL, f, NULL);
+			pthread_join(pth, NULL);
+		}
+	} else if (pid > 0) {
+		/* parent */
+		atexit(atexit_hook);
+
+		char buf[64];
+		snprintf(buf, sizeof(buf), "/proc/%u/task", pid);
+
+		signal(SIGALRM, sigalrm);
+		alarm(1);
+
+		while (1) {
+			DIR *d = opendir(buf);
+			struct dirent *de;
+			while ((de = readdir(d))) {
+				if (strcmp(de->d_name, "0") == 0) {
+					exit(1);
+				}
+			}
+			closedir(d);
+		}
+
+		return 0;
+	} else {
+		perror("fork");
+		return 1;
+	}
+}
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 04/87] x86/xen: update xen_oldmem_pfn_is_ram() documentation
  2021-11-09  2:30 incoming Andrew Morton
                   ` (2 preceding siblings ...)
  2021-11-09  2:31 ` [patch 03/87] procfs: do not list TID 0 in /proc/<pid>/task Andrew Morton
@ 2021-11-09  2:31 ` Andrew Morton
  2021-11-09  2:31 ` [patch 05/87] x86/xen: simplify xen_oldmem_pfn_is_ram() Andrew Morton
                   ` (82 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:31 UTC (permalink / raw)
  To: akpm, bhe, boris.ostrvsky, bp, david, dyoung, hpa, jasowang,
	jgross, linux-mm, mhocko, mingo, mm-commits, mst, osalvador,
	rafael.j.wysocki, rppt, sstabellini, tglx, torvalds, vgoyal

From: David Hildenbrand <david@redhat.com>
Subject: x86/xen: update xen_oldmem_pfn_is_ram() documentation

After removing /dev/kmem, sanitizing /proc/kcore and handling /dev/mem,
this series tackles the last sane way how a VM could accidentially access
logically unplugged memory managed by a virtio-mem device: /proc/vmcore

When dumping memory via "makedumpfile", PG_offline pages, used by
virtio-mem to flag logically unplugged memory, are already properly
excluded; however, especially when accessing/copying /proc/vmcore "the
usual way", we can still end up reading logically unplugged memory part of
a virtio-mem device.

Patch #1-#3 are cleanups.  Patch #4 extends the existing oldmem_pfn_is_ram
mechanism.  Patch #5-#7 are virtio-mem refactorings for patch #8, which
implements the virtio-mem logic to query the state of device blocks.

Patch #8:

"
Although virtio-mem currently supports reading unplugged memory in the
hypervisor, this will change in the future, indicated to the device via
a new feature flag. We similarly sanitized /proc/kcore access recently.
[...]
Distributions that support virtio-mem+kdump have to make sure that the
virtio_mem module will be part of the kdump kernel or the kdump initrd;
dracut was recently [2] extended to include virtio-mem in the generated
initrd. As long as no special kdump kernels are used, this will
automatically make sure that virtio-mem will be around in the kdump initrd
and sanitize /proc/vmcore access -- with dracut.
"

This is the last remaining bit to support
VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE [3] in the Linux implementation of
virtio-mem.

Note: this is best-effort.  We'll never be able to control what runs
inside the second kernel, really, but we also don't have to care: we only
care about sane setups where we don't want our VM getting zapped once we
touch the wrong memory location while dumping.  While we usually expect
sane setups to use "makedumfile", nothing really speaks against just
copying /proc/vmcore, especially in environments where HWpoisioning isn't
typically expected.  Also, we really don't want to put all our trust
completely on the memmap, so sanitizing also makes sense when just using
"makedumpfile".

[1] https://lkml.kernel.org/r/20210526093041.8800-1-david@redhat.com
[2] https://github.com/dracutdevs/dracut/pull/1157
[3] https://lists.oasis-open.org/archives/virtio-comment/202109/msg00021.html


This patch (of 9):

The callback is only used for the vmcore nowadays.

Link: https://lkml.kernel.org/r/20211005121430.30136-1-david@redhat.com
Link: https://lkml.kernel.org/r/20211005121430.30136-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Boris Ostrovsky <boris.ostrvsky@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/x86/xen/mmu_hvm.c |    9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

--- a/arch/x86/xen/mmu_hvm.c~x86-xen-update-xen_oldmem_pfn_is_ram-documentation
+++ a/arch/x86/xen/mmu_hvm.c
@@ -9,12 +9,9 @@
 
 #ifdef CONFIG_PROC_VMCORE
 /*
- * This function is used in two contexts:
- * - the kdump kernel has to check whether a pfn of the crashed kernel
- *   was a ballooned page. vmcore is using this function to decide
- *   whether to access a pfn of the crashed kernel.
- * - the kexec kernel has to check whether a pfn was ballooned by the
- *   previous kernel. If the pfn is ballooned, handle it properly.
+ * The kdump kernel has to check whether a pfn of the crashed kernel
+ * was a ballooned page. vmcore is using this function to decide
+ * whether to access a pfn of the crashed kernel.
  * Returns 0 if the pfn is not backed by a RAM page, the caller may
  * handle the pfn special in this case.
  */
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 05/87] x86/xen: simplify xen_oldmem_pfn_is_ram()
  2021-11-09  2:30 incoming Andrew Morton
                   ` (3 preceding siblings ...)
  2021-11-09  2:31 ` [patch 04/87] x86/xen: update xen_oldmem_pfn_is_ram() documentation Andrew Morton
@ 2021-11-09  2:31 ` Andrew Morton
  2021-11-09  2:31 ` [patch 06/87] x86/xen: print a warning when HVMOP_get_mem_type fails Andrew Morton
                   ` (81 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:31 UTC (permalink / raw)
  To: akpm, bhe, boris.ostrovsky, bp, david, dyoung, hpa, jasowang,
	jgross, linux-mm, mhocko, mingo, mm-commits, mst, osalvador,
	rafael.j.wysocki, rppt, sstabellini, tglx, torvalds, vgoyal

From: David Hildenbrand <david@redhat.com>
Subject: x86/xen: simplify xen_oldmem_pfn_is_ram()

Let's simplify return handling.

Link: https://lkml.kernel.org/r/20211005121430.30136-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/x86/xen/mmu_hvm.c |   15 +--------------
 1 file changed, 1 insertion(+), 14 deletions(-)

--- a/arch/x86/xen/mmu_hvm.c~x86-xen-simplify-xen_oldmem_pfn_is_ram
+++ a/arch/x86/xen/mmu_hvm.c
@@ -21,23 +21,10 @@ static int xen_oldmem_pfn_is_ram(unsigne
 		.domid = DOMID_SELF,
 		.pfn = pfn,
 	};
-	int ram;
 
 	if (HYPERVISOR_hvm_op(HVMOP_get_mem_type, &a))
 		return -ENXIO;
-
-	switch (a.mem_type) {
-	case HVMMEM_mmio_dm:
-		ram = 0;
-		break;
-	case HVMMEM_ram_rw:
-	case HVMMEM_ram_ro:
-	default:
-		ram = 1;
-		break;
-	}
-
-	return ram;
+	return a.mem_type != HVMMEM_mmio_dm;
 }
 #endif
 
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 06/87] x86/xen: print a warning when HVMOP_get_mem_type fails
  2021-11-09  2:30 incoming Andrew Morton
                   ` (4 preceding siblings ...)
  2021-11-09  2:31 ` [patch 05/87] x86/xen: simplify xen_oldmem_pfn_is_ram() Andrew Morton
@ 2021-11-09  2:31 ` Andrew Morton
  2021-11-09  2:31 ` [patch 07/87] proc/vmcore: let pfn_is_ram() return a bool Andrew Morton
                   ` (80 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:31 UTC (permalink / raw)
  To: akpm, bhe, boris.ostrovsky, bp, david, dyoung, hpa, jasowang,
	jgross, linux-mm, mhocko, mingo, mm-commits, mst, osalvador,
	rafael.j.wysocki, rppt, sstabellini, tglx, torvalds, vgoyal

From: David Hildenbrand <david@redhat.com>
Subject: x86/xen: print a warning when HVMOP_get_mem_type fails

HVMOP_get_mem_type is not expected to fail, "This call failing is
indication of something going quite wrong and it would be good to know
about this." [1]

Let's add a pr_warn_once().

[1] https://lkml.kernel.org/r/3b935aa0-6d85-0bcd-100e-15098add3c4c@oracle.com

Link: https://lkml.kernel.org/r/20211005121430.30136-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Young <dyoung@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/x86/xen/mmu_hvm.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/arch/x86/xen/mmu_hvm.c~x86-xen-print-a-warning-when-hvmop_get_mem_type-fails
+++ a/arch/x86/xen/mmu_hvm.c
@@ -22,8 +22,10 @@ static int xen_oldmem_pfn_is_ram(unsigne
 		.pfn = pfn,
 	};
 
-	if (HYPERVISOR_hvm_op(HVMOP_get_mem_type, &a))
+	if (HYPERVISOR_hvm_op(HVMOP_get_mem_type, &a)) {
+		pr_warn_once("Unexpected HVMOP_get_mem_type failure\n");
 		return -ENXIO;
+	}
 	return a.mem_type != HVMMEM_mmio_dm;
 }
 #endif
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 07/87] proc/vmcore: let pfn_is_ram() return a bool
  2021-11-09  2:30 incoming Andrew Morton
                   ` (5 preceding siblings ...)
  2021-11-09  2:31 ` [patch 06/87] x86/xen: print a warning when HVMOP_get_mem_type fails Andrew Morton
@ 2021-11-09  2:31 ` Andrew Morton
  2021-11-09  2:31 ` [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks Andrew Morton
                   ` (79 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:31 UTC (permalink / raw)
  To: akpm, bhe, boris.ostrovsky, bp, david, dyoung, hpa, jasowang,
	jgross, linux-mm, mhocko, mingo, mm-commits, mst, osalvador,
	rafael.j.wysocki, rppt, sstabellini, tglx, torvalds, vgoyal

From: David Hildenbrand <david@redhat.com>
Subject: proc/vmcore: let pfn_is_ram() return a bool

The callback should deal with errors internally, it doesn't make sense to
expose these via pfn_is_ram().  We'll rework the callbacks next.  Right
now we consider errors as if "it's RAM"; no functional change.

Link: https://lkml.kernel.org/r/20211005121430.30136-5-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/proc/vmcore.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/fs/proc/vmcore.c~proc-vmcore-let-pfn_is_ram-return-a-bool
+++ a/fs/proc/vmcore.c
@@ -84,11 +84,11 @@ void unregister_oldmem_pfn_is_ram(void)
 }
 EXPORT_SYMBOL_GPL(unregister_oldmem_pfn_is_ram);
 
-static int pfn_is_ram(unsigned long pfn)
+static bool pfn_is_ram(unsigned long pfn)
 {
 	int (*fn)(unsigned long pfn);
 	/* pfn is ram unless fn() checks pagetype */
-	int ret = 1;
+	bool ret = true;
 
 	/*
 	 * Ask hypervisor if the pfn is really ram.
@@ -97,7 +97,7 @@ static int pfn_is_ram(unsigned long pfn)
 	 */
 	fn = oldmem_pfn_is_ram;
 	if (fn)
-		ret = fn(pfn);
+		ret = !!fn(pfn);
 
 	return ret;
 }
@@ -124,7 +124,7 @@ ssize_t read_from_oldmem(char *buf, size
 			nr_bytes = count;
 
 		/* If pfn is not ram, return zeros for sparse dump files */
-		if (pfn_is_ram(pfn) == 0)
+		if (!pfn_is_ram(pfn))
 			memset(buf, 0, nr_bytes);
 		else {
 			if (encrypted)
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
  2021-11-09  2:30 incoming Andrew Morton
                   ` (6 preceding siblings ...)
  2021-11-09  2:31 ` [patch 07/87] proc/vmcore: let pfn_is_ram() return a bool Andrew Morton
@ 2021-11-09  2:31 ` Andrew Morton
  2021-11-09  3:59   ` Dave Young
  2021-11-10  7:22   ` Baoquan He
  2021-11-09  2:31 ` [patch 09/87] virtio-mem: factor out hotplug specifics from virtio_mem_init() into virtio_mem_init_hotplug() Andrew Morton
                   ` (78 subsequent siblings)
  86 siblings, 2 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:31 UTC (permalink / raw)
  To: akpm, bhe, boris.ostrovsky, bp, david, dyoung, hpa, jasowang,
	jgross, linux-mm, mhocko, mingo, mm-commits, mst, osalvador,
	rafael.j.wysocki, rppt, sstabellini, tglx, torvalds, vgoyal

From: David Hildenbrand <david@redhat.com>
Subject: proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks

Let's support multiple registered callbacks, making sure that registering
vmcore callbacks cannot fail.  Make the callback return a bool instead of
an int, handling how to deal with errors internally.  Drop unused
HAVE_OLDMEM_PFN_IS_RAM.

We soon want to make use of this infrastructure from other drivers:
virtio-mem, registering one callback for each virtio-mem device, to
prevent reading unplugged virtio-mem memory.

Handle it via a generic vmcore_cb structure, prepared for future
extensions: for example, once we support virtio-mem on s390x where the
vmcore is completely constructed in the second kernel, we want to detect
and add plugged virtio-mem memory ranges to the vmcore in order for them
to get dumped properly.

Handle corner cases that are unexpected and shouldn't happen in sane
setups: registering a callback after the vmcore has already been opened
(warn only) and unregistering a callback after the vmcore has already been
opened (warn and essentially read only zeroes from that point on).

Link: https://lkml.kernel.org/r/20211005121430.30136-6-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/x86/kernel/aperture_64.c |   13 +++-
 arch/x86/xen/mmu_hvm.c        |   11 ++-
 fs/proc/vmcore.c              |  101 ++++++++++++++++++++++----------
 include/linux/crash_dump.h    |   26 +++++++-
 4 files changed, 112 insertions(+), 39 deletions(-)

--- a/arch/x86/kernel/aperture_64.c~proc-vmcore-convert-oldmem_pfn_is_ram-callback-to-more-generic-vmcore-callbacks
+++ a/arch/x86/kernel/aperture_64.c
@@ -73,12 +73,23 @@ static int gart_mem_pfn_is_ram(unsigned
 		      (pfn >= aperture_pfn_start + aperture_page_count));
 }
 
+#ifdef CONFIG_PROC_VMCORE
+static bool gart_oldmem_pfn_is_ram(struct vmcore_cb *cb, unsigned long pfn)
+{
+	return !!gart_mem_pfn_is_ram(pfn);
+}
+
+static struct vmcore_cb gart_vmcore_cb = {
+	.pfn_is_ram = gart_oldmem_pfn_is_ram,
+};
+#endif
+
 static void __init exclude_from_core(u64 aper_base, u32 aper_order)
 {
 	aperture_pfn_start = aper_base >> PAGE_SHIFT;
 	aperture_page_count = (32 * 1024 * 1024) << aper_order >> PAGE_SHIFT;
 #ifdef CONFIG_PROC_VMCORE
-	WARN_ON(register_oldmem_pfn_is_ram(&gart_mem_pfn_is_ram));
+	register_vmcore_cb(&gart_vmcore_cb);
 #endif
 #ifdef CONFIG_PROC_KCORE
 	WARN_ON(register_mem_pfn_is_ram(&gart_mem_pfn_is_ram));
--- a/arch/x86/xen/mmu_hvm.c~proc-vmcore-convert-oldmem_pfn_is_ram-callback-to-more-generic-vmcore-callbacks
+++ a/arch/x86/xen/mmu_hvm.c
@@ -12,10 +12,10 @@
  * The kdump kernel has to check whether a pfn of the crashed kernel
  * was a ballooned page. vmcore is using this function to decide
  * whether to access a pfn of the crashed kernel.
- * Returns 0 if the pfn is not backed by a RAM page, the caller may
+ * Returns "false" if the pfn is not backed by a RAM page, the caller may
  * handle the pfn special in this case.
  */
-static int xen_oldmem_pfn_is_ram(unsigned long pfn)
+static bool xen_vmcore_pfn_is_ram(struct vmcore_cb *cb, unsigned long pfn)
 {
 	struct xen_hvm_get_mem_type a = {
 		.domid = DOMID_SELF,
@@ -24,10 +24,13 @@ static int xen_oldmem_pfn_is_ram(unsigne
 
 	if (HYPERVISOR_hvm_op(HVMOP_get_mem_type, &a)) {
 		pr_warn_once("Unexpected HVMOP_get_mem_type failure\n");
-		return -ENXIO;
+		return true;
 	}
 	return a.mem_type != HVMMEM_mmio_dm;
 }
+static struct vmcore_cb xen_vmcore_cb = {
+	.pfn_is_ram = xen_vmcore_pfn_is_ram,
+};
 #endif
 
 static void xen_hvm_exit_mmap(struct mm_struct *mm)
@@ -61,6 +64,6 @@ void __init xen_hvm_init_mmu_ops(void)
 	if (is_pagetable_dying_supported())
 		pv_ops.mmu.exit_mmap = xen_hvm_exit_mmap;
 #ifdef CONFIG_PROC_VMCORE
-	WARN_ON(register_oldmem_pfn_is_ram(&xen_oldmem_pfn_is_ram));
+	register_vmcore_cb(&xen_vmcore_cb);
 #endif
 }
--- a/fs/proc/vmcore.c~proc-vmcore-convert-oldmem_pfn_is_ram-callback-to-more-generic-vmcore-callbacks
+++ a/fs/proc/vmcore.c
@@ -62,46 +62,75 @@ core_param(novmcoredd, vmcoredd_disabled
 /* Device Dump Size */
 static size_t vmcoredd_orig_sz;
 
-/*
- * Returns > 0 for RAM pages, 0 for non-RAM pages, < 0 on error
- * The called function has to take care of module refcounting.
- */
-static int (*oldmem_pfn_is_ram)(unsigned long pfn);
-
-int register_oldmem_pfn_is_ram(int (*fn)(unsigned long pfn))
-{
-	if (oldmem_pfn_is_ram)
-		return -EBUSY;
-	oldmem_pfn_is_ram = fn;
-	return 0;
+static DECLARE_RWSEM(vmcore_cb_rwsem);
+/* List of registered vmcore callbacks. */
+static LIST_HEAD(vmcore_cb_list);
+/* Whether we had a surprise unregistration of a callback. */
+static bool vmcore_cb_unstable;
+/* Whether the vmcore has been opened once. */
+static bool vmcore_opened;
+
+void register_vmcore_cb(struct vmcore_cb *cb)
+{
+	down_write(&vmcore_cb_rwsem);
+	INIT_LIST_HEAD(&cb->next);
+	list_add_tail(&cb->next, &vmcore_cb_list);
+	/*
+	 * Registering a vmcore callback after the vmcore was opened is
+	 * very unusual (e.g., manual driver loading).
+	 */
+	if (vmcore_opened)
+		pr_warn_once("Unexpected vmcore callback registration\n");
+	up_write(&vmcore_cb_rwsem);
 }
-EXPORT_SYMBOL_GPL(register_oldmem_pfn_is_ram);
+EXPORT_SYMBOL_GPL(register_vmcore_cb);
 
-void unregister_oldmem_pfn_is_ram(void)
+void unregister_vmcore_cb(struct vmcore_cb *cb)
 {
-	oldmem_pfn_is_ram = NULL;
-	wmb();
+	down_write(&vmcore_cb_rwsem);
+	list_del(&cb->next);
+	/*
+	 * Unregistering a vmcore callback after the vmcore was opened is
+	 * very unusual (e.g., forced driver removal), but we cannot stop
+	 * unregistering.
+	 */
+	if (vmcore_opened) {
+		pr_warn_once("Unexpected vmcore callback unregistration\n");
+		vmcore_cb_unstable = true;
+	}
+	up_write(&vmcore_cb_rwsem);
 }
-EXPORT_SYMBOL_GPL(unregister_oldmem_pfn_is_ram);
+EXPORT_SYMBOL_GPL(unregister_vmcore_cb);
 
 static bool pfn_is_ram(unsigned long pfn)
 {
-	int (*fn)(unsigned long pfn);
-	/* pfn is ram unless fn() checks pagetype */
+	struct vmcore_cb *cb;
 	bool ret = true;
 
-	/*
-	 * Ask hypervisor if the pfn is really ram.
-	 * A ballooned page contains no data and reading from such a page
-	 * will cause high load in the hypervisor.
-	 */
-	fn = oldmem_pfn_is_ram;
-	if (fn)
-		ret = !!fn(pfn);
+	lockdep_assert_held_read(&vmcore_cb_rwsem);
+	if (unlikely(vmcore_cb_unstable))
+		return false;
+
+	list_for_each_entry(cb, &vmcore_cb_list, next) {
+		if (unlikely(!cb->pfn_is_ram))
+			continue;
+		ret = cb->pfn_is_ram(cb, pfn);
+		if (!ret)
+			break;
+	}
 
 	return ret;
 }
 
+static int open_vmcore(struct inode *inode, struct file *file)
+{
+	down_read(&vmcore_cb_rwsem);
+	vmcore_opened = true;
+	up_read(&vmcore_cb_rwsem);
+
+	return 0;
+}
+
 /* Reads a page from the oldmem device from given offset. */
 ssize_t read_from_oldmem(char *buf, size_t count,
 			 u64 *ppos, int userbuf,
@@ -117,6 +146,7 @@ ssize_t read_from_oldmem(char *buf, size
 	offset = (unsigned long)(*ppos % PAGE_SIZE);
 	pfn = (unsigned long)(*ppos / PAGE_SIZE);
 
+	down_read(&vmcore_cb_rwsem);
 	do {
 		if (count > (PAGE_SIZE - offset))
 			nr_bytes = PAGE_SIZE - offset;
@@ -136,8 +166,10 @@ ssize_t read_from_oldmem(char *buf, size
 				tmp = copy_oldmem_page(pfn, buf, nr_bytes,
 						       offset, userbuf);
 
-			if (tmp < 0)
+			if (tmp < 0) {
+				up_read(&vmcore_cb_rwsem);
 				return tmp;
+			}
 		}
 		*ppos += nr_bytes;
 		count -= nr_bytes;
@@ -147,6 +179,7 @@ ssize_t read_from_oldmem(char *buf, size
 		offset = 0;
 	} while (count);
 
+	up_read(&vmcore_cb_rwsem);
 	return read;
 }
 
@@ -537,14 +570,19 @@ static int vmcore_remap_oldmem_pfn(struc
 			    unsigned long from, unsigned long pfn,
 			    unsigned long size, pgprot_t prot)
 {
+	int ret;
+
 	/*
 	 * Check if oldmem_pfn_is_ram was registered to avoid
 	 * looping over all pages without a reason.
 	 */
-	if (oldmem_pfn_is_ram)
-		return remap_oldmem_pfn_checked(vma, from, pfn, size, prot);
+	down_read(&vmcore_cb_rwsem);
+	if (!list_empty(&vmcore_cb_list) || vmcore_cb_unstable)
+		ret = remap_oldmem_pfn_checked(vma, from, pfn, size, prot);
 	else
-		return remap_oldmem_pfn_range(vma, from, pfn, size, prot);
+		ret = remap_oldmem_pfn_range(vma, from, pfn, size, prot);
+	up_read(&vmcore_cb_rwsem);
+	return ret;
 }
 
 static int mmap_vmcore(struct file *file, struct vm_area_struct *vma)
@@ -668,6 +706,7 @@ static int mmap_vmcore(struct file *file
 #endif
 
 static const struct proc_ops vmcore_proc_ops = {
+	.proc_open	= open_vmcore,
 	.proc_read	= read_vmcore,
 	.proc_lseek	= default_llseek,
 	.proc_mmap	= mmap_vmcore,
--- a/include/linux/crash_dump.h~proc-vmcore-convert-oldmem_pfn_is_ram-callback-to-more-generic-vmcore-callbacks
+++ a/include/linux/crash_dump.h
@@ -91,9 +91,29 @@ static inline void vmcore_unusable(void)
 		elfcorehdr_addr = ELFCORE_ADDR_ERR;
 }
 
-#define HAVE_OLDMEM_PFN_IS_RAM 1
-extern int register_oldmem_pfn_is_ram(int (*fn)(unsigned long pfn));
-extern void unregister_oldmem_pfn_is_ram(void);
+/**
+ * struct vmcore_cb - driver callbacks for /proc/vmcore handling
+ * @pfn_is_ram: check whether a PFN really is RAM and should be accessed when
+ *              reading the vmcore. Will return "true" if it is RAM or if the
+ *              callback cannot tell. If any callback returns "false", it's not
+ *              RAM and the page must not be accessed; zeroes should be
+ *              indicated in the vmcore instead. For example, a ballooned page
+ *              contains no data and reading from such a page will cause high
+ *              load in the hypervisor.
+ * @next: List head to manage registered callbacks internally; initialized by
+ *        register_vmcore_cb().
+ *
+ * vmcore callbacks allow drivers managing physical memory ranges to
+ * coordinate with vmcore handling code, for example, to prevent accessing
+ * physical memory ranges that should not be accessed when reading the vmcore,
+ * although included in the vmcore header as memory ranges to dump.
+ */
+struct vmcore_cb {
+	bool (*pfn_is_ram)(struct vmcore_cb *cb, unsigned long pfn);
+	struct list_head next;
+};
+extern void register_vmcore_cb(struct vmcore_cb *cb);
+extern void unregister_vmcore_cb(struct vmcore_cb *cb);
 
 #else /* !CONFIG_CRASH_DUMP */
 static inline bool is_kdump_kernel(void) { return 0; }
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 09/87] virtio-mem: factor out hotplug specifics from virtio_mem_init() into virtio_mem_init_hotplug()
  2021-11-09  2:30 incoming Andrew Morton
                   ` (7 preceding siblings ...)
  2021-11-09  2:31 ` [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks Andrew Morton
@ 2021-11-09  2:31 ` Andrew Morton
  2021-11-09  2:31 ` [patch 10/87] virtio-mem: factor out hotplug specifics from virtio_mem_probe() " Andrew Morton
                   ` (77 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:31 UTC (permalink / raw)
  To: akpm, bhe, boris.ostrovsky, bp, david, dyoung, hpa, jasowang,
	jgross, linux-mm, mhocko, mingo, mm-commits, mst, osalvador,
	rafael.j.wysocki, rppt, sstabellini, tglx, torvalds, vgoyal

From: David Hildenbrand <david@redhat.com>
Subject: virtio-mem: factor out hotplug specifics from virtio_mem_init() into virtio_mem_init_hotplug()

Let's prepare for a new virtio-mem kdump mode in which we don't actually
hot(un)plug any memory but only observe the state of device blocks.

Link: https://lkml.kernel.org/r/20211005121430.30136-7-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/virtio/virtio_mem.c |   81 ++++++++++++++++++----------------
 1 file changed, 44 insertions(+), 37 deletions(-)

--- a/drivers/virtio/virtio_mem.c~virtio-mem-factor-out-hotplug-specifics-from-virtio_mem_init-into-virtio_mem_init_hotplug
+++ a/drivers/virtio/virtio_mem.c
@@ -2392,41 +2392,10 @@ static int virtio_mem_init_vq(struct vir
 	return 0;
 }
 
-static int virtio_mem_init(struct virtio_mem *vm)
+static int virtio_mem_init_hotplug(struct virtio_mem *vm)
 {
 	const struct range pluggable_range = mhp_get_pluggable_range(true);
 	uint64_t sb_size, addr;
-	uint16_t node_id;
-
-	if (!vm->vdev->config->get) {
-		dev_err(&vm->vdev->dev, "config access disabled\n");
-		return -EINVAL;
-	}
-
-	/*
-	 * We don't want to (un)plug or reuse any memory when in kdump. The
-	 * memory is still accessible (but not mapped).
-	 */
-	if (is_kdump_kernel()) {
-		dev_warn(&vm->vdev->dev, "disabled in kdump kernel\n");
-		return -EBUSY;
-	}
-
-	/* Fetch all properties that can't change. */
-	virtio_cread_le(vm->vdev, struct virtio_mem_config, plugged_size,
-			&vm->plugged_size);
-	virtio_cread_le(vm->vdev, struct virtio_mem_config, block_size,
-			&vm->device_block_size);
-	virtio_cread_le(vm->vdev, struct virtio_mem_config, node_id,
-			&node_id);
-	vm->nid = virtio_mem_translate_node_id(vm, node_id);
-	virtio_cread_le(vm->vdev, struct virtio_mem_config, addr, &vm->addr);
-	virtio_cread_le(vm->vdev, struct virtio_mem_config, region_size,
-			&vm->region_size);
-
-	/* Determine the nid for the device based on the lowest address. */
-	if (vm->nid == NUMA_NO_NODE)
-		vm->nid = memory_add_physaddr_to_nid(vm->addr);
 
 	/* bad device setup - warn only */
 	if (!IS_ALIGNED(vm->addr, memory_block_size_bytes()))
@@ -2496,10 +2465,6 @@ static int virtio_mem_init(struct virtio
 					      vm->offline_threshold);
 	}
 
-	dev_info(&vm->vdev->dev, "start address: 0x%llx", vm->addr);
-	dev_info(&vm->vdev->dev, "region size: 0x%llx", vm->region_size);
-	dev_info(&vm->vdev->dev, "device block size: 0x%llx",
-		 (unsigned long long)vm->device_block_size);
 	dev_info(&vm->vdev->dev, "memory block size: 0x%lx",
 		 memory_block_size_bytes());
 	if (vm->in_sbm)
@@ -2508,10 +2473,52 @@ static int virtio_mem_init(struct virtio
 	else
 		dev_info(&vm->vdev->dev, "big block size: 0x%llx",
 			 (unsigned long long)vm->bbm.bb_size);
+
+	return 0;
+}
+
+static int virtio_mem_init(struct virtio_mem *vm)
+{
+	uint16_t node_id;
+
+	if (!vm->vdev->config->get) {
+		dev_err(&vm->vdev->dev, "config access disabled\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * We don't want to (un)plug or reuse any memory when in kdump. The
+	 * memory is still accessible (but not mapped).
+	 */
+	if (is_kdump_kernel()) {
+		dev_warn(&vm->vdev->dev, "disabled in kdump kernel\n");
+		return -EBUSY;
+	}
+
+	/* Fetch all properties that can't change. */
+	virtio_cread_le(vm->vdev, struct virtio_mem_config, plugged_size,
+			&vm->plugged_size);
+	virtio_cread_le(vm->vdev, struct virtio_mem_config, block_size,
+			&vm->device_block_size);
+	virtio_cread_le(vm->vdev, struct virtio_mem_config, node_id,
+			&node_id);
+	vm->nid = virtio_mem_translate_node_id(vm, node_id);
+	virtio_cread_le(vm->vdev, struct virtio_mem_config, addr, &vm->addr);
+	virtio_cread_le(vm->vdev, struct virtio_mem_config, region_size,
+			&vm->region_size);
+
+	/* Determine the nid for the device based on the lowest address. */
+	if (vm->nid == NUMA_NO_NODE)
+		vm->nid = memory_add_physaddr_to_nid(vm->addr);
+
+	dev_info(&vm->vdev->dev, "start address: 0x%llx", vm->addr);
+	dev_info(&vm->vdev->dev, "region size: 0x%llx", vm->region_size);
+	dev_info(&vm->vdev->dev, "device block size: 0x%llx",
+		 (unsigned long long)vm->device_block_size);
 	if (vm->nid != NUMA_NO_NODE && IS_ENABLED(CONFIG_NUMA))
 		dev_info(&vm->vdev->dev, "nid: %d", vm->nid);
 
-	return 0;
+	return virtio_mem_init_hotplug(vm);
 }
 
 static int virtio_mem_create_resource(struct virtio_mem *vm)
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 10/87] virtio-mem: factor out hotplug specifics from virtio_mem_probe() into virtio_mem_init_hotplug()
  2021-11-09  2:30 incoming Andrew Morton
                   ` (8 preceding siblings ...)
  2021-11-09  2:31 ` [patch 09/87] virtio-mem: factor out hotplug specifics from virtio_mem_init() into virtio_mem_init_hotplug() Andrew Morton
@ 2021-11-09  2:31 ` Andrew Morton
  2021-11-09  2:31 ` [patch 11/87] virtio-mem: factor out hotplug specifics from virtio_mem_remove() into virtio_mem_deinit_hotplug() Andrew Morton
                   ` (76 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:31 UTC (permalink / raw)
  To: akpm, bhe, boris.ostrovsky, bp, david, dyoung, hpa, jasowang,
	jgross, linux-mm, mhocko, mingo, mm-commits, mst, osalvador,
	rafael.j.wysocki, rppt, sstabellini, tglx, torvalds, vgoyal

From: David Hildenbrand <david@redhat.com>
Subject: virtio-mem: factor out hotplug specifics from virtio_mem_probe() into virtio_mem_init_hotplug()

Let's prepare for a new virtio-mem kdump mode in which we don't actually
hot(un)plug any memory but only observe the state of device blocks.

Link: https://lkml.kernel.org/r/20211005121430.30136-8-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/virtio/virtio_mem.c |   87 +++++++++++++++++-----------------
 1 file changed, 45 insertions(+), 42 deletions(-)

--- a/drivers/virtio/virtio_mem.c~virtio-mem-factor-out-hotplug-specifics-from-virtio_mem_probe-into-virtio_mem_init_hotplug
+++ a/drivers/virtio/virtio_mem.c
@@ -260,6 +260,8 @@ static void virtio_mem_fake_offline_goin
 static void virtio_mem_fake_offline_cancel_offline(unsigned long pfn,
 						   unsigned long nr_pages);
 static void virtio_mem_retry(struct virtio_mem *vm);
+static int virtio_mem_create_resource(struct virtio_mem *vm);
+static void virtio_mem_delete_resource(struct virtio_mem *vm);
 
 /*
  * Register a virtio-mem device so it will be considered for the online_page
@@ -2395,7 +2397,8 @@ static int virtio_mem_init_vq(struct vir
 static int virtio_mem_init_hotplug(struct virtio_mem *vm)
 {
 	const struct range pluggable_range = mhp_get_pluggable_range(true);
-	uint64_t sb_size, addr;
+	uint64_t unit_pages, sb_size, addr;
+	int rc;
 
 	/* bad device setup - warn only */
 	if (!IS_ALIGNED(vm->addr, memory_block_size_bytes()))
@@ -2474,7 +2477,48 @@ static int virtio_mem_init_hotplug(struc
 		dev_info(&vm->vdev->dev, "big block size: 0x%llx",
 			 (unsigned long long)vm->bbm.bb_size);
 
+	/* create the parent resource for all memory */
+	rc = virtio_mem_create_resource(vm);
+	if (rc)
+		return rc;
+
+	/* use a single dynamic memory group to cover the whole memory device */
+	if (vm->in_sbm)
+		unit_pages = PHYS_PFN(memory_block_size_bytes());
+	else
+		unit_pages = PHYS_PFN(vm->bbm.bb_size);
+	rc = memory_group_register_dynamic(vm->nid, unit_pages);
+	if (rc < 0)
+		goto out_del_resource;
+	vm->mgid = rc;
+
+	/*
+	 * If we still have memory plugged, we have to unplug all memory first.
+	 * Registering our parent resource makes sure that this memory isn't
+	 * actually in use (e.g., trying to reload the driver).
+	 */
+	if (vm->plugged_size) {
+		vm->unplug_all_required = true;
+		dev_info(&vm->vdev->dev, "unplugging all memory is required\n");
+	}
+
+	/* register callbacks */
+	vm->memory_notifier.notifier_call = virtio_mem_memory_notifier_cb;
+	rc = register_memory_notifier(&vm->memory_notifier);
+	if (rc)
+		goto out_unreg_group;
+	rc = register_virtio_mem_device(vm);
+	if (rc)
+		goto out_unreg_mem;
+
 	return 0;
+out_unreg_mem:
+	unregister_memory_notifier(&vm->memory_notifier);
+out_unreg_group:
+	memory_group_unregister(vm->mgid);
+out_del_resource:
+	virtio_mem_delete_resource(vm);
+	return rc;
 }
 
 static int virtio_mem_init(struct virtio_mem *vm)
@@ -2578,7 +2622,6 @@ static bool virtio_mem_has_memory_added(
 static int virtio_mem_probe(struct virtio_device *vdev)
 {
 	struct virtio_mem *vm;
-	uint64_t unit_pages;
 	int rc;
 
 	BUILD_BUG_ON(sizeof(struct virtio_mem_req) != 24);
@@ -2608,40 +2651,6 @@ static int virtio_mem_probe(struct virti
 	if (rc)
 		goto out_del_vq;
 
-	/* create the parent resource for all memory */
-	rc = virtio_mem_create_resource(vm);
-	if (rc)
-		goto out_del_vq;
-
-	/* use a single dynamic memory group to cover the whole memory device */
-	if (vm->in_sbm)
-		unit_pages = PHYS_PFN(memory_block_size_bytes());
-	else
-		unit_pages = PHYS_PFN(vm->bbm.bb_size);
-	rc = memory_group_register_dynamic(vm->nid, unit_pages);
-	if (rc < 0)
-		goto out_del_resource;
-	vm->mgid = rc;
-
-	/*
-	 * If we still have memory plugged, we have to unplug all memory first.
-	 * Registering our parent resource makes sure that this memory isn't
-	 * actually in use (e.g., trying to reload the driver).
-	 */
-	if (vm->plugged_size) {
-		vm->unplug_all_required = true;
-		dev_info(&vm->vdev->dev, "unplugging all memory is required\n");
-	}
-
-	/* register callbacks */
-	vm->memory_notifier.notifier_call = virtio_mem_memory_notifier_cb;
-	rc = register_memory_notifier(&vm->memory_notifier);
-	if (rc)
-		goto out_unreg_group;
-	rc = register_virtio_mem_device(vm);
-	if (rc)
-		goto out_unreg_mem;
-
 	virtio_device_ready(vdev);
 
 	/* trigger a config update to start processing the requested_size */
@@ -2649,12 +2658,6 @@ static int virtio_mem_probe(struct virti
 	queue_work(system_freezable_wq, &vm->wq);
 
 	return 0;
-out_unreg_mem:
-	unregister_memory_notifier(&vm->memory_notifier);
-out_unreg_group:
-	memory_group_unregister(vm->mgid);
-out_del_resource:
-	virtio_mem_delete_resource(vm);
 out_del_vq:
 	vdev->config->del_vqs(vdev);
 out_free_vm:
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 11/87] virtio-mem: factor out hotplug specifics from virtio_mem_remove() into virtio_mem_deinit_hotplug()
  2021-11-09  2:30 incoming Andrew Morton
                   ` (9 preceding siblings ...)
  2021-11-09  2:31 ` [patch 10/87] virtio-mem: factor out hotplug specifics from virtio_mem_probe() " Andrew Morton
@ 2021-11-09  2:31 ` Andrew Morton
  2021-11-09  2:32 ` [patch 12/87] virtio-mem: kdump mode to sanitize /proc/vmcore access Andrew Morton
                   ` (75 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:31 UTC (permalink / raw)
  To: akpm, bhe, boris.ostrovsky, bp, david, dyoung, hpa, jasowang,
	jgross, linux-mm, mhocko, mingo, mm-commits, mst, osalvador,
	rafael.j.wysocki, rppt, sstabellini, tglx, torvalds, vgoyal

From: David Hildenbrand <david@redhat.com>
Subject: virtio-mem: factor out hotplug specifics from virtio_mem_remove() into virtio_mem_deinit_hotplug()

Let's prepare for a new virtio-mem kdump mode in which we don't actually
hot(un)plug any memory but only observe the state of device blocks.

Link: https://lkml.kernel.org/r/20211005121430.30136-9-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/virtio/virtio_mem.c |   13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

--- a/drivers/virtio/virtio_mem.c~virtio-mem-factor-out-hotplug-specifics-from-virtio_mem_remove-into-virtio_mem_deinit_hotplug
+++ a/drivers/virtio/virtio_mem.c
@@ -2667,9 +2667,8 @@ out_free_vm:
 	return rc;
 }
 
-static void virtio_mem_remove(struct virtio_device *vdev)
+static void virtio_mem_deinit_hotplug(struct virtio_mem *vm)
 {
-	struct virtio_mem *vm = vdev->priv;
 	unsigned long mb_id;
 	int rc;
 
@@ -2716,7 +2715,8 @@ static void virtio_mem_remove(struct vir
 	 * away. Warn at least.
 	 */
 	if (virtio_mem_has_memory_added(vm)) {
-		dev_warn(&vdev->dev, "device still has system memory added\n");
+		dev_warn(&vm->vdev->dev,
+			 "device still has system memory added\n");
 	} else {
 		virtio_mem_delete_resource(vm);
 		kfree_const(vm->resource_name);
@@ -2730,6 +2730,13 @@ static void virtio_mem_remove(struct vir
 	} else {
 		vfree(vm->bbm.bb_states);
 	}
+}
+
+static void virtio_mem_remove(struct virtio_device *vdev)
+{
+	struct virtio_mem *vm = vdev->priv;
+
+	virtio_mem_deinit_hotplug(vm);
 
 	/* reset the device and cleanup the queues */
 	vdev->config->reset(vdev);
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 12/87] virtio-mem: kdump mode to sanitize /proc/vmcore access
  2021-11-09  2:30 incoming Andrew Morton
                   ` (10 preceding siblings ...)
  2021-11-09  2:31 ` [patch 11/87] virtio-mem: factor out hotplug specifics from virtio_mem_remove() into virtio_mem_deinit_hotplug() Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 13/87] proc: allow pid_revalidate() during LOOKUP_RCU Andrew Morton
                   ` (74 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, bhe, boris.ostrovsky, bp, david, dyoung, hpa, jasowang,
	jgross, linux-mm, mhocko, mingo, mm-commits, mst, osalvador,
	rafael.j.wysocki, rppt, sstabellini, tglx, torvalds, vgoyal

From: David Hildenbrand <david@redhat.com>
Subject: virtio-mem: kdump mode to sanitize /proc/vmcore access

Although virtio-mem currently supports reading unplugged memory in the
hypervisor, this will change in the future, indicated to the device via a
new feature flag.  We similarly sanitized /proc/kcore access recently. 
[1]

Let's register a vmcore callback, to allow vmcore code to check if a PFN
belonging to a virtio-mem device is either currently plugged and should be
dumped or is currently unplugged and should not be accessed, instead
mapping the shared zeropage or returning zeroes when reading.

This is important when not capturing /proc/vmcore via tools like
"makedumpfile" that can identify logically unplugged virtio-mem memory via
PG_offline in the memmap, but simply by e.g., copying the file.

Distributions that support virtio-mem+kdump have to make sure that the
virtio_mem module will be part of the kdump kernel or the kdump initrd;
dracut was recently [2] extended to include virtio-mem in the generated
initrd.  As long as no special kdump kernels are used, this will
automatically make sure that virtio-mem will be around in the kdump initrd
and sanitize /proc/vmcore access -- with dracut.

With this series, we'll send one virtio-mem state request for every ~2 MiB
chunk of virtio-mem memory indicated in the vmcore that we intend to
read/map.

In the future, we might want to allow building virtio-mem for kdump mode
only, even without CONFIG_MEMORY_HOTPLUG and friends: this way, we could
support special stripped-down kdump kernels that have many other config
options disabled; we'll tackle that once required.  Further, we might want
to try sensing bigger blocks (e.g., memory sections) first before falling
back to device blocks on demand.

Tested with Fedora rawhide, which contains a recent kexec-tools version
(considering "System RAM (virtio_mem)" when creating the vmcore header)
and a recent dracut version (including the virtio_mem module in the kdump
initrd).

[1] https://lkml.kernel.org/r/20210526093041.8800-1-david@redhat.com
[2] https://github.com/dracutdevs/dracut/pull/1157

Link: https://lkml.kernel.org/r/20211005121430.30136-10-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/virtio/virtio_mem.c |  136 +++++++++++++++++++++++++++++++---
 1 file changed, 124 insertions(+), 12 deletions(-)

--- a/drivers/virtio/virtio_mem.c~virtio-mem-kdump-mode-to-sanitize-proc-vmcore-access
+++ a/drivers/virtio/virtio_mem.c
@@ -223,6 +223,9 @@ struct virtio_mem {
 	 * When this lock is held the pointers can't change, ONLINE and
 	 * OFFLINE blocks can't change the state and no subblocks will get
 	 * plugged/unplugged.
+	 *
+	 * In kdump mode, used to serialize requests, last_block_addr and
+	 * last_block_plugged.
 	 */
 	struct mutex hotplug_mutex;
 	bool hotplug_active;
@@ -230,6 +233,9 @@ struct virtio_mem {
 	/* An error occurred we cannot handle - stop processing requests. */
 	bool broken;
 
+	/* Cached valued of is_kdump_kernel() when the device was probed. */
+	bool in_kdump;
+
 	/* The driver is being removed. */
 	spinlock_t removal_lock;
 	bool removing;
@@ -243,6 +249,13 @@ struct virtio_mem {
 	/* Memory notifier (online/offline events). */
 	struct notifier_block memory_notifier;
 
+#ifdef CONFIG_PROC_VMCORE
+	/* vmcore callback for /proc/vmcore handling in kdump mode */
+	struct vmcore_cb vmcore_cb;
+	uint64_t last_block_addr;
+	bool last_block_plugged;
+#endif /* CONFIG_PROC_VMCORE */
+
 	/* Next device in the list of virtio-mem devices. */
 	struct list_head next;
 };
@@ -2293,6 +2306,12 @@ static void virtio_mem_run_wq(struct wor
 	uint64_t diff;
 	int rc;
 
+	if (unlikely(vm->in_kdump)) {
+		dev_warn_once(&vm->vdev->dev,
+			     "unexpected workqueue run in kdump kernel\n");
+		return;
+	}
+
 	hrtimer_cancel(&vm->retry_timer);
 
 	if (vm->broken)
@@ -2521,6 +2540,86 @@ out_del_resource:
 	return rc;
 }
 
+#ifdef CONFIG_PROC_VMCORE
+static int virtio_mem_send_state_request(struct virtio_mem *vm, uint64_t addr,
+					 uint64_t size)
+{
+	const uint64_t nb_vm_blocks = size / vm->device_block_size;
+	const struct virtio_mem_req req = {
+		.type = cpu_to_virtio16(vm->vdev, VIRTIO_MEM_REQ_STATE),
+		.u.state.addr = cpu_to_virtio64(vm->vdev, addr),
+		.u.state.nb_blocks = cpu_to_virtio16(vm->vdev, nb_vm_blocks),
+	};
+	int rc = -ENOMEM;
+
+	dev_dbg(&vm->vdev->dev, "requesting state: 0x%llx - 0x%llx\n", addr,
+		addr + size - 1);
+
+	switch (virtio_mem_send_request(vm, &req)) {
+	case VIRTIO_MEM_RESP_ACK:
+		return virtio16_to_cpu(vm->vdev, vm->resp.u.state.state);
+	case VIRTIO_MEM_RESP_ERROR:
+		rc = -EINVAL;
+		break;
+	default:
+		break;
+	}
+
+	dev_dbg(&vm->vdev->dev, "requesting state failed: %d\n", rc);
+	return rc;
+}
+
+static bool virtio_mem_vmcore_pfn_is_ram(struct vmcore_cb *cb,
+					 unsigned long pfn)
+{
+	struct virtio_mem *vm = container_of(cb, struct virtio_mem,
+					     vmcore_cb);
+	uint64_t addr = PFN_PHYS(pfn);
+	bool is_ram;
+	int rc;
+
+	if (!virtio_mem_contains_range(vm, addr, PAGE_SIZE))
+		return true;
+	if (!vm->plugged_size)
+		return false;
+
+	/*
+	 * We have to serialize device requests and access to the information
+	 * about the block queried last.
+	 */
+	mutex_lock(&vm->hotplug_mutex);
+
+	addr = ALIGN_DOWN(addr, vm->device_block_size);
+	if (addr != vm->last_block_addr) {
+		rc = virtio_mem_send_state_request(vm, addr,
+						   vm->device_block_size);
+		/* On any kind of error, we're going to signal !ram. */
+		if (rc == VIRTIO_MEM_STATE_PLUGGED)
+			vm->last_block_plugged = true;
+		else
+			vm->last_block_plugged = false;
+		vm->last_block_addr = addr;
+	}
+
+	is_ram = vm->last_block_plugged;
+	mutex_unlock(&vm->hotplug_mutex);
+	return is_ram;
+}
+#endif /* CONFIG_PROC_VMCORE */
+
+static int virtio_mem_init_kdump(struct virtio_mem *vm)
+{
+#ifdef CONFIG_PROC_VMCORE
+	dev_info(&vm->vdev->dev, "memory hot(un)plug disabled in kdump kernel\n");
+	vm->vmcore_cb.pfn_is_ram = virtio_mem_vmcore_pfn_is_ram;
+	register_vmcore_cb(&vm->vmcore_cb);
+	return 0;
+#else /* CONFIG_PROC_VMCORE */
+	dev_warn(&vm->vdev->dev, "disabled in kdump kernel without vmcore\n");
+	return -EBUSY;
+#endif /* CONFIG_PROC_VMCORE */
+}
+
 static int virtio_mem_init(struct virtio_mem *vm)
 {
 	uint16_t node_id;
@@ -2530,15 +2629,6 @@ static int virtio_mem_init(struct virtio
 		return -EINVAL;
 	}
 
-	/*
-	 * We don't want to (un)plug or reuse any memory when in kdump. The
-	 * memory is still accessible (but not mapped).
-	 */
-	if (is_kdump_kernel()) {
-		dev_warn(&vm->vdev->dev, "disabled in kdump kernel\n");
-		return -EBUSY;
-	}
-
 	/* Fetch all properties that can't change. */
 	virtio_cread_le(vm->vdev, struct virtio_mem_config, plugged_size,
 			&vm->plugged_size);
@@ -2562,6 +2652,12 @@ static int virtio_mem_init(struct virtio
 	if (vm->nid != NUMA_NO_NODE && IS_ENABLED(CONFIG_NUMA))
 		dev_info(&vm->vdev->dev, "nid: %d", vm->nid);
 
+	/*
+	 * We don't want to (un)plug or reuse any memory when in kdump. The
+	 * memory is still accessible (but not exposed to Linux).
+	 */
+	if (vm->in_kdump)
+		return virtio_mem_init_kdump(vm);
 	return virtio_mem_init_hotplug(vm);
 }
 
@@ -2640,6 +2736,7 @@ static int virtio_mem_probe(struct virti
 	hrtimer_init(&vm->retry_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
 	vm->retry_timer.function = virtio_mem_timer_expired;
 	vm->retry_timer_ms = VIRTIO_MEM_RETRY_TIMER_MIN_MS;
+	vm->in_kdump = is_kdump_kernel();
 
 	/* register the virtqueue */
 	rc = virtio_mem_init_vq(vm);
@@ -2654,8 +2751,10 @@ static int virtio_mem_probe(struct virti
 	virtio_device_ready(vdev);
 
 	/* trigger a config update to start processing the requested_size */
-	atomic_set(&vm->config_changed, 1);
-	queue_work(system_freezable_wq, &vm->wq);
+	if (!vm->in_kdump) {
+		atomic_set(&vm->config_changed, 1);
+		queue_work(system_freezable_wq, &vm->wq);
+	}
 
 	return 0;
 out_del_vq:
@@ -2732,11 +2831,21 @@ static void virtio_mem_deinit_hotplug(st
 	}
 }
 
+static void virtio_mem_deinit_kdump(struct virtio_mem *vm)
+{
+#ifdef CONFIG_PROC_VMCORE
+	unregister_vmcore_cb(&vm->vmcore_cb);
+#endif /* CONFIG_PROC_VMCORE */
+}
+
 static void virtio_mem_remove(struct virtio_device *vdev)
 {
 	struct virtio_mem *vm = vdev->priv;
 
-	virtio_mem_deinit_hotplug(vm);
+	if (vm->in_kdump)
+		virtio_mem_deinit_kdump(vm);
+	else
+		virtio_mem_deinit_hotplug(vm);
 
 	/* reset the device and cleanup the queues */
 	vdev->config->reset(vdev);
@@ -2750,6 +2859,9 @@ static void virtio_mem_config_changed(st
 {
 	struct virtio_mem *vm = vdev->priv;
 
+	if (unlikely(vm->in_kdump))
+		return;
+
 	atomic_set(&vm->config_changed, 1);
 	virtio_mem_retry(vm);
 }
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 13/87] proc: allow pid_revalidate() during LOOKUP_RCU
  2021-11-09  2:30 incoming Andrew Morton
                   ` (11 preceding siblings ...)
  2021-11-09  2:32 ` [patch 12/87] virtio-mem: kdump mode to sanitize /proc/vmcore access Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 14/87] kernel.h: drop unneeded <linux/kernel.h> inclusion from other headers Andrew Morton
                   ` (73 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: adobriyan, akpm, konrad.wilk, linux-mm, mm-commits,
	stephen.s.brennan, torvalds, viro, willy

From: Stephen Brennan <stephen.s.brennan@oracle.com>
Subject: proc: allow pid_revalidate() during LOOKUP_RCU

Problem Description:

When running running ~128 parallel instances of "TZ=/etc/localtime ps -fe
>/dev/null" on a 128CPU machine, the %sys utilization reaches 97%, and
perf shows the following code path as being responsible for heavy
contention on the d_lockref spinlock:

      walk_component()
        lookup_fast()
          d_revalidate()
            pid_revalidate() // returns -ECHILD
          unlazy_child()
            lockref_get_not_dead(&nd->path.dentry->d_lockref) <-- contention

The reason is that pid_revalidate() is triggering a drop from RCU to ref
path walk mode.  All concurrent path lookups thus try to grab a reference
to the dentry for /proc/, before re-executing pid_revalidate() and then
stepping into the /proc/$pid directory.  Thus there is huge spinlock
contention.  This patch allows pid_revalidate() to execute in RCU mode,
meaning that the path lookup can successfully enter the /proc/$pid
directory while still in RCU mode.  Later on, the path lookup may still
drop into ref mode, but the contention will be much reduced at this point.

By applying this patch, %sys utilization falls to around 85% under the
same workload, and the number of ps processes executed per unit time
increases by 3x-4x.  Although this particular workload is a bit contrived,
we have seen some large collections of eager monitoring scripts which
produced similarly high %sys time due to contention in the /proc
directory.

As a result this patch, Al noted that several procfs methods which were
only called in ref-walk mode could now be called from RCU mode.  To ensure
that this patch is safe, I audited all the inode get_link and permission()
implementations, as well as dentry d_revalidate() implementations, in
fs/proc.  The purpose here is to ensure that they either are safe to call
in RCU (i.e.  don't sleep) or correctly bail out of RCU mode if they don't
support it.  My analysis shows that all at-risk procfs methods are safe to
call under RCU, and thus this patch is safe.

Procfs RCU-walk Analysis:

This analysis is up-to-date with 5.15-rc3.  When called under RCU mode,
these functions have arguments as follows:

* get_link() receives a NULL dentry pointer when called in RCU mode.
* permission() receives MAY_NOT_BLOCK in the mode parameter when called
  from RCU.
* d_revalidate() receives LOOKUP_RCU in flags.

For the following functions, either they are trivially RCU safe, or they
explicitly bail at the beginning of the function when they run:

proc_ns_get_link       (bails out)
proc_get_link          (RCU safe)
proc_pid_get_link      (bails out)
map_files_d_revalidate (bails out)
map_misc_d_revalidate  (bails out)
proc_net_d_revalidate  (RCU safe)
proc_sys_revalidate    (bails out, also not under /proc/$pid)
tid_fd_revalidate      (bails out)
proc_sys_permission    (not under /proc/$pid)

The remainder of the functions require a bit more detail:

* proc_fd_permission: RCU safe. All of the body of this function is
  under rcu_read_lock(), except generic_permission() which declares
  itself RCU safe in its documentation string.
* proc_self_get_link uses GFP_ATOMIC in the RCU case, so it is RCU aware
  and otherwise looks safe. The same is true of proc_thread_self_get_link.
* proc_map_files_get_link: calls ns_capable, which calls capable(), and
  thus calls into the audit code (see note #1 below). The remainder is
  just a call to the trivially safe proc_pid_get_link().
* proc_pid_permission: calls ptrace_may_access(), which appears RCU
  safe, although it does call into the "security_ptrace_access_check()"
  hook, which looks safe under smack and selinux. Just the audit code is
  of concern. Also uses get_task_struct() and put_task_struct(), see
  note #2 below.
* proc_tid_comm_permission: Appears safe, though calls put_task_struct
  (see note #2 below).

Note #1:
  Most of the concern of RCU safety has centered around the audit code.
  However, since b17ec22fb339 ("selinux: slow_avc_audit has become
  non-blocking"), it's safe to call this code under RCU. So all of the
  above are safe by my estimation.

Note #2: get_task_struct() and put_task_struct():
  The majority of get_task_struct() is under RCU read lock, and in any
  case it is a simple increment. But put_task_struct() is complex, given
  that it could at some point free the task struct, and this process has
  many steps which I couldn't manually verify. However, several other
  places call put_task_struct() under RCU, so it appears safe to use
  here too (see kernel/hung_task.c:165 or rcu/tree-stall.h:296)

Patch description:

pid_revalidate() drops from RCU into REF lookup mode.  When many threads
are resolving paths within /proc in parallel, this can result in heavy
spinlock contention on d_lockref as each thread tries to grab a reference
to the /proc dentry (and drop it shortly thereafter).

Investigation indicates that it is not necessary to drop RCU in
pid_revalidate(), as no RCU data is modified and the function never
sleeps.  So, remove the LOOKUP_RCU check.

Link: https://lkml.kernel.org/r/20211004175629.292270-2-stephen.s.brennan@oracle.com
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/proc/base.c |   18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

--- a/fs/proc/base.c~proc-allow-pid_revalidate-during-lookup_rcu
+++ a/fs/proc/base.c
@@ -1979,19 +1979,21 @@ static int pid_revalidate(struct dentry
 {
 	struct inode *inode;
 	struct task_struct *task;
+	int ret = 0;
 
-	if (flags & LOOKUP_RCU)
-		return -ECHILD;
-
-	inode = d_inode(dentry);
-	task = get_proc_task(inode);
+	rcu_read_lock();
+	inode = d_inode_rcu(dentry);
+	if (!inode)
+		goto out;
+	task = pid_task(proc_pid(inode), PIDTYPE_PID);
 
 	if (task) {
 		pid_update_inode(task, inode);
-		put_task_struct(task);
-		return 1;
+		ret = 1;
 	}
-	return 0;
+out:
+	rcu_read_unlock();
+	return ret;
 }
 
 static inline bool proc_inode_is_dead(struct inode *inode)
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 14/87] kernel.h: drop unneeded <linux/kernel.h> inclusion from other headers
  2021-11-09  2:30 incoming Andrew Morton
                   ` (12 preceding siblings ...)
  2021-11-09  2:32 ` [patch 13/87] proc: allow pid_revalidate() during LOOKUP_RCU Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 15/87] kernel.h: split out container_of() and typeof_member() macros Andrew Morton
                   ` (72 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, boqun.feng, brendanhiggins, jic23,
	laurent.pinchart, linux-mm, linux, longman, mchehab,
	miguel.ojeda.sandonis, mingo, mm-commits, peterz, regressions,
	sakari.ailus, sfr, tglx, torvalds, will

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Subject: kernel.h: drop unneeded <linux/kernel.h> inclusion from other headers

Patch series "kernel.h further split", v5.

kernel.h is a set of something which is not related to each other and
often used in non-crossed compilation units, especially when drivers need
only one or two macro definitions from it.


This patch (of 7):

There is no evidence we need kernel.h inclusion in certain headers.  Drop
unneeded <linux/kernel.h> inclusion from other headers.

[sfr@canb.auug.org.au: bottom_half.h needs kernel]
  Link: https://lkml.kernel.org/r/20211015202908.1c417ae2@canb.auug.org.au
Link: https://lkml.kernel.org/r/20211013170417.87909-1-andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20211013170417.87909-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/bottom_half.h |    1 +
 include/linux/rwsem.h       |    1 -
 include/linux/smp.h         |    1 -
 include/linux/spinlock.h    |    1 -
 4 files changed, 1 insertion(+), 3 deletions(-)

--- a/include/linux/bottom_half.h~kernelh-drop-unneeded-linux-kernelh-inclusion-from-other-headers
+++ a/include/linux/bottom_half.h
@@ -2,6 +2,7 @@
 #ifndef _LINUX_BH_H
 #define _LINUX_BH_H
 
+#include <linux/kernel.h>
 #include <linux/preempt.h>
 
 #if defined(CONFIG_PREEMPT_RT) || defined(CONFIG_TRACE_IRQFLAGS)
--- a/include/linux/rwsem.h~kernelh-drop-unneeded-linux-kernelh-inclusion-from-other-headers
+++ a/include/linux/rwsem.h
@@ -11,7 +11,6 @@
 #include <linux/linkage.h>
 
 #include <linux/types.h>
-#include <linux/kernel.h>
 #include <linux/list.h>
 #include <linux/spinlock.h>
 #include <linux/atomic.h>
--- a/include/linux/smp.h~kernelh-drop-unneeded-linux-kernelh-inclusion-from-other-headers
+++ a/include/linux/smp.h
@@ -108,7 +108,6 @@ static inline void on_each_cpu_cond(smp_
 #ifdef CONFIG_SMP
 
 #include <linux/preempt.h>
-#include <linux/kernel.h>
 #include <linux/compiler.h>
 #include <linux/thread_info.h>
 #include <asm/smp.h>
--- a/include/linux/spinlock.h~kernelh-drop-unneeded-linux-kernelh-inclusion-from-other-headers
+++ a/include/linux/spinlock.h
@@ -57,7 +57,6 @@
 #include <linux/compiler.h>
 #include <linux/irqflags.h>
 #include <linux/thread_info.h>
-#include <linux/kernel.h>
 #include <linux/stringify.h>
 #include <linux/bottom_half.h>
 #include <linux/lockdep.h>
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 15/87] kernel.h: split out container_of() and typeof_member() macros
  2021-11-09  2:30 incoming Andrew Morton
                   ` (13 preceding siblings ...)
  2021-11-09  2:32 ` [patch 14/87] kernel.h: drop unneeded <linux/kernel.h> inclusion from other headers Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 16/87] include/kunit/test.h: replace kernel.h with the necessary inclusions Andrew Morton
                   ` (71 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, boqun.feng, brendanhiggins, jic23,
	laurent.pinchart, linux-mm, linux, longman, mchehab,
	miguel.ojeda.sandonis, mingo, mm-commits, peterz, regressions,
	sakari.ailus, tglx, torvalds, will

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Subject: kernel.h: split out container_of() and typeof_member() macros

kernel.h is being used as a dump for all kinds of stuff for a long time. 
Here is the attempt cleaning it up by splitting out container_of() and
typeof_member() macros.

For time being include new header back to kernel.h to avoid twisted
indirected includes for existing users.

Note, there are _a lot_ of headers and modules that include kernel.h
solely for one of these macros and this allows to unburden compiler for
the twisted inclusion paths and to make new code cleaner in the future.

Link: https://lkml.kernel.org/r/20211013170417.87909-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/container_of.h |   40 +++++++++++++++++++++++++++++++++
 include/linux/kernel.h       |   33 ---------------------------
 2 files changed, 41 insertions(+), 32 deletions(-)

--- /dev/null
+++ a/include/linux/container_of.h
@@ -0,0 +1,40 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_CONTAINER_OF_H
+#define _LINUX_CONTAINER_OF_H
+
+#include <linux/build_bug.h>
+#include <linux/err.h>
+
+#define typeof_member(T, m)	typeof(((T*)0)->m)
+
+/**
+ * container_of - cast a member of a structure out to the containing structure
+ * @ptr:	the pointer to the member.
+ * @type:	the type of the container struct this is embedded in.
+ * @member:	the name of the member within the struct.
+ *
+ */
+#define container_of(ptr, type, member) ({				\
+	void *__mptr = (void *)(ptr);					\
+	BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&	\
+			 !__same_type(*(ptr), void),			\
+			 "pointer type mismatch in container_of()");	\
+	((type *)(__mptr - offsetof(type, member))); })
+
+/**
+ * container_of_safe - cast a member of a structure out to the containing structure
+ * @ptr:	the pointer to the member.
+ * @type:	the type of the container struct this is embedded in.
+ * @member:	the name of the member within the struct.
+ *
+ * If IS_ERR_OR_NULL(ptr), ptr is returned unchanged.
+ */
+#define container_of_safe(ptr, type, member) ({				\
+	void *__mptr = (void *)(ptr);					\
+	BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&	\
+			 !__same_type(*(ptr), void),			\
+			 "pointer type mismatch in container_of()");	\
+	IS_ERR_OR_NULL(__mptr) ? ERR_CAST(__mptr) :			\
+		((type *)(__mptr - offsetof(type, member))); })
+
+#endif	/* _LINUX_CONTAINER_OF_H */
--- a/include/linux/kernel.h~kernelh-split-out-container_of-and-typeof_member-macros
+++ a/include/linux/kernel.h
@@ -9,6 +9,7 @@
 #include <linux/stddef.h>
 #include <linux/types.h>
 #include <linux/compiler.h>
+#include <linux/container_of.h>
 #include <linux/bitops.h>
 #include <linux/kstrtox.h>
 #include <linux/log2.h>
@@ -52,8 +53,6 @@
 }					\
 )
 
-#define typeof_member(T, m)	typeof(((T*)0)->m)
-
 #define _RET_IP_		(unsigned long)__builtin_return_address(0)
 #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })
 
@@ -484,36 +483,6 @@ static inline void ftrace_dump(enum ftra
 #define __CONCAT(a, b) a ## b
 #define CONCATENATE(a, b) __CONCAT(a, b)
 
-/**
- * container_of - cast a member of a structure out to the containing structure
- * @ptr:	the pointer to the member.
- * @type:	the type of the container struct this is embedded in.
- * @member:	the name of the member within the struct.
- *
- */
-#define container_of(ptr, type, member) ({				\
-	void *__mptr = (void *)(ptr);					\
-	BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&	\
-			 !__same_type(*(ptr), void),			\
-			 "pointer type mismatch in container_of()");	\
-	((type *)(__mptr - offsetof(type, member))); })
-
-/**
- * container_of_safe - cast a member of a structure out to the containing structure
- * @ptr:	the pointer to the member.
- * @type:	the type of the container struct this is embedded in.
- * @member:	the name of the member within the struct.
- *
- * If IS_ERR_OR_NULL(ptr), ptr is returned unchanged.
- */
-#define container_of_safe(ptr, type, member) ({				\
-	void *__mptr = (void *)(ptr);					\
-	BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&	\
-			 !__same_type(*(ptr), void),			\
-			 "pointer type mismatch in container_of()");	\
-	IS_ERR_OR_NULL(__mptr) ? ERR_CAST(__mptr) :			\
-		((type *)(__mptr - offsetof(type, member))); })
-
 /* Rebuild everything on CONFIG_FTRACE_MCOUNT_RECORD */
 #ifdef CONFIG_FTRACE_MCOUNT_RECORD
 # define REBUILD_DUE_TO_FTRACE_MCOUNT_RECORD
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 16/87] include/kunit/test.h: replace kernel.h with the necessary inclusions
  2021-11-09  2:30 incoming Andrew Morton
                   ` (14 preceding siblings ...)
  2021-11-09  2:32 ` [patch 15/87] kernel.h: split out container_of() and typeof_member() macros Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 17/87] include/linux/list.h: " Andrew Morton
                   ` (70 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, boqun.feng, brendanhiggins, jic23,
	laurent.pinchart, linux-mm, linux, longman, mchehab,
	miguel.ojeda.sandonis, mingo, mm-commits, peterz, regressions,
	sakari.ailus, tglx, torvalds, will

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Subject: include/kunit/test.h: replace kernel.h with the necessary inclusions

When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are involved.

Replace kernel.h inclusion with the list of what is really being used.

Link: https://lkml.kernel.org/r/20211013170417.87909-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/kunit/test.h |   13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

--- a/include/kunit/test.h~kunit-replace-kernelh-with-the-necessary-inclusions
+++ a/include/kunit/test.h
@@ -11,11 +11,20 @@
 
 #include <kunit/assert.h>
 #include <kunit/try-catch.h>
-#include <linux/kernel.h>
+
+#include <linux/container_of.h>
+#include <linux/err.h>
+#include <linux/init.h>
+#include <linux/kconfig.h>
+#include <linux/kref.h>
+#include <linux/list.h>
 #include <linux/module.h>
 #include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/string.h>
 #include <linux/types.h>
-#include <linux/kref.h>
+
+#include <asm/rwonce.h>
 
 struct kunit_resource;
 
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 17/87] include/linux/list.h: replace kernel.h with the necessary inclusions
  2021-11-09  2:30 incoming Andrew Morton
                   ` (15 preceding siblings ...)
  2021-11-09  2:32 ` [patch 16/87] include/kunit/test.h: replace kernel.h with the necessary inclusions Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 18/87] include/linux/llist.h: " Andrew Morton
                   ` (69 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, boqun.feng, brendanhiggins, jic23,
	laurent.pinchart, linux-mm, linux, longman, mchehab,
	miguel.ojeda.sandonis, mingo, mm-commits, peterz, regressions,
	sakari.ailus, tglx, torvalds, will

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Subject: include/linux/list.h: replace kernel.h with the necessary inclusions

When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are involved.

Replace kernel.h inclusion with the list of what is really being used.

Link: https://lkml.kernel.org/r/20211013170417.87909-5-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/list.h |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/include/linux/list.h~list-replace-kernelh-with-the-necessary-inclusions
+++ a/include/linux/list.h
@@ -2,11 +2,13 @@
 #ifndef _LINUX_LIST_H
 #define _LINUX_LIST_H
 
+#include <linux/container_of.h>
 #include <linux/types.h>
 #include <linux/stddef.h>
 #include <linux/poison.h>
 #include <linux/const.h>
-#include <linux/kernel.h>
+
+#include <asm/barrier.h>
 
 /*
  * Circular doubly linked list implementation.
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 18/87] include/linux/llist.h: replace kernel.h with the necessary inclusions
  2021-11-09  2:30 incoming Andrew Morton
                   ` (16 preceding siblings ...)
  2021-11-09  2:32 ` [patch 17/87] include/linux/list.h: " Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 19/87] include/linux/plist.h: " Andrew Morton
                   ` (68 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, boqun.feng, brendanhiggins, jic23,
	laurent.pinchart, linux-mm, linux, longman, mchehab,
	miguel.ojeda.sandonis, mingo, mm-commits, peterz, regressions,
	sakari.ailus, tglx, torvalds, will

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Subject: include/linux/llist.h: replace kernel.h with the necessary inclusions

When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are involved.

Replace kernel.h inclusion with the list of what is really being used.

Link: https://lkml.kernel.org/r/20211013170417.87909-6-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/llist.h |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/include/linux/llist.h~llist-replace-kernelh-with-the-necessary-inclusions
+++ a/include/linux/llist.h
@@ -49,7 +49,9 @@
  */
 
 #include <linux/atomic.h>
-#include <linux/kernel.h>
+#include <linux/container_of.h>
+#include <linux/stddef.h>
+#include <linux/types.h>
 
 struct llist_head {
 	struct llist_node *first;
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 19/87] include/linux/plist.h: replace kernel.h with the necessary inclusions
  2021-11-09  2:30 incoming Andrew Morton
                   ` (17 preceding siblings ...)
  2021-11-09  2:32 ` [patch 18/87] include/linux/llist.h: " Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 20/87] include/media/media-entity.h: " Andrew Morton
                   ` (67 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, boqun.feng, brendanhiggins, jic23,
	laurent.pinchart, linux-mm, linux, longman, mchehab,
	miguel.ojeda.sandonis, mingo, mm-commits, peterz, regressions,
	sakari.ailus, tglx, torvalds, will

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Subject: include/linux/plist.h: replace kernel.h with the necessary inclusions

When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are involved.

Replace kernel.h inclusion with the list of what is really being used.

Link: https://lkml.kernel.org/r/20211013170417.87909-7-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/plist.h |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/include/linux/plist.h~plist-replace-kernelh-with-the-necessary-inclusions
+++ a/include/linux/plist.h
@@ -73,8 +73,11 @@
 #ifndef _LINUX_PLIST_H_
 #define _LINUX_PLIST_H_
 
-#include <linux/kernel.h>
+#include <linux/container_of.h>
 #include <linux/list.h>
+#include <linux/types.h>
+
+#include <asm/bug.h>
 
 struct plist_head {
 	struct list_head node_list;
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 20/87] include/media/media-entity.h: replace kernel.h with the necessary inclusions
  2021-11-09  2:30 incoming Andrew Morton
                   ` (18 preceding siblings ...)
  2021-11-09  2:32 ` [patch 19/87] include/linux/plist.h: " Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 21/87] include/linux/delay.h: " Andrew Morton
                   ` (66 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, boqun.feng, brendanhiggins, jic23,
	laurent.pinchart, linux-mm, linux, longman, mchehab,
	miguel.ojeda.sandonis, mingo, mm-commits, peterz, regressions,
	sakari.ailus, tglx, torvalds, will

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Subject: include/media/media-entity.h: replace kernel.h with the necessary inclusions

When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are involved.

Replace kernel.h inclusion with the list of what is really being used.

Link: https://lkml.kernel.org/r/20211013170417.87909-8-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/media/media-entity.h |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/include/media/media-entity.h~media-entity-replace-kernelh-with-the-necessary-inclusions
+++ a/include/media/media-entity.h
@@ -13,10 +13,11 @@
 
 #include <linux/bitmap.h>
 #include <linux/bug.h>
+#include <linux/container_of.h>
 #include <linux/fwnode.h>
-#include <linux/kernel.h>
 #include <linux/list.h>
 #include <linux/media.h>
+#include <linux/types.h>
 
 /* Enums used internally at the media controller to represent graphs */
 
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 21/87] include/linux/delay.h: replace kernel.h with the necessary inclusions
  2021-11-09  2:30 incoming Andrew Morton
                   ` (19 preceding siblings ...)
  2021-11-09  2:32 ` [patch 20/87] include/media/media-entity.h: " Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 22/87] include/linux/sbitmap.h: " Andrew Morton
                   ` (65 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, linux-mm, mm-commits, torvalds

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Subject: include/linux/delay.h: replace kernel.h with the necessary inclusions

When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are involved.

Replace kernel.h inclusion with the list of what is really being used.

[akpm@linux-foundation.org: cxd2880_common.h needs bits.h for GENMASK()]
[andriy.shevchenko@linux.intel.com: delay.h: fix for removed kernel.h]
  Link: https://lkml.kernel.org/r/20211028170143.56523-1-andriy.shevchenko@linux.intel.com
[akpm@linux-foundation.org: include/linux/fwnode.h needs bits.h for BIT()]
Link: https://lkml.kernel.org/r/20211027150324.79827-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/riscv/lib/delay.c                               |    4 ++++
 arch/s390/include/asm/facility.h                     |    4 ++++
 drivers/media/dvb-frontends/cxd2880/cxd2880_common.h |    1 +
 include/linux/delay.h                                |    2 +-
 include/linux/fwnode.h                               |    1 +
 5 files changed, 11 insertions(+), 1 deletion(-)

--- a/arch/riscv/lib/delay.c~delay-replace-kernelh-with-the-necessary-inclusions
+++ a/arch/riscv/lib/delay.c
@@ -4,10 +4,14 @@
  */
 
 #include <linux/delay.h>
+#include <linux/math.h>
 #include <linux/param.h>
 #include <linux/timex.h>
+#include <linux/types.h>
 #include <linux/export.h>
 
+#include <asm/processor.h>
+
 /*
  * This is copies from arch/arm/include/asm/delay.h
  *
--- a/arch/s390/include/asm/facility.h~delay-replace-kernelh-with-the-necessary-inclusions
+++ a/arch/s390/include/asm/facility.h
@@ -9,8 +9,12 @@
 #define __ASM_FACILITY_H
 
 #include <asm/facility-defs.h>
+
+#include <linux/minmax.h>
 #include <linux/string.h>
+#include <linux/types.h>
 #include <linux/preempt.h>
+
 #include <asm/lowcore.h>
 
 #define MAX_FACILITY_BIT (sizeof(stfle_fac_list) * 8)
--- a/drivers/media/dvb-frontends/cxd2880/cxd2880_common.h~delay-replace-kernelh-with-the-necessary-inclusions
+++ a/drivers/media/dvb-frontends/cxd2880/cxd2880_common.h
@@ -12,6 +12,7 @@
 #include <linux/types.h>
 #include <linux/errno.h>
 #include <linux/delay.h>
+#include <linux/bits.h>
 #include <linux/string.h>
 
 int cxd2880_convert2s_complement(u32 value, u32 bitlen);
--- a/include/linux/delay.h~delay-replace-kernelh-with-the-necessary-inclusions
+++ a/include/linux/delay.h
@@ -19,7 +19,7 @@
  *   https://lists.openwall.net/linux-kernel/2011/01/09/56
  */
 
-#include <linux/kernel.h>
+#include <linux/math.h>
 
 extern unsigned long loops_per_jiffy;
 
--- a/include/linux/fwnode.h~delay-replace-kernelh-with-the-necessary-inclusions
+++ a/include/linux/fwnode.h
@@ -11,6 +11,7 @@
 
 #include <linux/types.h>
 #include <linux/list.h>
+#include <linux/bits.h>
 #include <linux/err.h>
 
 struct fwnode_operations;
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 22/87] include/linux/sbitmap.h: replace kernel.h with the necessary inclusions
  2021-11-09  2:30 incoming Andrew Morton
                   ` (20 preceding siblings ...)
  2021-11-09  2:32 ` [patch 21/87] include/linux/delay.h: " Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 23/87] include/linux/radix-tree.h: " Andrew Morton
                   ` (64 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, linux-mm, mm-commits, torvalds

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Subject: include/linux/sbitmap.h: replace kernel.h with the necessary inclusions

When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are involved.

Replace kernel.h inclusion with the list of what is really being used.

Link: https://lkml.kernel.org/r/20211027150437.79921-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/sbitmap.h |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- a/include/linux/sbitmap.h~sbitmap-replace-kernelh-with-the-necessary-inclusions
+++ a/include/linux/sbitmap.h
@@ -9,8 +9,17 @@
 #ifndef __LINUX_SCALE_BITMAP_H
 #define __LINUX_SCALE_BITMAP_H
 
-#include <linux/kernel.h>
+#include <linux/atomic.h>
+#include <linux/bitops.h>
+#include <linux/cache.h>
+#include <linux/list.h>
+#include <linux/log2.h>
+#include <linux/minmax.h>
+#include <linux/percpu.h>
 #include <linux/slab.h>
+#include <linux/smp.h>
+#include <linux/types.h>
+#include <linux/wait.h>
 
 struct seq_file;
 
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 23/87] include/linux/radix-tree.h: replace kernel.h with the necessary inclusions
  2021-11-09  2:30 incoming Andrew Morton
                   ` (21 preceding siblings ...)
  2021-11-09  2:32 ` [patch 22/87] include/linux/sbitmap.h: " Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 24/87] include/linux/generic-radix-tree.h: " Andrew Morton
                   ` (63 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, linux-mm, mm-commits, torvalds

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Subject: include/linux/radix-tree.h: replace kernel.h with the necessary inclusions

When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are involved.

Replace kernel.h inclusion with the list of what is really being used.

Link: https://lkml.kernel.org/r/20211027150528.80003-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/radix-tree.h |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/include/linux/radix-tree.h~radix-tree-replace-kernelh-with-the-necessary-inclusions
+++ a/include/linux/radix-tree.h
@@ -9,8 +9,10 @@
 #define _LINUX_RADIX_TREE_H
 
 #include <linux/bitops.h>
-#include <linux/kernel.h>
+#include <linux/gfp.h>
 #include <linux/list.h>
+#include <linux/lockdep.h>
+#include <linux/math.h>
 #include <linux/percpu.h>
 #include <linux/preempt.h>
 #include <linux/rcupdate.h>
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 24/87] include/linux/generic-radix-tree.h: replace kernel.h with the necessary inclusions
  2021-11-09  2:30 incoming Andrew Morton
                   ` (22 preceding siblings ...)
  2021-11-09  2:32 ` [patch 23/87] include/linux/radix-tree.h: " Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 25/87] kernel.h: split out instruction pointer accessors Andrew Morton
                   ` (62 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, linux-mm, mm-commits, torvalds

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Subject: include/linux/generic-radix-tree.h: replace kernel.h with the necessary inclusions

When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are involved.

Replace kernel.h inclusion with the list of what is really being used.

[akpm@linux-foundation.org: include math.h for round_up()]
Link: https://lkml.kernel.org/r/20211027150548.80042-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/generic-radix-tree.h |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/include/linux/generic-radix-tree.h~generic-radix-tree-replace-kernelh-with-the-necessary-inclusions
+++ a/include/linux/generic-radix-tree.h
@@ -38,8 +38,9 @@
 
 #include <asm/page.h>
 #include <linux/bug.h>
-#include <linux/kernel.h>
 #include <linux/log2.h>
+#include <linux/math.h>
+#include <linux/types.h>
 
 struct genradix_root;
 
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 25/87] kernel.h: split out instruction pointer accessors
  2021-11-09  2:30 incoming Andrew Morton
                   ` (23 preceding siblings ...)
  2021-11-09  2:32 ` [patch 24/87] include/linux/generic-radix-tree.h: " Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 26/87] linux/container_of.h: switch to static_assert Andrew Morton
                   ` (61 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, linux-mm, mm-commits, sfr, torvalds

From: Stephen Rothwell <sfr@canb.auug.org.au>
Subject: kernel.h: split out instruction pointer accessors

bottom_half.h needs _THIS_IP_ to be standalone, so split that and _RET_IP_
out from kernel.h into the new instruction_pointer.h.  kernel.h directly
needs them, so include it there and replace the include of kernel.h with
this new file in bottom_half.h.

Link: https://lkml.kernel.org/r/20211028161248.45232-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/bottom_half.h         |    2 +-
 include/linux/instruction_pointer.h |    8 ++++++++
 include/linux/kernel.h              |    4 +---
 3 files changed, 10 insertions(+), 4 deletions(-)

--- a/include/linux/bottom_half.h~kernelh-split-out-instruction-pointer-accessors
+++ a/include/linux/bottom_half.h
@@ -2,7 +2,7 @@
 #ifndef _LINUX_BH_H
 #define _LINUX_BH_H
 
-#include <linux/kernel.h>
+#include <linux/instruction_pointer.h>
 #include <linux/preempt.h>
 
 #if defined(CONFIG_PREEMPT_RT) || defined(CONFIG_TRACE_IRQFLAGS)
--- /dev/null
+++ a/include/linux/instruction_pointer.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_INSTRUCTION_POINTER_H
+#define _LINUX_INSTRUCTION_POINTER_H
+
+#define _RET_IP_		(unsigned long)__builtin_return_address(0)
+#define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })
+
+#endif /* _LINUX_INSTRUCTION_POINTER_H */
--- a/include/linux/kernel.h~kernelh-split-out-instruction-pointer-accessors
+++ a/include/linux/kernel.h
@@ -20,6 +20,7 @@
 #include <linux/printk.h>
 #include <linux/build_bug.h>
 #include <linux/static_call_types.h>
+#include <linux/instruction_pointer.h>
 #include <asm/byteorder.h>
 
 #include <uapi/linux/kernel.h>
@@ -53,9 +54,6 @@
 }					\
 )
 
-#define _RET_IP_		(unsigned long)__builtin_return_address(0)
-#define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })
-
 /**
  * upper_32_bits - return bits 32-63 of a number
  * @n: the number we're accessing
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 26/87] linux/container_of.h: switch to static_assert
  2021-11-09  2:30 incoming Andrew Morton
                   ` (24 preceding siblings ...)
  2021-11-09  2:32 ` [patch 25/87] kernel.h: split out instruction pointer accessors Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 27/87] mailmap: update email address for Colin King Andrew Morton
                   ` (60 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, linux-mm, linux, mm-commits,
	ndesaulniers, ojeda, torvalds

From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Subject: linux/container_of.h: switch to static_assert

_Static_assert() is evaluated already in the compiler's frontend, and
gives a somehat more to-the-point error, compared to the BUILD_BUG_ON
macro, which only fires after the optimizer has had a chance to eliminate
calls to functions marked with __attribute__((error)).  In theory, this
might make builds a tiny bit faster.

There's also a little less gunk in the error message emitted:

lib/sort.c: In function `foo':
./include/linux/build_bug.h:78:41: error: static assertion failed: "pointer type mismatch in container_of()"
   78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)

compared to

lib/sort.c: In function `foo':
././include/linux/compiler_types.h:322:38: error: call to `__compiletime_assert_2' declared with attribute error: pointer type mismatch in container_of()
  322 |  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)

While at it, fix the copy-pasto in container_of_safe().

Link: https://lkml.kernel.org/r/20211015090530.2774079-1-linux@rasmusvillemoes.dk
Link: https://lore.kernel.org/lkml/20211014132331.GA4811@kernel.org/T/
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/container_of.h |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- a/include/linux/container_of.h~linux-container_ofh-switch-to-static_assert
+++ a/include/linux/container_of.h
@@ -16,9 +16,9 @@
  */
 #define container_of(ptr, type, member) ({				\
 	void *__mptr = (void *)(ptr);					\
-	BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&	\
-			 !__same_type(*(ptr), void),			\
-			 "pointer type mismatch in container_of()");	\
+	static_assert(__same_type(*(ptr), ((type *)0)->member) ||	\
+		      __same_type(*(ptr), void),			\
+		      "pointer type mismatch in container_of()");	\
 	((type *)(__mptr - offsetof(type, member))); })
 
 /**
@@ -31,9 +31,9 @@
  */
 #define container_of_safe(ptr, type, member) ({				\
 	void *__mptr = (void *)(ptr);					\
-	BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&	\
-			 !__same_type(*(ptr), void),			\
-			 "pointer type mismatch in container_of()");	\
+	static_assert(__same_type(*(ptr), ((type *)0)->member) ||	\
+		      __same_type(*(ptr), void),			\
+		      "pointer type mismatch in container_of_safe()");	\
 	IS_ERR_OR_NULL(__mptr) ? ERR_CAST(__mptr) :			\
 		((type *)(__mptr - offsetof(type, member))); })
 
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 27/87] mailmap: update email address for Colin King
  2021-11-09  2:30 incoming Andrew Morton
                   ` (25 preceding siblings ...)
  2021-11-09  2:32 ` [patch 26/87] linux/container_of.h: switch to static_assert Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 28/87] MAINTAINERS: add "exec & binfmt" section with myself and Eric Andrew Morton
                   ` (59 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, colin.i.king, colin.i.king, linux-mm, mm-commits, torvalds

From: Colin Ian King <colin.i.king@googlemail.com>
Subject: mailmap: update email address for Colin King

Colin King has moved to Intel to update gmail and Canonical email
addresses.

Link: https://lkml.kernel.org/r/20211102231617.78569-1-colin.i.king@gmail.com
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 .mailmap |    2 ++
 1 file changed, 2 insertions(+)

--- a/.mailmap~mailmap-update-email-address-for-colin-king
+++ a/.mailmap
@@ -73,6 +73,8 @@ Chris Chiu <chris.chiu@canonical.com> <c
 Chris Chiu <chris.chiu@canonical.com> <chiu@endlessos.org>
 Christophe Ricard <christophe.ricard@gmail.com>
 Christoph Hellwig <hch@lst.de>
+Colin Ian King <colin.king@intel.com> <colin.king@canonical.com>
+Colin Ian King <colin.king@intel.com> <colin.i.king@gmail.com>
 Corey Minyard <minyard@acm.org>
 Damian Hobson-Garcia <dhobsong@igel.co.jp>
 Daniel Borkmann <daniel@iogearbox.net> <danborkmann@googlemail.com>
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 28/87] MAINTAINERS: add "exec & binfmt" section with myself and Eric
  2021-11-09  2:30 incoming Andrew Morton
                   ` (26 preceding siblings ...)
  2021-11-09  2:32 ` [patch 27/87] mailmap: update email address for Colin King Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 29/87] MAINTAINERS: rectify entry for ARM/TOSHIBA VISCONTI ARCHITECTURE Andrew Morton
                   ` (58 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, ebiederm, keescook, linux-mm, mm-commits, torvalds

From: Kees Cook <keescook@chromium.org>
Subject: MAINTAINERS: add "exec & binfmt" section with myself and Eric

I'd like more continuity of review for the exec and binfmt (and ELF)
stuff.  Eric and I have been the most active lately, so list us as
reviewers.

Link: https://lkml.kernel.org/r/20211006180200.1178142-1-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 MAINTAINERS |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

--- a/MAINTAINERS~maintainers-add-exec-binfmt-section-with-myself-and-eric
+++ a/MAINTAINERS
@@ -7037,6 +7037,20 @@ F:	include/trace/events/mdio.h
 F:	include/uapi/linux/mdio.h
 F:	include/uapi/linux/mii.h
 
+EXEC & BINFMT API
+R:	Eric Biederman <ebiederm@xmission.com>
+R:	Kees Cook <keescook@chromium.org>
+F:	arch/alpha/kernel/binfmt_loader.c
+F:	arch/x86/ia32/ia32_aout.c
+F:	fs/*binfmt_*.c
+F:	fs/exec.c
+F:	include/linux/binfmts.h
+F:	include/linux/elf.h
+F:	include/uapi/linux/binfmts.h
+F:	tools/testing/selftests/exec/
+N:	asm/elf.h
+N:	binfmt
+
 EXFAT FILE SYSTEM
 M:	Namjae Jeon <linkinjeon@kernel.org>
 M:	Sungjong Seo <sj1557.seo@samsung.com>
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 29/87] MAINTAINERS: rectify entry for ARM/TOSHIBA VISCONTI ARCHITECTURE
  2021-11-09  2:30 incoming Andrew Morton
                   ` (27 preceding siblings ...)
  2021-11-09  2:32 ` [patch 28/87] MAINTAINERS: add "exec & binfmt" section with myself and Eric Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:32 ` [patch 30/87] MAINTAINERS: rectify entry for HIKEY960 ONBOARD USB GPIO HUB DRIVER Andrew Morton
                   ` (57 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, anitha.chrisanthus, chenyu56, edmund.j.dea, gregkh, joe,
	john.stultz, linux-mm, lukas.bulwahn, mchehab+huawei, mm-commits,
	nobuhiro1.iwamatsu, punit1.agrawal, ralf.ramsauer, robh+dt, sam,
	torvalds, wilken.gottwalt

From: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Subject: MAINTAINERS: rectify entry for ARM/TOSHIBA VISCONTI ARCHITECTURE

Patch series "Rectify file references for dt-bindings in MAINTAINERS", v5.

A patch series that cleans up some file references for dt-bindings in
MAINTAINERS.  


This patch (of 4):

Commit 836863a08c99 ("MAINTAINERS: Add information for Toshiba Visconti
ARM SoCs") refers to the non-existing file toshiba,tmpv7700-pinctrl.yaml
in ./Documentation/devicetree/bindings/pinctrl/.  Commit 1825c1fe0057
("pinctrl: Add DT bindings for Toshiba Visconti TMPV7700 SoC") originating
from the same patch series however adds the file
toshiba,visconti-pinctrl.yaml in that directory instead.

So, refer to toshiba,visconti-pinctrl.yaml in the ARM/TOSHIBA VISCONTI
ARCHITECTURE section instead.

Link: https://lkml.kernel.org/r/20211026141902.4865-1-lukas.bulwahn@gmail.com
Link: https://lkml.kernel.org/r/20211026141902.4865-2-lukas.bulwahn@gmail.com
Fixes: 836863a08c99 ("MAINTAINERS: Add information for Toshiba Visconti ARM SoCs")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Reviewed-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Punit Agrawal <punit1.agrawal@toshiba.co.jp>
Cc: Anitha Chrisanthus <anitha.chrisanthus@intel.com>
Cc: Wilken Gottwalt <wilken.gottwalt@posteo.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Yu Chen <chenyu56@huawei.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Edmund Dea <edmund.j.dea@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: Ralf Ramsauer <ralf.ramsauer@oth-regensburg.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 MAINTAINERS |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/MAINTAINERS~maintainers-rectify-entry-for-arm-toshiba-visconti-architecture
+++ a/MAINTAINERS
@@ -2745,7 +2745,7 @@ F:	Documentation/devicetree/bindings/arm
 F:	Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml
 F:	Documentation/devicetree/bindings/gpio/toshiba,gpio-visconti.yaml
 F:	Documentation/devicetree/bindings/pci/toshiba,visconti-pcie.yaml
-F:	Documentation/devicetree/bindings/pinctrl/toshiba,tmpv7700-pinctrl.yaml
+F:	Documentation/devicetree/bindings/pinctrl/toshiba,visconti-pinctrl.yaml
 F:	Documentation/devicetree/bindings/watchdog/toshiba,visconti-wdt.yaml
 F:	arch/arm64/boot/dts/toshiba/
 F:	drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 30/87] MAINTAINERS: rectify entry for HIKEY960 ONBOARD USB GPIO HUB DRIVER
  2021-11-09  2:30 incoming Andrew Morton
                   ` (28 preceding siblings ...)
  2021-11-09  2:32 ` [patch 29/87] MAINTAINERS: rectify entry for ARM/TOSHIBA VISCONTI ARCHITECTURE Andrew Morton
@ 2021-11-09  2:32 ` Andrew Morton
  2021-11-09  2:33 ` [patch 31/87] MAINTAINERS: rectify entry for INTEL KEEM BAY DRM DRIVER Andrew Morton
                   ` (56 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:32 UTC (permalink / raw)
  To: akpm, anitha.chrisanthus, chenyu56, edmund.j.dea, gregkh, joe,
	john.stultz, linux-mm, lukas.bulwahn, mchehab+huawei, mm-commits,
	nobuhiro1.iwamatsu, punit1.agrawal, ralf.ramsauer, robh+dt, sam,
	torvalds, wilken.gottwalt

From: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Subject: MAINTAINERS: rectify entry for HIKEY960 ONBOARD USB GPIO HUB DRIVER

Commit 7a6ff4c4cbc3 ("misc: hisi_hikey_usb: Driver to support onboard USB
gpio hub on Hikey960") refers to the non-existing file
./Documentation/devicetree/bindings/misc/hisilicon-hikey-usb.yaml, but
this commit's patch series does not add any related devicetree binding in
misc.

So, just drop this file reference in HIKEY960 ONBOARD USB GPIO HUB DRIVER.

Link: https://lkml.kernel.org/r/20211026141902.4865-3-lukas.bulwahn@gmail.com
Fixes: 7a6ff4c4cbc3 ("misc: hisi_hikey_usb: Driver to support onboard USB gpio hub on Hikey960")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Anitha Chrisanthus <anitha.chrisanthus@intel.com>
Cc: Edmund Dea <edmund.j.dea@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joe Perches <joe@perches.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Cc: Punit Agrawal <punit1.agrawal@toshiba.co.jp>
Cc: Ralf Ramsauer <ralf.ramsauer@oth-regensburg.de>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Wilken Gottwalt <wilken.gottwalt@posteo.net>
Cc: Yu Chen <chenyu56@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 MAINTAINERS |    1 -
 1 file changed, 1 deletion(-)

--- a/MAINTAINERS~maintainers-rectify-entry-for-hikey960-onboard-usb-gpio-hub-driver
+++ a/MAINTAINERS
@@ -8483,7 +8483,6 @@ M:	John Stultz <john.stultz@linaro.org>
 L:	linux-kernel@vger.kernel.org
 S:	Maintained
 F:	drivers/misc/hisi_hikey_usb.c
-F:	Documentation/devicetree/bindings/misc/hisilicon-hikey-usb.yaml
 
 HISILICON PMU DRIVER
 M:	Shaokun Zhang <zhangshaokun@hisilicon.com>
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 31/87] MAINTAINERS: rectify entry for INTEL KEEM BAY DRM DRIVER
  2021-11-09  2:30 incoming Andrew Morton
                   ` (29 preceding siblings ...)
  2021-11-09  2:32 ` [patch 30/87] MAINTAINERS: rectify entry for HIKEY960 ONBOARD USB GPIO HUB DRIVER Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 32/87] MAINTAINERS: rectify entry for ALLWINNER HARDWARE SPINLOCK SUPPORT Andrew Morton
                   ` (55 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: akpm, anitha.chrisanthus, chenyu56, edmund.j.dea, gregkh, joe,
	john.stultz, linux-mm, lukas.bulwahn, mchehab+huawei, mm-commits,
	nobuhiro1.iwamatsu, punit1.agrawal, ralf.ramsauer, robh+dt, sam,
	torvalds, wilken.gottwalt

From: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Subject: MAINTAINERS: rectify entry for INTEL KEEM BAY DRM DRIVER

Commit ed794057b052 ("drm/kmb: Build files for KeemBay Display driver")
refers to the non-existing file intel,kmb_display.yaml in
./Documentation/devicetree/bindings/display/.

Commit 5a76b1ed73b9 ("dt-bindings: display: Add support for Intel KeemBay
Display") originating from the same patch series however adds the file
intel,keembay-display.yaml in that directory instead.

So, refer to intel,keembay-display.yaml in the INTEL KEEM BAY DRM DRIVER
section instead.

Link: https://lkml.kernel.org/r/20211026141902.4865-4-lukas.bulwahn@gmail.com
Fixes: ed794057b052 ("drm/kmb: Build files for KeemBay Display driver")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Anitha Chrisanthus <anitha.chrisanthus@intel.com>
Cc: Edmund Dea <edmund.j.dea@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joe Perches <joe@perches.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Cc: Punit Agrawal <punit1.agrawal@toshiba.co.jp>
Cc: Ralf Ramsauer <ralf.ramsauer@oth-regensburg.de>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Wilken Gottwalt <wilken.gottwalt@posteo.net>
Cc: Yu Chen <chenyu56@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 MAINTAINERS |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/MAINTAINERS~maintainers-rectify-entry-for-intel-keem-bay-drm-driver
+++ a/MAINTAINERS
@@ -9530,7 +9530,7 @@ INTEL KEEM BAY DRM DRIVER
 M:	Anitha Chrisanthus <anitha.chrisanthus@intel.com>
 M:	Edmund Dea <edmund.j.dea@intel.com>
 S:	Maintained
-F:	Documentation/devicetree/bindings/display/intel,kmb_display.yaml
+F:	Documentation/devicetree/bindings/display/intel,keembay-display.yaml
 F:	drivers/gpu/drm/kmb/
 
 INTEL KEEM BAY OCS AES/SM4 CRYPTO DRIVER
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 32/87] MAINTAINERS: rectify entry for ALLWINNER HARDWARE SPINLOCK SUPPORT
  2021-11-09  2:30 incoming Andrew Morton
                   ` (30 preceding siblings ...)
  2021-11-09  2:33 ` [patch 31/87] MAINTAINERS: rectify entry for INTEL KEEM BAY DRM DRIVER Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 33/87] lib, stackdepot: check stackdepot handle before accessing slabs Andrew Morton
                   ` (54 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: akpm, anitha.chrisanthus, chenyu56, edmund.j.dea, gregkh, joe,
	john.stultz, linux-mm, lukas.bulwahn, mchehab+huawei, mm-commits,
	nobuhiro1.iwamatsu, punit1.agrawal, ralf.ramsauer, robh+dt, sam,
	torvalds, wilken.gottwalt

From: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Subject: MAINTAINERS: rectify entry for ALLWINNER HARDWARE SPINLOCK SUPPORT

Commit f9e784dcb63f ("dt-bindings: hwlock: add sun6i_hwspinlock") adds
Documentation/devicetree/bindings/hwlock/allwinner,sun6i-a31-hwspinlock.yaml,
but the related commit 3c881e05c814 ("hwspinlock: add sun6i hardware
spinlock support") adds a file reference to
allwinner,sun6i-hwspinlock.yaml instead.

Hence, ./scripts/get_maintainer.pl --self-test=patterns complains:

  warning: no file matches  F:  Documentation/devicetree/bindings/hwlock/allwinner,sun6i-hwspinlock.yaml

Rectify this file reference in ALLWINNER HARDWARE SPINLOCK SUPPORT.

Link: https://lkml.kernel.org/r/20211026141902.4865-5-lukas.bulwahn@gmail.com
Reviewed-by: Wilken Gottwalt <wilken.gottwalt@posteo.net>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Anitha Chrisanthus <anitha.chrisanthus@intel.com>
Cc: Edmund Dea <edmund.j.dea@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joe Perches <joe@perches.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Cc: Punit Agrawal <punit1.agrawal@toshiba.co.jp>
Cc: Ralf Ramsauer <ralf.ramsauer@oth-regensburg.de>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Yu Chen <chenyu56@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 MAINTAINERS |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/MAINTAINERS~maintainers-rectify-entry-for-allwinner-hardware-spinlock-support
+++ a/MAINTAINERS
@@ -761,7 +761,7 @@ F:	drivers/crypto/allwinner/
 ALLWINNER HARDWARE SPINLOCK SUPPORT
 M:	Wilken Gottwalt <wilken.gottwalt@posteo.net>
 S:	Maintained
-F:	Documentation/devicetree/bindings/hwlock/allwinner,sun6i-hwspinlock.yaml
+F:	Documentation/devicetree/bindings/hwlock/allwinner,sun6i-a31-hwspinlock.yaml
 F:	drivers/hwspinlock/sun6i_hwspinlock.c
 
 ALLWINNER THERMAL DRIVER
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 33/87] lib, stackdepot: check stackdepot handle before accessing slabs
  2021-11-09  2:30 incoming Andrew Morton
                   ` (31 preceding siblings ...)
  2021-11-09  2:33 ` [patch 32/87] MAINTAINERS: rectify entry for ALLWINNER HARDWARE SPINLOCK SUPPORT Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 34/87] lib, stackdepot: add helper to print stack entries Andrew Morton
                   ` (53 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: airlied, akpm, andreyknvl, daniel, dvyukov, geert, glider,
	imran.f.khan, linux-mm, maarten.lankhorst, mm-commits, mripard,
	ryabinin.a.a, torvalds, tzimmermann, vbabka

From: Imran Khan <imran.f.khan@oracle.com>
Subject: lib, stackdepot: check stackdepot handle before accessing slabs

Patch series "lib, stackdepot: check stackdepot handle before accessing slabs", v2.

PATCH-1: Checks validity of a stackdepot handle before proceeding to
access stackdepot slab/objects.

PATCH-2: Adds a helper in stackdepot, to allow users to print stack
entries just by specifying the stackdepot handle.  It also changes such
users to use this new interface.  

PATCH-3: Adds a helper in stackdepot, to allow users to print stack
entries into buffers just by specifying the stackdepot handle and
destination buffer.  It also changes such users to use this new interface.


This patch (of 3):

stack_depot_save allocates slabs that will be used for storing objects in
future.If this slab allocation fails we may get to a situation where space
allocation for a new stack_record fails, causing stack_depot_save to
return 0 as handle.  If user of this handle ends up invoking
stack_depot_fetch with this handle value, current implementation of
stack_depot_fetch will end up using slab from wrong index.  To avoid this
check handle value at the beginning.

Link: https://lkml.kernel.org/r/20210915175321.3472770-1-imran.f.khan@oracle.com
Link: https://lkml.kernel.org/r/20210915014806.3206938-1-imran.f.khan@oracle.com
Link: https://lkml.kernel.org/r/20210915014806.3206938-2-imran.f.khan@oracle.com
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/stackdepot.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/lib/stackdepot.c~lib-stackdepot-check-stackdepot-handle-before-accessing-slabs
+++ a/lib/stackdepot.c
@@ -231,6 +231,9 @@ unsigned int stack_depot_fetch(depot_sta
 	struct stack_record *stack;
 
 	*entries = NULL;
+	if (!handle)
+		return 0;
+
 	if (parts.slabindex > depot_index) {
 		WARN(1, "slab index %d out of bounds (%d) for stack id %08x\n",
 			parts.slabindex, depot_index, handle);
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 34/87] lib, stackdepot: add helper to print stack entries
  2021-11-09  2:30 incoming Andrew Morton
                   ` (32 preceding siblings ...)
  2021-11-09  2:33 ` [patch 33/87] lib, stackdepot: check stackdepot handle before accessing slabs Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 35/87] lib, stackdepot: add helper to print stack entries into buffer Andrew Morton
                   ` (52 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: airlied, akpm, andreyknvl, daniel, dvyukov, geert, glider,
	imran.f.khan, linux-mm, maarten.lankhorst, mm-commits, mripard,
	ryabinin.a.a, torvalds, tzimmermann, vbabka

From: Imran Khan <imran.f.khan@oracle.com>
Subject: lib, stackdepot: add helper to print stack entries

To print a stack entries, users of stackdepot, first use stack_depot_fetch
to get a list of stack entries and then use stack_trace_print to print
this list.  Provide a helper in stackdepot to print stack entries based on
stackdepot handle.  Also change above mentioned users to use this helper.

Link: https://lkml.kernel.org/r/20210915014806.3206938-3-imran.f.khan@oracle.com
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/stackdepot.h |    2 ++
 lib/stackdepot.c           |   18 ++++++++++++++++++
 mm/kasan/report.c          |   15 +++------------
 mm/page_owner.c            |   13 ++++---------
 4 files changed, 27 insertions(+), 21 deletions(-)

--- a/include/linux/stackdepot.h~lib-stackdepot-add-helper-to-print-stack-entries
+++ a/include/linux/stackdepot.h
@@ -25,6 +25,8 @@ depot_stack_handle_t stack_depot_save(un
 unsigned int stack_depot_fetch(depot_stack_handle_t handle,
 			       unsigned long **entries);
 
+void stack_depot_print(depot_stack_handle_t stack);
+
 #ifdef CONFIG_STACKDEPOT
 int stack_depot_init(void);
 #else
--- a/lib/stackdepot.c~lib-stackdepot-add-helper-to-print-stack-entries
+++ a/lib/stackdepot.c
@@ -214,6 +214,24 @@ static inline struct stack_record *find_
 }
 
 /**
+ * stack_depot_print - print stack entries from a depot
+ *
+ * @stack:		Stack depot handle which was returned from
+ *			stack_depot_save().
+ *
+ */
+void stack_depot_print(depot_stack_handle_t stack)
+{
+	unsigned long *entries;
+	unsigned int nr_entries;
+
+	nr_entries = stack_depot_fetch(stack, &entries);
+	if (nr_entries > 0)
+		stack_trace_print(entries, nr_entries, 0);
+}
+EXPORT_SYMBOL_GPL(stack_depot_print);
+
+/**
  * stack_depot_fetch - Fetch stack entries from a depot
  *
  * @handle:		Stack depot handle which was returned from
--- a/mm/kasan/report.c~lib-stackdepot-add-helper-to-print-stack-entries
+++ a/mm/kasan/report.c
@@ -132,20 +132,11 @@ static void end_report(unsigned long *fl
 	kasan_enable_current();
 }
 
-static void print_stack(depot_stack_handle_t stack)
-{
-	unsigned long *entries;
-	unsigned int nr_entries;
-
-	nr_entries = stack_depot_fetch(stack, &entries);
-	stack_trace_print(entries, nr_entries, 0);
-}
-
 static void print_track(struct kasan_track *track, const char *prefix)
 {
 	pr_err("%s by task %u:\n", prefix, track->pid);
 	if (track->stack) {
-		print_stack(track->stack);
+		stack_depot_print(track->stack);
 	} else {
 		pr_err("(stack is not available)\n");
 	}
@@ -214,12 +205,12 @@ static void describe_object_stacks(struc
 		return;
 	if (alloc_meta->aux_stack[0]) {
 		pr_err("Last potentially related work creation:\n");
-		print_stack(alloc_meta->aux_stack[0]);
+		stack_depot_print(alloc_meta->aux_stack[0]);
 		pr_err("\n");
 	}
 	if (alloc_meta->aux_stack[1]) {
 		pr_err("Second to last potentially related work creation:\n");
-		print_stack(alloc_meta->aux_stack[1]);
+		stack_depot_print(alloc_meta->aux_stack[1]);
 		pr_err("\n");
 	}
 #endif
--- a/mm/page_owner.c~lib-stackdepot-add-helper-to-print-stack-entries
+++ a/mm/page_owner.c
@@ -394,8 +394,6 @@ void __dump_page_owner(const struct page
 	struct page_ext *page_ext = lookup_page_ext(page);
 	struct page_owner *page_owner;
 	depot_stack_handle_t handle;
-	unsigned long *entries;
-	unsigned int nr_entries;
 	gfp_t gfp_mask;
 	int mt;
 
@@ -423,20 +421,17 @@ void __dump_page_owner(const struct page
 		 page_owner->pid, page_owner->ts_nsec, page_owner->free_ts_nsec);
 
 	handle = READ_ONCE(page_owner->handle);
-	if (!handle) {
+	if (!handle)
 		pr_alert("page_owner allocation stack trace missing\n");
-	} else {
-		nr_entries = stack_depot_fetch(handle, &entries);
-		stack_trace_print(entries, nr_entries, 0);
-	}
+	else
+		stack_depot_print(handle);
 
 	handle = READ_ONCE(page_owner->free_handle);
 	if (!handle) {
 		pr_alert("page_owner free stack trace missing\n");
 	} else {
-		nr_entries = stack_depot_fetch(handle, &entries);
 		pr_alert("page last free stack trace:\n");
-		stack_trace_print(entries, nr_entries, 0);
+		stack_depot_print(handle);
 	}
 
 	if (page_owner->last_migrate_reason != -1)
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 35/87] lib, stackdepot: add helper to print stack entries into buffer
  2021-11-09  2:30 incoming Andrew Morton
                   ` (33 preceding siblings ...)
  2021-11-09  2:33 ` [patch 34/87] lib, stackdepot: add helper to print stack entries Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 36/87] include/linux/string_helpers.h: add linux/string.h for strlen() Andrew Morton
                   ` (51 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: airlied, akpm, andreyknvl, daniel, dvyukov, geert, glider,
	imran.f.khan, jani.nikula, linux-mm, maarten.lankhorst,
	mm-commits, mripard, ryabinin.a.a, torvalds, tzimmermann, vbabka

From: Imran Khan <imran.f.khan@oracle.com>
Subject: lib, stackdepot: add helper to print stack entries into buffer

To print stack entries into a buffer, users of stackdepot, first get a
list of stack entries using stack_depot_fetch and then print this list
into a buffer using stack_trace_snprint.  Provide a helper in stackdepot
for this purpose.  Also change above mentioned users to use this helper.

[imran.f.khan@oracle.com: fix build error]
  Link: https://lkml.kernel.org/r/20210915175321.3472770-4-imran.f.khan@oracle.com
[imran.f.khan@oracle.com: export stack_depot_snprint() to modules]
  Link: https://lkml.kernel.org/r/20210916133535.3592491-4-imran.f.khan@oracle.com
Link: https://lkml.kernel.org/r/20210915014806.3206938-4-imran.f.khan@oracle.com
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Jani Nikula <jani.nikula@intel.com>	[i915]
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/gpu/drm/drm_dp_mst_topology.c   |    5 ----
 drivers/gpu/drm/drm_mm.c                |    5 ----
 drivers/gpu/drm/i915/i915_vma.c         |    5 ----
 drivers/gpu/drm/i915/intel_runtime_pm.c |   20 ++++-------------
 include/linux/stackdepot.h              |    3 ++
 lib/stackdepot.c                        |   25 ++++++++++++++++++++++
 mm/page_owner.c                         |    5 ----
 7 files changed, 37 insertions(+), 31 deletions(-)

--- a/drivers/gpu/drm/drm_dp_mst_topology.c~lib-stackdepot-add-helper-to-print-stack-entries-into-buffer
+++ a/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -1668,13 +1668,10 @@ __dump_topology_ref_history(struct drm_d
 	for (i = 0; i < history->len; i++) {
 		const struct drm_dp_mst_topology_ref_entry *entry =
 			&history->entries[i];
-		ulong *entries;
-		uint nr_entries;
 		u64 ts_nsec = entry->ts_nsec;
 		u32 rem_nsec = do_div(ts_nsec, 1000000000);
 
-		nr_entries = stack_depot_fetch(entry->backtrace, &entries);
-		stack_trace_snprint(buf, PAGE_SIZE, entries, nr_entries, 4);
+		stack_depot_snprint(entry->backtrace, buf, PAGE_SIZE, 4);
 
 		drm_printf(&p, "  %d %ss (last at %5llu.%06u):\n%s",
 			   entry->count,
--- a/drivers/gpu/drm/drm_mm.c~lib-stackdepot-add-helper-to-print-stack-entries-into-buffer
+++ a/drivers/gpu/drm/drm_mm.c
@@ -118,8 +118,6 @@ static noinline void save_stack(struct d
 static void show_leaks(struct drm_mm *mm)
 {
 	struct drm_mm_node *node;
-	unsigned long *entries;
-	unsigned int nr_entries;
 	char *buf;
 
 	buf = kmalloc(BUFSZ, GFP_KERNEL);
@@ -133,8 +131,7 @@ static void show_leaks(struct drm_mm *mm
 			continue;
 		}
 
-		nr_entries = stack_depot_fetch(node->stack, &entries);
-		stack_trace_snprint(buf, BUFSZ, entries, nr_entries, 0);
+		stack_depot_snprint(node->stack, buf, BUFSZ, 0);
 		DRM_ERROR("node [%08llx + %08llx]: inserted at\n%s",
 			  node->start, node->size, buf);
 	}
--- a/drivers/gpu/drm/i915/i915_vma.c~lib-stackdepot-add-helper-to-print-stack-entries-into-buffer
+++ a/drivers/gpu/drm/i915/i915_vma.c
@@ -56,8 +56,6 @@ void i915_vma_free(struct i915_vma *vma)
 
 static void vma_print_allocator(struct i915_vma *vma, const char *reason)
 {
-	unsigned long *entries;
-	unsigned int nr_entries;
 	char buf[512];
 
 	if (!vma->node.stack) {
@@ -66,8 +64,7 @@ static void vma_print_allocator(struct i
 		return;
 	}
 
-	nr_entries = stack_depot_fetch(vma->node.stack, &entries);
-	stack_trace_snprint(buf, sizeof(buf), entries, nr_entries, 0);
+	stack_depot_snprint(vma->node.stack, buf, sizeof(buf), 0);
 	DRM_DEBUG_DRIVER("vma.node [%08llx + %08llx] %s: inserted at %s\n",
 			 vma->node.start, vma->node.size, reason, buf);
 }
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c~lib-stackdepot-add-helper-to-print-stack-entries-into-buffer
+++ a/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -65,16 +65,6 @@ static noinline depot_stack_handle_t __s
 	return stack_depot_save(entries, n, GFP_NOWAIT | __GFP_NOWARN);
 }
 
-static void __print_depot_stack(depot_stack_handle_t stack,
-				char *buf, int sz, int indent)
-{
-	unsigned long *entries;
-	unsigned int nr_entries;
-
-	nr_entries = stack_depot_fetch(stack, &entries);
-	stack_trace_snprint(buf, sz, entries, nr_entries, indent);
-}
-
 static void init_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm)
 {
 	spin_lock_init(&rpm->debug.lock);
@@ -146,12 +136,12 @@ static void untrack_intel_runtime_pm_wak
 		if (!buf)
 			return;
 
-		__print_depot_stack(stack, buf, PAGE_SIZE, 2);
+		stack_depot_snprint(stack, buf, PAGE_SIZE, 2);
 		DRM_DEBUG_DRIVER("wakeref %x from\n%s", stack, buf);
 
 		stack = READ_ONCE(rpm->debug.last_release);
 		if (stack) {
-			__print_depot_stack(stack, buf, PAGE_SIZE, 2);
+			stack_depot_snprint(stack, buf, PAGE_SIZE, 2);
 			DRM_DEBUG_DRIVER("wakeref last released at\n%s", buf);
 		}
 
@@ -183,12 +173,12 @@ __print_intel_runtime_pm_wakeref(struct
 		return;
 
 	if (dbg->last_acquire) {
-		__print_depot_stack(dbg->last_acquire, buf, PAGE_SIZE, 2);
+		stack_depot_snprint(dbg->last_acquire, buf, PAGE_SIZE, 2);
 		drm_printf(p, "Wakeref last acquired:\n%s", buf);
 	}
 
 	if (dbg->last_release) {
-		__print_depot_stack(dbg->last_release, buf, PAGE_SIZE, 2);
+		stack_depot_snprint(dbg->last_release, buf, PAGE_SIZE, 2);
 		drm_printf(p, "Wakeref last released:\n%s", buf);
 	}
 
@@ -203,7 +193,7 @@ __print_intel_runtime_pm_wakeref(struct
 		rep = 1;
 		while (i + 1 < dbg->count && dbg->owners[i + 1] == stack)
 			rep++, i++;
-		__print_depot_stack(stack, buf, PAGE_SIZE, 2);
+		stack_depot_snprint(stack, buf, PAGE_SIZE, 2);
 		drm_printf(p, "Wakeref x%lu taken at:\n%s", rep, buf);
 	}
 
--- a/include/linux/stackdepot.h~lib-stackdepot-add-helper-to-print-stack-entries-into-buffer
+++ a/include/linux/stackdepot.h
@@ -25,6 +25,9 @@ depot_stack_handle_t stack_depot_save(un
 unsigned int stack_depot_fetch(depot_stack_handle_t handle,
 			       unsigned long **entries);
 
+int stack_depot_snprint(depot_stack_handle_t handle, char *buf, size_t size,
+		       int spaces);
+
 void stack_depot_print(depot_stack_handle_t stack);
 
 #ifdef CONFIG_STACKDEPOT
--- a/lib/stackdepot.c~lib-stackdepot-add-helper-to-print-stack-entries-into-buffer
+++ a/lib/stackdepot.c
@@ -214,6 +214,31 @@ static inline struct stack_record *find_
 }
 
 /**
+ * stack_depot_snprint - print stack entries from a depot into a buffer
+ *
+ * @handle:	Stack depot handle which was returned from
+ *		stack_depot_save().
+ * @buf:	Pointer to the print buffer
+ *
+ * @size:	Size of the print buffer
+ *
+ * @spaces:	Number of leading spaces to print
+ *
+ * Return:	Number of bytes printed.
+ */
+int stack_depot_snprint(depot_stack_handle_t handle, char *buf, size_t size,
+		       int spaces)
+{
+	unsigned long *entries;
+	unsigned int nr_entries;
+
+	nr_entries = stack_depot_fetch(handle, &entries);
+	return nr_entries ? stack_trace_snprint(buf, size, entries, nr_entries,
+						spaces) : 0;
+}
+EXPORT_SYMBOL_GPL(stack_depot_snprint);
+
+/**
  * stack_depot_print - print stack entries from a depot
  *
  * @stack:		Stack depot handle which was returned from
--- a/mm/page_owner.c~lib-stackdepot-add-helper-to-print-stack-entries-into-buffer
+++ a/mm/page_owner.c
@@ -329,8 +329,6 @@ print_page_owner(char __user *buf, size_
 		depot_stack_handle_t handle)
 {
 	int ret, pageblock_mt, page_mt;
-	unsigned long *entries;
-	unsigned int nr_entries;
 	char *kbuf;
 
 	count = min_t(size_t, count, PAGE_SIZE);
@@ -361,8 +359,7 @@ print_page_owner(char __user *buf, size_
 	if (ret >= count)
 		goto err;
 
-	nr_entries = stack_depot_fetch(handle, &entries);
-	ret += stack_trace_snprint(kbuf + ret, count - ret, entries, nr_entries, 0);
+	ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
 	if (ret >= count)
 		goto err;
 
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 36/87] include/linux/string_helpers.h: add linux/string.h for strlen()
  2021-11-09  2:30 incoming Andrew Morton
                   ` (34 preceding siblings ...)
  2021-11-09  2:33 ` [patch 35/87] lib, stackdepot: add helper to print stack entries into buffer Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 37/87] lib: uninline simple_strntoull() as well Andrew Morton
                   ` (50 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: akpm, linux-mm, lucas.demarchi, mm-commits, torvalds

From: Lucas De Marchi <lucas.demarchi@intel.com>
Subject: include/linux/string_helpers.h: add linux/string.h for strlen()

linux/string_helpers.h uses strlen(), so include the correponding header. 
Otherwise we get a compilation error if it's not also included by whoever
included the helper.

Link: https://lkml.kernel.org/r/20211005212634.3223113-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/string_helpers.h |    1 +
 1 file changed, 1 insertion(+)

--- a/include/linux/string_helpers.h~lib-string_helpers-add-linux-stringh-for-strlen
+++ a/include/linux/string_helpers.h
@@ -4,6 +4,7 @@
 
 #include <linux/bits.h>
 #include <linux/ctype.h>
+#include <linux/string.h>
 #include <linux/types.h>
 
 struct file;
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 37/87] lib: uninline simple_strntoull() as well
  2021-11-09  2:30 incoming Andrew Morton
                   ` (35 preceding siblings ...)
  2021-11-09  2:33 ` [patch 36/87] include/linux/string_helpers.h: add linux/string.h for strlen() Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 38/87] mm/scatterlist: replace the !preemptible warning in sg_miter_stop() Andrew Morton
                   ` (49 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: adobriyan, akpm, linux-mm, mm-commits, rf, torvalds

From: Alexey Dobriyan <adobriyan@gmail.com>
Subject: lib: uninline simple_strntoull() as well

Codegen become bloated again after simple_strntoull() introduction

	add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-224 (-224)
	Function                                     old     new   delta
	simple_strtoul                                 5       2      -3
	simple_strtol                                 23      20      -3
	simple_strtoull                              119      15    -104
	simple_strtoll                               155      41    -114

Link: https://lkml.kernel.org/r/YVmlB9yY4lvbNKYt@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Richard Fitzgerald <rf@opensource.cirrus.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/vsprintf.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/lib/vsprintf.c~lib-uninline-simple_strntoull-as-well
+++ a/lib/vsprintf.c
@@ -53,8 +53,7 @@
 #include <linux/string_helpers.h>
 #include "kstrtox.h"
 
-static unsigned long long simple_strntoull(const char *startp, size_t max_chars,
-					   char **endp, unsigned int base)
+static noinline unsigned long long simple_strntoull(const char *startp, size_t max_chars, char **endp, unsigned int base)
 {
 	const char *cp;
 	unsigned long long result = 0ULL;
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 38/87] mm/scatterlist: replace the !preemptible warning in sg_miter_stop()
  2021-11-09  2:30 incoming Andrew Morton
                   ` (36 preceding siblings ...)
  2021-11-09  2:33 ` [patch 37/87] lib: uninline simple_strntoull() as well Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 39/87] const_structs.checkpatch: add a few sound ops structs Andrew Morton
                   ` (48 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: akpm, bigeasy, linux-mm, mm-commits, tglx, torvalds

From: Thomas Gleixner <tglx@linutronix.de>
Subject: mm/scatterlist: replace the !preemptible warning in sg_miter_stop()

sg_miter_stop() checks for disabled preemption before unmapping a page via
kunmap_atomic().  The kernel doc mentions under context that preemption
must be disabled if SG_MITER_ATOMIC is set.

There is no active requirement for the caller to have preemption disabled
before invoking sg_mitter_stop().  The sg_mitter_*() implementation itself
has no such requirement.

In fact, preemption is disabled by kmap_atomic() as part of
sg_miter_next() and remains disabled as long as there is an active
SG_MITER_ATOMIC mapping.  This is a consequence of kmap_atomic() and not a
requirement for sg_mitter_*() itself.

The user chooses SG_MITER_ATOMIC because it uses the API in a context
where blocking is not possible or blocking is possible but he chooses a
lower weight mapping which is not available on all CPUs and so it might
need less overhead to setup at a price that now preemption will be
disabled.

The kmap_atomic() implementation on PREEMPT_RT does not disable
preemption.  It simply disables CPU migration to ensure that the task
remains on the same CPU while the caller remains preemptible.  This in
turn triggers the warning in sg_miter_stop() because preemption is
allowed.

The PREEMPT_RT and !PREEMPT_RT implementation of kmap_atomic() disable
pagefaults as a requirement.  It is sufficient to check for this instead
of disabled preemption.

Check for disabled pagefault handler in the SG_MITER_ATOMIC case.  Remove
the "preemption disabled" part from the kernel doc as the sg_milter*()
implementation does not care.

[bigeasy@linutronix.de: commit description]
Link: https://lkml.kernel.org/r/20211015211409.cqopacv3pxdwn2ty@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/scatterlist.c |   11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

--- a/lib/scatterlist.c~mm-scatterlist-replace-the-preemptible-warning-in-sg_miter_stop
+++ a/lib/scatterlist.c
@@ -828,8 +828,7 @@ static bool sg_miter_get_next_page(struc
  *   stops @miter.
  *
  * Context:
- *   Don't care if @miter is stopped, or not proceeded yet.
- *   Otherwise, preemption disabled if the SG_MITER_ATOMIC is set.
+ *   Don't care.
  *
  * Returns:
  *   true if @miter contains the valid mapping.  false if end of sg
@@ -865,8 +864,7 @@ EXPORT_SYMBOL(sg_miter_skip);
  *   @miter->addr and @miter->length point to the current mapping.
  *
  * Context:
- *   Preemption disabled if SG_MITER_ATOMIC.  Preemption must stay disabled
- *   till @miter is stopped.  May sleep if !SG_MITER_ATOMIC.
+ *   May sleep if !SG_MITER_ATOMIC.
  *
  * Returns:
  *   true if @miter contains the next mapping.  false if end of sg
@@ -906,8 +904,7 @@ EXPORT_SYMBOL(sg_miter_next);
  *   need to be released during iteration.
  *
  * Context:
- *   Preemption disabled if the SG_MITER_ATOMIC is set.  Don't care
- *   otherwise.
+ *   Don't care otherwise.
  */
 void sg_miter_stop(struct sg_mapping_iter *miter)
 {
@@ -922,7 +919,7 @@ void sg_miter_stop(struct sg_mapping_ite
 			flush_dcache_page(miter->page);
 
 		if (miter->__flags & SG_MITER_ATOMIC) {
-			WARN_ON_ONCE(preemptible());
+			WARN_ON_ONCE(!pagefault_disabled());
 			kunmap_atomic(miter->addr);
 		} else
 			kunmap(miter->page);
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 39/87] const_structs.checkpatch: add a few sound ops structs
  2021-11-09  2:30 incoming Andrew Morton
                   ` (37 preceding siblings ...)
  2021-11-09  2:33 ` [patch 38/87] mm/scatterlist: replace the !preemptible warning in sg_miter_stop() Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 40/87] checkpatch: improve EXPORT_SYMBOL test for EXPORT_SYMBOL_NS uses Andrew Morton
                   ` (47 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: akpm, broonie, joe, lgirdwood, linux-mm, mm-commits, perex,
	rikard.falkeborn, tiwai, torvalds

From: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Subject: const_structs.checkpatch: add a few sound ops structs

Add a couple of commonly used (>50 instances) sound ops structs that are
typically const.

Link: https://lkml.kernel.org/r/20210922211042.38017-1-rikard.falkeborn@gmail.com
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/const_structs.checkpatch |    4 ++++
 1 file changed, 4 insertions(+)

--- a/scripts/const_structs.checkpatch~const_structscheckpatch-add-a-few-sound-ops-structs
+++ a/scripts/const_structs.checkpatch
@@ -54,7 +54,11 @@ sd_desc
 seq_operations
 sirfsoc_padmux
 snd_ac97_build_ops
+snd_pcm_ops
+snd_rawmidi_ops
 snd_soc_component_driver
+snd_soc_dai_ops
+snd_soc_ops
 soc_pcmcia_socket_ops
 stacktrace_ops
 sysfs_ops
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 40/87] checkpatch: improve EXPORT_SYMBOL test for EXPORT_SYMBOL_NS uses
  2021-11-09  2:30 incoming Andrew Morton
                   ` (38 preceding siblings ...)
  2021-11-09  2:33 ` [patch 39/87] const_structs.checkpatch: add a few sound ops structs Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 41/87] checkpatch: get default codespell dictionary path from package location Andrew Morton
                   ` (46 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: akpm, arequipeno, dwaipayanray1, joe, linux-mm, lukas.bulwahn,
	mm-commits, torvalds

From: Joe Perches <joe@perches.com>
Subject: checkpatch: improve EXPORT_SYMBOL test for EXPORT_SYMBOL_NS uses

The EXPORT_SYMBOL test expects a single argument but definitions of
EXPORT_SYMBOL_NS have multiple arguments.

Update the test to extract only the first argument from any
EXPORT_SYMBOL related definition.

Link: https://lkml.kernel.org/r/9e8f251b42e405f460f26a23ba9b33ef45a94adc.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reported-by: Ian Pilcher <arequipeno@gmail.com>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/checkpatch.pl |    1 +
 1 file changed, 1 insertion(+)

--- a/scripts/checkpatch.pl~checkpatch-improve-export_symbol-test-for-export_symbol_ns-uses
+++ a/scripts/checkpatch.pl
@@ -4449,6 +4449,7 @@ sub process {
 			#   XXX(foo);
 			#   EXPORT_SYMBOL(something_foo);
 			my $name = $1;
+			$name =~ s/^\s*($Ident).*/$1/;
 			if ($stat =~ /^(?:.\s*}\s*\n)?.([A-Z_]+)\s*\(\s*($Ident)/ &&
 			    $name =~ /^${Ident}_$2/) {
 #print "FOO C name<$name>\n";
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 41/87] checkpatch: get default codespell dictionary path from package location
  2021-11-09  2:30 incoming Andrew Morton
                   ` (39 preceding siblings ...)
  2021-11-09  2:33 ` [patch 40/87] checkpatch: improve EXPORT_SYMBOL test for EXPORT_SYMBOL_NS uses Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 42/87] binfmt_elf: reintroduce using MAP_FIXED_NOREPLACE Andrew Morton
                   ` (45 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: akpm, joe, linux-mm, mm-commits, peter.ujfalusi, torvalds

From: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Subject: checkpatch: get default codespell dictionary path from package location

The standard location of dictionary.txt is under codespell's package, on
my machine atm (codespell 2.1, Artix Linux):
/usr/lib/python3.9/site-packages/codespell_lib/data/dictionary.txt

Since we enable the codespell by default for SOF I have constant:
No codespell typos will be found - \
file '/usr/share/codespell/dictionary.txt': No such file or directory

The patch proposes to try to fix up the path following the recommendation
found here: https://github.com/codespell-project/codespell/issues/1540

Link: https://lkml.kernel.org/r/29e25d1364c8ad7f7657cc0660f60c568074d438.camel@perches.com
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Acked-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/checkpatch.pl |   32 ++++++++++++++++++++++++++++----
 1 file changed, 28 insertions(+), 4 deletions(-)

--- a/scripts/checkpatch.pl~checkpatch-get-default-codespell-dictionary-path-from-package-location
+++ a/scripts/checkpatch.pl
@@ -63,6 +63,7 @@ my $min_conf_desc_length = 4;
 my $spelling_file = "$D/spelling.txt";
 my $codespell = 0;
 my $codespellfile = "/usr/share/codespell/dictionary.txt";
+my $user_codespellfile = "";
 my $conststructsfile = "$D/const_structs.checkpatch";
 my $docsfile = "$D/../Documentation/dev-tools/checkpatch.rst";
 my $typedefsfile;
@@ -130,7 +131,7 @@ Options:
   --ignore-perl-version      override checking of perl version.  expect
                              runtime errors.
   --codespell                Use the codespell dictionary for spelling/typos
-                             (default:/usr/share/codespell/dictionary.txt)
+                             (default:$codespellfile)
   --codespellfile            Use this codespell dictionary
   --typedefsfile             Read additional types from this file
   --color[=WHEN]             Use colors 'always', 'never', or only when output
@@ -317,7 +318,7 @@ GetOptions(
 	'debug=s'	=> \%debug,
 	'test-only=s'	=> \$tst_only,
 	'codespell!'	=> \$codespell,
-	'codespellfile=s'	=> \$codespellfile,
+	'codespellfile=s'	=> \$user_codespellfile,
 	'typedefsfile=s'	=> \$typedefsfile,
 	'color=s'	=> \$color,
 	'no-color'	=> \$color,	#keep old behaviors of -nocolor
@@ -325,9 +326,32 @@ GetOptions(
 	'kconfig-prefix=s'	=> \${CONFIG_},
 	'h|help'	=> \$help,
 	'version'	=> \$help
-) or help(1);
+) or $help = 2;
 
-help(0) if ($help);
+if ($user_codespellfile) {
+	# Use the user provided codespell file unconditionally
+	$codespellfile = $user_codespellfile;
+} elsif (!(-f $codespellfile)) {
+	# If /usr/share/codespell/dictionary.txt is not present, try to find it
+	# under codespell's install directory: <codespell_root>/data/dictionary.txt
+	if (($codespell || $help) && which("codespell") ne "" && which("python") ne "") {
+		my $python_codespell_dict = << "EOF";
+
+import os.path as op
+import codespell_lib
+codespell_dir = op.dirname(codespell_lib.__file__)
+codespell_file = op.join(codespell_dir, 'data', 'dictionary.txt')
+print(codespell_file, end='')
+EOF
+
+		my $codespell_dict = `python -c "$python_codespell_dict" 2> /dev/null`;
+		$codespellfile = $codespell_dict if (-f $codespell_dict);
+	}
+}
+
+# $help is 1 if either -h, --help or --version is passed as option - exitcode: 0
+# $help is 2 if invalid option is passed - exitcode: 1
+help($help - 1) if ($help);
 
 die "$P: --git cannot be used with --file or --fix\n" if ($git && ($file || $fix));
 die "$P: --verbose cannot be used with --terse\n" if ($verbose && $terse);
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 42/87] binfmt_elf: reintroduce using MAP_FIXED_NOREPLACE
  2021-11-09  2:30 incoming Andrew Morton
                   ` (40 preceding siblings ...)
  2021-11-09  2:33 ` [patch 41/87] checkpatch: get default codespell dictionary path from package location Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 43/87] ELF: simplify STACK_ALLOC macro Andrew Morton
                   ` (44 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: akpm, anthony.yznaga, avagin, chenjingwen6, ebiederm, keescook,
	khalid.aziz, linux-mm, linux, mhocko, mm-commits, mpe, torvalds,
	viro

From: Kees Cook <keescook@chromium.org>
Subject: binfmt_elf: reintroduce using MAP_FIXED_NOREPLACE

Commit b212921b13bd ("elf: don't use MAP_FIXED_NOREPLACE for elf
executable mappings") reverted back to using MAP_FIXED to map ELF LOAD
segments because it was found that the segments in some binaries overlap
and can cause MAP_FIXED_NOREPLACE to fail.

The original intent of MAP_FIXED_NOREPLACE in the ELF loader was to
prevent the silent clobbering of an existing mapping (e.g.  stack) by the
ELF image, which could lead to exploitable conditions.  Quoting commit
4ed28639519c ("fs, elf: drop MAP_FIXED usage from elf_map"), which
originally introduced the use of MAP_FIXED_NOREPLACE in the loader:

    Both load_elf_interp and load_elf_binary rely on elf_map to map
    segments [to a specific] address and they use MAP_FIXED to enforce
    that. This is however [a] dangerous thing prone to silent data
    corruption which can be even exploitable.
    ...
    Let's take CVE-2017-1000253 as an example ... we could end up mapping
    [the executable] over the existing stack ... The [stack layout] issue
    has been fixed since then ... So we should be safe and any [similar]
    attack should be impractical. On the other hand this is just too
    subtle [an] assumption ... it can break quite easily and [be] hard to
    spot.
    ...
    Address this [weakness] by changing MAP_FIXED to the newly added
    MAP_FIXED_NOREPLACE. This will mean that mmap will fail if there is
    an existing mapping clashing with the requested one [instead of
    silently] clobbering it.

Then processing ET_DYN binaries the loader already calculates a total size
for the image when the first segment is mapped, maps the entire image, and
then unmaps the remainder before the remaining segments are then
individually mapped.  To avoid the earlier problems (legitimate
overlapping LOAD segments specified in the ELF), apply the same logic to
ET_EXEC binaries as well.  For both ET_EXEC and ET_DYN+INTERP use
MAP_FIXED_NOREPLACE for the initial total size mapping and then use
MAP_FIXED to build the final (possibly legitimately overlapping) mappings.
For ET_DYN w/out INTERP, continue to map at a system-selected address in
the mmap region.

Link: https://lkml.kernel.org/r/20210916215947.3993776-1-keescook@chromium.org
Link: https://lore.kernel.org/lkml/1595869887-23307-2-git-send-email-anthony.yznaga@oracle.com
Co-developed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Chen Jingwen <chenjingwen6@huawei.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrei Vagin <avagin@openvz.org>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/binfmt_elf.c |   31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

--- a/fs/binfmt_elf.c~binfmt_elf-reintroduce-using-map_fixed_noreplace
+++ a/fs/binfmt_elf.c
@@ -1074,20 +1074,26 @@ out_free_interp:
 
 		vaddr = elf_ppnt->p_vaddr;
 		/*
-		 * If we are loading ET_EXEC or we have already performed
-		 * the ET_DYN load_addr calculations, proceed normally.
+		 * The first time through the loop, load_addr_set is false:
+		 * layout will be calculated. Once set, use MAP_FIXED since
+		 * we know we've already safely mapped the entire region with
+		 * MAP_FIXED_NOREPLACE in the once-per-binary logic following.
 		 */
-		if (elf_ex->e_type == ET_EXEC || load_addr_set) {
+		if (load_addr_set) {
 			elf_flags |= MAP_FIXED;
+		} else if (elf_ex->e_type == ET_EXEC) {
+			/*
+			 * This logic is run once for the first LOAD Program
+			 * Header for ET_EXEC binaries. No special handling
+			 * is needed.
+			 */
+			elf_flags |= MAP_FIXED_NOREPLACE;
 		} else if (elf_ex->e_type == ET_DYN) {
 			/*
 			 * This logic is run once for the first LOAD Program
 			 * Header for ET_DYN binaries to calculate the
 			 * randomization (load_bias) for all the LOAD
-			 * Program Headers, and to calculate the entire
-			 * size of the ELF mapping (total_size). (Note that
-			 * load_addr_set is set to true later once the
-			 * initial mapping is performed.)
+			 * Program Headers.
 			 *
 			 * There are effectively two types of ET_DYN
 			 * binaries: programs (i.e. PIE: ET_DYN with INTERP)
@@ -1108,7 +1114,7 @@ out_free_interp:
 			 * Therefore, programs are loaded offset from
 			 * ELF_ET_DYN_BASE and loaders are loaded into the
 			 * independently randomized mmap region (0 load_bias
-			 * without MAP_FIXED).
+			 * without MAP_FIXED nor MAP_FIXED_NOREPLACE).
 			 */
 			if (interpreter) {
 				load_bias = ELF_ET_DYN_BASE;
@@ -1117,7 +1123,7 @@ out_free_interp:
 				alignment = maximum_alignment(elf_phdata, elf_ex->e_phnum);
 				if (alignment)
 					load_bias &= ~(alignment - 1);
-				elf_flags |= MAP_FIXED;
+				elf_flags |= MAP_FIXED_NOREPLACE;
 			} else
 				load_bias = 0;
 
@@ -1129,7 +1135,14 @@ out_free_interp:
 			 * is then page aligned.
 			 */
 			load_bias = ELF_PAGESTART(load_bias - vaddr);
+		}
 
+		/*
+		 * Calculate the entire size of the ELF mapping (total_size).
+		 * (Note that load_addr_set is set to true later once the
+		 * initial mapping is performed.)
+		 */
+		if (!load_addr_set) {
 			total_size = total_mapping_size(elf_phdata,
 							elf_ex->e_phnum);
 			if (!total_size) {
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 43/87] ELF: simplify STACK_ALLOC macro
  2021-11-09  2:30 incoming Andrew Morton
                   ` (41 preceding siblings ...)
  2021-11-09  2:33 ` [patch 42/87] binfmt_elf: reintroduce using MAP_FIXED_NOREPLACE Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 44/87] kallsyms: remove arch specific text and data check Andrew Morton
                   ` (43 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: adobriyan, akpm, linux-mm, mm-commits, torvalds

From: Alexey Dobriyan <adobriyan@gmail.com>
Subject: ELF: simplify STACK_ALLOC macro

"A -= B; A" is equivalent to "A -= B".

Link: https://lkml.kernel.org/r/YVmcP256fRMqCwgK@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/binfmt_elf.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/binfmt_elf.c~elf-simplify-stack_alloc-macro
+++ a/fs/binfmt_elf.c
@@ -156,7 +156,7 @@ static int padzero(unsigned long elf_bss
 #define STACK_ADD(sp, items) ((elf_addr_t __user *)(sp) - (items))
 #define STACK_ROUND(sp, items) \
 	(((unsigned long) (sp - items)) &~ 15UL)
-#define STACK_ALLOC(sp, len) ({ sp -= len ; sp; })
+#define STACK_ALLOC(sp, len) (sp -= len)
 #endif
 
 #ifndef ELF_BASE_PLATFORM
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 44/87] kallsyms: remove arch specific text and data check
  2021-11-09  2:30 incoming Andrew Morton
                   ` (42 preceding siblings ...)
  2021-11-09  2:33 ` [patch 43/87] ELF: simplify STACK_ALLOC macro Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 45/87] kallsyms: fix address-checks for kernel related range Andrew Morton
                   ` (42 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: akpm, andreyknvl, arnd, ast, benh, bp, christophe.leroy, davem,
	dvyukov, glider, ink, linux-mm, mattst88, mingo, mm-commits,
	monstr, mpe, paulus, pmladek, rostedt, rth, ryabinin.a.a,
	senozhatsky, tglx, torvalds, wangkefeng.wang

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: kallsyms: remove arch specific text and data check

Patch series "sections: Unify kernel sections range check and use", v4.

There are three head files(kallsyms.h, kernel.h and sections.h) which
include the kernel sections range check, let's make some cleanup and unify
them.

1. cleanup arch specific text/data check and fix address boundary check
   in kallsyms.h

2. make all the basic/core kernel range check function into sections.h

3. update all the callers, and use the helper in sections.h to simplify
   the code

After this series, we have 5 APIs about kernel sections range check in
sections.h

 * is_kernel_rodata()		--- already in sections.h
 * is_kernel_core_data()	--- come from core_kernel_data() in kernel.h
 * is_kernel_inittext()		--- come from kernel.h and kallsyms.h
 * __is_kernel_text()		--- add new internal helper
 * __is_kernel()		--- add new internal helper

Note: For the last two helpers, people should not use directly, consider to
      use corresponding function in kallsyms.h.


This patch (of 11):

Remove arch specific text and data check after commit 4ba66a976072 ("arch:
remove blackfin port"), no need arch-specific text/data check.

Link: https://lkml.kernel.org/r/20210930071143.63410-1-wangkefeng.wang@huawei.com
Link: https://lkml.kernel.org/r/20210930071143.63410-2-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/asm-generic/sections.h |   16 ----------------
 include/linux/kallsyms.h       |    3 +--
 kernel/locking/lockdep.c       |    3 ---
 3 files changed, 1 insertion(+), 21 deletions(-)

--- a/include/asm-generic/sections.h~kallsyms-remove-arch-specific-text-and-data-check
+++ a/include/asm-generic/sections.h
@@ -64,22 +64,6 @@ extern __visible const void __nosave_beg
 #define dereference_kernel_function_descriptor(p) ((void *)(p))
 #endif
 
-/* random extra sections (if any).  Override
- * in asm/sections.h */
-#ifndef arch_is_kernel_text
-static inline int arch_is_kernel_text(unsigned long addr)
-{
-	return 0;
-}
-#endif
-
-#ifndef arch_is_kernel_data
-static inline int arch_is_kernel_data(unsigned long addr)
-{
-	return 0;
-}
-#endif
-
 /**
  * memory_contains - checks if an object is contained within a memory region
  * @begin: virtual address of the beginning of the memory region
--- a/include/linux/kallsyms.h~kallsyms-remove-arch-specific-text-and-data-check
+++ a/include/linux/kallsyms.h
@@ -34,8 +34,7 @@ static inline int is_kernel_inittext(uns
 
 static inline int is_kernel_text(unsigned long addr)
 {
-	if ((addr >= (unsigned long)_stext && addr <= (unsigned long)_etext) ||
-	    arch_is_kernel_text(addr))
+	if ((addr >= (unsigned long)_stext && addr <= (unsigned long)_etext))
 		return 1;
 	return in_gate_area_no_mm(addr);
 }
--- a/kernel/locking/lockdep.c~kallsyms-remove-arch-specific-text-and-data-check
+++ a/kernel/locking/lockdep.c
@@ -818,9 +818,6 @@ static int static_obj(const void *obj)
 	if ((addr >= start) && (addr < end))
 		return 1;
 
-	if (arch_is_kernel_data(addr))
-		return 1;
-
 	/*
 	 * in-kernel percpu var?
 	 */
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 45/87] kallsyms: fix address-checks for kernel related range
  2021-11-09  2:30 incoming Andrew Morton
                   ` (43 preceding siblings ...)
  2021-11-09  2:33 ` [patch 44/87] kallsyms: remove arch specific text and data check Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 46/87] sections: move and rename core_kernel_data() to is_kernel_core_data() Andrew Morton
                   ` (41 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: akpm, andreyknvl, arnd, ast, benh, bp, christophe.leroy, davem,
	dvyukov, glider, ink, linux-mm, mattst88, mingo, mm-commits,
	monstr, mpe, paulus, pmladek, rostedt, rth, ryabinin.a.a,
	senozhatsky, tglx, torvalds, wangkefeng.wang

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: kallsyms: fix address-checks for kernel related range

The is_kernel_inittext/is_kernel_text/is_kernel function should not
include the end address(the labels _einittext, _etext and _end) when check
the address range, the issue exists since Linux v2.6.12.

Link: https://lkml.kernel.org/r/20210930071143.63410-3-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/kallsyms.h |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/include/linux/kallsyms.h~kallsyms-fix-address-checks-for-kernel-related-range
+++ a/include/linux/kallsyms.h
@@ -27,21 +27,21 @@ struct module;
 static inline int is_kernel_inittext(unsigned long addr)
 {
 	if (addr >= (unsigned long)_sinittext
-	    && addr <= (unsigned long)_einittext)
+	    && addr < (unsigned long)_einittext)
 		return 1;
 	return 0;
 }
 
 static inline int is_kernel_text(unsigned long addr)
 {
-	if ((addr >= (unsigned long)_stext && addr <= (unsigned long)_etext))
+	if ((addr >= (unsigned long)_stext && addr < (unsigned long)_etext))
 		return 1;
 	return in_gate_area_no_mm(addr);
 }
 
 static inline int is_kernel(unsigned long addr)
 {
-	if (addr >= (unsigned long)_stext && addr <= (unsigned long)_end)
+	if (addr >= (unsigned long)_stext && addr < (unsigned long)_end)
 		return 1;
 	return in_gate_area_no_mm(addr);
 }
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 46/87] sections: move and rename core_kernel_data() to is_kernel_core_data()
  2021-11-09  2:30 incoming Andrew Morton
                   ` (44 preceding siblings ...)
  2021-11-09  2:33 ` [patch 45/87] kallsyms: fix address-checks for kernel related range Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 47/87] sections: move is_kernel_inittext() into sections.h Andrew Morton
                   ` (40 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: akpm, andreyknvl, arnd, ast, benh, bp, christophe.leroy, davem,
	dvyukov, glider, ink, linux-mm, mattst88, mingo, mm-commits,
	monstr, mpe, paulus, pmladek, rostedt, rth, ryabinin.a.a,
	senozhatsky, tglx, torvalds, wangkefeng.wang

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: sections: move and rename core_kernel_data() to is_kernel_core_data()

Move core_kernel_data() into sections.h and rename it to
is_kernel_core_data(), also make it return bool value, then update all the
callers.

Link: https://lkml.kernel.org/r/20210930071143.63410-4-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/asm-generic/sections.h |   16 ++++++++++++++++
 include/linux/kernel.h         |    1 -
 kernel/extable.c               |   18 ------------------
 kernel/trace/ftrace.c          |    2 +-
 net/sysctl_net.c               |    2 +-
 5 files changed, 18 insertions(+), 21 deletions(-)

--- a/include/asm-generic/sections.h~sections-move-and-rename-core_kernel_data-to-is_kernel_core_data
+++ a/include/asm-generic/sections.h
@@ -129,6 +129,22 @@ static inline bool init_section_intersec
 }
 
 /**
+ * is_kernel_core_data - checks if the pointer address is located in the
+ *			 .data section
+ *
+ * @addr: address to check
+ *
+ * Returns: true if the address is located in .data, false otherwise.
+ * Note: On some archs it may return true for core RODATA, and false
+ *       for others. But will always be true for core RW data.
+ */
+static inline bool is_kernel_core_data(unsigned long addr)
+{
+	return addr >= (unsigned long)_sdata &&
+	       addr < (unsigned long)_edata;
+}
+
+/**
  * is_kernel_rodata - checks if the pointer address is located in the
  *                    .rodata section
  *
--- a/include/linux/kernel.h~sections-move-and-rename-core_kernel_data-to-is_kernel_core_data
+++ a/include/linux/kernel.h
@@ -227,7 +227,6 @@ extern char *next_arg(char *args, char *
 
 extern int core_kernel_text(unsigned long addr);
 extern int init_kernel_text(unsigned long addr);
-extern int core_kernel_data(unsigned long addr);
 extern int __kernel_text_address(unsigned long addr);
 extern int kernel_text_address(unsigned long addr);
 extern int func_ptr_is_kernel_text(void *ptr);
--- a/kernel/extable.c~sections-move-and-rename-core_kernel_data-to-is_kernel_core_data
+++ a/kernel/extable.c
@@ -82,24 +82,6 @@ int notrace core_kernel_text(unsigned lo
 	return 0;
 }
 
-/**
- * core_kernel_data - tell if addr points to kernel data
- * @addr: address to test
- *
- * Returns true if @addr passed in is from the core kernel data
- * section.
- *
- * Note: On some archs it may return true for core RODATA, and false
- *  for others. But will always be true for core RW data.
- */
-int core_kernel_data(unsigned long addr)
-{
-	if (addr >= (unsigned long)_sdata &&
-	    addr < (unsigned long)_edata)
-		return 1;
-	return 0;
-}
-
 int __kernel_text_address(unsigned long addr)
 {
 	if (kernel_text_address(addr))
--- a/kernel/trace/ftrace.c~sections-move-and-rename-core_kernel_data-to-is_kernel_core_data
+++ a/kernel/trace/ftrace.c
@@ -323,7 +323,7 @@ int __register_ftrace_function(struct ft
 	if (!ftrace_enabled && (ops->flags & FTRACE_OPS_FL_PERMANENT))
 		return -EBUSY;
 
-	if (!core_kernel_data((unsigned long)ops))
+	if (!is_kernel_core_data((unsigned long)ops))
 		ops->flags |= FTRACE_OPS_FL_DYNAMIC;
 
 	add_ftrace_ops(&ftrace_ops_list, ops);
--- a/net/sysctl_net.c~sections-move-and-rename-core_kernel_data-to-is_kernel_core_data
+++ a/net/sysctl_net.c
@@ -144,7 +144,7 @@ static void ensure_safe_net_sysctl(struc
 		addr = (unsigned long)ent->data;
 		if (is_module_address(addr))
 			where = "module";
-		else if (core_kernel_data(addr))
+		else if (is_kernel_core_data(addr))
 			where = "kernel";
 		else
 			continue;
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 47/87] sections: move is_kernel_inittext() into sections.h
  2021-11-09  2:30 incoming Andrew Morton
                   ` (45 preceding siblings ...)
  2021-11-09  2:33 ` [patch 46/87] sections: move and rename core_kernel_data() to is_kernel_core_data() Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:33 ` [patch 48/87] x86: mm: rename __is_kernel_text() to is_x86_32_kernel_text() Andrew Morton
                   ` (39 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: akpm, andreyknvl, arnd, ast, benh, bp, christophe.leroy, davem,
	dvyukov, glider, ink, linux-mm, mattst88, mingo, mm-commits,
	monstr, mpe, paulus, pmladek, rostedt, rth, ryabinin.a.a,
	senozhatsky, tglx, torvalds, wangkefeng.wang

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: sections: move is_kernel_inittext() into sections.h

The is_kernel_inittext() and init_kernel_text() are with same
functionality, let's just keep is_kernel_inittext() and move it into
sections.h, then update all the callers.

Link: https://lkml.kernel.org/r/20210930071143.63410-5-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/x86/kernel/unwind_orc.c   |    2 +-
 include/asm-generic/sections.h |   14 ++++++++++++++
 include/linux/kallsyms.h       |    8 --------
 include/linux/kernel.h         |    1 -
 kernel/extable.c               |   12 ++----------
 5 files changed, 17 insertions(+), 20 deletions(-)

--- a/arch/x86/kernel/unwind_orc.c~sections-move-is_kernel_inittext-into-sectionsh
+++ a/arch/x86/kernel/unwind_orc.c
@@ -175,7 +175,7 @@ static struct orc_entry *orc_find(unsign
 	}
 
 	/* vmlinux .init slow lookup: */
-	if (init_kernel_text(ip))
+	if (is_kernel_inittext(ip))
 		return __orc_find(__start_orc_unwind_ip, __start_orc_unwind,
 				  __stop_orc_unwind_ip - __start_orc_unwind_ip, ip);
 
--- a/include/asm-generic/sections.h~sections-move-is_kernel_inittext-into-sectionsh
+++ a/include/asm-generic/sections.h
@@ -158,4 +158,18 @@ static inline bool is_kernel_rodata(unsi
 	       addr < (unsigned long)__end_rodata;
 }
 
+/**
+ * is_kernel_inittext - checks if the pointer address is located in the
+ *                      .init.text section
+ *
+ * @addr: address to check
+ *
+ * Returns: true if the address is located in .init.text, false otherwise.
+ */
+static inline bool is_kernel_inittext(unsigned long addr)
+{
+	return addr >= (unsigned long)_sinittext &&
+	       addr < (unsigned long)_einittext;
+}
+
 #endif /* _ASM_GENERIC_SECTIONS_H_ */
--- a/include/linux/kallsyms.h~sections-move-is_kernel_inittext-into-sectionsh
+++ a/include/linux/kallsyms.h
@@ -24,14 +24,6 @@
 struct cred;
 struct module;
 
-static inline int is_kernel_inittext(unsigned long addr)
-{
-	if (addr >= (unsigned long)_sinittext
-	    && addr < (unsigned long)_einittext)
-		return 1;
-	return 0;
-}
-
 static inline int is_kernel_text(unsigned long addr)
 {
 	if ((addr >= (unsigned long)_stext && addr < (unsigned long)_etext))
--- a/include/linux/kernel.h~sections-move-is_kernel_inittext-into-sectionsh
+++ a/include/linux/kernel.h
@@ -226,7 +226,6 @@ extern bool parse_option_str(const char
 extern char *next_arg(char *args, char **param, char **val);
 
 extern int core_kernel_text(unsigned long addr);
-extern int init_kernel_text(unsigned long addr);
 extern int __kernel_text_address(unsigned long addr);
 extern int kernel_text_address(unsigned long addr);
 extern int func_ptr_is_kernel_text(void *ptr);
--- a/kernel/extable.c~sections-move-is_kernel_inittext-into-sectionsh
+++ a/kernel/extable.c
@@ -62,14 +62,6 @@ const struct exception_table_entry *sear
 	return e;
 }
 
-int init_kernel_text(unsigned long addr)
-{
-	if (addr >= (unsigned long)_sinittext &&
-	    addr < (unsigned long)_einittext)
-		return 1;
-	return 0;
-}
-
 int notrace core_kernel_text(unsigned long addr)
 {
 	if (addr >= (unsigned long)_stext &&
@@ -77,7 +69,7 @@ int notrace core_kernel_text(unsigned lo
 		return 1;
 
 	if (system_state < SYSTEM_FREEING_INITMEM &&
-	    init_kernel_text(addr))
+	    is_kernel_inittext(addr))
 		return 1;
 	return 0;
 }
@@ -94,7 +86,7 @@ int __kernel_text_address(unsigned long
 	 * Since we are after the module-symbols check, there's
 	 * no danger of address overlap:
 	 */
-	if (init_kernel_text(addr))
+	if (is_kernel_inittext(addr))
 		return 1;
 	return 0;
 }
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 48/87] x86: mm: rename __is_kernel_text() to is_x86_32_kernel_text()
  2021-11-09  2:30 incoming Andrew Morton
                   ` (46 preceding siblings ...)
  2021-11-09  2:33 ` [patch 47/87] sections: move is_kernel_inittext() into sections.h Andrew Morton
@ 2021-11-09  2:33 ` Andrew Morton
  2021-11-09  2:34 ` [patch 49/87] sections: provide internal __is_kernel() and __is_kernel_text() helper Andrew Morton
                   ` (38 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:33 UTC (permalink / raw)
  To: akpm, andreyknvl, arnd, ast, benh, bp, christophe.leroy, davem,
	dvyukov, glider, ink, linux-mm, mattst88, mingo, mm-commits,
	monstr, mpe, paulus, pmladek, rostedt, rth, ryabinin.a.a,
	senozhatsky, tglx, torvalds, wangkefeng.wang

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: x86: mm: rename __is_kernel_text() to is_x86_32_kernel_text()

Commit b56cd05c55a1 ("x86/mm: Rename is_kernel_text to __is_kernel_text"),
add '__' prefix not to get in conflict with existing is_kernel_text() in
<linux/kallsyms.h>.

We will add __is_kernel_text() for the basic kernel text range check in
the next patch, so use private is_x86_32_kernel_text() naming for x86
special check.

Link: https://lkml.kernel.org/r/20210930071143.63410-6-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/x86/mm/init_32.c |   14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

--- a/arch/x86/mm/init_32.c~x86-mm-rename-__is_kernel_text-to-is_x86_32_kernel_text
+++ a/arch/x86/mm/init_32.c
@@ -238,11 +238,7 @@ page_table_range_init(unsigned long star
 	}
 }
 
-/*
- * The <linux/kallsyms.h> already defines is_kernel_text,
- * using '__' prefix not to get in conflict.
- */
-static inline int __is_kernel_text(unsigned long addr)
+static inline int is_x86_32_kernel_text(unsigned long addr)
 {
 	if (addr >= (unsigned long)_text && addr <= (unsigned long)__init_end)
 		return 1;
@@ -333,8 +329,8 @@ repeat:
 				addr2 = (pfn + PTRS_PER_PTE-1) * PAGE_SIZE +
 					PAGE_OFFSET + PAGE_SIZE-1;
 
-				if (__is_kernel_text(addr) ||
-				    __is_kernel_text(addr2))
+				if (is_x86_32_kernel_text(addr) ||
+				    is_x86_32_kernel_text(addr2))
 					prot = PAGE_KERNEL_LARGE_EXEC;
 
 				pages_2m++;
@@ -359,7 +355,7 @@ repeat:
 				 */
 				pgprot_t init_prot = __pgprot(PTE_IDENT_ATTR);
 
-				if (__is_kernel_text(addr))
+				if (is_x86_32_kernel_text(addr))
 					prot = PAGE_KERNEL_EXEC;
 
 				pages_4k++;
@@ -789,7 +785,7 @@ static void mark_nxdata_nx(void)
 	 */
 	unsigned long start = PFN_ALIGN(_etext);
 	/*
-	 * This comes from __is_kernel_text upper limit. Also HPAGE where used:
+	 * This comes from is_x86_32_kernel_text upper limit. Also HPAGE where used:
 	 */
 	unsigned long size = (((unsigned long)__init_end + HPAGE_SIZE) & HPAGE_MASK) - start;
 
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 49/87] sections: provide internal __is_kernel() and __is_kernel_text() helper
  2021-11-09  2:30 incoming Andrew Morton
                   ` (47 preceding siblings ...)
  2021-11-09  2:33 ` [patch 48/87] x86: mm: rename __is_kernel_text() to is_x86_32_kernel_text() Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 50/87] mm: kasan: use is_kernel() helper Andrew Morton
                   ` (37 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, andreyknvl, arnd, ast, benh, bp, christophe.leroy, davem,
	dvyukov, glider, ink, linux-mm, mattst88, mingo, mm-commits,
	monstr, mpe, paulus, pmladek, rostedt, rth, ryabinin.a.a,
	senozhatsky, tglx, torvalds, wangkefeng.wang

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: sections: provide internal __is_kernel() and __is_kernel_text() helper

An internal __is_kernel() helper which only check the kernel address
ranges, and an internal __is_kernel_text() helper which only check text
section ranges.

Link: https://lkml.kernel.org/r/20210930071143.63410-7-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/asm-generic/sections.h |   29 +++++++++++++++++++++++++++++
 include/linux/kallsyms.h       |    4 ++--
 2 files changed, 31 insertions(+), 2 deletions(-)

--- a/include/asm-generic/sections.h~sections-provide-internal-__is_kernel-and-__is_kernel_text-helper
+++ a/include/asm-generic/sections.h
@@ -172,4 +172,33 @@ static inline bool is_kernel_inittext(un
 	       addr < (unsigned long)_einittext;
 }
 
+/**
+ * __is_kernel_text - checks if the pointer address is located in the
+ *                    .text section
+ *
+ * @addr: address to check
+ *
+ * Returns: true if the address is located in .text, false otherwise.
+ * Note: an internal helper, only check the range of _stext to _etext.
+ */
+static inline bool __is_kernel_text(unsigned long addr)
+{
+	return addr >= (unsigned long)_stext &&
+	       addr < (unsigned long)_etext;
+}
+
+/**
+ * __is_kernel - checks if the pointer address is located in the kernel range
+ *
+ * @addr: address to check
+ *
+ * Returns: true if the address is located in the kernel range, false otherwise.
+ * Note: an internal helper, only check the range of _stext to _end.
+ */
+static inline bool __is_kernel(unsigned long addr)
+{
+	return addr >= (unsigned long)_stext &&
+	       addr < (unsigned long)_end;
+}
+
 #endif /* _ASM_GENERIC_SECTIONS_H_ */
--- a/include/linux/kallsyms.h~sections-provide-internal-__is_kernel-and-__is_kernel_text-helper
+++ a/include/linux/kallsyms.h
@@ -26,14 +26,14 @@ struct module;
 
 static inline int is_kernel_text(unsigned long addr)
 {
-	if ((addr >= (unsigned long)_stext && addr < (unsigned long)_etext))
+	if (__is_kernel_text(addr))
 		return 1;
 	return in_gate_area_no_mm(addr);
 }
 
 static inline int is_kernel(unsigned long addr)
 {
-	if (addr >= (unsigned long)_stext && addr < (unsigned long)_end)
+	if (__is_kernel(addr))
 		return 1;
 	return in_gate_area_no_mm(addr);
 }
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 50/87] mm: kasan: use is_kernel() helper
  2021-11-09  2:30 incoming Andrew Morton
                   ` (48 preceding siblings ...)
  2021-11-09  2:34 ` [patch 49/87] sections: provide internal __is_kernel() and __is_kernel_text() helper Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 51/87] extable: use is_kernel_text() helper Andrew Morton
                   ` (36 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, andreyknvl, arnd, ast, benh, bp, christophe.leroy, davem,
	dvyukov, glider, ink, linux-mm, mattst88, mingo, mm-commits,
	monstr, mpe, paulus, pmladek, rostedt, rth, ryabinin.a.a,
	senozhatsky, tglx, torvalds, wangkefeng.wang

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: mm: kasan: use is_kernel() helper

Directly use is_kernel() helper in kernel_or_module_addr().

Link: https://lkml.kernel.org/r/20210930071143.63410-8-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/kasan/report.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/kasan/report.c~mm-kasan-use-is_kernel-helper
+++ a/mm/kasan/report.c
@@ -226,7 +226,7 @@ static void describe_object(struct kmem_
 
 static inline bool kernel_or_module_addr(const void *addr)
 {
-	if (addr >= (void *)_stext && addr < (void *)_end)
+	if (is_kernel((unsigned long)addr))
 		return true;
 	if (is_module_address((unsigned long)addr))
 		return true;
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 51/87] extable: use is_kernel_text() helper
  2021-11-09  2:30 incoming Andrew Morton
                   ` (49 preceding siblings ...)
  2021-11-09  2:34 ` [patch 50/87] mm: kasan: use is_kernel() helper Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 52/87] powerpc/mm: use core_kernel_text() helper Andrew Morton
                   ` (35 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, andreyknvl, arnd, ast, benh, bp, christophe.leroy, davem,
	dvyukov, glider, ink, linux-mm, mattst88, mingo, mm-commits,
	monstr, mpe, paulus, pmladek, rostedt, rth, ryabinin.a.a,
	senozhatsky, tglx, torvalds, wangkefeng.wang

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: extable: use is_kernel_text() helper

The core_kernel_text() should check the gate area, as it is part of kernel
text range, use is_kernel_text() in core_kernel_text().

Link: https://lkml.kernel.org/r/20210930071143.63410-9-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/extable.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/kernel/extable.c~extable-use-is_kernel_text-helper
+++ a/kernel/extable.c
@@ -64,8 +64,7 @@ const struct exception_table_entry *sear
 
 int notrace core_kernel_text(unsigned long addr)
 {
-	if (addr >= (unsigned long)_stext &&
-	    addr < (unsigned long)_etext)
+	if (is_kernel_text(addr))
 		return 1;
 
 	if (system_state < SYSTEM_FREEING_INITMEM &&
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 52/87] powerpc/mm: use core_kernel_text() helper
  2021-11-09  2:30 incoming Andrew Morton
                   ` (50 preceding siblings ...)
  2021-11-09  2:34 ` [patch 51/87] extable: use is_kernel_text() helper Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 53/87] microblaze: use is_kernel_text() helper Andrew Morton
                   ` (34 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, andreyknvl, arnd, ast, benh, bp, christophe.leroy, davem,
	dvyukov, glider, ink, linux-mm, mattst88, mingo, mm-commits,
	monstr, mpe, paulus, pmladek, rostedt, rth, ryabinin.a.a,
	senozhatsky, tglx, torvalds, wangkefeng.wang

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: powerpc/mm: use core_kernel_text() helper

Use core_kernel_text() helper to simplify code, also drop etext, _stext,
_sinittext, _einittext declaration which already declared in section.h.

Link: https://lkml.kernel.org/r/20210930071143.63410-10-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/powerpc/mm/pgtable_32.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

--- a/arch/powerpc/mm/pgtable_32.c~powerpc-mm-use-core_kernel_text-helper
+++ a/arch/powerpc/mm/pgtable_32.c
@@ -33,8 +33,6 @@
 
 #include <mm/mmu_decl.h>
 
-extern char etext[], _stext[], _sinittext[], _einittext[];
-
 static u8 early_fixmap_pagetable[FIXMAP_PTE_SIZE] __page_aligned_data;
 
 notrace void __init early_ioremap_init(void)
@@ -104,14 +102,13 @@ static void __init __mapin_ram_chunk(uns
 {
 	unsigned long v, s;
 	phys_addr_t p;
-	int ktext;
+	bool ktext;
 
 	s = offset;
 	v = PAGE_OFFSET + s;
 	p = memstart_addr + s;
 	for (; s < top; s += PAGE_SIZE) {
-		ktext = ((char *)v >= _stext && (char *)v < etext) ||
-			((char *)v >= _sinittext && (char *)v < _einittext);
+		ktext = core_kernel_text(v);
 		map_kernel_page(v, p, ktext ? PAGE_KERNEL_TEXT : PAGE_KERNEL);
 		v += PAGE_SIZE;
 		p += PAGE_SIZE;
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 53/87] microblaze: use is_kernel_text() helper
  2021-11-09  2:30 incoming Andrew Morton
                   ` (51 preceding siblings ...)
  2021-11-09  2:34 ` [patch 52/87] powerpc/mm: use core_kernel_text() helper Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 54/87] alpha: " Andrew Morton
                   ` (33 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, andreyknvl, arnd, ast, benh, bp, christophe.leroy, davem,
	dvyukov, glider, ink, linux-mm, mattst88, michal.simek, mingo,
	mm-commits, mpe, paulus, pmladek, rostedt, rth, ryabinin.a.a,
	senozhatsky, tglx, torvalds, wangkefeng.wang

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: microblaze: use is_kernel_text() helper

Use is_kernel_text() helper to simplify code.

Link: https://lkml.kernel.org/r/20210930071143.63410-11-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/microblaze/mm/pgtable.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/microblaze/mm/pgtable.c~microblaze-use-is_kernel_text-helper
+++ a/arch/microblaze/mm/pgtable.c
@@ -34,6 +34,7 @@
 #include <linux/mm_types.h>
 #include <linux/pgtable.h>
 #include <linux/memblock.h>
+#include <linux/kallsyms.h>
 
 #include <asm/pgalloc.h>
 #include <linux/io.h>
@@ -171,7 +172,7 @@ void __init mapin_ram(void)
 	for (s = 0; s < lowmem_size; s += PAGE_SIZE) {
 		f = _PAGE_PRESENT | _PAGE_ACCESSED |
 				_PAGE_SHARED | _PAGE_HWEXEC;
-		if ((char *) v < _stext || (char *) v >= _etext)
+		if (!is_kernel_text(v))
 			f |= _PAGE_WRENABLE;
 		else
 			/* On the MicroBlaze, no user access
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 54/87] alpha: use is_kernel_text() helper
  2021-11-09  2:30 incoming Andrew Morton
                   ` (52 preceding siblings ...)
  2021-11-09  2:34 ` [patch 53/87] microblaze: use is_kernel_text() helper Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 55/87] ramfs: fix mount source show for ramfs Andrew Morton
                   ` (32 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, andreyknvl, arnd, ast, benh, bp, christophe.leroy, davem,
	dvyukov, glider, ink, linux-mm, mattst88, mingo, mm-commits,
	monstr, mpe, paulus, pmladek, rostedt, rth, ryabinin.a.a,
	senozhatsky, tglx, torvalds, wangkefeng.wang

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: alpha: use is_kernel_text() helper

Use is_kernel_text() helper to simplify code.

Link: https://lkml.kernel.org/r/20210930071143.63410-12-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/alpha/kernel/traps.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- a/arch/alpha/kernel/traps.c~alpha-use-is_kernel_text-helper
+++ a/arch/alpha/kernel/traps.c
@@ -129,9 +129,7 @@ dik_show_trace(unsigned long *sp, const
 		extern char _stext[], _etext[];
 		unsigned long tmp = *sp;
 		sp++;
-		if (tmp < (unsigned long) &_stext)
-			continue;
-		if (tmp >= (unsigned long) &_etext)
+		if (!is_kernel_text(tmp))
 			continue;
 		printk("%s[<%lx>] %pSR\n", loglvl, tmp, (void *)tmp);
 		if (i > 40) {
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 55/87] ramfs: fix mount source show for ramfs
  2021-11-09  2:30 incoming Andrew Morton
                   ` (53 preceding siblings ...)
  2021-11-09  2:34 ` [patch 54/87] alpha: " Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 56/87] init: make unknown command line param message clearer Andrew Morton
                   ` (31 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, torvalds, viro, yangerkun

From: yangerkun <yangerkun@huawei.com>
Subject: ramfs: fix mount source show for ramfs

ramfs_parse_param does not parse key "source", and will convert -ENOPARAM
to 0.  This will skip vfs_parse_fs_param_source in vfs_parse_fs_param,
which lead always "none" mount source for ramfs.  Fix it by parse "source"
in ramfs_parse_param like cgroup1_parse_param has do.

Link: https://lkml.kernel.org/r/20210924091756.1906118-1-yangerkun@huawei.com
Signed-off-by: yangerkun <yangerkun@huawei.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/ramfs/inode.c |   11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

--- a/fs/ramfs/inode.c~ramfs-fix-mount-source-show-for-ramfs
+++ a/fs/ramfs/inode.c
@@ -203,17 +203,20 @@ static int ramfs_parse_param(struct fs_c
 	int opt;
 
 	opt = fs_parse(fc, ramfs_fs_parameters, param, &result);
-	if (opt < 0) {
+	if (opt == -ENOPARAM) {
+		opt = vfs_parse_fs_param_source(fc, param);
+		if (opt != -ENOPARAM)
+			return opt;
 		/*
 		 * We might like to report bad mount options here;
 		 * but traditionally ramfs has ignored all mount options,
 		 * and as it is used as a !CONFIG_SHMEM simple substitute
 		 * for tmpfs, better continue to ignore other mount options.
 		 */
-		if (opt == -ENOPARAM)
-			opt = 0;
-		return opt;
+		return 0;
 	}
+	if (opt < 0)
+		return opt;
 
 	switch (opt) {
 	case Opt_mode:
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 56/87] init: make unknown command line param message clearer
  2021-11-09  2:30 incoming Andrew Morton
                   ` (54 preceding siblings ...)
  2021-11-09  2:34 ` [patch 55/87] ramfs: fix mount source show for ramfs Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 57/87] coda: avoid NULL pointer dereference from a bad inode Andrew Morton
                   ` (30 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: ahalaney, akpm, bp, linux-mm, mm-commits, rdunlap, rostedt, torvalds

From: Andrew Halaney <ahalaney@redhat.com>
Subject: init: make unknown command line param message clearer

The prior message is confusing users, which is the exact opposite of the
goal.  If the message is being seen, one of the following situations is
happening:

 1. the param is misspelled
 2. the param is not valid due to the kernel configuration
 3. the param is intended for init but isn't after the '--'
    delineator on the command line

To make that more clear to the user, explicitly mention "kernel command
line" and also note that the params are still passed to user space to
avoid causing any alarm over params intended for init.

Link: https://lkml.kernel.org/r/20211013223502.96756-1-ahalaney@redhat.com
Fixes: 86d1919a4fb0 ("init: print out unknown kernel parameters")
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 init/main.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/init/main.c~init-make-unknown-command-line-param-message-clearer
+++ a/init/main.c
@@ -924,7 +924,9 @@ static void __init print_unknown_bootopt
 	for (p = &envp_init[2]; *p; p++)
 		end += sprintf(end, " %s", *p);
 
-	pr_notice("Unknown command line parameters:%s\n", unknown_options);
+	/* Start at unknown_options[1] to skip the initial space */
+	pr_notice("Unknown kernel command line parameters \"%s\", will be passed to user space.\n",
+		&unknown_options[1]);
 	memblock_free(unknown_options, len);
 }
 
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 57/87] coda: avoid NULL pointer dereference from a bad inode
  2021-11-09  2:30 incoming Andrew Morton
                   ` (55 preceding siblings ...)
  2021-11-09  2:34 ` [patch 56/87] init: make unknown command line param message clearer Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 58/87] coda: check for async upcall request using local state Andrew Morton
                   ` (29 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, alex.shi, jaharkes, jing.yangyang, linux-mm, mm-commits,
	tanxin.ctf, torvalds, xiyuyang19, zealci

From: Jan Harkes <jaharkes@cs.cmu.edu>
Subject: coda: avoid NULL pointer dereference from a bad inode

Patch series "Coda updates for -next".

The following patch series contains some fixes for the Coda kernel module
I've had sitting around and were tested extensively in a development
version of the Coda kernel module that lives outside of the main kernel.


This patch (of 9):

Avoid accessing coda_inode_info from a dentry with a bad inode.

Link: https://lkml.kernel.org/r/20210908140308.18491-1-jaharkes@cs.cmu.edu
Link: https://lkml.kernel.org/r/20210908140308.18491-2-jaharkes@cs.cmu.edu
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Jing Yangyang <jing.yangyang@zte.com.cn>
Cc: Xin Tan <tanxin.ctf@gmail.com>
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Cc: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/coda/dir.c |   13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

--- a/fs/coda/dir.c~coda-avoid-null-pointer-dereference-from-a-bad-inode
+++ a/fs/coda/dir.c
@@ -499,15 +499,20 @@ out:
  */
 static int coda_dentry_delete(const struct dentry * dentry)
 {
-	int flags;
+	struct inode *inode;
+	struct coda_inode_info *cii;
 
 	if (d_really_is_negative(dentry)) 
 		return 0;
 
-	flags = (ITOC(d_inode(dentry))->c_flags) & C_PURGE;
-	if (is_bad_inode(d_inode(dentry)) || flags) {
+	inode = d_inode(dentry);
+	if (!inode || is_bad_inode(inode))
 		return 1;
-	}
+
+	cii = ITOC(inode);
+	if (cii->c_flags & C_PURGE)
+		return 1;
+
 	return 0;
 }
 
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 58/87] coda: check for async upcall request using local state
  2021-11-09  2:30 incoming Andrew Morton
                   ` (56 preceding siblings ...)
  2021-11-09  2:34 ` [patch 57/87] coda: avoid NULL pointer dereference from a bad inode Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 59/87] coda: remove err which no one care Andrew Morton
                   ` (28 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, alex.shi, jaharkes, jing.yangyang, linux-mm, mm-commits,
	tanxin.ctf, torvalds, xiyuyang19, zealci

From: Jan Harkes <jaharkes@cs.cmu.edu>
Subject: coda: check for async upcall request using local state

Originally flagged by Smatch because the code implicitly assumed outSize
is not NULL for non-async upcalls because of a flag that was (not) set in
req->uc_flags.

However req->uc_flags field is in shared state and although the current
code will not allow it to be changed before the async request check the
code is more robust when it tests against the local outSize variable.

Link: https://lkml.kernel.org/r/20210908140308.18491-3-jaharkes@cs.cmu.edu
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Jing Yangyang <jing.yangyang@zte.com.cn>
Cc: Xin Tan <tanxin.ctf@gmail.com>
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Cc: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/coda/upcall.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/fs/coda/upcall.c~coda-check-for-async-upcall-request-using-local-state
+++ a/fs/coda/upcall.c
@@ -744,7 +744,8 @@ static int coda_upcall(struct venus_comm
 	list_add_tail(&req->uc_chain, &vcp->vc_pending);
 	wake_up_interruptible(&vcp->vc_waitq);
 
-	if (req->uc_flags & CODA_REQ_ASYNC) {
+	/* We can return early on asynchronous requests */
+	if (outSize == NULL) {
 		mutex_unlock(&vcp->vc_mutex);
 		return 0;
 	}
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 59/87] coda: remove err which no one care
  2021-11-09  2:30 incoming Andrew Morton
                   ` (57 preceding siblings ...)
  2021-11-09  2:34 ` [patch 58/87] coda: check for async upcall request using local state Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 60/87] coda: avoid flagging NULL inodes Andrew Morton
                   ` (27 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, alex.shi, jaharkes, jing.yangyang, linux-mm, mm-commits,
	tanxin.ctf, torvalds, xiyuyang19, zealci

From: Alex Shi <alex.shi@linux.alibaba.com>
Subject: coda: remove err which no one care

No one care 'err' in func coda_release, so better remove it.

Link: https://lkml.kernel.org/r/20210908140308.18491-4-jaharkes@cs.cmu.edu
Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Jing Yangyang <jing.yangyang@zte.com.cn>
Cc: Xin Tan <tanxin.ctf@gmail.com>
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Cc: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/coda/file.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/fs/coda/file.c~coda-remove-err-which-no-one-care
+++ a/fs/coda/file.c
@@ -238,11 +238,10 @@ int coda_release(struct inode *coda_inod
 	struct coda_file_info *cfi;
 	struct coda_inode_info *cii;
 	struct inode *host_inode;
-	int err;
 
 	cfi = coda_ftoc(coda_file);
 
-	err = venus_close(coda_inode->i_sb, coda_i2f(coda_inode),
+	venus_close(coda_inode->i_sb, coda_i2f(coda_inode),
 			  coda_flags, coda_file->f_cred->fsuid);
 
 	host_inode = file_inode(cfi->cfi_container);
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 60/87] coda: avoid flagging NULL inodes
  2021-11-09  2:30 incoming Andrew Morton
                   ` (58 preceding siblings ...)
  2021-11-09  2:34 ` [patch 59/87] coda: remove err which no one care Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 61/87] coda: avoid hidden code duplication in rename Andrew Morton
                   ` (26 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, alex.shi, jaharkes, jing.yangyang, linux-mm, mm-commits,
	tanxin.ctf, torvalds, xiyuyang19, zealci

From: Jan Harkes <jaharkes@cs.cmu.edu>
Subject: coda: avoid flagging NULL inodes

Somehow we hit a negative dentry in coda_rename even after checking with
d_really_is_positive.  Maybe something raced and turned the new_dentry
negative while we were fixing up directory link counts.

Link: https://lkml.kernel.org/r/20210908140308.18491-5-jaharkes@cs.cmu.edu
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Jing Yangyang <jing.yangyang@zte.com.cn>
Cc: Xin Tan <tanxin.ctf@gmail.com>
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Cc: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/coda/coda_linux.h |    3 +++
 1 file changed, 3 insertions(+)

--- a/fs/coda/coda_linux.h~coda-avoid-flagging-null-inodes
+++ a/fs/coda/coda_linux.h
@@ -83,6 +83,9 @@ static __inline__ void coda_flag_inode(s
 {
 	struct coda_inode_info *cii = ITOC(inode);
 
+	if (!inode)
+		return;
+
 	spin_lock(&cii->c_lock);
 	cii->c_flags |= flag;
 	spin_unlock(&cii->c_lock);
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 61/87] coda: avoid hidden code duplication in rename
  2021-11-09  2:30 incoming Andrew Morton
                   ` (59 preceding siblings ...)
  2021-11-09  2:34 ` [patch 60/87] coda: avoid flagging NULL inodes Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 62/87] coda: avoid doing bad things on inode type changes during revalidation Andrew Morton
                   ` (25 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, alex.shi, jaharkes, jing.yangyang, linux-mm, mm-commits,
	tanxin.ctf, torvalds, xiyuyang19, zealci

From: Jan Harkes <jaharkes@cs.cmu.edu>
Subject: coda: avoid hidden code duplication in rename

We were actually fixing up the directory mtime in both branches after the
negative dentry test, it was just that one branch was only flagging the
directory inodes to refresh their attributes while the other branch used
the optional optimization to set mtime to the current time and not go back
to the Coda client.

Link: https://lkml.kernel.org/r/20210908140308.18491-6-jaharkes@cs.cmu.edu
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Jing Yangyang <jing.yangyang@zte.com.cn>
Cc: Xin Tan <tanxin.ctf@gmail.com>
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Cc: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/coda/dir.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

--- a/fs/coda/dir.c~coda-avoid-hidden-code-duplication-in-rename
+++ a/fs/coda/dir.c
@@ -317,13 +317,10 @@ static int coda_rename(struct user_names
 				coda_dir_drop_nlink(old_dir);
 				coda_dir_inc_nlink(new_dir);
 			}
-			coda_dir_update_mtime(old_dir);
-			coda_dir_update_mtime(new_dir);
 			coda_flag_inode(d_inode(new_dentry), C_VATTR);
-		} else {
-			coda_flag_inode(old_dir, C_VATTR);
-			coda_flag_inode(new_dir, C_VATTR);
 		}
+		coda_dir_update_mtime(old_dir);
+		coda_dir_update_mtime(new_dir);
 	}
 	return error;
 }
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 62/87] coda: avoid doing bad things on inode type changes during revalidation
  2021-11-09  2:30 incoming Andrew Morton
                   ` (60 preceding siblings ...)
  2021-11-09  2:34 ` [patch 61/87] coda: avoid hidden code duplication in rename Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 63/87] coda: convert from atomic_t to refcount_t on coda_vm_ops->refcnt Andrew Morton
                   ` (24 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, alex.shi, jaharkes, jing.yangyang, linux-mm, mm-commits,
	tanxin.ctf, torvalds, xiyuyang19, zealci

From: Jan Harkes <jaharkes@cs.cmu.edu>
Subject: coda: avoid doing bad things on inode type changes during revalidation

When Coda discovers an inconsistent object, it turns it into a symlink. 
However we can't just follow this change in the kernel on an existing file
or directory inode that may still have references.

This patch removes the inconsistent inode from the inode hash and
allocates a new inode for the symlink object.

Link: https://lkml.kernel.org/r/20210908140308.18491-7-jaharkes@cs.cmu.edu
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Jing Yangyang <jing.yangyang@zte.com.cn>
Cc: Xin Tan <tanxin.ctf@gmail.com>
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Cc: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/coda/cnode.c      |   13 +++++++++----
 fs/coda/coda_linux.c |   39 +++++++++++++++++++--------------------
 fs/coda/coda_linux.h |    3 ++-
 3 files changed, 30 insertions(+), 25 deletions(-)

--- a/fs/coda/cnode.c~coda-avoid-doing-bad-things-on-inode-type-changes-during-revalidation
+++ a/fs/coda/cnode.c
@@ -63,9 +63,10 @@ struct inode * coda_iget(struct super_bl
 	struct inode *inode;
 	struct coda_inode_info *cii;
 	unsigned long hash = coda_f2i(fid);
+	umode_t inode_type = coda_inode_type(attr);
 
+retry:
 	inode = iget5_locked(sb, hash, coda_test_inode, coda_set_inode, fid);
-
 	if (!inode)
 		return ERR_PTR(-ENOMEM);
 
@@ -75,11 +76,15 @@ struct inode * coda_iget(struct super_bl
 		inode->i_ino = hash;
 		/* inode is locked and unique, no need to grab cii->c_lock */
 		cii->c_mapcount = 0;
+		coda_fill_inode(inode, attr);
 		unlock_new_inode(inode);
+	} else if ((inode->i_mode & S_IFMT) != inode_type) {
+		/* Inode has changed type, mark bad and grab a new one */
+		remove_inode_hash(inode);
+		coda_flag_inode(inode, C_PURGE);
+		iput(inode);
+		goto retry;
 	}
-
-	/* always replace the attributes, type might have changed */
-	coda_fill_inode(inode, attr);
 	return inode;
 }
 
--- a/fs/coda/coda_linux.c~coda-avoid-doing-bad-things-on-inode-type-changes-during-revalidation
+++ a/fs/coda/coda_linux.c
@@ -87,28 +87,27 @@ static struct coda_timespec timespec64_t
 }
 
 /* utility functions below */
+umode_t coda_inode_type(struct coda_vattr *attr)
+{
+	switch (attr->va_type) {
+	case C_VREG:
+		return S_IFREG;
+	case C_VDIR:
+		return S_IFDIR;
+	case C_VLNK:
+		return S_IFLNK;
+	case C_VNON:
+	default:
+		return 0;
+	}
+}
+
 void coda_vattr_to_iattr(struct inode *inode, struct coda_vattr *attr)
 {
-        int inode_type;
-        /* inode's i_flags, i_ino are set by iget 
-           XXX: is this all we need ??
-           */
-        switch (attr->va_type) {
-        case C_VNON:
-                inode_type  = 0;
-                break;
-        case C_VREG:
-                inode_type = S_IFREG;
-                break;
-        case C_VDIR:
-                inode_type = S_IFDIR;
-                break;
-        case C_VLNK:
-                inode_type = S_IFLNK;
-                break;
-        default:
-                inode_type = 0;
-        }
+	/* inode's i_flags, i_ino are set by iget
+	 * XXX: is this all we need ??
+	 */
+	umode_t inode_type = coda_inode_type(attr);
 	inode->i_mode |= inode_type;
 
 	if (attr->va_mode != (u_short) -1)
--- a/fs/coda/coda_linux.h~coda-avoid-doing-bad-things-on-inode-type-changes-during-revalidation
+++ a/fs/coda/coda_linux.h
@@ -53,10 +53,11 @@ int coda_getattr(struct user_namespace *
 		 u32, unsigned int);
 int coda_setattr(struct user_namespace *, struct dentry *, struct iattr *);
 
-/* this file:  heloers */
+/* this file:  helpers */
 char *coda_f2s(struct CodaFid *f);
 int coda_iscontrol(const char *name, size_t length);
 
+umode_t coda_inode_type(struct coda_vattr *attr);
 void coda_vattr_to_iattr(struct inode *, struct coda_vattr *);
 void coda_iattr_to_vattr(struct iattr *, struct coda_vattr *);
 unsigned short coda_flags_to_cflags(unsigned short);
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 63/87] coda: convert from atomic_t to refcount_t on coda_vm_ops->refcnt
  2021-11-09  2:30 incoming Andrew Morton
                   ` (61 preceding siblings ...)
  2021-11-09  2:34 ` [patch 62/87] coda: avoid doing bad things on inode type changes during revalidation Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 64/87] coda: use vmemdup_user to replace the open code Andrew Morton
                   ` (23 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, alex.shi, jaharkes, jing.yangyang, linux-mm, mm-commits,
	tanxin.ctf, torvalds, xiyuyang19, zealci

From: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Subject: coda: convert from atomic_t to refcount_t on coda_vm_ops->refcnt

refcount_t type and corresponding API can protect refcounters from
accidental underflow and overflow and further use-after-free situations.

Link: https://lkml.kernel.org/r/20210908140308.18491-8-jaharkes@cs.cmu.edu
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Jing Yangyang <jing.yangyang@zte.com.cn>
Cc: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/coda/file.c |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

--- a/fs/coda/file.c~coda-convert-from-atomic_t-to-refcount_t-on-coda_vm_ops-refcnt
+++ a/fs/coda/file.c
@@ -8,6 +8,7 @@
  * to the Coda project. Contact Peter Braam <coda@cs.cmu.edu>.
  */
 
+#include <linux/refcount.h>
 #include <linux/types.h>
 #include <linux/kernel.h>
 #include <linux/time.h>
@@ -28,7 +29,7 @@
 #include "coda_int.h"
 
 struct coda_vm_ops {
-	atomic_t refcnt;
+	refcount_t refcnt;
 	struct file *coda_file;
 	const struct vm_operations_struct *host_vm_ops;
 	struct vm_operations_struct vm_ops;
@@ -98,7 +99,7 @@ coda_vm_open(struct vm_area_struct *vma)
 	struct coda_vm_ops *cvm_ops =
 		container_of(vma->vm_ops, struct coda_vm_ops, vm_ops);
 
-	atomic_inc(&cvm_ops->refcnt);
+	refcount_inc(&cvm_ops->refcnt);
 
 	if (cvm_ops->host_vm_ops && cvm_ops->host_vm_ops->open)
 		cvm_ops->host_vm_ops->open(vma);
@@ -113,7 +114,7 @@ coda_vm_close(struct vm_area_struct *vma
 	if (cvm_ops->host_vm_ops && cvm_ops->host_vm_ops->close)
 		cvm_ops->host_vm_ops->close(vma);
 
-	if (atomic_dec_and_test(&cvm_ops->refcnt)) {
+	if (refcount_dec_and_test(&cvm_ops->refcnt)) {
 		vma->vm_ops = cvm_ops->host_vm_ops;
 		fput(cvm_ops->coda_file);
 		kfree(cvm_ops);
@@ -189,7 +190,7 @@ coda_file_mmap(struct file *coda_file, s
 		cvm_ops->vm_ops.open = coda_vm_open;
 		cvm_ops->vm_ops.close = coda_vm_close;
 		cvm_ops->coda_file = coda_file;
-		atomic_set(&cvm_ops->refcnt, 1);
+		refcount_set(&cvm_ops->refcnt, 1);
 
 		vma->vm_ops = &cvm_ops->vm_ops;
 	}
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 64/87] coda: use vmemdup_user to replace the open code
  2021-11-09  2:30 incoming Andrew Morton
                   ` (62 preceding siblings ...)
  2021-11-09  2:34 ` [patch 63/87] coda: convert from atomic_t to refcount_t on coda_vm_ops->refcnt Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 65/87] coda: bump module version to 7.2 Andrew Morton
                   ` (22 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, alex.shi, jaharkes, jing.yangyang, linux-mm, mm-commits,
	tanxin.ctf, torvalds, xiyuyang19, zealci

From: Jing Yangyang <jing.yangyang@zte.com.cn>
Subject: coda: use vmemdup_user to replace the open code

vmemdup_user is better than duplicating its implementation, So just
replace the open code.

./fs/coda/psdev.c:125:10-18:WARNING:opportunity for vmemdup_user

The issue is detected with the help of Coccinelle.

Link: https://lkml.kernel.org/r/20210908140308.18491-9-jaharkes@cs.cmu.edu
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Jing Yangyang <jing.yangyang@zte.com.cn>
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Xin Tan <tanxin.ctf@gmail.com>
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/coda/psdev.c |   12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

--- a/fs/coda/psdev.c~coda-use-vmemdup_user-to-replace-the-open-code
+++ a/fs/coda/psdev.c
@@ -122,14 +122,10 @@ static ssize_t coda_psdev_write(struct f
 				hdr.opcode, hdr.unique);
 		        nbytes = size;
 		}
-		dcbuf = kvmalloc(nbytes, GFP_KERNEL);
-		if (!dcbuf) {
-			retval = -ENOMEM;
-			goto out;
-		}
-		if (copy_from_user(dcbuf, buf, nbytes)) {
-			kvfree(dcbuf);
-			retval = -EFAULT;
+
+		dcbuf = vmemdup_user(buf, nbytes);
+		if (IS_ERR(dcbuf)) {
+			retval = PTR_ERR(dcbuf);
 			goto out;
 		}
 
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 65/87] coda: bump module version to 7.2
  2021-11-09  2:30 incoming Andrew Morton
                   ` (63 preceding siblings ...)
  2021-11-09  2:34 ` [patch 64/87] coda: use vmemdup_user to replace the open code Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:34 ` [patch 66/87] nilfs2: replace snprintf in show functions with sysfs_emit Andrew Morton
                   ` (21 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, alex.shi, jaharkes, jing.yangyang, linux-mm, mm-commits,
	tanxin.ctf, torvalds, xiyuyang19, zealci

From: Jan Harkes <jaharkes@cs.cmu.edu>
Subject: coda: bump module version to 7.2

Helps with tracking which patches have been propagated upstream and if
users are running the latest known version.

Link: https://lkml.kernel.org/r/20210908140308.18491-10-jaharkes@cs.cmu.edu
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Jing Yangyang <jing.yangyang@zte.com.cn>
Cc: Xin Tan <tanxin.ctf@gmail.com>
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Cc: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/coda/psdev.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/coda/psdev.c~coda-bump-module-version-to-72
+++ a/fs/coda/psdev.c
@@ -384,7 +384,7 @@ MODULE_AUTHOR("Jan Harkes, Peter J. Braa
 MODULE_DESCRIPTION("Coda Distributed File System VFS interface");
 MODULE_ALIAS_CHARDEV_MAJOR(CODA_PSDEV_MAJOR);
 MODULE_LICENSE("GPL");
-MODULE_VERSION("7.0");
+MODULE_VERSION("7.2");
 
 static int __init init_coda(void)
 {
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 66/87] nilfs2: replace snprintf in show functions with sysfs_emit
  2021-11-09  2:30 incoming Andrew Morton
                   ` (64 preceding siblings ...)
  2021-11-09  2:34 ` [patch 65/87] coda: bump module version to 7.2 Andrew Morton
@ 2021-11-09  2:34 ` Andrew Morton
  2021-11-09  2:35 ` [patch 67/87] nilfs2: remove filenames from file comments Andrew Morton
                   ` (20 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:34 UTC (permalink / raw)
  To: akpm, konishi.ryusuke, linux-mm, mm-commits, torvalds, wangqing

From: Qing Wang <wangqing@vivo.com>
Subject: nilfs2: replace snprintf in show functions with sysfs_emit

Patch series "nilfs2 updates".


This patch (of 2):

coccicheck complains about the use of snprintf() in sysfs show functions.

Fix the coccicheck warning:
WARNING: use scnprintf or sprintf.

Use sysfs_emit instead of scnprintf or sprintf makes more sense.

Link: https://lkml.kernel.org/r/1635151862-11547-1-git-send-email-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/1634095759-4625-1-git-send-email-wangqing@vivo.com
Link: https://lkml.kernel.org/r/1635151862-11547-2-git-send-email-konishi.ryusuke@gmail.com
Signed-off-by: Qing Wang <wangqing@vivo.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/nilfs2/sysfs.c |   76 ++++++++++++++++++++++----------------------
 1 file changed, 38 insertions(+), 38 deletions(-)

--- a/fs/nilfs2/sysfs.c~nilfs2-replace-snprintf-in-show-functions-with-sysfs_emit
+++ a/fs/nilfs2/sysfs.c
@@ -95,7 +95,7 @@ static ssize_t
 nilfs_snapshot_inodes_count_show(struct nilfs_snapshot_attr *attr,
 				 struct nilfs_root *root, char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, "%llu\n",
+	return sysfs_emit(buf, "%llu\n",
 			(unsigned long long)atomic64_read(&root->inodes_count));
 }
 
@@ -103,7 +103,7 @@ static ssize_t
 nilfs_snapshot_blocks_count_show(struct nilfs_snapshot_attr *attr,
 				 struct nilfs_root *root, char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, "%llu\n",
+	return sysfs_emit(buf, "%llu\n",
 			(unsigned long long)atomic64_read(&root->blocks_count));
 }
 
@@ -116,7 +116,7 @@ static ssize_t
 nilfs_snapshot_README_show(struct nilfs_snapshot_attr *attr,
 			    struct nilfs_root *root, char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, snapshot_readme_str);
+	return sysfs_emit(buf, snapshot_readme_str);
 }
 
 NILFS_SNAPSHOT_RO_ATTR(inodes_count);
@@ -217,7 +217,7 @@ static ssize_t
 nilfs_mounted_snapshots_README_show(struct nilfs_mounted_snapshots_attr *attr,
 				    struct the_nilfs *nilfs, char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, mounted_snapshots_readme_str);
+	return sysfs_emit(buf, mounted_snapshots_readme_str);
 }
 
 NILFS_MOUNTED_SNAPSHOTS_RO_ATTR(README);
@@ -255,7 +255,7 @@ nilfs_checkpoints_checkpoints_number_sho
 
 	ncheckpoints = cpstat.cs_ncps;
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n", ncheckpoints);
+	return sysfs_emit(buf, "%llu\n", ncheckpoints);
 }
 
 static ssize_t
@@ -278,7 +278,7 @@ nilfs_checkpoints_snapshots_number_show(
 
 	nsnapshots = cpstat.cs_nsss;
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n", nsnapshots);
+	return sysfs_emit(buf, "%llu\n", nsnapshots);
 }
 
 static ssize_t
@@ -292,7 +292,7 @@ nilfs_checkpoints_last_seg_checkpoint_sh
 	last_cno = nilfs->ns_last_cno;
 	spin_unlock(&nilfs->ns_last_segment_lock);
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n", last_cno);
+	return sysfs_emit(buf, "%llu\n", last_cno);
 }
 
 static ssize_t
@@ -306,7 +306,7 @@ nilfs_checkpoints_next_checkpoint_show(s
 	cno = nilfs->ns_cno;
 	up_read(&nilfs->ns_segctor_sem);
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n", cno);
+	return sysfs_emit(buf, "%llu\n", cno);
 }
 
 static const char checkpoints_readme_str[] =
@@ -322,7 +322,7 @@ static ssize_t
 nilfs_checkpoints_README_show(struct nilfs_checkpoints_attr *attr,
 				struct the_nilfs *nilfs, char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, checkpoints_readme_str);
+	return sysfs_emit(buf, checkpoints_readme_str);
 }
 
 NILFS_CHECKPOINTS_RO_ATTR(checkpoints_number);
@@ -353,7 +353,7 @@ nilfs_segments_segments_number_show(stru
 				     struct the_nilfs *nilfs,
 				     char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, "%lu\n", nilfs->ns_nsegments);
+	return sysfs_emit(buf, "%lu\n", nilfs->ns_nsegments);
 }
 
 static ssize_t
@@ -361,7 +361,7 @@ nilfs_segments_blocks_per_segment_show(s
 					struct the_nilfs *nilfs,
 					char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, "%lu\n", nilfs->ns_blocks_per_segment);
+	return sysfs_emit(buf, "%lu\n", nilfs->ns_blocks_per_segment);
 }
 
 static ssize_t
@@ -375,7 +375,7 @@ nilfs_segments_clean_segments_show(struc
 	ncleansegs = nilfs_sufile_get_ncleansegs(nilfs->ns_sufile);
 	up_read(&NILFS_MDT(nilfs->ns_dat)->mi_sem);
 
-	return snprintf(buf, PAGE_SIZE, "%lu\n", ncleansegs);
+	return sysfs_emit(buf, "%lu\n", ncleansegs);
 }
 
 static ssize_t
@@ -395,7 +395,7 @@ nilfs_segments_dirty_segments_show(struc
 		return err;
 	}
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n", sustat.ss_ndirtysegs);
+	return sysfs_emit(buf, "%llu\n", sustat.ss_ndirtysegs);
 }
 
 static const char segments_readme_str[] =
@@ -411,7 +411,7 @@ nilfs_segments_README_show(struct nilfs_
 			    struct the_nilfs *nilfs,
 			    char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, segments_readme_str);
+	return sysfs_emit(buf, segments_readme_str);
 }
 
 NILFS_SEGMENTS_RO_ATTR(segments_number);
@@ -448,7 +448,7 @@ nilfs_segctor_last_pseg_block_show(struc
 	last_pseg = nilfs->ns_last_pseg;
 	spin_unlock(&nilfs->ns_last_segment_lock);
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n",
+	return sysfs_emit(buf, "%llu\n",
 			(unsigned long long)last_pseg);
 }
 
@@ -463,7 +463,7 @@ nilfs_segctor_last_seg_sequence_show(str
 	last_seq = nilfs->ns_last_seq;
 	spin_unlock(&nilfs->ns_last_segment_lock);
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n", last_seq);
+	return sysfs_emit(buf, "%llu\n", last_seq);
 }
 
 static ssize_t
@@ -477,7 +477,7 @@ nilfs_segctor_last_seg_checkpoint_show(s
 	last_cno = nilfs->ns_last_cno;
 	spin_unlock(&nilfs->ns_last_segment_lock);
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n", last_cno);
+	return sysfs_emit(buf, "%llu\n", last_cno);
 }
 
 static ssize_t
@@ -491,7 +491,7 @@ nilfs_segctor_current_seg_sequence_show(
 	seg_seq = nilfs->ns_seg_seq;
 	up_read(&nilfs->ns_segctor_sem);
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n", seg_seq);
+	return sysfs_emit(buf, "%llu\n", seg_seq);
 }
 
 static ssize_t
@@ -505,7 +505,7 @@ nilfs_segctor_current_last_full_seg_show
 	segnum = nilfs->ns_segnum;
 	up_read(&nilfs->ns_segctor_sem);
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n", segnum);
+	return sysfs_emit(buf, "%llu\n", segnum);
 }
 
 static ssize_t
@@ -519,7 +519,7 @@ nilfs_segctor_next_full_seg_show(struct
 	nextnum = nilfs->ns_nextnum;
 	up_read(&nilfs->ns_segctor_sem);
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n", nextnum);
+	return sysfs_emit(buf, "%llu\n", nextnum);
 }
 
 static ssize_t
@@ -533,7 +533,7 @@ nilfs_segctor_next_pseg_offset_show(stru
 	pseg_offset = nilfs->ns_pseg_offset;
 	up_read(&nilfs->ns_segctor_sem);
 
-	return snprintf(buf, PAGE_SIZE, "%lu\n", pseg_offset);
+	return sysfs_emit(buf, "%lu\n", pseg_offset);
 }
 
 static ssize_t
@@ -547,7 +547,7 @@ nilfs_segctor_next_checkpoint_show(struc
 	cno = nilfs->ns_cno;
 	up_read(&nilfs->ns_segctor_sem);
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n", cno);
+	return sysfs_emit(buf, "%llu\n", cno);
 }
 
 static ssize_t
@@ -575,7 +575,7 @@ nilfs_segctor_last_seg_write_time_secs_s
 	ctime = nilfs->ns_ctime;
 	up_read(&nilfs->ns_segctor_sem);
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n", ctime);
+	return sysfs_emit(buf, "%llu\n", ctime);
 }
 
 static ssize_t
@@ -603,7 +603,7 @@ nilfs_segctor_last_nongc_write_time_secs
 	nongc_ctime = nilfs->ns_nongc_ctime;
 	up_read(&nilfs->ns_segctor_sem);
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n", nongc_ctime);
+	return sysfs_emit(buf, "%llu\n", nongc_ctime);
 }
 
 static ssize_t
@@ -617,7 +617,7 @@ nilfs_segctor_dirty_data_blocks_count_sh
 	ndirtyblks = atomic_read(&nilfs->ns_ndirtyblks);
 	up_read(&nilfs->ns_segctor_sem);
 
-	return snprintf(buf, PAGE_SIZE, "%u\n", ndirtyblks);
+	return sysfs_emit(buf, "%u\n", ndirtyblks);
 }
 
 static const char segctor_readme_str[] =
@@ -654,7 +654,7 @@ static ssize_t
 nilfs_segctor_README_show(struct nilfs_segctor_attr *attr,
 			  struct the_nilfs *nilfs, char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, segctor_readme_str);
+	return sysfs_emit(buf, segctor_readme_str);
 }
 
 NILFS_SEGCTOR_RO_ATTR(last_pseg_block);
@@ -723,7 +723,7 @@ nilfs_superblock_sb_write_time_secs_show
 	sbwtime = nilfs->ns_sbwtime;
 	up_read(&nilfs->ns_sem);
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n", sbwtime);
+	return sysfs_emit(buf, "%llu\n", sbwtime);
 }
 
 static ssize_t
@@ -737,7 +737,7 @@ nilfs_superblock_sb_write_count_show(str
 	sbwcount = nilfs->ns_sbwcount;
 	up_read(&nilfs->ns_sem);
 
-	return snprintf(buf, PAGE_SIZE, "%u\n", sbwcount);
+	return sysfs_emit(buf, "%u\n", sbwcount);
 }
 
 static ssize_t
@@ -751,7 +751,7 @@ nilfs_superblock_sb_update_frequency_sho
 	sb_update_freq = nilfs->ns_sb_update_freq;
 	up_read(&nilfs->ns_sem);
 
-	return snprintf(buf, PAGE_SIZE, "%u\n", sb_update_freq);
+	return sysfs_emit(buf, "%u\n", sb_update_freq);
 }
 
 static ssize_t
@@ -799,7 +799,7 @@ static ssize_t
 nilfs_superblock_README_show(struct nilfs_superblock_attr *attr,
 				struct the_nilfs *nilfs, char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, sb_readme_str);
+	return sysfs_emit(buf, sb_readme_str);
 }
 
 NILFS_SUPERBLOCK_RO_ATTR(sb_write_time);
@@ -834,7 +834,7 @@ ssize_t nilfs_dev_revision_show(struct n
 	u32 major = le32_to_cpu(sbp[0]->s_rev_level);
 	u16 minor = le16_to_cpu(sbp[0]->s_minor_rev_level);
 
-	return snprintf(buf, PAGE_SIZE, "%d.%d\n", major, minor);
+	return sysfs_emit(buf, "%d.%d\n", major, minor);
 }
 
 static
@@ -842,7 +842,7 @@ ssize_t nilfs_dev_blocksize_show(struct
 				 struct the_nilfs *nilfs,
 				 char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, "%u\n", nilfs->ns_blocksize);
+	return sysfs_emit(buf, "%u\n", nilfs->ns_blocksize);
 }
 
 static
@@ -853,7 +853,7 @@ ssize_t nilfs_dev_device_size_show(struc
 	struct nilfs_super_block **sbp = nilfs->ns_sbp;
 	u64 dev_size = le64_to_cpu(sbp[0]->s_dev_size);
 
-	return snprintf(buf, PAGE_SIZE, "%llu\n", dev_size);
+	return sysfs_emit(buf, "%llu\n", dev_size);
 }
 
 static
@@ -864,7 +864,7 @@ ssize_t nilfs_dev_free_blocks_show(struc
 	sector_t free_blocks = 0;
 
 	nilfs_count_free_blocks(nilfs, &free_blocks);
-	return snprintf(buf, PAGE_SIZE, "%llu\n",
+	return sysfs_emit(buf, "%llu\n",
 			(unsigned long long)free_blocks);
 }
 
@@ -875,7 +875,7 @@ ssize_t nilfs_dev_uuid_show(struct nilfs
 {
 	struct nilfs_super_block **sbp = nilfs->ns_sbp;
 
-	return snprintf(buf, PAGE_SIZE, "%pUb\n", sbp[0]->s_uuid);
+	return sysfs_emit(buf, "%pUb\n", sbp[0]->s_uuid);
 }
 
 static
@@ -903,7 +903,7 @@ static ssize_t nilfs_dev_README_show(str
 				     struct the_nilfs *nilfs,
 				     char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, dev_readme_str);
+	return sysfs_emit(buf, dev_readme_str);
 }
 
 NILFS_DEV_RO_ATTR(revision);
@@ -1047,7 +1047,7 @@ void nilfs_sysfs_delete_device_group(str
 static ssize_t nilfs_feature_revision_show(struct kobject *kobj,
 					    struct attribute *attr, char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, "%d.%d\n",
+	return sysfs_emit(buf, "%d.%d\n",
 			NILFS_CURRENT_REV, NILFS_MINOR_REV);
 }
 
@@ -1060,7 +1060,7 @@ static ssize_t nilfs_feature_README_show
 					 struct attribute *attr,
 					 char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, features_readme_str);
+	return sysfs_emit(buf, features_readme_str);
 }
 
 NILFS_FEATURE_RO_ATTR(revision);
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 67/87] nilfs2: remove filenames from file comments
  2021-11-09  2:30 incoming Andrew Morton
                   ` (65 preceding siblings ...)
  2021-11-09  2:34 ` [patch 66/87] nilfs2: replace snprintf in show functions with sysfs_emit Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 68/87] hfs/hfsplus: use WARN_ON for sanity check Andrew Morton
                   ` (19 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, konishi.ryusuke, linux-mm, mm-commits, torvalds, wangqing

From: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Subject: nilfs2: remove filenames from file comments

Remove filenames that are not particularly useful in file comments, and
suppress checkpatch warnings "WARNING: It's generally not useful to have
the filename in the file".

Link: https://lkml.kernel.org/r/1635151862-11547-3-git-send-email-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: Qing Wang <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/nilfs2/alloc.c     |    2 +-
 fs/nilfs2/alloc.h     |    2 +-
 fs/nilfs2/bmap.c      |    2 +-
 fs/nilfs2/bmap.h      |    2 +-
 fs/nilfs2/btnode.c    |    2 +-
 fs/nilfs2/btnode.h    |    2 +-
 fs/nilfs2/btree.c     |    2 +-
 fs/nilfs2/btree.h     |    2 +-
 fs/nilfs2/cpfile.c    |    2 +-
 fs/nilfs2/cpfile.h    |    2 +-
 fs/nilfs2/dat.c       |    2 +-
 fs/nilfs2/dat.h       |    2 +-
 fs/nilfs2/dir.c       |    2 +-
 fs/nilfs2/direct.c    |    2 +-
 fs/nilfs2/direct.h    |    2 +-
 fs/nilfs2/file.c      |    2 +-
 fs/nilfs2/gcinode.c   |    2 +-
 fs/nilfs2/ifile.c     |    2 +-
 fs/nilfs2/ifile.h     |    2 +-
 fs/nilfs2/inode.c     |    2 +-
 fs/nilfs2/ioctl.c     |    2 +-
 fs/nilfs2/mdt.c       |    2 +-
 fs/nilfs2/mdt.h       |    2 +-
 fs/nilfs2/namei.c     |    2 +-
 fs/nilfs2/nilfs.h     |    2 +-
 fs/nilfs2/page.c      |    2 +-
 fs/nilfs2/page.h      |    2 +-
 fs/nilfs2/recovery.c  |    2 +-
 fs/nilfs2/segbuf.c    |    2 +-
 fs/nilfs2/segbuf.h    |    2 +-
 fs/nilfs2/segment.c   |    2 +-
 fs/nilfs2/segment.h   |    2 +-
 fs/nilfs2/sufile.c    |    2 +-
 fs/nilfs2/sufile.h    |    2 +-
 fs/nilfs2/super.c     |    2 +-
 fs/nilfs2/sysfs.c     |    2 +-
 fs/nilfs2/sysfs.h     |    2 +-
 fs/nilfs2/the_nilfs.c |    2 +-
 fs/nilfs2/the_nilfs.h |    2 +-
 39 files changed, 39 insertions(+), 39 deletions(-)

--- a/fs/nilfs2/alloc.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/alloc.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * alloc.c - NILFS dat/inode allocator
+ * NILFS dat/inode allocator
  *
  * Copyright (C) 2006-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/alloc.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/alloc.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * alloc.h - persistent object (dat entry/disk inode) allocator/deallocator
+ * Persistent object (dat entry/disk inode) allocator/deallocator
  *
  * Copyright (C) 2006-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/bmap.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/bmap.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * bmap.c - NILFS block mapping.
+ * NILFS block mapping.
  *
  * Copyright (C) 2006-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/bmap.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/bmap.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * bmap.h - NILFS block mapping.
+ * NILFS block mapping.
  *
  * Copyright (C) 2006-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/btnode.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/btnode.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * btnode.c - NILFS B-tree node cache
+ * NILFS B-tree node cache
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/btnode.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/btnode.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * btnode.h - NILFS B-tree node cache
+ * NILFS B-tree node cache
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/btree.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/btree.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * btree.c - NILFS B-tree.
+ * NILFS B-tree.
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/btree.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/btree.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * btree.h - NILFS B-tree.
+ * NILFS B-tree.
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/cpfile.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/cpfile.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * cpfile.c - NILFS checkpoint file.
+ * NILFS checkpoint file.
  *
  * Copyright (C) 2006-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/cpfile.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/cpfile.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * cpfile.h - NILFS checkpoint file.
+ * NILFS checkpoint file.
  *
  * Copyright (C) 2006-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/dat.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/dat.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * dat.c - NILFS disk address translation.
+ * NILFS disk address translation.
  *
  * Copyright (C) 2006-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/dat.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/dat.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * dat.h - NILFS disk address translation.
+ * NILFS disk address translation.
  *
  * Copyright (C) 2006-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/dir.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/dir.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * dir.c - NILFS directory entry operations
+ * NILFS directory entry operations
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/direct.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/direct.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * direct.c - NILFS direct block pointer.
+ * NILFS direct block pointer.
  *
  * Copyright (C) 2006-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/direct.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/direct.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * direct.h - NILFS direct block pointer.
+ * NILFS direct block pointer.
  *
  * Copyright (C) 2006-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/file.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/file.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * file.c - NILFS regular file handling primitives including fsync().
+ * NILFS regular file handling primitives including fsync().
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/gcinode.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/gcinode.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * gcinode.c - dummy inodes to buffer blocks for garbage collection
+ * Dummy inodes to buffer blocks for garbage collection
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/ifile.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/ifile.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * ifile.c - NILFS inode file
+ * NILFS inode file
  *
  * Copyright (C) 2006-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/ifile.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/ifile.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * ifile.h - NILFS inode file
+ * NILFS inode file
  *
  * Copyright (C) 2006-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/inode.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/inode.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * inode.c - NILFS inode operations.
+ * NILFS inode operations.
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/ioctl.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/ioctl.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * ioctl.c - NILFS ioctl operations.
+ * NILFS ioctl operations.
  *
  * Copyright (C) 2007, 2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/mdt.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/mdt.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * mdt.c - meta data file for NILFS
+ * Meta data file for NILFS
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/mdt.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/mdt.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * mdt.h - NILFS meta data file prototype and definitions
+ * NILFS meta data file prototype and definitions
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/namei.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/namei.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * namei.c - NILFS pathname lookup operations.
+ * NILFS pathname lookup operations.
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/nilfs.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/nilfs.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * nilfs.h - NILFS local header file.
+ * NILFS local header file.
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/page.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/page.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * page.c - buffer/page management specific to NILFS
+ * Buffer/page management specific to NILFS
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/page.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/page.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * page.h - buffer/page management specific to NILFS
+ * Buffer/page management specific to NILFS
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/recovery.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/recovery.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * recovery.c - NILFS recovery logic
+ * NILFS recovery logic
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/segbuf.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/segbuf.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * segbuf.c - NILFS segment buffer
+ * NILFS segment buffer
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/segbuf.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/segbuf.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * segbuf.h - NILFS Segment buffer prototypes and definitions
+ * NILFS Segment buffer prototypes and definitions
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/segment.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/segment.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * segment.c - NILFS segment constructor.
+ * NILFS segment constructor.
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/segment.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/segment.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * segment.h - NILFS Segment constructor prototypes and definitions
+ * NILFS Segment constructor prototypes and definitions
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/sufile.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/sufile.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * sufile.c - NILFS segment usage file.
+ * NILFS segment usage file.
  *
  * Copyright (C) 2006-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/sufile.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/sufile.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * sufile.h - NILFS segment usage file.
+ * NILFS segment usage file.
  *
  * Copyright (C) 2006-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/super.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/super.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * super.c - NILFS module and super block management.
+ * NILFS module and super block management.
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/sysfs.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/sysfs.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * sysfs.c - sysfs support implementation.
+ * Sysfs support implementation.
  *
  * Copyright (C) 2005-2014 Nippon Telegraph and Telephone Corporation.
  * Copyright (C) 2014 HGST, Inc., a Western Digital Company.
--- a/fs/nilfs2/sysfs.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/sysfs.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * sysfs.h - sysfs support declarations.
+ * Sysfs support declarations.
  *
  * Copyright (C) 2005-2014 Nippon Telegraph and Telephone Corporation.
  * Copyright (C) 2014 HGST, Inc., a Western Digital Company.
--- a/fs/nilfs2/the_nilfs.c~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/the_nilfs.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0+
 /*
- * the_nilfs.c - the_nilfs shared structure.
+ * the_nilfs shared structure.
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
--- a/fs/nilfs2/the_nilfs.h~nilfs2-remove-filenames-from-file-comments
+++ a/fs/nilfs2/the_nilfs.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0+ */
 /*
- * the_nilfs.h - the_nilfs shared structure.
+ * the_nilfs shared structure.
  *
  * Copyright (C) 2005-2008 Nippon Telegraph and Telephone Corporation.
  *
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 68/87] hfs/hfsplus: use WARN_ON for sanity check
  2021-11-09  2:30 incoming Andrew Morton
                   ` (66 preceding siblings ...)
  2021-11-09  2:35 ` [patch 67/87] nilfs2: remove filenames from file comments Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 69/87] crash_dump: fix boolreturn.cocci warning Andrew Morton
                   ` (18 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, arnd, christian.brauner, gregkh, jack, linux-mm,
	mm-commits, torvalds, viro

From: Arnd Bergmann <arnd@arndb.de>
Subject: hfs/hfsplus: use WARN_ON for sanity check

gcc warns about a couple of instances in which a sanity check exists but
the author wasn't sure how to react to it failing, which makes it look
like a possible bug:

fs/hfsplus/inode.c: In function 'hfsplus_cat_read_inode':
fs/hfsplus/inode.c:503:37: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
  503 |                         /* panic? */;
      |                                     ^
fs/hfsplus/inode.c:524:37: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
  524 |                         /* panic? */;
      |                                     ^
fs/hfsplus/inode.c: In function 'hfsplus_cat_write_inode':
fs/hfsplus/inode.c:582:37: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
  582 |                         /* panic? */;
      |                                     ^
fs/hfsplus/inode.c:608:37: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
  608 |                         /* panic? */;
      |                                     ^
fs/hfs/inode.c: In function 'hfs_write_inode':
fs/hfs/inode.c:464:37: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
  464 |                         /* panic? */;
      |                                     ^
fs/hfs/inode.c:485:37: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
  485 |                         /* panic? */;
      |                                     ^

panic() is probably not the correct choice here, but a WARN_ON
seems appropriate and avoids the compile-time warning.

Link: https://lkml.kernel.org/r/20210927102149.1809384-1-arnd@kernel.org
Link: https://lore.kernel.org/all/20210322223249.2632268-1-arnd@kernel.org/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/hfs/inode.c     |    6 ++----
 fs/hfsplus/inode.c |   12 ++++--------
 2 files changed, 6 insertions(+), 12 deletions(-)

--- a/fs/hfs/inode.c~hfs-hfsplus-use-warn_on-for-sanity-check
+++ a/fs/hfs/inode.c
@@ -462,8 +462,7 @@ int hfs_write_inode(struct inode *inode,
 		goto out;
 
 	if (S_ISDIR(main_inode->i_mode)) {
-		if (fd.entrylength < sizeof(struct hfs_cat_dir))
-			/* panic? */;
+		WARN_ON(fd.entrylength < sizeof(struct hfs_cat_dir));
 		hfs_bnode_read(fd.bnode, &rec, fd.entryoffset,
 			   sizeof(struct hfs_cat_dir));
 		if (rec.type != HFS_CDR_DIR ||
@@ -483,8 +482,7 @@ int hfs_write_inode(struct inode *inode,
 		hfs_bnode_write(fd.bnode, &rec, fd.entryoffset,
 				sizeof(struct hfs_cat_file));
 	} else {
-		if (fd.entrylength < sizeof(struct hfs_cat_file))
-			/* panic? */;
+		WARN_ON(fd.entrylength < sizeof(struct hfs_cat_file));
 		hfs_bnode_read(fd.bnode, &rec, fd.entryoffset,
 			   sizeof(struct hfs_cat_file));
 		if (rec.type != HFS_CDR_FIL ||
--- a/fs/hfsplus/inode.c~hfs-hfsplus-use-warn_on-for-sanity-check
+++ a/fs/hfsplus/inode.c
@@ -509,8 +509,7 @@ int hfsplus_cat_read_inode(struct inode
 	if (type == HFSPLUS_FOLDER) {
 		struct hfsplus_cat_folder *folder = &entry.folder;
 
-		if (fd->entrylength < sizeof(struct hfsplus_cat_folder))
-			/* panic? */;
+		WARN_ON(fd->entrylength < sizeof(struct hfsplus_cat_folder));
 		hfs_bnode_read(fd->bnode, &entry, fd->entryoffset,
 					sizeof(struct hfsplus_cat_folder));
 		hfsplus_get_perms(inode, &folder->permissions, 1);
@@ -530,8 +529,7 @@ int hfsplus_cat_read_inode(struct inode
 	} else if (type == HFSPLUS_FILE) {
 		struct hfsplus_cat_file *file = &entry.file;
 
-		if (fd->entrylength < sizeof(struct hfsplus_cat_file))
-			/* panic? */;
+		WARN_ON(fd->entrylength < sizeof(struct hfsplus_cat_file));
 		hfs_bnode_read(fd->bnode, &entry, fd->entryoffset,
 					sizeof(struct hfsplus_cat_file));
 
@@ -588,8 +586,7 @@ int hfsplus_cat_write_inode(struct inode
 	if (S_ISDIR(main_inode->i_mode)) {
 		struct hfsplus_cat_folder *folder = &entry.folder;
 
-		if (fd.entrylength < sizeof(struct hfsplus_cat_folder))
-			/* panic? */;
+		WARN_ON(fd.entrylength < sizeof(struct hfsplus_cat_folder));
 		hfs_bnode_read(fd.bnode, &entry, fd.entryoffset,
 					sizeof(struct hfsplus_cat_folder));
 		/* simple node checks? */
@@ -614,8 +611,7 @@ int hfsplus_cat_write_inode(struct inode
 	} else {
 		struct hfsplus_cat_file *file = &entry.file;
 
-		if (fd.entrylength < sizeof(struct hfsplus_cat_file))
-			/* panic? */;
+		WARN_ON(fd.entrylength < sizeof(struct hfsplus_cat_file));
 		hfs_bnode_read(fd.bnode, &entry, fd.entryoffset,
 					sizeof(struct hfsplus_cat_file));
 		hfsplus_inode_write_fork(inode, &file->data_fork);
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 69/87] crash_dump: fix boolreturn.cocci warning
  2021-11-09  2:30 incoming Andrew Morton
                   ` (67 preceding siblings ...)
  2021-11-09  2:35 ` [patch 68/87] hfs/hfsplus: use WARN_ON for sanity check Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 70/87] crash_dump: remove duplicate include in crash_dump.h Andrew Morton
                   ` (17 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, bhe, deng.changcheng, dyoung, horms, linux-mm, mm-commits,
	rppt, torvalds, vgoyal, ye.guojin, zealci

From: Changcheng Deng <deng.changcheng@zte.com.cn>
Subject: crash_dump: fix boolreturn.cocci warning

./include/linux/crash_dump.h: 119: 50-51: WARNING: return of 0/1 in
function 'is_kdump_kernel' with return type bool

Return statements in functions returning bool should use true/false
instead of 1/0.

Link: https://lkml.kernel.org/r/20211020083905.1037952-1-deng.changcheng@zte.com.cn
Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Ye Guojin <ye.guojin@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/crash_dump.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/crash_dump.h~crash_dump-fix-boolreturncocci-warning
+++ a/include/linux/crash_dump.h
@@ -116,7 +116,7 @@ extern void register_vmcore_cb(struct vm
 extern void unregister_vmcore_cb(struct vmcore_cb *cb);
 
 #else /* !CONFIG_CRASH_DUMP */
-static inline bool is_kdump_kernel(void) { return 0; }
+static inline bool is_kdump_kernel(void) { return false; }
 #endif /* CONFIG_CRASH_DUMP */
 
 /* Device Dump information to be filled by drivers */
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 70/87] crash_dump: remove duplicate include in crash_dump.h
  2021-11-09  2:30 incoming Andrew Morton
                   ` (68 preceding siblings ...)
  2021-11-09  2:35 ` [patch 69/87] crash_dump: fix boolreturn.cocci warning Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 71/87] signal: remove duplicate include in signal.h Andrew Morton
                   ` (16 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, bhe, deng.changcheng, dyoung, horms, linux-mm, mm-commits,
	rppt, torvalds, vgoyal, ye.guojin, zealci

From: Ye Guojin <ye.guojin@zte.com.cn>
Subject: crash_dump: remove duplicate include in crash_dump.h

In crash_dump.h, header file <linux/pgtable.h> is included twice.  This
duplication was introduced in commit 65fddcfca8ad("mm: reorder includes
after introduction of linux/pgtable.h") where the order of the header
files is adjusted, while the old one was not removed.

Clean it up here.

Link: https://lkml.kernel.org/r/20211020090659.1038877-1-ye.guojin@zte.com.cn
Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Changcheng Deng <deng.changcheng@zte.com.cn>
Cc: Simon Horman <horms@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/crash_dump.h |    2 --
 1 file changed, 2 deletions(-)

--- a/include/linux/crash_dump.h~crash_dump-remove-duplicate-include-in-crash_dumph
+++ a/include/linux/crash_dump.h
@@ -8,8 +8,6 @@
 #include <linux/pgtable.h>
 #include <uapi/linux/vmcore.h>
 
-#include <linux/pgtable.h> /* for pgprot_t */
-
 /* For IS_ENABLED(CONFIG_CRASH_DUMP) */
 #define ELFCORE_ADDR_MAX	(-1ULL)
 #define ELFCORE_ADDR_ERR	(-2ULL)
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 71/87] signal: remove duplicate include in signal.h
  2021-11-09  2:30 incoming Andrew Morton
                   ` (69 preceding siblings ...)
  2021-11-09  2:35 ` [patch 70/87] crash_dump: remove duplicate include in crash_dump.h Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 72/87] seq_file: move seq_escape() to a header Andrew Morton
                   ` (15 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, torvalds, ye.guojin, zealci

From: Ye Guojin <ye.guojin@zte.com.cn>
Subject: signal: remove duplicate include in signal.h

'linux/string.h' included in 'signal.h' is duplicated.
it's also included at line 7.

Link: https://lkml.kernel.org/r/20211019024934.973008-1-ye.guojin@zte.com.cn
Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/signal.h |    1 -
 1 file changed, 1 deletion(-)

--- a/include/linux/signal.h~signal-remove-duplicate-include-in-signalh
+++ a/include/linux/signal.h
@@ -126,7 +126,6 @@ static inline int sigequalsets(const sig
 #define sigmask(sig)	(1UL << ((sig) - 1))
 
 #ifndef __HAVE_ARCH_SIG_SETOPS
-#include <linux/string.h>
 
 #define _SIG_SET_BINOP(name, op)					\
 static inline void name(sigset_t *r, const sigset_t *a, const sigset_t *b) \
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 72/87] seq_file: move seq_escape() to a header
  2021-11-09  2:30 incoming Andrew Morton
                   ` (70 preceding siblings ...)
  2021-11-09  2:35 ` [patch 71/87] signal: remove duplicate include in signal.h Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 73/87] seq_file: fix passing wrong private data Andrew Morton
                   ` (14 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, linux-mm, mm-commits, torvalds

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Subject: seq_file: move seq_escape() to a header

Move seq_escape() to the header as inliner, for a small kernel text size
reduction.

Link: https://lkml.kernel.org/r/20211001122917.67228-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/seq_file.c            |   16 ----------------
 include/linux/seq_file.h |   17 ++++++++++++++++-
 2 files changed, 16 insertions(+), 17 deletions(-)

--- a/fs/seq_file.c~seq_file-move-seq_escape-to-a-header
+++ a/fs/seq_file.c
@@ -383,22 +383,6 @@ void seq_escape_mem(struct seq_file *m,
 }
 EXPORT_SYMBOL(seq_escape_mem);
 
-/**
- *	seq_escape -	print string into buffer, escaping some characters
- *	@m:	target buffer
- *	@s:	string
- *	@esc:	set of characters that need escaping
- *
- *	Puts string into buffer, replacing each occurrence of character from
- *	@esc with usual octal escape.
- *	Use seq_has_overflowed() to check for errors.
- */
-void seq_escape(struct seq_file *m, const char *s, const char *esc)
-{
-	seq_escape_str(m, s, ESCAPE_OCTAL, esc);
-}
-EXPORT_SYMBOL(seq_escape);
-
 void seq_vprintf(struct seq_file *m, const char *f, va_list args)
 {
 	int len;
--- a/include/linux/seq_file.h~seq_file-move-seq_escape-to-a-header
+++ a/include/linux/seq_file.h
@@ -4,6 +4,7 @@
 
 #include <linux/types.h>
 #include <linux/string.h>
+#include <linux/string_helpers.h>
 #include <linux/bug.h>
 #include <linux/mutex.h>
 #include <linux/cpumask.h>
@@ -135,7 +136,21 @@ static inline void seq_escape_str(struct
 	seq_escape_mem(m, src, strlen(src), flags, esc);
 }
 
-void seq_escape(struct seq_file *m, const char *s, const char *esc);
+/**
+ * seq_escape - print string into buffer, escaping some characters
+ * @m: target buffer
+ * @s: NULL-terminated string
+ * @esc: set of characters that need escaping
+ *
+ * Puts string into buffer, replacing each occurrence of character from
+ * @esc with usual octal escape.
+ *
+ * Use seq_has_overflowed() to check for errors.
+ */
+static inline void seq_escape(struct seq_file *m, const char *s, const char *esc)
+{
+	seq_escape_str(m, s, ESCAPE_OCTAL, esc);
+}
 
 void seq_hex_dump(struct seq_file *m, const char *prefix_str, int prefix_type,
 		  int rowsize, int groupsize, const void *buf, size_t len,
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 73/87] seq_file: fix passing wrong private data
  2021-11-09  2:30 incoming Andrew Morton
                   ` (71 preceding siblings ...)
  2021-11-09  2:35 ` [patch 72/87] seq_file: move seq_escape() to a header Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 74/87] kernel/fork.c: unshare(): use swap() to make code cleaner Andrew Morton
                   ` (13 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: adobriyan, akpm, andriy.shevchenko, christian.brauner, linux-mm,
	mm-commits, revest, sfr, songmuchun, torvalds

From: Muchun Song <songmuchun@bytedance.com>
Subject: seq_file: fix passing wrong private data

DEFINE_PROC_SHOW_ATTRIBUTE() is supposed to be used to define a series
of functions and variables to register proc file easily. And the users
can use proc_create_data() to pass their own private data and get it
via seq->private in the callback. Unfortunately, the proc file system
use PDE_DATA() to get private data instead of inode->i_private. So fix
it. Fortunately, there only one user of it which does not pass any
private data, so this bug does not break any in-tree codes.

Link: https://lkml.kernel.org/r/20211029032638.84884-1-songmuchun@bytedance.com
Fixes: 97a32539b956 ("proc: convert everything to "struct proc_ops"")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Florent Revest <revest@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/seq_file.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/seq_file.h~seq_file-fix-passing-wrong-private-data
+++ a/include/linux/seq_file.h
@@ -209,7 +209,7 @@ static const struct file_operations __na
 #define DEFINE_PROC_SHOW_ATTRIBUTE(__name)				\
 static int __name ## _open(struct inode *inode, struct file *file)	\
 {									\
-	return single_open(file, __name ## _show, inode->i_private);	\
+	return single_open(file, __name ## _show, PDE_DATA(inode));	\
 }									\
 									\
 static const struct proc_ops __name ## _proc_ops = {			\
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 74/87] kernel/fork.c: unshare(): use swap() to make code cleaner
  2021-11-09  2:30 incoming Andrew Morton
                   ` (72 preceding siblings ...)
  2021-11-09  2:35 ` [patch 73/87] seq_file: fix passing wrong private data Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 75/87] sysv: use BUILD_BUG_ON instead of runtime check Andrew Morton
                   ` (12 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, axboe, ebiederm, krisman, legion, linux-mm, mm-commits,
	peterz, ran.xiaokai, torvalds

From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Subject: kernel/fork.c: unshare(): use swap() to make code cleaner

Use swap() instead of reimplementing it.

Link: https://lkml.kernel.org/r/20210909022046.8151-1-ran.xiaokai@zte.com.cn
Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Cc: Gabriel Krisman Bertazi <krisman@collabora.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Alexey Gladkov <legion@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/fork.c |    9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

--- a/kernel/fork.c~unshare-use-swap-to-make-code-cleaner
+++ a/kernel/fork.c
@@ -3027,7 +3027,7 @@ int unshare_fd(unsigned long unshare_fla
 int ksys_unshare(unsigned long unshare_flags)
 {
 	struct fs_struct *fs, *new_fs = NULL;
-	struct files_struct *fd, *new_fd = NULL;
+	struct files_struct *new_fd = NULL;
 	struct cred *new_cred = NULL;
 	struct nsproxy *new_nsproxy = NULL;
 	int do_sysvsem = 0;
@@ -3114,11 +3114,8 @@ int ksys_unshare(unsigned long unshare_f
 			spin_unlock(&fs->lock);
 		}
 
-		if (new_fd) {
-			fd = current->files;
-			current->files = new_fd;
-			new_fd = fd;
-		}
+		if (new_fd)
+			swap(current->files, new_fd);
 
 		task_unlock(current);
 
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 75/87] sysv: use BUILD_BUG_ON instead of runtime check
  2021-11-09  2:30 incoming Andrew Morton
                   ` (73 preceding siblings ...)
  2021-11-09  2:35 ` [patch 74/87] kernel/fork.c: unshare(): use swap() to make code cleaner Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 76/87] Documentation/kcov: include types.h in the example Andrew Morton
                   ` (11 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, hch, linux-mm, mm-commits, paskripkin, torvalds

From: Pavel Skripkin <paskripkin@gmail.com>
Subject: sysv: use BUILD_BUG_ON instead of runtime check

There were runtime checks about sizes of struct v7_super_block and struct
sysv_inode.  If one of these checks fail the kernel will panic.  Since
these values are known at compile time let's use BUILD_BUG_ON(), because
it's a standard mechanism for validation checking at build time

Link: https://lkml.kernel.org/r/20210813123020.22971-1-paskripkin@gmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/sysv/super.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- a/fs/sysv/super.c~sysv-use-build_bug_on-instead-of-runtime-check
+++ a/fs/sysv/super.c
@@ -474,10 +474,8 @@ static int v7_fill_super(struct super_bl
 	struct sysv_sb_info *sbi;
 	struct buffer_head *bh;
 
-	if (440 != sizeof (struct v7_super_block))
-		panic("V7 FS: bad super-block size");
-	if (64 != sizeof (struct sysv_inode))
-		panic("sysv fs: bad i-node size");
+	BUILD_BUG_ON(sizeof(struct v7_super_block) != 440);
+	BUILD_BUG_ON(sizeof(struct sysv_inode) != 64);
 
 	sbi = kzalloc(sizeof(struct sysv_sb_info), GFP_KERNEL);
 	if (!sbi)
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 76/87] Documentation/kcov: include types.h in the example
  2021-11-09  2:30 incoming Andrew Morton
                   ` (74 preceding siblings ...)
  2021-11-09  2:35 ` [patch 75/87] sysv: use BUILD_BUG_ON instead of runtime check Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 77/87] Documentation/kcov: define `ip' " Andrew Morton
                   ` (10 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, andreyknvl, bigeasy, dvyukov, elver, linux-mm, mm-commits,
	rostedt, torvalds, williams

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: Documentation/kcov: include types.h in the example

Patch series "kcov: PREEMPT_RT fixup + misc", v2.

The last patch in series is follow-up to address the PREEMPT_RT issue
within in kcov reported by Clark [1].  Patches 1-3 are smaller things that
I noticed while staring at it.  Patch 4 is small change which makes
replacement in #5 simpler / more obvious.

[1] https://lkml.kernel.org/r/20210809155909.333073de@theseus.lan


This patch (of 5):

The first example code has includes at the top, the following two
example share that part. The last example (remote coverage collection)
requires the linux/types.h header file due its __aligned_u64 usage.

Add the linux/types.h to the top most example and a comment that the
header files from above are required as it is done in the second
example.

Link: https://lkml.kernel.org/r/20210923164741.1859522-1-bigeasy@linutronix.de
Link: https://lore.kernel.org/r/20210830172627.267989-2-bigeasy@linutronix.de
Link: https://lkml.kernel.org/r/20210923164741.1859522-2-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Marco Elver <elver@google.com>
Tested-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Clark Williams <williams@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/dev-tools/kcov.rst |    3 +++
 1 file changed, 3 insertions(+)

--- a/Documentation/dev-tools/kcov.rst~documentation-kcov-include-typesh-in-the-example
+++ a/Documentation/dev-tools/kcov.rst
@@ -50,6 +50,7 @@ program using kcov:
     #include <sys/mman.h>
     #include <unistd.h>
     #include <fcntl.h>
+    #include <linux/types.h>
 
     #define KCOV_INIT_TRACE			_IOR('c', 1, unsigned long)
     #define KCOV_ENABLE			_IO('c', 100)
@@ -251,6 +252,8 @@ selectively from different subsystems.
 
 .. code-block:: c
 
+    /* Same includes and defines as above. */
+
     struct kcov_remote_arg {
 	__u32		trace_mode;
 	__u32		area_size;
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 77/87] Documentation/kcov: define `ip' in the example
  2021-11-09  2:30 incoming Andrew Morton
                   ` (75 preceding siblings ...)
  2021-11-09  2:35 ` [patch 76/87] Documentation/kcov: include types.h in the example Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 78/87] kcov: allocate per-CPU memory on the relevant node Andrew Morton
                   ` (9 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, andreyknvl, bigeasy, dvyukov, elver, linux-mm, mm-commits,
	rostedt, torvalds, williams

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: Documentation/kcov: define `ip' in the example

The example code uses the variable `ip' but never declares it.

Declare `ip' as a 64bit variable which is the same type as the array
from which it loads its value.

Link: https://lkml.kernel.org/r/20210923164741.1859522-3-bigeasy@linutronix.de
Link: https://lore.kernel.org/r/20210830172627.267989-3-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Marco Elver <elver@google.com>
Tested-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/dev-tools/kcov.rst |    2 ++
 1 file changed, 2 insertions(+)

--- a/Documentation/dev-tools/kcov.rst~documentation-kcov-define-ip-in-the-example
+++ a/Documentation/dev-tools/kcov.rst
@@ -178,6 +178,8 @@ Comparison operands collection is simila
 	/* Read number of comparisons collected. */
 	n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED);
 	for (i = 0; i < n; i++) {
+		uint64_t ip;
+
 		type = cover[i * KCOV_WORDS_PER_CMP + 1];
 		/* arg1 and arg2 - operands of the comparison. */
 		arg1 = cover[i * KCOV_WORDS_PER_CMP + 2];
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 78/87] kcov: allocate per-CPU memory on the relevant node
  2021-11-09  2:30 incoming Andrew Morton
                   ` (76 preceding siblings ...)
  2021-11-09  2:35 ` [patch 77/87] Documentation/kcov: define `ip' " Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 79/87] kcov: avoid enable+disable interrupts if !in_task() Andrew Morton
                   ` (8 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, andreyknvl, bigeasy, dvyukov, elver, linux-mm, mm-commits,
	rostedt, torvalds, williams

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: kcov: allocate per-CPU memory on the relevant node

During boot kcov allocates per-CPU memory which is used later if remote/
softirq processing is enabled.

Allocate the per-CPU memory on the CPU local node to avoid cross node
memory access.

Link: https://lkml.kernel.org/r/20210923164741.1859522-4-bigeasy@linutronix.de
Link: https://lore.kernel.org/r/20210830172627.267989-4-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Marco Elver <elver@google.com>
Tested-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/kcov.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/kernel/kcov.c~kcov-allocate-per-cpu-memory-on-the-relevant-node
+++ a/kernel/kcov.c
@@ -1034,8 +1034,8 @@ static int __init kcov_init(void)
 	int cpu;
 
 	for_each_possible_cpu(cpu) {
-		void *area = vmalloc(CONFIG_KCOV_IRQ_AREA_SIZE *
-				sizeof(unsigned long));
+		void *area = vmalloc_node(CONFIG_KCOV_IRQ_AREA_SIZE *
+				sizeof(unsigned long), cpu_to_node(cpu));
 		if (!area)
 			return -ENOMEM;
 		per_cpu_ptr(&kcov_percpu_data, cpu)->irq_area = area;
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 79/87] kcov: avoid enable+disable interrupts if !in_task()
  2021-11-09  2:30 incoming Andrew Morton
                   ` (77 preceding siblings ...)
  2021-11-09  2:35 ` [patch 78/87] kcov: allocate per-CPU memory on the relevant node Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 80/87] kcov: replace local_irq_save() with a local_lock_t Andrew Morton
                   ` (7 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, andreyknvl, bigeasy, dvyukov, elver, linux-mm, mm-commits,
	rostedt, torvalds, williams

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: kcov: avoid enable+disable interrupts if !in_task()

kcov_remote_start() may need to allocate memory in the in_task() case
(otherwise per-CPU memory has been pre-allocated) and therefore requires
enabled interrupts.

The interrupts are enabled before checking if the allocation is required
so if no allocation is required then the interrupts are needlessly enabled
and disabled again.

Enable interrupts only if memory allocation is performed.

Link: https://lkml.kernel.org/r/20210923164741.1859522-5-bigeasy@linutronix.de
Link: https://lore.kernel.org/r/20210830172627.267989-5-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Marco Elver <elver@google.com>
Tested-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/kcov.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/kernel/kcov.c~kcov-avoid-enabledisable-interrupts-if-in_task
+++ a/kernel/kcov.c
@@ -869,19 +869,19 @@ void kcov_remote_start(u64 handle)
 		size = CONFIG_KCOV_IRQ_AREA_SIZE;
 		area = this_cpu_ptr(&kcov_percpu_data)->irq_area;
 	}
-	spin_unlock_irqrestore(&kcov_remote_lock, flags);
+	spin_unlock(&kcov_remote_lock);
 
 	/* Can only happen when in_task(). */
 	if (!area) {
+		local_irqrestore(flags);
 		area = vmalloc(size * sizeof(unsigned long));
 		if (!area) {
 			kcov_put(kcov);
 			return;
 		}
+		local_irq_save(flags);
 	}
 
-	local_irq_save(flags);
-
 	/* Reset coverage size. */
 	*(u64 *)area = 0;
 
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 80/87] kcov: replace local_irq_save() with a local_lock_t
  2021-11-09  2:30 incoming Andrew Morton
                   ` (78 preceding siblings ...)
  2021-11-09  2:35 ` [patch 79/87] kcov: avoid enable+disable interrupts if !in_task() Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 81/87] scripts/gdb: handle split debug for vmlinux Andrew Morton
                   ` (6 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, andreyknvl, bigeasy, dvyukov, elver, linux-mm, mm-commits,
	rostedt, torvalds, williams

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: kcov: replace local_irq_save() with a local_lock_t

The kcov code mixes local_irq_save() and spin_lock() in
kcov_remote_{start|end}().  This creates a warning on PREEMPT_RT because
local_irq_save() disables interrupts and spin_lock_t is turned into a
sleeping lock which can not be acquired in a section with disabled
interrupts.

The kcov_remote_lock is used to synchronize the access to the hash-list
kcov_remote_map.  The local_irq_save() block protects access to the
per-CPU data kcov_percpu_data.

There is no compelling reason to change the lock type to raw_spin_lock_t
to make it work with local_irq_save().  Changing it would require to move
memory allocation (in kcov_remote_add()) and deallocation outside of the
locked section.

Adding an unlimited amount of entries to the hashlist will increase the
IRQ-off time during lookup.  It could be argued that this is debug code
and the latency does not matter.  There is however no need to do so and it
would allow to use this facility in an RT enabled build.

Using a local_lock_t instead of local_irq_save() has the befit of adding a
protection scope within the source which makes it obvious what is
protected.  On a !PREEMPT_RT && !LOCKDEP build the local_lock_irqsave()
maps directly to local_irq_save() so there is overhead at runtime.

Replace the local_irq_save() section with a local_lock_t.

Link: https://lkml.kernel.org/r/20210923164741.1859522-6-bigeasy@linutronix.de
Link: https://lore.kernel.org/r/20210830172627.267989-6-bigeasy@linutronix.de
Reported-by: Clark Williams <williams@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Marco Elver <elver@google.com>
Tested-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/kcov.c |   30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

--- a/kernel/kcov.c~kcov-replace-local_irq_save-with-a-local_lock_t
+++ a/kernel/kcov.c
@@ -88,6 +88,7 @@ static struct list_head kcov_remote_area
 
 struct kcov_percpu_data {
 	void			*irq_area;
+	local_lock_t		lock;
 
 	unsigned int		saved_mode;
 	unsigned int		saved_size;
@@ -96,7 +97,9 @@ struct kcov_percpu_data {
 	int			saved_sequence;
 };
 
-static DEFINE_PER_CPU(struct kcov_percpu_data, kcov_percpu_data);
+static DEFINE_PER_CPU(struct kcov_percpu_data, kcov_percpu_data) = {
+	.lock = INIT_LOCAL_LOCK(lock),
+};
 
 /* Must be called with kcov_remote_lock locked. */
 static struct kcov_remote *kcov_remote_find(u64 handle)
@@ -824,7 +827,7 @@ void kcov_remote_start(u64 handle)
 	if (!in_task() && !in_serving_softirq())
 		return;
 
-	local_irq_save(flags);
+	local_lock_irqsave(&kcov_percpu_data.lock, flags);
 
 	/*
 	 * Check that kcov_remote_start() is not called twice in background
@@ -832,7 +835,7 @@ void kcov_remote_start(u64 handle)
 	 */
 	mode = READ_ONCE(t->kcov_mode);
 	if (WARN_ON(in_task() && kcov_mode_enabled(mode))) {
-		local_irq_restore(flags);
+		local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
 		return;
 	}
 	/*
@@ -841,14 +844,15 @@ void kcov_remote_start(u64 handle)
 	 * happened while collecting coverage from a background thread.
 	 */
 	if (WARN_ON(in_serving_softirq() && t->kcov_softirq)) {
-		local_irq_restore(flags);
+		local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
 		return;
 	}
 
 	spin_lock(&kcov_remote_lock);
 	remote = kcov_remote_find(handle);
 	if (!remote) {
-		spin_unlock_irqrestore(&kcov_remote_lock, flags);
+		spin_unlock(&kcov_remote_lock);
+		local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
 		return;
 	}
 	kcov_debug("handle = %llx, context: %s\n", handle,
@@ -873,13 +877,13 @@ void kcov_remote_start(u64 handle)
 
 	/* Can only happen when in_task(). */
 	if (!area) {
-		local_irqrestore(flags);
+		local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
 		area = vmalloc(size * sizeof(unsigned long));
 		if (!area) {
 			kcov_put(kcov);
 			return;
 		}
-		local_irq_save(flags);
+		local_lock_irqsave(&kcov_percpu_data.lock, flags);
 	}
 
 	/* Reset coverage size. */
@@ -891,7 +895,7 @@ void kcov_remote_start(u64 handle)
 	}
 	kcov_start(t, kcov, size, area, mode, sequence);
 
-	local_irq_restore(flags);
+	local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
 
 }
 EXPORT_SYMBOL(kcov_remote_start);
@@ -965,12 +969,12 @@ void kcov_remote_stop(void)
 	if (!in_task() && !in_serving_softirq())
 		return;
 
-	local_irq_save(flags);
+	local_lock_irqsave(&kcov_percpu_data.lock, flags);
 
 	mode = READ_ONCE(t->kcov_mode);
 	barrier();
 	if (!kcov_mode_enabled(mode)) {
-		local_irq_restore(flags);
+		local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
 		return;
 	}
 	/*
@@ -978,12 +982,12 @@ void kcov_remote_stop(void)
 	 * actually found the remote handle and started collecting coverage.
 	 */
 	if (in_serving_softirq() && !t->kcov_softirq) {
-		local_irq_restore(flags);
+		local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
 		return;
 	}
 	/* Make sure that kcov_softirq is only set when in softirq. */
 	if (WARN_ON(!in_serving_softirq() && t->kcov_softirq)) {
-		local_irq_restore(flags);
+		local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
 		return;
 	}
 
@@ -1013,7 +1017,7 @@ void kcov_remote_stop(void)
 		spin_unlock(&kcov_remote_lock);
 	}
 
-	local_irq_restore(flags);
+	local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
 
 	/* Get in kcov_remote_start(). */
 	kcov_put(kcov);
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 81/87] scripts/gdb: handle split debug for vmlinux
  2021-11-09  2:30 incoming Andrew Morton
                   ` (79 preceding siblings ...)
  2021-11-09  2:35 ` [patch 80/87] kcov: replace local_irq_save() with a local_lock_t Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 82/87] kernel/resource: clean up and optimize iomem_is_exclusive() Andrew Morton
                   ` (5 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, dianders, jan.kiszka, johannes.berg, kbingham, linux-mm,
	mm-commits, swboyd, torvalds

From: Douglas Anderson <dianders@chromium.org>
Subject: scripts/gdb: handle split debug for vmlinux

This is related to two previous changes.  Commit dfe4529ee4d3
("scripts/gdb: find vmlinux where it was before") and commit da036ae14762
("scripts/gdb: handle split debug").

Although Chrome OS has been using the debug suffix for modules for a
while, it has just recently started using it for vmlinux as well.  That
means we've now got to improve the detection of "vmlinux" to also handle
that it might end with ".debug".

Link: https://lkml.kernel.org/r/20211028151120.v2.1.Ie6bd5a232f770acd8c9ffae487a02170bad3e963@changeid
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/gdb/linux/symbols.py |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/scripts/gdb/linux/symbols.py~scripts-gdb-handle-split-debug-for-vmlinux
+++ a/scripts/gdb/linux/symbols.py
@@ -148,7 +148,8 @@ lx-symbols command."""
         # drop all current symbols and reload vmlinux
         orig_vmlinux = 'vmlinux'
         for obj in gdb.objfiles():
-            if obj.filename.endswith('vmlinux'):
+            if (obj.filename.endswith('vmlinux') or
+                obj.filename.endswith('vmlinux.debug')):
                 orig_vmlinux = obj.filename
         gdb.execute("symbol-file", to_string=True)
         gdb.execute("symbol-file {0}".format(orig_vmlinux))
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 82/87] kernel/resource: clean up and optimize iomem_is_exclusive()
  2021-11-09  2:30 incoming Andrew Morton
                   ` (80 preceding siblings ...)
  2021-11-09  2:35 ` [patch 81/87] scripts/gdb: handle split debug for vmlinux Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 83/87] kernel/resource: disallow access to exclusive system RAM regions Andrew Morton
                   ` (4 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, andy.shevchenko, arnd, dan.j.williams, david, gregkh,
	guohanjun, jasowang, linux-mm, mm-commits, mst, rafael.j.wysocki,
	torvalds

From: David Hildenbrand <david@redhat.com>
Subject: kernel/resource: clean up and optimize iomem_is_exclusive()

Patch series "virtio-mem: disallow mapping virtio-mem memory via /dev/mem", v5.

Let's add the basic infrastructure to exclude some physical memory regions
marked as "IORESOURCE_SYSTEM_RAM" completely from /dev/mem access, even
though they are not marked IORESOURCE_BUSY and even though "iomem=relaxed"
is set.  Resource IORESOURCE_EXCLUSIVE for that purpose instead of adding
new flags to express something similar to "soft-busy" or "not busy yet,
but already prepared by a driver and not to be mapped by user space".

Use it for virtio-mem, to disallow mapping any virtio-mem memory via
/dev/mem to user space after the virtio-mem driver was loaded.


This patch (of 3):

We end up traversing subtrees of ranges we are not interested in; let's
optimize this case, skipping such subtrees, cleaning up the function a
bit.

For example, in the following configuration (/proc/iomem):

00000000-00000fff : Reserved
00001000-00057fff : System RAM
00058000-00058fff : Reserved
00059000-0009cfff : System RAM
0009d000-000fffff : Reserved
   000a0000-000bffff : PCI Bus 0000:00
   000c0000-000c3fff : PCI Bus 0000:00
   000c4000-000c7fff : PCI Bus 0000:00
   000c8000-000cbfff : PCI Bus 0000:00
   000cc000-000cffff : PCI Bus 0000:00
   000d0000-000d3fff : PCI Bus 0000:00
   000d4000-000d7fff : PCI Bus 0000:00
   000d8000-000dbfff : PCI Bus 0000:00
   000dc000-000dffff : PCI Bus 0000:00
   000e0000-000e3fff : PCI Bus 0000:00
   000e4000-000e7fff : PCI Bus 0000:00
   000e8000-000ebfff : PCI Bus 0000:00
   000ec000-000effff : PCI Bus 0000:00
   000f0000-000fffff : PCI Bus 0000:00
     000f0000-000fffff : System ROM
00100000-3fffffff : System RAM
40000000-403fffff : Reserved
   40000000-403fffff : pnp 00:00
40400000-80a79fff : System RAM
...

We don't have to look at any children of "0009d000-000fffff : Reserved" if
we can just skip these 15 items directly because the parent range is not
of interest.

Link: https://lkml.kernel.org/r/20210920142856.17758-1-david@redhat.com
Link: https://lkml.kernel.org/r/20210920142856.17758-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/resource.c |   25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

--- a/kernel/resource.c~kernel-resource-clean-up-and-optimize-iomem_is_exclusive
+++ a/kernel/resource.c
@@ -73,6 +73,18 @@ static struct resource *next_resource(st
 	return p->sibling;
 }
 
+static struct resource *next_resource_skip_children(struct resource *p)
+{
+	while (!p->sibling && p->parent)
+		p = p->parent;
+	return p->sibling;
+}
+
+#define for_each_resource(_root, _p, _skip_children) \
+	for ((_p) = (_root)->child; (_p); \
+	     (_p) = (_skip_children) ? next_resource_skip_children(_p) : \
+				       next_resource(_p))
+
 static void *r_next(struct seq_file *m, void *v, loff_t *pos)
 {
 	struct resource *p = v;
@@ -1712,10 +1724,9 @@ static int strict_iomem_checks;
  */
 bool iomem_is_exclusive(u64 addr)
 {
-	struct resource *p = &iomem_resource;
-	bool err = false;
-	loff_t l;
+	bool skip_children = false, err = false;
 	int size = PAGE_SIZE;
+	struct resource *p;
 
 	if (!strict_iomem_checks)
 		return false;
@@ -1723,15 +1734,19 @@ bool iomem_is_exclusive(u64 addr)
 	addr = addr & PAGE_MASK;
 
 	read_lock(&resource_lock);
-	for (p = p->child; p ; p = r_next(NULL, p, &l)) {
+	for_each_resource(&iomem_resource, p, skip_children) {
 		/*
 		 * We can probably skip the resources without
 		 * IORESOURCE_IO attribute?
 		 */
 		if (p->start >= addr + size)
 			break;
-		if (p->end < addr)
+		if (p->end < addr) {
+			skip_children = true;
 			continue;
+		}
+		skip_children = false;
+
 		/*
 		 * A resource is exclusive if IORESOURCE_EXCLUSIVE is set
 		 * or CONFIG_IO_STRICT_DEVMEM is enabled and the
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 83/87] kernel/resource: disallow access to exclusive system RAM regions
  2021-11-09  2:30 incoming Andrew Morton
                   ` (81 preceding siblings ...)
  2021-11-09  2:35 ` [patch 82/87] kernel/resource: clean up and optimize iomem_is_exclusive() Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 84/87] virtio-mem: disallow mapping virtio-mem memory via /dev/mem Andrew Morton
                   ` (3 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, andy.shevchenko, arnd, dan.j.williams, david, gregkh,
	guohanjun, jasowang, linux-mm, mm-commits, mst, rafael.j.wysocki,
	torvalds

From: David Hildenbrand <david@redhat.com>
Subject: kernel/resource: disallow access to exclusive system RAM regions

virtio-mem dynamically exposes memory inside a device memory region as
system RAM to Linux, coordinating with the hypervisor which parts are
actually "plugged" and consequently usable/accessible.  On the one hand,
the virtio-mem driver adds/removes whole memory blocks, creating/removing
busy IORESOURCE_SYSTEM_RAM resources, on the other hand, it logically
(un)plugs memory inside added memory blocks, dynamically either exposing
them to the buddy or hiding them from the buddy and marking them
PG_offline.

In contrast to physical devices, like a DIMM, the virtio-mem driver is
required to actually make use of any of the device-provided memory,
because it performs the handshake with the hypervisor.  virtio-mem memory
cannot simply be access via /dev/mem without a driver.

There is no safe way to:
a) Access plugged memory blocks via /dev/mem, as they might contain
   unplugged holes or might get silently unplugged by the virtio-mem
   driver and consequently turned inaccessible.
b) Access unplugged memory blocks via /dev/mem because the virtio-mem
   driver is required to make them actually accessible first.

The virtio-spec states that unplugged memory blocks MUST NOT be written,
and only selected unplugged memory blocks MAY be read.  We want to make
sure, this is the case in sane environments -- where the virtio-mem driver
was loaded.

We want to make sure that in a sane environment, nobody "accidentially"
accesses unplugged memory inside the device managed region.  For example,
a user might spot a memory region in /proc/iomem and try accessing it via
/dev/mem via gdb or dumping it via something else.  By the time the mmap()
happens, the memory might already have been removed by the virtio-mem
driver silently: the mmap() would succeeed and user space might
accidentially access unplugged memory.

So once the driver was loaded and detected the device along the
device-managed region, we just want to disallow any access via /dev/mem to
it.

In an ideal world, we would mark the whole region as busy ("owned by a
driver") and exclude it; however, that would be wrong, as we don't really
have actual system RAM at these ranges added to Linux ("busy system RAM").
Instead, we want to mark such ranges as "not actual busy system RAM but
still soft-reserved and prepared by a driver for future use."

Let's teach iomem_is_exclusive() to reject access to any range with
"IORESOURCE_SYSTEM_RAM | IORESOURCE_EXCLUSIVE", even if not busy and even
if "iomem=relaxed" is set.  Introduce EXCLUSIVE_SYSTEM_RAM to make it
easier for applicable drivers to depend on this setting in their Kconfig.

For now, there are no applicable ranges and we'll modify virtio-mem next
to properly set IORESOURCE_EXCLUSIVE on the parent resource container it
creates to contain all actual busy system RAM added via
add_memory_driver_managed().

Link: https://lkml.kernel.org/r/20210920142856.17758-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/resource.c |   29 +++++++++++++++++++----------
 mm/Kconfig        |    7 +++++++
 2 files changed, 26 insertions(+), 10 deletions(-)

--- a/kernel/resource.c~kernel-resource-disallow-access-to-exclusive-system-ram-regions
+++ a/kernel/resource.c
@@ -1719,26 +1719,23 @@ static int strict_iomem_checks;
 #endif
 
 /*
- * check if an address is reserved in the iomem resource tree
- * returns true if reserved, false if not reserved.
+ * Check if an address is exclusive to the kernel and must not be mapped to
+ * user space, for example, via /dev/mem.
+ *
+ * Returns true if exclusive to the kernel, otherwise returns false.
  */
 bool iomem_is_exclusive(u64 addr)
 {
+	const unsigned int exclusive_system_ram = IORESOURCE_SYSTEM_RAM |
+						  IORESOURCE_EXCLUSIVE;
 	bool skip_children = false, err = false;
 	int size = PAGE_SIZE;
 	struct resource *p;
 
-	if (!strict_iomem_checks)
-		return false;
-
 	addr = addr & PAGE_MASK;
 
 	read_lock(&resource_lock);
 	for_each_resource(&iomem_resource, p, skip_children) {
-		/*
-		 * We can probably skip the resources without
-		 * IORESOURCE_IO attribute?
-		 */
 		if (p->start >= addr + size)
 			break;
 		if (p->end < addr) {
@@ -1748,11 +1745,23 @@ bool iomem_is_exclusive(u64 addr)
 		skip_children = false;
 
 		/*
+		 * IORESOURCE_SYSTEM_RAM resources are exclusive if
+		 * IORESOURCE_EXCLUSIVE is set, even if they
+		 * are not busy and even if "iomem=relaxed" is set. The
+		 * responsible driver dynamically adds/removes system RAM within
+		 * such an area and uncontrolled access is dangerous.
+		 */
+		if ((p->flags & exclusive_system_ram) == exclusive_system_ram) {
+			err = true;
+			break;
+		}
+
+		/*
 		 * A resource is exclusive if IORESOURCE_EXCLUSIVE is set
 		 * or CONFIG_IO_STRICT_DEVMEM is enabled and the
 		 * resource is busy.
 		 */
-		if ((p->flags & IORESOURCE_BUSY) == 0)
+		if (!strict_iomem_checks || !(p->flags & IORESOURCE_BUSY))
 			continue;
 		if (IS_ENABLED(CONFIG_IO_STRICT_DEVMEM)
 				|| p->flags & IORESOURCE_EXCLUSIVE) {
--- a/mm/Kconfig~kernel-resource-disallow-access-to-exclusive-system-ram-regions
+++ a/mm/Kconfig
@@ -109,6 +109,13 @@ config NUMA_KEEP_MEMINFO
 config MEMORY_ISOLATION
 	bool
 
+# IORESOURCE_SYSTEM_RAM regions in the kernel resource tree that are marked
+# IORESOURCE_EXCLUSIVE cannot be mapped to user space, for example, via
+# /dev/mem.
+config EXCLUSIVE_SYSTEM_RAM
+	def_bool y
+	depends on !DEVMEM || STRICT_DEVMEM
+
 #
 # Only be set on architectures that have completely implemented memory hotplug
 # feature. If you are not sure, don't touch it.
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 84/87] virtio-mem: disallow mapping virtio-mem memory via /dev/mem
  2021-11-09  2:30 incoming Andrew Morton
                   ` (82 preceding siblings ...)
  2021-11-09  2:35 ` [patch 83/87] kernel/resource: disallow access to exclusive system RAM regions Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 85/87] selftests/kselftest/runner/run_one(): allow running non-executable files Andrew Morton
                   ` (2 subsequent siblings)
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, andy.shevchenko, arnd, dan.j.williams, david, gregkh,
	guohanjun, jasowang, linux-mm, mm-commits, mst, rafael.j.wysocki,
	torvalds

From: David Hildenbrand <david@redhat.com>
Subject: virtio-mem: disallow mapping virtio-mem memory via /dev/mem

We don't want user space to be able to map virtio-mem device memory
directly (e.g., via /dev/mem) in order to have guarantees that in a sane
setup we'll never accidentially access unplugged memory within the
device-managed region of a virtio-mem device, just as required by the
virtio-spec.

As soon as the virtio-mem driver is loaded, the device region is visible
in /proc/iomem via the parent device region.  From that point on user
space is aware of the device region and we want to disallow mapping
anything inside that region (where we will dynamically (un)plug memory)
until the driver has been unloaded cleanly and e.g., another driver might
take over.

By creating our parent IORESOURCE_SYSTEM_RAM resource with
IORESOURCE_EXCLUSIVE, we will disallow any /dev/mem access to our device
region until the driver was unloaded cleanly and removed the parent
region.  This will work even though only some memory blocks are actually
currently added to Linux and appear as busy in the resource tree.

So access to the region from user space is only possible
a) if we don't load the virtio-mem driver.
b) after unloading the virtio-mem driver cleanly.

Don't build virtio-mem if access to /dev/mem cannot be restricticted -- if
we have CONFIG_DEVMEM=y but CONFIG_STRICT_DEVMEM is not set.

Link: https://lkml.kernel.org/r/20210920142856.17758-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/virtio/Kconfig      |    1 +
 drivers/virtio/virtio_mem.c |    4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

--- a/drivers/virtio/Kconfig~virtio-mem-disallow-mapping-virtio-mem-memory-via-dev-mem
+++ a/drivers/virtio/Kconfig
@@ -101,6 +101,7 @@ config VIRTIO_MEM
 	depends on MEMORY_HOTPLUG
 	depends on MEMORY_HOTREMOVE
 	depends on CONTIG_ALLOC
+	depends on EXCLUSIVE_SYSTEM_RAM
 	help
 	 This driver provides access to virtio-mem paravirtualized memory
 	 devices, allowing to hotplug and hotunplug memory.
--- a/drivers/virtio/virtio_mem.c~virtio-mem-disallow-mapping-virtio-mem-memory-via-dev-mem
+++ a/drivers/virtio/virtio_mem.c
@@ -2672,8 +2672,10 @@ static int virtio_mem_create_resource(st
 	if (!name)
 		return -ENOMEM;
 
+	/* Disallow mapping device memory via /dev/mem completely. */
 	vm->parent_resource = __request_mem_region(vm->addr, vm->region_size,
-						   name, IORESOURCE_SYSTEM_RAM);
+						   name, IORESOURCE_SYSTEM_RAM |
+						   IORESOURCE_EXCLUSIVE);
 	if (!vm->parent_resource) {
 		kfree(name);
 		dev_warn(&vm->vdev->dev, "could not reserve device region\n");
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 85/87] selftests/kselftest/runner/run_one(): allow running non-executable files
  2021-11-09  2:30 incoming Andrew Morton
                   ` (83 preceding siblings ...)
  2021-11-09  2:35 ` [patch 84/87] virtio-mem: disallow mapping virtio-mem memory via /dev/mem Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:35 ` [patch 86/87] ipc: check checkpoint_restore_ns_capable() to modify C/R proc files Andrew Morton
  2021-11-09  2:36 ` [patch 87/87] ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL Andrew Morton
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, gregkh, linux-mm, mm-commits, shuah, sjpark, torvalds

From: SeongJae Park <sjpark@amazon.de>
Subject: selftests/kselftest/runner/run_one(): allow running non-executable files

When running a test program, 'run_one()' checks if the program has the
execution permission and fails if it doesn't.  However, it's easy to
mistakenly lose the permissions, as some common tools like 'diff' don't
support the permission change well[1].  Compared to that, making mistakes
in the test program's path would only rare, as those are explicitly listed
in 'TEST_PROGS'.  Therefore, it might make more sense to resolve the
situation on our own and run the program.

For this reason, this commit makes the test program runner function still
print the warning message but to try parsing the interpreter of the
program and to explicitly run it with the interpreter, in this case.

[1] https://lore.kernel.org/mm-commits/YRJisBs9AunccCD4@kroah.com/

Link: https://lkml.kernel.org/r/20210810164534.25902-1-sj38.park@gmail.com
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/kselftest/runner.sh |   28 +++++++++++-------
 1 file changed, 18 insertions(+), 10 deletions(-)

--- a/tools/testing/selftests/kselftest/runner.sh~selftests-kselftest-runner-run_one-allow-running-non-executable-files
+++ a/tools/testing/selftests/kselftest/runner.sh
@@ -33,9 +33,9 @@ tap_timeout()
 {
 	# Make sure tests will time out if utility is available.
 	if [ -x /usr/bin/timeout ] ; then
-		/usr/bin/timeout --foreground "$kselftest_timeout" "$1"
+		/usr/bin/timeout --foreground "$kselftest_timeout" $1
 	else
-		"$1"
+		$1
 	fi
 }
 
@@ -65,17 +65,25 @@ run_one()
 
 	TEST_HDR_MSG="selftests: $DIR: $BASENAME_TEST"
 	echo "# $TEST_HDR_MSG"
-	if [ ! -x "$TEST" ]; then
-		echo -n "# Warning: file $TEST is "
-		if [ ! -e "$TEST" ]; then
-			echo "missing!"
-		else
-			echo "not executable, correct this."
-		fi
+	if [ ! -e "$TEST" ]; then
+		echo "# Warning: file $TEST is missing!"
 		echo "not ok $test_num $TEST_HDR_MSG"
 	else
+		cmd="./$BASENAME_TEST"
+		if [ ! -x "$TEST" ]; then
+			echo "# Warning: file $TEST is not executable"
+
+			if [ $(head -n 1 "$TEST" | cut -c -2) = "#!" ]
+			then
+				interpreter=$(head -n 1 "$TEST" | cut -c 3-)
+				cmd="$interpreter ./$BASENAME_TEST"
+			else
+				echo "not ok $test_num $TEST_HDR_MSG"
+				return
+			fi
+		fi
 		cd `dirname $TEST` > /dev/null
-		((((( tap_timeout ./$BASENAME_TEST 2>&1; echo $? >&3) |
+		((((( tap_timeout "$cmd" 2>&1; echo $? >&3) |
 			tap_prefix >&4) 3>&1) |
 			(read xs; exit $xs)) 4>>"$logfile" &&
 		echo "ok $test_num $TEST_HDR_MSG") ||
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 86/87] ipc: check checkpoint_restore_ns_capable() to modify C/R proc files
  2021-11-09  2:30 incoming Andrew Morton
                   ` (84 preceding siblings ...)
  2021-11-09  2:35 ` [patch 85/87] selftests/kselftest/runner/run_one(): allow running non-executable files Andrew Morton
@ 2021-11-09  2:35 ` Andrew Morton
  2021-11-09  2:36 ` [patch 87/87] ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL Andrew Morton
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:35 UTC (permalink / raw)
  To: akpm, dbueso, ebiederm, linux-mm, manfred, mclapinski,
	mm-commits, torvalds

From: Michal Clapinski <mclapinski@google.com>
Subject: ipc: check checkpoint_restore_ns_capable() to modify C/R proc files

This commit removes the requirement to be root to modify sem_next_id,
msg_next_id and shm_next_id and checks checkpoint_restore_ns_capable
instead.

Since those files are specific to the IPC namespace, there is no reason
they should require root privileges.  This is similar to ns_last_pid,
which also only checks checkpoint_restore_ns_capable.

[akpm@linux-foundation.org: ipc/ipc_sysctl.c needs capability.h for checkpoint_restore_ns_capable()]
Link: https://lkml.kernel.org/r/20210916163717.3179496-1-mclapinski@google.com
Signed-off-by: Michal Clapinski <mclapinski@google.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Reviewed-by: Manfred Spraul <manfred@colorfullife.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 ipc/ipc_sysctl.c |   29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

--- a/ipc/ipc_sysctl.c~ipc-check-checkpoint_restore_ns_capable-to-modify-c-r-proc-files
+++ a/ipc/ipc_sysctl.c
@@ -10,6 +10,7 @@
 #include <linux/nsproxy.h>
 #include <linux/sysctl.h>
 #include <linux/uaccess.h>
+#include <linux/capability.h>
 #include <linux/ipc_namespace.h>
 #include <linux/msg.h>
 #include "util.h"
@@ -104,6 +105,19 @@ static int proc_ipc_sem_dointvec(struct
 	return ret;
 }
 
+#ifdef CONFIG_CHECKPOINT_RESTORE
+static int proc_ipc_dointvec_minmax_checkpoint_restore(struct ctl_table *table,
+		int write, void *buffer, size_t *lenp, loff_t *ppos)
+{
+	struct user_namespace *user_ns = current->nsproxy->ipc_ns->user_ns;
+
+	if (write && !checkpoint_restore_ns_capable(user_ns))
+		return -EPERM;
+
+	return proc_ipc_dointvec_minmax(table, write, buffer, lenp, ppos);
+}
+#endif
+
 #else
 #define proc_ipc_doulongvec_minmax NULL
 #define proc_ipc_dointvec	   NULL
@@ -111,6 +125,9 @@ static int proc_ipc_sem_dointvec(struct
 #define proc_ipc_dointvec_minmax_orphans   NULL
 #define proc_ipc_auto_msgmni	   NULL
 #define proc_ipc_sem_dointvec	   NULL
+#ifdef CONFIG_CHECKPOINT_RESTORE
+#define proc_ipc_dointvec_minmax_checkpoint_restore	NULL
+#endif	/* CONFIG_CHECKPOINT_RESTORE */
 #endif
 
 int ipc_mni = IPCMNI;
@@ -198,8 +215,8 @@ static struct ctl_table ipc_kern_table[]
 		.procname	= "sem_next_id",
 		.data		= &init_ipc_ns.ids[IPC_SEM_IDS].next_id,
 		.maxlen		= sizeof(init_ipc_ns.ids[IPC_SEM_IDS].next_id),
-		.mode		= 0644,
-		.proc_handler	= proc_ipc_dointvec_minmax,
+		.mode		= 0666,
+		.proc_handler	= proc_ipc_dointvec_minmax_checkpoint_restore,
 		.extra1		= SYSCTL_ZERO,
 		.extra2		= SYSCTL_INT_MAX,
 	},
@@ -207,8 +224,8 @@ static struct ctl_table ipc_kern_table[]
 		.procname	= "msg_next_id",
 		.data		= &init_ipc_ns.ids[IPC_MSG_IDS].next_id,
 		.maxlen		= sizeof(init_ipc_ns.ids[IPC_MSG_IDS].next_id),
-		.mode		= 0644,
-		.proc_handler	= proc_ipc_dointvec_minmax,
+		.mode		= 0666,
+		.proc_handler	= proc_ipc_dointvec_minmax_checkpoint_restore,
 		.extra1		= SYSCTL_ZERO,
 		.extra2		= SYSCTL_INT_MAX,
 	},
@@ -216,8 +233,8 @@ static struct ctl_table ipc_kern_table[]
 		.procname	= "shm_next_id",
 		.data		= &init_ipc_ns.ids[IPC_SHM_IDS].next_id,
 		.maxlen		= sizeof(init_ipc_ns.ids[IPC_SHM_IDS].next_id),
-		.mode		= 0644,
-		.proc_handler	= proc_ipc_dointvec_minmax,
+		.mode		= 0666,
+		.proc_handler	= proc_ipc_dointvec_minmax_checkpoint_restore,
 		.extra1		= SYSCTL_ZERO,
 		.extra2		= SYSCTL_INT_MAX,
 	},
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [patch 87/87] ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL
  2021-11-09  2:30 incoming Andrew Morton
                   ` (85 preceding siblings ...)
  2021-11-09  2:35 ` [patch 86/87] ipc: check checkpoint_restore_ns_capable() to modify C/R proc files Andrew Morton
@ 2021-11-09  2:36 ` Andrew Morton
  86 siblings, 0 replies; 98+ messages in thread
From: Andrew Morton @ 2021-11-09  2:36 UTC (permalink / raw)
  To: akpm, dbueso, ebiederm, linux-mm, manfred, mm-commits, torvalds

From: Manfred Spraul <manfred@colorfullife.com>
Subject: ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL

Compilation of ipc/ipc_sysctl.c is controlled by
obj-$(CONFIG_SYSVIPC_SYSCTL)
[see ipc/Makefile]

And CONFIG_SYSVIPC_SYSCTL depends on SYSCTL
[see init/Kconfig]

An SYSCTL is selected by PROC_SYSCTL.
[see fs/proc/Kconfig]

Thus: #ifndef CONFIG_PROC_SYSCTL in ipc/ipc_sysctl.c is impossible, the
fallback can be removed.

Link: https://lkml.kernel.org/r/20210918145337.3369-1-manfred@colorfullife.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 ipc/ipc_sysctl.c |   13 -------------
 1 file changed, 13 deletions(-)

--- a/ipc/ipc_sysctl.c~ipc-ipc_sysctlc-remove-fallback-for-config_proc_sysctl
+++ a/ipc/ipc_sysctl.c
@@ -23,7 +23,6 @@ static void *get_ipc(struct ctl_table *t
 	return which;
 }
 
-#ifdef CONFIG_PROC_SYSCTL
 static int proc_ipc_dointvec(struct ctl_table *table, int write,
 		void *buffer, size_t *lenp, loff_t *ppos)
 {
@@ -118,18 +117,6 @@ static int proc_ipc_dointvec_minmax_chec
 }
 #endif
 
-#else
-#define proc_ipc_doulongvec_minmax NULL
-#define proc_ipc_dointvec	   NULL
-#define proc_ipc_dointvec_minmax   NULL
-#define proc_ipc_dointvec_minmax_orphans   NULL
-#define proc_ipc_auto_msgmni	   NULL
-#define proc_ipc_sem_dointvec	   NULL
-#ifdef CONFIG_CHECKPOINT_RESTORE
-#define proc_ipc_dointvec_minmax_checkpoint_restore	NULL
-#endif	/* CONFIG_CHECKPOINT_RESTORE */
-#endif
-
 int ipc_mni = IPCMNI;
 int ipc_mni_shift = IPCMNI_SHIFT;
 int ipc_min_cycle = RADIX_TREE_MAP_SIZE;
_


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
  2021-11-09  2:31 ` [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks Andrew Morton
@ 2021-11-09  3:59   ` Dave Young
  2021-11-09  6:40     ` David Hildenbrand
  2021-11-10  7:22   ` Baoquan He
  1 sibling, 1 reply; 98+ messages in thread
From: Dave Young @ 2021-11-09  3:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: bhe, boris.ostrovsky, bp, david, hpa, jasowang, jgross, linux-mm,
	mhocko, mingo, mm-commits, mst, osalvador, rafael.j.wysocki,
	rppt, sstabellini, tglx, torvalds, vgoyal

Hi Andrew,
On 11/08/21 at 06:31pm, Andrew Morton wrote:
> From: David Hildenbrand <david@redhat.com>
> Subject: proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
> 
> Let's support multiple registered callbacks, making sure that registering
> vmcore callbacks cannot fail.  Make the callback return a bool instead of
> an int, handling how to deal with errors internally.  Drop unused
> HAVE_OLDMEM_PFN_IS_RAM.
> 
> We soon want to make use of this infrastructure from other drivers:
> virtio-mem, registering one callback for each virtio-mem device, to
> prevent reading unplugged virtio-mem memory.
> 
> Handle it via a generic vmcore_cb structure, prepared for future
> extensions: for example, once we support virtio-mem on s390x where the
> vmcore is completely constructed in the second kernel, we want to detect
> and add plugged virtio-mem memory ranges to the vmcore in order for them
> to get dumped properly.
> 
> Handle corner cases that are unexpected and shouldn't happen in sane
> setups: registering a callback after the vmcore has already been opened
> (warn only) and unregistering a callback after the vmcore has already been
> opened (warn and essentially read only zeroes from that point on).

This is a nice improvement, thanks David.  But I did not get time to
review it yet.  The overall idea is good, I would prefer to hold on the
patches for some time and waiting for more review.

Sorry for jumping in late.

> 
> Link: https://lkml.kernel.org/r/20211005121430.30136-6-david@redhat.com
> Signed-off-by: David Hildenbrand <david@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Jason Wang <jasowang@redhat.com>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  arch/x86/kernel/aperture_64.c |   13 +++-
>  arch/x86/xen/mmu_hvm.c        |   11 ++-
>  fs/proc/vmcore.c              |  101 ++++++++++++++++++++++----------
>  include/linux/crash_dump.h    |   26 +++++++-
>  4 files changed, 112 insertions(+), 39 deletions(-)
> 
> --- a/arch/x86/kernel/aperture_64.c~proc-vmcore-convert-oldmem_pfn_is_ram-callback-to-more-generic-vmcore-callbacks
> +++ a/arch/x86/kernel/aperture_64.c
> @@ -73,12 +73,23 @@ static int gart_mem_pfn_is_ram(unsigned
>  		      (pfn >= aperture_pfn_start + aperture_page_count));
>  }
>  
> +#ifdef CONFIG_PROC_VMCORE
> +static bool gart_oldmem_pfn_is_ram(struct vmcore_cb *cb, unsigned long pfn)
> +{
> +	return !!gart_mem_pfn_is_ram(pfn);
> +}
> +
> +static struct vmcore_cb gart_vmcore_cb = {
> +	.pfn_is_ram = gart_oldmem_pfn_is_ram,
> +};
> +#endif
> +
>  static void __init exclude_from_core(u64 aper_base, u32 aper_order)
>  {
>  	aperture_pfn_start = aper_base >> PAGE_SHIFT;
>  	aperture_page_count = (32 * 1024 * 1024) << aper_order >> PAGE_SHIFT;
>  #ifdef CONFIG_PROC_VMCORE
> -	WARN_ON(register_oldmem_pfn_is_ram(&gart_mem_pfn_is_ram));
> +	register_vmcore_cb(&gart_vmcore_cb);
>  #endif
>  #ifdef CONFIG_PROC_KCORE
>  	WARN_ON(register_mem_pfn_is_ram(&gart_mem_pfn_is_ram));
> --- a/arch/x86/xen/mmu_hvm.c~proc-vmcore-convert-oldmem_pfn_is_ram-callback-to-more-generic-vmcore-callbacks
> +++ a/arch/x86/xen/mmu_hvm.c
> @@ -12,10 +12,10 @@
>   * The kdump kernel has to check whether a pfn of the crashed kernel
>   * was a ballooned page. vmcore is using this function to decide
>   * whether to access a pfn of the crashed kernel.
> - * Returns 0 if the pfn is not backed by a RAM page, the caller may
> + * Returns "false" if the pfn is not backed by a RAM page, the caller may
>   * handle the pfn special in this case.
>   */
> -static int xen_oldmem_pfn_is_ram(unsigned long pfn)
> +static bool xen_vmcore_pfn_is_ram(struct vmcore_cb *cb, unsigned long pfn)
>  {
>  	struct xen_hvm_get_mem_type a = {
>  		.domid = DOMID_SELF,
> @@ -24,10 +24,13 @@ static int xen_oldmem_pfn_is_ram(unsigne
>  
>  	if (HYPERVISOR_hvm_op(HVMOP_get_mem_type, &a)) {
>  		pr_warn_once("Unexpected HVMOP_get_mem_type failure\n");
> -		return -ENXIO;
> +		return true;
>  	}
>  	return a.mem_type != HVMMEM_mmio_dm;
>  }
> +static struct vmcore_cb xen_vmcore_cb = {
> +	.pfn_is_ram = xen_vmcore_pfn_is_ram,
> +};
>  #endif
>  
>  static void xen_hvm_exit_mmap(struct mm_struct *mm)
> @@ -61,6 +64,6 @@ void __init xen_hvm_init_mmu_ops(void)
>  	if (is_pagetable_dying_supported())
>  		pv_ops.mmu.exit_mmap = xen_hvm_exit_mmap;
>  #ifdef CONFIG_PROC_VMCORE
> -	WARN_ON(register_oldmem_pfn_is_ram(&xen_oldmem_pfn_is_ram));
> +	register_vmcore_cb(&xen_vmcore_cb);
>  #endif
>  }
> --- a/fs/proc/vmcore.c~proc-vmcore-convert-oldmem_pfn_is_ram-callback-to-more-generic-vmcore-callbacks
> +++ a/fs/proc/vmcore.c
> @@ -62,46 +62,75 @@ core_param(novmcoredd, vmcoredd_disabled
>  /* Device Dump Size */
>  static size_t vmcoredd_orig_sz;
>  
> -/*
> - * Returns > 0 for RAM pages, 0 for non-RAM pages, < 0 on error
> - * The called function has to take care of module refcounting.
> - */
> -static int (*oldmem_pfn_is_ram)(unsigned long pfn);
> -
> -int register_oldmem_pfn_is_ram(int (*fn)(unsigned long pfn))
> -{
> -	if (oldmem_pfn_is_ram)
> -		return -EBUSY;
> -	oldmem_pfn_is_ram = fn;
> -	return 0;
> +static DECLARE_RWSEM(vmcore_cb_rwsem);
> +/* List of registered vmcore callbacks. */
> +static LIST_HEAD(vmcore_cb_list);
> +/* Whether we had a surprise unregistration of a callback. */
> +static bool vmcore_cb_unstable;
> +/* Whether the vmcore has been opened once. */
> +static bool vmcore_opened;
> +
> +void register_vmcore_cb(struct vmcore_cb *cb)
> +{
> +	down_write(&vmcore_cb_rwsem);
> +	INIT_LIST_HEAD(&cb->next);
> +	list_add_tail(&cb->next, &vmcore_cb_list);
> +	/*
> +	 * Registering a vmcore callback after the vmcore was opened is
> +	 * very unusual (e.g., manual driver loading).
> +	 */
> +	if (vmcore_opened)
> +		pr_warn_once("Unexpected vmcore callback registration\n");
> +	up_write(&vmcore_cb_rwsem);
>  }
> -EXPORT_SYMBOL_GPL(register_oldmem_pfn_is_ram);
> +EXPORT_SYMBOL_GPL(register_vmcore_cb);
>  
> -void unregister_oldmem_pfn_is_ram(void)
> +void unregister_vmcore_cb(struct vmcore_cb *cb)
>  {
> -	oldmem_pfn_is_ram = NULL;
> -	wmb();
> +	down_write(&vmcore_cb_rwsem);
> +	list_del(&cb->next);
> +	/*
> +	 * Unregistering a vmcore callback after the vmcore was opened is
> +	 * very unusual (e.g., forced driver removal), but we cannot stop
> +	 * unregistering.
> +	 */
> +	if (vmcore_opened) {
> +		pr_warn_once("Unexpected vmcore callback unregistration\n");
> +		vmcore_cb_unstable = true;
> +	}
> +	up_write(&vmcore_cb_rwsem);
>  }
> -EXPORT_SYMBOL_GPL(unregister_oldmem_pfn_is_ram);
> +EXPORT_SYMBOL_GPL(unregister_vmcore_cb);
>  
>  static bool pfn_is_ram(unsigned long pfn)
>  {
> -	int (*fn)(unsigned long pfn);
> -	/* pfn is ram unless fn() checks pagetype */
> +	struct vmcore_cb *cb;
>  	bool ret = true;
>  
> -	/*
> -	 * Ask hypervisor if the pfn is really ram.
> -	 * A ballooned page contains no data and reading from such a page
> -	 * will cause high load in the hypervisor.
> -	 */
> -	fn = oldmem_pfn_is_ram;
> -	if (fn)
> -		ret = !!fn(pfn);
> +	lockdep_assert_held_read(&vmcore_cb_rwsem);
> +	if (unlikely(vmcore_cb_unstable))
> +		return false;
> +
> +	list_for_each_entry(cb, &vmcore_cb_list, next) {
> +		if (unlikely(!cb->pfn_is_ram))
> +			continue;
> +		ret = cb->pfn_is_ram(cb, pfn);
> +		if (!ret)
> +			break;
> +	}
>  
>  	return ret;
>  }
>  
> +static int open_vmcore(struct inode *inode, struct file *file)
> +{
> +	down_read(&vmcore_cb_rwsem);
> +	vmcore_opened = true;
> +	up_read(&vmcore_cb_rwsem);
> +
> +	return 0;
> +}
> +
>  /* Reads a page from the oldmem device from given offset. */
>  ssize_t read_from_oldmem(char *buf, size_t count,
>  			 u64 *ppos, int userbuf,
> @@ -117,6 +146,7 @@ ssize_t read_from_oldmem(char *buf, size
>  	offset = (unsigned long)(*ppos % PAGE_SIZE);
>  	pfn = (unsigned long)(*ppos / PAGE_SIZE);
>  
> +	down_read(&vmcore_cb_rwsem);
>  	do {
>  		if (count > (PAGE_SIZE - offset))
>  			nr_bytes = PAGE_SIZE - offset;
> @@ -136,8 +166,10 @@ ssize_t read_from_oldmem(char *buf, size
>  				tmp = copy_oldmem_page(pfn, buf, nr_bytes,
>  						       offset, userbuf);
>  
> -			if (tmp < 0)
> +			if (tmp < 0) {
> +				up_read(&vmcore_cb_rwsem);
>  				return tmp;
> +			}
>  		}
>  		*ppos += nr_bytes;
>  		count -= nr_bytes;
> @@ -147,6 +179,7 @@ ssize_t read_from_oldmem(char *buf, size
>  		offset = 0;
>  	} while (count);
>  
> +	up_read(&vmcore_cb_rwsem);
>  	return read;
>  }
>  
> @@ -537,14 +570,19 @@ static int vmcore_remap_oldmem_pfn(struc
>  			    unsigned long from, unsigned long pfn,
>  			    unsigned long size, pgprot_t prot)
>  {
> +	int ret;
> +
>  	/*
>  	 * Check if oldmem_pfn_is_ram was registered to avoid
>  	 * looping over all pages without a reason.
>  	 */
> -	if (oldmem_pfn_is_ram)
> -		return remap_oldmem_pfn_checked(vma, from, pfn, size, prot);
> +	down_read(&vmcore_cb_rwsem);
> +	if (!list_empty(&vmcore_cb_list) || vmcore_cb_unstable)
> +		ret = remap_oldmem_pfn_checked(vma, from, pfn, size, prot);
>  	else
> -		return remap_oldmem_pfn_range(vma, from, pfn, size, prot);
> +		ret = remap_oldmem_pfn_range(vma, from, pfn, size, prot);
> +	up_read(&vmcore_cb_rwsem);
> +	return ret;
>  }
>  
>  static int mmap_vmcore(struct file *file, struct vm_area_struct *vma)
> @@ -668,6 +706,7 @@ static int mmap_vmcore(struct file *file
>  #endif
>  
>  static const struct proc_ops vmcore_proc_ops = {
> +	.proc_open	= open_vmcore,
>  	.proc_read	= read_vmcore,
>  	.proc_lseek	= default_llseek,
>  	.proc_mmap	= mmap_vmcore,
> --- a/include/linux/crash_dump.h~proc-vmcore-convert-oldmem_pfn_is_ram-callback-to-more-generic-vmcore-callbacks
> +++ a/include/linux/crash_dump.h
> @@ -91,9 +91,29 @@ static inline void vmcore_unusable(void)
>  		elfcorehdr_addr = ELFCORE_ADDR_ERR;
>  }
>  
> -#define HAVE_OLDMEM_PFN_IS_RAM 1
> -extern int register_oldmem_pfn_is_ram(int (*fn)(unsigned long pfn));
> -extern void unregister_oldmem_pfn_is_ram(void);
> +/**
> + * struct vmcore_cb - driver callbacks for /proc/vmcore handling
> + * @pfn_is_ram: check whether a PFN really is RAM and should be accessed when
> + *              reading the vmcore. Will return "true" if it is RAM or if the
> + *              callback cannot tell. If any callback returns "false", it's not
> + *              RAM and the page must not be accessed; zeroes should be
> + *              indicated in the vmcore instead. For example, a ballooned page
> + *              contains no data and reading from such a page will cause high
> + *              load in the hypervisor.
> + * @next: List head to manage registered callbacks internally; initialized by
> + *        register_vmcore_cb().
> + *
> + * vmcore callbacks allow drivers managing physical memory ranges to
> + * coordinate with vmcore handling code, for example, to prevent accessing
> + * physical memory ranges that should not be accessed when reading the vmcore,
> + * although included in the vmcore header as memory ranges to dump.
> + */
> +struct vmcore_cb {
> +	bool (*pfn_is_ram)(struct vmcore_cb *cb, unsigned long pfn);
> +	struct list_head next;
> +};
> +extern void register_vmcore_cb(struct vmcore_cb *cb);
> +extern void unregister_vmcore_cb(struct vmcore_cb *cb);
>  
>  #else /* !CONFIG_CRASH_DUMP */
>  static inline bool is_kdump_kernel(void) { return 0; }
> _
> 



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
  2021-11-09  3:59   ` Dave Young
@ 2021-11-09  6:40     ` David Hildenbrand
  2021-11-09 10:30       ` Dave Young
  0 siblings, 1 reply; 98+ messages in thread
From: David Hildenbrand @ 2021-11-09  6:40 UTC (permalink / raw)
  To: Dave Young, Andrew Morton
  Cc: bhe, boris.ostrovsky, bp, hpa, jasowang, jgross, linux-mm,
	mhocko, mingo, mm-commits, mst, osalvador, rafael.j.wysocki,
	rppt, sstabellini, tglx, torvalds, vgoyal

On 09.11.21 04:59, Dave Young wrote:
> Hi Andrew,
> On 11/08/21 at 06:31pm, Andrew Morton wrote:
>> From: David Hildenbrand <david@redhat.com>
>> Subject: proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
>>
>> Let's support multiple registered callbacks, making sure that registering
>> vmcore callbacks cannot fail.  Make the callback return a bool instead of
>> an int, handling how to deal with errors internally.  Drop unused
>> HAVE_OLDMEM_PFN_IS_RAM.
>>
>> We soon want to make use of this infrastructure from other drivers:
>> virtio-mem, registering one callback for each virtio-mem device, to
>> prevent reading unplugged virtio-mem memory.
>>
>> Handle it via a generic vmcore_cb structure, prepared for future
>> extensions: for example, once we support virtio-mem on s390x where the
>> vmcore is completely constructed in the second kernel, we want to detect
>> and add plugged virtio-mem memory ranges to the vmcore in order for them
>> to get dumped properly.
>>
>> Handle corner cases that are unexpected and shouldn't happen in sane
>> setups: registering a callback after the vmcore has already been opened
>> (warn only) and unregistering a callback after the vmcore has already been
>> opened (warn and essentially read only zeroes from that point on).
> 
> This is a nice improvement, thanks David.  But I did not get time to
> review it yet.  The overall idea is good, I would prefer to hold on the
> patches for some time and waiting for more review.
> 
> Sorry for jumping in late.

I really want this in v5.16. Please see the comment in

https://lkml.kernel.org/r/20211006122709.27885-1-david@redhat.com

Can we just fix any fallout (if any) as usual after the merge window?

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
  2021-11-09  6:40     ` David Hildenbrand
@ 2021-11-09 10:30       ` Dave Young
  0 siblings, 0 replies; 98+ messages in thread
From: Dave Young @ 2021-11-09 10:30 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Andrew Morton, bhe, boris.ostrovsky, Borislav Petkov,
	H. Peter Anvin, jasowang, jgross, linux-mm, mhocko, Ingo Molnar,
	mm-commits, MST, osalvador, rafael.j.wysocki, rppt, sstabellini,
	Thomas Gleixner, torvalds, Goyal, Vivek

[-- Attachment #1: Type: text/plain, Size: 2133 bytes --]

On Tue, 9 Nov 2021 at 14:40, David Hildenbrand <david@redhat.com> wrote:

> On 09.11.21 04:59, Dave Young wrote:
> > Hi Andrew,
> > On 11/08/21 at 06:31pm, Andrew Morton wrote:
> >> From: David Hildenbrand <david@redhat.com>
> >> Subject: proc/vmcore: convert oldmem_pfn_is_ram callback to more
> generic vmcore callbacks
> >>
> >> Let's support multiple registered callbacks, making sure that
> registering
> >> vmcore callbacks cannot fail.  Make the callback return a bool instead
> of
> >> an int, handling how to deal with errors internally.  Drop unused
> >> HAVE_OLDMEM_PFN_IS_RAM.
> >>
> >> We soon want to make use of this infrastructure from other drivers:
> >> virtio-mem, registering one callback for each virtio-mem device, to
> >> prevent reading unplugged virtio-mem memory.
> >>
> >> Handle it via a generic vmcore_cb structure, prepared for future
> >> extensions: for example, once we support virtio-mem on s390x where the
> >> vmcore is completely constructed in the second kernel, we want to detect
> >> and add plugged virtio-mem memory ranges to the vmcore in order for them
> >> to get dumped properly.
> >>
> >> Handle corner cases that are unexpected and shouldn't happen in sane
> >> setups: registering a callback after the vmcore has already been opened
> >> (warn only) and unregistering a callback after the vmcore has already
> been
> >> opened (warn and essentially read only zeroes from that point on).
> >
> > This is a nice improvement, thanks David.  But I did not get time to
> > review it yet.  The overall idea is good, I would prefer to hold on the
> > patches for some time and waiting for more review.
> >
> > Sorry for jumping in late.
>
> I really want this in v5.16. Please see the comment in
>
> https://lkml.kernel.org/r/20211006122709.27885-1-david@redhat.com
>
> Can we just fix any fallout (if any) as usual after the merge window?
>

Hi David,  sounds good to me if there are no other objections.

As we discussed offline in another thread,  we can address the issues later
if we have, eg. measuring the dump performance etc.

Thanks
Dave

>
> --
> Thanks,
>
> David / dhildenb
>
>

[-- Attachment #2: Type: text/html, Size: 3129 bytes --]

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
  2021-11-09  2:31 ` [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks Andrew Morton
  2021-11-09  3:59   ` Dave Young
@ 2021-11-10  7:22   ` Baoquan He
  2021-11-10  8:10     ` David Hildenbrand
  1 sibling, 1 reply; 98+ messages in thread
From: Baoquan He @ 2021-11-10  7:22 UTC (permalink / raw)
  To: david
  Cc: boris.ostrovsky, bp, Andrew Morton, dyoung, hpa, jasowang,
	jgross, linux-mm, mhocko, mingo, mm-commits, mst, osalvador,
	rafael.j.wysocki, rppt, sstabellini, tglx, torvalds, vgoyal

On 11/08/21 at 06:31pm, Andrew Morton wrote:
> From: David Hildenbrand <david@redhat.com>
> Subject: proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
> 
> Let's support multiple registered callbacks, making sure that registering
> vmcore callbacks cannot fail.  Make the callback return a bool instead of
> an int, handling how to deal with errors internally.  Drop unused
> HAVE_OLDMEM_PFN_IS_RAM.
> 
> We soon want to make use of this infrastructure from other drivers:
> virtio-mem, registering one callback for each virtio-mem device, to
> prevent reading unplugged virtio-mem memory.
> 
> Handle it via a generic vmcore_cb structure, prepared for future
> extensions: for example, once we support virtio-mem on s390x where the
> vmcore is completely constructed in the second kernel, we want to detect
> and add plugged virtio-mem memory ranges to the vmcore in order for them
> to get dumped properly.
> 
> Handle corner cases that are unexpected and shouldn't happen in sane
> setups: registering a callback after the vmcore has already been opened
> (warn only) and unregistering a callback after the vmcore has already been
> opened (warn and essentially read only zeroes from that point on).
                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I am fine with the whole patch except of one concern. As above sentence
underscored states, if a callback is unregistered when vmcore has been
opened, it will read out zeros from that point on. And it's done by
judging global variable 'vmcore_cb_unstable' in pfn_is_ram(). This will
cause vmcore dumping in makedumpfile only being able to read out zero
page since then, and may cost long extra time to finish.

Please see remap_oldmem_pfn_checked(). In makedumpfile, we default to
mmap 4M memory region at one time, then copy out. With this patch, and if
vmcore_cb_unstable is true, kernel will mmap page by page. The extra
time could be huge, e.g on machine with TBs memory, and we only get a
useless vmcore because of loss of core data with high probability.

I am thinking if we can simply panic in the case, since the left dumping
are all zeroed, very likely the vmcore is unavailable any more.

......
  
>  static bool pfn_is_ram(unsigned long pfn)
>  {
> -	int (*fn)(unsigned long pfn);
> -	/* pfn is ram unless fn() checks pagetype */
> +	struct vmcore_cb *cb;
>  	bool ret = true;
>  
> -	/*
> -	 * Ask hypervisor if the pfn is really ram.
> -	 * A ballooned page contains no data and reading from such a page
> -	 * will cause high load in the hypervisor.
> -	 */
> -	fn = oldmem_pfn_is_ram;
> -	if (fn)
> -		ret = !!fn(pfn);
> +	lockdep_assert_held_read(&vmcore_cb_rwsem);
> +	if (unlikely(vmcore_cb_unstable))
> +		return false;
> +
> +	list_for_each_entry(cb, &vmcore_cb_list, next) {
> +		if (unlikely(!cb->pfn_is_ram))
> +			continue;
> +		ret = cb->pfn_is_ram(cb, pfn);
> +		if (!ret)
> +			break;
> +	}
>  
>  	return ret;
>  }
>  
......



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
  2021-11-10  7:22   ` Baoquan He
@ 2021-11-10  8:10     ` David Hildenbrand
  2021-11-10 11:11       ` Dave Young
  0 siblings, 1 reply; 98+ messages in thread
From: David Hildenbrand @ 2021-11-10  8:10 UTC (permalink / raw)
  To: Baoquan He
  Cc: boris.ostrovsky, bp, Andrew Morton, dyoung, hpa, jasowang,
	jgross, linux-mm, mhocko, mingo, mm-commits, mst, osalvador,
	rafael.j.wysocki, rppt, sstabellini, tglx, torvalds, vgoyal

On 10.11.21 08:22, Baoquan He wrote:
> On 11/08/21 at 06:31pm, Andrew Morton wrote:
>> From: David Hildenbrand <david@redhat.com>
>> Subject: proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
>>
>> Let's support multiple registered callbacks, making sure that registering
>> vmcore callbacks cannot fail.  Make the callback return a bool instead of
>> an int, handling how to deal with errors internally.  Drop unused
>> HAVE_OLDMEM_PFN_IS_RAM.
>>
>> We soon want to make use of this infrastructure from other drivers:
>> virtio-mem, registering one callback for each virtio-mem device, to
>> prevent reading unplugged virtio-mem memory.
>>
>> Handle it via a generic vmcore_cb structure, prepared for future
>> extensions: for example, once we support virtio-mem on s390x where the
>> vmcore is completely constructed in the second kernel, we want to detect
>> and add plugged virtio-mem memory ranges to the vmcore in order for them
>> to get dumped properly.
>>
>> Handle corner cases that are unexpected and shouldn't happen in sane
>> setups: registering a callback after the vmcore has already been opened
>> (warn only) and unregistering a callback after the vmcore has already been
>> opened (warn and essentially read only zeroes from that point on).
>                                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> I am fine with the whole patch except of one concern. As above sentence
> underscored states, if a callback is unregistered when vmcore has been
> opened, it will read out zeros from that point on. And it's done by
> judging global variable 'vmcore_cb_unstable' in pfn_is_ram(). This will
> cause vmcore dumping in makedumpfile only being able to read out zero
> page since then, and may cost long extra time to finish.
> 
> Please see remap_oldmem_pfn_checked(). In makedumpfile, we default to
> mmap 4M memory region at one time, then copy out. With this patch, and if
> vmcore_cb_unstable is true, kernel will mmap page by page. The extra
> time could be huge, e.g on machine with TBs memory, and we only get a
> useless vmcore because of loss of core data with high probability.

Thanks Baoquan for the quick review!

This code is really just to handle the unlikely case of a driver getting
unbound from a device that has a callback registered (e.g., a
virtio-mem-pci device). Something like this will never happen in
practice in a *sane* environment.

The only known way I know is if userspace manually unbinds the driver
from a virtio-mem-pci device -- which is possible but especially in a
kdump environment something without any sane use case. In that case, we'll

pr_warn_once("Unexpected vmcore callback unregistration\n");

to let user space know that something weird/unsupported is going on.

Long story short: if user space does something nasty, I don't see a
problem in some action taking a little longer.


> 
> I am thinking if we can simply panic in the case, since the left dumping
> are all zeroed, very likely the vmcore is unavailable any more.

IMHO panic() is a little bit too much. Instead of returning zeroes, we
could fail the read/mmap operation -- I considered that as an option
when I crafted/tested this patch, however, this approach here turned out
to be the easiest way to handle something that's really not
supported/advised and won't really happen in a sane environment.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
  2021-11-10  8:10     ` David Hildenbrand
@ 2021-11-10 11:11       ` Dave Young
  2021-11-10 11:21         ` David Hildenbrand
  0 siblings, 1 reply; 98+ messages in thread
From: Dave Young @ 2021-11-10 11:11 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Baoquan He, boris.ostrovsky, bp, Andrew Morton, hpa, jasowang,
	jgross, linux-mm, mhocko, mingo, mm-commits, mst, osalvador,
	rafael.j.wysocki, rppt, sstabellini, tglx, torvalds, vgoyal

Hi David,
On 11/10/21 at 09:10am, David Hildenbrand wrote:
> On 10.11.21 08:22, Baoquan He wrote:
> > On 11/08/21 at 06:31pm, Andrew Morton wrote:
> >> From: David Hildenbrand <david@redhat.com>
> >> Subject: proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
> >>
> >> Let's support multiple registered callbacks, making sure that registering
> >> vmcore callbacks cannot fail.  Make the callback return a bool instead of
> >> an int, handling how to deal with errors internally.  Drop unused
> >> HAVE_OLDMEM_PFN_IS_RAM.
> >>
> >> We soon want to make use of this infrastructure from other drivers:
> >> virtio-mem, registering one callback for each virtio-mem device, to
> >> prevent reading unplugged virtio-mem memory.
> >>
> >> Handle it via a generic vmcore_cb structure, prepared for future
> >> extensions: for example, once we support virtio-mem on s390x where the
> >> vmcore is completely constructed in the second kernel, we want to detect
> >> and add plugged virtio-mem memory ranges to the vmcore in order for them
> >> to get dumped properly.
> >>
> >> Handle corner cases that are unexpected and shouldn't happen in sane
> >> setups: registering a callback after the vmcore has already been opened
> >> (warn only) and unregistering a callback after the vmcore has already been
> >> opened (warn and essentially read only zeroes from that point on).
> >                                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > 
> > I am fine with the whole patch except of one concern. As above sentence
> > underscored states, if a callback is unregistered when vmcore has been
> > opened, it will read out zeros from that point on. And it's done by
> > judging global variable 'vmcore_cb_unstable' in pfn_is_ram(). This will
> > cause vmcore dumping in makedumpfile only being able to read out zero
> > page since then, and may cost long extra time to finish.
> > 
> > Please see remap_oldmem_pfn_checked(). In makedumpfile, we default to
> > mmap 4M memory region at one time, then copy out. With this patch, and if
> > vmcore_cb_unstable is true, kernel will mmap page by page. The extra
> > time could be huge, e.g on machine with TBs memory, and we only get a
> > useless vmcore because of loss of core data with high probability.
> 
> Thanks Baoquan for the quick review!
> 
> This code is really just to handle the unlikely case of a driver getting
> unbound from a device that has a callback registered (e.g., a
> virtio-mem-pci device). Something like this will never happen in
> practice in a *sane* environment.
> 
> The only known way I know is if userspace manually unbinds the driver
> from a virtio-mem-pci device -- which is possible but especially in a
> kdump environment something without any sane use case. In that case, we'll
> 
> pr_warn_once("Unexpected vmcore callback unregistration\n");
> 
> to let user space know that something weird/unsupported is going on.
> 
> Long story short: if user space does something nasty, I don't see a
> problem in some action taking a little longer.
> 
> 
> > 
> > I am thinking if we can simply panic in the case, since the left dumping
> > are all zeroed, very likely the vmcore is unavailable any more.
> 
> IMHO panic() is a little bit too much. Instead of returning zeroes, we
> could fail the read/mmap operation -- I considered that as an option
> when I crafted/tested this patch, however, this approach here turned out
> to be the easiest way to handle something that's really not
> supported/advised and won't really happen in a sane environment.

I would still say that the most important task for kdump is to save the
vmcore successfully.  Even the above issue is not a common case it could
cause the vmcore to be useless.  It is understandable if the zeroed part
is only the virtio-mem part, but if all the remaining vmcore is zeroed
that it is bad and not acceptable for kdump. 

Sometimes panic is not always reproducible thus kdump could have only
one time choice to save the vmcore.  So I think we should try the best
to save useful data for later debugging use.

I'm still suggest to acquire the lock when vmcore is opened and block
the driver vmcore_cb updating.  All drivers should be ready before the
vmcore saving in the kdump os/initramfs.  Since the case we talked is
not a common case so this should be better approach.

> 
> -- 
> Thanks,
> 
> David / dhildenb
> 

Thanks
Dave



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
  2021-11-10 11:11       ` Dave Young
@ 2021-11-10 11:21         ` David Hildenbrand
  2021-11-10 11:28           ` Dave Young
  0 siblings, 1 reply; 98+ messages in thread
From: David Hildenbrand @ 2021-11-10 11:21 UTC (permalink / raw)
  To: Dave Young
  Cc: Baoquan He, boris.ostrovsky, bp, Andrew Morton, hpa, jasowang,
	jgross, linux-mm, mhocko, mingo, mm-commits, mst, osalvador,
	rafael.j.wysocki, rppt, sstabellini, tglx, torvalds, vgoyal

On 10.11.21 12:11, Dave Young wrote:
> Hi David,
> On 11/10/21 at 09:10am, David Hildenbrand wrote:
>> On 10.11.21 08:22, Baoquan He wrote:
>>> On 11/08/21 at 06:31pm, Andrew Morton wrote:
>>>> From: David Hildenbrand <david@redhat.com>
>>>> Subject: proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
>>>>
>>>> Let's support multiple registered callbacks, making sure that registering
>>>> vmcore callbacks cannot fail.  Make the callback return a bool instead of
>>>> an int, handling how to deal with errors internally.  Drop unused
>>>> HAVE_OLDMEM_PFN_IS_RAM.
>>>>
>>>> We soon want to make use of this infrastructure from other drivers:
>>>> virtio-mem, registering one callback for each virtio-mem device, to
>>>> prevent reading unplugged virtio-mem memory.
>>>>
>>>> Handle it via a generic vmcore_cb structure, prepared for future
>>>> extensions: for example, once we support virtio-mem on s390x where the
>>>> vmcore is completely constructed in the second kernel, we want to detect
>>>> and add plugged virtio-mem memory ranges to the vmcore in order for them
>>>> to get dumped properly.
>>>>
>>>> Handle corner cases that are unexpected and shouldn't happen in sane
>>>> setups: registering a callback after the vmcore has already been opened
>>>> (warn only) and unregistering a callback after the vmcore has already been
>>>> opened (warn and essentially read only zeroes from that point on).
>>>                                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>> I am fine with the whole patch except of one concern. As above sentence
>>> underscored states, if a callback is unregistered when vmcore has been
>>> opened, it will read out zeros from that point on. And it's done by
>>> judging global variable 'vmcore_cb_unstable' in pfn_is_ram(). This will
>>> cause vmcore dumping in makedumpfile only being able to read out zero
>>> page since then, and may cost long extra time to finish.
>>>
>>> Please see remap_oldmem_pfn_checked(). In makedumpfile, we default to
>>> mmap 4M memory region at one time, then copy out. With this patch, and if
>>> vmcore_cb_unstable is true, kernel will mmap page by page. The extra
>>> time could be huge, e.g on machine with TBs memory, and we only get a
>>> useless vmcore because of loss of core data with high probability.
>>
>> Thanks Baoquan for the quick review!
>>
>> This code is really just to handle the unlikely case of a driver getting
>> unbound from a device that has a callback registered (e.g., a
>> virtio-mem-pci device). Something like this will never happen in
>> practice in a *sane* environment.
>>
>> The only known way I know is if userspace manually unbinds the driver
>> from a virtio-mem-pci device -- which is possible but especially in a
>> kdump environment something without any sane use case. In that case, we'll
>>
>> pr_warn_once("Unexpected vmcore callback unregistration\n");
>>
>> to let user space know that something weird/unsupported is going on.
>>
>> Long story short: if user space does something nasty, I don't see a
>> problem in some action taking a little longer.
>>
>>
>>>
>>> I am thinking if we can simply panic in the case, since the left dumping
>>> are all zeroed, very likely the vmcore is unavailable any more.
>>
>> IMHO panic() is a little bit too much. Instead of returning zeroes, we
>> could fail the read/mmap operation -- I considered that as an option
>> when I crafted/tested this patch, however, this approach here turned out
>> to be the easiest way to handle something that's really not
>> supported/advised and won't really happen in a sane environment.
> 
> I would still say that the most important task for kdump is to save the
> vmcore successfully.  Even the above issue is not a common case it could
> cause the vmcore to be useless.  It is understandable if the zeroed part
> is only the virtio-mem part, but if all the remaining vmcore is zeroed
> that it is bad and not acceptable for kdump. 

Again, in a sane environment this will never happen.

Why are we discussing on how to optimize a scenario where user space
does something that's clearly unsupported and will not happen in real life?

My take is to warn and fail as simple as possible, without hacking
around the issue (like blocking driver unloading while user space has
/proc/vmcore opened.

"remaining vmcore is zeroed that it is bad and not acceptable for kdump."

Which scenario are you concerned about? User space plays stupid games
(unbining a driver from a virtio-mem device in a *kdump kernel* after
opening /proc/vmcore) and wins stupid prices (a warning and a vmcore
filled (partially) with zeroes). Why isn't a warning sufficient for
something like that?

I appreciate all the feedback (even if it comes in late :) ), but I'm
missing why we are trying to optimize something here.

I'm happy to send a patch that does whatever we decide to do, but I
really don't see the need for a change. Most probably I'm missing
something important?

(the patch landed mainline in the meantime)

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
  2021-11-10 11:21         ` David Hildenbrand
@ 2021-11-10 11:28           ` Dave Young
  2021-11-10 12:05             ` David Hildenbrand
  0 siblings, 1 reply; 98+ messages in thread
From: Dave Young @ 2021-11-10 11:28 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Baoquan He, boris.ostrovsky, bp, Andrew Morton, hpa, jasowang,
	jgross, linux-mm, mhocko, mingo, mm-commits, mst, osalvador,
	rafael.j.wysocki, rppt, sstabellini, tglx, torvalds, vgoyal

On 11/10/21 at 12:21pm, David Hildenbrand wrote:
> On 10.11.21 12:11, Dave Young wrote:
> > Hi David,
> > On 11/10/21 at 09:10am, David Hildenbrand wrote:
> >> On 10.11.21 08:22, Baoquan He wrote:
> >>> On 11/08/21 at 06:31pm, Andrew Morton wrote:
> >>>> From: David Hildenbrand <david@redhat.com>
> >>>> Subject: proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
> >>>>
> >>>> Let's support multiple registered callbacks, making sure that registering
> >>>> vmcore callbacks cannot fail.  Make the callback return a bool instead of
> >>>> an int, handling how to deal with errors internally.  Drop unused
> >>>> HAVE_OLDMEM_PFN_IS_RAM.
> >>>>
> >>>> We soon want to make use of this infrastructure from other drivers:
> >>>> virtio-mem, registering one callback for each virtio-mem device, to
> >>>> prevent reading unplugged virtio-mem memory.
> >>>>
> >>>> Handle it via a generic vmcore_cb structure, prepared for future
> >>>> extensions: for example, once we support virtio-mem on s390x where the
> >>>> vmcore is completely constructed in the second kernel, we want to detect
> >>>> and add plugged virtio-mem memory ranges to the vmcore in order for them
> >>>> to get dumped properly.
> >>>>
> >>>> Handle corner cases that are unexpected and shouldn't happen in sane
> >>>> setups: registering a callback after the vmcore has already been opened
> >>>> (warn only) and unregistering a callback after the vmcore has already been
> >>>> opened (warn and essentially read only zeroes from that point on).
> >>>                                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>>
> >>> I am fine with the whole patch except of one concern. As above sentence
> >>> underscored states, if a callback is unregistered when vmcore has been
> >>> opened, it will read out zeros from that point on. And it's done by
> >>> judging global variable 'vmcore_cb_unstable' in pfn_is_ram(). This will
> >>> cause vmcore dumping in makedumpfile only being able to read out zero
> >>> page since then, and may cost long extra time to finish.
> >>>
> >>> Please see remap_oldmem_pfn_checked(). In makedumpfile, we default to
> >>> mmap 4M memory region at one time, then copy out. With this patch, and if
> >>> vmcore_cb_unstable is true, kernel will mmap page by page. The extra
> >>> time could be huge, e.g on machine with TBs memory, and we only get a
> >>> useless vmcore because of loss of core data with high probability.
> >>
> >> Thanks Baoquan for the quick review!
> >>
> >> This code is really just to handle the unlikely case of a driver getting
> >> unbound from a device that has a callback registered (e.g., a
> >> virtio-mem-pci device). Something like this will never happen in
> >> practice in a *sane* environment.
> >>
> >> The only known way I know is if userspace manually unbinds the driver
> >> from a virtio-mem-pci device -- which is possible but especially in a
> >> kdump environment something without any sane use case. In that case, we'll
> >>
> >> pr_warn_once("Unexpected vmcore callback unregistration\n");
> >>
> >> to let user space know that something weird/unsupported is going on.
> >>
> >> Long story short: if user space does something nasty, I don't see a
> >> problem in some action taking a little longer.
> >>
> >>
> >>>
> >>> I am thinking if we can simply panic in the case, since the left dumping
> >>> are all zeroed, very likely the vmcore is unavailable any more.
> >>
> >> IMHO panic() is a little bit too much. Instead of returning zeroes, we
> >> could fail the read/mmap operation -- I considered that as an option
> >> when I crafted/tested this patch, however, this approach here turned out
> >> to be the easiest way to handle something that's really not
> >> supported/advised and won't really happen in a sane environment.
> > 
> > I would still say that the most important task for kdump is to save the
> > vmcore successfully.  Even the above issue is not a common case it could
> > cause the vmcore to be useless.  It is understandable if the zeroed part
> > is only the virtio-mem part, but if all the remaining vmcore is zeroed
> > that it is bad and not acceptable for kdump. 
> 
> Again, in a sane environment this will never happen.
> 
> Why are we discussing on how to optimize a scenario where user space
> does something that's clearly unsupported and will not happen in real life?
> 
> My take is to warn and fail as simple as possible, without hacking
> around the issue (like blocking driver unloading while user space has
> /proc/vmcore opened.
> 
> "remaining vmcore is zeroed that it is bad and not acceptable for kdump."
> 
> Which scenario are you concerned about? User space plays stupid games
> (unbining a driver from a virtio-mem device in a *kdump kernel* after
> opening /proc/vmcore) and wins stupid prices (a warning and a vmcore
> filled (partially) with zeroes). Why isn't a warning sufficient for
> something like that?

Hi David,

Suppose we have the use case below:

A user plays with the game (Probably in hypervisor part, but the user is
not aware that the guest panicked and in a kdump kernel), then we get a
zeroed vmcore.   But the panic can not be easily reproduced any more,
then the warning is not useful.

But if you think user is playing the game in kdump kernel, eg. in guest
os while kdump is saving vmcore then it is nearly not possible to happen
I agree with you it is a very trival problem.

Probably we have some misunderstanding, but it would be good to make it
clear :)
> 
> I appreciate all the feedback (even if it comes in late :) ), but I'm
> missing why we are trying to optimize something here.
> 
> I'm happy to send a patch that does whatever we decide to do, but I
> really don't see the need for a change. Most probably I'm missing
> something important?
> 
> (the patch landed mainline in the meantime)
> 
> -- 
> Thanks,
> 
> David / dhildenb
> 
Thanks
Dave



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
  2021-11-10 11:28           ` Dave Young
@ 2021-11-10 12:05             ` David Hildenbrand
  2021-11-10 13:11               ` Dave Young
  0 siblings, 1 reply; 98+ messages in thread
From: David Hildenbrand @ 2021-11-10 12:05 UTC (permalink / raw)
  To: Dave Young
  Cc: Baoquan He, boris.ostrovsky, bp, Andrew Morton, hpa, jasowang,
	jgross, linux-mm, mhocko, mingo, mm-commits, mst, osalvador,
	rafael.j.wysocki, rppt, sstabellini, tglx, torvalds, vgoyal

>> "remaining vmcore is zeroed that it is bad and not acceptable for kdump."
>>
>> Which scenario are you concerned about? User space plays stupid games
>> (unbining a driver from a virtio-mem device in a *kdump kernel* after
>> opening /proc/vmcore) and wins stupid prices (a warning and a vmcore
>> filled (partially) with zeroes). Why isn't a warning sufficient for
>> something like that?
> 
> Hi David,
> 
> Suppose we have the use case below:
> 

Hi Dave,

thanks for elaborating, it helps a lot to understand your concerns.

> A user plays with the game (Probably in hypervisor part, but the user is
> not aware that the guest panicked and in a kdump kernel), then we get a
> zeroed vmcore.   But the panic can not be easily reproduced any more,
> then the warning is not useful.

I can only speak about virtio-mem (well, that's the only current known
"dynamic vmcore_cb registration" user :) ).

virtio-mem devices cannot get hotunplugged in the hypervisor (i.e.,
QEMU)-- you can only hot(un)plug device memory, but not the device
itself, it will stick around. Hotunplugging the device is completely
blocked and not supported.

The reason is simple: unplugging a virtio-mem device will also remove
the device memory. It's similar to other memory devices, such as DIMMs
-- I would not recommend forced, physical removal of a DIMM to anybody
-- not while the OS is running and not while kdump is saving
/proc/vmcore. Which is also the reason why hypervisors don't generally
support forced removal of such devices. :)

So for the currently known vmcore_cb users, hypervisor action cannot
result in driver unbinding and consequently vmcore_cb changes.

Note: virtio-mem-pci devices might eventually get hotplugged while kdump
is active. I assume we don't disable PCI hotplug in kdump kernels. While
this will trigger a warning ("Unexpected vmcore callback registration"),
the vmcore will not be affected and be complete.

> 
> But if you think user is playing the game in kdump kernel, eg. in guest
> os while kdump is saving vmcore then it is nearly not possible to happen
> I agree with you it is a very trival problem.

Yes, that's the only thing I consider can happen. For example, doing a:

# echo 1 > /sys/devices/pci0000\:00/0000\:00\:03.0/remove

in a kdump kernel after opening the vmcore.

> 
> Probably we have some misunderstanding, but it would be good to make it
> clear :)

Understanding your concern, it could be future proof (for future
vmcore_cb users?) to fail the ioctls instead of returning 0. But even
for new memory devices, unplug is usually something to be fenced off by
the hypervisor, just like not allowing forced DIMM removal.

The only think I could imagine is having e.g., virtio-balloon device
register a vmcore_cb dynamically and providing a new mechanism to query
if a page is backed by a real page in the hypervisor (similar to XENs
hypercall). Such a device could be unplugged without harm, as it doesn't
actually provide device memory.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
  2021-11-10 12:05             ` David Hildenbrand
@ 2021-11-10 13:11               ` Dave Young
  0 siblings, 0 replies; 98+ messages in thread
From: Dave Young @ 2021-11-10 13:11 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Baoquan He, boris.ostrovsky, Borislav Petkov, Andrew Morton,
	H. Peter Anvin, Jason Wang, jgross, linux-mm, mhocko,
	Ingo Molnar, mm-commits, MST, osalvador, rafael.j.wysocki, rppt,
	sstabellini, Thomas Gleixner, torvalds, Goyal, Vivek

[-- Attachment #1: Type: text/plain, Size: 3372 bytes --]

On Wed, 10 Nov 2021 at 20:06, David Hildenbrand <david@redhat.com> wrote:

> >> "remaining vmcore is zeroed that it is bad and not acceptable for
> kdump."
> >>
> >> Which scenario are you concerned about? User space plays stupid games
> >> (unbining a driver from a virtio-mem device in a *kdump kernel* after
> >> opening /proc/vmcore) and wins stupid prices (a warning and a vmcore
> >> filled (partially) with zeroes). Why isn't a warning sufficient for
> >> something like that?
> >
> > Hi David,
> >
> > Suppose we have the use case below:
> >
>
> Hi Dave,
>
> thanks for elaborating, it helps a lot to understand your concerns.
>
> > A user plays with the game (Probably in hypervisor part, but the user is
> > not aware that the guest panicked and in a kdump kernel), then we get a
> > zeroed vmcore.   But the panic can not be easily reproduced any more,
> > then the warning is not useful.
>
> I can only speak about virtio-mem (well, that's the only current known
> "dynamic vmcore_cb registration" user :) ).
>
> virtio-mem devices cannot get hotunplugged in the hypervisor (i.e.,
> QEMU)-- you can only hot(un)plug device memory, but not the device
> itself, it will stick around. Hotunplugging the device is completely
> blocked and not supported.
>
> The reason is simple: unplugging a virtio-mem device will also remove
> the device memory. It's similar to other memory devices, such as DIMMs
> -- I would not recommend forced, physical removal of a DIMM to anybody
> -- not while the OS is running and not while kdump is saving
> /proc/vmcore. Which is also the reason why hypervisors don't generally
> support forced removal of such devices. :)
>
> So for the currently known vmcore_cb users, hypervisor action cannot
> result in driver unbinding and consequently vmcore_cb changes.
>
> Note: virtio-mem-pci devices might eventually get hotplugged while kdump
> is active. I assume we don't disable PCI hotplug in kdump kernels. While
> this will trigger a warning ("Unexpected vmcore callback registration"),
> the vmcore will not be affected and be complete.
>

Ok, thanks for the details, it sounds safe for the time being then.


>
> >
> > But if you think user is playing the game in kdump kernel, eg. in guest
> > os while kdump is saving vmcore then it is nearly not possible to happen
> > I agree with you it is a very trival problem.
>
> Yes, that's the only thing I consider can happen. For example, doing a:
>
> # echo 1 > /sys/devices/pci0000\:00/0000\:00\:03.0/remove
>
> in a kdump kernel after opening the vmcore.
>
> >
> > Probably we have some misunderstanding, but it would be good to make it
> > clear :)
>
> Understanding your concern, it could be future proof (for future
> vmcore_cb users?) to fail the ioctls instead of returning 0. But even
> for new memory devices, unplug is usually something to be fenced off by
> the hypervisor, just like not allowing forced DIMM removal.
>

Yes,  there could be some future issues, not only for virt users, who
knows...


> The only think I could imagine is having e.g., virtio-balloon device
> register a vmcore_cb dynamically and providing a new mechanism to query
> if a page is backed by a real page in the hypervisor (similar to XENs
> hypercall). Such a device could be unplugged without harm, as it doesn't
> actually provide device memory.
>

> --
> Thanks,
>
> David / dhildenb
>
>

[-- Attachment #2: Type: text/html, Size: 4640 bytes --]

^ permalink raw reply	[flat|nested] 98+ messages in thread

end of thread, other threads:[~2021-11-10 13:12 UTC | newest]

Thread overview: 98+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-09  2:30 incoming Andrew Morton
2021-11-09  2:31 ` [patch 01/87] vfs: keep inodes with page cache off the inode shrinker LRU Andrew Morton
2021-11-09  2:31 ` [patch 02/87] mm,hugetlb: remove mlock ulimit for SHM_HUGETLB Andrew Morton
2021-11-09  2:31 ` [patch 03/87] procfs: do not list TID 0 in /proc/<pid>/task Andrew Morton
2021-11-09  2:31 ` [patch 04/87] x86/xen: update xen_oldmem_pfn_is_ram() documentation Andrew Morton
2021-11-09  2:31 ` [patch 05/87] x86/xen: simplify xen_oldmem_pfn_is_ram() Andrew Morton
2021-11-09  2:31 ` [patch 06/87] x86/xen: print a warning when HVMOP_get_mem_type fails Andrew Morton
2021-11-09  2:31 ` [patch 07/87] proc/vmcore: let pfn_is_ram() return a bool Andrew Morton
2021-11-09  2:31 ` [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks Andrew Morton
2021-11-09  3:59   ` Dave Young
2021-11-09  6:40     ` David Hildenbrand
2021-11-09 10:30       ` Dave Young
2021-11-10  7:22   ` Baoquan He
2021-11-10  8:10     ` David Hildenbrand
2021-11-10 11:11       ` Dave Young
2021-11-10 11:21         ` David Hildenbrand
2021-11-10 11:28           ` Dave Young
2021-11-10 12:05             ` David Hildenbrand
2021-11-10 13:11               ` Dave Young
2021-11-09  2:31 ` [patch 09/87] virtio-mem: factor out hotplug specifics from virtio_mem_init() into virtio_mem_init_hotplug() Andrew Morton
2021-11-09  2:31 ` [patch 10/87] virtio-mem: factor out hotplug specifics from virtio_mem_probe() " Andrew Morton
2021-11-09  2:31 ` [patch 11/87] virtio-mem: factor out hotplug specifics from virtio_mem_remove() into virtio_mem_deinit_hotplug() Andrew Morton
2021-11-09  2:32 ` [patch 12/87] virtio-mem: kdump mode to sanitize /proc/vmcore access Andrew Morton
2021-11-09  2:32 ` [patch 13/87] proc: allow pid_revalidate() during LOOKUP_RCU Andrew Morton
2021-11-09  2:32 ` [patch 14/87] kernel.h: drop unneeded <linux/kernel.h> inclusion from other headers Andrew Morton
2021-11-09  2:32 ` [patch 15/87] kernel.h: split out container_of() and typeof_member() macros Andrew Morton
2021-11-09  2:32 ` [patch 16/87] include/kunit/test.h: replace kernel.h with the necessary inclusions Andrew Morton
2021-11-09  2:32 ` [patch 17/87] include/linux/list.h: " Andrew Morton
2021-11-09  2:32 ` [patch 18/87] include/linux/llist.h: " Andrew Morton
2021-11-09  2:32 ` [patch 19/87] include/linux/plist.h: " Andrew Morton
2021-11-09  2:32 ` [patch 20/87] include/media/media-entity.h: " Andrew Morton
2021-11-09  2:32 ` [patch 21/87] include/linux/delay.h: " Andrew Morton
2021-11-09  2:32 ` [patch 22/87] include/linux/sbitmap.h: " Andrew Morton
2021-11-09  2:32 ` [patch 23/87] include/linux/radix-tree.h: " Andrew Morton
2021-11-09  2:32 ` [patch 24/87] include/linux/generic-radix-tree.h: " Andrew Morton
2021-11-09  2:32 ` [patch 25/87] kernel.h: split out instruction pointer accessors Andrew Morton
2021-11-09  2:32 ` [patch 26/87] linux/container_of.h: switch to static_assert Andrew Morton
2021-11-09  2:32 ` [patch 27/87] mailmap: update email address for Colin King Andrew Morton
2021-11-09  2:32 ` [patch 28/87] MAINTAINERS: add "exec & binfmt" section with myself and Eric Andrew Morton
2021-11-09  2:32 ` [patch 29/87] MAINTAINERS: rectify entry for ARM/TOSHIBA VISCONTI ARCHITECTURE Andrew Morton
2021-11-09  2:32 ` [patch 30/87] MAINTAINERS: rectify entry for HIKEY960 ONBOARD USB GPIO HUB DRIVER Andrew Morton
2021-11-09  2:33 ` [patch 31/87] MAINTAINERS: rectify entry for INTEL KEEM BAY DRM DRIVER Andrew Morton
2021-11-09  2:33 ` [patch 32/87] MAINTAINERS: rectify entry for ALLWINNER HARDWARE SPINLOCK SUPPORT Andrew Morton
2021-11-09  2:33 ` [patch 33/87] lib, stackdepot: check stackdepot handle before accessing slabs Andrew Morton
2021-11-09  2:33 ` [patch 34/87] lib, stackdepot: add helper to print stack entries Andrew Morton
2021-11-09  2:33 ` [patch 35/87] lib, stackdepot: add helper to print stack entries into buffer Andrew Morton
2021-11-09  2:33 ` [patch 36/87] include/linux/string_helpers.h: add linux/string.h for strlen() Andrew Morton
2021-11-09  2:33 ` [patch 37/87] lib: uninline simple_strntoull() as well Andrew Morton
2021-11-09  2:33 ` [patch 38/87] mm/scatterlist: replace the !preemptible warning in sg_miter_stop() Andrew Morton
2021-11-09  2:33 ` [patch 39/87] const_structs.checkpatch: add a few sound ops structs Andrew Morton
2021-11-09  2:33 ` [patch 40/87] checkpatch: improve EXPORT_SYMBOL test for EXPORT_SYMBOL_NS uses Andrew Morton
2021-11-09  2:33 ` [patch 41/87] checkpatch: get default codespell dictionary path from package location Andrew Morton
2021-11-09  2:33 ` [patch 42/87] binfmt_elf: reintroduce using MAP_FIXED_NOREPLACE Andrew Morton
2021-11-09  2:33 ` [patch 43/87] ELF: simplify STACK_ALLOC macro Andrew Morton
2021-11-09  2:33 ` [patch 44/87] kallsyms: remove arch specific text and data check Andrew Morton
2021-11-09  2:33 ` [patch 45/87] kallsyms: fix address-checks for kernel related range Andrew Morton
2021-11-09  2:33 ` [patch 46/87] sections: move and rename core_kernel_data() to is_kernel_core_data() Andrew Morton
2021-11-09  2:33 ` [patch 47/87] sections: move is_kernel_inittext() into sections.h Andrew Morton
2021-11-09  2:33 ` [patch 48/87] x86: mm: rename __is_kernel_text() to is_x86_32_kernel_text() Andrew Morton
2021-11-09  2:34 ` [patch 49/87] sections: provide internal __is_kernel() and __is_kernel_text() helper Andrew Morton
2021-11-09  2:34 ` [patch 50/87] mm: kasan: use is_kernel() helper Andrew Morton
2021-11-09  2:34 ` [patch 51/87] extable: use is_kernel_text() helper Andrew Morton
2021-11-09  2:34 ` [patch 52/87] powerpc/mm: use core_kernel_text() helper Andrew Morton
2021-11-09  2:34 ` [patch 53/87] microblaze: use is_kernel_text() helper Andrew Morton
2021-11-09  2:34 ` [patch 54/87] alpha: " Andrew Morton
2021-11-09  2:34 ` [patch 55/87] ramfs: fix mount source show for ramfs Andrew Morton
2021-11-09  2:34 ` [patch 56/87] init: make unknown command line param message clearer Andrew Morton
2021-11-09  2:34 ` [patch 57/87] coda: avoid NULL pointer dereference from a bad inode Andrew Morton
2021-11-09  2:34 ` [patch 58/87] coda: check for async upcall request using local state Andrew Morton
2021-11-09  2:34 ` [patch 59/87] coda: remove err which no one care Andrew Morton
2021-11-09  2:34 ` [patch 60/87] coda: avoid flagging NULL inodes Andrew Morton
2021-11-09  2:34 ` [patch 61/87] coda: avoid hidden code duplication in rename Andrew Morton
2021-11-09  2:34 ` [patch 62/87] coda: avoid doing bad things on inode type changes during revalidation Andrew Morton
2021-11-09  2:34 ` [patch 63/87] coda: convert from atomic_t to refcount_t on coda_vm_ops->refcnt Andrew Morton
2021-11-09  2:34 ` [patch 64/87] coda: use vmemdup_user to replace the open code Andrew Morton
2021-11-09  2:34 ` [patch 65/87] coda: bump module version to 7.2 Andrew Morton
2021-11-09  2:34 ` [patch 66/87] nilfs2: replace snprintf in show functions with sysfs_emit Andrew Morton
2021-11-09  2:35 ` [patch 67/87] nilfs2: remove filenames from file comments Andrew Morton
2021-11-09  2:35 ` [patch 68/87] hfs/hfsplus: use WARN_ON for sanity check Andrew Morton
2021-11-09  2:35 ` [patch 69/87] crash_dump: fix boolreturn.cocci warning Andrew Morton
2021-11-09  2:35 ` [patch 70/87] crash_dump: remove duplicate include in crash_dump.h Andrew Morton
2021-11-09  2:35 ` [patch 71/87] signal: remove duplicate include in signal.h Andrew Morton
2021-11-09  2:35 ` [patch 72/87] seq_file: move seq_escape() to a header Andrew Morton
2021-11-09  2:35 ` [patch 73/87] seq_file: fix passing wrong private data Andrew Morton
2021-11-09  2:35 ` [patch 74/87] kernel/fork.c: unshare(): use swap() to make code cleaner Andrew Morton
2021-11-09  2:35 ` [patch 75/87] sysv: use BUILD_BUG_ON instead of runtime check Andrew Morton
2021-11-09  2:35 ` [patch 76/87] Documentation/kcov: include types.h in the example Andrew Morton
2021-11-09  2:35 ` [patch 77/87] Documentation/kcov: define `ip' " Andrew Morton
2021-11-09  2:35 ` [patch 78/87] kcov: allocate per-CPU memory on the relevant node Andrew Morton
2021-11-09  2:35 ` [patch 79/87] kcov: avoid enable+disable interrupts if !in_task() Andrew Morton
2021-11-09  2:35 ` [patch 80/87] kcov: replace local_irq_save() with a local_lock_t Andrew Morton
2021-11-09  2:35 ` [patch 81/87] scripts/gdb: handle split debug for vmlinux Andrew Morton
2021-11-09  2:35 ` [patch 82/87] kernel/resource: clean up and optimize iomem_is_exclusive() Andrew Morton
2021-11-09  2:35 ` [patch 83/87] kernel/resource: disallow access to exclusive system RAM regions Andrew Morton
2021-11-09  2:35 ` [patch 84/87] virtio-mem: disallow mapping virtio-mem memory via /dev/mem Andrew Morton
2021-11-09  2:35 ` [patch 85/87] selftests/kselftest/runner/run_one(): allow running non-executable files Andrew Morton
2021-11-09  2:35 ` [patch 86/87] ipc: check checkpoint_restore_ns_capable() to modify C/R proc files Andrew Morton
2021-11-09  2:36 ` [patch 87/87] ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).